https://typecraft.org/w/api.php?action=feedcontributions&user=Asger+Hagerup&feedformat=atomTypeCraft - User contributions [en]2024-03-28T11:26:56ZUser contributionsMediaWiki 1.24.2https://typecraft.org/w/index.php?title=User:Asger_Hagerup&diff=10395User:Asger Hagerup2012-01-20T09:06:36Z<p>Asger Hagerup: </p>
<hr />
<div>[[Image:Asger.jpg|300px|thumbnail|left]]<br />
I started studying linguistics at the [http://www.ntnu.no/isk Institute of Languages and Communication Studies] at [http://www.ntnu.no NTNU] in autumn 2006, and finished my Master's Degree spring 2011. My [http://ntnu.diva-portal.org/smash/get/diva2:419552/FULLTEXT01 Master's Thesis] was on the topic of phonology in [http://www.ethnologue.com/show_language.asp?code=kal Western Greenlandic]. In addition to studying at NTNU I have also worked there as a teaching assistant in an introductory course to basic linguistics and phonetics and a course in Old Norse/Norwegian diachronics.</div>Asger Hageruphttps://typecraft.org/w/index.php?title=User:Asger_Hagerup&diff=10386User:Asger Hagerup2012-01-18T14:08:33Z<p>Asger Hagerup: </p>
<hr />
<div>[[Image:Asger.jpg|300px|thumbnail|left]]<br />
I started studying linguistics at the [http://www.ntnu.no/isk Institute of Languages and Communication Studies] at [http://www.ntnu.no NTNU] in Autumn 2006, and finished my Master's Degree spring 2011. My [http://ntnu.diva-portal.org/smash/get/diva2:419552/FULLTEXT01 Master's Thesis] was on the topic of phonology in [http://www.ethnologue.com/show_language.asp?code=kal Western Greenlandic]. In addition to studying at NTNU I have also worked there as a teaching assistant in an introductory course to basic linguistics and phonetics and a course in Old Norse/Norwegian diachronics.</div>Asger Hageruphttps://typecraft.org/w/index.php?title=User:Asger_Hagerup&diff=10314User:Asger Hagerup2011-12-11T14:23:08Z<p>Asger Hagerup: </p>
<hr />
<div>[[Image:Asger.jpg|300px|thumbnail|left]]<br />
I started studying linguistics at the [http://www.ntnu.no/isk Institute of Languages and Communication Studies] at [http://www.ntnu.no NTNU] since Autumn 2006, and finished my Master's Degree spring 2011. My [http://ntnu.diva-portal.org/smash/get/diva2:419552/FULLTEXT01 Master's Thesis] was on the topic of phonology in [http://www.ethnologue.com/show_language.asp?code=kal Western Greenlandic]. In addition to studying at NTNU I have also worked there as a teaching assistant in an introductory course to basic linguistics and phonetics and a course in Old Norse/Norwegian diachronics.</div>Asger Hageruphttps://typecraft.org/w/index.php?title=User:Asger_Hagerup&diff=10313User:Asger Hagerup2011-12-11T14:22:14Z<p>Asger Hagerup: </p>
<hr />
<div>[[Image:Asger.jpg|300px|thumbnail|left]]<br />
I started studying linguistics at the [http://www.ntnu.no/isk Institute of Languages and Communication Studies] at [http://www.ntnu.no NTNU] since Autumn 2006, and finished my Master's Degree spring 2011. My [ntnu.diva-portal.org/smash/get/diva2:419552/FULLTEXT01 Master's Thesis] was on the topic of phonology in [http://www.ethnologue.com/show_language.asp?code=kal Western Greenlandic]. In addition to studying at NTNU I have also worked there as a teaching assistant in an introductory course to basic linguistics and phonetics and a course in Old Norse/Norwegian diachronics.</div>Asger Hageruphttps://typecraft.org/w/index.php?title=Parallel_Annotation_of_Speech_and_Text&diff=4724Parallel Annotation of Speech and Text2010-04-23T12:04:17Z<p>Asger Hagerup: </p>
<hr />
<div>== Project Description==<br />
Goal of this short pilot has been parallel sound and text annotation. The study has been conducted by Professor [[User:Wim van Dommelen|Wim van Dommelen]] and Assc.Professor [[User:Dorothee Beermann|Dorothee Beermann]] at the [http://www.ntnu.no/isk Institute of Languages and Communication Studies] at the [http://www.ntnu.no Norwegian University of Science and Technology]. Scientific assistant for the project was [[User:Asger Hagerup|Asger Hagerup]]. The project has been funded by [http://www.ntnu.no/hf/satsing/sstl the SSTL]. <br />
<br />
The pilot investigated how to integrate presentations of linguistically annotated audio and text material, combining [http://www.fon.hum.uva.nl/praat/ Praat] and [[Main Page|TypeCraft]]. <br />
<br />
'''Praat''' is a signal analysis software developed by [http://www.fon.hum.uva.nl/paul/ Paul Boersma] and [http://www.fon.hum.uva.nl/david/ David Weenink] from the [http://www.example.com University of Amsterdam]. It is a tool widely used for the annotation of sound objects. For the present study we have taken advantage of the fact that Praat annotation data resides in a TextGrid object that exists separately from the sound object. Using annotated tiers allows easy referencing of data across applications. At present our sound signal representations are static, and selective, that is, they focus on the presentation of one selected feature to illustrate interesting correlations across phonetic and linguistic categories. Further funding will allow us to develop an interactive representation of speech data.<br />
<br />
On this page and the pages [[Parallel_Annotation_of_Speech_and_Text_-_Part_2|Parallel Annotation of Speech and Text - Part 2]] and [[Parallel_Annotation_of_Speech_and_Text_-_Part_3|Parallel Annotation of Speech and Text - Part 3]] we present some of our data. A sample collection of annotated text can be found be following this link: [http://typecraft.org/TCEditor/965/ Parallel speech text annotation]. <br />
<br />
The corresponding Praat annotations can be found on this and the following page - we have embedded sound and TextGrid files which can be downloaded for further inspection in Praat. The data presented here allows for example the inspection of '''Cliticalization''' (syntax). '''Vowel Reduction''' (phonology) and '''Voice Onset Time''' (phonetics) in Norwegian. We reflect three Norwegian dialects, a fact which in particular in the context of dialectology might be of some interest. In each case morpho-syntactic and phonetic/phonological annotation are presented in parallel. On the basis of a larger data-set our approach to speech and text annotation will allow a comparison of dialects taking parameters from different fields of linguistics also well as the phonetic annotation into account.<br />
<br />
===Description of the material=== <br />
For our study we selected 10 sentences from the phonetic database of the [http://www.sound2sense.eu/ Sound to Sense] project. <br />
<br />
To illustrate some of the differences between Norwegian dialects we looked at both segmental and suprasegmental phenomena that divide Norwegian language into a Western and an Eastern dialect group. On the segment level we can examine the pronunciation of the phoneme /r/. As documented in the sound data presented here, the Bergen (Western) speaker pronounces /r/ as a voiced uvular fricative, while the two other speakers (Eastern) pronounce the phoneme as a voiced alveolar tap (although the segment may also appear as an approximant in rapid speech for all three speakers). In addition, the Eastern Norwegian speakers have an assimilation between /r/ and a following alveolar consonant: the consonant sequence surfaces as a retroflex version of the latter consonant. This is not the case for the Bergen speaker, where the two segments are preserved in the surface form.<br />
<br />
To illustrate a suprasegmental phonomenon we can look at the pitch contour for bisyllabic words with initial stress. For these words there are two possible pitch contours in Norwegian, with either two or three tones. These two pitch contours are commonly called toneme 1 and toneme 2, respectively. In [[Parallel_Annotation_of_Speech_and_Text_-_Part_2|sentence 7]] we look closer at how toneme 1 and 2 are used in inflection, but here we shall briefly look at how toneme 1 is realised in the different dialects.<br />
<br />
<br />
<br />
{| border="1"<br />
|+'''Tone realisation across Norwegian dialects'''<br />
|-<br />
| valign="top"|<br />
[[Image:Peker.jpg|thumb|left|350px|Bergen]]<br />
| valign="bottom"|<br />
[[Image:Døra.jpg|thumb|left|350px| Trondheim]]<br />
|}<br />
<br />
<br />
<br />
[[Image:Vasken.jpg|thumb|left|350px|Eastern Norwegian dialect (south of Trøndelag)]]<br />
<br />
'''Description of picture material'''<br />
<br />
The screenshots above and to the left illustrate three words represented using Praat. The data is taken from sentences 3, 7 and 9, respectively. The blue curve in the middle of each screenshot shows the fundamental frequency, or pitch, throughout the pronunciation of the word (it has gaps because unvoiced sounds do not have any pitch). Examining the pitch contour of the words we see that the Bergen speaker pronounces /pe:ker/ with an HL pitch contour, i.e. a high tone on the first syllable and a low tone on the last syllable, while the pattern is the opposite (LH) for the Trondheim pronunciation of /dø:ra/. The last screenshot illustrates the pitch of the speaker of an Eastern Norwegian dialect south of Trøndelag, also with an LH tone contour on /vaska/. Because of the high tone on the stressed syllable, western Norwegian dialects are often referred to as high-tone dialects and contrarily Eastern Norwegian dialects as low-tone dialects. However, there are differences between dialects in the same group as well, comparing the Trondheim speaker and the other Eastern dialect we see that the former has a gradual rise from L to H, while the latter has a more abrupt rise at the end of the word.<br />
<br />
<br />
<br />
<br />
<br />
==Speaker dialect: ''Bergen''==<br />
<br />
''Sentence 1''<br />
<Phrase>10903</Phrase><br />
<flashmp3>PSTA01.mp3</flashmp3><br />
<br />
<br><br />
<br><br />
File download for viewing in the Praat ([[#Downloading Help|Downloading Help]]):<br />
<br />
*[[Media:PSTA01.mp3|Sound]]<br />
*[[Media:PSTA01.txt|TextGrid]]<br />
<br />
<br><br />
<br><br />
''Sentence 2''<br />
<Phrase>10904</Phrase><br />
<flashmp3>PSTA02.mp3</flashmp3><br />
<br />
<br><br />
<br><br />
File download for Praat ([[#Downloading Help|Downloading Help]]): <br />
<br />
* [[Media:PSTA02.mp3|Sound]], <br />
* [[Media:PSTA02.txt|TextGrid]]<br />
<br />
<br><br />
<br><br />
''Sentence 3''<br />
<phrase>10905</phrase><br />
<flashmp3>PSTA03.mp3</flashmp3><br />
<br />
<br><br />
<br><br />
File download for Praat ([[#Downloading Help|Downloading Help]]): <br />
<br />
*[[Media:PSTA03.mp3|Sound]], <br />
*[[Media:PSTA03.txt|TextGrid]]<br />
<br />
==Speaker Dialect: Trondheim==<br />
<br />
[[Parallel Annotation of Speech and Text - Part 2]]<br />
<br />
==Speaker Dialect: Eastern Norway==<br />
<br />
[[Parallel Annotation of Speech and Text - Part 3]]<br />
<br />
==About the TextGrid files==<br />
<br />
The TextGrid files are opened together with the matching sound files for viewing in the Praat application. The TextGrid files consist of three tiers, 'Word' (rendered in Bokmål orthography) 'Phoneme' (shows underlying segments) and 'Note' (shows surface realisation with IPA symbols, and other notes). In the phoneme tier, a hash (#) represents a word boundary and a segment inside <angle brackets> is an underlying segment that is syncopated or otherwise missing in the surface form.<br />
<br />
Here is a list of glosses used in the 'Note' tier:<br />
<br />
''Phonology/Phonetics:''<br><br />
BrV = Segent realised with breathy voice<br><br />
CrV = Segent realised with creaky voice<br><br />
DV = Underlying voiced segment realised devoiced<br><br />
EPN = Epenthesis<br><br />
RD = Reduction of segment (e.g. corner vowel realised as schwa or plosive as fricative).<br><br />
V = Underlying non-voiced segment realised voiced<br><br />
<br />
''Morphophonology/Syntax''<br><br />
CL = Clitic<br><br />
<br />
''Other''<br><br />
ERR = The speaker errs and corrects himself<br><br />
HES = (Audible) hesitation from speaker<br><br />
<br />
The note tier may also show an IPA symbol inside square brackets, this represents the actual realisation of the underlying segment(s).<br />
<br />
==Downloading Help==<br />
When clicking on the file links called '''Sound''' and '''TextGrid''' the files will open in a separate window in your browser.<br />
<br />
Go to *FILE*, right click and select *Save this Page as*.<br />
<br />
You now are able to save the file to a place of your choice in your home directory.</div>Asger Hageruphttps://typecraft.org/w/index.php?title=Parallel_Annotation_of_Speech_and_Text&diff=4723Parallel Annotation of Speech and Text2010-04-23T11:59:58Z<p>Asger Hagerup: </p>
<hr />
<div>== Project Description==<br />
Goal of this short pilot has been parallel sound and text annotation. The study has been conducted by Professor [[User:Wim van Dommelen|Wim van Dommelen]] and Assc.Professor [[User:Dorothee Beermann|Dorothee Beermann]] at the [http://www.ntnu.no/isk Institute of Languages and Communication Studies] at the [http://www.ntnu.no Norwegian University of Science and Technology]. Scientific assistant for the project was [[User:Asger Hagerup|Asger Hagerup]]. The project has been funded by [http://www.ntnu.no/hf/satsing/sstl the SSTL]. <br />
<br />
The pilot investigated how to integrate presentations of linguistically annotated audio and text material, combining [http://www.fon.hum.uva.nl/praat/ Praat] and [[Main Page|TypeCraft]]. <br />
<br />
'''Praat''' is a signal analysis software developed by [http://www.fon.hum.uva.nl/paul/ Paul Boersma] and [http://www.fon.hum.uva.nl/david/ David Weenink] from the [http://www.example.com University of Amsterdam]. It is a tool widely used for the annotation of sound objects. For the present study we have taken advantage of the fact that Praat annotation data resides in a TextGrid object that exists separately from the sound object. Using annotated tiers allows easy referencing of data across applications. At present our sound signal representations are static, and selective, that is, they focus on the presentation of one selected feature to illustrate interesting correlations across phonetic and linguistic categories. Further funding will allow us to develop an interactive representation of speech data.<br />
<br />
On this page and the pages [[Parallel_Annotation_of_Speech_and_Text_-_Part_2|Parallel Annotation of Speech and Text - Part 2]] and [[Parallel_Annotation_of_Speech_and_Text_-_Part_3|Parallel Annotation of Speech and Text - Part 3]] we present some of our data. A sample collection of annotated text can be found be following this link: [http://typecraft.org/TCEditor/965/ Parallel speech text annotation]. <br />
<br />
The corresponding Praat annotations can be found on this and the following page - we have embedded sound and TextGrid files which can be downloaded for further inspection in Praat. The data presented here allows for example the inspection of '''Cliticalization''' (syntax). '''Vowel Reduction''' (phonology) and '''Voice Onset Time''' (phonetics) in Norwegian. We reflect three Norwegian dialects, a fact which in particular in the context of dialectology might be of some interest. In each case morpho-syntactic and phonetic/phonological annotation are presented in parallel. On the basis of a larger data-set our approach to speech and text annotation will allow a comparison of dialects taking parameters from different fields of linguistics also well as the phonetic annotation into account.<br />
<br />
===Description of the material=== <br />
For our study we selected 10 sentences from the phonetic database of the [http://www.sound2sense.eu/ Sound to Sense] project. <br />
<br />
To illustrate some of the differences between Norwegian dialects we looked at both segmental and suprasegmental phenomena that divide Norwegian language into a Western and an Eastern dialect group. On the segment level we can examine the pronunciation of the phoneme /r/. As documented in the sound data presented here, the Bergen (Western) speaker pronounces /r/ as a voiced uvular fricative, while the two other speakers (Eastern) pronounce the phoneme as a voiced alveolar tap (although the segment may also appear as an approximant in rapid speech for all three speakers). In addition, the Eastern Norwegian speakers have an assimilation between /r/ and a following alveolar consonant: the consonant sequence surfaces as a retroflex version of the latter consonant. This is not the case for the Bergen speaker, where the two segments are preserved in the surface form.<br />
<br />
To illustrate a suprasegmental phonomenon we can look at the pitch contour for bisyllabic words with initial stress. For these words there are two possible pitch contours in Norwegian, with either two or three tones. These two pitch contours are commonly called toneme 1 and toneme 2, respectively. In [[Parallel_Annotation_of_Speech_and_Text_-_Part_2|sentence 7]] we look closer at how toneme 1 and 2 are used in inflection, but here we shall briefly look at how toneme 1 is realised in the different dialects.<br />
<br />
<br />
<br />
{| border="1"<br />
|+'''Tone realisation across Norwegian dialects'''<br />
|-<br />
| valign="top"|<br />
[[Image:Peker.jpg|thumb|left|350px|Bergen]]<br />
| valign="bottom"|<br />
[[Image:Døra.jpg|thumb|left|350px| Trondheim]]<br />
|}<br />
<br />
<br />
<br />
[[Image:Vasken.jpg|thumb|left|350px|Eastern Norwegian dialect (south of Trøndelag)]]<br />
<br />
'''Description of picture material'''<br />
<br />
The screenshots above and to the left illustrate three words represented using Praat. The data is taken from sentences 3, 7 and 9, respectively. The blue curve in the middle of each screenshot shows the fundamental frequency, or pitch, throughout the pronunciation of the word (it has gaps because unvoiced sounds do not have any pitch). Examining the pitch contour of the words we see that the Bergen speaker pronounces /pe:ker/ with an HL pitch contour, i.e. a high tone on the first syllable and a low tone on the last syllable, while the pattern is the opposite (LH) for the Trondheim pronunciation of /dø:ra/. The last screenshot illustrates the pitch of the speaker of an Eastern Norwegian dialect south of Trøndelag, also with an LH tone contour on /vaska/. Because of the high tone on the stressed syllable, western Norwegian dialects are often referred to as high-tone dialects and contrarily Eastern Norwegian dialects as low-tone dialects. However, there are differences between dialects in the same group as well, comparing the Trondheim speaker and the other Eastern dialect we see that the former has a gradual rise from L to H, while the latter has a more abrupt rise at the end of the word.<br />
<br />
<br />
<br />
<br />
<br />
==Speaker dialect: ''Bergen''==<br />
<br />
''Sentence 1''<br />
<Phrase>10903</Phrase><br />
<flashmp3>PSTA01.mp3</flashmp3><br />
<br />
<br><br />
<br><br />
File download for viewing in the Praat([[#Downloading Help|Downloading Help]]):<br />
<br />
*[[Media:PSTA01.mp3|Sound]]<br />
*[[Media:PSTA01.txt|TextGrid]]<br />
<br />
<br><br />
<br><br />
''Sentence 2''<br />
<Phrase>10904</Phrase><br />
<flashmp3>PSTA02.mp3</flashmp3><br />
<br />
<br><br />
<br><br />
File download for Praat ([[#Downloading Help|Downloading Help]]): <br />
<br />
* [[Media:PSTA02.mp3|Sound]], <br />
* [[Media:PSTA02.txt|TextGrid]]<br />
<br />
<br><br />
<br><br />
''Sentence 3''<br />
<phrase>10905</phrase><br />
<flashmp3>PSTA03.mp3</flashmp3><br />
<br />
<br><br />
<br><br />
File download for Praat ([[#Downloading Help|Downloading Help]]): <br />
<br />
*[[Media:PSTA03.mp3|Sound]], <br />
*[[Media:PSTA03.txt|TextGrid]]<br />
<br />
==Speaker Dialect: Trondheim==<br />
<br />
[[Parallel Annotation of Speech and Text - Part 2]]<br />
<br />
==Speaker Dialect: Eastern Norway==<br />
<br />
[[Parallel Annotation of Speech and Text - Part 3]]<br />
<br />
==About the TextGrid files==<br />
<br />
The TextGrid files are opened together with the matching sound files for viewing in the Praat application. The TextGrid files consist of three tiers, 'Word' (rendered in Bokmål orthography) 'Phoneme' (shows underlying segments) and 'Note' (shows surface realisation with IPA symbols, and other notes). In the phoneme tier, a hash (#) represents a word boundary and a segment inside <angle brackets> is an underlying segment that is syncopated or otherwise missing in the surface form.<br />
<br />
Here is a list of glosses used in the 'Note' tier:<br />
<br />
''Phonology/Phonetics:''<br><br />
BrV = Segent realised with breathy voice<br><br />
CrV = Segent realised with creaky voice<br><br />
DV = Underlying voiced segment realised devoiced<br><br />
EPN = Epenthesis<br><br />
RD = Reduction of segment (e.g. corner vowel realised as schwa or plosive as fricative).<br><br />
V = Underlying non-voiced segment realised voiced<br><br />
<br />
''Morphophonology/Syntax''<br><br />
CL = Clitic<br><br />
<br />
''Other''<br><br />
ERR = The speaker errs and corrects himself<br><br />
HES = (Audible) hesitation from speaker<br><br />
<br />
The note tier may also show an IPA symbol inside square brackets, this represents the actual realisation of the underlying segment(s).<br />
<br />
==Downloading Help==<br />
When clicking on the file links called '''Sound''' and '''TextGrid''' the files will open in a separate window in your browser.<br />
<br />
Go to *FILE*, right click and select *Save this Page as*.<br />
<br />
You now are able to save the file to a place of your choice in your home directory.</div>Asger Hageruphttps://typecraft.org/w/index.php?title=Parallel_Annotation_of_Speech_and_Text_-_Part_2&diff=4722Parallel Annotation of Speech and Text - Part 22010-04-23T11:58:57Z<p>Asger Hagerup: </p>
<hr />
<div>go back to [[Parallel Annotation of Speech and Text]]<br />
<br />
go to [[Parallel_Annotation_of_Speech_and_Text_-_Part_3|Parallel Annotation of Speech and Text - Part 3]]<br />
<br />
<br />
'''Speaker dialect: Trondheim'''<br />
<br />
''Sentence 4''<br />
<Phrase>10906</Phrase><br />
<flashmp3>PSTA04.mp3</flashmp3><br><br />
Download files for viewing in the Praat application: <br />
*[[Media:PSTA04.mp3| Sound]] <br />
*[[Media:PSTA04.txt|TextGrid]]<br />
<br />
<br><br />
<br><br />
''Sentence 5''<br />
<Phrase>10907</Phrase><br />
<flashmp3>PSTA05.mp3</flashmp3><br><br />
Download files for viewing in the Praat application:<br />
*[[Media:PSTA05.mp3|Sound]] <br />
*[[Media:PSTA05.txt|TextGrid]]<br />
<br />
<br><br />
<br><br />
''Sentence 6''<br />
<Phrase>10908</Phrase><br />
<flashmp3>PSTA06.mp3</flashmp3><br><br />
Download files for viewing in the Praat application:<br />
*[[Media:PSTA06.mp3|Sound]]<br />
*[[File:PSTA06.txt|TextGrid]]<br />
<br />
<br><br />
<br><br />
''Sentence 7''<br />
<Phrase>10909</Phrase><br />
<br><br />
A phenomenon that is interesting relative to the sentence above is, how toneme 1 and toneme 2 is used in inflection.<br />
This is truly best documented by combining morphological annotation with speech annotation. Only the later allows a look at the actual pronunciation. <br />
<br><br />
<br />
{| border="1"<br />
|+'''Difference in tone contour as a result of inflection'''<br />
|-<br />
| valign="top"|<br />
[[Image:Taket.jpg|thumb|left|350px|/ta:k+e/]]<br />
| valign="bottom"|<br />
[[Image:Pære.jpg|thumb|left|350px|/pæ:r+e/]]<br />
|}<br />
<br><br />
Above we have screenshots of two of the words from the sentence, “taket” and “pære” in orthography, viewed in the Praat application. The latter word is part of a compound word, but for the sake of simplicity we can disregard this. The pitch is shown as a blue curve in the middle. Phonemically these words are transcribed /ta:k+e/ and /pæ:r+e/, where the suffix /e/ in the first word denotes definite singular (“the roof”) and in the second indefinite singular (“bulb”). It would seem we have two different morphemes that are realised by the same morph /e/, however, this is not the case. As we can see, the inflected words have different pitch contours, in /ta:k+e/ there is an HLH contour (the word has toneme 2), while in /pæ:r+e/ there is a LH contour (toneme 1). One way to analyse this is to say that the suffix /e/ in /ta:k+e/ carries an extra tone (or tone bearing unit) with it, while the suffix /e/ in /pæ:r+e/ does not. This means that the two suffixes are not phonologically the same.<br />
<br />
<flashmp3>PSTA07.mp3</flashmp3><br><br />
Download files for viewing in the Praat application:<br />
*[[Media:PSTA07.mp3|Sound]]<br />
*[[Media:PSTA07.txt|TextGrid]]</div>Asger Hageruphttps://typecraft.org/w/index.php?title=Parallel_Annotation_of_Speech_and_Text_-_Part_3&diff=4721Parallel Annotation of Speech and Text - Part 32010-04-23T11:57:58Z<p>Asger Hagerup: </p>
<hr />
<div>go back to [[Parallel Annotation of Speech and Text - Part 2]]<br />
<br />
<br />
go back to [[Parallel Annotation of Speech and Text]]<br />
<br />
<br />
'''Speaker dialect: Eastern Norway'''<br />
<br />
''Sentence 8''<br />
<Phrase>10910</Phrase><br />
<flashmp3>PSTA08.mp3</flashmp3><br><br />
Download files for viewing in the Praat application:<br />
*[[Media:PSTA08.mp3|Sound]]<br />
*[[Media:PSTA08.txt|TextGrid]]<br />
<br />
<br><br />
<br><br />
''Sentence 9''<br />
<Phrase>10911</Phrase><br />
<flashmp3>PSTA09.mp3</flashmp3><br><br />
Download files for viewing in the Praat application:<br />
*[[Media:PSTA09.mp3|Sound]] <br />
*[[Media:PSTA09.txt|TextGrid]]<br />
<br />
<br><br />
<br><br />
''Sentence 10''<br />
<Phrase>10912</Phrase><br />
<flashmp3>PSTA03.mp3</flashmp3><br><br />
Download files for viewing in the Praat application:<br />
*[[Media:PSTA10.mp3|Sound]]<br />
*[[Media:PSTA10.txt|TextGrid]]</div>Asger Hageruphttps://typecraft.org/w/index.php?title=Parallel_Annotation_of_Speech_and_Text_-_Part_2&diff=4720Parallel Annotation of Speech and Text - Part 22010-04-23T11:56:15Z<p>Asger Hagerup: </p>
<hr />
<div>go back to [[Parallel Annotation of Speech and Text]]<br />
<br />
go to [[Parallel_Annotation_of_Speech_and_Text_-_Part_3|Parallel Annotation of Speech and Text - Part 3]]<br />
<br />
<br />
'''Speaker dialect: Trondheim'''<br />
<br />
<br><br />
'''Sentence 4'''<br />
<Phrase>10906</Phrase><br />
<flashmp3>PSTA04.mp3</flashmp3><br><br />
Download files for viewing in the Praat application: <br />
*[[Media:PSTA04.mp3| Sound]] <br />
*[[Media:PSTA04.txt|TextGrid]]<br />
<br />
<br><br />
<br><br />
'''Sentence 5'''<br />
<Phrase>10907</Phrase><br />
<flashmp3>PSTA05.mp3</flashmp3><br><br />
Download files for viewing in the Praat application:<br />
*[[Media:PSTA05.mp3|Sound]] <br />
*[[Media:PSTA05.txt|TextGrid]]<br />
<br />
<br><br />
<br><br />
'''Sentence 6'''<br />
<Phrase>10908</Phrase><br />
<flashmp3>PSTA06.mp3</flashmp3><br><br />
Download files for viewing in the Praat application:<br />
*[[Media:PSTA06.mp3|Sound]]<br />
*[[File:PSTA06.txt|TextGrid]]<br />
<br />
<br><br />
<br><br />
'''Sentence 7'''<br />
<Phrase>10909</Phrase><br />
<br><br />
A phenomenon that is interesting relative to the sentence above is, how toneme 1 and toneme 2 is used in inflection.<br />
This is truly best documented by combining morphological annotation with speech annotation. Only the later allows a look at the actual pronunciation. <br />
<br><br />
<br />
{| border="1"<br />
|+'''Difference in tone contour as a result of inflection'''<br />
|-<br />
| valign="top"|<br />
[[Image:Taket.jpg|thumb|left|350px|/ta:k+e/]]<br />
| valign="bottom"|<br />
[[Image:Pære.jpg|thumb|left|350px|/pæ:r+e/]]<br />
|}<br />
<br><br />
Above we have screenshots of two of the words from the sentence, “taket” and “pære” in orthography, viewed in the Praat application. The latter word is part of a compound word, but for the sake of simplicity we can disregard this. The pitch is shown as a blue curve in the middle. Phonemically these words are transcribed /ta:k+e/ and /pæ:r+e/, where the suffix /e/ in the first word denotes definite singular (“the roof”) and in the second indefinite singular (“bulb”). It would seem we have two different morphemes that are realised by the same morph /e/, however, this is not the case. As we can see, the inflected words have different pitch contours, in /ta:k+e/ there is an HLH contour (the word has toneme 2), while in /pæ:r+e/ there is a LH contour (toneme 1). One way to analyse this is to say that the suffix /e/ in /ta:k+e/ carries an extra tone (or tone bearing unit) with it, while the suffix /e/ in /pæ:r+e/ does not. This means that the two suffixes are not phonologically the same.<br />
<br />
<flashmp3>PSTA07.mp3</flashmp3><br><br />
Download files for viewing in the Praat application:<br />
*[[Media:PSTA07.mp3|Sound]]<br />
*[[Media:PSTA07.txt|TextGrid]]</div>Asger Hageruphttps://typecraft.org/w/index.php?title=Parallel_Annotation_of_Speech_and_Text&diff=4719Parallel Annotation of Speech and Text2010-04-23T11:50:51Z<p>Asger Hagerup: </p>
<hr />
<div>== Project Description==<br />
Goal of this short pilot has been parallel sound and text annotation. The study has been conducted by Professor [[User:Wim van Dommelen|Wim van Dommelen]] and Assc.Professor [[User:Dorothee Beermann|Dorothee Beermann]] at the [http://www.ntnu.no/isk Institute of Languages and Communication Studies] at the [http://www.ntnu.no Norwegian University of Science and Technology]. Scientific assistant for the project was [[User:Asger Hagerup|Asger Hagerup]]. The project has been funded by [http://www.ntnu.no/hf/satsing/sstl the SSTL]. <br />
<br />
The pilot investigated how to integrate presentations of linguistically annotated audio and text material, combining [http://www.fon.hum.uva.nl/praat/ Praat] and [[Main Page|TypeCraft]]. <br />
<br />
'''Praat''' is a signal analysis software developed by [http://www.fon.hum.uva.nl/paul/ Paul Boersma] and [http://www.fon.hum.uva.nl/david/ David Weenink] from the [http://www.example.com University of Amsterdam]. It is a tool widely used for the annotation of sound objects. For the present study we have taken advantage of the fact that Praat annotation data resides in a TextGrid object that exists separately from the sound object. Using annotated tiers allows easy referencing of data across applications. At present our sound signal representations are static, and selective, that is, they focus on the presentation of one selected feature to illustrate interesting correlations across phonetic and linguistic categories. Further funding will allow us to develop an interactive representation of speech data.<br />
<br />
On this page and the pages [[Parallel_Annotation_of_Speech_and_Text_-_Part_2|Parallel Annotation of Speech and Text - Part 2]] and [[Parallel_Annotation_of_Speech_and_Text_-_Part_3|Parallel Annotation of Speech and Text - Part 3]] we present some of our data. A sample collection of annotated text can be found be following this link: [http://typecraft.org/TCEditor/965/ Parallel speech text annotation]. <br />
<br />
The corresponding Praat annotations can be found on this and the following page - we have embedded sound and TextGrid files which can be downloaded for further inspection in Praat. The data presented here allows for example the inspection of '''Cliticalization''' (syntax). '''Vowel Reduction''' (phonology) and '''Voice Onset Time''' (phonetics) in Norwegian. We reflect three Norwegian dialects, a fact which in particular in the context of dialectology might be of some interest. In each case morpho-syntactic and phonetic/phonological annotation are presented in parallel. On the basis of a larger data-set our approach to speech and text annotation will allow a comparison of dialects taking parameters from different fields of linguistics also well as the phonetic annotation into account.<br />
<br />
===Description of the material=== <br />
For our study we selected 10 sentences from the phonetic database of the [http://www.sound2sense.eu/ Sound to Sense] project. <br />
<br />
To illustrate some of the differences between Norwegian dialects we looked at both segmental and suprasegmental phenomena that divide Norwegian language into a Western and an Eastern dialect group. On the segment level we can examine the pronunciation of the phoneme /r/. As documented in the sound data presented here, the Bergen (Western) speaker pronounces /r/ as a voiced uvular fricative, while the two other speakers (Eastern) pronounce the phoneme as a voiced alveolar tap (although the segment may also appear as an approximant in rapid speech for all three speakers). In addition, the Eastern Norwegian speakers have an assimilation between /r/ and a following alveolar consonant: the consonant sequence surfaces as a retroflex version of the latter consonant. This is not the case for the Bergen speaker, where the two segments are preserved in the surface form.<br />
<br />
To illustrate a suprasegmental phonomenon we can look at the pitch contour for bisyllabic words with initial stress. For these words there are two possible pitch contours in Norwegian, with either two or three tones. These two pitch contours are commonly called toneme 1 and toneme 2, respectively. In [[Parallel_Annotation_of_Speech_and_Text_-_Part_2|sentence 7]] we look closer at how toneme 1 and 2 are used in inflection, but here we shall briefly look at how toneme 1 is realised in the different dialects.<br />
<br />
<br />
<br />
{| border="1"<br />
|+'''Tone realisation across Norwegian dialects'''<br />
|-<br />
| valign="top"|<br />
[[Image:Peker.jpg|thumb|left|350px|Bergen]]<br />
| valign="bottom"|<br />
[[Image:Døra.jpg|thumb|left|350px| Trondheim]]<br />
|}<br />
<br />
<br />
<br />
[[Image:Vasken.jpg|thumb|left|350px|Eastern Norwegian dialect (south of Trøndelag)]]<br />
<br />
'''Description of picture material'''<br />
<br />
The screenshots above and to the left illustrate three words represented using Praat. The data is taken from sentences 3, 7 and 9, respectively. The blue curve in the middle of each screenshot shows the fundamental frequency, or pitch, throughout the pronunciation of the word (it has gaps because unvoiced sounds do not have any pitch). Examining the pitch contour of the words we see that the Bergen speaker pronounces /pe:ker/ with an HL pitch contour, i.e. a high tone on the first syllable and a low tone on the last syllable, while the pattern is the opposite (LH) for the Trondheim pronunciation of /dø:ra/. The last screenshot illustrates the pitch of the speaker of an Eastern Norwegian dialect south of Trøndelag, also with an LH tone contour on /vaska/. Because of the high tone on the stressed syllable, western Norwegian dialects are often referred to as high-tone dialects and contrarily Eastern Norwegian dialects as low-tone dialects. However, there are differences between dialects in the same group as well, comparing the Trondheim speaker and the other Eastern dialect we see that the former has a gradual rise from L to H, while the latter has a more abrupt rise at the end of the word.<br />
<br />
<br />
<br />
<br />
<br />
==Speaker dialect: ''Bergen''==<br />
<br />
'''Sentence 1'''<br />
<Phrase>10903</Phrase><br />
<flashmp3>PSTA01.mp3</flashmp3><br />
<br />
<br><br />
<br><br />
File download for viewing in the Praat([[#Downloading Help|Downloading Help]]):<br />
<br />
*[[Media:PSTA01.mp3|Sound]]<br />
*[[Media:PSTA01.txt|TextGrid]]<br />
<br />
<br><br />
<br><br />
'''Sentence 2'''<br />
<Phrase>10904</Phrase><br />
<flashmp3>PSTA02.mp3</flashmp3><br />
<br />
<br><br />
<br><br />
File download for Praat ([[#Downloading Help|Downloading Help]]): <br />
<br />
* [[Media:PSTA02.mp3|Sound]], <br />
* [[Media:PSTA02.txt|TextGrid]]<br />
<br />
<br><br />
<br><br />
'''Sentence 3'''<br />
<phrase>10905</phrase><br />
<flashmp3>PSTA03.mp3</flashmp3><br />
<br />
<br><br />
<br><br />
File download for Praat ([[#Downloading Help|Downloading Help]]): <br />
<br />
*[[Media:PSTA03.mp3|Sound]], <br />
*[[Media:PSTA03.txt|TextGrid]]<br />
<br />
==Speaker Dialect: Trondheim==<br />
<br />
[[Parallel Annotation of Speech and Text - Part 2]]<br />
<br />
==Speaker Dialect: Eastern Norway==<br />
<br />
[[Parallel Annotation of Speech and Text - Part 3]]<br />
<br />
==About the TextGrid files==<br />
<br />
The TextGrid files are opened together with the matching sound files for viewing in the Praat application. The TextGrid files consist of three tiers, 'Word' (rendered in Bokmål orthography) 'Phoneme' (shows underlying segments) and 'Note' (shows surface realisation with IPA symbols, and other notes). In the phoneme tier, a hash (#) represents a word boundary and a segment inside <angle brackets> is an underlying segment that is syncopated or otherwise missing in the surface form.<br />
<br />
Here is a list of glosses used in the 'Note' tier:<br />
<br />
''Phonology/Phonetics:''<br><br />
BrV = Segent realised with breathy voice<br><br />
CrV = Segent realised with creaky voice<br><br />
DV = Underlying voiced segment realised devoiced<br><br />
EPN = Epenthesis<br><br />
RD = Reduction of segment (e.g. corner vowel realised as schwa or plosive as fricative).<br><br />
V = Underlying non-voiced segment realised voiced<br><br />
<br />
''Morphophonology/Syntax''<br><br />
CL = Clitic<br><br />
<br />
''Other''<br><br />
ERR = The speaker errs and corrects himself<br><br />
HES = (Audible) hesitation from speaker<br><br />
<br />
The note tier may also show an IPA symbol inside square brackets, this represents the actual realisation of the underlying segment(s).<br />
<br />
==Downloading Help==<br />
When clicking on the file links called '''Sound''' and '''TextGrid''' the files will open in a separate window in your browser.<br />
<br />
Go to *FILE*, right click and select *Save this Page as*.<br />
<br />
You now are able to save the file to a place of your choice in your home directory.</div>Asger Hageruphttps://typecraft.org/w/index.php?title=Parallel_Annotation_of_Speech_and_Text&diff=4718Parallel Annotation of Speech and Text2010-04-23T11:50:06Z<p>Asger Hagerup: </p>
<hr />
<div>== Project Description==<br />
Goal of this short pilot has been parallel sound and text annotation. The study has been conducted by Professor [[User:Wim van Dommelen|Wim van Dommelen]] and Assc.Professor [[User:Dorothee Beermann|Dorothee Beermann]] at the [http://www.ntnu.no/isk Institute of Languages and Communication Studies] at the [http://www.ntnu.no Norwegian University of Science and Technology]. Scientific assistant for the project was [[User:Asger Hagerup|Asger Hagerup]]. The project has been funded by [http://www.ntnu.no/hf/satsing/sstl the SSTL]. <br />
<br />
The pilot investigated how to integrate presentations of linguistically annotated audio and text material, combining [http://www.fon.hum.uva.nl/praat/ Praat] and [[Main Page|TypeCraft]]. <br />
<br />
'''Praat''' is a signal analysis software developed by [http://www.fon.hum.uva.nl/paul/ Paul Boersma] and [http://www.fon.hum.uva.nl/david/ David Weenink] from the [http://www.example.com University of Amsterdam]. It is a tool widely used for the annotation of sound objects. For the present study we have taken advantage of the fact that Praat annotation data resides in a TextGrid object that exists separately from the sound object. Using annotated tiers allows easy referencing of data across applications. At present our sound signal representations are static, and selective, that is, they focus on the presentation of one selected feature to illustrate interesting correlations across phonetic and linguistic categories. Further funding will allow us to develop an interactive representation of speech data.<br />
<br />
On this page and the pages [[Parallel_Annotation_of_Speech_and_Text_-_Part_2|Parallel Annotation of Speech and Text - Part 2]] and [[Parallel_Annotation_of_Speech_and_Text_-_Part_3|Parallel Annotation of Speech and Text - Part 3]] we present some of our data. A sample collection of annotated text can be found be following this link: [http://typecraft.org/TCEditor/965/ Parallel speech text annotation]. <br />
<br />
The corresponding Praat annotations can be found on this and the following page - we have embedded sound and TextGrid files which can be downloaded for further inspection in Praat. The data presented here allows for example the inspection of '''Cliticalization''' (syntax). '''Vowel Reduction''' (phonology) and '''Voice Onset Time''' (phonetics) in Norwegian. We reflect three Norwegian dialects, a fact which in particular in the context of dialectology might be of some interest. In each case morpho-syntactic and phonetic/phonological annotation are presented in parallel. On the basis of a larger data-set our approach to speech and text annotation will allow a comparison of dialects taking parameters from different fields of linguistics also well as the phonetic annotation into account.<br />
<br />
===Description of the material=== <br />
For our study we selected 10 sentences from the phonetic database of the [http://www.sound2sense.eu/ Sound to Sense] project. <br />
<br />
To illustrate some of the differences between Norwegian dialects we looked at both segmental and suprasegmental phenomena that divide Norwegian language into a Western and an Eastern dialect group. On the segment level we can examine the pronunciation of the phoneme /r/. As documented in the sound data presented here, the Bergen (Western) speaker pronounces /r/ as a voiced uvular fricative, while the two other speakers (Eastern) pronounce the phoneme as a voiced alveolar tap (although the segment may also appear as an approximant in rapid speech for all three speakers). In addition, the Eastern Norwegian speakers have an assimilation between /r/ and a following alveolar consonant: the consonant sequence surfaces as a retroflex version of the latter consonant. This is not the case for the Bergen speaker, where the two segments are preserved in the surface form.<br />
<br />
To illustrate a suprasegmental phonomenon we can look at the pitch contour for bisyllabic words with initial stress. For these words there are two possible pitch contours in Norwegian, with either two or three tones. These two pitch contours are commonly called toneme 1 and toneme 2, respectively. In [[Parallel_Annotation_of_Speech_and_Text_-_Part_2|sentence 7]] we look closer at how toneme 1 and 2 are used in inflection, but here we shall briefly look at how toneme 1 is realised in the different dialects.<br />
<br />
<br />
<br />
{| border="1"<br />
|+'''Tone realisation across Norwegian dialects'''<br />
|-<br />
| valign="top"|<br />
[[Image:Peker.jpg|thumb|left|350px|Bergen]]<br />
| valign="bottom"|<br />
[[Image:Døra.jpg|thumb|left|350px| Trondheim]]<br />
|}<br />
<br />
<br />
<br />
[[Image:Vasken.jpg|thumb|left|350px|Eastern Norwegian dialect (south of Trøndelag)]]<br />
<br />
'''Description of picture materia'''<br />
<br />
The screenshots above and to the left illustrate three words represented using Praat. The data is taken from sentences 3, 7 and 9, respectively. The blue curve in the middle of each screenshot shows the fundamental frequency, or pitch, throughout the pronunciation of the word (it has gaps because unvoiced sounds do not have any pitch). Examining the pitch contour of the words we see that the Bergen speaker pronounces /pe:ker/ with an HL pitch contour, i.e. a high tone on the first syllable and a low tone on the last syllable, while the pattern is the opposite (LH) for the Trondheim pronunciation of /dø:ra/. The last screenshot illustrates the pitch of the speaker of an Eastern Norwegian dialect south of Trøndelag, also with an LH tone contour on /vaska/. Because of the high tone on the stressed syllable, western Norwegian dialects are often referred to as high-tone dialects and contrarily Eastern Norwegian dialects as low-tone dialects. However, there are differences between dialects in the same group as well, comparing the Trondheim speaker and the other Eastern dialect we see that the former has a gradual rise from L to H, while the latter has a more abrupt rise at the end of the word.<br />
<br />
<br />
<br />
<br />
<br />
==Speaker dialect: ''Bergen''==<br />
<br />
'''Sentence 1'''<br />
<Phrase>10903</Phrase><br />
<flashmp3>PSTA01.mp3</flashmp3><br />
<br />
<br><br />
<br><br />
File download for viewing in the Praat([[#Downloading Help|Downloading Help]]):<br />
<br />
*[[Media:PSTA01.mp3|Sound]]<br />
*[[Media:PSTA01.txt|TextGrid]]<br />
<br />
<br><br />
<br><br />
'''Sentence 2'''<br />
<Phrase>10904</Phrase><br />
<flashmp3>PSTA02.mp3</flashmp3><br />
<br />
<br><br />
<br><br />
File download for Praat ([[#Downloading Help|Downloading Help]]): <br />
<br />
* [[Media:PSTA02.mp3|Sound]], <br />
* [[Media:PSTA02.txt|TextGrid]]<br />
<br />
<br><br />
<br><br />
'''Sentence 3'''<br />
<phrase>10905</phrase><br />
<flashmp3>PSTA03.mp3</flashmp3><br />
<br />
<br><br />
<br><br />
File download for Praat ([[#Downloading Help|Downloading Help]]): <br />
<br />
*[[Media:PSTA03.mp3|Sound]], <br />
*[[Media:PSTA03.txt|TextGrid]]<br />
<br />
==Speaker Dialect: Trondheim==<br />
<br />
[[Parallel Annotation of Speech and Text - Part 2]]<br />
<br />
==Speaker Dialect: Eastern Norway==<br />
<br />
[[Parallel Annotation of Speech and Text - Part 3]]<br />
<br />
==About the TextGrid files==<br />
<br />
The TextGrid files are opened together with the matching sound files for viewing in the Praat application. The TextGrid files consist of three tiers, 'Word' (rendered in Bokmål orthography) 'Phoneme' (shows underlying segments) and 'Note' (shows surface realisation with IPA symbols, and other notes). In the phoneme tier, a hash (#) represents a word boundary and a segment inside <angle brackets> is an underlying segment that is syncopated or otherwise missing in the surface form.<br />
<br />
Here is a list of glosses used in the 'Note' tier:<br />
<br />
''Phonology/Phonetics:''<br><br />
BrV = Segent realised with breathy voice<br><br />
CrV = Segent realised with creaky voice<br><br />
DV = Underlying voiced segment realised devoiced<br><br />
EPN = Epenthesis<br><br />
RD = Reduction of segment (e.g. corner vowel realised as schwa or plosive as fricative).<br><br />
V = Underlying non-voiced segment realised voiced<br><br />
<br />
''Morphophonology/Syntax''<br><br />
CL = Clitic<br><br />
<br />
''Other''<br><br />
ERR = The speaker errs and corrects himself<br><br />
HES = (Audible) hesitation from speaker<br><br />
<br />
The note tier may also show an IPA symbol inside square brackets, this represents the actual realisation of the underlying segment(s).<br />
<br />
==Downloading Help==<br />
When clicking on the file links called '''Sound''' and '''TextGrid''' the files will open in a separate window in your browser.<br />
<br />
Go to *FILE*, right click and select *Save this Page as*.<br />
<br />
You now are able to save the file to a place of your choice in your home directory.</div>Asger Hageruphttps://typecraft.org/w/index.php?title=Parallel_Annotation_of_Speech_and_Text&diff=4677Parallel Annotation of Speech and Text2010-04-16T19:00:37Z<p>Asger Hagerup: /* About the TextGrid files */</p>
<hr />
<div><span style="color:red"> '''This page is under construction'''</span><br />
<br />
== Project Description==<br />
Goal of this short pilot has been parallel sound and text annotation. The study has been conducted by Professor [[User:Wim van Dommelen|Wim van Dommelen]] and Assc.Professor [[User:Dorothee Beermann|Dorothee Beermann]] at the [http://www.ntnu.no/isk Institute of Languages and Communication Studies] at the [http://www.ntnu.no Norwegian University of Science and Technology].Scientific assistant for the project was [[User:Asger Hagerup|Asger Hagerup]]. The project has been funded by [http://www.ntnu.no/hf/satsing/sstl the SSTL]. <br />
<br />
The pilot investigated how to integrate presentations of linguistically annotated audio and text material, combining [http://www.fon.hum.uva.nl/praat/ Praat] and [[Main Page|TypeCraft]]. <br />
<br />
'''Praat''' is a signal analysis software developed by [http://www.fon.hum.uva.nl/paul/ Paul Boersma] and [http://www.fon.hum.uva.nl/david/ David Weenink] from the [http://www.example.com University of Amsterdam]. It is a tool widely used for the annotation of sound objects. For the present study we have taken advantage of the fact that Praat annotation data resides in a TextGrid object that exists separately from the sound object. Using annotated tiers allows easy referencing of data across applications. At present our sound signal representations are static, and selective, that is, they focus on the presentation of one selected feature to illustrate interesting correlations across phonetic and linguistic categories. Further funding will allow us to develop an interactive representation of speech data in the near future.<br />
<br />
Below are some annotated sentences uttered by speakers of three different dialects of Norwegian, which could be used as data for many different linguists. A syntactician working on e.g. how clitics are used cross-linguistically would find some of this sentences by searching for clitics in the TypeCraft database. A phonologist interested in e.g. vowel reduction would be able to download the sentences and TextGrid files to view them in Praat and get an idea of where and how vowel reduction happens in Norwegian. A phonetician may be interested in examining e.g. Voice Onset Time in Norwegian, and would also benefit from the data. And for a linguist working with dialectology this data with morphosyntactic annotation combined with phonetic/phonological annotation would allow a thorough comparison of the dialects, in any linguistic field.<br />
<br />
===Description of the material=== <br />
For our study we selected 10 sentences from the phonetic database of the [http://www.sound2sense.eu/ Sound to Sense] project. <br />
<br />
To illustrate some of the differences between Norwegian dialects we could look at both segmental and suprasegmental phenomena that are used in dividing Norwegian language into a western and an eastern dialect group. On the segment level we can examine the pronunciation of the phoneme /r/. As can be determined by the attached sound data, the Bergen (western) speaker pronounces this as a voiced uvular fricative, while the Trondheim and X (eastern) speakers pronounces it as a voiced alveolar tap (although the segment may also appear as an approximant in rapid speech for all three speakers). In addition, the Trondheim and X speaker have an assimilation between /r/ and a following alveolar consonant: the consonant sequence surfaces as a retroflex version of the latter consonant. This is not the case for the Bergen speaker, where the two segments are preserved in the surface form.<br />
<br />
To illustrate a suprasegmental phonomenon we can look at the pitch contour for bisyllabic words with initial stress. For these words there are two possible pitch contours in Norwegian, with either two or three tones. These two pitch contours are commonly called toneme 1 and toneme 2, respectively. In one of the sentences uttered by the Trondheim speaker we look closer at how toneme 1 and 2 are used in inflection, but here we shall briefly look at how toneme 1 is realised in the different dialects.<br />
<br />
[[Image:Peker.jpg|350px]]<br />
[[Image:Døra.jpg|350px]]<br />
[[Image:Vasken.jpg|350px]]<br />
<br />
Above are screenshots of three words taken from the sentences and viewed in the Praat application. The blue curve in the middle of the pictures shows the fundamental frequency, or pitch, throughout the pronunciation of the word (it has gaps because unvoiced sounds do not have any pitch). Examining the pitch contour of the words we see that the Bergen speaker pronounces /pe:ker/ with an HL pitch contour, i.e. a high tone on the first syllable and a low tone on the last syllable, while the pattern is the opposite (LH) for the Trondheim (/dø:ra/) and X (/vaska/) speaker. Because of the high tone on the stressed syllable, western Norwegian dialects are often referred to as high-tone dialects and contrarily eastern Norwegian dialects as low-tone dialects. However, there are differences between dialects in the same group as well, comparing the Trondheim speaker and the X speaker we see that the former has a gradual rise from L to H, while the latter has a more abrupt rise at the end of the word.<br />
<br />
'''Sentences 1 to 3'''<br />
<br />
==Speaker dialect: ''Bergen''==<br />
<br />
<Phrase>10903</Phrase><br />
<flashmp3>PSTA01.mp3</flashmp3><br />
<br />
<br><br />
<br><br />
File download for viewing in the Praat([[#Downloading Help|Downloading Help]]):<br />
<br />
*[[Media:PSTA01.mp3|Sound]]<br />
*[[File:PSTA01.txt|TextGrid]]<br />
<br />
<br />
<br />
<Phrase>10904</Phrase><br />
<flashmp3>PSTA02.mp3</flashmp3><br />
<br />
<br><br />
<br><br />
File download for Praat ([[#Downloading Help|Downloading Help]]): <br />
<br />
* [[Media:PSTA02.mp3|Sound]], <br />
* [[Media:PSTA02.txt|TextGrid]]<br />
<br />
<br />
<phrase>10905</phrase><br />
<flashmp3>PSTA03.mp3</flashmp3><br />
<br />
<br><br />
<br><br />
File download for Praat ([[#Downloading Help|Downloading Help]]): <br />
<br />
*[[Media:PSTA03.mp3|Sound]], <br />
*[[Media:PSTA03.txt|TextGrid]]<br />
<br />
==Speaker Dialect: Trondheim==<br />
<br />
[[Parallel Annotation of Speech and Text - Part 2]]<br />
<br />
==Speaker Dialect: Eastern Norway==<br />
<br />
[[Parallel Annotation of Speech and Text - Part 3]]<br />
<br />
==About the TextGrid files==<br />
<br />
The TextGrid files are opened together with the matching sound files for viewing in the Praat application. The TextGrid files consist of three tiers, 'Word' (rendered in Bokmål orthography) 'Phoneme' (shows underlying segments) and 'Note' (shows surface realisation with IPA symbols, and other notes). In the phoneme tier, a hash (#) represents a word boundary and a segment inside angle brackets (<>) is an underlying segment that is syncopated or otherwise missing in the surface form.<br />
<br />
Here is a list of glosses used in the 'Note' tier:<br />
<br />
''Phonology/Phonetics:''<br><br />
BrV = Segent realised with breathy voice<br><br />
CrV = Segent realised with creaky voice<br><br />
DV = Underlying voiced segment realised devoiced<br><br />
EPN = Epenthesis<br><br />
RD = Reduction of segment (e.g. corner vowel realised as schwa or plosive as fricative).<br><br />
V = Underlying non-voiced segment realised voiced<br><br />
<br />
''Morphophonology/Syntax''<br><br />
CL = Clitic<br><br />
<br />
''Other''<br><br />
ERR = The speaker errs and corrects himself<br><br />
HES = (Audible) hesitation from speaker<br><br />
<br />
The note tier may also show an IPA symbol inside square brackets, this represents the actual realisation of the underlying segment(s).<br />
<br />
==Downloading Help==<br />
When clicking on the file links called '''Sound''' and '''TextGrid''' the files will open in a separate window in your browser.<br />
<br />
Go to *FILE*, right click and select *Save this Page as*.<br />
<br />
You now are able to save the file to a place of your choice in your home directory.</div>Asger Hageruphttps://typecraft.org/w/index.php?title=User:Asger_Hagerup&diff=4676User:Asger Hagerup2010-04-16T18:49:31Z<p>Asger Hagerup: </p>
<hr />
<div>[[Image:Asger.jpg|300px|thumbnail|left]]<br />
I have been studying linguistics at the [http://www.ntnu.no/isk Institute of Languages and Communication Studies] at [http://www.ntnu.no NTNU] since Autumn 2006. I am currently in the progress of writing my Master's Thesis which is going to be about phonology in [http://www.ethnologue.com/show_language.asp?code=kal Western Greenlandic]. In addition to studying at NTNU I have also worked there as a teaching assistant in an introductory course to basic linguistics and phonetics and a course in Old Norse/Norwegian diachronics.</div>Asger Hageruphttps://typecraft.org/w/index.php?title=User:Asger_Hagerup&diff=4675User:Asger Hagerup2010-04-16T18:48:04Z<p>Asger Hagerup: </p>
<hr />
<div>[[Image:Asger.jpg|300px|thumbnail|left]]<br />
I have been studying linguistics at the [http://www.ntnu.no/isk Institute of Languages and Communication Studies] at [http://www.ntnu.no NTNU] since Autumn 2006. I am currently in the progress of writing my Master's Thesis which is going to be about phonology in [http://www.ethnologue.com/show_language.asp?code=kal Western Greenlandic]. I have also worked as a teaching assistant in an introductory course to basic linguistics and phonetics and a course in Old Norse/Norwegian diachronics.</div>Asger Hageruphttps://typecraft.org/w/index.php?title=Parallel_Annotation_of_Speech_and_Text_-_Part_2&diff=4674Parallel Annotation of Speech and Text - Part 22010-04-16T18:32:16Z<p>Asger Hagerup: </p>
<hr />
<div>go back to [[Parallel Annotation of Speech and Text]]<br />
<br />
<br />
Sentences 4 to 7:<br>Speaker dialect: Trondheim<br />
<br />
<Phrase>10906</Phrase><br />
<flashmp3>PSTA04.mp3</flashmp3><br><br />
Download files for viewing in the Praat application: Sound: [[File:PSTA04.mp3]], TextGrid: [[File:PSTA04.txt]]<br />
<br><br />
<Phrase>10907</Phrase><br />
<flashmp3>PSTA05.mp3</flashmp3><br><br />
Download files for viewing in the Praat application: Sound: [[File:PSTA05.mp3]], TextGrid: [[File:PSTA05.txt]]<br />
<br><br />
<Phrase>10908</Phrase><br />
<flashmp3>PSTA06.mp3</flashmp3><br><br />
Download files for viewing in the Praat application: Sound: [[File:PSTA06.mp3]], TextGrid: [[File:PSTA06.txt]]<br />
<br><br />
<Phrase>10909</Phrase><br />
In this sentence we can look at how toneme 1 and toneme 2 is used in inflection. It is also a good example of how morphological annotation can be incomplete without a more thorough look at the actual pronunciation of words.<br />
<br />
[[Image:Taket.jpg|350px]]<br />
[[Image:Pære.jpg|350px]]<br />
<br />
Here we have screenshots of two of the words from the sentence, “taket” and “pære” in orthography, viewed in the Praat application. The latter word is part of a compound word, but for the sake of simplicity we can disregard this. The pitch is shown as a blue curve in the middle. Phonemically these words are transcribed /ta:k+e/ and /pæ:r+e/, where the suffix /e/ in the first word denotes definite singular (“the roof”) and in the second indefinite singular (“bulb”). It would seem we have two different morphemes that are realised by the same morph /e/, however, this is not the case. As we can see, the inflected words have different pitch contours, in /ta:k+e/ there is an HLH contour (the word has toneme 2), while in /pæ:r+e/ there is a LH contour (toneme 1). One way to analyse this is to say that the suffix /e/ in /ta:k+e/ carries an extra tone (or tone bearing unit) with it, while the suffix /e/ in /pæ:r+e/ does not. This means that the two suffixes are not phonologically the same.<br />
<br />
<flashmp3>PSTA07.mp3</flashmp3><br><br />
Download files for viewing in the Praat application: Sound: [[File:PSTA07.mp3]], TextGrid: [[File:PSTA07.txt]]</div>Asger Hageruphttps://typecraft.org/w/index.php?title=Parallel_Annotation_of_Speech_and_Text&diff=4673Parallel Annotation of Speech and Text2010-04-16T18:29:14Z<p>Asger Hagerup: </p>
<hr />
<div><span style="color:red"> '''This page is under construction'''</span><br />
<br />
== Project Description==<br />
Goal of this short pilot has been parallel sound and text annotation. The study has been conducted by Professor [[User:Wim van Dommelen|Wim van Dommelen]] and Assc.Professor [[User:Dorothee Beermann|Dorothee Beermann]] at the [http://www.ntnu.no/isk Institute of Languages and Communication Studies] at the [http://www.ntnu.no Norwegian University of Science and Technology].Scientific assistant for the project was [[User:Asger Hagerup|Asger Hagerup]]. The project has been funded by [http://www.ntnu.no/hf/satsing/sstl the SSTL]. <br />
<br />
The pilot investigated how to integrate presentations of linguistically annotated audio and text material, combining [http://www.fon.hum.uva.nl/praat/ Praat] and [[Main Page|TypeCraft]]. <br />
<br />
'''Praat''' is a signal analysis software developed by [http://www.fon.hum.uva.nl/paul/ Paul Boersma] and [http://www.fon.hum.uva.nl/david/ David Weenink] from the [http://www.example.com University of Amsterdam]. It is a tool widely used for the annotation of sound objects. For the present study we have taken advantage of the fact that Praat annotation data resides in a TextGrid object that exists separately from the sound object. Using annotated tiers allows easy referencing of data across applications. At present our sound signal representations are static, and selective, that is, they focus on the presentation of one selected feature to illustrate interesting correlations across phonetic and linguistic categories. Further funding will allow us to develop an interactive representation of speech data in the near future.<br />
<br />
Below are some annotated sentences uttered by speakers of three different dialects of Norwegian, which could be used as data for many different linguists. A syntactician working on e.g. how clitics are used cross-linguistically would find some of this sentences by searching for clitics in the TypeCraft database. A phonologist interested in e.g. vowel reduction would be able to download the sentences and TextGrid files to view them in Praat and get an idea of where and how vowel reduction happens in Norwegian. A phonetician may be interested in examining e.g. Voice Onset Time in Norwegian, and would also benefit from the data. And for a linguist working with dialectology this data with morphosyntactic annotation combined with phonetic/phonological annotation would allow a thorough comparison of the dialects, in any linguistic field.<br />
<br />
===Description of the material=== <br />
For our study we selected 10 sentences from the phonetic database of the [http://www.sound2sense.eu/ Sound to Sense] project. <br />
<br />
To illustrate some of the differences between Norwegian dialects we could look at both segmental and suprasegmental phenomena that are used in dividing Norwegian language into a western and an eastern dialect group. On the segment level we can examine the pronunciation of the phoneme /r/. As can be determined by the attached sound data, the Bergen (western) speaker pronounces this as a voiced uvular fricative, while the Trondheim and X (eastern) speakers pronounces it as a voiced alveolar tap (although the segment may also appear as an approximant in rapid speech for all three speakers). In addition, the Trondheim and X speaker have an assimilation between /r/ and a following alveolar consonant: the consonant sequence surfaces as a retroflex version of the latter consonant. This is not the case for the Bergen speaker, where the two segments are preserved in the surface form.<br />
<br />
To illustrate a suprasegmental phonomenon we can look at the pitch contour for bisyllabic words with initial stress. For these words there are two possible pitch contours in Norwegian, with either two or three tones. These two pitch contours are commonly called toneme 1 and toneme 2, respectively. In one of the sentences uttered by the Trondheim speaker we look closer at how toneme 1 and 2 are used in inflection, but here we shall briefly look at how toneme 1 is realised in the different dialects.<br />
<br />
[[Image:Peker.jpg|350px]]<br />
[[Image:Døra.jpg|350px]]<br />
[[Image:Vasken.jpg|350px]]<br />
<br />
Above are screenshots of three words taken from the sentences and viewed in the Praat application. The blue curve in the middle of the pictures shows the fundamental frequency, or pitch, throughout the pronunciation of the word (it has gaps because unvoiced sounds do not have any pitch). Examining the pitch contour of the words we see that the Bergen speaker pronounces /pe:ker/ with an HL pitch contour, i.e. a high tone on the first syllable and a low tone on the last syllable, while the pattern is the opposite (LH) for the Trondheim (/dø:ra/) and X (/vaska/) speaker. Because of the high tone on the stressed syllable, western Norwegian dialects are often referred to as high-tone dialects and contrarily eastern Norwegian dialects as low-tone dialects. However, there are differences between dialects in the same group as well, comparing the Trondheim speaker and the X speaker we see that the former has a gradual rise from L to H, while the latter has a more abrupt rise at the end of the word.<br />
<br />
'''Sentences 1 to 3'''<br />
<br />
==Speaker dialect: ''Bergen''==<br />
<br />
<Phrase>10903</Phrase><br />
<flashmp3>PSTA01.mp3</flashmp3><br />
<br />
<br><br />
<br><br />
File download for viewing in the Praat([[#Downloading Help|Downloading Help]]):<br />
<br />
*[[Media:PSTA01.mp3|Sound]]<br />
*[[File:PSTA01.txt|TextGrid]]<br />
<br />
<br />
<br />
<Phrase>10904</Phrase><br />
<flashmp3>PSTA02.mp3</flashmp3><br />
<br />
<br><br />
<br><br />
File download for Praat ([[#Downloading Help|Downloading Help]]): <br />
<br />
* [[Media:PSTA02.mp3|Sound]], <br />
* [[Media:PSTA02.txt|TextGrid]]<br />
<br />
<br />
<phrase>10905</phrase><br />
<flashmp3>PSTA03.mp3</flashmp3><br />
<br />
<br><br />
<br><br />
File download for Praat ([[#Downloading Help|Downloading Help]]): <br />
<br />
*[[Media:PSTA03.mp3|Sound]], <br />
*[[Media:PSTA03.txt|TextGrid]]<br />
<br />
==Speaker Dialect: Trondheim==<br />
<br />
[[Parallel Annotation of Speech and Text - Part 2]]<br />
<br />
==Speaker Dialect: Eastern Norway==<br />
<br />
[[Parallel Annotation of Speech and Text - Part 3]]<br />
<br />
==About the TextGrid files==<br />
<br />
The TextGrid files are opened together with the matching sound files for viewing in the Praat application. The TextGrid files consist of three tiers, 'Word' (rendered in Bokmål orthography) 'Phoneme' (shows underlying segments) and 'Note' (shows surface realisation with IPA symbols, and other notes).<br />
<br />
Here is a list of glosses used in the 'Note' tier:<br />
<br />
''Phonology/Phonetics:''<br><br />
BrV = Segent realised with breathy voice<br><br />
CrV = Segent realised with creaky voice<br><br />
DV = Underlying voiced segment realised devoiced<br><br />
EPN = Epenthesis<br><br />
RD = Reduction of segment (e.g. corner vowel realised as schwa or plosive as fricative).<br><br />
V = Underlying non-voiced segment realised voiced<br><br />
<br />
''Morphophonology/Syntax''<br><br />
CL = Clitic<br><br />
<br />
''Other''<br><br />
ERR = The speaker errs and corrects himself<br><br />
HES = (Audible) hesitation from speaker<br><br />
<br />
==Downloading Help==<br />
When clicking on the file links called '''Sound''' and '''TextGrid''' the files will open in a separate window in your browser.<br />
<br />
Go to *FILE*, right click and select *Save this Page as*.<br />
<br />
You now are able to save the file to a place of your choice in your home directory.</div>Asger Hageruphttps://typecraft.org/w/index.php?title=File:Peker.jpg&diff=4672File:Peker.jpg2010-04-16T18:16:34Z<p>Asger Hagerup: </p>
<hr />
<div></div>Asger Hageruphttps://typecraft.org/w/index.php?title=File:P%C3%A6re.jpg&diff=4671File:Pære.jpg2010-04-16T18:16:24Z<p>Asger Hagerup: </p>
<hr />
<div></div>Asger Hageruphttps://typecraft.org/w/index.php?title=File:Taket.jpg&diff=4670File:Taket.jpg2010-04-16T18:16:10Z<p>Asger Hagerup: </p>
<hr />
<div></div>Asger Hageruphttps://typecraft.org/w/index.php?title=File:Vasken.jpg&diff=4669File:Vasken.jpg2010-04-16T18:15:56Z<p>Asger Hagerup: </p>
<hr />
<div></div>Asger Hageruphttps://typecraft.org/w/index.php?title=File:D%C3%B8ra.jpg&diff=4668File:Døra.jpg2010-04-16T18:15:07Z<p>Asger Hagerup: uploaded a new version of "File:Døra.jpg":&#32;Reverted to version as of 14:06, 15 April 2010</p>
<hr />
<div></div>Asger Hageruphttps://typecraft.org/w/index.php?title=File:Asger.jpg&diff=4667File:Asger.jpg2010-04-16T18:13:16Z<p>Asger Hagerup: </p>
<hr />
<div></div>Asger Hageruphttps://typecraft.org/w/index.php?title=Parallel_Annotation_of_Speech_and_Text&diff=4412Parallel Annotation of Speech and Text2010-03-16T16:26:38Z<p>Asger Hagerup: </p>
<hr />
<div>[[Image:PraatTest.png|frameless|500px|left]]<br />
<br />
The following is a test picture (of a picture produced in Praat) The test picture will be replaced by a Praat picture which reflects Asger's text grids. Every Praat picture will correspond to a TC sentence which we can load very easily to this wiki by exporting it from<br />
the TC database. The size of the the Praat picture can be adjusted.<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
Here we can add some text that discusses our annotations, and makes the reader aware of some of the interesting aspects of our annoations.<br />
<br />
<br />
<Phrase>10894</Phrase><br />
<br />
Not all of your text grids and annotation tables need to be represented equally large. The reader of the this page may always click on one<br />
of the Praat representations to view a larger version of the Praat picture, while TC annotations can be searched in the TC database in order to be inspected more carefully.<br />
<br />
{| border="1"<br />
|-<br />
[[File:PraatTest.png|thumb|left|150px|0.52 - 0.67]]<br />
|-<br />
<Phrase>10894</Phrase><br />
|-<br />
[[File:PraatTest.png|thumb|left|150px|]]<br />
| valign="top"|<br />
|-<br />
<Phrase>10894</Phrase><br />
|-<br />
[[File:PraatTest.png|thumb|left|150px|]]<br />
| valign="top"|<br />
|-<br />
<Phrase>10894</Phrase><br />
|}<br />
<Phrase>10905</Phrase><br />
<br />
----<br />
<Phrase>on</Phrase><br />
== Test sentences ==<br />
''(taken from the "Sound to Sense" project)''<br />
<br />
Sentences 1 to 3:<br>Speaker dialect: Bergen<br />
<br />
<Phrase>10903</Phrase><br />
<flashmp3>PSTA01.mp3</flashmp3><br><br />
Download files for viewing in the Praat application: Sound: [[File:PSTA01.mp3]], TextGrid: [[File:PSTA01.txt]]<br />
<br><br />
<Phrase>10904</Phrase><br />
<flashmp3>PSTA02.mp3</flashmp3><br><br />
Download files for viewing in the Praat application: Sound: [[File:PSTA02.mp3]], TextGrid: [[File:PSTA02.txt]]<br />
<br><br />
<Phrase>10905</Phrase><br />
<flashmp3>PSTA03.mp3</flashmp3><br><br />
Download files for viewing in the Praat application: Sound: [[File:PSTA03.mp3]], TextGrid: [[File:PSTA03.txt]]<br />
<br><br />
----<br />
<br><br />
Sentences 4 to 7:<br>Speaker dialect: Trondheim<br />
<br />
<Phrase>10906</Phrase><br />
<flashmp3>PSTA04.mp3</flashmp3><br><br />
Download files for viewing in the Praat application: Sound: [[File:PSTA04.mp3]], TextGrid: [[File:PSTA04.txt]]<br />
<br><br />
<Phrase>10907</Phrase><br />
<flashmp3>PSTA05.mp3</flashmp3><br><br />
Download files for viewing in the Praat application: Sound: [[File:PSTA05.mp3]], TextGrid: [[File:PSTA05.txt]]<br />
<br><br />
<Phrase>10908</Phrase><br />
<flashmp3>PSTA06.mp3</flashmp3><br><br />
Download files for viewing in the Praat application: Sound: [[File:PSTA06.mp3]], TextGrid: [[File:PSTA06.txt]]<br />
<br><br />
<Phrase>10909</Phrase><br />
<flashmp3>PSTA07.mp3</flashmp3><br><br />
Download files for viewing in the Praat application: Sound: [[File:PSTA07.mp3]], TextGrid: [[File:PSTA07.txt]]<br />
<br><br />
----<br />
<br><br />
Sentences 8 to 10:<br>Speaker dialect: Eastern Norway<br />
<br />
<Phrase>10910</Phrase><br />
<flashmp3>PSTA08.mp3</flashmp3><br><br />
Download files for viewing in the Praat application: Sound: [[File:PSTA08.mp3]], TextGrid: [[File:PSTA08.txt]]<br />
<br><br />
<Phrase>10911</Phrase><br />
<flashmp3>PSTA09.mp3</flashmp3><br><br />
Download files for viewing in the Praat application: Sound: [[File:PSTA09.mp3]], TextGrid: [[File:PSTA09.txt]]<br />
<br><br />
<Phrase>10912</Phrase><br />
<flashmp3>PSTA03.mp3</flashmp3><br><br />
Download files for viewing in the Praat application: Sound: [[File:PSTA10.mp3]], TextGrid: [[File:PSTA10.txt]]<br />
<br><br />
----<br />
<br><br />
'''About the TextGrid files:'''<br />
<br />
The TextGrid files are opened together with the matching sound files for viewing in the Praat application. The TextGrid files consist of three tiers, 'Word' (rendered in Bokmål orthography) 'Phoneme' (shows underlying segments) and 'Note' (shows surface realisation with IPA symbols, and other notes).<br />
<br />
Here is a list of glosses used in the 'Note' tier:<br />
<br />
''Phonology/Phonetics:''<br><br />
BrV = Segent realised with breathy voice<br><br />
CrV = Segent realised with creaky voice<br><br />
DV = Underlying voiced segment realised devoiced<br><br />
EPN = Epenthesis<br><br />
RD = Reduction of segment (e.g. corner vowel realised as schwa or plosive as fricative).<br><br />
V = Underlying non-voiced segment realised voiced<br><br />
<br />
''Morphophonology/Syntax''<br><br />
CL = Clitic<br><br />
<br />
''Other''<br><br />
ERR = The speaker errs and corrects himself<br><br />
HES = (Audible) hesitation from speaker<br></div>Asger Hageruphttps://typecraft.org/w/index.php?title=Parallel_Annotation_of_Speech_and_Text&diff=4411Parallel Annotation of Speech and Text2010-03-16T16:23:16Z<p>Asger Hagerup: </p>
<hr />
<div>[[Image:PraatTest.png|frameless|500px|left]]<br />
<br />
The following is a test picture (of a picture produced in Praat) The test picture will be replaced by a Praat picture which reflects Asger's text grids. Every Praat picture will correspond to a TC sentence which we can load very easily to this wiki by exporting it from<br />
the TC database. The size of the the Praat picture can be adjusted.<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
Here we can add some text that discusses our annotations, and makes the reader aware of some of the interesting aspects of our annoations.<br />
<br />
<br />
<Phrase>10894</Phrase><br />
<br />
Not all of your text grids and annotation tables need to be represented equally large. The reader of the this page may always click on one<br />
of the Praat representations to view a larger version of the Praat picture, while TC annotations can be searched in the TC database in order to be inspected more carefully.<br />
<br />
{| border="1"<br />
|-<br />
[[File:PraatTest.png|thumb|left|150px|0.52 - 0.67]]<br />
|-<br />
<Phrase>10894</Phrase><br />
|-<br />
[[File:PraatTest.png|thumb|left|150px|]]<br />
| valign="top"|<br />
|-<br />
<Phrase>10894</Phrase><br />
|-<br />
[[File:PraatTest.png|thumb|left|150px|]]<br />
| valign="top"|<br />
|-<br />
<Phrase>10894</Phrase><br />
|}<br />
<Phrase>10905</Phrase><br />
<br />
----<br />
<Phrase>on</Phrase><br />
== Test sentences ==<br />
''(taken from the "Sound to Sense" project)''<br />
<br />
Sentences 1 to 3:<br />
<br />
Speaker dialect: Bergen<br />
<br />
<Phrase>10903</Phrase><br />
<flashmp3>PSTA01.mp3</flashmp3><br><br />
Download files for viewing in the Praat application: Sound: [[File:PSTA01.mp3]], TextGrid: [[File:PSTA01.txt]]<br />
<br />
<Phrase>10904</Phrase><br />
<flashmp3>PSTA02.mp3</flashmp3><br><br />
Download files for viewing in the Praat application: Sound: [[File:PSTA02.mp3]], TextGrid: [[File:PSTA02.txt]]<br />
<br />
<Phrase>10905</Phrase><br />
<flashmp3>PSTA03.mp3</flashmp3><br><br />
Download files for viewing in the Praat application: Sound: [[File:PSTA03.mp3]], TextGrid: [[File:PSTA03.txt]]<br />
<br />
----<br />
<br />
Sentences 4 to 7:<br />
<br />
Speaker dialect: Trondheim<br />
<br />
<Phrase>10906</Phrase><br />
<flashmp3>PSTA04.mp3</flashmp3><br><br />
Download files for viewing in the Praat application: Sound: [[File:PSTA04.mp3]], TextGrid: [[File:PSTA04.txt]]<br />
<br />
<Phrase>10907</Phrase><br />
<flashmp3>PSTA05.mp3</flashmp3><br><br />
Download files for viewing in the Praat application: Sound: [[File:PSTA05.mp3]], TextGrid: [[File:PSTA05.txt]]<br />
<br />
<Phrase>10908</Phrase><br />
<flashmp3>PSTA06.mp3</flashmp3><br><br />
Download files for viewing in the Praat application: Sound: [[File:PSTA06.mp3]], TextGrid: [[File:PSTA06.txt]]<br />
<br />
<Phrase>10909</Phrase><br />
<flashmp3>PSTA07.mp3</flashmp3><br><br />
Download files for viewing in the Praat application: Sound: [[File:PSTA07.mp3]], TextGrid: [[File:PSTA07.txt]]<br />
<br />
----<br />
<br />
Sentences 8 to 10:<br>Speaker dialect: Eastern Norway<br />
<br />
<Phrase>10910</Phrase><br />
<flashmp3>PSTA08.mp3</flashmp3><br><br />
Download files for viewing in the Praat application: Sound: [[File:PSTA08.mp3]], TextGrid: [[File:PSTA08.txt]]<br />
<br />
<Phrase>10911</Phrase><br />
<flashmp3>PSTA09.mp3</flashmp3><br><br />
Download files for viewing in the Praat application: Sound: [[File:PSTA09.mp3]], TextGrid: [[File:PSTA09.txt]]<br />
<br />
<Phrase>10912</Phrase><br />
<flashmp3>PSTA03.mp3</flashmp3><br><br />
Download files for viewing in the Praat application: Sound: [[File:PSTA10.mp3]], TextGrid: [[File:PSTA10.txt]]<br />
<br />
----<br />
<br />
'''About the TextGrid files:'''<br />
<br />
The TextGrid files are opened together with the matching sound files for viewing in the Praat application. The TextGrid files consist of three tiers, 'Word' (rendered in Bokmål orthography) 'Phoneme' (shows underlying segments) and 'Note' (shows surface realisation with IPA symbols, and other notes).<br />
<br />
Here is a list of glosses used in the 'Note' tier:<br />
<br />
''Phonology/Phonetics:''<br><br />
BrV = Segent realised with breathy voice<br><br />
CrV = Segent realised with creaky voice<br><br />
DV = Underlying voiced segment realised devoiced<br><br />
EPN = Epenthesis<br><br />
RD = Reduction of segment (e.g. corner vowel realised as schwa or plosive as fricative).<br><br />
V = Underlying non-voiced segment realised voiced<br><br />
<br />
''Morphophonology/Syntax''<br><br />
CL = Clitic<br><br />
<br />
''Other''<br><br />
ERR = The speaker errs and corrects himself<br><br />
HES = (Audible) hesitation from speaker<br></div>Asger Hageruphttps://typecraft.org/w/index.php?title=Parallel_Annotation_of_Speech_and_Text&diff=4410Parallel Annotation of Speech and Text2010-03-16T16:19:08Z<p>Asger Hagerup: </p>
<hr />
<div>[[Image:PraatTest.png|frameless|500px|left]]<br />
<br />
The following is a test picture (of a picture produced in Praat) The test picture will be replaced by a Praat picture which reflects Asger's text grids. Every Praat picture will correspond to a TC sentence which we can load very easily to this wiki by exporting it from<br />
the TC database. The size of the the Praat picture can be adjusted.<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
Here we can add some text that discusses our annotations, and makes the reader aware of some of the interesting aspects of our annoations.<br />
<br />
<br />
<Phrase>10894</Phrase><br />
<br />
Not all of your text grids and annotation tables need to be represented equally large. The reader of the this page may always click on one<br />
of the Praat representations to view a larger version of the Praat picture, while TC annotations can be searched in the TC database in order to be inspected more carefully.<br />
<br />
{| border="1"<br />
|-<br />
[[File:PraatTest.png|thumb|left|150px|0.52 - 0.67]]<br />
|-<br />
<Phrase>10894</Phrase><br />
|-<br />
[[File:PraatTest.png|thumb|left|150px|]]<br />
| valign="top"|<br />
|-<br />
<Phrase>10894</Phrase><br />
|-<br />
[[File:PraatTest.png|thumb|left|150px|]]<br />
| valign="top"|<br />
|-<br />
<Phrase>10894</Phrase><br />
|}<br />
<Phrase>10905</Phrase><br />
<br />
----<br />
<br />
== Test sentences ==<br />
''(taken from the "Sound to Sense" project)''<br />
<br />
Sentences 1 to 3:<br />
<br />
Speaker dialect: Bergen<br />
<br />
<Phrase>10903</Phrase><br />
<flashmp3>PSTA01.mp3</flashmp3><br />
Download files for viewing in the Praat application: Sound: [[File:PSTA01.mp3]], TextGrid: [[File:PSTA01.txt]]<br />
<br />
<Phrase>10904</Phrase><br />
<flashmp3>PSTA02.mp3</flashmp3><br />
Download files for viewing in the Praat application: Sound: [[File:PSTA02.mp3]], TextGrid: [[File:PSTA02.txt]]<br />
<br />
<Phrase>10905</Phrase><br />
<flashmp3>PSTA03.mp3</flashmp3><br />
Download files for viewing in the Praat application: Sound: [[File:PSTA03.mp3]], TextGrid: [[File:PSTA03.txt]]<br />
<br />
----<br />
<br />
Sentences 4 to 7:<br />
<br />
Speaker dialect: Trondheim<br />
<br />
<Phrase>10906</Phrase><br />
<flashmp3>PSTA04.mp3</flashmp3><br />
Download files for viewing in the Praat application: Sound: [[File:PSTA04.mp3]], TextGrid: [[File:PSTA04.txt]]<br />
<br />
<Phrase>10907</Phrase><br />
<flashmp3>PSTA05.mp3</flashmp3><br />
Download files for viewing in the Praat application: Sound: [[File:PSTA05.mp3]], TextGrid: [[File:PSTA05.txt]]<br />
<br />
<Phrase>10908</Phrase><br />
<flashmp3>PSTA06.mp3</flashmp3><br />
Download files for viewing in the Praat application: Sound: [[File:PSTA06.mp3]], TextGrid: [[File:PSTA06.txt]]<br />
<br />
<Phrase>10909</Phrase><br />
<flashmp3>PSTA07.mp3</flashmp3><br />
Download files for viewing in the Praat application: Sound: [[File:PSTA07.mp3]], TextGrid: [[File:PSTA07.txt]]<br />
<br />
----<br />
<br />
Sentences 8 to 10:<br />
<br />
Speaker dialect: Eastern Norway<br />
<br />
<Phrase>10910</Phrase><br />
<flashmp3>PSTA08.mp3</flashmp3><br />
Download files for viewing in the Praat application: Sound: [[File:PSTA08.mp3]], TextGrid: [[File:PSTA08.txt]]<br />
<br />
<Phrase>10911</Phrase><br />
<flashmp3>PSTA09.mp3</flashmp3><br />
Download files for viewing in the Praat application: Sound: [[File:PSTA09.mp3]], TextGrid: [[File:PSTA09.txt]]<br />
<br />
<Phrase>10912</Phrase><br />
<flashmp3>PSTA03.mp3</flashmp3><br />
Download files for viewing in the Praat application: Sound: [[File:PSTA10.mp3]], TextGrid: [[File:PSTA10.txt]]<br />
<br />
----<br />
<br />
'''About the TextGrid files:'''<br />
<br />
The TextGrid files are opened together with the matching sound file for viewing in the Praat application. The TextGrid files consist of three tiers, 'Word' (rendered in Bokmål orthography) 'Phoneme' (shows underlying segments) and 'Note' (shows surface realisation with IPA symbols, and other notes).<br />
<br />
Here is a list of glosses used in the 'Note' tier:<br />
<br />
''Phonology/Phonetics:''<br />
BrV = Segent realised with breathy voice<br />
CrV = Segent realised with creaky voice<br />
DV = Underlying voiced segment realised devoiced<br />
EPN = Epenthesis<br />
RD = Reduction of segment (e.g. corner vowel realised as schwa or plosive as fricative).<br />
V = Underlying non-voiced segment realised voiced<br />
<br />
''Morphophonology/Syntax''<br />
CL = Clitic<br />
<br />
''Other''<br />
ERR = The speaker errs and corrects himself<br />
HES = (Audible) hesitation from speaker</div>Asger Hageruphttps://typecraft.org/w/index.php?title=File:PSTA10.txt&diff=4409File:PSTA10.txt2010-03-16T15:32:36Z<p>Asger Hagerup: Praat TextGrid for tenth test sentence in parallel speech/text project</p>
<hr />
<div>Praat TextGrid for tenth test sentence in parallel speech/text project</div>Asger Hageruphttps://typecraft.org/w/index.php?title=File:PSTA09.txt&diff=4408File:PSTA09.txt2010-03-16T15:32:24Z<p>Asger Hagerup: Praat TextGrid for ninth test sentence in parallel speech/text project</p>
<hr />
<div>Praat TextGrid for ninth test sentence in parallel speech/text project</div>Asger Hageruphttps://typecraft.org/w/index.php?title=File:PSTA08.txt&diff=4407File:PSTA08.txt2010-03-16T15:32:10Z<p>Asger Hagerup: Praat TextGrid for eighth test sentence in parallel speech/text project</p>
<hr />
<div>Praat TextGrid for eighth test sentence in parallel speech/text project</div>Asger Hageruphttps://typecraft.org/w/index.php?title=File:PSTA07.txt&diff=4406File:PSTA07.txt2010-03-16T15:31:54Z<p>Asger Hagerup: Praat TextGrid for seventh test sentence in parallel speech/text project</p>
<hr />
<div>Praat TextGrid for seventh test sentence in parallel speech/text project</div>Asger Hageruphttps://typecraft.org/w/index.php?title=File:PSTA06.txt&diff=4405File:PSTA06.txt2010-03-16T15:31:38Z<p>Asger Hagerup: Praat TextGrid for sixth test sentence in parallel speech/text project</p>
<hr />
<div>Praat TextGrid for sixth test sentence in parallel speech/text project</div>Asger Hageruphttps://typecraft.org/w/index.php?title=File:PSTA05.txt&diff=4404File:PSTA05.txt2010-03-16T15:31:20Z<p>Asger Hagerup: Praat TextGrid for fifth test sentence in parallel speech/text project</p>
<hr />
<div>Praat TextGrid for fifth test sentence in parallel speech/text project</div>Asger Hageruphttps://typecraft.org/w/index.php?title=File:PSTA04.txt&diff=4403File:PSTA04.txt2010-03-16T15:31:04Z<p>Asger Hagerup: Praat TextGrid for fourth test sentence in parallel speech/text project</p>
<hr />
<div>Praat TextGrid for fourth test sentence in parallel speech/text project</div>Asger Hageruphttps://typecraft.org/w/index.php?title=File:PSTA03.txt&diff=4402File:PSTA03.txt2010-03-16T15:30:48Z<p>Asger Hagerup: Praat TextGrid for third test sentence in parallel speech/text project</p>
<hr />
<div>Praat TextGrid for third test sentence in parallel speech/text project</div>Asger Hageruphttps://typecraft.org/w/index.php?title=File:PSTA02.txt&diff=4401File:PSTA02.txt2010-03-16T15:30:31Z<p>Asger Hagerup: Praat TextGrid for second test sentence in parallel speech/text project</p>
<hr />
<div>Praat TextGrid for second test sentence in parallel speech/text project</div>Asger Hageruphttps://typecraft.org/w/index.php?title=File:PSTA01.txt&diff=4400File:PSTA01.txt2010-03-16T15:30:17Z<p>Asger Hagerup: Praat TextGrid for first test sentence in parallel speech/text project</p>
<hr />
<div>Praat TextGrid for first test sentence in parallel speech/text project</div>Asger Hageruphttps://typecraft.org/w/index.php?title=File:PSTA10.mp3&diff=4399File:PSTA10.mp32010-03-16T15:29:29Z<p>Asger Hagerup: Tenth test sentence for parallel speech/text project
mp3 format, 160 kbps</p>
<hr />
<div>Tenth test sentence for parallel speech/text project<br />
mp3 format, 160 kbps</div>Asger Hageruphttps://typecraft.org/w/index.php?title=File:PSTA09.mp3&diff=4398File:PSTA09.mp32010-03-16T15:29:13Z<p>Asger Hagerup: Ninth test sentence for parallel speech/text project
mp3 format, 160 kbps</p>
<hr />
<div>Ninth test sentence for parallel speech/text project<br />
mp3 format, 160 kbps</div>Asger Hageruphttps://typecraft.org/w/index.php?title=File:PSTA08.mp3&diff=4397File:PSTA08.mp32010-03-16T15:28:39Z<p>Asger Hagerup: uploaded a new version of "File:PSTA08.mp3":&#32;Eighth test sentence for parallel speech/text project
mp3 format, 160 kbps</p>
<hr />
<div></div>Asger Hageruphttps://typecraft.org/w/index.php?title=File:PSTA08.mp3&diff=4396File:PSTA08.mp32010-03-16T15:27:04Z<p>Asger Hagerup: </p>
<hr />
<div></div>Asger Hageruphttps://typecraft.org/w/index.php?title=File:PSTA07.mp3&diff=4395File:PSTA07.mp32010-03-16T15:26:51Z<p>Asger Hagerup: Seventh test sentence for parallel speech/text project
mp3 format, 160 kbps</p>
<hr />
<div>Seventh test sentence for parallel speech/text project<br />
mp3 format, 160 kbps</div>Asger Hageruphttps://typecraft.org/w/index.php?title=File:PSTA06.mp3&diff=4394File:PSTA06.mp32010-03-16T15:26:12Z<p>Asger Hagerup: Sixth test sentence for parallel speech/text project
mp3 format, 160 kbps</p>
<hr />
<div>Sixth test sentence for parallel speech/text project<br />
mp3 format, 160 kbps</div>Asger Hageruphttps://typecraft.org/w/index.php?title=File:PSTA05.mp3&diff=4393File:PSTA05.mp32010-03-16T15:25:56Z<p>Asger Hagerup: Fifth test sentence for parallel speech/text project
mp3 format, 160 kbps</p>
<hr />
<div>Fifth test sentence for parallel speech/text project<br />
mp3 format, 160 kbps</div>Asger Hageruphttps://typecraft.org/w/index.php?title=File:PSTA04.mp3&diff=4392File:PSTA04.mp32010-03-16T15:25:39Z<p>Asger Hagerup: Fourth test sentence for parallel speech/text project
mp3 format, 160 kbps</p>
<hr />
<div>Fourth test sentence for parallel speech/text project<br />
mp3 format, 160 kbps</div>Asger Hageruphttps://typecraft.org/w/index.php?title=File:PSTA03.mp3&diff=4391File:PSTA03.mp32010-03-16T15:25:23Z<p>Asger Hagerup: Third test sentence for parallel speech/text project
mp3 format, 160 kbps</p>
<hr />
<div>Third test sentence for parallel speech/text project<br />
mp3 format, 160 kbps</div>Asger Hageruphttps://typecraft.org/w/index.php?title=File:PSTA02.mp3&diff=4390File:PSTA02.mp32010-03-16T15:24:56Z<p>Asger Hagerup: Second test sentence for parallel speech/text project
mp3 format, 160 kbps</p>
<hr />
<div>Second test sentence for parallel speech/text project<br />
mp3 format, 160 kbps</div>Asger Hageruphttps://typecraft.org/w/index.php?title=File:PSTA01.mp3&diff=4389File:PSTA01.mp32010-03-16T15:23:35Z<p>Asger Hagerup: First test sentence for parallel speech/text project
mp3 format, 160 kbps</p>
<hr />
<div>First test sentence for parallel speech/text project<br />
mp3 format, 160 kbps</div>Asger Hageruphttps://typecraft.org/w/index.php?title=File:Bergen01.wav&diff=4340File:Bergen01.wav2010-03-10T12:21:08Z<p>Asger Hagerup: </p>
<hr />
<div></div>Asger Hagerup