Typecraft v2.5
Jump to: navigation, search

Parallel Annotation of Speech and Text

Revision as of 21:26, 18 March 2010 by Dorothee Beermann (Talk | contribs) (Description of the material)

This page is under construction

Project Description

Goal of this short pilot has been parallel sound and text annotation. The study has been conducted by Professor Wim van Dommelen and Assc.Professor Dorothee Beermannat the Institute of Languages and Communication Studies at the Norwegian University of Science and Technology.Scientific assistant for the project was Asger Hagerup. The project has been funded by the SSTL.

The pilot investigated integrated presentations of linguistically annotated audio and text material, combining Praat and TypeCraft.

Praat is a signal analysis software developed by Paul Boersma and David Weenink from the University of Amsterdam. It is a tool widely used for the annotation of sound objects. For the present study we have taken advantage of the fact that Praat annotation data resides in a TextGrid object that exists separately from the sound object.Specifying a sentence tier allowed us easy referencing of data across applications. At present our sound signal representations are static, and selective, that is, they focus on the presentation of one selected feature to illustrate interesting correlations across phonetic and linguistic categories.


Description of the material

For our study we selected 10 sentences from the phonetic database of the Sound to Sense project.

Sentences 1 to 3

Speaker dialect: Bergen

Jeg ser bildet, kan du si, litt på skrått ned, ovenifra.
“I see the picture, say, somewhat diagonally downwards, from above.”
Jeg
e
1SG
PN
ser
se:r
seePRES
V
bildet
bilde
pictureDEFSG
N
kan
kan:
canPRES
V
du
ʉ
2SG
CL
si
si:
sayINF
V
litt
lit:
a.little
ADVm
po
onDIR
PREP
skrått
skro:t
diagonalADJ>ADV
ADVm
ned
ned
downDIR
ADVm
ovenifra
ovenifra
from.aboveDIRSRC
ADVm
Download files for viewing in the Praat application: File:PSTA01.mp3, TextGrid
Det dekker omtrent hele det venstre…mest…altså, venstreste kortsiden.
“It covers approximately the whole left…most…that is, the leftest short side.”
Det
de
3SGNEUT
PN
dekker
dek:er
coverPRES
V
omtrent
umtrent
approximately
ADVm
hele
he:le
wholeDEF
ADJ
det
de
DEFSGNEUT
ART
venstre
venstre
left
ADVm
mest
mest
mostSUP
ADJ
altså
aso
that.isDM
ADVm
venstreste
venstreste
leftSUPMUDEF
ADJ
kortsiden
kortsiden
shortsideDEFSG
N
File downloads for Praat: * Sound, * TextGrid # for help with the download
Hun står med ryggen mot veggen opp og ser på han som skal kaste ballen som står utenfor og peker på boksene.
“She's standing with her back up against the wall and looking at him, who is standing outside and about to throw the ball, and pointing towards the boxes.”
Hun
hun
3SGFEM
PN
står
sto:r
standPRES
V
med
med
withMNR
PREP
ryggen
ryɡ:en
backDEFSG
N
mot
mut
againstDIR
PREP
veggen
veɡ:en
wallDEFSG
N
opp
up
upDIRMU
PREP
og
o
and
CONJC
 
 
ser
se:r
seePRES
V
po
atDIR
PREP
han
han
3SGMASC
PN
som
som
 
PNrel
skal
skal:
shallPRES
V
kaste
kaste
throwINF
V
ballen
bal:en
ballDEFSG
N
som
som
 
PNrel
står
sto:r
standPRES
V
utenfor
ʉtenfor
outside
ADVm
og
o
and
CONJC
peker
pe:ker
pointPRES
V
po
atDIR
PREP
boksene
boksene
boxDEFPL
N


Download files for viewing in the Praat application:Sound, TextGrid

Speaker Dialect: Trondheim

Parallel Processing of Speech and Text Data - Part 2

Speaker Dialect:

Parallel Processing of Speech and Text Data - Part 3

About the TextGrid files

The TextGrid files are opened together with the matching sound files for viewing in the Praat application. The TextGrid files consist of three tiers, 'Word' (rendered in Bokmål orthography) 'Phoneme' (shows underlying segments) and 'Note' (shows surface realisation with IPA symbols, and other notes).

Here is a list of glosses used in the 'Note' tier:

Phonology/Phonetics:
BrV = Segent realised with breathy voice
CrV = Segent realised with creaky voice
DV = Underlying voiced segment realised devoiced
EPN = Epenthesis
RD = Reduction of segment (e.g. corner vowel realised as schwa or plosive as fricative).
V = Underlying non-voiced segment realised voiced

Morphophonology/Syntax
CL = Clitic

Other
ERR = The speaker errs and corrects himself
HES = (Audible) hesitation from speaker