Workshop on Text and Speech Annotation

Central Institute of Indian Languages

Linguistic Description and the Creation of Digital Language Resources

October 18th - 21st, 2010. Mysore India.

Aim of the Workshop

The workshop will focus on the creation and use of digital speech and text resources, emphasizing the role that digital resources play in linguistic research. The course will give a general introduction to the generation of data collections, small language corpora and on-line linguistic knowledge bases.

The target group of the workshop are linguists interested in strengthening the empirical underpinnings of their typological and/or theoretical work. Graduate students as well as faculty members are welcome. The course will be held in collaboration with Professor Lars Hellan, who has been the head of Linguistics at the Norwegian University of Science and Technology (NTNU) for many years and worked as a Fulbright Fellow at top research institutions in the United States. The main organisers of the course are Professor Dorothee Beermann, NTNU, Professor Gautam Sengupta, University of Hyderabad and ____________, Central Institute of Indian Languages, Mysore. Prof. Beermann is one of the developers of TypeCraft, which is an online-text tool introduced at several computational conferences and in the context of Language Documentation; she is a multi-versed linguist working at the intersection points between formal linguistics, NLP and Knowledge Representation Theory, typological linguistics and language documentation. Prof. Sengupta teaches linguistics and cognitive science at the University of Hyderabad.

Dr. Björn Köhnlein will teach Praat a computer program which allows you to analyse speech. Köhnlein....

Pavel Mihaylov is the technical developer of TypeCraft. He ....

Our focus will be on collaborative speech and text annotation making use of the new facilities that modern web-technology and linguistic software offer to linguists.

The workshop will feature 4 days of course work covering practical issues relating to linguistic annotation of speech and text as well as web-editing as a means to create public linguistic knowledge bases through collaborative on-line editing.

We will offer a hands-on introduction to TypeCraft, a multi-lingual on-line database for text annotation developed at the Norwegian University of Science and Technology (NTNU), Trondheim, Norway by Dorothee Beermann and Pavel Mihaylov, and Praat, a freely available signal analysis software developed by Paul Boersma and David Weenink of the University of Amsterdam, The Netherlands.

In addition to an introduction to text and speech annotation we are also planning several guest lectures on selected linguistic topics of particular relevance to language annotation and linguistic analysis.

More information about the workshop and the workshop program will be published on this site in September!