Data-driven Valence Typology

21 December 2017


The multilingual Interlinear Glossed Text (IGT) Bank.

With TypeCraft you can freely access grammatically glossed examples (IGT) from more than 150 languages (see:Portal of Languages). Examples can be exported in various formats. You can also use TypeCraft to create your own Interlinear Glossed Text for any language you wish. You can store the data in your own privat space, or share your work with a group. The TypeCraft Wiki consists of 286 articles which discuss mostly less described languages and address linguistic questions, often embedding Interlinear Glossed Texts drawn from the database. Viewing existing data in the database or reading articles in the wiki do not require a login, whereas creation of data and writing in the wiki requires a login (upper right corner). The Quick Start page gives a general introduction to the use of TypeCraft, and information about system updates. To access the database and data editing functions, push the TypeCraft Tools button in the upper left corner, and to access the wiki pages, write part of the name of the page wanted into the Search Mediawiki slot in the upper right cormer. For how to search in the respective domains, see below.

The TypeCraft Editor and TypeCraft Search

The TypeCraft IGT comes in the form illustrated below, an example from Akan illustrating verb serialization, aspectual preverbs and other features:

Boakye rekɔgye aba abεdi.
“Boakye is going to collect it, come back and eat it.”

TypeCraft data can be searched on any of the levels here illustrated - from the text to the morph level, using TypeCraft Tools -> TypeCraft Search -> "Text search or "Phrase search". You can search for individual glosses, or part of speech specifications. Search results can be freely downloaded in various formats, for instance to TC's own wiki as in the example above. The Quick Start page provides information on how to proceed.

The TypeCraft Wiki

The category page Languages offers an overview over the TypeCraft Wiki pages ordered by language. Use the wiki search box in the upper right corner to search this wiki.

TypeCraft Portal of Languages

TypeCraft database consists as of now of 2940 texts. There are at present 145 unique languages in the database. Our Portal of Languages is a dynamic list of those languages that have more than 5 public texts in the database. Our Portal allows you to download texts and phrases using different export formats.


Mary Esther Kropp Dakubu and Lars Hellan

Nov. 7, 2011

Data-driven Valence Typology (DVT) is a project where we seek to represent the characteristic sentence construction types of a language – called its c-profile - in a transparent, detailed and non-theory-biased format, drawing from a common, restricted repertory of analytic-descriptive primitives, cf. [1]. By adhering to a common classification system, DVT in principle allows for its data to be searchable in a relational database. DVT has so far been developed with a view to cover significantly different languages (Ga from the Niger-Congo family Kwa, Norwegian from Germanic, and Kistaninya from Ethio-Semitic), while in a current phase the project has a more ‘micro-comparative’ focus, in showing how a profile for one language of a given family can be derived from the c-profile of another language in the same family. In Germanic we envisage such extensions with regard English and German, and in Kwa/Gur with regard to Dangme and Gurene.

In situating DVT among current projects and initiatives, it can perhaps be most directly related to VerbNet [2], its non-computational predecessor in Levin's work [3], and a cross-linguistic development of the latter, the Leipzig Valency Classes Project[4].

In future publications we will show how an inventory of verb classes in the Levin approach can be derived from a DVT c-profile and an accompanying verb construction lexicon, as are available for Ga [5], and for Norwegian [6]. We will also assess the notion of ‘valence alternation’ as a comparison unit, by itself notoriously difficult to define, and show that for the 150 most salient frames in Ga, none of them are interconnected by any of the ‘alternation’ patterns which are commonly applied in the European setting. We will advocate DVT as offering a sounder general basis for valence typology, not being directly dependent on notions like 'alternation'.

Further pages at this site giving information about the project include:

1. The three parts of [1], consisting of: The system , Ga Appendix , Norwegian Appendix

2. Verbconstructions cross-linguistically - Introduction, a predecessor of [1], and introducing the system particularly as applied to Norwegian, with wiki pages illustrating the by then established c-profile of Norwegian, with annotated examples for each type.

3. The following TypeCraft annotated texts:

 Ga sentence types	                Mary Esther Kropp Dakubu
 Norwegian verb constructions	        Lars Hellan
 Verb constructions in Kistaniniya	Bedilu Debela


