Typecraft v2.5
Jump to: navigation, search

Difference between revisions of "Multilingual Verb Valence Lexicon"

Line 1: Line 1:
 +
THIS PAGE IS BEING CONSTRUCTED--[[User:Lars Hellan|Lars Hellan]] 21:04, 26 September 2013 (UTC)
 +
 +
 
Lars Hellan, NTNU            
 
Lars Hellan, NTNU            
  
 
September 2013
 
September 2013
  
 +
 +
We here present an implemented design for a multilingual valence database, and its web-demo.
  
  
Line 53: Line 58:
  
 
e. type of the situation expressed, in terms of some classificatory system.
 
e. type of the situation expressed, in terms of some classificatory system.
 +
 +
 +
Some, but not all, of these factors are represented in the present database.
 +
 +
 +
A display of the database is given in the web-demo:
 +
 +
http://regdili.idi.ntnu.no:8080/multilanguage_valence_demo/multivalence
 +
 +
 +
 +
Here is a practical survey of the types of information offered in this web-demo. It currently offers three languages, which can be activated simultaneously, as indicated, or one by one:
 +
 +
 +
Languages:
 +
 +
Norwegian
 +
 +
Ga
 +
 +
Spanish
 +
 +
 +
It offers 5 drop-down menus and one write-in field; any combination of them can be activated in a search:
 +
 +
 +
TO BE CONTINUED

Revision as of 21:04, 26 September 2013

THIS PAGE IS BEING CONSTRUCTED--Lars Hellan 21:04, 26 September 2013 (UTC)


Lars Hellan, NTNU

September 2013


We here present an implemented design for a multilingual valence database, and its web-demo.


A multilingual valence database should consist of the following:

- The Languages:

A selection of languages L1... Ln;

- The Parameters:

A set of specification parameters defined across all the languages (i.e., common parameters, in the sense of being independent of any particular language, although not in the sense of necessarily being relevant for all of the languages);

- The Valence-profiles:

For each language, an inventory of its valence types characterized in terms of the parameters available, called its valence-profile;

- The Valence-type suites:

For each language, a list of sentences instantiating each of its valence types, indexed according to the types;

- The Valence Lexicons:

For each language, a verb lexicon where each verb entry is classified according to its valence type (in addition to other lexical information);

- The Valence Corpora:

For each language, a sentence corpus instantiating each verb in each of the valence frames it can support.


The notion valence represents a perspective from the verb and thus from the Lexicon, whereas from the viewpoint of the sentence and the Corpora, the most closely corresponding term is argument structure, as when we talk about ‘the argument structure of a sentence’; since both perspectives are represented here, we use both terms. The sentential perspective is necessary when not just a single verb determines the argument structure of a sentence, such as when it is determined by a verb plus a secondary predicate, or resides in a series of verbs – the argument structure then results from the interplay between the valence of the constituent items and constructional factors. To widen the scope of the database to fully recognize constructional factors in such cases, we may think of it as a database of argument structure constructions, and use Construction-profile as an alternative to Valence-profile, and Construction-type suites as an alternative to Valence-type suites.

Within the word-based perspective on argument frames, one in turn has to recognize argument structure alternation or derivation, as when ‘processes’ like passive, causative, applicative etc affect the valence of the verb. As grammatical processes, these are commonly not represented in a Lexicon (except perhaps through marking potential for undergoing the processes), but they are obviously reflected in a Corpus, and it is not implausible that they be reflected in a Valence-profile and a Valence-type suite of a given language.

In sum, in addition to notions like ‘intransitive’, ‘transitive’ and concepts modeled around these, being relevant both from the lexical perspective and the derivation- and construction-based perspectives, parameters to be represented in the database also include derivation, as well as constructional factors like secondary predicates, serial verbs and ‘complex predicates’.

In general, the parameters selected for inclusion in the database must in the first place be amenable to formalization for a relational database, and in the second place accessible in an understandable form to those who input data and search for data. The latter point is connected to the need for a flexible inventory of terms, on the one hand accommodating terminologies of various frameworks, on the other hand being based on consistent conversion systems between terminologies.

From the viewpoint of standard linguistic adequacy, the following factors may be expected in the representation of argument structure:

(1)

a. syntactic argument structure, i.e., whether there is a subject, an object, a second/indirect object, etc., referred to as grammatical functions, and the formal categories carrying them;

b. semantic argument structure, that is, how many participants are present in the situation depicted, and which roles they play (such as ‘agent’, ‘patient’, etc.);

c. linkage between syntactic and semantic argument structure, i.e., which grammatical functions express which roles, and possible roles not expressed; here also belong identity relations, part-whole relations, etc., between arguments;

d. aspect and Aktionsart, that is, properties of a situation expressed by a sentence with the valence in question in terms of whether it is dynamic/stative, continuous/instantaneous, completed/ongoing, etc.;

e. type of the situation expressed, in terms of some classificatory system.


Some, but not all, of these factors are represented in the present database.


A display of the database is given in the web-demo:

http://regdili.idi.ntnu.no:8080/multilanguage_valence_demo/multivalence


Here is a practical survey of the types of information offered in this web-demo. It currently offers three languages, which can be activated simultaneously, as indicated, or one by one:


Languages:

Norwegian 
Ga 
Spanish


It offers 5 drop-down menus and one write-in field; any combination of them can be activated in a search:


TO BE CONTINUED