Jump to: navigation, search

Norwegian HPSG grammar NorSource

Revision as of 16:12, 14 April 2014 by Lars Hellan (Talk | contribs) (History and Purpose of the grammar)

For the web demo of the Norwegian HPSG grammar Norsource, the sentences below can be used for illustration.

More complete test suites for basic verbal constructions are found in


License

[WWW] Lesser General Public License For Linguistic Resources


History and Purpose of the grammar

NorSource is a so-called ‘deep’ computational grammar (‘DG’) of Norwegian, developed throughout the last 12 years. The grammar has been developed with a view to the following overall desiderata:


Desideratum 1. Encoding of Linguistic Meaning

As a ‘generic’ information repository, the DG should have a semantic component from which a Reasoning capacity can be deduced for any domain of discourse – possibly with addition of concepts for the specific domains. It should be like a Fregean ‘Sinn’, in acting as a function from domains of use to models of interpretation. However, contrary to most artificial ‘reasoning’ devices, a DG must span the full complexity of a natural language, reflecting the size of its vocabulary and its grammar complexity. In this respect,the DG can also be seen as the materialization of a Generative Grammar, in the original sense of that notion.


Desideratum 2. Cross-grammar Generality

The content of the DG should to as large an extent as possible be phrased in terms used, or alignable with terms used, in other grammars and for other languages, thereby enabling linguistic comparison using the DG. By ‘content of the DG’ we mean both the content of the grammar files (formalism, notions used) and the content of its parse productions.


Desideratum 3. Interoperability

The DG should attain as much interoperability with other applications as possible. In general, what a digital ubiquitous research environment for linguistics should enable is an interconnectivity of data, researchers and processing facilities whereby from any point in an overall structure of components, a contribution can have its ramifications immediately implemented throughout the entire structure. Such interconnectivity will have to be manifested both on an ‘outer’ level enabling data flow and easy access, and on an ‘inner’ level ensuring information exchange from one system component to another. For a DG, thus, its files and productions (parses, etc.) should be transportable to other applications, and the codes in which its files are written should be readable by other applications, or able to be mapped into other codes.


Desideratum 4. Sustainability

The DG should be in such a format, and be situated in such an over-all environment, that as much as possible of its capacity can be retained, independently of particular persons maintaining it or particular physical environments.


The first of the desiderata reflects a central concern throughout modern logic and philosophy of language, and in turn linguistics and Artificial Intelligence. Semantics being inevitably the basis for significant progress in cross-linguistic modelling, the desideratum has relevance also for desideratum 2. The grammar to be discussed belongs to a family of DGs whose design quite explicitly caters for this concern. This family of DGs has as its formal and theoretical framework HPSG (Pollard and Sag 1994, Sag et al. 2003), and started as a computational project through the LinGO initiative at CSLI, Stanford, using the LKB platform (Copestake 2002), which is a general platform with the format of typed feature-structures (TFS), and has integrated in it a format of semantic representation called Minimal Recursion Semantics (‘MRS’; cf. Copestake et al. 2005). Before year 2000 there were three grammars in this framework, the English Resource Grammar (ERG), the Japanese grammar Jacy, and a German grammar. Essential to the development of further grammars in the family was the ‘HPSG Grammar Matrix’ (‘the Matrix’; see Bender et al. 2002, 2010), which was mainly based on ERG, and had its first phase of deployment during the EU-project DeepThought (2002-4). The grammar family is currently developed within the frame of the DELPH-IN consortium, and will in the following be referred to as the ‘DELPH-IN grammars’.

The DG to be discussed was started in 2001, by linguists versed in Generative Grammar since the late 60ies, and formal semantics (‘Montague Grammar’) since the mid 70ies. From the mid 80ies the group developed a computational lexicon (under the acronym ‘TROLL’, see Hellan et al. 1989), mainly associated with research within ‘consolidated GB’. In the late 90ies the group reoriented itself towards HPSG, and started the DG as part of the LinGO initiative with the LKB platform. The DG was the first grammar to be built on the Matrix, during the EU-project DeepThought (2002-4), and despite never receiving very substantial funding, it has retained a place among the medium-large DELPH-IN grammars. We can distinguish four main phases in its development:

 Phase 1, the Grounding phase (2001-04), 
 Phase 2, the Semantic Expansion phase (2005-07), 
 Phase 3, the Cross-Linguistic Coding phase (2008-10), and 
 Phase 4, the Interoperability phase (2010-14).