Converting a Toolbox lexical database to LKB format
Summary
Note: I currently work on this page, I hope to get it into a usable state by mid October 2012. Hannes Hirzel
The LKB system (Linguistic Knowledge Builder) is a grammar and lexicon development environment for unification-based linguistic formalisms. LKB is focused on the use of HPSG. This page contains a description and the program to convert a lexicon database made with Toolbox to lexicon format needed by LKB. The scripts were developed by Hannes Hirzel.
A presentation in Trondheim, 2005 (File:Toolbox-LKB-Link-slides - version 4.pdf) shows how this was applied to a lexicon file of the Ga language edited by Mary E. Kropp Dakubu.
The scripting language used is called 'Consistent changes' and built into the Toolbox program. You may run the program from within Toolbox. To do so make the the lexicon file the active windows and then choose the
- 'File' menu,
- 'Export'
- 'TBox-LKB-Step1'.
This runs all processing steps. The result is a lexicon file in LKB format.
A working portable setup is available from the author on request. However the information and files to recreate the setup is included on this page. You might need to adapt it to your particular lexicon file.
Implementation
Setup
The files which belong to a Toolbox project may be kept all in the same folder. The following screen shot shows the setup how Toolbox has to be setup to produce an LKB TDL lexicon file. Marked green are the six 'consistent changes' script files. They include a conversion from an 8 bit font to Unicode for the particular setup used for the Ga language as of 2005. As of 2012 most lexicons use a Unicode font so these steps might be left out or adapted. The LKB lexicon is the result of the sixth step marked in red.
Each of the steps of the 'consistent changes' process chain must be defined. The screen shot shows the last dialog. It contains input fields for
- input file
- 'consistent changes' script
- output file
The script file
The script files File:Toolbox-LKB-Link-CCT-tables-for-Ga-lexicon.zip
License
The presentation and this wiki page are licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License. The script code is under the MIT license.