Typecraft v2.5
Jump to: navigation, search

Help:How to annotate in TypeCraft - a practical guide

Revision as of 20:32, 9 November 2012 by Dorothee Beermann (Talk | contribs) (Step 3 - sentence split)

How you create a new text, split it into sentences, and enter it into the database

You are logged in. There are no texts in your home domain called *My texts* because you have not yet started to annotate.

This is how you create a new text and split it into phrases that you can annotate.

Step 1 - open Text Editor in New Tab

To start, click on *New text* which you find in the upper navigation box to the left of your browser window. Right click it, and choose the second option from the drop-down window saying *Open Link in New Tab*. Having the TypeCraft wiki and the TypeCraft Editor open in two tabs will allow you to go more easily back and forth between these two. You can for example keep the Help Page open while you work with your TypeCraft Editor. Your normally find your tabs on the upper part of your browser window.

Step 2 - enter text into the text field

Go to the Text Editor window either by choosing *New text* or *My texts* from your navigation bar. The Text Editor has on the left side a text field. You use this field to either to import text from a file by copy & paste, or you manually enter sentences.

NOTE - any collection of phrases is a text.
Under Text we understand either real text, or part of real text, for example from a newspaper or a book etc., but also ::a collection of phrases, either sentences or smaller constituents, such as noun phrases which have been collected to illustrate a certain linguistic phenomenon.

Step 3 - sentence split

After you have entered your text you click on *Create phrases*. A dialog box appears. If you have not marked parts of the text (which you normally only do if you want to add more text to an already existing text), TypeCraft will say: Nothing selected. Should TypeCraft use the whole text instead? Answer: yes.

(Step 4 optional) - repeat sentence split

If you are unhappy with the way TypeCraft has split your sentences, clean out your text by for example deleting spaces between paragraphs, or by inserting periods. If worse comes to worse, you have to repeat this step several times to get the sentence break-up that is correct.

How to annotate individual tokens

Step 1 - Initializing your annotation table

Click on one of the individual blue phrases on the right-hand side of your Text Editor window. A small window pops-up, saying:

TypeCraft wants to know:
This phrase has no words yet.

You can initialise words and morphemes automatically. Separate words by spaces (" ") and morphemes by hyphens ("-").

If you click cancel you can insert words and morphemes manually.

It seems that we have added this window to confuse you ;); what we really mean is this:

You have 3 different option how to use the input mask that contains now the sentence that you want to initialize:

  • if you want the phrase in the input mask inserted into a table without any further segmantation, click OK.
  • if you want the text in the input mask inserted into the table and you in addition want to split some of the words into smaller segments,, you can do that by inserting hyphens "-" or spaces " " in the phrase. Then you click OK. Do not be afraid of possible mistakes you might make when inserting hyphens at this point. You can always change what you do later.
  • If you don't want to start off with the material in the input mask, but rather wish to fill all material into the table manually click cancel. A one-column skeleton of a table appears, and here you can fill in text in the top line, one word per column. You create a new column by clicking in an existing column: you get a menu where you left-click, and get the options 'New word before', 'New word after' or 'Delete word'. By clicking on either of the former options a new empty column appears, where a word can be written in. On the second line in the table, morphological units are written in in a similar manner - the menu now offers 'New morpheme before', 'New morpheme after' or 'Delete morpheme', and in similar fashion as above, in the first two cases an empty column is created for being filled manually.

These manual processes can be performed also if one has chosen one of the first two bullet-point options presented above. Thus, one can at any point go back and correct mistakes, fill in more information, etc.

Note however that the only steps where you get segment splitting showing up automatically is in option 2, while option 1 just gives you the text-line in the first tier of the table.