Difference between revisions of "Help:How to annotate in TypeCraft - a practical guide"
Revision as of 16:00, 19 November 2010
- 1 How you create a new text, split it into sentences, and enter it into the database
- 2 How to annotate individual tokens
How you create a new text, split it into sentences, and enter it into the database
You are logged in. There are no texts in your home domain called *My texts* because you have not yet started to annotate.
This is how you create a new text and split it into phrases that you can annotate.
Step 1 - open Text Editor in New Tab
To start, click on *New text* which you find in the upper navigation box to your left. Right click it, and choose the second option from the drop-down window saying *Open Link in New Tab* which will allow you to go more easily back and forth between your Text Editor and this Help Page. In my browser tabs open right below my toolbar, on the upper part of my browser window.
Step 2 - enter text into the text field
Go to the Text Editor window on your Tabs bar. Click it. Now you are in your Text Editor. It has on the left side a Text field. You use this field to either paste text from a file into it, or to manually enter sentences. Both are possible.
- NOTE - any collection of phrases is a text.
- Under Text we understand either real text, or part of real text, for example from a newspaper or a book etc., or a set
- of phrases, either sentences or smaller constituents, such as noun phrases or verb phrases, which you have collected to illustrate a certain linguistic phenomenon, or which simply represent the raw data that you have collected during your field work.
Step 3 - sentence split
After you have entered your text you click on *Create phrases*. A dialog box appears: if you have not marked parts of the text (which you normally only do if you want to add more text to an already existing text), TypeCraft will say: Nothing selected. Should TypeCraft use the whole text instead? Answer: yes!
(Step 4 optional) - repeat sentence split
If you are unhappy with the way TypeCraft has split your sentences, clean out your text by for example deleting spaces between paragraphs, or by inserting periods. If worse comes to worse, you have to repeat this step several times to get the sentence break-up that is correct.
How to annotate individual tokens
Step 1 - Initializing your annotation table
Click on one of the individual blue phrases on the right-hand side of your Text Editor window. A small window pops-up, saying:
|TypeCraft wants to know:|
|This phrase has no words yet.
You can initialise words and morphemes automatically. Separate words by spaces (" ") and morphemes by hyphens ("-").
If you click cancel you can insert words and morphemes manually.
It seems that we have added this window to confuse you ;); what we really mean is this:
You have 3 different option how to use the input mask that contains now the sentence that you want to initialize:
- if you want the phrase in the input mask inserted into a table without any further segmantation, click OK.
- if you want the text in the input mask inserted into the table and you in addition want to split some of the words into smaller segments,, you can do that by inserting hyphens "-" or spaces " " in the phrase. Then you click OK. Do not be afraid of possible mistakes you might make when inserting hyphens at this point. You can always change what you do later.
- If you don't want to start off with the material in the input mask, but rather wish to fill all material into the table manually click cancel. A one-column skeleton of a table appears, and here you can fill in text in the top line, one word per column. You create a new column by clicking in an existing column: you get a menu where you left-click, and get the options 'New word before', 'New word after' or 'Delete word'. By clicking on either of the former options a new empty column appears, where a word can be written in. On the second line in the table, morphological units are written in in a similar manner - the menu now offers 'New morpheme before', 'New morpheme after' or 'Delete morpheme', and in similar fashion as above, in the first two cases an empty column is created for being filled manually.
These manual processes can be performed also if one has chosen one of the first two bullet-point options presented above. Thus, one can at any point go back and correct mistakes, fill in more information, etc.
Note however that the only steps where you get segment splitting showing up automatically is in option 2, while option 1 just gives you the text-line in the first tier of the table.