Description |
A compiled Matukar Panau corpus of 150,740 words, including words in context, speaker metadata, file metadata and where available parsing and glossing and translations. A subset of this corpus is included in a separate file as a morpheme corpus with parsing and glossing of 20,359 morphemes.
Most files have been standardized for spelling. The spelling standardization script package for ELAN was developed by Jake Farrell, AI Specialist at Appen, for the use by CoEDL researchers.
A lexicon from ELAN In xml format is included. An annotation guideline for clause chains is also included. Annotations are in tiers with the ELAN type "chain".
|