Item details
Item ID
DGB1-2023corpus_dict
Title Corpus and dictionary files for 2023
Description A compiled Matukar Panau corpus of 150,740 words, including words in context, speaker metadata, file metadata and where available parsing and glossing and translations. A subset of this corpus is included in a separate file as a morpheme corpus with parsing and glossing of 20,359 morphemes.

Most files have been standardized for spelling. The spelling standardization script package for ELAN was developed by Jake Farrell, AI Specialist at Appen, for the use by CoEDL researchers.

A lexicon from ELAN In xml format is included. An annotation guideline for clause chains is also included. Annotations are in tiers with the ELAN type "chain".
Origination date 2023-03-31
Origination date free form
Archive link https://catalog.paradisec.org.au/repository/DGB1/2023corpus_dict
URL
Collector
Danielle Barth
Countries To view related information on a country, click its name
Language as given
Subject language(s) To view related information on a language, click its name
Content language(s) To view related information on a language, click its name
Dialect
Region / village Oceania
Originating university Australian National University
Operator Julia Colleen Miller
Data Categories
Data Types
Discourse type
Roles Danielle Barth : compiler
DOI 10.26278/9a0q-6423
Cite as Danielle Barth (collector), Danielle Barth (compiler), 2023. Corpus and dictionary files for 2023. PLAIN/RTF. DGB1-2023corpus_dict at catalog.paradisec.org.au. https://dx.doi.org/10.26278/9a0q-6423
Content Files (4)
Filename Type File size Duration File access
DGB1-2023corpus_dict-clause_chain_annos.txt text/plain 3.9 KB
DGB1-2023corpus_dict-dir_annos.rtf text/rtf 110 KB
DGB1-2023corpus_dict-morphs.txt text/plain 35 MB
DGB1-2023corpus_dict-words.txt text/plain 116 MB
4 files -- 152 MB -- --

Show 10 Show 50 Show all 4

Collection Information
Collection ID DGB1
Collection title Matukar Panau Language Documentation
Description Recordings collected during 2010-2022 for language documentation. Collection is ongoing. Includes traditional stories, descriptions of traditional practices, family stories, songs, myths and procedural texts including narration of videos of typical village activities.
Countries To view related information on a country, click its name
Languages To view related information on a language, click its name
Access Information
Edit access
View/Download access
Data access conditions Open (subject to agreeing to PDSC access conditions)
Data access narrative
Metadata
RO-Crate Metadata
Comments

Must be logged in to comment


No comments found