Moses and the Character-Based Random Babbling Baseline: CoAStaL at AmericasNLP 2021 Shared Task
Research output: Chapter in Book/Report/Conference proceeding › Article in proceedings › Research › peer-review
Documents
- Fulltext
Final published version, 193 KB, PDF document
We evaluated a range of neural machine translation techniques developed specifically for low-resource scenarios. Unsuccessfully. In the end, we submitted two runs: (i) a standard phrase-based model, and (ii) a random babbling baseline using character trigrams. We found that it was surprisingly hard to beat (i), in spite of this model being, in theory, a bad fit for polysynthetic languages; and more interestingly, that (ii) was better than several of the submitted systems, highlighting how difficult low-resource machine translation for polysynthetic languages is.
Original language | English |
---|---|
Title of host publication | Proceedings of the 1st Workshop on Natural Language Processing for Indigenous Languages of the Americas, AmericasNLP 2021 |
Editors | Manuel Mager, Arturo Oncevay, Annette Rios, Ivan Vladimir Meza Ruiz, Alexis Palmer, Graham Neubig, Katharina Kann |
Publisher | Association for Computational Linguistics |
Publication date | 2021 |
Pages | 248-254 |
ISBN (Electronic) | 9781954085442 |
DOIs | |
Publication status | Published - 2021 |
Event | 1st Workshop on Natural Language Processing for Indigenous Languages of the Americas, AmericasNLP 2021 - Virtual, Online Duration: 11 Jun 2021 → … |
Conference
Conference | 1st Workshop on Natural Language Processing for Indigenous Languages of the Americas, AmericasNLP 2021 |
---|---|
By | Virtual, Online |
Periode | 11/06/2021 → … |
Bibliographical note
Publisher Copyright:
© 2021 Association for Computational Linguistics
Number of downloads are based on statistics from Google Scholar and www.ku.dk
ID: 291814762