How far can we get with one GPU in 100 hours? CoAStaL at MultiIndicMT Shared Task
Research output: Chapter in Book/Report/Conference proceeding › Article in proceedings › Research › peer-review
Documents
- Fulltext
Final published version, 327 KB, PDF document
This work shows that competitive translation results can be obtained in a constrained setting by incorporating the latest advances in memory and compute optimization. We train and evaluate large multilingual translation models using a single GPU for a maximum of 100 hours and get within 4-5 BLEU points of the top submission on the leaderboard. We also benchmark standard baselines on the PMI corpus and re-discover well-known shortcomings of translation systems and metrics.
Original language | English |
---|---|
Title of host publication | Proceedings of the 8th Workshop on Asian Translation (WAT2021) |
Publisher | Association for Computational Linguistics |
Publication date | 2021 |
Pages | 205-211 |
DOIs | |
Publication status | Published - 2021 |
Event | 8th Workshop on Asian Translation (WAT2021) - Online Duration: 5 Aug 2021 → 6 Aug 2021 |
Conference
Conference | 8th Workshop on Asian Translation (WAT2021) |
---|---|
By | Online |
Periode | 05/08/2021 → 06/08/2021 |
ID: 300450019