How far can we get with one GPU in 100 hours?

How far can we get with one GPU in 100 hours? CoAStaL at MultiIndicMT Shared Task

Publikation: Bidrag til bog/antologi/rapport › Konferencebidrag i proceedings › Forskning › fagfællebedømt

Dokumenter

Fulltext
Forlagets udgivne version, 327 KB, PDF-dokument

Rahul Aralikatte
Héctor Ricardo Murrieta Bello
Hershcovich, Daniel
Marcel Bollmann
Søgaard, Anders

This work shows that competitive translation results can be obtained in a constrained setting by incorporating the latest advances in memory and compute optimization. We train and evaluate large multilingual translation models using a single GPU for a maximum of 100 hours and get within 4-5 BLEU points of the top submission on the leaderboard. We also benchmark standard baselines on the PMI corpus and re-discover well-known shortcomings of translation systems and metrics.

Originalsprog	Engelsk
Titel	Proceedings of the 8th Workshop on Asian Translation (WAT2021)
Forlag	Association for Computational Linguistics
Publikationsdato	2021
Sider	205-211
DOI	https://doi.org/10.18653/v1/2021.wat-1.24
Status	Udgivet - 2021
Begivenhed	8th Workshop on Asian Translation (WAT2021) - Online Varighed: 5 aug. 2021 → 6 aug. 2021

Konference

Konference	8th Workshop on Asian Translation (WAT2021)
By	Online
Periode	05/08/2021 → 06/08/2021

ID: 300450019

Datalogisk Institut

How far can we get with one GPU in 100 hours? CoAStaL at MultiIndicMT Shared Task

Dokumenter

Konference