How far can we get with one GPU in 100 hours? CoAStaL at MultiIndicMT Shared Task

Publikation: Bidrag til bog/antologi/rapportKonferencebidrag i proceedingsForskningfagfællebedømt

Dokumenter

  • Fulltext

    Forlagets udgivne version, 327 KB, PDF-dokument

This work shows that competitive translation results can be obtained in a constrained setting by incorporating the latest advances in memory and compute optimization. We train and evaluate large multilingual translation models using a single GPU for a maximum of 100 hours and get within 4-5 BLEU points of the top submission on the leaderboard. We also benchmark standard baselines on the PMI corpus and re-discover well-known shortcomings of translation systems and metrics.
OriginalsprogEngelsk
TitelProceedings of the 8th Workshop on Asian Translation (WAT2021)
ForlagAssociation for Computational Linguistics
Publikationsdato2021
Sider205-211
DOI
StatusUdgivet - 2021
Begivenhed8th Workshop on Asian Translation (WAT2021) - Online
Varighed: 5 aug. 20216 aug. 2021

Konference

Konference8th Workshop on Asian Translation (WAT2021)
ByOnline
Periode05/08/202106/08/2021

ID: 300450019