How far can we get with one GPU in 100 hours? CoAStaL at MultiIndicMT Shared Task

Research output: Chapter in Book/Report/Conference proceedingArticle in proceedingsResearchpeer-review

Standard

How far can we get with one GPU in 100 hours? CoAStaL at MultiIndicMT Shared Task. / Aralikatte, Rahul; Murrieta Bello, Héctor Ricardo; Hershcovich, Daniel; Bollmann, Marcel; Søgaard, Anders.

Proceedings of the 8th Workshop on Asian Translation (WAT2021). Association for Computational Linguistics, 2021. p. 205-211.

Research output: Chapter in Book/Report/Conference proceedingArticle in proceedingsResearchpeer-review

Harvard

Aralikatte, R, Murrieta Bello, HR, Hershcovich, D, Bollmann, M & Søgaard, A 2021, How far can we get with one GPU in 100 hours? CoAStaL at MultiIndicMT Shared Task. in Proceedings of the 8th Workshop on Asian Translation (WAT2021). Association for Computational Linguistics, pp. 205-211, 8th Workshop on Asian Translation (WAT2021), Online, 05/08/2021. https://doi.org/10.18653/v1/2021.wat-1.24

APA

Aralikatte, R., Murrieta Bello, H. R., Hershcovich, D., Bollmann, M., & Søgaard, A. (2021). How far can we get with one GPU in 100 hours? CoAStaL at MultiIndicMT Shared Task. In Proceedings of the 8th Workshop on Asian Translation (WAT2021) (pp. 205-211). Association for Computational Linguistics. https://doi.org/10.18653/v1/2021.wat-1.24

Vancouver

Aralikatte R, Murrieta Bello HR, Hershcovich D, Bollmann M, Søgaard A. How far can we get with one GPU in 100 hours? CoAStaL at MultiIndicMT Shared Task. In Proceedings of the 8th Workshop on Asian Translation (WAT2021). Association for Computational Linguistics. 2021. p. 205-211 https://doi.org/10.18653/v1/2021.wat-1.24

Author

Aralikatte, Rahul ; Murrieta Bello, Héctor Ricardo ; Hershcovich, Daniel ; Bollmann, Marcel ; Søgaard, Anders. / How far can we get with one GPU in 100 hours? CoAStaL at MultiIndicMT Shared Task. Proceedings of the 8th Workshop on Asian Translation (WAT2021). Association for Computational Linguistics, 2021. pp. 205-211

Bibtex

@inproceedings{c3d3230ac4a04faca8ee5677d0b6c2e0,
title = "How far can we get with one GPU in 100 hours?: CoAStaL at MultiIndicMT Shared Task",
abstract = "This work shows that competitive translation results can be obtained in a constrained setting by incorporating the latest advances in memory and compute optimization. We train and evaluate large multilingual translation models using a single GPU for a maximum of 100 hours and get within 4-5 BLEU points of the top submission on the leaderboard. We also benchmark standard baselines on the PMI corpus and re-discover well-known shortcomings of translation systems and metrics.",
author = "Rahul Aralikatte and {Murrieta Bello}, {H{\'e}ctor Ricardo} and Daniel Hershcovich and Marcel Bollmann and Anders S{\o}gaard",
year = "2021",
doi = "10.18653/v1/2021.wat-1.24",
language = "English",
pages = "205--211",
booktitle = "Proceedings of the 8th Workshop on Asian Translation (WAT2021)",
publisher = "Association for Computational Linguistics",
note = "8th Workshop on Asian Translation (WAT2021) ; Conference date: 05-08-2021 Through 06-08-2021",

}

RIS

TY - GEN

T1 - How far can we get with one GPU in 100 hours?

T2 - 8th Workshop on Asian Translation (WAT2021)

AU - Aralikatte, Rahul

AU - Murrieta Bello, Héctor Ricardo

AU - Hershcovich, Daniel

AU - Bollmann, Marcel

AU - Søgaard, Anders

PY - 2021

Y1 - 2021

N2 - This work shows that competitive translation results can be obtained in a constrained setting by incorporating the latest advances in memory and compute optimization. We train and evaluate large multilingual translation models using a single GPU for a maximum of 100 hours and get within 4-5 BLEU points of the top submission on the leaderboard. We also benchmark standard baselines on the PMI corpus and re-discover well-known shortcomings of translation systems and metrics.

AB - This work shows that competitive translation results can be obtained in a constrained setting by incorporating the latest advances in memory and compute optimization. We train and evaluate large multilingual translation models using a single GPU for a maximum of 100 hours and get within 4-5 BLEU points of the top submission on the leaderboard. We also benchmark standard baselines on the PMI corpus and re-discover well-known shortcomings of translation systems and metrics.

U2 - 10.18653/v1/2021.wat-1.24

DO - 10.18653/v1/2021.wat-1.24

M3 - Article in proceedings

SP - 205

EP - 211

BT - Proceedings of the 8th Workshop on Asian Translation (WAT2021)

PB - Association for Computational Linguistics

Y2 - 5 August 2021 through 6 August 2021

ER -

ID: 300450019