Talk: Universality in Machine Translation

Speaker

Orhan Firat is a Research Scientist at Google Research working on the areas of sequence modelling, multilingual models, multi-task learning and scaling neural networks.

Universality in Machine Translation: M4 - Massively Multilingual, Massive MT Models for the Next 1000 Languages

What does universality mean for machine translation? Massively multilingual models jointly trained on hundreds of languages, have been showing great success in processing different languages simultaneously in a single large model. These large multilingual models, which we call M4, are appealing for both efficiency and positive cross-lingual transfer: (1) Training and deploying a single multilingual model requires much less resources than maintaining one model for each language considered, (2) by transferring knowledge from high-resource languages, multilingual models are able to improve performance on low-resource languages. In this talk, we will be talking about our efforts on scaling machine translation models to more than 1000 languages (massive multilinguality) and beyond trillion weights (massive size). We will be detailing several research (and even some development) challenges that the project has tackled; multi-task learning with hundreds of tasks, learning under heavy data imbalance, trainability of very deep networks, understanding the learned representations, cross-lingual down-stream transfer and many more insights will be shared.