Visual Prediction Improves Zero-Shot Cross-Modal Machine Translation

Research output: Chapter in Book/Report/Conference proceeding › Article in proceedings › Research › peer-review

Documents

Fulltext
Final published version, 2.19 MB, PDF document

Tosho Hirasawa
Bugliarello, Emanuele
Elliott, Desmond
Mamoru Komachi

Multimodal machine translation (MMT) systems have been successfully developed in recent years for a few language pairs. However, training such models usually requires tuples of a source language text, target language text, and images. Obtaining these data involves expensive human annotations, making it difficult to develop models for unseen text-only language pairs. In this work, we propose the task of zero-shot cross-modal machine translation aiming to transfer multimodal knowledge from an existing multimodal parallel corpus into a new translation direction. We also introduce a novel MMT model with a visual prediction network to learn visual features grounded on multimodal parallel data and provide pseudo-features for text-only language pairs. With this training paradigm, our MMT model outperforms its text-only counterpart. In our extensive analyses, we show that (i) the selection of visual features is important, and (ii) training on image-aware translations and being grounded on a similar language pair are mandatory. Our code are available at https://github.com/toshohirasawa/zeroshot-crossmodal-mt.

Original language	English
Title of host publication	Proceedings of the 8th Conference on Machine Translation, WMT 2023
Publisher	Association for Computational Linguistics (ACL)
Publication date	2023
Pages	520-533
ISBN (Electronic)	9798891760417
DOIs	https://doi.org/10.18653/v1/2023.wmt-1.47
Publication status	Published - 2023
Event	8th Conference on Machine Translation, WMT 2023 - Singapore, Singapore Duration: 6 Dec 2023 → 7 Dec 2023

Conference

Conference	8th Conference on Machine Translation, WMT 2023
Land	Singapore
By	Singapore
Periode	06/12/2023 → 07/12/2023

Bibliographical note

ID: 377814940

Department of Computer Science