Visual Prediction Improves Zero-Shot Cross-Modal Machine Translation
Research output: Chapter in Book/Report/Conference proceeding › Article in proceedings › Research › peer-review
Documents
- Fulltext
Final published version, 2.19 MB, PDF document
Multimodal machine translation (MMT) systems have been successfully developed in recent years for a few language pairs. However, training such models usually requires tuples of a source language text, target language text, and images. Obtaining these data involves expensive human annotations, making it difficult to develop models for unseen text-only language pairs. In this work, we propose the task of zero-shot cross-modal machine translation aiming to transfer multimodal knowledge from an existing multimodal parallel corpus into a new translation direction. We also introduce a novel MMT model with a visual prediction network to learn visual features grounded on multimodal parallel data and provide pseudo-features for text-only language pairs. With this training paradigm, our MMT model outperforms its text-only counterpart. In our extensive analyses, we show that (i) the selection of visual features is important, and (ii) training on image-aware translations and being grounded on a similar language pair are mandatory. Our code are available at https://github.com/toshohirasawa/zeroshot-crossmodal-mt.
Original language | English |
---|---|
Title of host publication | Proceedings of the 8th Conference on Machine Translation, WMT 2023 |
Publisher | Association for Computational Linguistics (ACL) |
Publication date | 2023 |
Pages | 520-533 |
ISBN (Electronic) | 9798891760417 |
DOIs | |
Publication status | Published - 2023 |
Event | 8th Conference on Machine Translation, WMT 2023 - Singapore, Singapore Duration: 6 Dec 2023 → 7 Dec 2023 |
Conference
Conference | 8th Conference on Machine Translation, WMT 2023 |
---|---|
Land | Singapore |
By | Singapore |
Periode | 06/12/2023 → 07/12/2023 |
Bibliographical note
Publisher Copyright:
© 2023 Association for Computational Linguistics.
ID: 377814940