Cross-Lingual QA as a Stepping Stone for Monolingual Open QA in Icelandic

Research output: Chapter in Book/Report/Conference proceedingArticle in proceedingsResearchpeer-review

Standard

Cross-Lingual QA as a Stepping Stone for Monolingual Open QA in Icelandic. / Snæbjarnarson, Vésteinn; Einarsson, Hafsteinn.

MIA 2022 - Workshop on Multilingual Information Access, Proceedings of the Workshop. ed. / Akari Asai; Eunsol Choi; Jonathan H. Clark; Junjie Hu; Chia-Hsuan Lee; Jungo Kasai; Shayne Longpre; Ikuya IkuyaYamada; Rui Zhang. Association for Computational Linguistics (ACL), 2022. p. 29-36.

Research output: Chapter in Book/Report/Conference proceedingArticle in proceedingsResearchpeer-review

Harvard

Snæbjarnarson, V & Einarsson, H 2022, Cross-Lingual QA as a Stepping Stone for Monolingual Open QA in Icelandic. in A Asai, E Choi, JH Clark, J Hu, C-H Lee, J Kasai, S Longpre, I IkuyaYamada & R Zhang (eds), MIA 2022 - Workshop on Multilingual Information Access, Proceedings of the Workshop. Association for Computational Linguistics (ACL), pp. 29-36, 2022 Workshop on Multilingual Information Access, MIA 2022, Seattle, United States, 15/07/2022.

APA

Snæbjarnarson, V., & Einarsson, H. (2022). Cross-Lingual QA as a Stepping Stone for Monolingual Open QA in Icelandic. In A. Asai, E. Choi, J. H. Clark, J. Hu, C-H. Lee, J. Kasai, S. Longpre, I. IkuyaYamada, & R. Zhang (Eds.), MIA 2022 - Workshop on Multilingual Information Access, Proceedings of the Workshop (pp. 29-36). Association for Computational Linguistics (ACL).

Vancouver

Snæbjarnarson V, Einarsson H. Cross-Lingual QA as a Stepping Stone for Monolingual Open QA in Icelandic. In Asai A, Choi E, Clark JH, Hu J, Lee C-H, Kasai J, Longpre S, IkuyaYamada I, Zhang R, editors, MIA 2022 - Workshop on Multilingual Information Access, Proceedings of the Workshop. Association for Computational Linguistics (ACL). 2022. p. 29-36

Author

Snæbjarnarson, Vésteinn ; Einarsson, Hafsteinn. / Cross-Lingual QA as a Stepping Stone for Monolingual Open QA in Icelandic. MIA 2022 - Workshop on Multilingual Information Access, Proceedings of the Workshop. editor / Akari Asai ; Eunsol Choi ; Jonathan H. Clark ; Junjie Hu ; Chia-Hsuan Lee ; Jungo Kasai ; Shayne Longpre ; Ikuya IkuyaYamada ; Rui Zhang. Association for Computational Linguistics (ACL), 2022. pp. 29-36

Bibtex

@inproceedings{7fa579f47b2f4cd4acc08677af750a59,
title = "Cross-Lingual QA as a Stepping Stone for Monolingual Open QA in Icelandic",
abstract = "It can be challenging to build effective open question answering (open QA) systems for languages other than English, mainly due to a lack of labeled data for training. We present a data efficient method to bootstrap such a system for languages other than English. Our approach requires only limited QA resources in the given language, along with machine-translated data, and at least a bilingual language model. To evaluate our approach, we build such a system for the Icelandic language and evaluate performance over trivia style datasets. The corpora used for training are English in origin but machine translated into Icelandic. We train a bilingual Icelandic/English language model to embed English context and Icelandic questions following methodology introduced with DensePhrases (Lee et al., 2021). The resulting system is an open domain cross-lingual QA system between Icelandic and English. Finally, the system is adapted for Icelandic only open QA, demonstrating how it is possible to efficiently create an open QA system with limited access to curated datasets in the language of interest.",
author = "V{\'e}steinn Sn{\ae}bjarnarson and Hafsteinn Einarsson",
note = "Publisher Copyright: {\textcopyright} 2022 Association for Computational Linguistics.; 2022 Workshop on Multilingual Information Access, MIA 2022 ; Conference date: 15-07-2022",
year = "2022",
language = "English",
pages = "29--36",
editor = "Akari Asai and Eunsol Choi and Clark, {Jonathan H.} and Junjie Hu and Chia-Hsuan Lee and Jungo Kasai and Shayne Longpre and Ikuya IkuyaYamada and Rui Zhang",
booktitle = "MIA 2022 - Workshop on Multilingual Information Access, Proceedings of the Workshop",
publisher = "Association for Computational Linguistics (ACL)",
address = "United States",

}

RIS

TY - GEN

T1 - Cross-Lingual QA as a Stepping Stone for Monolingual Open QA in Icelandic

AU - Snæbjarnarson, Vésteinn

AU - Einarsson, Hafsteinn

N1 - Publisher Copyright: © 2022 Association for Computational Linguistics.

PY - 2022

Y1 - 2022

N2 - It can be challenging to build effective open question answering (open QA) systems for languages other than English, mainly due to a lack of labeled data for training. We present a data efficient method to bootstrap such a system for languages other than English. Our approach requires only limited QA resources in the given language, along with machine-translated data, and at least a bilingual language model. To evaluate our approach, we build such a system for the Icelandic language and evaluate performance over trivia style datasets. The corpora used for training are English in origin but machine translated into Icelandic. We train a bilingual Icelandic/English language model to embed English context and Icelandic questions following methodology introduced with DensePhrases (Lee et al., 2021). The resulting system is an open domain cross-lingual QA system between Icelandic and English. Finally, the system is adapted for Icelandic only open QA, demonstrating how it is possible to efficiently create an open QA system with limited access to curated datasets in the language of interest.

AB - It can be challenging to build effective open question answering (open QA) systems for languages other than English, mainly due to a lack of labeled data for training. We present a data efficient method to bootstrap such a system for languages other than English. Our approach requires only limited QA resources in the given language, along with machine-translated data, and at least a bilingual language model. To evaluate our approach, we build such a system for the Icelandic language and evaluate performance over trivia style datasets. The corpora used for training are English in origin but machine translated into Icelandic. We train a bilingual Icelandic/English language model to embed English context and Icelandic questions following methodology introduced with DensePhrases (Lee et al., 2021). The resulting system is an open domain cross-lingual QA system between Icelandic and English. Finally, the system is adapted for Icelandic only open QA, demonstrating how it is possible to efficiently create an open QA system with limited access to curated datasets in the language of interest.

UR - http://www.scopus.com/inward/record.url?scp=85139142520&partnerID=8YFLogxK

M3 - Article in proceedings

AN - SCOPUS:85139142520

SP - 29

EP - 36

BT - MIA 2022 - Workshop on Multilingual Information Access, Proceedings of the Workshop

A2 - Asai, Akari

A2 - Choi, Eunsol

A2 - Clark, Jonathan H.

A2 - Hu, Junjie

A2 - Lee, Chia-Hsuan

A2 - Kasai, Jungo

A2 - Longpre, Shayne

A2 - IkuyaYamada, Ikuya

A2 - Zhang, Rui

PB - Association for Computational Linguistics (ACL)

T2 - 2022 Workshop on Multilingual Information Access, MIA 2022

Y2 - 15 July 2022

ER -

ID: 371184644