Parameter sharing between dependency parsers for related languages
Research output: Chapter in Book/Report/Conference proceeding › Article in proceedings › Research › peer-review
Standard
Parameter sharing between dependency parsers for related languages. / Lhoneux, Miryam de ; Bjerva, Johannes; Augenstein, Isabelle; Søgaard, Anders.
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing . Association for Computational Linguistics, 2020. p. 4992-4997.Research output: Chapter in Book/Report/Conference proceeding › Article in proceedings › Research › peer-review
Harvard
APA
Vancouver
Author
Bibtex
}
RIS
TY - GEN
T1 - Parameter sharing between dependency parsers for related languages
AU - Lhoneux, Miryam de
AU - Bjerva, Johannes
AU - Augenstein, Isabelle
AU - Søgaard, Anders
PY - 2020
Y1 - 2020
N2 - Previous work has suggested that parameter sharing between transition-based neural dependency parsers for related languages can lead to better performance, but there is no consensus on what parameters to share. We present an evaluation of 27 different parameter sharing strategies across 10 languages, representing five pairs of related languages, each pair from a different language family. We find that sharing transition classifier parameters always helps, whereas the usefulness of sharing word and/or character LSTM parameters varies. Based on this result, we propose an architecture where the transition classifier is shared, and the sharing of word and character parameters is controlled by a parameter that can be tuned on validation data. This model is linguistically motivated and obtains significant improvements over a mono-lingually trained baseline. We also find that sharing transition classifier parameters helps when training a parser on unrelated language pairs, but we find that, in the case of unrelated languages, sharing too many parameters does not help.
AB - Previous work has suggested that parameter sharing between transition-based neural dependency parsers for related languages can lead to better performance, but there is no consensus on what parameters to share. We present an evaluation of 27 different parameter sharing strategies across 10 languages, representing five pairs of related languages, each pair from a different language family. We find that sharing transition classifier parameters always helps, whereas the usefulness of sharing word and/or character LSTM parameters varies. Based on this result, we propose an architecture where the transition classifier is shared, and the sharing of word and character parameters is controlled by a parameter that can be tuned on validation data. This model is linguistically motivated and obtains significant improvements over a mono-lingually trained baseline. We also find that sharing transition classifier parameters helps when training a parser on unrelated language pairs, but we find that, in the case of unrelated languages, sharing too many parameters does not help.
M3 - Article in proceedings
SP - 4992
EP - 4997
BT - Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing
PB - Association for Computational Linguistics
Y2 - 31 October 2018 through 4 November 2018
ER -
ID: 214507219