Multi-task learning for historical text normalization: Size matters
Research output: Chapter in Book/Report/Conference proceeding › Article in proceedings › Research › peer-review
Standard
Multi-task learning for historical text normalization : Size matters. / Bollmann, Marc Marcel; Søgaard, Anders; Bingel, Joachim.
Proceedings of the Workshop on Deep Learning Approaches for Low-Resource NLP. Association for Computational Linguistics, 2018. p. 19–24.Research output: Chapter in Book/Report/Conference proceeding › Article in proceedings › Research › peer-review
Harvard
APA
Vancouver
Author
Bibtex
}
RIS
TY - GEN
T1 - Multi-task learning for historical text normalization
AU - Bollmann, Marc Marcel
AU - Søgaard, Anders
AU - Bingel, Joachim
PY - 2018
Y1 - 2018
N2 - Historical text normalization suffers fromsmall datasets that exhibit high variance,and previous work has shown that multitasklearning can be used to leverage datafrom related problems in order to obtainmore robust models. Previous work hasbeen limited to datasets from a specific languageand a specific historical period, andit is not clear whether results generalize. Ittherefore remains an open problem, whenhistorical text normalization benefits frommulti-task learning. We explore the benefitsof multi-task learning across 10 differentdatasets, representing different languagesand periods. Our main finding—contrary to what has been observed forother NLP tasks—is that multi-task learningmainly works when target task data isvery scarce.
AB - Historical text normalization suffers fromsmall datasets that exhibit high variance,and previous work has shown that multitasklearning can be used to leverage datafrom related problems in order to obtainmore robust models. Previous work hasbeen limited to datasets from a specific languageand a specific historical period, andit is not clear whether results generalize. Ittherefore remains an open problem, whenhistorical text normalization benefits frommulti-task learning. We explore the benefitsof multi-task learning across 10 differentdatasets, representing different languagesand periods. Our main finding—contrary to what has been observed forother NLP tasks—is that multi-task learningmainly works when target task data isvery scarce.
M3 - Article in proceedings
SP - 19
EP - 24
BT - Proceedings of the Workshop on Deep Learning Approaches for Low-Resource NLP
PB - Association for Computational Linguistics
Y2 - 19 July 2018 through 19 July 2018
ER -
ID: 214754949