Multi-task learning for historical text normalization

Multi-task learning for historical text normalization: Size matters

Research output: Chapter in Book/Report/Conference proceeding › Article in proceedings › Research › peer-review

Marc Marcel Bollmann
Søgaard, Anders
Joachim Bingel

Historical text normalization suffers fromsmall datasets that exhibit high variance,and previous work has shown that multitasklearning can be used to leverage datafrom related problems in order to obtainmore robust models. Previous work hasbeen limited to datasets from a specific languageand a specific historical period, andit is not clear whether results generalize. Ittherefore remains an open problem, whenhistorical text normalization benefits frommulti-task learning. We explore the benefitsof multi-task learning across 10 differentdatasets, representing different languagesand periods. Our main finding—contrary to what has been observed forother NLP tasks—is that multi-task learningmainly works when target task data isvery scarce.

Original language	English
Title of host publication	Proceedings of the Workshop on Deep Learning Approaches for Low-Resource NLP
Publisher	Association for Computational Linguistics
Publication date	2018
Pages	19–24
Publication status	Published - 2018
Event	Workshop on Deep Learning Approaches for Low-Resource NLP - Melbourne, Australia Duration: 19 Jul 2018 → 19 Jul 2018

Workshop

Workshop	Workshop on Deep Learning Approaches for Low-Resource NLP
Land	Australia
By	Melbourne
Periode	19/07/2018 → 19/07/2018

ID: 214754949

Department of Computer Science

Multi-task learning for historical text normalization: Size matters

Workshop