Noisy Channel for Low Resource Grammatical Error Correction

Publikation: Bidrag til bog/antologi/rapportKonferencebidrag i proceedingsForskning

Dokumenter

This paper describes our contribution to the low-resource track of the BEA 2019 shared task on Grammatical Error Correction (GEC). Our approach to GEC builds on the theory of the noisy channel by combining a channel model and language model. We generate confusion sets from the Wikipedia edit history and use the frequencies of edits to estimate the channel model. Additionally, we use two pre-trained language models: 1) Google’s BERT model, which we fine-tune for specific error types and 2) OpenAI’s GPT-2 model, utilizing that it can operate with previous sentences as context. Furthermore, we search for the optimal combinations of corrections using beam search.
OriginalsprogEngelsk
TitelProceedings of the Fourteenth Workshop on Innovative Use of NLP for Building Educational Applications
ForlagAssociation for Computational Linguistics
Publikationsdato2019
Sider191-196
DOI
StatusUdgivet - 2019
Begivenhed14th Workshop on Innovative Use of NLP for Building Educational Applications - Florence, Italy
Varighed: 2 aug. 2019 → …

Workshop

Workshop14th Workshop on Innovative Use of NLP for Building Educational Applications
ByFlorence, Italy
Periode02/08/2019 → …

Antal downloads er baseret på statistik fra Google Scholar og www.ku.dk


Ingen data tilgængelig

ID: 240410261