Noisy Channel for Low Resource Grammatical Error Correction

Publikation: Bidrag til bog/antologi/rapport › Konferencebidrag i proceedings › Forskning

Standard

Noisy Channel for Low Resource Grammatical Error Correction. / Flachs, Simon; Lacroix, Ophélie; Søgaard, Anders.

Proceedings of the Fourteenth Workshop on Innovative Use of NLP for Building Educational Applications. Association for Computational Linguistics, 2019. s. 191-196.

Publikation: Bidrag til bog/antologi/rapport › Konferencebidrag i proceedings › Forskning

Harvard

Flachs, S, Lacroix, O & Søgaard, A 2019, Noisy Channel for Low Resource Grammatical Error Correction. i Proceedings of the Fourteenth Workshop on Innovative Use of NLP for Building Educational Applications. Association for Computational Linguistics, s. 191-196, 14th Workshop on Innovative Use of NLP for Building Educational Applications, Florence, Italy, 02/08/2019. https://doi.org/10.18653/v1/W19-4420

APA

Flachs, S., Lacroix, O., & Søgaard, A. (2019). Noisy Channel for Low Resource Grammatical Error Correction. I Proceedings of the Fourteenth Workshop on Innovative Use of NLP for Building Educational Applications (s. 191-196). Association for Computational Linguistics. https://doi.org/10.18653/v1/W19-4420

Vancouver

Flachs S, Lacroix O, Søgaard A. Noisy Channel for Low Resource Grammatical Error Correction. I Proceedings of the Fourteenth Workshop on Innovative Use of NLP for Building Educational Applications. Association for Computational Linguistics. 2019. s. 191-196 https://doi.org/10.18653/v1/W19-4420

Author

Flachs, Simon ; Lacroix, Ophélie ; Søgaard, Anders. / Noisy Channel for Low Resource Grammatical Error Correction. Proceedings of the Fourteenth Workshop on Innovative Use of NLP for Building Educational Applications. Association for Computational Linguistics, 2019. s. 191-196

Bibtex

@inproceedings{cabe3103fd8c44a59af8830a77fb10be,

title = "Noisy Channel for Low Resource Grammatical Error Correction",

abstract = "This paper describes our contribution to the low-resource track of the BEA 2019 shared task on Grammatical Error Correction (GEC). Our approach to GEC builds on the theory of the noisy channel by combining a channel model and language model. We generate confusion sets from the Wikipedia edit history and use the frequencies of edits to estimate the channel model. Additionally, we use two pre-trained language models: 1) Google{\textquoteright}s BERT model, which we fine-tune for specific error types and 2) OpenAI{\textquoteright}s GPT-2 model, utilizing that it can operate with previous sentences as context. Furthermore, we search for the optimal combinations of corrections using beam search.",

author = "Simon Flachs and Oph{\'e}lie Lacroix and Anders S{\o}gaard",

year = "2019",

doi = "10.18653/v1/W19-4420",

language = "English",

pages = "191--196",

booktitle = "Proceedings of the Fourteenth Workshop on Innovative Use of NLP for Building Educational Applications",

publisher = "Association for Computational Linguistics",

note = "null ; Conference date: 02-08-2019",

}

RIS

TY - GEN

T1 - Noisy Channel for Low Resource Grammatical Error Correction

AU - Flachs, Simon

AU - Lacroix, Ophélie

AU - Søgaard, Anders

PY - 2019

Y1 - 2019

N2 - This paper describes our contribution to the low-resource track of the BEA 2019 shared task on Grammatical Error Correction (GEC). Our approach to GEC builds on the theory of the noisy channel by combining a channel model and language model. We generate confusion sets from the Wikipedia edit history and use the frequencies of edits to estimate the channel model. Additionally, we use two pre-trained language models: 1) Google’s BERT model, which we fine-tune for specific error types and 2) OpenAI’s GPT-2 model, utilizing that it can operate with previous sentences as context. Furthermore, we search for the optimal combinations of corrections using beam search.

AB - This paper describes our contribution to the low-resource track of the BEA 2019 shared task on Grammatical Error Correction (GEC). Our approach to GEC builds on the theory of the noisy channel by combining a channel model and language model. We generate confusion sets from the Wikipedia edit history and use the frequencies of edits to estimate the channel model. Additionally, we use two pre-trained language models: 1) Google’s BERT model, which we fine-tune for specific error types and 2) OpenAI’s GPT-2 model, utilizing that it can operate with previous sentences as context. Furthermore, we search for the optimal combinations of corrections using beam search.

U2 - 10.18653/v1/W19-4420

DO - 10.18653/v1/W19-4420

M3 - Article in proceedings

SP - 191

EP - 196

BT - Proceedings of the Fourteenth Workshop on Innovative Use of NLP for Building Educational Applications

PB - Association for Computational Linguistics

Y2 - 2 August 2019

ER -

ID: 240410261

Datalogisk Institut