Noisy Channel for Low Resource Grammatical Error Correction

Research output: Chapter in Book/Report/Conference proceeding › Article in proceedings › Research

Standard

Noisy Channel for Low Resource Grammatical Error Correction. / Flachs, Simon; Lacroix, Ophélie; Søgaard, Anders.

Proceedings of the Fourteenth Workshop on Innovative Use of NLP for Building Educational Applications. Association for Computational Linguistics, 2019. p. 191-196.

Research output: Chapter in Book/Report/Conference proceeding › Article in proceedings › Research

Harvard

Flachs, S, Lacroix, O & Søgaard, A 2019, Noisy Channel for Low Resource Grammatical Error Correction. in Proceedings of the Fourteenth Workshop on Innovative Use of NLP for Building Educational Applications. Association for Computational Linguistics, pp. 191-196, 14th Workshop on Innovative Use of NLP for Building Educational Applications, Florence, Italy, 02/08/2019. https://doi.org/10.18653/v1/W19-4420

APA

Flachs, S., Lacroix, O., & Søgaard, A. (2019). Noisy Channel for Low Resource Grammatical Error Correction. In Proceedings of the Fourteenth Workshop on Innovative Use of NLP for Building Educational Applications (pp. 191-196). Association for Computational Linguistics. https://doi.org/10.18653/v1/W19-4420

Vancouver

Flachs S, Lacroix O, Søgaard A. Noisy Channel for Low Resource Grammatical Error Correction. In Proceedings of the Fourteenth Workshop on Innovative Use of NLP for Building Educational Applications. Association for Computational Linguistics. 2019. p. 191-196 https://doi.org/10.18653/v1/W19-4420

Author

Flachs, Simon ; Lacroix, Ophélie ; Søgaard, Anders. / Noisy Channel for Low Resource Grammatical Error Correction. Proceedings of the Fourteenth Workshop on Innovative Use of NLP for Building Educational Applications. Association for Computational Linguistics, 2019. pp. 191-196

Bibtex

@inproceedings{cabe3103fd8c44a59af8830a77fb10be,

title = "Noisy Channel for Low Resource Grammatical Error Correction",

abstract = "This paper describes our contribution to the low-resource track of the BEA 2019 shared task on Grammatical Error Correction (GEC). Our approach to GEC builds on the theory of the noisy channel by combining a channel model and language model. We generate confusion sets from the Wikipedia edit history and use the frequencies of edits to estimate the channel model. Additionally, we use two pre-trained language models: 1) Google{\textquoteright}s BERT model, which we fine-tune for specific error types and 2) OpenAI{\textquoteright}s GPT-2 model, utilizing that it can operate with previous sentences as context. Furthermore, we search for the optimal combinations of corrections using beam search.",

author = "Simon Flachs and Oph{\'e}lie Lacroix and Anders S{\o}gaard",

year = "2019",

doi = "10.18653/v1/W19-4420",

language = "English",

pages = "191--196",

booktitle = "Proceedings of the Fourteenth Workshop on Innovative Use of NLP for Building Educational Applications",

publisher = "Association for Computational Linguistics",

note = "null ; Conference date: 02-08-2019",

}

RIS

TY - GEN

T1 - Noisy Channel for Low Resource Grammatical Error Correction

AU - Flachs, Simon

AU - Lacroix, Ophélie

AU - Søgaard, Anders

PY - 2019

Y1 - 2019

N2 - This paper describes our contribution to the low-resource track of the BEA 2019 shared task on Grammatical Error Correction (GEC). Our approach to GEC builds on the theory of the noisy channel by combining a channel model and language model. We generate confusion sets from the Wikipedia edit history and use the frequencies of edits to estimate the channel model. Additionally, we use two pre-trained language models: 1) Google’s BERT model, which we fine-tune for specific error types and 2) OpenAI’s GPT-2 model, utilizing that it can operate with previous sentences as context. Furthermore, we search for the optimal combinations of corrections using beam search.

AB - This paper describes our contribution to the low-resource track of the BEA 2019 shared task on Grammatical Error Correction (GEC). Our approach to GEC builds on the theory of the noisy channel by combining a channel model and language model. We generate confusion sets from the Wikipedia edit history and use the frequencies of edits to estimate the channel model. Additionally, we use two pre-trained language models: 1) Google’s BERT model, which we fine-tune for specific error types and 2) OpenAI’s GPT-2 model, utilizing that it can operate with previous sentences as context. Furthermore, we search for the optimal combinations of corrections using beam search.

U2 - 10.18653/v1/W19-4420

DO - 10.18653/v1/W19-4420

M3 - Article in proceedings

SP - 191

EP - 196

BT - Proceedings of the Fourteenth Workshop on Innovative Use of NLP for Building Educational Applications

PB - Association for Computational Linguistics

Y2 - 2 August 2019

ER -

ID: 240410261

Department of Computer Science