Experiments with crowdsourced re-annotation of a POS tagging data set

Research output: Chapter in Book/Report/Conference proceeding › Article in proceedings › Research › peer-review

Dirk Hovy
Barbara Plank
Søgaard, Anders

Crowdsourcing lets us collect multiple annotations for an item from several annotators. Typically, these are annotations for non-sequential classification tasks. While there has been some work on crowdsourcing named entity annotations, researchers have assumed that syntactic tasks such as part-of-speech (POS) tagging cannot be crowdsourced. This paper shows that workers can actually annotate sequential data almost as well as experts. Further, we show that the models learned from crowdsourced annotations fare as well as the models learned from expert annotations in downstream tasks.

Original language	English
Title of host publication	Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)
Place of Publication	Baltimore, Maryland
Publisher	Association for Computational Linguistics
Publication date	Jun 2014
Pages	377-382
Publication status	Published - Jun 2014

ID: 107673017

Department of Computer Science

Experiments with crowdsourced re-annotation of a POS tagging data set