Guideline Bias in Wizard-of-Oz Dialogues

Publikation: Bidrag til bog/antologi/rapport › Konferencebidrag i proceedings › Forskning › fagfællebedømt

Dokumenter

Fulltext
Forlagets udgivne version, 304 KB, PDF-dokument

Victor Petrén Bach Hansen
Søgaard, Anders

NLP models struggle with generalization due to sampling and annotator bias. This paper focuses on a different kind of bias that has received very little attention: guideline bias, i.e., the bias introduced by how our annotator guidelines are formulated. We examine two recently introduced dialogue datasets, CCPE-M and Taskmaster-1, both collected by trained assistants in a Wizard-of-Oz set-up. For CCPE-M, we show how a simple lexical bias for the word like in the guidelines biases the data collection. This bias, in effect, leads to poor performance on data without this bias: a preference elicitation architecture based on BERT suffers a 5.3% absolute drop in performance, when like is replaced with a synonymous phrase, and a 13.2% drop in performance when evaluated on out-of-sample data. For Taskmaster-1, we show how the order in which instructions are presented, biases the data collection.

Originalsprog	Engelsk
Titel	BPPF 2021 - 1st Workshop on Benchmarking : Past, Present and Future, Proceedings
Redaktører	Kenneth Church, Mark Liberman, Valia Kordoni
Forlag	Association for Computational Linguistics
Publikationsdato	2021
Sider	8-14
ISBN (Elektronisk)	9781954085589
DOI	https://doi.org/10.18653/v1/2021.bppf-1.2
Status	Udgivet - 2021
Begivenhed	1st Workshop on Benchmarking: Past, Present and Future, BPPF 2021 - Virtual, Bangkok, Thailand Varighed: 5 aug. 2021 → 6 aug. 2021

Konference

Konference	1st Workshop on Benchmarking: Past, Present and Future, BPPF 2021
Land	Thailand
By	Virtual, Bangkok
Periode	05/08/2021 → 06/08/2021

Bibliografisk note

ID: 291812390

Datalogisk Institut