Claim Check-Worthiness Detection as Positive Unlabelled Learning

Publikation: Bidrag til bog/antologi/rapport › Konferencebidrag i proceedings › Forskning › fagfællebedømt

Dokumenter

Claim Check-Worthiness Detection as Positive Unlabelled Learning
Forlagets udgivne version, 722 KB, PDF-dokument

As the first step of automatic fact checking, claim check-worthiness detection is a critical component of fact checking systems. There are multiple lines of research which study this problem: check-worthiness ranking from political speeches and debates, rumour detection on Twitter, and citation needed detection from Wikipedia. To date, there has been no structured comparison of these various tasks to understand their relatedness, and no investigation into whether or not a unified approach to all of them is achievable. In this work, we illuminate a central challenge in claim check-worthiness detection underlying all of these tasks, being that they hinge upon detecting both how factual a sentence is, as well as how likely a sentence is to be believed without verification. As such, annotators only mark those instances they judge to be clear-cut check-worthy. Our best performing method is a unified approach which automatically corrects for this using a variant of positive unlabelled learning that finds instances which were incorrectly labelled as not check-worthy. In applying this, we out-perform the state of the art in two of the three tasks studied for claim check-worthiness detection in English.

Originalsprog	Engelsk
Titel	Findings of the Association for Computational Linguistics: EMNLP 2020
Forlag	Association for Computational Linguistics
Publikationsdato	2020
Sider	476-488
DOI	https://doi.org/10.18653/v1/2020.findings-emnlp.43
Status	Udgivet - 2020
Begivenhed	The 2020 Conference on Empirical Methods in Natural Language Processing - online Varighed: 16 nov. 2020 → 20 nov. 2020 http://2020.emnlp.org

Konference

Konference	The 2020 Conference on Empirical Methods in Natural Language Processing
Lokation	online
Periode	16/11/2020 → 20/11/2020
Internetadresse	http://2020.emnlp.org

Antal downloads er baseret på statistik fra Google Scholar og www.ku.dk

Ingen data tilgængelig

ID: 254996033

Datalogisk Institut