Claim Check-Worthiness Detection as Positive Unlabelled Learning

Research output: Chapter in Book/Report/Conference proceedingArticle in proceedingsResearchpeer-review

Standard

Claim Check-Worthiness Detection as Positive Unlabelled Learning. / Wright, Dustin; Augenstein, Isabelle.

Findings of the Association for Computational Linguistics: EMNLP 2020. Association for Computational Linguistics, 2020. p. 476-488.

Research output: Chapter in Book/Report/Conference proceedingArticle in proceedingsResearchpeer-review

Harvard

Wright, D & Augenstein, I 2020, Claim Check-Worthiness Detection as Positive Unlabelled Learning. in Findings of the Association for Computational Linguistics: EMNLP 2020. Association for Computational Linguistics, pp. 476-488, The 2020 Conference on Empirical Methods in Natural Language Processing, 16/11/2020. https://doi.org/10.18653/v1/2020.findings-emnlp.43

APA

Wright, D., & Augenstein, I. (2020). Claim Check-Worthiness Detection as Positive Unlabelled Learning. In Findings of the Association for Computational Linguistics: EMNLP 2020 (pp. 476-488). Association for Computational Linguistics. https://doi.org/10.18653/v1/2020.findings-emnlp.43

Vancouver

Wright D, Augenstein I. Claim Check-Worthiness Detection as Positive Unlabelled Learning. In Findings of the Association for Computational Linguistics: EMNLP 2020. Association for Computational Linguistics. 2020. p. 476-488 https://doi.org/10.18653/v1/2020.findings-emnlp.43

Author

Wright, Dustin ; Augenstein, Isabelle. / Claim Check-Worthiness Detection as Positive Unlabelled Learning. Findings of the Association for Computational Linguistics: EMNLP 2020. Association for Computational Linguistics, 2020. pp. 476-488

Bibtex

@inproceedings{9afc2f5e14774d279b9366eaf69873c6,
title = "Claim Check-Worthiness Detection as Positive Unlabelled Learning",
abstract = "As the first step of automatic fact checking, claim check-worthiness detection is a critical component of fact checking systems. There are multiple lines of research which study this problem: check-worthiness ranking from political speeches and debates, rumour detection on Twitter, and citation needed detection from Wikipedia. To date, there has been no structured comparison of these various tasks to understand their relatedness, and no investigation into whether or not a unified approach to all of them is achievable. In this work, we illuminate a central challenge in claim check-worthiness detection underlying all of these tasks, being that they hinge upon detecting both how factual a sentence is, as well as how likely a sentence is to be believed without verification. As such, annotators only mark those instances they judge to be clear-cut check-worthy. Our best performing method is a unified approach which automatically corrects for this using a variant of positive unlabelled learning that finds instances which were incorrectly labelled as not check-worthy. In applying this, we out-perform the state of the art in two of the three tasks studied for claim check-worthiness detection in English.",
author = "Dustin Wright and Isabelle Augenstein",
year = "2020",
doi = "10.18653/v1/2020.findings-emnlp.43",
language = "English",
pages = "476--488",
booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2020",
publisher = "Association for Computational Linguistics",
note = "The 2020 Conference on Empirical Methods in Natural Language Processing, EMNLP 2020 ; Conference date: 16-11-2020 Through 20-11-2020",
url = "http://2020.emnlp.org",

}

RIS

TY - GEN

T1 - Claim Check-Worthiness Detection as Positive Unlabelled Learning

AU - Wright, Dustin

AU - Augenstein, Isabelle

PY - 2020

Y1 - 2020

N2 - As the first step of automatic fact checking, claim check-worthiness detection is a critical component of fact checking systems. There are multiple lines of research which study this problem: check-worthiness ranking from political speeches and debates, rumour detection on Twitter, and citation needed detection from Wikipedia. To date, there has been no structured comparison of these various tasks to understand their relatedness, and no investigation into whether or not a unified approach to all of them is achievable. In this work, we illuminate a central challenge in claim check-worthiness detection underlying all of these tasks, being that they hinge upon detecting both how factual a sentence is, as well as how likely a sentence is to be believed without verification. As such, annotators only mark those instances they judge to be clear-cut check-worthy. Our best performing method is a unified approach which automatically corrects for this using a variant of positive unlabelled learning that finds instances which were incorrectly labelled as not check-worthy. In applying this, we out-perform the state of the art in two of the three tasks studied for claim check-worthiness detection in English.

AB - As the first step of automatic fact checking, claim check-worthiness detection is a critical component of fact checking systems. There are multiple lines of research which study this problem: check-worthiness ranking from political speeches and debates, rumour detection on Twitter, and citation needed detection from Wikipedia. To date, there has been no structured comparison of these various tasks to understand their relatedness, and no investigation into whether or not a unified approach to all of them is achievable. In this work, we illuminate a central challenge in claim check-worthiness detection underlying all of these tasks, being that they hinge upon detecting both how factual a sentence is, as well as how likely a sentence is to be believed without verification. As such, annotators only mark those instances they judge to be clear-cut check-worthy. Our best performing method is a unified approach which automatically corrects for this using a variant of positive unlabelled learning that finds instances which were incorrectly labelled as not check-worthy. In applying this, we out-perform the state of the art in two of the three tasks studied for claim check-worthiness detection in English.

U2 - 10.18653/v1/2020.findings-emnlp.43

DO - 10.18653/v1/2020.findings-emnlp.43

M3 - Article in proceedings

SP - 476

EP - 488

BT - Findings of the Association for Computational Linguistics: EMNLP 2020

PB - Association for Computational Linguistics

T2 - The 2020 Conference on Empirical Methods in Natural Language Processing

Y2 - 16 November 2020 through 20 November 2020

ER -

ID: 254996033