MultiFC - Staff

MultiFC: A Real-World Multi-Domain Dataset for Evidence-Based Fact Checking of Claims

Research output: Chapter in Book/Report/Conference proceeding › Article in proceedings › Research › peer-review

Documents

OA-MultiFC
Final published version, 502 KB, PDF document

Augenstein, Isabelle
Lioma, Christina
Dongsheng Wang
Lucas Chaves Lima
Casper Hansen
Christian Hansen
Simonsen, Jakob Grue

We contribute the largest publicly available dataset of naturally occurring factual claims for the purpose of automatic claim verification. It is collected from 26 fact checking websites in English, paired with textual sources and rich metadata, and labelled for veracity by human expert journalists. We present an in-depth analysis of the dataset, highlighting characteristics and challenges. Further, we present results for automatic veracity prediction, both with established baselines and with a novel method for joint ranking of evidence pages and predicting veracity that outperforms all baselines. Significant performance increases are achieved by encoding evidence, and by modelling metadata. Our best-performing model achieves a Macro F1 of 49.2%, showing that this is a challenging testbed for claim veracity prediction.

Original language	English
Title of host publication	Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)
Publisher	Association for Computational Linguistics
Publication date	2019
Pages	4684-4697
DOIs	https://doi.org/10.18653/v1/D19-1475
Publication status	Published - 2019
Event	2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP) - Hong Kong, China Duration: 3 Nov 2019 → 7 Nov 2019

Conference

Conference	2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)
Land	China
By	Hong Kong
Periode	03/11/2019 → 07/11/2019

Number of downloads are based on statistics from Google Scholar and www.ku.dk

No data available

ID: 239563731

Department of Computer Science

MultiFC: A Real-World Multi-Domain Dataset for Evidence-Based Fact Checking of Claims

Documents

Conference

Number of downloads are based on statistics from Google Scholar and www.ku.dk