The Copenhagen team participation in the factuality task of the competition of automatic identification and verification of claims in political debates of the CLEF-2018 Fact Checking Lab

Publikation: Bidrag til bog/antologi/rapportKonferencebidrag i proceedingsForskningfagfællebedømt


  • paper_98

    Forlagets udgivne version, 556 KB, PDF-dokument

Given a set of political debate claims that have been already identified as worth checking, we consider the task of automatically checking the factuality of these claims. In particular, given a sentence that is worth checking, the goal is for the system to determine whether the claim is likely to be true, false, half-true or that it is unsure of its factuality. We implement a variety of models, including Bayes, SVM, RNN, to either step-wise assist our model or work as potential baselines. Then, we develop additional multi-scale Convolutional Neural Networks (CNNs) with different kernel sizes that learn from external sources whether a claim is true, false, half-true or unsure as follows: we treat claims as search engine queries and step-wise retrieve the top-N documents from Google with as much original claim as possible. We strategically select most relevant but sufficient documents with respect to the claims, and extract features, such as title, total number of results returned, and snippet to train the prediction model. We submitted results of SVM and CNNs, and the overall performance of our techniques is successful, achieving the overall best performing run (with lowest error rate 0.7050 from our SVM and highest accuracy 46.76% from our CNNs) in the competition.

TitelCLEF 2018 Working Notes
RedaktørerLinda Cappellato , Nicola Ferro , Jian-Yun Nie, Laure Soulier
Antal sider10
StatusUdgivet - 2018
Begivenhed19th Working Notes of CLEF Conference and Labs of the Evaluation Forum, CLEF 2018 - Avignon, Frankrig
Varighed: 10 sep. 201814 sep. 2018


Konference19th Working Notes of CLEF Conference and Labs of the Evaluation Forum, CLEF 2018
NavnCEUR Workshop Proceedings

Antal downloads er baseret på statistik fra Google Scholar og

Ingen data tilgængelig

ID: 202539509