Explaining Interactions Between Text Spans

Publikation: Bidrag til bog/antologi/rapport › Konferencebidrag i proceedings › Forskning › fagfællebedømt

Dokumenter

Fulltext
Forlagets udgivne version, 903 KB, PDF-dokument

Reasoning over spans of tokens from different parts of the input is essential for natural language understanding (NLU) tasks such as fact-checking (FC), machine reading comprehension (MRC) or natural language inference (NLI). However, existing highlight-based explanations primarily focus on identifying individual important features or interactions only between adjacent tokens or tuples of tokens. Most notably, there is a lack of annotations capturing the human decision-making process with respect to the necessary interactions for informed decision-making in such tasks. To bridge this gap, we introduce SpanEx, a multi-annotator dataset of human span interaction explanations for two NLU tasks: NLI and FC. We then investigate the decision-making processes of multiple fine-tuned large language models in terms of the employed connections between spans in separate parts of the input and compare them to the human reasoning processes. Finally, we present a novel community detection based unsupervised method to extract such interaction explanations. We make the code and the dataset available on [Github](https://github.com/copenlu/spanex). The dataset is also available on [Huggingface datasets](https://huggingface.co/datasets/copenlu/spanex).

Originalsprog	Engelsk
Titel	Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing
Forlag	Association for Computational Linguistics (ACL)
Publikationsdato	2023
Sider	12709-12730
ISBN (Trykt)	N 979-8-89176-060-8
DOI	https://doi.org/10.18653/v1/2023.emnlp-main.783
Status	Udgivet - 2023
Begivenhed	2023 Conference on Empirical Methods in Natural Language Processing - Singapore Varighed: 6 dec. 2023 → 10 dec. 2023

Konference

Konference	2023 Conference on Empirical Methods in Natural Language Processing
By	Singapore
Periode	06/12/2023 → 10/12/2023

ID: 381512104

Datalogisk Institut