Rather a Nurse than a Physician - Contrastive Explanations under Investigation
Publikation: Bidrag til bog/antologi/rapport › Konferencebidrag i proceedings › Forskning › fagfællebedømt
Dokumenter
- Fulltext
Forlagets udgivne version, 1,84 MB, PDF-dokument
Contrastive explanations, where one decision is explained *in contrast to another*, are supposed to be closer to how humans explain a decision than non-contrastive explanations, where the decision is not necessarily referenced to an alternative. This claim has never been empirically validated. We analyze four English text-classification datasets (SST2, DynaSent, BIOS and DBpedia-Animals). We fine-tune and extract explanations from three different models (RoBERTa, GTP-2, and T5), each in three different sizes and apply three post-hoc explainability methods (LRP, GradientxInput, GradNorm). We furthermore collect and release human rationale annotations for a subset of 100 samples from the BIOS dataset for contrastive and non-contrastive settings. A cross-comparison between model-based rationales and human annotations, both in contrastive and non-contrastive settings, yields a high agreement between the two settings for models as well as for humans. Moreover, model-based explanations computed in both settings align equally well with human rationales. Thus, we empirically find that humans do not necessarily explain in a contrastive manner.
Originalsprog | Engelsk |
---|---|
Titel | Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing |
Forlag | Association for Computational Linguistics (ACL) |
Publikationsdato | 2023 |
Sider | 6907-6920 |
ISBN (Elektronisk) | 979-8-89176-060-8 |
DOI | |
Status | Udgivet - 2023 |
Begivenhed | 2023 Conference on Empirical Methods in Natural Language Processing - Singapore Varighed: 6 dec. 2023 → 10 dec. 2023 |
Konference
Konference | 2023 Conference on Empirical Methods in Natural Language Processing |
---|---|
By | Singapore |
Periode | 06/12/2023 → 10/12/2023 |
ID: 383927536