What does the Failure to Reason with “Respectively” in Zero/Few-Shot Settings Tell Us about Language Models?

Publikation: Bidrag til bog/antologi/rapport › Konferencebidrag i proceedings › Forskning › fagfællebedømt

Dokumenter

Fulltext
Forlagets udgivne version, 311 KB, PDF-dokument

Humans can effortlessly understand the coordinate structure of sentences such as “Niels Bohr and Kurt Cobain were born in Copenhagen and Seattle, respectively”. In the context of natural language inference (NLI), we examine how language models (LMs) reason with respective readings (Gawron and Kehler, 2004) from two perspectives: syntactic-semantic and commonsense-world knowledge. We propose a controlled synthetic dataset WikiResNLI and a naturally occurring dataset NatResNLI to encompass various explicit and implicit realizations of “respectively”. We show that fine-tuned NLI models struggle with understanding such readings without explicit supervision. While few-shot learning is easy in the presence of explicit cues, longer training is required when the reading is evoked implicitly, leaving models to rely on common sense inferences. Furthermore, our fine-grained analysis indicates models fail to generalize across different constructions. To conclude, we demonstrate that LMs still lag behind humans in generalizing to the long tail of linguistic constructions.

Originalsprog	Engelsk
Titel	Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics : Long Papers
Vol/bind	1
Forlag	Association for Computational Linguistics (ACL)
Publikationsdato	2023
Sider	8786-8800
ISBN (Elektronisk)	9781959429722
DOI	https://doi.org/10.18653/v1/2023.acl-long.489
Status	Udgivet - 2023
Begivenhed	61st Annual Meeting of the Association for Computational Linguistics, ACL 2023 - Toronto, Canada Varighed: 9 jul. 2023 → 14 jul. 2023

Konference

Konference	61st Annual Meeting of the Association for Computational Linguistics, ACL 2023
Land	Canada
By	Toronto
Periode	09/07/2023 → 14/07/2023
Sponsor	Bloomberg Engineering, et al., Google Research, Liveperson, Meta, Microsoft

Bibliografisk note

ID: 371030992

Datalogisk Institut