Can Language Models Encode Perceptual Structure Without Grounding?

Can Language Models Encode Perceptual Structure Without Grounding? A Case Study in Color

Publikation: Bidrag til bog/antologi/rapport › Konferencebidrag i proceedings › Forskning › fagfællebedømt

Dokumenter

Can Language Models Encode Perceptual Structure Without Grounding
Forlagets udgivne version, 4,54 MB, PDF-dokument

Mostafa Abdou
Artur Kulmizev
Hershcovich, Daniel
Stella Frank
Ellie Pavlick
Søgaard, Anders

Pretrained language models have been shown to encode relational information, such as the relations between entities or concepts in knowledge-bases — (Paris, Capital, France). However, simple relations of this type can often be recovered heuristically and the extent to which models implicitly reflect topological structure that is grounded in world, such as perceptual structure, is unknown. To explore this question, we conduct a thorough case study on color. Namely, we employ a dataset of monolexemic color terms and color chips represented in CIELAB, a color space with a perceptually meaningful distance metric. Using two methods of evaluating the structural alignment of colors in this space with text-derived color term representations, we find significant correspondence. Analyzing the differences in alignment across the color spectrum, we find that warmer colors are, on average, better aligned to the perceptual color space than cooler ones, suggesting an intriguing connection to findings from recent work on efficient communication in color naming. Further analysis suggests that differences in alignment are, in part, mediated by collocationality and differences in syntactic usage, posing questions as to the relationship between color perception and usage and context.

Originalsprog	Engelsk
Titel	Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing
Forlag	Association for Computational Linguistics
Publikationsdato	2021
Sider	109–132
DOI	https://doi.org/10.18653/v1/2021.conll-1.9
Status	Udgivet - 2021
Begivenhed	2021 Conference on Empirical Methods in Natural Language Processing - Varighed: 7 nov. 2021 → 11 nov. 2021

Konference

Konference	2021 Conference on Empirical Methods in Natural Language Processing
Periode	07/11/2021 → 11/11/2021

Antal downloads er baseret på statistik fra Google Scholar og www.ku.dk

Ingen data tilgængelig

ID: 299824244

Datalogisk Institut

Can Language Models Encode Perceptual Structure Without Grounding? A Case Study in Color

Dokumenter

Konference

Antal downloads er baseret på statistik fra Google Scholar og www.ku.dk