PhD defence by Mostafa Abdou

Title

Language Understanding in Humans and Artificial Neural Networks: Parallels and Contrasts

Summary

Modern artificial neural networks (ANNs) are loosely inspired by the human brain. This begs the question: can they help us model aspects of (human) language understanding by serving as plausible hypotheses for its underlying representations and mechanisms? Although the way modern ANN language models (LMs) learn — via training on immense amounts of text to predict either future or masked tokens — is manifestly not human-like, they have demonstrated a remarkable ability to simulate human understanding on a wide range of tasks. In addition, they have recently been shown to predict or align to a variety of cognitive measurements.

This dissertation presents research that investigates ANN LMs, examining the linguistic properties of the representations they acquire and exploring parallels and disparities between their language processing capacities and those of humans. This is done based on three classes of analytical comparisons, viz. (i) behavioural data, (ii) linguistic theory, and (iii) neural response measurements. For the first class, we (a) analyse the representational similarity between ANN language model activation patterns and eye-tracking data and (b) evaluate the structural alignment of humans’ perceptual color space with LM-derived color name representations. In both cases, we find interesting correspondences.

For the second, we show that ANN LMs (a) when finetuned on taskspecific data, are robust to linguistic perturbations that minimally affect human understanding, (b) can learn attention patterns that reflect linguistic structure, and (c) trained on sentences with scrambled word order, still retain a notion of word order derived from statistical cues that persist in the scrambled data, offering an explanation for why they still perform well on language understanding tasks. For the final class of comparisons, we present a literature review of research linking computational models of language with human neural response measurements and conclude by introducing a framework for leveraging ANN LMs to enable the evaluation of targeted hypotheses about the composition of meaning in the human brain.

Overall, this dissertation works towards furthering our understanding of ANN LMs through comparisons to what we already know about how humans process language and, reciprocally, towards developing frameworks where LMs can help provide insights into human language.

Assessment Committee

Professor Serge Belongie, Department of Computer Science at University of Copenhagen.

Assistant Professor Tal Linzen, New York University.

Assistant Professor Laila Wehbe, Carnegie Melon University.

Academic Supervisor

Professor Anders Søgaard, Department of Computer Science, UCPH.

Moderator for this defence will be

Tenure-Track Assistant Professor Daniel Hershcovich, Department of Computer Science, UCPH.

For a digital copy of the thesis, please visit our PhD page

Datalogisk Institut