15 October 2019

Researcher receives prestigious award for combining NLP and eye tracking

ELLIS PhD Award 2018

In her PhD thesis from 2018, Maria Barrett, postdoc at Department of Computer Science, University of Copenhagen (DIKU), has combined her research area Natural Language Processing (NLP) with psycholinguistics to demonstrate that eye tracking data from reading can inform NLP models about syntax. For this work, she has received a new prestigious award; ELLIS PhD Award.

Maria Barrett with her ELLIS PhD Award 2018 diploma.
Maria Barrett with her ELLIS PhD Award 2018 diploma.

The new award has been established by the European Laboratory for Learning & Intelligent Systems (ELLIS) to recognise and encourage outstanding research achievements during the dissertation phase of outstanding students working in the field of artificial intelligence and machine learning including related fields such as computer vision and robotics.

She received the award in the beginning of September 2019. The argument from the committee was:

- The dissertation makes an important theoretical contribution to the field of NLP by showing that traces of human processing, particularly eye-tracking data, can be used to improve models for a range of language processing tasks. It represents impressive innovative work, makes significant novel contributions to various NLP tasks and each of the studies presents clear hypotheses and follows a well-justified methodology. Thus, the thesis as a whole advances the state- of-the-art NLP research by suggesting what can be defined a new and promising modelling paradigm.

Research with potential impact on low-resource languages

In Maria Barrett’s PhD thesis, she has demonstrated that eye tracking data holds the potential to improve weakly supervised models which is especially promising for low-resource languages, since native readers is more available and cheaper than trained annotators.

She found that eye tracking can generate large amounts of rich data, which can inform us about the cognitive load associated with processing each unit in the text.

- My vision with the dissertation was to investigate how people's cognitive processing of text can be used to teach computers to understand language - and the results were positive, says Maria Barrett.

Specifically, this new method teaches a computer something about syntax, ie. how a sentence is put together in a given language so that the computer can analyse a sentence correctly. This is essential when the computer has to translate a text or derive what the text is about.

- My research showed, among other things, that differences in reading times between words were enough to inform the computer of differences in syntax and word classes without the need for annotated text. In this way, the computer can learn something about the language directly from eye-tracking data, and in the long term it can benefit the development of language technology for far more languages ​​than today without the use of annotated text, says Maria Barrett.

The method is one of several methods that NLP researchers around the world are working on to help ensure that in the future, all NLP models can be scaled to all the languages in the world.

The video is recorded as part of the data collection in the thesis. It shows how a person's eye moves across the sentence and the fixations the eye makes in the reading. The size of the circle indicates the length of the fixation. Between the fixations, the eye moves on in a fast, leaping motion, a so-called saccade. The majority of saccades bring the reader further into the text, but 5-20% of all saccades moves against the reading direction.