Researcher receives prestigious award for combining NLP and eye tracking
In her PhD thesis from 2018, Maria Barrett, postdoc at Department of Computer Science, University of Copenhagen (DIKU), has combined her research area Natural Language Processing (NLP) with psycholinguistics to demonstrate that eye tracking data from reading can inform NLP models about syntax. For this work, she has received a new prestigious award; ELLIS PhD Award.
The new award has been established by the European Laboratory for Learning & Intelligent Systems (ELLIS) to recognise and encourage outstanding research achievements during the dissertation phase of outstanding students working in the field of artificial intelligence and machine learning including related fields such as computer vision and robotics.
She received the award in the beginning of September 2019. The argument from the committee was:
- The dissertation makes an important theoretical contribution to the field of NLP by showing that traces of human processing, particularly eye-tracking data, can be used to improve models for a range of language processing tasks. It represents impressive innovative work, makes significant novel contributions to various NLP tasks and each of the studies presents clear hypotheses and follows a well-justified methodology. Thus, the thesis as a whole advances the state- of-the-art NLP research by suggesting what can be defined a new and promising modelling paradigm.
Research with potential impact on low-resource languages
In Maria Barrett’s PhD thesis, she has demonstrated that eye tracking data holds the potential to improve weakly supervised models which is especially promising for low-resource languages, since native readers is more available and cheaper than trained annotators.
She found that eye tracking can generate large amounts of rich data, which can inform us about the cognitive load associated with processing each unit in the text.
- My vision with the dissertation was to investigate how people's cognitive processing of text can be used to teach computers to understand language - and the results were positive, says Maria Barrett.
Specifically, this new method teaches a computer something about syntax, ie. how a sentence is put together in a given language so that the computer can analyse a sentence correctly. This is essential when the computer has to translate a text or derive what the text is about.
- My research showed, among other things, that differences in reading times between words were enough to inform the computer of differences in syntax and word classes without the need for annotated text. In this way, the computer can learn something about the language directly from eye-tracking data, and in the long term it can benefit the development of language technology for far more languages than today without the use of annotated text, says Maria Barrett.
The method is one of several methods that NLP researchers around the world are working on to help ensure that in the future, all NLP models can be scaled to all the languages in the world.
The video is recorded as part of the data collection in the thesis. It shows how a person's eye moves across the sentence and the fixations the eye makes in the reading. The size of the circle indicates the length of the fixation. Between the fixations, the eye moves on in a fast, leaping motion, a so-called saccade. The majority of saccades bring the reader further into the text, but 5-20% of all saccades moves against the reading direction.
Related News
Contact
Maria Jung Barrett
Postdoc, Department of Computer Science, University of Copenhagen
mjb@di.ku.dk
Tel. 29 72 22 69
Tina Virenfeldt Kristensen
Communication Consultant, Department of Computer Science, University of Copenhagen
tikr@di.ku.dk
Tel. 40 59 40 54
In the media
15 October 2019
Politiken: Ny dansk forskning kan gøre sprogteknologi mere udbredt (only in Danish)
Subscription is required
Facts
Maria Barret’s PhD thesis Improving natural language processing with human data: Eye tracking and other data sources reflecting cognitive text processing consists of eight papers. One of them Sequence classification with human attention won a best paper award at the conference Computational Natural Language Learning (CoNLL) in 2018.
This year, the ELLIS PhD Award were given for the previous three years. From now on, it will be given annually. From a set of very strong nominations, an international committee of renowned experts chose the following scholars as ELLIS PhD Award winners:
- Maria Barrett (University of Copenhagen, Denmark)
- Wittawat Jitkrittum (Gatsby, UC London, UK)
- Alex Kendall (Cambridge University, UK)
- Diederik Kingma (University of Amsterdam, Netherlands)
- Anastasia Pentina (IST Austria, Austria)
- Siyu Tang (MPI Informatics, Germany)
The awards are sponsored by the Kühborth Stiftung GmbH.
Natural Language Processing is a growing field at University of Copenhagen and the university is among the top universities in Europe within NLP. The research group is affiliated to Department of Computer Science and led by professor Anders Søgaard and assistant professor Isabelle Augenstein.