PhD defence by Christian Hansen
Title
Summary
Sequential modelling entails making sense of sequential data, which naturally occurs in a wide array of domains. One example is systems that interact with users, log user actions and behaviour, and make recommendations of items of potential interest to users on the basis of their previous interactions. In such cases, the sequential order of user interactions is often indicative of what the user is interested in next. Similarly, for systems that automatically infer the semantics of text, capturing the sequential order of words in a sentence is essential, as even a slight re-ordering could significantly alter its original meaning. This thesis makes methodological contributions and new investigations of sequential modelling for the specific application areas of systems that recommend music tracks to listeners and systems that process text semantics in order to automatically fact-check claims, or "speed read" text for efficient further classification.
For music recommendation, we make three contributions: Firstly, a study of how the complexity of sequential music recommender methods relates to the diversity and relevance of the recommendations, and how diversification of recommendations can be used to control this trade-off. Secondly, we investigate how listening context impacts music consumption, which we use to motivate a new way of representing user profiles that captures sequential and contextual deviations from the user's typical music preferences. Thirdly, we improve the prediction of music skip behaviour in a listening session based on past skips.
For fact-checking, we make three contributions: Firstly, we construct the currently largest benchmark dataset of naturally occurring claims for training automatic fact-checking models. Secondly, we link and use eye-tracking data of humans reading news headlines to automatic fact-checking predictions. Thirdly, we present two models for detecting check-worthy sentences for fact-checking, which by the use of weak supervision and contrastive ranking, make steps towards better model generalization in a domain with very limited training data.
Lastly, for speed reading, we contribute a new model that utilizes the inherent punctuation structure of text for learning how to ignore a large number of words, while being equally or more effective than processing every word in the text.
Assessment Committee
Associate Professor Tuukka Ruotsalo, Department of Computer Science, UCPH
Professor Marie Francine Moens, Department of Computer Science KU Leuven
Full Professor Jimmy Lin, University of Waterloo
Moderator for this defence will be
Associate Professor Daniel Spikol, Machine Learning Section at Department of Computer Science, UCPH.For a digital version of the thesis, please visit our PhD page