Success at top NLP conference
Weakly supervised, multilingual and multi-modal learning are the central research themes of the seven papers that researchers from the Machine Learning section at Department of Computer Science, University of Copenhagen (DIKU), have had accepted at NAACL - one of the top conferences in Natural Language Processing.
The Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL) is a leading conference in the area of Natural Language Processing. The authors of the seven accepted papers are researchers from the two research groups CoAStAL NLP and CopeNLU at DIKU, together with researchers from the University of Cambridge, Microsoft Research, Amazon AI and the University of Edinburgh.
Result can have positive implications for machine translation
One of the papers* proposes a method for automatically learning typological features of languages. Typological features can be thought of as attributes that characterise languages. Examples are word order or the number of genders a language has. Such characteristics are partly known and encoded in typological knowledge bases, but to a large degree, especially for smaller languages, they are not.
This paper shows that similarities between languages and features can be exploited, by modelling them all in a generative model of language, based on exponential-family matrix factorisation. This study reaffirms what linguists have long hypothesised, i.e. that there are significant correlations between typological features and languages. An additional advancement to the field is achieved by showing that such typological knowledge bases can be completed automatically.
This has significant implications for other areas of NLP, which rely on multilingual learning and understanding how languages are related to one another, such as machine translation.
The seven accepted papers by DIKU researchers
- Marcel Bollmann
A Large-Scale Comparison of Historical Text Normalization Systems
- Johannes Bjerva, Yova Kementchedjhieva, Ryan Cotterell and Isabelle Augenstein *
A Probabilistic Generative Model of Linguistic Typology
- Simon Flachs, Ophélie Lacroix, Marek Rei, Helen Yannakoudakis and Anders Søgaard
A Simple and Robust Approach to Detecting Subject-Verb Agreement Errors
- David Vilares, Mostafa Abdou and Anders Søgaard
Better, Faster, Stronger Sequence Tagging Constituent Parsers
- Alexander Hoyle, Lawrence Wolf-Sonkin, Hanna Wallach, Ryan Cotterell and Isabelle Augenstein
Combining Disparate Sentiment Lexica with a Multi-View Variational Autoencoder
- Spandana Gella, Desmond Elliott and Frank Keller
Cross-lingual Visual Verb Sense Disambiguation
- Mareike Hartmann, Tallulah Jansen, Isabelle Augenstein and Anders Søgaard
Issue Framing in Online Discussion Fora
The conference will take place on 2-7 June 2019 in Minneapolis, USA.
For more information about NLP activities in the Machine Learning Section, visit the NLP website.