Transfer Learning for Computational Content Analysis​​

Publikation: Bog/antologi/afhandling/rapportPh.d.-afhandling

Standard

Transfer Learning for Computational Content Analysis​​. / Hartmann , Mareike .

Department of Computer Science, Faculty of Science, University of Copenhagen, 2019.

Publikation: Bog/antologi/afhandling/rapportPh.d.-afhandling

Harvard

Hartmann , M 2019, Transfer Learning for Computational Content Analysis​​. Department of Computer Science, Faculty of Science, University of Copenhagen.

APA

Hartmann , M. (2019). Transfer Learning for Computational Content Analysis​​. Department of Computer Science, Faculty of Science, University of Copenhagen.

Vancouver

Hartmann M. Transfer Learning for Computational Content Analysis​​. Department of Computer Science, Faculty of Science, University of Copenhagen, 2019.

Author

Hartmann , Mareike . / Transfer Learning for Computational Content Analysis​​. Department of Computer Science, Faculty of Science, University of Copenhagen, 2019.

Bibtex

@phdthesis{2aae790e8fc64749834c19e2c689591c,
title = "Transfer Learning for Computational Content Analysis​​",
abstract = "Content analysis is a research technique that is concerned with the discovery of trends, patterns and differences in artifacts of human communication. It requires the reading and coding of data according to annotation guidelines, which is a labor-intensive process. In the times of mass communication, huge amounts of content are produced everyday. Analysing this content with respect to the social phenomena they capture is of interest to researchers in many fields. However, manual coding is impractical for such large amounts of data and automating the coding step could speed up the process significantly.Supervised machine learning is a promising approach in this direction, as such models can be applied to learn from human annotations and generalize to unseen data, making the coding of large amounts of content more feasible. However, labeled data sets are expensive to generate. On the one hand, this leads to small training dataset sizes. On the other hand, it makes it valuable if a model can generalize across datasets from different domains and languages. Transfer learning is a machine learning method that enables such knowledge transfer between data from different distributions, leveraging as much data as possible and keeping the additional annotation efforts low.This thesis investigates the use of transfer learning for automated content coding. In the first part of the work, we directly apply transfer learning to content coding tasks. We investigate how the methods can improve the task and show that transfer learning can overcome the problem of little training data by leveraging additional resources. The second part of the work focuses on methods that enable knowledge transfer between languages. Such methods rely on word representations that capture meanings across languages. Unsupervised methods for learning such representations are attractive but unstable and we investigate the causes of these instabilities",
author = "Mareike Hartmann",
year = "2019",
language = "English",
publisher = "Department of Computer Science, Faculty of Science, University of Copenhagen",

}

RIS

TY - BOOK

T1 - Transfer Learning for Computational Content Analysis​​

AU - Hartmann , Mareike

PY - 2019

Y1 - 2019

N2 - Content analysis is a research technique that is concerned with the discovery of trends, patterns and differences in artifacts of human communication. It requires the reading and coding of data according to annotation guidelines, which is a labor-intensive process. In the times of mass communication, huge amounts of content are produced everyday. Analysing this content with respect to the social phenomena they capture is of interest to researchers in many fields. However, manual coding is impractical for such large amounts of data and automating the coding step could speed up the process significantly.Supervised machine learning is a promising approach in this direction, as such models can be applied to learn from human annotations and generalize to unseen data, making the coding of large amounts of content more feasible. However, labeled data sets are expensive to generate. On the one hand, this leads to small training dataset sizes. On the other hand, it makes it valuable if a model can generalize across datasets from different domains and languages. Transfer learning is a machine learning method that enables such knowledge transfer between data from different distributions, leveraging as much data as possible and keeping the additional annotation efforts low.This thesis investigates the use of transfer learning for automated content coding. In the first part of the work, we directly apply transfer learning to content coding tasks. We investigate how the methods can improve the task and show that transfer learning can overcome the problem of little training data by leveraging additional resources. The second part of the work focuses on methods that enable knowledge transfer between languages. Such methods rely on word representations that capture meanings across languages. Unsupervised methods for learning such representations are attractive but unstable and we investigate the causes of these instabilities

AB - Content analysis is a research technique that is concerned with the discovery of trends, patterns and differences in artifacts of human communication. It requires the reading and coding of data according to annotation guidelines, which is a labor-intensive process. In the times of mass communication, huge amounts of content are produced everyday. Analysing this content with respect to the social phenomena they capture is of interest to researchers in many fields. However, manual coding is impractical for such large amounts of data and automating the coding step could speed up the process significantly.Supervised machine learning is a promising approach in this direction, as such models can be applied to learn from human annotations and generalize to unseen data, making the coding of large amounts of content more feasible. However, labeled data sets are expensive to generate. On the one hand, this leads to small training dataset sizes. On the other hand, it makes it valuable if a model can generalize across datasets from different domains and languages. Transfer learning is a machine learning method that enables such knowledge transfer between data from different distributions, leveraging as much data as possible and keeping the additional annotation efforts low.This thesis investigates the use of transfer learning for automated content coding. In the first part of the work, we directly apply transfer learning to content coding tasks. We investigate how the methods can improve the task and show that transfer learning can overcome the problem of little training data by leveraging additional resources. The second part of the work focuses on methods that enable knowledge transfer between languages. Such methods rely on word representations that capture meanings across languages. Unsupervised methods for learning such representations are attractive but unstable and we investigate the causes of these instabilities

M3 - Ph.D. thesis

BT - Transfer Learning for Computational Content Analysis​​

PB - Department of Computer Science, Faculty of Science, University of Copenhagen

ER -

ID: 234994924