Tracking Typological Traits of Uralic Languages in Distributed Language Representations
Research output: Chapter in Book/Report/Conference proceeding › Article in proceedings › Research › peer-review
Standard
Tracking Typological Traits of Uralic Languages in Distributed Language Representations. / Bjerva, Johannes; Augenstein, Isabelle.
Proceedings, Fourth International Workshop on Computational Linguistics for Uralic Languages. Association for Computational Linguistics, 2018. p. 78-88.Research output: Chapter in Book/Report/Conference proceeding › Article in proceedings › Research › peer-review
Harvard
APA
Vancouver
Author
Bibtex
}
RIS
TY - GEN
T1 - Tracking Typological Traits of Uralic Languages in Distributed Language Representations
AU - Bjerva, Johannes
AU - Augenstein, Isabelle
PY - 2018
Y1 - 2018
N2 - Although linguistic typology has a long history,computational approaches have only recentlygained popularity. The use of distributedrepresentations in computational linguisticshas also become increasingly popular.A recent development is to learn distributedrepresentations of language, such that typologicallysimilar languages are spatially closeto one another. Although empirical successeshave been shown for such language representations,they have not been subjected to muchtypological probing. In this paper, we firstlook at whether this type of language representationsare empirically useful for model transferbetween Uralic languages in deep neuralnetworks. We then investigate which typologicalfeatures are encoded in these representationsby attempting to predict features in theWorld Atlas of Language Structures, at variousstages of fine-tuning of the representations.We focus on Uralic languages, and findthat some typological traits can be automaticallyinferred with accuracies well above astrong baseline
AB - Although linguistic typology has a long history,computational approaches have only recentlygained popularity. The use of distributedrepresentations in computational linguisticshas also become increasingly popular.A recent development is to learn distributedrepresentations of language, such that typologicallysimilar languages are spatially closeto one another. Although empirical successeshave been shown for such language representations,they have not been subjected to muchtypological probing. In this paper, we firstlook at whether this type of language representationsare empirically useful for model transferbetween Uralic languages in deep neuralnetworks. We then investigate which typologicalfeatures are encoded in these representationsby attempting to predict features in theWorld Atlas of Language Structures, at variousstages of fine-tuning of the representations.We focus on Uralic languages, and findthat some typological traits can be automaticallyinferred with accuracies well above astrong baseline
U2 - 10.18653/v1/W18-02
DO - 10.18653/v1/W18-02
M3 - Article in proceedings
SP - 78
EP - 88
BT - Proceedings, Fourth International Workshop on Computational Linguistics for Uralic Languages
PB - Association for Computational Linguistics
T2 - Fourth International Workshop on Computational Linguistics for Uralic Languages
Y2 - 8 January 2018 through 9 January 2018
ER -
ID: 195046443