Contextually propagated term weights for document representation

Publikation: Bidrag til bog/antologi/rapportKonferencebidrag i proceedingsForskningfagfællebedømt

Standard

Contextually propagated term weights for document representation. / Hansen, Casper; Hansen, Christian; Alstrup, Stephen; Simonsen, Jakob Grue; Lioma, Christina.

SIGIR 2019 - Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval. Association for Computing Machinery, 2019. s. 897-900 (SIGIR 2019 - Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval).

Publikation: Bidrag til bog/antologi/rapportKonferencebidrag i proceedingsForskningfagfællebedømt

Harvard

Hansen, C, Hansen, C, Alstrup, S, Simonsen, JG & Lioma, C 2019, Contextually propagated term weights for document representation. i SIGIR 2019 - Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval. Association for Computing Machinery, SIGIR 2019 - Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval, s. 897-900, 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2019, Paris, Frankrig, 21/07/2019. https://doi.org/10.1145/3331184.3331307

APA

Hansen, C., Hansen, C., Alstrup, S., Simonsen, J. G., & Lioma, C. (2019). Contextually propagated term weights for document representation. I SIGIR 2019 - Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval (s. 897-900). Association for Computing Machinery. SIGIR 2019 - Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval https://doi.org/10.1145/3331184.3331307

Vancouver

Hansen C, Hansen C, Alstrup S, Simonsen JG, Lioma C. Contextually propagated term weights for document representation. I SIGIR 2019 - Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval. Association for Computing Machinery. 2019. s. 897-900. (SIGIR 2019 - Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval). https://doi.org/10.1145/3331184.3331307

Author

Hansen, Casper ; Hansen, Christian ; Alstrup, Stephen ; Simonsen, Jakob Grue ; Lioma, Christina. / Contextually propagated term weights for document representation. SIGIR 2019 - Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval. Association for Computing Machinery, 2019. s. 897-900 (SIGIR 2019 - Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval).

Bibtex

@inproceedings{267930579ce2474dbf5c8555a32bb04c,
title = "Contextually propagated term weights for document representation",
abstract = "Word embeddings predict a word from its neighbours by learning small, dense embedding vectors. In practice, this prediction corresponds to a semantic score given to the predicted word (or term weight). We present a novel model that, given a target word, redistributes part of that word's weight (that has been computed with word embeddings) across words occurring in similar contexts as the target word. Thus, our model aims to simulate how semantic meaning is shared by words occurring in similar contexts, which is incorporated into bag-of-words document representations. Experimental evaluation in an unsupervised setting against 8 state of the art baselines shows that our model yields the best micro and macro F1 scores across datasets of increasing difficulty.",
keywords = "Contextual semantics, Document representation, Word embeddings",
author = "Casper Hansen and Christian Hansen and Stephen Alstrup and Simonsen, {Jakob Grue} and Christina Lioma",
year = "2019",
month = jul,
day = "18",
doi = "10.1145/3331184.3331307",
language = "English",
series = "SIGIR 2019 - Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval",
pages = "897--900",
booktitle = "SIGIR 2019 - Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval",
publisher = "Association for Computing Machinery",
note = "42nd International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2019 ; Conference date: 21-07-2019 Through 25-07-2019",

}

RIS

TY - GEN

T1 - Contextually propagated term weights for document representation

AU - Hansen, Casper

AU - Hansen, Christian

AU - Alstrup, Stephen

AU - Simonsen, Jakob Grue

AU - Lioma, Christina

PY - 2019/7/18

Y1 - 2019/7/18

N2 - Word embeddings predict a word from its neighbours by learning small, dense embedding vectors. In practice, this prediction corresponds to a semantic score given to the predicted word (or term weight). We present a novel model that, given a target word, redistributes part of that word's weight (that has been computed with word embeddings) across words occurring in similar contexts as the target word. Thus, our model aims to simulate how semantic meaning is shared by words occurring in similar contexts, which is incorporated into bag-of-words document representations. Experimental evaluation in an unsupervised setting against 8 state of the art baselines shows that our model yields the best micro and macro F1 scores across datasets of increasing difficulty.

AB - Word embeddings predict a word from its neighbours by learning small, dense embedding vectors. In practice, this prediction corresponds to a semantic score given to the predicted word (or term weight). We present a novel model that, given a target word, redistributes part of that word's weight (that has been computed with word embeddings) across words occurring in similar contexts as the target word. Thus, our model aims to simulate how semantic meaning is shared by words occurring in similar contexts, which is incorporated into bag-of-words document representations. Experimental evaluation in an unsupervised setting against 8 state of the art baselines shows that our model yields the best micro and macro F1 scores across datasets of increasing difficulty.

KW - Contextual semantics

KW - Document representation

KW - Word embeddings

U2 - 10.1145/3331184.3331307

DO - 10.1145/3331184.3331307

M3 - Article in proceedings

T3 - SIGIR 2019 - Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval

SP - 897

EP - 900

BT - SIGIR 2019 - Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval

PB - Association for Computing Machinery

T2 - 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2019

Y2 - 21 July 2019 through 25 July 2019

ER -

ID: 239566043