Contextual compositionality detection with external knowledge bases and word embeddings
Research output: Chapter in Book/Report/Conference proceeding › Article in proceedings › Research › peer-review
Standard
Contextual compositionality detection with external knowledge bases and word embeddings. / Wang, Dongsheng; Li, Qiuchi; Lima, Lucas Chaves; Simonsen, Jakob Grue; Lioma, Christina.
The Web Conference 2019 - Companion of the World Wide Web Conference, WWW 2019. Association for Computing Machinery, 2019. p. 317-323.Research output: Chapter in Book/Report/Conference proceeding › Article in proceedings › Research › peer-review
Harvard
APA
Vancouver
Author
Bibtex
}
RIS
TY - GEN
T1 - Contextual compositionality detection with external knowledge bases and word embeddings
AU - Wang, Dongsheng
AU - Li, Qiuchi
AU - Lima, Lucas Chaves
AU - Simonsen, Jakob Grue
AU - Lioma, Christina
PY - 2019
Y1 - 2019
N2 - When the meaning of a phrase cannot be inferred from the individual meanings of its words (e.g., hot dog), that phrase is said to be non-compositional. Automatic compositionality detection in multiword phrases is critical in any application of semantic processing, such as search engines [9]; failing to detect non-compositional phrases can hurt system effectiveness notably. Existing research treats phrases as either compositional or non-compositional in a deterministic manner. In this paper, we operationalize the viewpoint that compositionality is contextual rather than deterministic, i.e., that whether a phrase is compositional or non-compositional depends on its context. For example, the phrase �green card� is compositional when referring to a green colored card, whereas it is non-compositional when meaning permanent residence authorization. We address the challenge of detecting this type of contextual compositionality as follows: given a multi-word phrase, we enrich the word embedding representing its semantics with evidence about its global context (terms it often collocates with) as well as its local context (narratives where that phrase is used, which we call usage scenarios). We further extend this representation with information extracted from external knowledge bases. The resulting representation incorporates both localized context and more general usage of the phrase and allows to detect its compositionality in a non-deterministic and contextual way. Empirical evaluation of our model on a dataset of phrase compositionality1, manually collected by crowdsourcing contextual compositionality assessments, shows that our model outperforms state-of-the-art baselines notably on detecting phrase compositionality.
AB - When the meaning of a phrase cannot be inferred from the individual meanings of its words (e.g., hot dog), that phrase is said to be non-compositional. Automatic compositionality detection in multiword phrases is critical in any application of semantic processing, such as search engines [9]; failing to detect non-compositional phrases can hurt system effectiveness notably. Existing research treats phrases as either compositional or non-compositional in a deterministic manner. In this paper, we operationalize the viewpoint that compositionality is contextual rather than deterministic, i.e., that whether a phrase is compositional or non-compositional depends on its context. For example, the phrase �green card� is compositional when referring to a green colored card, whereas it is non-compositional when meaning permanent residence authorization. We address the challenge of detecting this type of contextual compositionality as follows: given a multi-word phrase, we enrich the word embedding representing its semantics with evidence about its global context (terms it often collocates with) as well as its local context (narratives where that phrase is used, which we call usage scenarios). We further extend this representation with information extracted from external knowledge bases. The resulting representation incorporates both localized context and more general usage of the phrase and allows to detect its compositionality in a non-deterministic and contextual way. Empirical evaluation of our model on a dataset of phrase compositionality1, manually collected by crowdsourcing contextual compositionality assessments, shows that our model outperforms state-of-the-art baselines notably on detecting phrase compositionality.
KW - Compositionality detection
KW - Knowledge base
KW - Word embedding
UR - http://www.scopus.com/inward/record.url?scp=85066890169&partnerID=8YFLogxK
U2 - 10.1145/3308560.3316584
DO - 10.1145/3308560.3316584
M3 - Article in proceedings
AN - SCOPUS:85066890169
SP - 317
EP - 323
BT - The Web Conference 2019 - Companion of the World Wide Web Conference, WWW 2019
PB - Association for Computing Machinery
T2 - 2019 World Wide Web Conference, WWW 2019
Y2 - 13 May 2019 through 17 May 2019
ER -
ID: 223251961