A cascaded classification approach to semantic head recognition
Publikation: Bidrag til bog/antologi/rapport › Bidrag til bog/antologi › Forskning › fagfællebedømt
Standard
A cascaded classification approach to semantic head recognition. / Michelbacher, L.; Kothari, A.; Lioma, Christina; Schütze, H.; Forst, M.
EMNLP 2011 - Conference on Empirical Methods in Natural Language Processing, Proceedings of the Conference. 2011. s. 793-803.Publikation: Bidrag til bog/antologi/rapport › Bidrag til bog/antologi › Forskning › fagfællebedømt
Harvard
APA
Vancouver
Author
Bibtex
}
RIS
TY - CHAP
T1 - A cascaded classification approach to semantic head recognition
AU - Michelbacher, L.
AU - Kothari, A.
AU - Lioma, Christina
AU - Schütze, H.
AU - Forst, M.
PY - 2011/1/1
Y1 - 2011/1/1
N2 - Most NLP systems use tokenization as part of preprocessing. Generally, tokenizers are based on simple heuristics and do not recognize multi-word units (MWUs) like hot dog or black hole unless a precompiled list of MWUs is available. In this paper, we propose a new cascaded model for detecting MWUs of arbitrary length for tokenization, focusing on noun phrases in the physics domain. We adopt a classification approach because - unlike other work on MWUs - tokenization requires a completely automatic approach. We achieve an accuracy of 68% for recognizing non-compositional MWUs and show that our MWU recognizer improves retrieval performance when used as part of an information retrieval system.
AB - Most NLP systems use tokenization as part of preprocessing. Generally, tokenizers are based on simple heuristics and do not recognize multi-word units (MWUs) like hot dog or black hole unless a precompiled list of MWUs is available. In this paper, we propose a new cascaded model for detecting MWUs of arbitrary length for tokenization, focusing on noun phrases in the physics domain. We adopt a classification approach because - unlike other work on MWUs - tokenization requires a completely automatic approach. We achieve an accuracy of 68% for recognizing non-compositional MWUs and show that our MWU recognizer improves retrieval performance when used as part of an information retrieval system.
UR - http://www.scopus.com/inward/record.url?scp=80053237387&partnerID=8YFLogxK
M3 - Book chapter
AN - SCOPUS:80053237387
SN - 9781937284114
SP - 793
EP - 803
BT - EMNLP 2011 - Conference on Empirical Methods in Natural Language Processing, Proceedings of the Conference
ER -
ID: 49502244