Random walk term weighting for information retrieval

Research output: Chapter in Book/Report/Conference proceeding › Article in proceedings › Research › peer-review

Lioma, Christina
Roi Blanco

We present a way of estimating term weights for Information Retrieval (IR), using term co-occurrence as a measure of dependency between terms.We use the random walk graph-based ranking algorithm on a graph that encodes terms and co-occurrence dependencies in text, from which we derive term weights that represent a quantification of how a term contributes to its context. Evaluation on two TREC collections and 350 topics shows that the random walk-based term weights perform at least comparably to the traditional tf-idf term weighting, while they outperform it when the distance between co-occurring terms is between 6 and 30 terms.

Original language	English
Title of host publication	SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Publisher	Association for Computing Machinery
Publication date	2007
Pages	829-830
Publication status	Published - 2007
Externally published	Yes

Bibliographical note

Copyright is held by the author/owner(s).
SIGIR’07, July 23–27, 2007, Amsterdam, The Netherlands.
ACM 978-1-59593-597-7/07/0007.

Department of Computer Science

Random walk term weighting for information retrieval

Bibliographical note

Links