Random walk term weighting for information retrieval

Research output: Chapter in Book/Report/Conference proceedingArticle in proceedingsResearchpeer-review

We present a way of estimating term weights for Information Retrieval (IR), using term co-occurrence as a measure of dependency between terms.We use the random walk graph-based ranking algorithm on a graph that encodes terms and co-occurrence dependencies in text, from which we derive term weights that represent a quantification of how a term contributes to its context. Evaluation on two TREC collections and 350 topics shows that the random walk-based term weights perform at least comparably to the traditional tf-idf term weighting, while they outperform it when the distance between co-occurring terms is between 6 and 30 terms.
Original languageEnglish
Title of host publicationSIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
PublisherAssociation for Computing Machinery
Publication date2007
Pages829-830
Publication statusPublished - 2007
Externally publishedYes

Bibliographical note

Copyright is held by the author/owner(s).
SIGIR’07, July 23–27, 2007, Amsterdam, The Netherlands.
ACM 978-1-59593-597-7/07/0007.

ID: 38251957