Random walk term weighting for information retrieval

Research output: Chapter in Book/Report/Conference proceedingBook chapterResearchpeer-review

We present a way of estimating term weights for Information Retrieval (IR), using term co-occurrence as a measure of dependency between terms.We use the random walk graph-based ranking algorithm on a graph that encodes terms and co-occurrence dependencies in text, from which we derive term weights that represent a quantification of how a term contributes to its context. Evaluation on two TREC collections and 350 topics shows that the random walk-based term weights perform at least comparably to the traditional tf-idf term weighting, while they outperform it when the distance between co-occurring terms is between 6 and 30 terms.
Original languageEnglish
Title of host publicationProceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR'07
Number of pages2
Publication date1 Jan 2007
Pages829-830
ISBN (Print)9781595935977
DOIs
Publication statusPublished - 1 Jan 2007

ID: 49502390