A syntactically-based query reformulation technique for information retrieval

Department of Computer Science

A syntactically-based query reformulation technique for information retrieval

Research output: Contribution to journal › Journal article › Research › peer-review

Standard

A syntactically-based query reformulation technique for information retrieval. / Lioma, Christina; Ounis, I.

In: Information Processing & Management, Vol. 44, No. 1, 2008, p. 143-162.

Research output: Contribution to journal › Journal article › Research › peer-review

Harvard

Lioma, C & Ounis, I 2008, 'A syntactically-based query reformulation technique for information retrieval', Information Processing & Management, vol. 44, no. 1, pp. 143-162. https://doi.org/10.1016/j.ipm.2006.12.005

APA

Lioma, C., & Ounis, I. (2008). A syntactically-based query reformulation technique for information retrieval. Information Processing & Management, 44(1), 143-162. https://doi.org/10.1016/j.ipm.2006.12.005

Vancouver

Lioma C, Ounis I. A syntactically-based query reformulation technique for information retrieval. Information Processing & Management. 2008;44(1):143-162. https://doi.org/10.1016/j.ipm.2006.12.005

Author

Lioma, Christina ; Ounis, I. / A syntactically-based query reformulation technique for information retrieval. In: Information Processing & Management. 2008 ; Vol. 44, No. 1. pp. 143-162.

Bibtex

@article{e80d33da74a04cf098528c252dcd493e,

title = "A syntactically-based query reformulation technique for information retrieval",

abstract = "Whereas in language words of high frequency are generally associated with low content [Bookstein, A., & Swanson, D. (1974). Probabilistic models for automatic indexing. Journal of the American Society of Information Science, 25(5), 312-318; Damerau, F. J. (1965). An experiment in automatic indexing. American Documentation, 16, 283-289; Harter, S. P. (1974). A probabilistic approach to automatic keyword indexing. PhD thesis, University of Chicago; Sparck-Jones, K. (1972). A statistical interpretation of term specificity and its application in retrieval. Journal of Documentation, 28, 11-21; Yu, C., & Salton, G. (1976). Precision weighting - an effective automatic indexing method. Journal of the Association for Computer Machinery (ACM), 23(1), 76-88], shallow syntactic fragments of high frequency generally correspond to lexical fragments of high content [Lioma, C., & Ounis, I. (2006). Examining the content load of part of speech blocks for information retrieval. In Proceedings of the international committee on computational linguistics and the association for computational linguistics (COLING/ACL 2006), Sydney, Australia]. We implement this finding to Information Retrieval, as follows. We present a novel automatic query reformulation technique, which is based on shallow syntactic evidence induced from various language samples, and used to enhance the performance of an Information Retrieval system. Firstly, we draw shallow syntactic evidence from language samples of varying size, and compare the effect of language sample size upon retrieval performance, when using our syntactically-based query reformulation (SQR) technique. Secondly, we compare SQR to a state-of-the-art probabilistic pseudo-relevance feedback technique. Additionally, we combine both techniques and evaluate their compatibility. We evaluate our proposed technique across two standard Text REtrieval Conference (TREC) English test collections, and three statistically different weighting models. Experimental results suggest that SQR markedly enhances retrieval performance, and is at least comparable to pseudo-relevance feedback. Notably, the combination of SQR and pseudo-relevance feedback further enhances retrieval performance considerably. These collective experimental results confirm the tenet that high frequency shallow syntactic fragments correspond to content-bearing lexical fragments.",

author = "Christina Lioma and I. Ounis",

note = "Evaluation of Interactive Information Retrieval Systems",

year = "2008",

doi = "10.1016/j.ipm.2006.12.005",

language = "English",

volume = "44",

pages = "143--162",

journal = "Information Processing & Management",

issn = "0306-4573",

publisher = "Elsevier",

number = "1",

}

RIS

TY - JOUR

T1 - A syntactically-based query reformulation technique for information retrieval

AU - Lioma, Christina

AU - Ounis, I.

N1 - Evaluation of Interactive Information Retrieval Systems

PY - 2008

Y1 - 2008

N2 - Whereas in language words of high frequency are generally associated with low content [Bookstein, A., & Swanson, D. (1974). Probabilistic models for automatic indexing. Journal of the American Society of Information Science, 25(5), 312-318; Damerau, F. J. (1965). An experiment in automatic indexing. American Documentation, 16, 283-289; Harter, S. P. (1974). A probabilistic approach to automatic keyword indexing. PhD thesis, University of Chicago; Sparck-Jones, K. (1972). A statistical interpretation of term specificity and its application in retrieval. Journal of Documentation, 28, 11-21; Yu, C., & Salton, G. (1976). Precision weighting - an effective automatic indexing method. Journal of the Association for Computer Machinery (ACM), 23(1), 76-88], shallow syntactic fragments of high frequency generally correspond to lexical fragments of high content [Lioma, C., & Ounis, I. (2006). Examining the content load of part of speech blocks for information retrieval. In Proceedings of the international committee on computational linguistics and the association for computational linguistics (COLING/ACL 2006), Sydney, Australia]. We implement this finding to Information Retrieval, as follows. We present a novel automatic query reformulation technique, which is based on shallow syntactic evidence induced from various language samples, and used to enhance the performance of an Information Retrieval system. Firstly, we draw shallow syntactic evidence from language samples of varying size, and compare the effect of language sample size upon retrieval performance, when using our syntactically-based query reformulation (SQR) technique. Secondly, we compare SQR to a state-of-the-art probabilistic pseudo-relevance feedback technique. Additionally, we combine both techniques and evaluate their compatibility. We evaluate our proposed technique across two standard Text REtrieval Conference (TREC) English test collections, and three statistically different weighting models. Experimental results suggest that SQR markedly enhances retrieval performance, and is at least comparable to pseudo-relevance feedback. Notably, the combination of SQR and pseudo-relevance feedback further enhances retrieval performance considerably. These collective experimental results confirm the tenet that high frequency shallow syntactic fragments correspond to content-bearing lexical fragments.

AB - Whereas in language words of high frequency are generally associated with low content [Bookstein, A., & Swanson, D. (1974). Probabilistic models for automatic indexing. Journal of the American Society of Information Science, 25(5), 312-318; Damerau, F. J. (1965). An experiment in automatic indexing. American Documentation, 16, 283-289; Harter, S. P. (1974). A probabilistic approach to automatic keyword indexing. PhD thesis, University of Chicago; Sparck-Jones, K. (1972). A statistical interpretation of term specificity and its application in retrieval. Journal of Documentation, 28, 11-21; Yu, C., & Salton, G. (1976). Precision weighting - an effective automatic indexing method. Journal of the Association for Computer Machinery (ACM), 23(1), 76-88], shallow syntactic fragments of high frequency generally correspond to lexical fragments of high content [Lioma, C., & Ounis, I. (2006). Examining the content load of part of speech blocks for information retrieval. In Proceedings of the international committee on computational linguistics and the association for computational linguistics (COLING/ACL 2006), Sydney, Australia]. We implement this finding to Information Retrieval, as follows. We present a novel automatic query reformulation technique, which is based on shallow syntactic evidence induced from various language samples, and used to enhance the performance of an Information Retrieval system. Firstly, we draw shallow syntactic evidence from language samples of varying size, and compare the effect of language sample size upon retrieval performance, when using our syntactically-based query reformulation (SQR) technique. Secondly, we compare SQR to a state-of-the-art probabilistic pseudo-relevance feedback technique. Additionally, we combine both techniques and evaluate their compatibility. We evaluate our proposed technique across two standard Text REtrieval Conference (TREC) English test collections, and three statistically different weighting models. Experimental results suggest that SQR markedly enhances retrieval performance, and is at least comparable to pseudo-relevance feedback. Notably, the combination of SQR and pseudo-relevance feedback further enhances retrieval performance considerably. These collective experimental results confirm the tenet that high frequency shallow syntactic fragments correspond to content-bearing lexical fragments.

U2 - 10.1016/j.ipm.2006.12.005

DO - 10.1016/j.ipm.2006.12.005

M3 - Journal article

AN - SCOPUS:35548957766

VL - 44

SP - 143

EP - 162

JO - Information Processing & Management

JF - Information Processing & Management

SN - 0306-4573

IS - 1

ER -

ID: 49502350