Seasonal Web Search Query Selection for Influenza-Like Illness (ILI) Estimation
Research output: Chapter in Book/Report/Conference proceeding › Article in proceedings › Research › peer-review
Standard
Seasonal Web Search Query Selection for Influenza-Like Illness (ILI) Estimation. / Hansen, Niels Dalum; Mølbak, Kåre; Cox, Ingemar Johansson; Lioma, Christina.
SIGIR '17 Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval. Association for Computing Machinery, 2017. p. 1197-1200.Research output: Chapter in Book/Report/Conference proceeding › Article in proceedings › Research › peer-review
Harvard
APA
Vancouver
Author
Bibtex
}
RIS
TY - GEN
T1 - Seasonal Web Search Query Selection for Influenza-Like Illness (ILI) Estimation
AU - Hansen, Niels Dalum
AU - Mølbak, Kåre
AU - Cox, Ingemar Johansson
AU - Lioma, Christina
PY - 2017
Y1 - 2017
N2 - Inuenza-like illness (ILI) estimation from web search data is an importantweb analytics task. The basic idea is to use the frequencies ofqueries in web search logs that are correlated with past ILI activityas features when estimating current ILI activity. It has been notedthat since inuenza is seasonal, this approach can lead to spuriouscorrelations with features/queries that also exhibit seasonality, buthave no relationship with ILI. Spurious correlations can, in turn, degradeperformance. To address this issue, we propose modeling theseasonal variation in ILI activity and selecting queries that are correlatedwith the residual of the seasonal model and the observed ILIsignal. Experimental results show that re-ranking queries obtainedby Google Correlate based on their correlation with the residualstrongly favours ILI-related queries.
AB - Inuenza-like illness (ILI) estimation from web search data is an importantweb analytics task. The basic idea is to use the frequencies ofqueries in web search logs that are correlated with past ILI activityas features when estimating current ILI activity. It has been notedthat since inuenza is seasonal, this approach can lead to spuriouscorrelations with features/queries that also exhibit seasonality, buthave no relationship with ILI. Spurious correlations can, in turn, degradeperformance. To address this issue, we propose modeling theseasonal variation in ILI activity and selecting queries that are correlatedwith the residual of the seasonal model and the observed ILIsignal. Experimental results show that re-ranking queries obtainedby Google Correlate based on their correlation with the residualstrongly favours ILI-related queries.
U2 - 10.1145/3077136.3080760
DO - 10.1145/3077136.3080760
M3 - Article in proceedings
SP - 1197
EP - 1200
BT - SIGIR '17 Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval
PB - Association for Computing Machinery
T2 - 40th International ACM SIGIR Conference on Research and Development in Information Retrieval
Y2 - 7 August 2017 through 11 August 2017
ER -
ID: 195769168