How to Measure the Reproducibility of System-oriented IR Experiments

Research output: Chapter in Book/Report/Conference proceeding › Article in proceedings › Research › peer-review

Standard

How to Measure the Reproducibility of System-oriented IR Experiments. / Breuer, Timo; Ferro, Nicola; Fuhr, Norbert; Maistro, Maria; Sakai, Tetsuya; Schaer, Philipp; Soboroff, Ian.

SIGIR '20: Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval. Association for Computing Machinery, 2020. p. 349-358.

Research output: Chapter in Book/Report/Conference proceeding › Article in proceedings › Research › peer-review

Harvard

Breuer, T, Ferro, N, Fuhr, N, Maistro, M, Sakai, T, Schaer, P & Soboroff, I 2020, How to Measure the Reproducibility of System-oriented IR Experiments. in SIGIR '20: Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval. Association for Computing Machinery, pp. 349-358, 43rd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2020, Virtual, Online, China, 25/07/2020. https://doi.org/10.1145/3397271.3401036

APA

Breuer, T., Ferro, N., Fuhr, N., Maistro, M., Sakai, T., Schaer, P., & Soboroff, I. (2020). How to Measure the Reproducibility of System-oriented IR Experiments. In SIGIR '20: Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 349-358). Association for Computing Machinery. https://doi.org/10.1145/3397271.3401036

Vancouver

Breuer T, Ferro N, Fuhr N, Maistro M, Sakai T, Schaer P et al. How to Measure the Reproducibility of System-oriented IR Experiments. In SIGIR '20: Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval. Association for Computing Machinery. 2020. p. 349-358 https://doi.org/10.1145/3397271.3401036

Author

Breuer, Timo ; Ferro, Nicola ; Fuhr, Norbert ; Maistro, Maria ; Sakai, Tetsuya ; Schaer, Philipp ; Soboroff, Ian. / How to Measure the Reproducibility of System-oriented IR Experiments. SIGIR '20: Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval. Association for Computing Machinery, 2020. pp. 349-358

Bibtex

@inproceedings{efbdda4032b244afb2caba97969ea0f2,

title = "How to Measure the Reproducibility of System-oriented IR Experiments",

abstract = "Replicability and reproducibility of experimental results are primary concerns in all the areas of science and IR is not an exception. Besides the problem of moving the field towards more reproducible experimental practices and protocols, we also face a severe methodological issue: we do not have any means to assess when reproduced is reproduced. Moreover, we lack any reproducibility-oriented dataset, which would allow us to develop such methods. To address these issues, we compare several measures to objectively quantify to what extent we have replicated or reproduced a system-oriented IR experiment. These measures operate at different levels of granularity, from the fine-grained comparison of ranked lists, to the more general comparison of the obtained effects and significant differences. Moreover, we also develop a reproducibility-oriented dataset, which allows us to validate our measures and which can also be used to develop future measures.",

keywords = "measure, replicability, reproducibility",

author = "Timo Breuer and Nicola Ferro and Norbert Fuhr and Maria Maistro and Tetsuya Sakai and Philipp Schaer and Ian Soboroff",

note = "Publisher Copyright: {\textcopyright} 2020 ACM.; 43rd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2020 ; Conference date: 25-07-2020 Through 30-07-2020",

year = "2020",

doi = "10.1145/3397271.3401036",

language = "English",

pages = "349--358",

booktitle = "SIGIR '20",

publisher = "Association for Computing Machinery",

}

RIS

TY - GEN

T1 - How to Measure the Reproducibility of System-oriented IR Experiments

AU - Breuer, Timo

AU - Ferro, Nicola

AU - Fuhr, Norbert

AU - Maistro, Maria

AU - Sakai, Tetsuya

AU - Schaer, Philipp

AU - Soboroff, Ian

PY - 2020

Y1 - 2020

N2 - Replicability and reproducibility of experimental results are primary concerns in all the areas of science and IR is not an exception. Besides the problem of moving the field towards more reproducible experimental practices and protocols, we also face a severe methodological issue: we do not have any means to assess when reproduced is reproduced. Moreover, we lack any reproducibility-oriented dataset, which would allow us to develop such methods. To address these issues, we compare several measures to objectively quantify to what extent we have replicated or reproduced a system-oriented IR experiment. These measures operate at different levels of granularity, from the fine-grained comparison of ranked lists, to the more general comparison of the obtained effects and significant differences. Moreover, we also develop a reproducibility-oriented dataset, which allows us to validate our measures and which can also be used to develop future measures.

AB - Replicability and reproducibility of experimental results are primary concerns in all the areas of science and IR is not an exception. Besides the problem of moving the field towards more reproducible experimental practices and protocols, we also face a severe methodological issue: we do not have any means to assess when reproduced is reproduced. Moreover, we lack any reproducibility-oriented dataset, which would allow us to develop such methods. To address these issues, we compare several measures to objectively quantify to what extent we have replicated or reproduced a system-oriented IR experiment. These measures operate at different levels of granularity, from the fine-grained comparison of ranked lists, to the more general comparison of the obtained effects and significant differences. Moreover, we also develop a reproducibility-oriented dataset, which allows us to validate our measures and which can also be used to develop future measures.

KW - measure

KW - replicability

KW - reproducibility

U2 - 10.1145/3397271.3401036

DO - 10.1145/3397271.3401036

M3 - Article in proceedings

AN - SCOPUS:85090158838

SP - 349

EP - 358

BT - SIGIR '20

PB - Association for Computing Machinery

T2 - 43rd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2020

Y2 - 25 July 2020 through 30 July 2020

ER -

ID: 269912561

Department of Computer Science