The tipping point: F-score as a function of the number of retrieved items
Research output: Contribution to journal › Journal article › Research › peer-review
Standard
The tipping point : F-score as a function of the number of retrieved items. / Guns, Raf; Lioma, Christina; Larsen, Birger.
In: Information Processing & Management, Vol. 48, No. 6, 2012, p. 1171-1180.Research output: Contribution to journal › Journal article › Research › peer-review
Harvard
APA
Vancouver
Author
Bibtex
}
RIS
TY - JOUR
T1 - The tipping point
T2 - F-score as a function of the number of retrieved items
AU - Guns, Raf
AU - Lioma, Christina
AU - Larsen, Birger
PY - 2012
Y1 - 2012
N2 - One of the best known measures of information retrieval (IR) performance is the F-score, the harmonic mean of precision and recall. In this article we show that the curve of the F-score as a function of the number of retrieved items is always of the same shape: a fast concave increase to a maximum, followed by a slow decrease. In other words, there exists a single maximum, referred to as the tipping point, where the retrieval situation is ‘ideal’ in terms of the F-score. The tipping point thus indicates the optimal number of items to be retrieved, with more or less items resulting in a lower F-score. This empirical result is found in IR and link prediction experiments and can be partially explained theoretically, expanding on earlier results by Egghe. We discuss the implications and argue that, when comparing F-scores, one should compare the F-score curves’ tipping points.
AB - One of the best known measures of information retrieval (IR) performance is the F-score, the harmonic mean of precision and recall. In this article we show that the curve of the F-score as a function of the number of retrieved items is always of the same shape: a fast concave increase to a maximum, followed by a slow decrease. In other words, there exists a single maximum, referred to as the tipping point, where the retrieval situation is ‘ideal’ in terms of the F-score. The tipping point thus indicates the optimal number of items to be retrieved, with more or less items resulting in a lower F-score. This empirical result is found in IR and link prediction experiments and can be partially explained theoretically, expanding on earlier results by Egghe. We discuss the implications and argue that, when comparing F-scores, one should compare the F-score curves’ tipping points.
KW - Information Retrieval
KW - Evaluation
U2 - 10.1016/j.ipm.2012.02.009
DO - 10.1016/j.ipm.2012.02.009
M3 - Journal article
VL - 48
SP - 1171
EP - 1180
JO - Information Processing & Management
JF - Information Processing & Management
SN - 0306-4573
IS - 6
ER -
ID: 38240608