Evaluation Measures of Individual Item Fairness for Recommender Systems: A Critical Study
Research output: Contribution to journal › Journal article › Research › peer-review
Standard
Evaluation Measures of Individual Item Fairness for Recommender Systems : A Critical Study. / Rampisela, Theresia Veronika; Maistro, Maria; Ruotsalo, Tuukka; Lioma, Christina.
In: ACM Transactions on Recommender Systems, 2024.Research output: Contribution to journal › Journal article › Research › peer-review
Harvard
APA
Vancouver
Author
Bibtex
}
RIS
TY - JOUR
T1 - Evaluation Measures of Individual Item Fairness for Recommender Systems
T2 - A Critical Study
AU - Rampisela, Theresia Veronika
AU - Maistro, Maria
AU - Ruotsalo, Tuukka
AU - Lioma, Christina
N1 - Accepted to ACM Transactions on Recommender Systems (TORS)
PY - 2024
Y1 - 2024
N2 - Fairness is an emerging and challenging topic in recommender systems. In recent years, various ways of evaluating and therefore improving fairness have emerged. In this study, we examine existing evaluation measures of fairness in recommender systems. Specifically, we focus solely on exposure-based fairness measures of individual items that aim to quantify the disparity in how individual items are recommended to users, separate from item relevance to users. We gather all such measures and we critically analyse their theoretical properties. We identify a series of limitations in each of them, which collectively may render the affected measures hard or impossible to interpret, to compute, or to use for comparing recommendations. We resolve these limitations by redefining or correcting the affected measures, or we argue why certain limitations cannot be resolved. We further perform a comprehensive empirical analysis of both the original and our corrected versions of these fairness measures, using real-world and synthetic datasets. Our analysis provides novel insights into the relationship between measures based on different fairness concepts, and different levels of measure sensitivity and strictness. We conclude with practical suggestions of which fairness measures should be used and when. Our code is publicly available. To our knowledge, this is the first critical comparison of individual item fairness measures in recommender systems.
AB - Fairness is an emerging and challenging topic in recommender systems. In recent years, various ways of evaluating and therefore improving fairness have emerged. In this study, we examine existing evaluation measures of fairness in recommender systems. Specifically, we focus solely on exposure-based fairness measures of individual items that aim to quantify the disparity in how individual items are recommended to users, separate from item relevance to users. We gather all such measures and we critically analyse their theoretical properties. We identify a series of limitations in each of them, which collectively may render the affected measures hard or impossible to interpret, to compute, or to use for comparing recommendations. We resolve these limitations by redefining or correcting the affected measures, or we argue why certain limitations cannot be resolved. We further perform a comprehensive empirical analysis of both the original and our corrected versions of these fairness measures, using real-world and synthetic datasets. Our analysis provides novel insights into the relationship between measures based on different fairness concepts, and different levels of measure sensitivity and strictness. We conclude with practical suggestions of which fairness measures should be used and when. Our code is publicly available. To our knowledge, this is the first critical comparison of individual item fairness measures in recommender systems.
KW - cs.IR
U2 - 10.1145/3631943
DO - 10.1145/3631943
M3 - Journal article
JO - ACM Transactions on Recommender Systems
JF - ACM Transactions on Recommender Systems
SN - 2770-6699
ER -
ID: 380218431