Evaluation Measures of Individual Item Fairness for Recommender Systems

Evaluation Measures of Individual Item Fairness for Recommender Systems: A Critical Study

Research output: Contribution to journal › Journal article › Research › peer-review

Standard

Evaluation Measures of Individual Item Fairness for Recommender Systems : A Critical Study. / Rampisela, Theresia Veronika; Maistro, Maria; Ruotsalo, Tuukka; Lioma, Christina.

In: ACM Transactions on Recommender Systems, 2024.

Research output: Contribution to journal › Journal article › Research › peer-review

Harvard

Rampisela, TV, Maistro, M, Ruotsalo, T & Lioma, C 2024, 'Evaluation Measures of Individual Item Fairness for Recommender Systems: A Critical Study', ACM Transactions on Recommender Systems. https://doi.org/10.1145/3631943

APA

Rampisela, T. V., Maistro, M., Ruotsalo, T., & Lioma, C. (2024). Evaluation Measures of Individual Item Fairness for Recommender Systems: A Critical Study. ACM Transactions on Recommender Systems. https://doi.org/10.1145/3631943

Vancouver

Rampisela TV, Maistro M, Ruotsalo T, Lioma C. Evaluation Measures of Individual Item Fairness for Recommender Systems: A Critical Study. ACM Transactions on Recommender Systems. 2024. https://doi.org/10.1145/3631943

Author

Rampisela, Theresia Veronika ; Maistro, Maria ; Ruotsalo, Tuukka ; Lioma, Christina. / Evaluation Measures of Individual Item Fairness for Recommender Systems : A Critical Study. In: ACM Transactions on Recommender Systems. 2024.

Bibtex

@article{4982e603471447818001a1cbed800524,

title = "Evaluation Measures of Individual Item Fairness for Recommender Systems: A Critical Study",

abstract = " Fairness is an emerging and challenging topic in recommender systems. In recent years, various ways of evaluating and therefore improving fairness have emerged. In this study, we examine existing evaluation measures of fairness in recommender systems. Specifically, we focus solely on exposure-based fairness measures of individual items that aim to quantify the disparity in how individual items are recommended to users, separate from item relevance to users. We gather all such measures and we critically analyse their theoretical properties. We identify a series of limitations in each of them, which collectively may render the affected measures hard or impossible to interpret, to compute, or to use for comparing recommendations. We resolve these limitations by redefining or correcting the affected measures, or we argue why certain limitations cannot be resolved. We further perform a comprehensive empirical analysis of both the original and our corrected versions of these fairness measures, using real-world and synthetic datasets. Our analysis provides novel insights into the relationship between measures based on different fairness concepts, and different levels of measure sensitivity and strictness. We conclude with practical suggestions of which fairness measures should be used and when. Our code is publicly available. To our knowledge, this is the first critical comparison of individual item fairness measures in recommender systems. ",

keywords = "cs.IR",

author = "Rampisela, {Theresia Veronika} and Maria Maistro and Tuukka Ruotsalo and Christina Lioma",

note = "Accepted to ACM Transactions on Recommender Systems (TORS)",

year = "2024",

doi = "10.1145/3631943",

language = "English",

journal = "ACM Transactions on Recommender Systems",

issn = "2770-6699",

publisher = "Association for Computing Machinery (ACM)",

}

RIS

TY - JOUR

T1 - Evaluation Measures of Individual Item Fairness for Recommender Systems

T2 - A Critical Study

AU - Rampisela, Theresia Veronika

AU - Maistro, Maria

AU - Ruotsalo, Tuukka

AU - Lioma, Christina

N1 - Accepted to ACM Transactions on Recommender Systems (TORS)

PY - 2024

Y1 - 2024

N2 - Fairness is an emerging and challenging topic in recommender systems. In recent years, various ways of evaluating and therefore improving fairness have emerged. In this study, we examine existing evaluation measures of fairness in recommender systems. Specifically, we focus solely on exposure-based fairness measures of individual items that aim to quantify the disparity in how individual items are recommended to users, separate from item relevance to users. We gather all such measures and we critically analyse their theoretical properties. We identify a series of limitations in each of them, which collectively may render the affected measures hard or impossible to interpret, to compute, or to use for comparing recommendations. We resolve these limitations by redefining or correcting the affected measures, or we argue why certain limitations cannot be resolved. We further perform a comprehensive empirical analysis of both the original and our corrected versions of these fairness measures, using real-world and synthetic datasets. Our analysis provides novel insights into the relationship between measures based on different fairness concepts, and different levels of measure sensitivity and strictness. We conclude with practical suggestions of which fairness measures should be used and when. Our code is publicly available. To our knowledge, this is the first critical comparison of individual item fairness measures in recommender systems.

AB - Fairness is an emerging and challenging topic in recommender systems. In recent years, various ways of evaluating and therefore improving fairness have emerged. In this study, we examine existing evaluation measures of fairness in recommender systems. Specifically, we focus solely on exposure-based fairness measures of individual items that aim to quantify the disparity in how individual items are recommended to users, separate from item relevance to users. We gather all such measures and we critically analyse their theoretical properties. We identify a series of limitations in each of them, which collectively may render the affected measures hard or impossible to interpret, to compute, or to use for comparing recommendations. We resolve these limitations by redefining or correcting the affected measures, or we argue why certain limitations cannot be resolved. We further perform a comprehensive empirical analysis of both the original and our corrected versions of these fairness measures, using real-world and synthetic datasets. Our analysis provides novel insights into the relationship between measures based on different fairness concepts, and different levels of measure sensitivity and strictness. We conclude with practical suggestions of which fairness measures should be used and when. Our code is publicly available. To our knowledge, this is the first critical comparison of individual item fairness measures in recommender systems.

KW - cs.IR

U2 - 10.1145/3631943

DO - 10.1145/3631943

M3 - Journal article

JO - ACM Transactions on Recommender Systems

JF - ACM Transactions on Recommender Systems

SN - 2770-6699

ER -

ID: 380218431

Department of Computer Science