Projected hamming dissimilarity for bit-level importance coding in collaborative filtering

Research output: Chapter in Book/Report/Conference proceedingArticle in proceedingsResearchpeer-review

Standard

Projected hamming dissimilarity for bit-level importance coding in collaborative filtering. / Hansen, Christian; Hansen, Casper; Simonsen, Jakob Grue; Lioma, Christina.

The Web Conference 2021 - Proceedings of the World Wide Web Conference, WWW 2021. Association for Computing Machinery, Inc, 2021. p. 261-269.

Research output: Chapter in Book/Report/Conference proceedingArticle in proceedingsResearchpeer-review

Harvard

Hansen, C, Hansen, C, Simonsen, JG & Lioma, C 2021, Projected hamming dissimilarity for bit-level importance coding in collaborative filtering. in The Web Conference 2021 - Proceedings of the World Wide Web Conference, WWW 2021. Association for Computing Machinery, Inc, pp. 261-269, 2021 World Wide Web Conference, WWW 2021, Ljubljana, Slovenia, 19/04/2021. https://doi.org/10.1145/3442381.3450011

APA

Hansen, C., Hansen, C., Simonsen, J. G., & Lioma, C. (2021). Projected hamming dissimilarity for bit-level importance coding in collaborative filtering. In The Web Conference 2021 - Proceedings of the World Wide Web Conference, WWW 2021 (pp. 261-269). Association for Computing Machinery, Inc. https://doi.org/10.1145/3442381.3450011

Vancouver

Hansen C, Hansen C, Simonsen JG, Lioma C. Projected hamming dissimilarity for bit-level importance coding in collaborative filtering. In The Web Conference 2021 - Proceedings of the World Wide Web Conference, WWW 2021. Association for Computing Machinery, Inc. 2021. p. 261-269 https://doi.org/10.1145/3442381.3450011

Author

Hansen, Christian ; Hansen, Casper ; Simonsen, Jakob Grue ; Lioma, Christina. / Projected hamming dissimilarity for bit-level importance coding in collaborative filtering. The Web Conference 2021 - Proceedings of the World Wide Web Conference, WWW 2021. Association for Computing Machinery, Inc, 2021. pp. 261-269

Bibtex

@inproceedings{fb49f3422ef541a39a10d1784b259a69,
title = "Projected hamming dissimilarity for bit-level importance coding in collaborative filtering",
abstract = "When reasoning about tasks that involve large amounts of data, a common approach is to represent data items as objects in the Hamming space where operations can be done efficiently and effectively. Object similarity can then be computed by learning binary representations (hash codes) of the objects and computing their Hamming distance. While this is highly efficient, each bit dimension is equally weighted, which means that potentially discriminative information of the data is lost. A more expressive alternative is to use real-valued vector representations and compute their inner product; this allows varying the weight of each dimension but is many magnitudes slower. To fix this, we derive a new way of measuring the dissimilarity between two objects in the Hamming space with binary weighting of each dimension (i.e., disabling bits): we consider a field-agnostic dissimilarity that projects the vector of one object onto the vector of the other. When working in the Hamming space, this results in a novel projected Hamming dissimilarity, which by choice of projection, effectively allows a binary importance weighting of the hash code of one object through the hash code of the other. We propose a variational hashing model for learning hash codes optimized for this projected Hamming dissimilarity, and experimentally evaluate it in collaborative filtering experiments. The resultant hash codes lead to effectiveness gains of up to +7% in NDCG and +14% in MRR compared to state-of-the-art hashing-based collaborative filtering baselines, while requiring no additional storage and no computational overhead compared to using the Hamming distance. ",
keywords = "Collaborative filtering, Hash codes, Importance coding",
author = "Christian Hansen and Casper Hansen and Simonsen, {Jakob Grue} and Christina Lioma",
year = "2021",
doi = "10.1145/3442381.3450011",
language = "English",
pages = "261--269",
booktitle = "The Web Conference 2021 - Proceedings of the World Wide Web Conference, WWW 2021",
publisher = "Association for Computing Machinery, Inc",
note = "2021 World Wide Web Conference, WWW 2021 ; Conference date: 19-04-2021 Through 23-04-2021",

}

RIS

TY - GEN

T1 - Projected hamming dissimilarity for bit-level importance coding in collaborative filtering

AU - Hansen, Christian

AU - Hansen, Casper

AU - Simonsen, Jakob Grue

AU - Lioma, Christina

PY - 2021

Y1 - 2021

N2 - When reasoning about tasks that involve large amounts of data, a common approach is to represent data items as objects in the Hamming space where operations can be done efficiently and effectively. Object similarity can then be computed by learning binary representations (hash codes) of the objects and computing their Hamming distance. While this is highly efficient, each bit dimension is equally weighted, which means that potentially discriminative information of the data is lost. A more expressive alternative is to use real-valued vector representations and compute their inner product; this allows varying the weight of each dimension but is many magnitudes slower. To fix this, we derive a new way of measuring the dissimilarity between two objects in the Hamming space with binary weighting of each dimension (i.e., disabling bits): we consider a field-agnostic dissimilarity that projects the vector of one object onto the vector of the other. When working in the Hamming space, this results in a novel projected Hamming dissimilarity, which by choice of projection, effectively allows a binary importance weighting of the hash code of one object through the hash code of the other. We propose a variational hashing model for learning hash codes optimized for this projected Hamming dissimilarity, and experimentally evaluate it in collaborative filtering experiments. The resultant hash codes lead to effectiveness gains of up to +7% in NDCG and +14% in MRR compared to state-of-the-art hashing-based collaborative filtering baselines, while requiring no additional storage and no computational overhead compared to using the Hamming distance.

AB - When reasoning about tasks that involve large amounts of data, a common approach is to represent data items as objects in the Hamming space where operations can be done efficiently and effectively. Object similarity can then be computed by learning binary representations (hash codes) of the objects and computing their Hamming distance. While this is highly efficient, each bit dimension is equally weighted, which means that potentially discriminative information of the data is lost. A more expressive alternative is to use real-valued vector representations and compute their inner product; this allows varying the weight of each dimension but is many magnitudes slower. To fix this, we derive a new way of measuring the dissimilarity between two objects in the Hamming space with binary weighting of each dimension (i.e., disabling bits): we consider a field-agnostic dissimilarity that projects the vector of one object onto the vector of the other. When working in the Hamming space, this results in a novel projected Hamming dissimilarity, which by choice of projection, effectively allows a binary importance weighting of the hash code of one object through the hash code of the other. We propose a variational hashing model for learning hash codes optimized for this projected Hamming dissimilarity, and experimentally evaluate it in collaborative filtering experiments. The resultant hash codes lead to effectiveness gains of up to +7% in NDCG and +14% in MRR compared to state-of-the-art hashing-based collaborative filtering baselines, while requiring no additional storage and no computational overhead compared to using the Hamming distance.

KW - Collaborative filtering

KW - Hash codes

KW - Importance coding

UR - http://www.scopus.com/inward/record.url?scp=85105115404&partnerID=8YFLogxK

U2 - 10.1145/3442381.3450011

DO - 10.1145/3442381.3450011

M3 - Article in proceedings

AN - SCOPUS:85105115404

SP - 261

EP - 269

BT - The Web Conference 2021 - Proceedings of the World Wide Web Conference, WWW 2021

PB - Association for Computing Machinery, Inc

T2 - 2021 World Wide Web Conference, WWW 2021

Y2 - 19 April 2021 through 23 April 2021

ER -

ID: 300920300