Unsupervised Semantic Hashing with Pairwise Reconstruction

Research output: Chapter in Book/Report/Conference proceedingArticle in proceedingsResearchpeer-review

Standard

Unsupervised Semantic Hashing with Pairwise Reconstruction. / Hansen, Casper; Hansen, Christian; Simonsen, Jakob Grue; Alstrup, Stephen; Lioma, Christina.

SIGIR 2020 - Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval. Association for Computing Machinery, 2020. p. 2009-2012.

Research output: Chapter in Book/Report/Conference proceedingArticle in proceedingsResearchpeer-review

Harvard

Hansen, C, Hansen, C, Simonsen, JG, Alstrup, S & Lioma, C 2020, Unsupervised Semantic Hashing with Pairwise Reconstruction. in SIGIR 2020 - Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval. Association for Computing Machinery, pp. 2009-2012, 43rd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2020, Virtual, Online, China, 25/07/2020. https://doi.org/10.1145/3397271.3401220

APA

Hansen, C., Hansen, C., Simonsen, J. G., Alstrup, S., & Lioma, C. (2020). Unsupervised Semantic Hashing with Pairwise Reconstruction. In SIGIR 2020 - Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 2009-2012). Association for Computing Machinery. https://doi.org/10.1145/3397271.3401220

Vancouver

Hansen C, Hansen C, Simonsen JG, Alstrup S, Lioma C. Unsupervised Semantic Hashing with Pairwise Reconstruction. In SIGIR 2020 - Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval. Association for Computing Machinery. 2020. p. 2009-2012 https://doi.org/10.1145/3397271.3401220

Author

Hansen, Casper ; Hansen, Christian ; Simonsen, Jakob Grue ; Alstrup, Stephen ; Lioma, Christina. / Unsupervised Semantic Hashing with Pairwise Reconstruction. SIGIR 2020 - Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval. Association for Computing Machinery, 2020. pp. 2009-2012

Bibtex

@inproceedings{d8079968f1d540ae8372e0ed073522ca,
title = "Unsupervised Semantic Hashing with Pairwise Reconstruction",
abstract = "Semantic Hashing is a popular family of methods for efficient similarity search in large-scale datasets. In Semantic Hashing, documents are encoded as short binary vectors (i.e., hash codes), such that semantic similarity can be efficiently computed using the Hamming distance. Recent state-of-the-art approaches have utilized weak supervision to train better performing hashing models. Inspired by this, we present Semantic Hashing with Pairwise Reconstruction (PairRec), which is a discrete variational autoencoder based hashing model. PairRec first encodes weakly supervised training pairs (a query document and a semantically similar document) into two hash codes, and then learns to reconstruct the same query document from both of these hash codes (i.e., pairwise reconstruction). This pairwise reconstruction enables our model to encode local neighbourhood structures within the hash code directly through the decoder. We experimentally compare PairRec to traditional and state-of-the-art approaches, and obtain significant performance improvements in the task of document similarity search.",
keywords = "pairwise reconstruction, semantic hashing, variational",
author = "Casper Hansen and Christian Hansen and Simonsen, {Jakob Grue} and Stephen Alstrup and Christina Lioma",
year = "2020",
doi = "10.1145/3397271.3401220",
language = "English",
pages = "2009--2012",
booktitle = "SIGIR 2020 - Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval",
publisher = "Association for Computing Machinery",
note = "43rd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2020 ; Conference date: 25-07-2020 Through 30-07-2020",

}

RIS

TY - GEN

T1 - Unsupervised Semantic Hashing with Pairwise Reconstruction

AU - Hansen, Casper

AU - Hansen, Christian

AU - Simonsen, Jakob Grue

AU - Alstrup, Stephen

AU - Lioma, Christina

PY - 2020

Y1 - 2020

N2 - Semantic Hashing is a popular family of methods for efficient similarity search in large-scale datasets. In Semantic Hashing, documents are encoded as short binary vectors (i.e., hash codes), such that semantic similarity can be efficiently computed using the Hamming distance. Recent state-of-the-art approaches have utilized weak supervision to train better performing hashing models. Inspired by this, we present Semantic Hashing with Pairwise Reconstruction (PairRec), which is a discrete variational autoencoder based hashing model. PairRec first encodes weakly supervised training pairs (a query document and a semantically similar document) into two hash codes, and then learns to reconstruct the same query document from both of these hash codes (i.e., pairwise reconstruction). This pairwise reconstruction enables our model to encode local neighbourhood structures within the hash code directly through the decoder. We experimentally compare PairRec to traditional and state-of-the-art approaches, and obtain significant performance improvements in the task of document similarity search.

AB - Semantic Hashing is a popular family of methods for efficient similarity search in large-scale datasets. In Semantic Hashing, documents are encoded as short binary vectors (i.e., hash codes), such that semantic similarity can be efficiently computed using the Hamming distance. Recent state-of-the-art approaches have utilized weak supervision to train better performing hashing models. Inspired by this, we present Semantic Hashing with Pairwise Reconstruction (PairRec), which is a discrete variational autoencoder based hashing model. PairRec first encodes weakly supervised training pairs (a query document and a semantically similar document) into two hash codes, and then learns to reconstruct the same query document from both of these hash codes (i.e., pairwise reconstruction). This pairwise reconstruction enables our model to encode local neighbourhood structures within the hash code directly through the decoder. We experimentally compare PairRec to traditional and state-of-the-art approaches, and obtain significant performance improvements in the task of document similarity search.

KW - pairwise reconstruction

KW - semantic hashing

KW - variational

UR - http://www.scopus.com/inward/record.url?scp=85090121707&partnerID=8YFLogxK

U2 - 10.1145/3397271.3401220

DO - 10.1145/3397271.3401220

M3 - Article in proceedings

AN - SCOPUS:85090121707

SP - 2009

EP - 2012

BT - SIGIR 2020 - Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval

PB - Association for Computing Machinery

T2 - 43rd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2020

Y2 - 25 July 2020 through 30 July 2020

ER -

ID: 260411530