Separating Self-Expression and Visual Content in Hashtag Supervision

Publikation: Bidrag til tidsskrift › Konferenceartikel › Forskning › fagfællebedømt

Andreas Veit
Maximilian Nickel
Belongie, Serge
Laurens Van Der Maaten

The variety, abundance, and structured nature of hashtags make them an interesting data source for training vision models. For instance, hashtags have the potential to significantly reduce the problem of manual supervision and annotation when learning vision models for a large number of concepts. However, a key challenge when learning from hashtags is that they are inherently subjective because they are provided by users as a form of self-expression. As a consequence, hashtags may have synonyms (different hashtags referring to the same visual content) and may be polysemous (the same hashtag referring to different visual content). These challenges limit the effectiveness of approaches that simply treat hashtags as image-label pairs. This paper presents an approach that extends upon modeling simple image-label pairs with a joint model of images, hashtags, and users. We demonstrate the efficacy of such approaches in image tagging and retrieval experiments, and show how the joint model can be used to perform user-conditional retrieval and tagging.

Originalsprog	Engelsk
Tidsskrift	Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition
Sider (fra-til)	5919-5927
Antal sider	9
ISSN	1063-6919
DOI	https://doi.org/10.1109/CVPR.2018.00620
Status	Udgivet - 14 dec. 2018
Eksternt udgivet	Ja
Begivenhed	31st Meeting of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2018 - Salt Lake City, USA Varighed: 18 jun. 2018 → 22 jun. 2018

Konference

Konference	31st Meeting of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2018
Land	USA
By	Salt Lake City
Periode	18/06/2018 → 22/06/2018

Bibliografisk note

ID: 301824866

Datalogisk Institut