Learning to Match Aerial Images with Deep Attentive Architectures
Publikation: Bidrag til tidsskrift › Konferenceartikel › Forskning › fagfællebedømt
Standard
Learning to Match Aerial Images with Deep Attentive Architectures. / Altwaijry, Hani; Trulls, Eduard; Hays, James; Fua, Pascal; Belongie, Serge.
I: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 09.12.2016, s. 3539-3547.Publikation: Bidrag til tidsskrift › Konferenceartikel › Forskning › fagfællebedømt
Harvard
APA
Vancouver
Author
Bibtex
}
RIS
TY - GEN
T1 - Learning to Match Aerial Images with Deep Attentive Architectures
AU - Altwaijry, Hani
AU - Trulls, Eduard
AU - Hays, James
AU - Fua, Pascal
AU - Belongie, Serge
N1 - Publisher Copyright: © 2016 IEEE.
PY - 2016/12/9
Y1 - 2016/12/9
N2 - Image matching is a fundamental problem in Computer Vision. In the context of feature-based matching, SIFT and its variants have long excelled in a wide array of applications. However, for ultra-wide baselines, as in the case of aerial images captured under large camera rotations, the appearance variation goes beyond the reach of SIFT and RANSAC. In this paper we propose a data-driven, deep learning-based approach that sidesteps local correspondence by framing the problem as a classification task. Furthermore, we demonstrate that local correspondences can still be useful. To do so we incorporate an attention mechanism to produce a set of probable matches, which allows us to further increase performance. We train our models on a dataset of urban aerial imagery consisting of 'same' and 'different' pairs, collected for this purpose, and characterize the problem via a human study with annotations from Amazon Mechanical Turk. We demonstrate that our models outperform the state-of-the-art on ultra-wide baseline matching and approach human accuracy.
AB - Image matching is a fundamental problem in Computer Vision. In the context of feature-based matching, SIFT and its variants have long excelled in a wide array of applications. However, for ultra-wide baselines, as in the case of aerial images captured under large camera rotations, the appearance variation goes beyond the reach of SIFT and RANSAC. In this paper we propose a data-driven, deep learning-based approach that sidesteps local correspondence by framing the problem as a classification task. Furthermore, we demonstrate that local correspondences can still be useful. To do so we incorporate an attention mechanism to produce a set of probable matches, which allows us to further increase performance. We train our models on a dataset of urban aerial imagery consisting of 'same' and 'different' pairs, collected for this purpose, and characterize the problem via a human study with annotations from Amazon Mechanical Turk. We demonstrate that our models outperform the state-of-the-art on ultra-wide baseline matching and approach human accuracy.
UR - http://www.scopus.com/inward/record.url?scp=84986275007&partnerID=8YFLogxK
U2 - 10.1109/CVPR.2016.385
DO - 10.1109/CVPR.2016.385
M3 - Conference article
AN - SCOPUS:84986275007
SP - 3539
EP - 3547
JO - I E E E Conference on Computer Vision and Pattern Recognition. Proceedings
JF - I E E E Conference on Computer Vision and Pattern Recognition. Proceedings
SN - 1063-6919
T2 - 29th IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2016
Y2 - 26 June 2016 through 1 July 2016
ER -
ID: 301828469