Learning to Match Aerial Images with Deep Attentive Architectures

Research output: Contribution to journal › Conference article › Research › peer-review

Hani Altwaijry
Eduard Trulls
James Hays
Pascal Fua
Belongie, Serge

Image matching is a fundamental problem in Computer Vision. In the context of feature-based matching, SIFT and its variants have long excelled in a wide array of applications. However, for ultra-wide baselines, as in the case of aerial images captured under large camera rotations, the appearance variation goes beyond the reach of SIFT and RANSAC. In this paper we propose a data-driven, deep learning-based approach that sidesteps local correspondence by framing the problem as a classification task. Furthermore, we demonstrate that local correspondences can still be useful. To do so we incorporate an attention mechanism to produce a set of probable matches, which allows us to further increase performance. We train our models on a dataset of urban aerial imagery consisting of 'same' and 'different' pairs, collected for this purpose, and characterize the problem via a human study with annotations from Amazon Mechanical Turk. We demonstrate that our models outperform the state-of-the-art on ultra-wide baseline matching and approach human accuracy.

Original language	English
Journal	Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition
Pages (from-to)	3539-3547
Number of pages	9
ISSN	1063-6919
DOIs	https://doi.org/10.1109/CVPR.2016.385
Publication status	Published - 9 Dec 2016
Externally published	Yes
Event	29th IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2016 - Las Vegas, United States Duration: 26 Jun 2016 → 1 Jul 2016

Conference

Conference	29th IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2016
Country	United States
City	Las Vegas
Period	26/06/2016 → 01/07/2016

Bibliographical note

ID: 301828469

Department of Computer Science