Building a bird recognition app and large scale dataset with citizen scientists: The fine print in fine-grained dataset collection

Publikation: Bidrag til tidsskriftKonferenceartikelForskningfagfællebedømt

  • Grant Van Horn
  • Steve Branson
  • Ryan Farrell
  • Scott Haber
  • Jessie Barry
  • Panos Ipeirotis
  • Pietro Perona
  • Belongie, Serge

We introduce tools and methodologies to collect high quality, large scale fine-grained computer vision datasets using citizen scientists - crowd annotators who are passionate and knowledgeable about specific domains such as birds or airplanes. We worked with citizen scientists and domain experts to collect NABirds, a new high quality dataset containing 48,562 images of North American birds with 555 categories, part annotations and bounding boxes. We find that citizen scientists are significantly more accurate than Mechanical Turkers at zero cost. We worked with bird experts to measure the quality of popular datasets like CUB-200-2011 and ImageNet and found class label error rates of at least 4%. Nevertheless, we found that learning algorithms are surprisingly robust to annotation errors and this level of training data corruption can lead to an acceptably small increase in test error if the training set has sufficient size. At the same time, we found that an expert-curated high quality test set like NABirds is necessary to accurately measure the performance of fine-grained computer vision systems. We used NABirds to train a publicly available bird recognition service deployed on the web site of the Cornell Lab of Ornithology.

OriginalsprogEngelsk
TidsskriftProceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition
Sider (fra-til)595-604
Antal sider10
ISSN1063-6919
DOI
StatusUdgivet - 14 okt. 2015
Eksternt udgivetJa
BegivenhedIEEE Conference on Computer Vision and Pattern Recognition, CVPR 2015 - Boston, USA
Varighed: 7 jun. 201512 jun. 2015

Konference

KonferenceIEEE Conference on Computer Vision and Pattern Recognition, CVPR 2015
LandUSA
ByBoston
Periode07/06/201512/06/2015

Bibliografisk note

Publisher Copyright:
© 2015 IEEE.

ID: 301829133