COCO-Text: Dataset and Benchmark for Text Detection and Recognition in Natural Images

Publikation: Working paperPreprintForskning

Standard

COCO-Text: Dataset and Benchmark for Text Detection and Recognition in Natural Images. / Belongie, Serge; Veit, Andreas; Matera, Tomáš; Neumann, Lukas; Matas, Jiri.

2016.

Publikation: Working paperPreprintForskning

Harvard

Belongie, S, Veit, A, Matera, T, Neumann, L & Matas, J 2016 'COCO-Text: Dataset and Benchmark for Text Detection and Recognition in Natural Images'. https://doi.org/10.48550/arXiv.1601.07140

APA

Belongie, S., Veit, A., Matera, T., Neumann, L., & Matas, J. (2016). COCO-Text: Dataset and Benchmark for Text Detection and Recognition in Natural Images. https://doi.org/10.48550/arXiv.1601.07140

Vancouver

Belongie S, Veit A, Matera T, Neumann L, Matas J. COCO-Text: Dataset and Benchmark for Text Detection and Recognition in Natural Images. 2016 jun. 19. https://doi.org/10.48550/arXiv.1601.07140

Author

Belongie, Serge ; Veit, Andreas ; Matera, Tomáš ; Neumann, Lukas ; Matas, Jiri. / COCO-Text: Dataset and Benchmark for Text Detection and Recognition in Natural Images. 2016.

Bibtex

@techreport{719f91d67d084f5aabd9f15f047f9b45,
title = "COCO-Text: Dataset and Benchmark for Text Detection and Recognition in Natural Images",
abstract = "This paper describes the COCO-Text dataset. In recent years large-scale datasets like SUN and Imagenet drove the advancement of scene understanding and object recognition. The goal of COCO-Text is to advance state-of-the-art in text detection and recognition in natural images. The dataset is based on the MS COCO dataset, which contains images of complex everyday scenes. The images were not collected with text in mind and thus contain a broad variety of text instances. To reflect the diversity of text in natural scenes, we annotate text with (a) location in terms of a bounding box, (b) fine-grained classification into machine printed text and handwritten text, (c) classification into legible and illegible text, (d) script of the text and (e) transcriptions of legible text. The dataset contains over 173k text annotations in over 63k images. We provide a statistical analysis of the accuracy of our annotations. In addition, we present an analysis of three leading state-of-the-art photo Optical Character Recognition (OCR) approaches on our dataset. While scene text detection and recognition enjoys strong advances in recent years, we identify significant shortcomings motivating future work.",
author = "Serge Belongie and Andreas Veit and Tom{\'a}{\v s} Matera and Lukas Neumann and Jiri Matas",
year = "2016",
month = jun,
day = "19",
doi = "https://doi.org/10.48550/arXiv.1601.07140",
language = "English",
type = "WorkingPaper",

}

RIS

TY - UNPB

T1 - COCO-Text: Dataset and Benchmark for Text Detection and Recognition in Natural Images

AU - Belongie, Serge

AU - Veit, Andreas

AU - Matera, Tomáš

AU - Neumann, Lukas

AU - Matas, Jiri

PY - 2016/6/19

Y1 - 2016/6/19

N2 - This paper describes the COCO-Text dataset. In recent years large-scale datasets like SUN and Imagenet drove the advancement of scene understanding and object recognition. The goal of COCO-Text is to advance state-of-the-art in text detection and recognition in natural images. The dataset is based on the MS COCO dataset, which contains images of complex everyday scenes. The images were not collected with text in mind and thus contain a broad variety of text instances. To reflect the diversity of text in natural scenes, we annotate text with (a) location in terms of a bounding box, (b) fine-grained classification into machine printed text and handwritten text, (c) classification into legible and illegible text, (d) script of the text and (e) transcriptions of legible text. The dataset contains over 173k text annotations in over 63k images. We provide a statistical analysis of the accuracy of our annotations. In addition, we present an analysis of three leading state-of-the-art photo Optical Character Recognition (OCR) approaches on our dataset. While scene text detection and recognition enjoys strong advances in recent years, we identify significant shortcomings motivating future work.

AB - This paper describes the COCO-Text dataset. In recent years large-scale datasets like SUN and Imagenet drove the advancement of scene understanding and object recognition. The goal of COCO-Text is to advance state-of-the-art in text detection and recognition in natural images. The dataset is based on the MS COCO dataset, which contains images of complex everyday scenes. The images were not collected with text in mind and thus contain a broad variety of text instances. To reflect the diversity of text in natural scenes, we annotate text with (a) location in terms of a bounding box, (b) fine-grained classification into machine printed text and handwritten text, (c) classification into legible and illegible text, (d) script of the text and (e) transcriptions of legible text. The dataset contains over 173k text annotations in over 63k images. We provide a statistical analysis of the accuracy of our annotations. In addition, we present an analysis of three leading state-of-the-art photo Optical Character Recognition (OCR) approaches on our dataset. While scene text detection and recognition enjoys strong advances in recent years, we identify significant shortcomings motivating future work.

UR - https://vision.cornell.edu/se3/wp-content/uploads/2016/01/1601.07140v1.pdf

U2 - https://doi.org/10.48550/arXiv.1601.07140

DO - https://doi.org/10.48550/arXiv.1601.07140

M3 - Preprint

BT - COCO-Text: Dataset and Benchmark for Text Detection and Recognition in Natural Images

ER -

ID: 307528242