Video text detection and recognition: Dataset and benchmark

Research output: Contribution to journalConference articleResearchpeer-review

Standard

Video text detection and recognition : Dataset and benchmark. / Nguyen, Phuc Xuan; Wang, Kai; Belongie, Serge.

In: 2014 IEEE Winter Conference on Applications of Computer Vision, WACV 2014, 2014, p. 776-783.

Research output: Contribution to journalConference articleResearchpeer-review

Harvard

Nguyen, PX, Wang, K & Belongie, S 2014, 'Video text detection and recognition: Dataset and benchmark', 2014 IEEE Winter Conference on Applications of Computer Vision, WACV 2014, pp. 776-783. https://doi.org/10.1109/WACV.2014.6836024

APA

Nguyen, P. X., Wang, K., & Belongie, S. (2014). Video text detection and recognition: Dataset and benchmark. 2014 IEEE Winter Conference on Applications of Computer Vision, WACV 2014, 776-783. https://doi.org/10.1109/WACV.2014.6836024

Vancouver

Nguyen PX, Wang K, Belongie S. Video text detection and recognition: Dataset and benchmark. 2014 IEEE Winter Conference on Applications of Computer Vision, WACV 2014. 2014;776-783. https://doi.org/10.1109/WACV.2014.6836024

Author

Nguyen, Phuc Xuan ; Wang, Kai ; Belongie, Serge. / Video text detection and recognition : Dataset and benchmark. In: 2014 IEEE Winter Conference on Applications of Computer Vision, WACV 2014. 2014 ; pp. 776-783.

Bibtex

@inproceedings{00def270c4934e6c83e0ec8059a6fff9,
title = "Video text detection and recognition: Dataset and benchmark",
abstract = "This paper focuses on the problem of text detection and recognition in videos. Even though text detection and recognition in images has seen much progress in recent years, relatively little work has been done to extend these solutions to the video domain. In this work, we extend an existing end-to-end solution for text recognition in natural images to video. We explore a variety of methods for training local character models and explore methods to capitalize on the temporal redundancy of text in video. We present detection performance using the Video Analysis and Content Extraction (VACE) benchmarking framework on the ICDAR 2013 Robust Reading Challenge 3 video dataset and on a new video text dataset. We also propose a new performance metric based on precision-recall curves to measure the performance of text recognition in videos. Using this metric, we provide early video text recognition results on the above mentioned datasets.",
author = "Nguyen, {Phuc Xuan} and Kai Wang and Serge Belongie",
year = "2014",
doi = "10.1109/WACV.2014.6836024",
language = "English",
pages = "776--783",
journal = "2014 IEEE Winter Conference on Applications of Computer Vision, WACV 2014",
note = "2014 IEEE Winter Conference on Applications of Computer Vision, WACV 2014 ; Conference date: 24-03-2014 Through 26-03-2014",

}

RIS

TY - GEN

T1 - Video text detection and recognition

T2 - 2014 IEEE Winter Conference on Applications of Computer Vision, WACV 2014

AU - Nguyen, Phuc Xuan

AU - Wang, Kai

AU - Belongie, Serge

PY - 2014

Y1 - 2014

N2 - This paper focuses on the problem of text detection and recognition in videos. Even though text detection and recognition in images has seen much progress in recent years, relatively little work has been done to extend these solutions to the video domain. In this work, we extend an existing end-to-end solution for text recognition in natural images to video. We explore a variety of methods for training local character models and explore methods to capitalize on the temporal redundancy of text in video. We present detection performance using the Video Analysis and Content Extraction (VACE) benchmarking framework on the ICDAR 2013 Robust Reading Challenge 3 video dataset and on a new video text dataset. We also propose a new performance metric based on precision-recall curves to measure the performance of text recognition in videos. Using this metric, we provide early video text recognition results on the above mentioned datasets.

AB - This paper focuses on the problem of text detection and recognition in videos. Even though text detection and recognition in images has seen much progress in recent years, relatively little work has been done to extend these solutions to the video domain. In this work, we extend an existing end-to-end solution for text recognition in natural images to video. We explore a variety of methods for training local character models and explore methods to capitalize on the temporal redundancy of text in video. We present detection performance using the Video Analysis and Content Extraction (VACE) benchmarking framework on the ICDAR 2013 Robust Reading Challenge 3 video dataset and on a new video text dataset. We also propose a new performance metric based on precision-recall curves to measure the performance of text recognition in videos. Using this metric, we provide early video text recognition results on the above mentioned datasets.

UR - http://www.scopus.com/inward/record.url?scp=84904675660&partnerID=8YFLogxK

U2 - 10.1109/WACV.2014.6836024

DO - 10.1109/WACV.2014.6836024

M3 - Conference article

AN - SCOPUS:84904675660

SP - 776

EP - 783

JO - 2014 IEEE Winter Conference on Applications of Computer Vision, WACV 2014

JF - 2014 IEEE Winter Conference on Applications of Computer Vision, WACV 2014

Y2 - 24 March 2014 through 26 March 2014

ER -

ID: 302044488