Detecting oriented text in natural images by linking segments

Publikation: Bidrag til tidsskrift › Konferenceartikel › Forskning › fagfællebedømt

Baoguang Shi
Xiang Bai
Belongie, Serge

Most state-of-the-art text detection methods are specific to horizontal Latin text and are not fast enough for real-time applications. We introduce Segment Linking (SegLink), an oriented text detection method. The main idea is to decompose text into two locally detectable elements, namely segments and links. A segment is an oriented box covering a part of a word or text line; A link connects two adjacent segments, indicating that they belong to the same word or text line. Both elements are detected densely at multiple scales by an end-to-end trained, fully-convolutional neural network. Final detections are produced by combining segments connected by links. Compared with previous methods, SegLink improves along the dimensions of accuracy, speed, and ease of training. It achieves an f-measure of 75.0% on the standard ICDAR 2015 Incidental (Challenge 4) benchmark, outperforming the previous best by a large margin. It runs at over 20 FPS on 512×512 images. Moreover, without modification, SegLink is able to detect long lines of non-Latin text, such as Chinese.

Originalsprog	Engelsk
Tidsskrift	Proceedings - 30th IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017
Sider (fra-til)	3482-3490
Antal sider	9
DOI	https://doi.org/10.1109/CVPR.2017.371
Status	Udgivet - 6 nov. 2017
Eksternt udgivet	Ja
Begivenhed	30th IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017 - Honolulu, USA Varighed: 21 jul. 2017 → 26 jul. 2017

Konference

Konference	30th IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017
Land	USA
By	Honolulu
Periode	21/07/2017 → 26/07/2017

Bibliografisk note

ID: 301827309

Datalogisk Institut