WordSup: Exploiting Word Annotations for Character based Text Detection

Hu, Han; Zhang, Chengquan; Luo, Yuxuan; Wang, Yuzhuo; Han, Junyu; Ding, Errui

Computer Science > Computer Vision and Pattern Recognition

arXiv:1708.06720 (cs)

[Submitted on 22 Aug 2017]

Title:WordSup: Exploiting Word Annotations for Character based Text Detection

Authors:Han Hu, Chengquan Zhang, Yuxuan Luo, Yuzhuo Wang, Junyu Han, Errui Ding

View PDF

Abstract:Imagery texts are usually organized as a hierarchy of several visual elements, i.e. characters, words, text lines and text blocks. Among these elements, character is the most basic one for various languages such as Western, Chinese, Japanese, mathematical expression and etc. It is natural and convenient to construct a common text detection engine based on character detectors. However, training character detectors requires a vast of location annotated characters, which are expensive to obtain. Actually, the existing real text datasets are mostly annotated in word or line level. To remedy this dilemma, we propose a weakly supervised framework that can utilize word annotations, either in tight quadrangles or the more loose bounding boxes, for character detector training. When applied in scene text detection, we are thus able to train a robust character detector by exploiting word annotations in the rich large-scale real scene text datasets, e.g. ICDAR15 and COCO-text. The character detector acts as a key role in the pipeline of our text detection engine. It achieves the state-of-the-art performance on several challenging scene text detection benchmarks. We also demonstrate the flexibility of our pipeline by various scenarios, including deformed text detection and math expression recognition.

Comments:	2017 International Conference on Computer Vision
Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:1708.06720 [cs.CV]
	(or arXiv:1708.06720v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.1708.06720

Submission history

From: Han Hu [view email]
[v1] Tue, 22 Aug 2017 16:55:24 UTC (5,805 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:WordSup: Exploiting Word Annotations for Character based Text Detection

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:WordSup: Exploiting Word Annotations for Character based Text Detection

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators