Reading Scene Text in Deep Convolutional Sequences

He, Pan; Huang, Weilin; Qiao, Yu; Loy, Chen Change; Tang, Xiaoou

Computer Science > Computer Vision and Pattern Recognition

arXiv:1506.04395 (cs)

[Submitted on 14 Jun 2015 (v1), last revised 20 Dec 2015 (this version, v2)]

Title:Reading Scene Text in Deep Convolutional Sequences

Authors:Pan He, Weilin Huang, Yu Qiao, Chen Change Loy, Xiaoou Tang

View PDF

Abstract:We develop a Deep-Text Recurrent Network (DTRN) that regards scene text reading as a sequence labelling problem. We leverage recent advances of deep convolutional neural networks to generate an ordered high-level sequence from a whole word image, avoiding the difficult character segmentation problem. Then a deep recurrent model, building on long short-term memory (LSTM), is developed to robustly recognize the generated CNN sequences, departing from most existing approaches recognising each character independently. Our model has a number of appealing properties in comparison to existing scene text recognition methods: (i) It can recognise highly ambiguous words by leveraging meaningful context information, allowing it to work reliably without either pre- or post-processing; (ii) the deep CNN feature is robust to various image distortions; (iii) it retains the explicit order information in word image, which is essential to discriminate word strings; (iv) the model does not depend on pre-defined dictionary, and it can process unknown words and arbitrary strings. Codes for the DTRN will be available.

Comments:	To appear in the 13th AAAI Conference on Artificial Intelligence (AAAI-16), 2016
Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:1506.04395 [cs.CV]
	(or arXiv:1506.04395v2 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.1506.04395

Submission history

From: Weilin Huang [view email]
[v1] Sun, 14 Jun 2015 13:34:10 UTC (4,510 KB)
[v2] Sun, 20 Dec 2015 21:06:23 UTC (4,840 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.CV

< prev | next >

new | recent | 2015-06

Change to browse by:

References & Citations

DBLP - CS Bibliography

listing | bibtex

Pan He
Weilin Huang
Yu Qiao
Chen Change Loy
Xiaoou Tang

export BibTeX citation

Computer Science > Computer Vision and Pattern Recognition

Title:Reading Scene Text in Deep Convolutional Sequences

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Reading Scene Text in Deep Convolutional Sequences

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators