Robustdetection of Text in Natural Scene Images
Robustdetection of Text in Natural Scene Images
PROF. G. UBALE,
Electronics and Telecommunication Khurana Sawant Institute of Engineering and Technology Hingoli, India
188 | P a g e
NOVATEUR PUBLICATIONS
International Journal of Research Publications in Engineering and Technology [IJRPET]
ISSN: 2454-7875
VOLUME 3, ISSUE 6, Jun.-2017
clustering algorithm with the help of the already learned MSERs which having less variation with sharp borders are
parameters as computer program. At next step, character mostly characters.
categorizer is used to estimate the subsequent B. TEXT CANDIDATE CONSTRUCTION:
probabilities of input image. Text regions corresponding to Text regions are constructed from received results. The
the existing non-text regions in the image are described in metric learning algorithm is used to learn the distance
this process. The algorithm removes the text regionswith weights and grouped threshold of the detected text.
high non-text probable regions within the image. These Character regions are grouped into text regions through a
kind of elimination helps for training a more accurate, single-link clustering algorithm. Such grouping produces
efficient and reliable text categorizer for identifying and groups that are elongated and is particularly better for the
recognizing text in the image. Lastly, an precise and text region construction. Single-link clustering belongs to
potentdetection of text in scene images system has been the ordered family of clustering i.e. in hierarchical
built. clustering, each input data is taken as a single group and
By applying various key improvements over traditional they are successively merged till all points have been
MSER-based methods, a novel MSER-based detection of merged into a single group. In single-linkclustering, the
textin scene images has been developed. The proposed two groups having smallest distance between the two
detection of text in scene images method includes the closest groups are merged together. A threshold is set such
following stages: that the grouping process is terminated when the
Fig. 1. Flowchart of the system used and results after each threshold is exceeded by the distance between nearest
clusters.
C. TEXT REGIONS ELIMINATION:
Most of the non-text regions are needed to be removed
before training the categorizer. Hence, it is not easy to train
an effective text categorizer using such an unbalanced
database. We propose the use of a character categorizer to
judge the subsequent probabilities of text regions
corresponding to non-text and remove text regions with
more non-text probabilities. Some features used to train
the character categorizer are: height of text region, width,
smoothness, aspect ratio and stroke width features.
Characters with lesser aspect ratios such as “i”, “j” and “l”
are labeled as negative samples.
189 | P a g e
NOVATEUR PUBLICATIONS
International Journal of Research Publications in Engineering and Technology [IJRPET]
ISSN: 2454-7875
VOLUME 3, ISSUE 6, Jun.-2017
the background leading to stable intensity profiles makes it individual characters. For example, recognizing the string
work well for the text. 'EXIT' vs. the set of individual characters {'X','E','T','I'},
B. USE OF BASIC GEOMETRIC PROPERTIES FOR where the meaning of the word is lost without the correct
REMOVAL OF NON-TEXT REGIONS: ordering.
Although the MSER technique picks most of the text, it D. TEXT RECOGNITION:
detects many stable regions in the image too that are not Finally, the text is recognized using template based
text. A rule-based approach is used to separate these non- technique.
text regions. For example, to filter out non-text regions,
geometric properties of text are used using simple
thresholds. Alternatively, to train a text against non-text a
machine learning approach categorizer can be used. A
combination of the two approaches can produce better
results. The several geometric properties that can be used
for discriminating between text and non-text regions can
be aspect ratio, eccentricity, Euler number, extent and
CONCLUSION:
This paper presents the MSER method for
detection of text in scene images with several techniques. A
solidity. swift and precise MSERs cut short algorithm allows
Fig.3. Canny edges and intersection of canny edges with detecting most characters even in low quality images. Also,
MSER region. novel distance metric technique where distance between
Use of stroke width variation for removal of non- the text regions is used as a parameter to distinguish
text regions Stroke width can be another parameter to between the text and non- text regions, is used. Now, the
distinguish between the text and the non-text regions. individual text regions are grouped together to form the
Stroke width is a measure of the width of the curves and meaningful words. The text categorizer eliminated the the
lines that make up a character. Non-text regions are likely non-text regions compared with the text regions.Asystem
to have larger stroke width variations while text regions with potent detection of text in scene images exhibits
have little. Then we estimate the stroke width of one of the superior performance over state-of-the-art methods on a
detected MSER regions. This can be done by using a variety of public databases.
distance transform and binary thinning operation. For the further research in this method we put
forth several limitations. It is still difficult to detect highly
blurred texts in images from scenes which are of lower
resolutions. Second, for different language text parameter
such as geometric parameters change like in English and in
Chinese. Third, the highly-skewed distortion of the text
with multiple orientations needs further research. Best
results have been observed with bright font colors (E.g.
white) distinguished from the background.
ACKNOWLEDGMENT:
I would like to express my deep sense of respect and
Fig. 4. Text candidate before and after stroke width filtering
gratitude towards my advisor Dr.R.R. Sawant and guide
Prof. G.B. Ubale Head of Electronics & Telecommunication
C. MERGE OF TEXT REGIONS FOR FINAL DETECTION:
Department of our institute, who has been the guiding force
Till now, we have results composed of individual text
behind this work. I am greatly indebted to him for his
characters. Now, these individual letters must be merged
constant encouragement, invaluable advice and for
to recognized as a single word and then to line. This helps
propelling me further in every aspect of my academic life.
in recognition of the actual words in an image than just the
190 | P a g e
NOVATEUR PUBLICATIONS
International Journal of Research Publications in Engineering and Technology [IJRPET]
ISSN: 2454-7875
VOLUME 3, ISSUE 6, Jun.-2017
REFERENCES:
1) J. Matas, O. Chum, M. Urban, and T. Pajdla. Potent wide-
baseline stereo from maximally stable extremal regions.
Image and Vision Computing, 2004.
2) B. Epshtein, E. Ofek, and Y. Wexler. Detecting text in
natural scenes with stroke width transform. In CVPR,
2010
3) D. Doermann, J. Liang, and H. Li. Progress in camera-
based document image analysis. In ICDAR. IEEE, 2003.
4) V. Wu, R. Manmatha, and E. M. Riseman. Textfinder: An
automatic system to detect and recognize text in
images. IEEE Transactions on Pattern Analysis and
Machine Intelligence, 1999.
5) L. Neumann and J. Matas. Text localization in real-
world images using efficiently pruned exhaustive search.
In ICDAR, 2011.
6) Y.-F. Pan, X. Hou, and C.-L. Liu, “A hybrid approach to
detect and localize texts in natural scene images,” IEEE
Trans. Image Process., vol. 20, no. 3, pp. 800–813, Mar.
2011.
7) A. Shahab, F. Shafait, and A. Dengel, “ICDAR potent
reading competition challenge 2: Reading text in scene
images,” in Proc. ICDAR, 2011, pp. 1491–1496.
8) J. D. Lafferty, A. McCallum, and F. C. N. Pereira,
“Conditional random fields: Probabilistic models for
segmenting and labeling sequence data,” in Proc. Int.
Conf. Mach. Learn., San Francisco, CA,USA, 2001, pp.
282–289.
191 | P a g e