0% found this document useful (0 votes)

69 views

Text Detection and Localization in Natural Scene Images Using MSER and Fast Guided Filter

Text detection

Uploaded by

Parul Narula

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

69 views

Text Detection and Localization in Natural Scene Images Using MSER and Fast Guided Filter

Text detection

Uploaded by

Parul Narula

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 6

2017 Fourth International Conference on Image Information Processing (ICIIP)

Text Detection and Localization in Natural Scene

Images Using MSER and Fast Guided Filter
Rituraj Soni∗ , Bijendra Kumar† , Satish Chand‡
∗ Scholar, Department of Computer Engineering, NSIT, New Delhi, India
† Professor, Department of Computer Engineering, NSIT, New Delhi, India
‡ Professor, School of Computer and Systems Science, JNU, New Delhi, India
∗ [email protected], † [email protected], ‡ [email protected]

Abstract—Textual matter present in a natural scene image three attributes namely Stroke Width Dissimilarity, Color
provides indispensable information about it. The semantics and Dissimilarity and Occupy Rate Convex Area are calculated on
information present in the natural scene images can be perceived these areas. c) Third, we blend these attributes using Bayesian
by extracting the text regions in them. Detection and localization
of text from natural scene images is a challenging task for analysis classifier to estimate the TCS(Text Confidence Score), that
of images due to various font size, font type, and illumination. In determines the feasibility of a region as a text. d) Last, the
this paper, we propose a hybrid approach for text detection and labeling of constituent as text and non-text is carried out using
localization based on text confidence score using three attributes graph cut and Markov Random Field (MRFs) [2], followed by
namely stroke width dissimilarity, color dissimilarity and occupy text line integration with the help of the mean-shift clustering
rate convex area to discern text and non-text constituents. The
aim of this paper is to achieve fast detection and localization of approach.
text regions in low resolution and blurred images. To accomplish The arrangement of the paper is as follows: Introduction is
this, the possible candidate regions are extracted using edge discussed in Section I, related work is reviewed in Section
smoothing by fast guided filter followed by MSER. The text II. Section III defines the working of the proposed method.
confidence score on these constituents is calculated using the Experiments and results are discussed in Section IV, with
Bayesian framework with the help of above mentioned three
attributes. Experimental results on benchmark ICDAR 2013 concluding remarks mentioned in Section V.
testing dataset shows the efficacy of our method in the form
of precision, recall, and f-measure. II. R ELATED WORK
Index Terms—Text Detection and Localization, Text Confi- Numerous methods have been developed and proposed in
dence Score, Edge Smoothing MSER,Fast Guided Filter. past for scene text detection and localization, therefore based
on extensive survey [1] these methods can be divided on the
I. I NTRODUCTION basis of edge, stroke, texture, connected component (CC) and
Extraction of text regions from natural scene images is hybrid.
one of the crucial tasks in computer vision. The information In edge based method [3], edges are detected by edge detector
about scene images like notice boards, advertisement boards, and text components are extracted by morphological opera-
road signs etc. is embedded in the form of text. These texts tions. It is suitable for images with uniform gradient and gives
provide a rich amount of information about such images and poor results for images having a complicated background.
can be used in heterogeneous applications such as license The texture based method uses Wavelet Transform, Discrete
plate localization, robot navigation, content-based image re- Cosine Transform(DCT) [1], Histogram of Gradients(HOG)
trieval, guidance for visually impaired people [1] etc. Our and Local Binary Pattern(LBP) to acquire texture features, as
proposed work in this paper revolves around text detection text regions have different texture properties as compared to
and localization process, which is focused on estimating non-text regions. They are delicate to text arrangement, but
text locations in the image and creating bounding boxes efficient for distinguishing crowded characters.
around them. Researchers over the years have made significant The stroke based method [4], uses stroke width as the text-
progress in this field; however, this domain is still open due determining feature to extract text regions from an image,
to various challenges like variable font size, alignment of the but it is unsuccessful on images with varying background.
text, color, complex natural scene, occlusion, noise, blur [1], The connected component(CC) based method [4], uses color
illumination variations, viewpoint, distortion of the image, etc. clustering or edge detection for separating text components
Further, some false positives detected due to presence of some from image. They have lower computation cost due to less
background objects like bricks, windows, leaves, etc. may number of segmented candidate components, but it requires
decrease precision. advance information about scale and position of the text.
In this paper, we present a hybrid approach for scene text Maximal Stable Extremal Regions (MSER) based methods [5]
detection and localization which consists of a) First, an which are sensitive to image bluriness [6] can be incuded in
edge smoothing process using fast guided filter followed by CC based methods.
extracting prospective text areas using MSER. b) Second, The disadvantages present in each of these methods prompt

978-1-5090-6734-3/17/$31.00 ©2017 IEEE 351

2017 Fourth International Conference on Image Information Processing (ICIIP)

researchers to choose hybrid methods [1] for text detection Testing Images 2013 dataset
purpose to achieve higher precision and recall. Yi and Tian
[7] present a hybrid method to locate horizontal text in the
steady colored image. The clustering of text at the pixel level ESMSER and Constituent Filtering
is achieved by applying bigram color uniformity based method
and extraction of text is done by stroke segmentation. Li Extract three attributes on constituents
et al. [8] discuss a method using integration of three cues
namely stroke width variation, perpetual divergence, histogram
of gradients to classify text and non-text components. Get TCS using attributes by bayesian Classifier
Wang et al. [9] proposes a method to design a confidence
map by combination of seed appearance and its relationship Text Labelling Process using MRF
with adjoining candidates to extract text. Missing texts are
recaptured by utilizing the context information. Fabrizio et al.
[10] present a method depending on texture and connected Text Line Integration by Mean Shift Clustering
component methods to detect letters after segmentation using
Fig. 1: Flowchart of the proposed method
wavelet descriptor and form text areas by applying graph
modeling. Gomez and Karatzas et al. [11] detect text with
discriminatory and probabilistic stopping rules by applying
agglomerative clustering process over individual regions. localization process [12].
Every method has some disadvantages associated with it. Although the original version of MSER algorithm detects re-
The methods based on MSER [12], [13], [7], are sensitive gions with consistent intensity enclosed by strikingly different
to blur and low-resolution images. The problem of strong background, but its efficiency decreases in case of diversified
reflection is not dealt properly in [13]. Text with small contrast and blurriness in images. As a result of which certain
distorted, artistic and unconventional fonts in images cannot text constituents remains undiscovered. To resolve this issue,
be detected properly by methods [8], [9], [11]. Texts cannot Chen et al. [12] locate and eliminate MSER pixels outside the
be properly segmented as they stuck together in [10]. The edge boundary using canny edge detector. Li and Lu [6] have
method in [8] is slow whereas the problem of low contrast extracted text components using contrast-enhanced MSER
in text and its background cannot be handled by [9]. These (CEMSER) whereas, Li et al. [8] extracted text components
disadvantages inspire us to propose a new hybrid approach by applying eMSER (edge preserving MSER) using guided
for text detection and localization in natural scene images to filter [15], but it [8] is slow and takes more time.
increase performance in terms of accuracy. In this paper, we propose Edge Smoothing MSER (ESMSER)
for detecting the possible text constituents using the fast
III. P ROPOSED M ETHOD guided filter [16](see Algorithm 1). The eMSER [8](using
In this work, we propose a hybrid method based on edge guided filter [15]) takes more time for smoothing of edges as
smoothing MSER using the fast guided filter to detect and compare to propsed ESMSER (using fast guided filter [16]).
localize text in natural scene images. The training is accom- The sensitivity of MSER to image blurriness due to diverse
plished on the dataset for text segmentation task (challenge 2, pixels as discussed above creates the need to get rid of these
task 2.2) from ICDAR 2013 robust reading competition [14] pixels so as to decrease the effect of the blurriness and improve
to generate the distribution for three attributes on text and detection of text in low-resolution and blurred images. To
non-text constituents, which is needed for the calculation of perform this, firstly an edge smoothing process is carried out
TCS using the Bayesian framework. The proposed method is on sample image in HSI color space using the fast guided
applied on ICDAR 2013 test dataset. The flowchart in Fig.1 filter and then the MSER detection is applied to the edge
depicts the working of the proposed method. Figure 2 exhibits smoothened image to extract possible text constituents. The
the working of the proposed method. miscellaneous pixels around the boundary of the characters are
removed by this edge smoothing process and thus separates the
A. Edge Smoothing MSER and Constituent Filtering characters. Figure 3(a) shows the sample image, the result by
1) ESMSER: The MSER [5] with a time complexity of original MSER is shown in Fig.3(b) (characters are connected),
O(nloglogn), where n is number of pixels in image were Fig.3(c) shows the effect of ESMSER(proposed) using fast
originally used to determine resemblance points between im- guided filter (characters are detached properly). The fast
ages. It is accepted in numerous discipline like object tracking, guided filter [16] having time complexity O(n/s2 ), (s is sub-
image matching, object recognition etc. The MSER algorithm sampling ratio) decreases the execution time for smoothing of
generates stable regions across a range of thresholds which are edges as discussed in SectionIV-2. The time complexity of the
either brighter or darker than their adjoining areas. Immutable Algorithm 1 is O(nloglogn)+O(n/s2 ). The space complexity
to affine transformation, steady to the range of thresholds, is proportional to n (pixels in image).
resilient to multiscale detection [5] are few advantages of 2) Constituent Filtering: The texts like constituents such as
MSER that makes them suitable for scene text detection and bricks, windows, boundaries of sign boards, doors, etc [13],

352
2017 Fourth International Conference on Image Information Processing (ICIIP)

(a) (b) (c) (d) (e)

Fig. 2: Proposed Method a) Sample Image. b) ESMSER. c) Constituent Filtering. d) Labelling. e) Grouping (Detected) text.

Algorithm 1 ESMSER: Prospective Text Constitutents

Input: Sample Image Is and corresponding parameters
Output: Prospective text constituents.
1: Transform image Is in intensity Image Ic utilizing HSI.
2: The edges of Image Ic are smoothened by using Fast (a) (b) (c)
Guided Filter [16]. Fig. 4: (a)Sample (b)ESMSER (c)SWD. (best viewed in color)
3: ∇Ic gradient value map is calculated.
4: Normalize ∇Ic to [0, 255]and get edge smoothened (Ies )
image Ies = Ic + λ ∇ Ic and Ies = Ic - λ ∇ Ic . consistent color, with a part of the text having approximately
5: Extract text constituents by applying MSER on Ies . constant width. In [13], for a region r the width variation
of a component (c) is defined as wv(r) = σ(c) μ(c) . Inspired by
[13], we propose to use attribute Stroke Width Dissimilarity
(equation 1 and Algorithm 2 )as follows:
σ(lsw )
SW D(r) = (1)
μ(lsw )
In fig. 4 color is used to show the stroke width (lsw ) similarity
in text regions 4(a), whereas in non-text constituents 4(b), there
is large dissimilarity of stroke width (lsw ) .
Algorithm 2 Stroke Width Dissimilarity(SW D)
Input: Prospective text Ct region.
Output: Stroke Width Dissimilarity.
(a) (b) (c) 1: Obtain the outline So (r) of the candidate Ct region r.
Fig. 3: a) Sample Image. b)Original MSER. c) Proposed ESMSER. 2: The distance transform is used to calculate its shortest
path to the region boundary for every pixel pSo (r). This
calculated shortest path lsw is called as Stroke width.
σ(l )
[6] may contribute to false positives in the detection process, 3: Calculate SW D(r) = μ(lsw ) .
sw
therefore, it is required to remove them. The basic structural
properties of text and non-text are different from each other,
2) Color Dissimilarity(CD): The color of the text regions
therefore certain heuristics and basic rules are implemented to
are disparate from its background in images, so text can
eliminate these non-text elements. These rules are as follows
be effortlessly, pointed out by humans. We propose attribute
(a) The aspect ratio for constituents is kept in the range 0.3 and
color dissimilarity (CD) as color distinction between text
3. (b) The occupation ratio for constituents is kept in between
and its adjacent area. The color dissimilarity of two regions
0 to 1. (c) The skeleton of constituents is kept less than 18,
can be estimated with the assistance of the Jensen-Shannon
as texts are smaller as compared to non-texts.
Divergence(JSD) [17]. JSD is well defined in information
B. Attributes for text regions theory, symmetric and a measure of discernibility. It’s square
We present three different attributes to discern between text root is a true metric for the probability distribution space.
and non-text constituents. The JSD between the probability distributions M and N is
1) Stroke Width Dissimilarity (SW D): The text con- calculated as:
1 1
stituents have almost unvarying stroke width and it denotes DSJSD (M ||N ) = (DSKL (M ||A)) + (DSKL (N ||A))
2 2
the text region in the image. It is extensively used in the field (2)
of text detection as an inital step [4], [6], [8], [13]. Stroke where, DSKL (M ||A) is the KLD divergence [18] between M
width transform (SWT) is specified as the length of straight and A is defined as
b
line in the perpendicular direction between two edge pixels M (i)
DSKL (M ||A) = M (i) log (3)
[4]. A stroke [7] can be interpret as a coupled region with
i
A(i)

353
2017 Fourth International Conference on Image Information Processing (ICIIP)

Distribution facet of SWD(text components) Distribution facet of SWD(Non Text Components)

0.2 0.2

0.16 0.16

Feasibility

Feasibility
0.12 0.12

0.08 0.08

(a) Convex Area (b) Area of bounding Box 0.04 0.04

Fig. 5: (a) Convex Area. (b) Bounding Box.

0 0
5 10 15 20 25 30 35 40 45 50 5 10 15 20 25 30 35 40 45 50
Bins Bins
Distribution facet of CD (Text Components) Distribution facet of CD(Non Text Components)

where, b denotes the number of bins. and A= 12 (M + N ).

0.25 0.25

The color dissimilarity of a region L against its surrounding 0.2 0.2

Feasibility
L∗ , is calculated as:

Feasibility
0.15 0.15

b

CD(L) = DSJSD (Ci (L)||Ci (L∗ )) (4)
0.1 0.1

0.05 0.05
RGB i=1
where, C(L) and C(L∗ ) are color histograms of two region 0
10 20 30 40 50
0
10 20 30 40 50

L and L∗ in a channel (R,G,B). Here L∗ exhibits the region Bins

Distribution facet of O C (Text Components)
Bins
Distribution facet of O C (Non Text Components)
outside L but within its bounding box, ibL is the index of
r a r a
0.2 0.2

0.18 0.18

histogram bins. The decisive color dissimilarity attribute for 0.16 0.16

the region L is acquired by summing the discrete color

Feasibility

Feasibility
0.14 0.14

0.12 0.12

dissimilarity of each (R,G,B) channel(equation 4). 0.1 0.1

3) Occupy Rate Convex Area attributes (Or Ca ): Inspired

0.08 0.08

0.06 0.06

by [19] we propose Or Ca to discern between text and non-text 0.04 0.04

0.02 0.02
constituents. It is calculated as the ratio of the convex area Ca 0
5 10 15 20 25 30 35 40 45 50
0
5 10 15 20 25 30 35 40 45 50

of a region r to the bounding box’s area enclosing the region r. Bins Bins

Convex Area is determined as an area of the smallest convex

(a) (b)
hull that comprises the region. The Or Ca attribute [19] for Fig. 6: Observation feasibility of text (green) in (a) and non-text
region r has range [0,1] is given as: (magenta) in (b) on three attributes i.e., SW D (first row), CV
CA (r) (second row), Or Ca (third row) are different each other.
Or Ca = (5)
ABB
where, CA , ABB is convex area and area of bounding box of
p(atr|nt) via distribution of attributes on text and non-text
the region r respectively. This feature is used in [12] to discern
constituents. This dataset is available with the pixel-level
between text and non-text constituents. Figure 5 displays both
ground truth information. The distribution of text and non-
convex area 5(a) and area of bounding box 5(b).
text constituents is computed as follows:
C. Text Confidence (TCS) 1) Distribution of attributes on text constituents is computed
1) TCS: Among the three attributes, SW D discern text and directly by applying ground truth information.
non-text constituents on structural dissimilarity, CD explores 2) Distribution of attributes on non-text constituents is com-
color dissimilarity of region r with its surrounding and Or Ca puted by first applying ESMSER to obtain possible candidate
discover area occupied dissimilarity. As these three attributes regions. The text constituents are masked from them by using
examine different ingredients of text and non-text constituents ground truth information, and three attributes are calculated
which are complementary and independent properties for a on non-text constituents.
region r, so it encourage us to blend them in a naive Bayesian In Figure 6 the normalized histograms (50 bins) are used to
framework to learn the feasibility TCS of a region r of being show the distribution of three attributes. The text constituents
a text (t) as follows: have smaller values (nearly within 10-15 bins) for distribution
p(Ψ|t)p(t) of SW D as compared with non-text constituents. The text
T CS(t|Ψ) = (6) elements have higher values of CD as compared to non-text
p(Ψ)
p(t)Πatr∈Ψ p(atr|t) and for Or Ca , text constituents almost follow half Gaussian
= (7) distribution but, such estimation does not exist for non-text
j∈{t,nt} p(j)Πatr∈Ψ p(atr|j)
constituents. [19].
Where Ψ= (SW D, CD, Or Ca ),atr stands for attribute and
p(t), p(nt) denote prior probabilities of text and non-text, D. Labeling and Grouping
respectively and calculated on the basis of relative frequency. 1) Overview of Labeling Model: We propose to use binary
2) Training on ICDAR 2013 :Distribution of the attributes: constituent association and unary constituent characteristics to
The dataset of ICDAR 2013 (text segmentation task) challenge classify constituents properly (in text and non-text category).
2 of robust reading competition [14] is selected for training A standard image segmentation problem can be formulated
purpose to compute the observation feasibility p(atr|t) and using graph cut model [2] for giving binary labels to text and

354
2017 Fourth International Conference on Image Information Processing (ICIIP)

non-text constituents. A standard graph model GI = (VI , EI ) region. Integration of text line is performed by taking at least
is constructed for every input image I where, vertex set two constituents on the basis of the spatial distance(calculated
associated to possible text regions is defined as VI = {vi } and by euclidean norm) of labeled constituents.
the edge set associated to the interaction between vertexes is
IV. E XPERIMENTAL R ESULTS AND D ISCUSSION
defined as EI = {ei }. To give label to each vi as either text
ki = 1, or non-text ki = 0 i.e ki ∈ {0, 1} is known as binary 1) Performance Evaluation Measure and Dataset: To mea-
labeling problem. Text and non-text can be isolated by means sure the usefulness of our proposed approach, we evaluate it on
of text labeling set K = {ki }. In this paper, inspired by [2] ICDAR 2013 [14] dataset of text localization task (challenge
the energy function (see equation (8)) is minimized to obtain 2, task 2.1) which contains 233 and 229 images for test and
optimal labeling K∗ . training set respectively. The detected bounding box around
K∗ = arg min E(K) (8) the texts is used to assess the performance in terms of three
K parameter namely precision(p), recall (r) and f -measure [20].
E(K) = ui (ki ) + vij (ki , kj ) (9) The deteval software [20] is used to calculate p and r by using
i i,j∈E many-to-one, one-to-many matches and one-to-one matches
where, ui (ki ), is unary potential,that determines the expenses between ground truth and detected bounding boxes. The f-
of giving label ki to ui , and vij (ki , kj ), is pairwise potential, measure is calculated as the harmonic mean [20] of the recall
that determines the expenses of assigning different labels to vi and precision.
and vj . Optimal labeling K∗ can be calculated efficiently using 2) Comparison of Execution time on Smoothing of Edges:
graph cut [2] as labeling is an energy minimization problem. The original MSER suffers due to the presence of the varied
2) Estimation of unary potential: Text Confidence Score pixels in the vicinity of edges so, it is imperative to smoothen
(TCS) in equation(7) can be used for estimation of the unary edges to extract the text properly. The edge smoothing can be
potential for the region as: achieved by guided filter [15] due to its perceivable quality.
T CS(k|Ψ), ki = 1 In this paper, the fast guided filter [16] is used for smoothing
ui (ki ) = (10) of edges and can accelerate from O(n) time to O(n/s2 ) (n
1 − T CS(k|Ψ), ki = 0
3) Estimation of pairwise potential: Due to some features is number of pixels) time for a sub-sampling ratio s. The
like color, spatial distance, texture, geometric etc. neighboring Table I shows that, fast guided filter (s=2) reduces the average
text constituents appear to be similar to each other. Two execution time for smoothing of edges by 67%. In both
features are used to quantify correspondence between regions. experiments the value of delta (which controls how stability
Distance Feature (DF): The distance features between two is calculated) parameter of MSER is kept 10.
adjacent constituents of extracted possible text regions is TABLE I: Execution time for smoothing of edges.
calculated as the euclidean distance DF (tri , trj ) between the Filter Avg. Execution Time (in seconds)
m and n coordinates of centroids of constituent of possible Guided filter 0.56
Fast Guided filter 0.182
text regions tri and trj .
Color Distance Feature (CDF): The CDF [8] is defined as
the average color distance between two region tri , trj in LAB 3) Effect of Proposed ESMSER: As mentioned in section
space model using L2 norm. The joint difference (JD) using III-A1 that original MSER algorithm is unable to deal with
(DF) and (CDF) can be estimated as follows [8]: blurriness present in the images, which creates hurdle in
JD(tri , trj ) = γDF (tri , trj ) + (1 − γ)CDF (tri , trj ) (11) detecting text properly in natural scene images. Therefore, in
where, γ specifies the relative weight of the two differences this paper we prefer to use Edge smoothing MSER (ESMSER)
and its value is set to 0.5 to give equal weightage to the DF to reduce the effect of blurriness in such images for efficient
and CDF. The joint difference is used to estimate the pairwise scene text detection. In Figure 7, the first and second row
potential as follows:
displays the prospective candidate regions detected by original
(1 − tanh(JD(ki , kj ))), ki = ki MSER and proposed ESMSER(Algorithm 1) respectively. It
vij (ki , kj ) = (12) is evident from the results shown in Fig.7 that characters
0, otherwise
are properly separated by proposed ESMSER (second row)
4) Text Line Integration: The labeled text components can as compared to MSER (first row) in which characters are
be integrated into text line on the basis of homogeneous interconnected to each other. Thus, proposed ESMSER helps
features such as average color, height, width, stroke width [4] in detecting the text in images with blurriness in them.
etc. Therefore, text line integration in this paper is achieved by 4) Text Detection and Localization Results: The proposed
using mean-shift clustering (bandwidth =2.2) with the help of method has been compared with few methods like [21],
two normalized features for a given constituent: Eccentricity [9], and some methods from ICDAR 2013 [14] competitions
and Orientation [13], for clustering the text regions using mean for scene text detection and localization methods on dataset
shift algorithm. The Eccentricity is the ratio of the distance ICDAR 2013. It is evident from Table II that the proposed
between the foci of the ellipse and its major axis length. The method attains a precision of 82%, a recall of 64% and f
Orientation is defined as an angle between x-axis and major measure of 72%. Figure 8 displays the few outputs of our
axis of the ellipse that has the same second-moments as the method as applied on ICDAR 2013 test dataset in the form

355
2017 Fourth International Conference on Image Information Processing (ICIIP)

and localization. The fast guided ﬁlter reduces the processing

time for edge smoothing. In future work, we intend to increase
the recall rate by improving MSER.
R EFERENCES
[1] H. Zhang, K. Zhao, Y.-Z. Song, and J. Guo, “Text extraction from natural
Fig. 7: Detection by Original MSER (First Row). Detection by scene image: A survey,” Neurocomputing, vol. 122, pp. 310–323, 2013.
ESMSER(Second Row) [2] Y. Boykov, O. Veksler, and R. Zabih, “Fast approximate energy min-
imization via graph cuts,” IEEE Transactions on pattern analysis and
TABLE II: Outcome on ICDAR 2013 dataset. machine intelligence, vol. 23, no. 11, pp. 1222–1239, 2001.
Method Year P R F [3] S. Lee, M. S. Cho, K. Jung, and J. H. Kim, “Scene text extraction with
TCS(Proposed) 2017 0.82 0.64 0.72 edge constraint and text collinearity,” in Pattern Recognition (ICPR),
Wang et al. [21] 2015 0.80 0.73 0.76 2010 20th International Conference on. IEEE, 2010, pp. 3983–3986.
Wang et al. [9] 2015 0.77 0.60 0.68 [4] B. Epshtein, E. Ofek, and Y. Wexler, “Detecting text in natural scenes
Text Detection [14] 2013 0.74 0.53 0.62 with stroke width transform,” in Computer Vision and Pattern Recogni-
TH-TextLoc [14] 2013 0.69 0.65 0.67 tion (CVPR), 2010 IEEE Conference on. IEEE, 2010, pp. 2963–2970.
I2R-NUS-FAR [14] 2013 0.75 0.69 0.71 [5] M. S. Extremal, J. Matas, O. Chum, M. Urban, and T. Pajdla, “Robust
CASIA-NLPR [14] 2013 0.78 0.68 0.73 wide baseline stereo from,” in In British Machine Vision Conference.
Citeseer, 2002.
[6] Y. Li and H. Lu, “Scene text detection via stroke width,” in Pattern
Recognition (ICPR), 2012 21st International Conference on. IEEE,
2012, pp. 681–684.
[7] C. Yi and Y. Tian, “Localizing text in scene images by boundary
clustering, stroke segmentation, and string fragment classification,” IEEE
Transactions on Image Processing, vol. 21, no. 9, pp. 4256–4268, 2012.
[8] Y. Li, W. Jia, C. Shen, and A. van den Hengel, “Characterness: An
indicator of text in the wild,” IEEE transactions on image processing,
vol. 23, no. 4, pp. 1666–1677, 2014.
[9] R. Wang, N. Sang, and C. Gao, “Text detection approach based on
confidence map and context information,” Neurocomputing, vol. 157,
pp. 153–165, 2015.
[10] J. Fabrizio, M. Robert-Seidowsky, S. Dubuisson, S. Calarasanu, and
R. Boissel, “Textcatcher: a method to detect curved and challenging
text in natural scenes,” International Journal on Document Analysis and
Recognition (IJDAR), vol. 19, no. 2, pp. 99–117, 2016.
[11] L. Gomez and D. Karatzas, “A fast hierarchical method for multi-script
and arbitrary oriented scene text extraction,” International Journal on
Document Analysis and Recognition (IJDAR), vol. 19, no. 4, pp. 335–
349, 2016.
[12] H. Chen, S. S. Tsai, G. Schroth, D. M. Chen, R. Grzeszczuk, and
Fig. 8: Sample Images from ICDAR 2013 dataset. B. Girod, “Robust text detection in natural images with edge-enhanced
maximally stable extremal regions,” in Image Processing (ICIP), 2011
18th IEEE International Conference on. IEEE, 2011, pp. 2609–2612.
[13] C. Yao, X. Bai, W. Liu, Y. Ma, and Z. Tu, “Detecting texts of
arbitrary orientations in natural images,” in Computer Vision and Pattern
Recognition (CVPR), 2012 IEEE Conference on. IEEE, 2012, pp. 1083–
1090.
[14] D. Karatzas, F. Shafait, S. Uchida, M. Iwamura, L. G. i Bigorda, S. R.
Mestre, J. Mas, D. F. Mota, J. A. Almazan, and L. P. de las Heras,
“Icdar 2013 robust reading competition,” in Document Analysis and
(a) (b) (c) Recognition (ICDAR), 2013 12th International Conference on. IEEE,
Fig. 9: Failure Cases 2013, pp. 1484–1493.
[15] K. He, J. Sun, and X. Tang, “Guided image filtering,” in European
conference on computer vision. Springer, 2010, pp. 1–14.
of detected text bounded by blue rectangles. The proposed [16] K. He and J. Sun, “Fast guided filter,” arXiv preprint arXiv:1505.00996,
2015.
method is able to detect text of font size, distinct fonts, [17] A. Majtey, P. Lamberti, and D. Prato, “Jensen-shannon divergence as a
color and orientation. It works effectively as compared to measure of distinguishability between mixed quantum states,” Physical
other state of the art work in occlusion, jumbled scene and Review A, vol. 72, no. 5, p. 052310, 2005.
[18] D. A. Klein and S. Frintrop, “Center-surround divergence of feature
divergent lighting conditions, but it needs improvement in few statistics for salient object detection,” in Computer Vision (ICCV), 2011
cases(see Fig.9) like text color mix with background and text IEEE International Conference on. IEEE, 2011, pp. 2214–2219.
in uncommon fonts. [19] A. Gonzalez, L. M. Bergasa, J. J. Yebes, and S. Bronte, “Text location
in complex images,” in Pattern Recognition (ICPR), 2012 21st Interna-
V. C ONCLUSION tional Conference on. IEEE, 2012, pp. 617–620.
[20] C. Wolf and J.-M. Jolion, “Object count/area graphs for the evaluation of
The paper presents a hybrid method for text detection and object detection and segmentation algorithms,” International Journal of
localization based on the text confidence score calculated by Document Analysis and Recognition (IJDAR), vol. 8, no. 4, pp. 280–296,
combination of three complementary attributes blended in 2006.
[21] Q. Wang, Y. Lu, and S. Sun, “Text detection in nature scene images using
the Bayesian framework to discern between text and non- two-stage nontext filtering,” in Document Analysis and Recognition
text areas. Results on ICDAR 2013 dataset by our method (ICDAR), 2015 13th International Conference on. IEEE, 2015, pp.
achieves good precision and recall for scene text detection 106–110.

356

Road Estimate PMGSY
100% (2)
Road Estimate PMGSY
7 pages
(IJCST-V12I2P9) :Dr.M. Praneesh, Ashwanth.V, Febina.N, Sai Krishna P K
No ratings yet
(IJCST-V12I2P9) :Dr.M. Praneesh, Ashwanth.V, Febina.N, Sai Krishna P K
8 pages
IJCRT2108410
No ratings yet
IJCRT2108410
5 pages
Scene Text Detection Using Machine Learning Classifiers
No ratings yet
Scene Text Detection Using Machine Learning Classifiers
5 pages
DSP Project
No ratings yet
DSP Project
16 pages
Scene Text Recognition by Using EE-MSER and Optical Character Recognition For Natural Images-35843
No ratings yet
Scene Text Recognition by Using EE-MSER and Optical Character Recognition For Natural Images-35843
5 pages
10
No ratings yet
10
22 pages
X2 - Text Recognition PDF
No ratings yet
X2 - Text Recognition PDF
14 pages
Scene Text Detection With Fully Convolutional Neural Networks
No ratings yet
Scene Text Detection With Fully Convolutional Neural Networks
23 pages
1301.2628!!!
No ratings yet
1301.2628!!!
10 pages
Gupta Synthetic Data For CVPR 2016 Paper
No ratings yet
Gupta Synthetic Data For CVPR 2016 Paper
10 pages
Ijecet: International Journal of Electronics and Communication Engineering & Technology (Ijecet)
No ratings yet
Ijecet: International Journal of Electronics and Communication Engineering & Technology (Ijecet)
8 pages
A Robust and Fast Text Extraction in Images and Video Frames
No ratings yet
A Robust and Fast Text Extraction in Images and Video Frames
7 pages
Signboard Detection and Text Recognition Using Artificial Neural Networks
No ratings yet
Signboard Detection and Text Recognition Using Artificial Neural Networks
4 pages
IJARCCE 208
No ratings yet
IJARCCE 208
3 pages
Applied Sciences: Scene Text Detection Using Attention With Depthwise Separable Convolutions
No ratings yet
Applied Sciences: Scene Text Detection Using Attention With Depthwise Separable Convolutions
18 pages
Automatically Detect and Recognize Text in Natural Images
No ratings yet
Automatically Detect and Recognize Text in Natural Images
19 pages
Text Detection and Recognition For Semantic Mapping in Indoor Navigation
No ratings yet
Text Detection and Recognition For Semantic Mapping in Indoor Navigation
4 pages
3586a949
No ratings yet
3586a949
6 pages
Rainarli 2020 IOP Conf. Ser. Mater. Sci. Eng. 879 012106
No ratings yet
Rainarli 2020 IOP Conf. Ser. Mater. Sci. Eng. 879 012106
9 pages
Kami Export - 1904.01941
No ratings yet
Kami Export - 1904.01941
5 pages
GoK2014_4
No ratings yet
GoK2014_4
6 pages
Text Detection and Recognition Using Enhanced MSER Detection and A Novel OCR Technique
No ratings yet
Text Detection and Recognition Using Enhanced MSER Detection and A Novel OCR Technique
7 pages
Tang_Few_Could_Be_Better_Than_All_Feature_Sampling_and_Grouping_CVPR_2022_paper
No ratings yet
Tang_Few_Could_Be_Better_Than_All_Feature_Sampling_and_Grouping_CVPR_2022_paper
10 pages
Text Detection
No ratings yet
Text Detection
17 pages
Deep Learning Approaches To Scene Text Detection A
No ratings yet
Deep Learning Approaches To Scene Text Detection A
61 pages
Research PaPer EAST
No ratings yet
Research PaPer EAST
10 pages
Real-Time Scene Text Detection Based On Global Level and Word Level Features
No ratings yet
Real-Time Scene Text Detection Based On Global Level and Word Level Features
12 pages
Stroke Width Transform
No ratings yet
Stroke Width Transform
8 pages
Project Report On 2factor Authentication
No ratings yet
Project Report On 2factor Authentication
91 pages
Cheng BorderNet An Efficient Border-Attention Text Detector ACCV 2022 Paper
No ratings yet
Cheng BorderNet An Efficient Border-Attention Text Detector ACCV 2022 Paper
17 pages
Detection of Text from Lecture Video Images
No ratings yet
Detection of Text from Lecture Video Images
5 pages
Cohesive Multi-Oriented Text Detection and Recognition Structure in Natural Scene Images Regions Has Exposed
No ratings yet
Cohesive Multi-Oriented Text Detection and Recognition Structure in Natural Scene Images Regions Has Exposed
15 pages
CRNN Model For Text Detection and Classification From Natural Scenes
No ratings yet
CRNN Model For Text Detection and Classification From Natural Scenes
11 pages
Image Segmentation: Unlocking Insights through Pixel Precision
From Everand
Image Segmentation: Unlocking Insights through Pixel Precision
Fouad Sabry
No ratings yet
Enhanced Scene Text Recognition Using Deep Learning Based Hybrid Attention Recognition Network
No ratings yet
Enhanced Scene Text Recognition Using Deep Learning Based Hybrid Attention Recognition Network
12 pages
Kang Orientation Robust Text 2014 CVPR Paper
No ratings yet
Kang Orientation Robust Text 2014 CVPR Paper
8 pages
Report
No ratings yet
Report
39 pages
Extraction Text From Camera Images
No ratings yet
Extraction Text From Camera Images
14 pages
CMRT09 Fabrizio Et Al
No ratings yet
CMRT09 Fabrizio Et Al
6 pages
3586a929
No ratings yet
3586a929
6 pages
Yerrijdnewpaper
No ratings yet
Yerrijdnewpaper
5 pages
Miriam Leon, Veronica Vilaplana, Antoni Gasull, Ferran Marques (Veronica - Vilaplana, Antoni - Gasull, Ferran - Marques) @upc - Edu
No ratings yet
Miriam Leon, Veronica Vilaplana, Antoni Gasull, Ferran Marques (Veronica - Vilaplana, Antoni - Gasull, Ferran - Marques) @upc - Edu
4 pages
Haramaya University Computer Science Student
No ratings yet
Haramaya University Computer Science Student
15 pages
10.36222-ejt.1407231-3609588
No ratings yet
10.36222-ejt.1407231-3609588
7 pages
SEE: Towards Semi-Supervised End-to-End Scene Text Recognition
No ratings yet
SEE: Towards Semi-Supervised End-to-End Scene Text Recognition
8 pages
Deep Scene Text Detection With Connected Component Proposals
No ratings yet
Deep Scene Text Detection With Connected Component Proposals
10 pages
IJERT Segmentation and Detection of Text
No ratings yet
IJERT Segmentation and Detection of Text
6 pages
Robustdetection of Text in Natural Scene Images
No ratings yet
Robustdetection of Text in Natural Scene Images
4 pages
Top-Down and Bottom-Up Cues For Scene Text Recognition: Anand Mishra Karteek Alahari C. V. Jawahar
No ratings yet
Top-Down and Bottom-Up Cues For Scene Text Recognition: Anand Mishra Karteek Alahari C. V. Jawahar
8 pages
turki2016-AICCSA
No ratings yet
turki2016-AICCSA
6 pages
Char RCG TH
No ratings yet
Char RCG TH
11 pages
Text Extraction and Localization From Captured Images: Taufin M Jeeralbhavi Dr. Jagadeesh D. Pujari Shivananda V. Seeri
No ratings yet
Text Extraction and Localization From Captured Images: Taufin M Jeeralbhavi Dr. Jagadeesh D. Pujari Shivananda V. Seeri
3 pages
1st Review
100% (1)
1st Review
14 pages
TraffSign-Multilingual-Traffic-Signboard-Text-Detection-and-Recognition-for-Urdu-and-English
No ratings yet
TraffSign-Multilingual-Traffic-Signboard-Text-Detection-and-Recognition-for-Urdu-and-English
15 pages
Scene Text Detection and Recognition USING DL PDF
No ratings yet
Scene Text Detection and Recognition USING DL PDF
20 pages
Jaderberg 16
No ratings yet
Jaderberg 16
20 pages
2021 PEREPU - Artificial Intelligence - Deep Learning For Detection of Text Polarity in Natural Scene Images
No ratings yet
2021 PEREPU - Artificial Intelligence - Deep Learning For Detection of Text Polarity in Natural Scene Images
6 pages
Deep Matching Prior Network: Toward Tighter Multi-Oriented Text Detection
No ratings yet
Deep Matching Prior Network: Toward Tighter Multi-Oriented Text Detection
8 pages
8e58227702cd9aaf
No ratings yet
8e58227702cd9aaf
15 pages
Document Mosaicing: Unlocking Visual Insights through Document Mosaicing
From Everand
Document Mosaicing: Unlocking Visual Insights through Document Mosaicing
Fouad Sabry
No ratings yet
20220715114528cbse Circular 1
No ratings yet
20220715114528cbse Circular 1
2 pages
Assignment 1 - DC
No ratings yet
Assignment 1 - DC
1 page
TDM
No ratings yet
TDM
10 pages
Analog Vs Digital
No ratings yet
Analog Vs Digital
2 pages
Digital Communication Advantages-Disadvantages
No ratings yet
Digital Communication Advantages-Disadvantages
4 pages
Acti9 Disbo - DBXROW2FDS
No ratings yet
Acti9 Disbo - DBXROW2FDS
2 pages
Exercise 4 - Режим Прохождение - Online Practice (Workbook) Unit 4 Home Sweet Home 4.4 Reading - Form 7-8 a - MyEnglishLab
No ratings yet
Exercise 4 - Режим Прохождение - Online Practice (Workbook) Unit 4 Home Sweet Home 4.4 Reading - Form 7-8 a - MyEnglishLab
1 page
Powell The Fighting Dragon How To
100% (5)
Powell The Fighting Dragon How To
147 pages
Pipeline Pre Trenching Pre Qua - Rev A 27june22 - Final
No ratings yet
Pipeline Pre Trenching Pre Qua - Rev A 27june22 - Final
57 pages
Elexp: Instruction Manual
No ratings yet
Elexp: Instruction Manual
11 pages
1st Grand Test Correct Answer Key1
No ratings yet
1st Grand Test Correct Answer Key1
3 pages
(China Academic Library) Huaiqi Wu (Auth.) - An Historical Sketch of Chinese Historiography-Springer-Verlag Berlin Heidelberg (2018)
No ratings yet
(China Academic Library) Huaiqi Wu (Auth.) - An Historical Sketch of Chinese Historiography-Springer-Verlag Berlin Heidelberg (2018)
504 pages
96 Corporate Safety - WHS Fatigue Management Procedure
No ratings yet
96 Corporate Safety - WHS Fatigue Management Procedure
19 pages
7JUMANGPAS SET UP As Stake-6 GDS-HVW
No ratings yet
7JUMANGPAS SET UP As Stake-6 GDS-HVW
1 page
ICRC
No ratings yet
ICRC
2 pages
6 Timers
No ratings yet
6 Timers
36 pages
03MCL013
No ratings yet
03MCL013
147 pages
Controlling The: Jatco Re5R05A
100% (1)
Controlling The: Jatco Re5R05A
8 pages
Case Study On HEAD INJURY
No ratings yet
Case Study On HEAD INJURY
11 pages
Unit 8 Packet Key
No ratings yet
Unit 8 Packet Key
21 pages
Infographics Global Gateway Flagship Projects 2023 2024 Eu Africa - en
No ratings yet
Infographics Global Gateway Flagship Projects 2023 2024 Eu Africa - en
6 pages
414 Tutorials
No ratings yet
414 Tutorials
2 pages
Nail The Mix EQ Guide
No ratings yet
Nail The Mix EQ Guide
12 pages
Development of Educational Aim & Objectives
100% (1)
Development of Educational Aim & Objectives
26 pages
16 S 10 Standard Earthing Philosophy of GETCO
No ratings yet
16 S 10 Standard Earthing Philosophy of GETCO
4 pages
Cremophor EL
100% (1)
Cremophor EL
8 pages
Cooking Korean Food With Maangchi - Books 1 and 2 (Revised 2nd Edition)
100% (8)
Cooking Korean Food With Maangchi - Books 1 and 2 (Revised 2nd Edition)
117 pages
Free Space Optics
No ratings yet
Free Space Optics
7 pages
Bajaj Electricals LTD
No ratings yet
Bajaj Electricals LTD
9 pages
Race Academy General English Part B
No ratings yet
Race Academy General English Part B
56 pages
7th Math Unit 15 Lec2
No ratings yet
7th Math Unit 15 Lec2
9 pages
Lotus
No ratings yet
Lotus
2 pages
Rotary Kiln Zones
No ratings yet
Rotary Kiln Zones
1 page