Character Recoganization

Download as pdf or txt
Download as pdf or txt
You are on page 1of 6

nd

IEEE Sponsored 2 International Conference on Innovations in Information,Embedded and Communication systems (ICIIECS)2015

Text Recognition from Images


Mr. Pratik Madhukar Manwatkar Mr. Shashank H. Yadav
Department of Computer Technology, Department of Computer Technology,
YCCE, Nagpur (M.S.), 441 110, India. YCCE, Nagpur (M.S.), 441 110, India.
[email protected] [email protected]

Abstract— Text recognition in images is a research area which documents are different to font of the characters in computer
attempts to develop a computer system with the ability to system. As a result, computer is unable to recognize the
automatically read the text from images. These days there is a characters while reading them. This concept of storing the
huge demand in storing the information available in paper contents of paper documents in computer storage place and
documents format in to a computer storage disk and then later then reading and searching the content is called document
reusing this information by searching process. One simple way to
processing. Sometimes in this document processing we need
store information from these paper documents in to computer
to process the information that is related to languages other
system is to first scan the documents and then store them as
images. But to reuse this information it is very difficult to read
than the English in the world. This process is also called
the individual contents and searching the contents form these Document Image Analysis (DIA). Thus our need is to develop
documents line-by-line and word-by-word. The challenges some text recognition algorithm to perform Document Image
involved in this the font characteristics of the characters in paper Analysis which transforms documents in paper format to
documents and quality of images. Due to these challenges, electronic format.
computer is unable to recognize the characters while reading
them. Thus there is a need of character recognition mechanisms The paper is organized as follows: in Section 2, we
to perform Document Image Analysis (DIA) which transforms discuss the related work done in field of image to text
documents in paper format to electronic format. In this paper we recognition. Section 3, overview of text recognition system.
have discuss method for text recognition from images. The Section 4, we discuss about experimental results of this
objective of this paper is to recognition of text from image for
system. Section 5, discusses about the applications of text
better understanding of the reader by using particular sequence
recognition and Section 6, finally, conclusion is given.
of different processing module.

Keywords: Document Image Analysis (DIA), electronic format, II. LITERATURE REVIEW
text recognition, font characteristics.
As discussed earlier text recognition from images is still an
active research in the field of pattern recognition. To address
I. INTRODUCTION
the issues related to text recognition many researchers have
Now-a-days, there is growing demand for the software proposed different technologies, each approach or technology
systems to recognize characters in computer system when tries to address the issues in different why. In forthcoming
information is scanned through paper documents as we know section we present a detailed survey of approaches proposed
that we have number of newspapers and books which are in to handle the issues related to text recognition.
printed format related to different subjects. These days there is
Yang et al.[1] has proposed a novel adaptive binarization
a huge demand in “storing the information available in these method based on wavelet filter is proposed. This approach was
paper documents in to a computer storage disk and then later processes faster, so that it is more suitable for real-time
reusing this information by searching process”. One simple processing and applicable for mobile devices. They evaluated
way to store information in these paper documents in to this adaptive method on complex scene images of ICDAR
computer system is to first scan the documents. Whenever we 2005 database. Sankaran et al. [2] has proposed a novel
recognition approach that result in a 15% decrease in word
scan the documents through the scanner, the documents are
error rate on heavily degraded Indian language document
stored as images format in the computer system. These images images.
containing text cannot be edited by the user. But to reuse this
information it is very difficult for computer system to read the Gur et al. [3] has discussed some problems in text
individual contents and searching the contents form these recognition and retrieval. Automated optical character
documents line-by-line and word-by-word. The reason for this recognition(OCR) tools do not supply a complete solution and
in most cases human inspection is required. They suggest a
difficulty is the font characteristics of the characters in paper
novel text recognition algorithm based on usage of fuzzy logic

978-1-4799-6818-3/15/$31.00 © 2015 IEEE


nd
IEEE Sponsored 2 International Co
onference on Innovations in Information,Embedded and Com
mmunication systems (ICIIECS)2015

rules relying on statistical data of the analyzzed font. The new form. Our text recognition system divided in following
approach combines letter statistics and correelation coefficients module:
in a set of fuzzy based rules, enabling thhe recognition of
distorted letters that may not be retrievedd otherwise. They A. Pre-processing Modulle
focused on rashi fonts associated with com mmentaries of the B. System Training Module.
bible that are actually handwritten calligraphhy. C. Text Recognition Moddule
D. Post-processing Moduule
Rhead et al. [4] has considered real world
w UK number
plates and relates these to ANPR. It considders aspects of the The overall architecture is depicted
d in figure 1.
relevant legislation and standards when applying them to real
world number plates. The varied manufaccturing techniques
and varied specifications of component parrts are also noted.
The varied fixing methodologies and fixxing locations are
discussed as well as the impact on image cappture.

Badawy, W. et al. [5] has discussed the Automatic license


plate recognition (ALPR) is the extraction of vehicle license
plate information from an image or a sequennce of images. The
extracted information can be used with or without
w a database
in many applications, such as electronic paym
ment systems (toll
payment, parking fee payment), and freeeway and arterial
monitoring systems for traffic surveillance. The ALPR uses
either a color, black and white, or infrareed camera to take
images.

Jawahar et al. [6] has proposed a recoggnition scheme for


the indian script of devanagari. They used approach
a does not
require word to character segmentation, whhich is one of the
most common reason for high word error ratte. They have been
reported a reduction of more than 20% in word
w error rate and
over 9% reduction in character error rate while comparing
with the best available OCR system.

Ntirogiannis et al. [7] has studied that thee document image


binarization is of great importance in the document image
analysis and recognition pipeline since it afffects further stages
of the recognition process. The evaluation of a binarization Fig.1: Architecturee of text recognition.
method aids in studying its algorithmic behhavior, as well as
A. Pre-processing Module
verifying its effectiveness, by providingg qualitative and
quantitative indication of its performance. They proposed a The Paper document is geenerally scanned by the optical
pixel-based binarization evaluation methodoology for historical scanner and is converted in to the form of a picture. A picture
handwritten/machine-printed document imagges. is the combinations of picture elements which are also known
as pixels. The pixels contain basically two values ON and
Malakar et al. [8] has described that extraaction of text lines
OFF. The ON value points thaat’s the pixel is visible and the
from document images is one of the impoortant steps in the
OFF value points that’s the piixel is not visible. At this stage
process of an Optical Character Recognition (OCR) system. In
we have the data in the form of o image and this image can be
case of handwritten document images, preesence of skewed,
further analyzed so that’s thee important information can be
touching or overlapping text line(s) makes this
t process a real
retrieved. So to improve qualitty of the input image and make
challenge to the researcher.
it suitable for further analysis, We perform some operation on
III. TEXT RECOGNITION SYSTEM RELATTED WORK it such as Grayscale conversionn, Binary image conversion and
the most important is segmenttation. In this we perform some
In this section we describe the overall arrchitecture of Text operation on scan image such as: a
recognition system. A Text recognition syystem receives an 1) Pre-processing of imagges
input in the form of image which conntains some text This performs certain activitiees such as scanning documents,
information. The output of this system is inn electronic format storing them as images. The module supports the following
i.e. text information in image are stored in computer
c readable services:-
nd
IEEE Sponsored 2 International Conference on Innovations in Information,Embedded and Communication systems (ICIIECS)2015

a. Scanning printed documents and storing the all the characters that are required for recognition from the
documents as snapshots or images. scanned document as an image file. This image file should be
provided as an input during the training process.
b. Processing those image-based documents, Converting
these image-based documents into proper format(also
called structured documents) such as Greyscale and C. Text Recognition Module
Binary format.
This module can be used for text recognition in output
2) Segmentation: image of pre-processing model and give output data which are
in computer understandable form. Hence in this module
The segmentation is the most important process in text
following techniques are used.
recognition. Segmentation is done to make the separation
between the individual characters of an image. Segmentation 1) Feature Extraction
is one of the most important phases in this project. The
performance of of this project is depending on segmentation. Feature extraction is the process to retrieve the most
Segmentation subdivides an image into its constituent regions important data from the raw data. The most important data
or objects. Basically in segmentation, we try to extract basic means that’s on the basis of that’s the characters can be
constituent of the script, which are certainly characters. This is represented accurately. To store the different features of a
needed because our classifier recognizes these characters only. character, the different classes are made. There are many
In this project ,We perform the segmentation of character from technique used for feature extraction like Principle
image by applying Line detection and Character detection Component Analysis (PCA), Linear Discriminate Analysis
algorithm which are discuss as follows: (LDA), Independent Component Analysis (ICA), Chain Code
(CC), zoning, Gradient Based features, Histogram etc. .
Algorithm: Line detection from image
In this we use matrix feature extraction method. In this
Step 1 :Start scanning the image horizontally from the topmost method first we convert the image to binary matrix i.e. black
left corner row by row. and white image convert to matrix form, it may look like as
Step 2:If any black pixel is encountered in a row make the row shown in figure 2. in the above figure text image is converted
status as ‘0’. in to the matrix of 0’s and 1’s.from this matrix data we was
Step 3: If no black pixel in encountered in a row while tracing extract text character line by line and word by word by using
it then marks the row status as ‘1’. above segmentation method. After that segmented characters
Step 4:By counting and following the total numbers of data are normalized and store in fixed dimension as a feature
continuous ‘0’ from row status vector number and of that character which can be shown in above figure 3.
position of lines can be obtained
2) Classification

Algorithm: Character detection from the line The classification is the process of identifying each
character and assigning to it the correct character class, so that
Step 1: Take a single line under consideration. texts in images are converted in to computer understandable
Step 2:Start scanning the image vertically from the topmost form. This process used extracted feature of text image for
left corner column by column. classification i.e. input to this stage is output of the feature
Step3: If any black pixel is encountered in a column mark the extraction process. Classifiers compare the input feature with
column status to ‘0’. stored pattern and find out best matching class for input. There
Step 4:If no black pixel in encountered in a column while are many technique used for classification such as Artificial
tracing it then marks the column status as ‘1’. Neural Network (ANN), Template Matching, Support Vector
Step 5: By counting and following the total numbers of Matching (SVM) etc.
continuous ‘0’ from column status vector number and
position of lines can be obtained. In this we use Artificial Neural Network (ANN) for
classification because neural network can get itself trained
automatically on the basis of efficient tools for learning large
B. System Training Module databases and examples. This approach is non algorithmic and
trainable. There are the different types of neural networks
This module can be used to train the system for text which can be used for the classification from which we used
recognition. Before converting the printed documents in to Kohonen neural network.
editable and searchable documents, the first and the
mandatory step is providing training to the system. Here
training in the sense the font followed in the scanned
document should be identified by the user. Then the user types
nd
IEEE Sponsored 2 International Co
onference on Innovations in Information,Embedded and Com
mmunication systems (ICIIECS)2015

Kohonen Neural Network: D. Post-processing Moduule

The Kohonen neural network workks differently than The output of Text Recognnition Module is in the form text
the feed forward neural network. The Kohonnen neural network data which is understand by computer, So there need to store it
contains only an input and output layer of neeurons. There is no
in to some proper format( i.e. txt
t or MS-Word )for farther use
hidden layer in a Kohonen neural network. First we will
examine the input and output to a Kohonen neural
n network. such as Editing or Searching inn that data.

The input to a Kohonen neural networrk is given to the IV. EXPERIMEENTAL RESULT
neural network using the input neurons. Thhese input neurons
are each given the floating point numbers that make up the The Paper document is geenerally scanned by the optical
input pattern to the network. A Kohonenn neural network scanner and is converted in to the form of a picture. A picture
requires that these inputs be normalized to thhe range between - is the combinations of picture elements which are also known
1 and 1. Presenting an input pattern to the network
n will cause as pixels. At this stage we havve the data in the form of image
a reaction from the output neurons. and this image can be further analyzed
a so that’s the important
information can be retrieved. So,
S we apply our method of text
The output of a Kohonen neural networkk is very different recognition which disused in thhis paper and output results are
from the output of a feed forward neural nettwork. If we had a shown in the form of followingg images.
neural network with five output neurons we would
w be given an
output that consisted of five values. This is not the case with
the Kohonen neural network. In a Kohoneen neural network
only one of the output neurons actually produces
p a value.
Additionally, this single value is either true or
o false. When the
pattern is presented to the Kohonen neural neetwork, one single
output neuron is chosen as the output neuroon. Therefore, the
output from the Kohonen neural network is usually the index
of the neuron (i.e. Neuron #5) that fired. TheT structure of a
typical Kohonen neural network is shown in Figure 2.

Fig.3: Output of
o preprocessing.

Fig.4: Output of Line


L segmentation.

Fig.5: Output of Chaaracter segmentation.


Fig.2: The structure of a typical Kohonen neuural network
nd
IEEE Sponsored 2 International Co
onference on Innovations in Information,Embedded and Com
mmunication systems (ICIIECS)2015

professionals now have fast, easy


e access to a huge library of
documents in electronic formatt, which they can find simply by
typing in a few keywords.

C. Healthcare[18]

Healthcare also use of imaage text recognition technology


to process paperwork. Healthcaare professionals always have to
deal with large volumes of forms for each patient, including
insurance forms as well as genneral health forms. To keep up
with all of this information, it is useful to input relevant data
into an electronic database thaat can be accessed as necessary.
By using image recognition tecchnology they are able to extract
information from forms and puut it into databases, so that every
patient's data is promptly reccorded. As a result, healthcare
providers can focus on deliverring the best possible service to
every patient.

Fig.6: Output of Text Recognition Sysstem. D. Image text recognition in Other


O Industries[18]

Image text recognition techhnology is widely used in many


other fields, including educattion, finance, and government
V. APPLICATION agencies. This technology has made countless texts available
online, saving money for studeents and allowing knowledge to
Text recognition technology may be appply throughout the be shared. Invoice imaging applications
a are used in many
entire spectrum of industries, revolutioniziing the document businesses to keep track of financial
f records and prevent a
management process. This technology enablle scan documents backlog of payments from piliing up. In government agencies
to become more than just image files, turning
t into fully and independent organizatioons, image text recognition
searchable documents with text content thatt is recognized by technology simplifies data coollection and analysis, among
computers. With the help of this technology, people no longer other processes.
need to manually retype important documennts when entering
them into electronic databases. Instead, Text recognition As the technology continuues to develop, more and more
system extracts relevant information and enters it applications are found for techhnology, including increased use
automatically. The result is accurate, effiicient information of handwriting recognition.
processing in less time. In the following, we
w overview some
applications of text recognition system VI. CONC
CLUSION

A. Banking[18] In this paper, we propossed and discussed method text


recognition. The OCR is a widde area for researcher in pattern
The uses of image text recognition varry across different recognition. A lot of research work has been done and is still
fields. One widely known application is in banking,
b it is used being done in OCR for varioous languages. More and more
to process checks without human involvemennt. A check can be researchers are attracted to thiis challenging field. Each stage
inserted into a machine, the writing on it is scanned instantly, of optical character recognitioon has its own significance and
and the correct amount of money is transferred. This should be designed properly for better results.
technology has nearly been perfected for prrinted checks, and .
is fairly accurate for handwritten checks as a well, though it
occasionally requires manual confirmatioon. Overall, this REFER
RENCE
reduces wait times in many banks.
[1] Yang, Jufeng, Kai Wanng, Jiaofeng Li, Jiao Jiao, and Jing
B. Legal[18] Xu, “A fast adaptive binaarization method for complex scene
images,” 19th IEEE Innternational Conference on Image
In the legal industry, there has also been
b a significant Processing (ICIP), 2012..
movement to digitize paper documents. In order
o to save space [2] Shrey Dutta, Naveen Sankaran,
S PramodSankar K., C.V.
Jawahar, “Robust Recoognition of Degraded Documents
and eliminate the need to sift through boxxes of paper files, Using Character N-Gram ms,” IEEE, 2012.
documents are being scanned and entereed into computer [3] Gur, Eran, and ZeevZeelavsky, “Retrieval of Rashi Semi-
databases. Image text recognition furtheer simplifies the Cursive Handwriting viaa Fuzzy Logic,” IEEE International
process by making documents text-searchablle, so that they are Conference on Frontieers in Handwriting Recognition
easier to locate and work with once in thhe database. Legal (ICFHR), 2012.
nd
IEEE Sponsored 2 International Conference on Innovations in Information,Embedded and Communication systems (ICIIECS)2015

[4] Rhead, Mke, "Accuracy of automatic number plate


recognition (ANPR) and real world UK number plate
problems." IEEE International Carnahan Conference on
Security Technology (ICCST), 2012.
[5] Badawy, W. "Automatic License Plate Recognition
(ALPR): A State of the Art Review." IEEE International
Conference on Document Analysis and Recognition, 2012.
[6] Naveen Sankaran and C.V Jawahar, “Recognition of
Printed Devanagari Text Using BLSTM Neural Network,”
IEEE, 2012.
[7] Ntirogiannis, Konstantinos, Basilis Gatos, and Ioannis
Pratikakis. "A Performance Evaluation Methodology for
Historical Document Image Binarization.," IEEE
International Conference on Document Analysis and
Recognition, 2013.
[8] Malakar, Samir, et al. "Text line extraction from
handwritten document pages using spiral run length
smearing algorithm," IEEE International Conference on
Communications, Devices and Intelligent Systems
(CODIS), 2012.
[9] Application of OCR, from http://www.cvisiontech.com.

You might also like