0% found this document useful (0 votes)
8 views44 pages

FULL

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views44 pages

FULL

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 44

HANDWRITTEN DIGIT RECOGNITION USING

NEURAL NETWORKS WITH TENSORFLOW

The Internship Report submitted to the SRM Arts and Science College in partial fulfilment of

the requirements for the award of the Degree of Master of Computer Science

Submitted by

THILAK R

(832000320)

DEPARTMENT OF COMPUTER SCIENCE

SRM ARTS AND SCIENCE COLLEGE

NOVEMBER 2021

1
BONAFIDE CERTIFICATE

This is to certify that the project report entitled


"“INTERNSHIP TITLE"
being submitted to the University of Madras, Chennai - 600 005
by

THILAK R
(Register number :832000320)

for the partial fulfillment for the award of the degree


MASTER OF COMPUTER SCIENCE
is the bonafide record work carried by him/her,
under my guidance and supervision

Date: Head of the Department

Submitted for the Viva-Voce Examination held on


…………………………..………..at SRM Arts and Science College,
Kattankulathur – 603 203.

2
TABLE OF CONTENTS

CHAPTER PAGE
PARTICULARS
NO NO
1 Introduction 1
2 Company profile 2

Handwritten Digit Recognition Using Neural Networks With


3
Tensorflow

I. Objectives and Methods


3

II. Literature Survey


4

III. Problem Description


7

IV. System Requirements


8

V. System Modeling and Design


10

VI. Implementation
28

VII. Testing
37

4 Work done/Work Schedule 41

5 Approval of Supervisor 43

6 Conclusions and Observations 44

7 References 45

3
4
1. INTRODUCTION
The purpose of this project is to take handwritten English characters as input,
process the character, train the neural network algorithm, to recognize the pattern and
modify the character to a beautified version of the input.
This project is aimed at developing software which will be helpful in
recognizing characters of English language. This project is restricted to English characters
only. It can be further developed to recognize the characters of different languages. It
engulfs the concept of neural network.
One of the primary means by which computers are endowed with humanlike
abilities is through the use of a neural network. Neural networks are particularly useful for
solving problems that cannot be expressed as a series of steps, such as recognizing
patterns, classifying them into groups, series prediction and data mining.
Pattern recognition is perhaps the most common use of neural networks. The
neural network is presented with a target vector and also a vector which contains the pattern
information, this could be an image and hand written data. The neural network then
attempts to determine if the input data matches a pattern that the neural network has
memorized.
A neural network trained for classification is designed to take input samples and
classify them into groups. These groups may be fuzzy, without clearly defined boundaries.
This project concerns detecting free handwritten characters.

1
2.COMPANY PROFILE

The company known today as Teva started as a small business in Jerusalem in 1901. Teva
has since grown significantly worldwide and is currently among the top 15 global
pharmaceutical Companies 01 - a world leader in generic and specialty medicines.

The company known today as Teva started as a small business in Jerusalem in 1901. The
young company, named after its pharmacist founders, was Salomon, Levin and Elstein Ltd.,
and it distributed imported medicines throughout the region using mule trains and camel
caravans.

Over the following decades, the company’s growth was spurred as demand for locally-
produced medicines grew. In 1976, the company became Teva ("‫ "טבע‬- the Hebrew word for
“nature”) Pharmaceutical Industries Ltd.

Teva grew significantly across the globe through numerous acquisitions which integrated and
enhanced its expertise in innovative and generic medicines, as well as new therapeutic areas
and markets. Today, Teva is among the top 15 global pharmaceutical companies ― a world
leader in generic and specialty medicines.

As the global leader in generic medicine, nearly 200 million people across six continents take
one of our products every day. We also invest hundreds of millions of dollars every year to
help our scientists develop specialty and biopharmaceutical treatments that aim to increase
access and improve patients’ health. Since Teva’s establishment in 1901 in Jerusalem, our
leadership in healthcare has been marked by tenacity, entrepreneurial spirit, and an aspiration
to improve people's lives. This defines how we do business and motivates thousands of Teva
employees all over the world, every single day. It specializes primarily in generic drugs,
but other business interests include active pharmaceutical ingredients and, to a lesser extent,
proprietary pharmaceuticals. Until 2020, Teva Pharmaceuticals was the largest generic drug
manufacturer, when it was surpassed by US-based Pfizer Teva Pharmaceuticals has an
overall rating of 3.6 out of 5, based on over 1,884 reviews left anonymously by employees.
67% of employees would recommend working at Teva Pharmaceuticals to a friend and 48%
have a positive outlook for the business.

2
3. HANDWRITTEN DIGIT RECOGNITION USING NEURAL
NETWORKS WITH TENSORFLOW

3.1. OBJECTIVES AND METHODS


Objectives
• To provide an easy user interface to input the object image.
• User should be able to upload the image.
• System should be able to pre-process the given input to suppress the background.
• System should detect text regions present in the image.
• System should retrieve text present in the image and display them to the user.
Methods
The proposed method comprises of 4 phases:
1. Pre-processing.
2. Segmentation.
3. Feature Extraction.
4. Classification and Recognition.

Figure 1.2.1: Process Flow

3
3.2. LITERATURE SURVEY
A few state of the art approaches that use hand written character recognition for text
identification have been summarized here:

A) Handwritten Character Recognition using Neural Network


(Chirag I Patel,Ripal Patel, Palak Patel)
Objective of this paper is to recognize the characters in a given scanned
documents and study the effects of changing the Models of ANN. Today Neural Networks
are mostly used for Pattern Recognition task. The paper describes the behaviors of
different Models of Neural Network used in OCR. OCR is widespread use of
Neural Network. We have considered parameters like number of Hidden Layer, size of
Hidden Layer and epochs. We have used Multilayer Feed Forward network with Back
propagation. In Preprocessing we have applied some basic algorithms for segmentation of
characters, normalizing of characters and De-skewing. We have used different Models of
Neural Network and applied the test set on each to find the accuracy of the respective
Neural Network.

B) Handwritten Character Recognition Using Gradient Features


(AshutoshAggarwal, Rajneesh Rani, RenuDhir)
Feature extraction is an integral part of any recognition system. The aim of feature
extraction is to describe the pattern by means of minimum number of features that are
effective in discriminating pattern classes. The gradient measures the magnitude
and direction of thegreatest change in intensity in a small neighbourhood of
eachpixel. (In what follows, "gradient" refers to both the gradient magnitude and
direct ion). Gradients are computedby means of the Sobel operator.In this paper an
effort is made towards recognition of English Characters and obtained recognition
accuracy of 94%.Due to its logical simplicity, ease of use and high recognition rate,
Gradient Features should be used for recognition purposes.

4
C) Neural based handwritten character recognition
(Hanmandlu M, Murali Mohan K.R,Kumar H.)
This paper explores the existing ring based method (W.I.Reber, 1987), the
new sector based method and the combination of these, termed the Fusion method for the
recognition of handwritten English capital letters. The variability associated with the
characters is accounted for by way of considering a fixed number of concentric rings
in the case of the ring based approach and a fixed number of sectors in the case of
the sector approach. Structural features such as end points, junction points and the number
of branches are used for the pre classification of characters, the local features such as
normalized vector lengths and angles derived from either ring or sector approaches are used
in the training using the reference characters and subsequent recognition of the test
characters. The recognition rates obtained are encouraging.

D) A feature extraction technique based on character geometry for character


recognition (Dinesh Dileep.)
This paper describes a geometry based technique for feature extraction applicable to
segmentation-based word recognition systems. The proposed system extracts the geometric
features of the character contour. These features are based on the basic line types that form
the character skeleton. The system gives a feature vector as its output. The feature vectors
so generated from a training set were then used to train a pattern recognition engine based
on Neural Networks so that the system can be benchmarked.

E) A Review of Gradient-Based and Edge-Based Feature Extraction Methods for


Object Detection.
(Sheng Wang.)
In computer vision research, object detection based on image processing is
the task of identifying a designated object on a static image or a sequence of video frames.
Projects based on such research works have been widely adapted to various industrial
and social applications. The field to which those applications apply includes but not
limited to, security surveillance, intelligent transportation system, automated
manufacturing, and quality control and supply chain management. In this paper, we
are going to review a few most popular computer vision methods based on image
processing and pattern recognition. Those methods have been extensively studied in
various research papers and their significance to computer vision research has been

5
proven by subsequent research works. In general, we categorize those methods into to
gradient-based and edge based feature extraction methods, depending on the low level
features they use. In this paper, the definitions for gradient and edge are extended. Because
an image can also be considered as a grid of image patches, it is therefore
reasonable to incorporate the concept of granules to gradient for a review.

3.3. PROBLEM DESCRIPTION


The purpose of this project is to take handwritten English characters as input,
process the character, train the neural network algorithm, to recognize the pattern and
modify the character to a beautified version of the input.
This project is aimed at developing software which will be helpful in recognizing
characters of English language. This project is restricted to English characters and
numerals only. It is also helpful in recognizing special characters. It can be further
developed to recognize the characters of different languages. It engulfs the concept of neural
network.
One of the primary means by which computers are endowed with humanlike
abilities is through the use of a neural network. Neural networks are particularly useful for
solving problems that cannot be expressed as a series of steps, such as recognizing
patterns, classifying them into groups, series prediction and data mining.
Pattern recognition is perhaps the most common use of neural networks. The
neural network is presented with a target vector and also a vector which contains the pattern
information, this could be an image and hand written data. The neural network then
attempts to determine if the input data matches a pattern that the neural network
has memorized.
A neural network trained for classification is designed to take input samples and
classify them into groups. These groups may be fuzzy, without clearly defined boundaries.
This project concerns detecting free handwritten characters.

6
3.4. SYSTEM REQUIREMENTS
3.4.1 Hardware and Software Requirements
Python 3.5, Tensorflow Windows7
Processor Dual Core, Intel i3
RAM 2GB RAM

DISK Space Disk space varies depending on size of


partition and installation of online help
files. The Python Installer will inform
you of the hard disk space requirement
for your particular partition.
Graphics adapter 8-bit graphics adapter and display(for
256 simultaneous colors)

Table 3.4.1.1: Minimum Requirements


Processor RAM DISK space Graphics
Adapter
Python 3.5, Intel i3 2GB 1GB for Python A 32-Bit or 64-
Tensorflow only,5GB for Bit OpenGL
typical capable graphics
installation adapter is
strongly
recommended.
Table 3.4.1.2 Recommended Requirements

7
3.4.2 High Level Specifications
• Python 3.5
• Intel Dual core Intel i3 Windows7 based personal computer
• 2GB RAM recommended
• 8-bit graphics adapter and display (for 256 simultaneous colors). A 32-bit or
64bit OpenGL capable graphics adapter is strongly recommended.

3.4.3 Low Level Specifications


• Microsoft Windows supported graphics accelerator card, printer, and sound
card.
• Microsoft Word 13.0 (Office 365), Office 2013.
• TCP/IP is required on all platforms when using a license server.
3.4.4 Functional Requirements
• The system should process the input given by the user only if it is an image
file (JPG, PNG etc.)
• System shall show the error message to the user when the input given is not
in the required format.
• System should detect characters present in the image.
• System should retrieve characters present in the image and display
them to the user.
3.4.5 Non Functional Requirements
• Performance: Handwritten characters in the input image will be
recognized with an accuracy of about 90% and more.
• Functionality: This software will deliver on the functional
requirements mentioned in this document.
• Availability: This system will retrieve the handwritten text regions only
if the image contains written text in it.
• Flexibility: It provides the users to load the image easily.
• Learn ability: The software is very easy to use and reduces the learning work.
• Reliability: This software will work reliably for low resolution images and
not for graphical images.

8
3.5. SYSTEM MODELING AND DESIGN
Purpose
The purpose of this design document is to explore the logical view of
architecture design, sequence diagram, data flow diagram, user interface design of the
software for performing the operations such as pre-processing, extracting features and
Displaying the text present in the images.
Scope
The scope of this design document is to achieve the features of the system such
as pre-process the images, feature extraction, segmentation and display the text present in the image.

3.5.1 Block Diagram and Algorithm


The proposed methodology uses some techniques to remove the background
noise, and features extraction to detect and classify the handwritten text.
The proposed method comprises of 4 phases:
1. Pre-processing.
2. Segmentation.
3. Feature Extraction.
4. Classification and Recognition.
The block schematic diagram of the proposed model is given in Fig.3.5.1.1

Figure 3.5.1.1: Block diagram of proposed method

9
3.5.1.1 Pre-processing
The pre-processing is a series of operations performed on scanned input
image. It essentially enhances the image rendering it suitable for segmentation. The
role of pre- processing is to segment the interesting pattern from the background.
Generally, noise filtering, smoothing and normalization should be done in this step. The
pre-processing also defines a compact representation of the pattern. Binarization process
converts a gray scale image into a binary image. Dilation of edges in the binarized
image is done using sobel technique.
3.5.1.2 Segmentation
In the segmentation stage, an image of sequence of characters is decomposed
into sub-images of individual character. The pre-processed input image is segmented into
isolated characters by assigning a number to each character using a labelling process. This
labelling provides information about number of characters in the image. Each individual
character is uniformly resized into pixels. Normalization: Afterextracting the character
we need to normalize the size of the characters. There are large variations in the sizes of
each Character hence we need a method to normalize the size.

Original Image Normalized Image


Figure 3.5.1.2.1: Normalization of Image

Character Extraction Algorithm


1. Create a Traverse List: - List of pixels which have been already traversed. This list is
initially empty.
2. Scan row Pixel-by-Pixel.
3. Whenever we get a black pixel check whether the pixel is already in the traverse list,
if it is simply ignore and move on else apply Edge-detection Algorithm.
4. Add the List of Pixels returned by Edge-detection Algorithm to Traverse List.
5. Continue the steps 2 - 5 for all rows.

10
Edge Detection Algorithm
The Edge Detection Algorithm has a list called traverse list. It is the list of
pixel already traversed by the algorithm. EdgeDetection(x,y,TraverseList);
1) Add the current pixel to TraverseList. The current position of pixel is (x,y).
2) NewTraverseList= TraverseList + current position(x,y).
If pixel at (x-1,y-1) then
Check if it is not in TraverseList. Edgedetection(x-1,y-1,NewTraverseList);
end if
If pixel at (x-1,y) then
Check if it is not in TraverseList.
Edgedetection(x-1,y+1,NewTraverseList);
end if
If pixel at (x,y+1) then
Check if it is not in TraverseList. Edgedetection(x,y+1,NewTraverseList);
Endif
3)return

3.5.1.3 Feature Extraction


There are two techniques employed based on the efficiencies obtained, while
training the neural network. They are as follows
• Feature Extraction based on Character Geometry.
• Feature Extraction Using Gradient Features.

3.5.1.3.1 Feature Extraction Based on Character Geometry.


It extracts different line types that form a particular character. It also
concentrates on the positional features of the same. The feature extraction technique
explained was tested using a Neural Network which was trained with the feature
vectors obtained from the system proposed.
Universe of Discourse
Universe of discourse is defined as the shortest matrix that fits the entire
character skeleton. The Universe of discourse is selected because the features

11
extracted from the character image include the positions of different line segments in the
character image. So every character image should be independent of its Image size.

Original Image Universe of Discourse

Figure 3.5.1.3.1.1: Universe of Discourse


Zoning
After the universe of discourse is selected, the image is divided into windows of
equal size, and the feature is done on individual windows. For the system implemented, two
types of zoning were used. The image was zoned into 9 equal sized windows. Feature
extraction was applied to individual zones,rather than the whole image. This gives
more information about fine details of character skeleton. Also positions of different line
segments in a character skeleton become a feature if zoning is used. This is because,
a particular line segment of a character occurs in a particular zone in almost cases. For
instance, the horizontal line segment in character ’A’ almost occurs in the central zone of
the entire character zone.
To extract different line segments in a particular zone, the entire skeleton in that
zone should be traversed. For this purpose, certain pixels in the character skeleton
were defined as starters, intersections and minor starters.
Starters
Starters are those pixels with one neighbour in the character skeleton. Before
character traversal starts, all the starters in the particular zone is found and is
populated in a list.

Figure 3.5.1.3.1.2: Starters are rounded

12
Intersections
The definition for intersections is somewhat more complicated. The necessary
but insufficient criterion for a pixel to be an intersection is that it should have more than one
neighbour. A new property called true neighbours is defined for each pixel. Based on the
number of true neighbours for a particular pixel, it is classified as an intersection or
not. For this, neighbouring pixels are classified into two categories, Direct pixels and
diagonal pixels. Direct pixels are all those pixels in the neighbourhood of the
pixel under consideration in the horizontal and vertical directions. Diagonal pixels
are the remaining pixels in the neighbourhood which are in a diagonal direction to the pixel
under consideration. Now for finding number of true neighbours for the pixel under
consideration, it has to be classified further based on the number of neighbours it have
in the character skeleton. Pixels under consideration are classified as those with 3
neighbours: If any one of the direct pixels is adjacent to anyone of the diagonal pixels, then
the pixel under consideration cannot be an intersection, else if none of the neighbouring
pixels are adjacent to each other than its an intersection. 4 neighbours: If each and every
direct pixel has an adjacent diagonal pixel or vice-versa, then the pixel under consideration
cannot be considered as an intersection. 5 or neighbours: If the pixel under consideration
has five or more neighbours, then it is always considered as an intersection once all the
intersections are identified in the image, then they are populated in a list.

Figure 3.5.1.3.1.3: Intersections

Minor Starters
Minor starters are found along the course of traversal along the character
skeleton. They are created when pixel under consideration have more than two
neighbours. There are two conditions that can occur Intersections: When the current pixel
is an intersection. The current line segment will end there and all the unvisited neighbours
are populated in the minor starters list. Non-intersections: Situations can occur where the
pixel under consideration has more than two neighbours but still it’s not an
intersection. In such cases, the current direction of traversal is found by using the

13
position of the previous pixel. If any of the unvisited pixels in the neighbourhood is in this
direction, then it is considered as the next pixel and all other pixels are populated in the
minor starters list. If none of the pixels is not in the current direction of traversal,
then the current segment is ended there and the entireneighbourhood are populated in
the minor starters list.When the algorithm proposed is applied to character ’A’, in
most cases, the minor
starters found are given in the image.

Figure 3.5.1.3.1.4: Minor Starters


After the line type of each segment is determined, feature vector is formed based
on this information. Every zone has a feature vector corresponding to it. Under the
algorithm proposed, every zone has a feature vector with a length of 8.
The contents of each zone feature vector are:
1) Number of horizontal lines.
2) Number of vertical lines.
3) Number of Right diagonal lines.
4) Number of Left diagonal lines.
5) Normalized Length of all horizontal lines.
6) Normalized Length of all vertical lines.
7) Normalized Length of all right diagonal lines.
8) Normalized Length of all left diagonal lines.
9) Normalized Area of the Skeleton.
The number of any particular line type is normalized using the following
method,Value = 1 - ((number of lines/10) x 2).
Normalized length of any particular line type is found using the following
method,Length = (Total Pixels in that line type)/ (Total zone pixels).
The feature vector explained here is extracted individually for each zone. So
if there are N zones, there will be 9N elements in feature vector for each zone. For the
system proposed, the original image was first zoned into 9 zones by dividing the
image matrix. The features were then extracted for each zone. Again the original

14
image was divided into 3 zones by dividing in the horizontal direction. Then features were
extracted for each such zone.
After zonal feature extraction, certain features were extracted for the entire
image based on the regional properties namely Euler Number: It is defined as the difference
of Number of Objects and Number of holes in the image. For instance, a perfectly
drawn ’A’ would have Euler number as zero, since number of objects is 1 and number of
holes is 2, whereas ‘B’ would have Euler number as -1, since it have two holes
Regional Area: It is defined as the ratio of the number of the pixels in the skeleton to the
total number of pixels in the image. Eccentricity: It is defined as the eccentricity of the
smallest ellipse that fits the skeleton of the image.
3.5.1.3.2 Gradient Feature Extraction.
The gradient measures the magnitude and direction of the greatest change in
intensity in a small neighbourhood of each pixel. (In what follows, "gradient" refers to both
the gradient magnitude and direction). Gradients are computed by means of the Sobel
operator. The Sobel templates used to compute the horizontal (X) & vertical (Y)
components of the gradient are shown in Fig.

Horizontal Component Vertical Component


Figure 3.5.1.3.2.1: Sobel masks for Gradient
Given an input image of size D1×D2, each pixel neighbourhood is convolved with
these templates to determine these X and Y components, Sx and Sy, respectively.
Eq. (1) and (2) represents their mathematical representation: (1)
S (i, j) = I(i -1, j +1) + 2 * I(i, j +1) + I(i +1, j +1)-I(i-1,j-1)-2*I(i,j-1)-I(i+1,j-1). (1)
S (i, j) = I(i -1, j -1) + 2* I(i -1, j) + I(i -1, j +1) y -I(i+1,j -1) - 2* I(i +1, j) - I(i +1, j +1)
(2)
Here, (i, j) range over the image rows (D1) and columns (D2), respectively. The
gradient strength and direction can be computed from the gradient vector [Sx, Sy]. After
obtaining gradient vector of each pixel, the gradient image is decomposed into four
orientation planes or eight direction planes (chain code directions) as shown in Figure

15
Figure 5.1.3.2.2: directions of chain codes

Generation of Gradient Feature Vector


A gradient feature vector is composed of the strength of gradient accumulated
separately in different directions as described below: (1) the direction of gradient
detected as above is decomposed along 8 chain code directions. (2) The character
image is divided into 81(9 horizontal × 9 vertical) blocks. The strength of the
gradient is accumulated separately in each of 8 directions, in each block, to
produce 81 local spectra of direction. (3) The spatial resolution is reduced from 9×9 to
5×5 by down sampling every two horizontal and every two vertical blocks with 5×5
Gaussian Filter to produce a feature vector of size 200 (5 horizontal, 5 vertical, 8
directional resolution). (5) The variable transformation (y = x0.4) is applied to make
the distribution of the features Gaussian-like. The 5 × 5 Gaussian Filter used is the
high cut filter to reduce the aliasing due to the down sampling.

3.5.1.4 Classification
Artificial Neural Network
Animals recognize various objects and make sense out of large amount of
visual information, apparently requiring very little effort. Simulating the task
performed by animals to recognize to the extent allowed by physical limitations will be
enormously profitable for the system. This necessitates study and simulation of
Artificial Neural Network. In Neural Network, each node perform some simple
computation and each connection conveys a signal from one node to another labelled by a
number called the “connection strength” or weight indicating the extent to which signal is
amplified or diminished by the connection.

16
Different choices for weight results in different functions are being evaluated by the
network. If in a given network whose weight are initial random and given that we
know the task to be accomplished by the network , a learning algorithm must be used to
determine the values of the weight that will achieve the desired task. Learning
Algorithm qualifies the computing system to be called Artificial Neural Network. The node
function was predetermined to apply specific function on inputs imposing a
fundamental limitation on the capabilities of the network. Typical pattern recognition
systems are designed using two pass. The first pass is a feature extractor that finds
features within the data which are specific to the task being solved (e.g. finding bars of
pixels within an image for character recognition). The second pass is the classifier, which is
more general purpose and can be trained using a neural network and sample data sets.
Clearly, the feature extractor typically requires the most design effort, since it usually must
be hand-crafted based on what the application is trying to achieve.
Back propagation was created by generalizing the Widrow-Hoff learning rule
to multiple-layer networks and nonlinear differentiable transfer functions. Input vectors and
the corresponding target vectors are used to train a network until it can approximate
a function, associate input vectors with specific output vectors, or classify input
vectors in an appropriate way as defined by you. Networks with biases, a sigmoid layer, and
a linear output layer are capable of approximating any function with a finite number of
discontinuities.

Figure 3.5.1.4.1: Typical Neural Network

17
Figure 3.5.1.4.2: Neural Network

Once the network is trained, the match pattern is obtained to generate the associated
character.

Sample Input Sample Output

Figure 3.5.1.4.3: Sample Input & Output

Output will be the beautified version of the uploaded image and will be saved in a
.doc or in text file.

3.5.2 Use of tools for design


Image Processing Toolbox
Image Processing Toolbox™ provides a comprehensive set of reference-standard
algorithms, functions, and apps for image processing, analysis, visualization, and algorithm
development. You can perform image enhancement, image deblurring, feature
detection, noise reduction, image segmentation, geometric transformations, and image
registration. Many toolbox functions are multithreaded to take advantage of multi core and
multiprocessor computers.

18
Image Processing Toolbox supports a diverse set of image types, including high
dynamic range, giga pixel resolution, embedded ICC profile and tomography. Visualization
functions let you explore an image, examine a region of pixels, adjust the contrast, create
contours or histograms, and manipulate regions of interest (ROIs). With toolbox
algorithms you can restore degraded images, detect and measure features, analyze shapes and
textures, and adjust color balance.
Neural Network Toolbox
Neural Network Toolbox™ provides functions and apps for modeling
complex nonlinear systems that are not easily modeled with a closed-form equation. Neural
Network Toolbox supports supervised learning with feed forward, radial basis, and dynamic
networks. It also supports unsupervised learning with self-organizing maps and competitive
layers. With the toolbox you can design, train, visualize, and simulate neural networks. You can
use Neural Network Toolbox for applications such as data fitting, pattern
recognition, clustering, time-series prediction, and dynamic system modeling and control.
Rational Rose
Rational Rose is an object-oriented Unified Modeling Language (UML) software
design tool intended for visual modeling and component construction of enterprise-level
software applications. In much the same way a theatrical director blocks out a play, a
software designer uses Rational Rose to visually create (model) the framework for an
application by blocking out classes with actors (stick figures), use case elements (ovals),
objects (rectangles) and messages/relationships (arrows) in a sequence diagram using drag-
and-drop symbols. Rational Rose documents the diagram as it is being constructed and then
generates code in the designer's choice of C++, Visual Basic, Java, Oracle8, Corba or
Data Definition Language.

19
3.5.3 Flow Chart

Fig: 3.5.3.1Flow chart


20
3.5.4 UML diagrams
3.5.4.1 Use Case Diagram

Figure 3.5.4.1: Use Case Diagram

Figure 3.5.4.1.1: User Module


User Case and Description
Actor: User
Precondition: Input image should be available.
Main Scenario:User uploads image.

21
Post Condition:Image successfully uploaded.
Extension Scenario: If the image is not compatible. Not possible to upload the file

Pre-processing Module:

Figure 3.5.4.1.2: Pre-Processing Module


User Case & Description
Actor: System
Precondition: Uploaded input image
Main Scenario: Pre-processing is carried out by converting the image from RGB format
to binary format.
Post Condition: Extract characters Before segmentation

Segmentation Module

System segmentation
Figure 3.5.4.1.3: Segmentation Module
User Case & Description
Actor: System
Precondition: Pre-processed image should be available.
Main Scenario: The pre-processed input image is segmented into isolated characters by
assigning a number to each character using a labeling process. This labeling provides
information about number of characters in the image. Each individual character is
uniformly resized into pixels.
Extension Scenario: If the image is not compatible.Not possible to upload file.
Post Condition: Image successfully uploaded

22
3.5.4.2 Sequential Diagram

Figure 3.5.4.2: Sequence Diagram

23
3.5.4.3 Activity Diagram

Figure 3.5.4.3: Activity Diagram

24
3.5.4.4 Architecture of the system

Figure 3.5.4.4: Architecture of the proposed system

3.5.5 Design of the User Interface


Tensorflow, the Python graphical user interface development environment,
provides a set of tools for creating graphical user interfaces (GUIs). These tools
simplify the process of laying out and programming GUIs.
The Tensorflow Layout Editor enables you to populate a GUI by clicking and dragging
GUI components — such as buttons, text fields, sliders, axes, and so on — into the layout area.
It also enables you to create menus and context menus for the GUI.
The three main principle elements required in the creation of Graphical User Interface are:
• Components: Each item on the Python GUI is a graphical component. The types of
components include graphical controls (pushbuttons, edit boxes, lists, sliders, etc, static
elements (frames and text strings),menus and axes. Axes, which are used to display the
graphical data are created using function axes. Graphical control and static elements are
created by the function uicontrol, and menus are created by function uimenu and
uicontextmenu. Axes which are used to display graphical data are created by the function
axes.
• Figures: The components of the GUI must be arranged within the figure, which is a
window on the computer screen. In the post figure have been created automatically
whenever we have plotted data. However empty figure can be created with the function
figure and can be used to hold any combination of components.
• Call back: Finally, there must be some way to perform an action if a user click a
mouse on a button or type information on a keyboard .A mouse click or a key press is an
event, and the Python program must respond to each event if the program is to perform
its function .The code executed in response to an event is called a callback. There must
be a callback to implement the function of each graphical user component on the GUI.

25
3.5.6 Risks Identified
Python is an interpreted language. The main disadvantage of interpreted languages
is execution speed. When a language is compiled, all of the code is analyzed and
processed efficiently, before the programmer distributes the application. With an interpreted
language, the computer running the program has to analyze and interpret the code (through the
interpreter) before it can be executed (each and every time), resulting in slower processing
performance.
The values of 39 and 35 hidden neurons for gradient features and character geometry
respectively are chosen based on experiments conducted on several different images and are
used by classifiers to produce correct classification results. The variations in the hidden neuron
values might tend to produce wrong result. Hence these values should be carefully chosen.

26
3.6. IMPLEMENTATION
3.6.1 Software and Hardware Used
Software Processor RAM Disk Space
Python 3.7, Dual Core, Intel i3 2048 MB 1 GB for Python
Tensorflow only,5 GB for a
custom installation

Table 3.6.1 Software and Hardware Used


3.6.2 Software Development Platform/Environment/framework
Python is a high-performance language for technical computing. It
integrates computation, visualization, and programming in an easy-to-use environment where
problems and solutions are expressed in familiar mathematical notation. The name
Python stands for matrix laboratory.

3.6.2.1 The Python language

Python is a general-purpose interpreted, interactive, object-oriented, and high-level


programming language. It was created by Guido van Rossum during 1985- 1990. Like Perl,
Python source code is also available under the GNU General Public License (GPL). This
tutorial gives enough understanding on Python programming language.
3.6.2.2 The Python working environment
This is the set of tools and facilities that you work with as the Python user or
programmer. It includes facilities for managing the variables in your workspace and
importing and exporting data. It also includes tools for developing, managing, debugging,
and profiling .py files, Python's applications.
3.6.2.3 Python Image Processing Toolbox
We have used Python Image Processing Toolbox for the development of this software.
Image processing involves changing the nature of an image in order to improve pictorial
information of the image for human interpretation for autonomous human perception. The
Image Processing Toolbox is a collection of functions that extend the capability of the
Python numeric computing environment. The toolbox supports a wide range of operations on
the image.

27
Key Features
• Image enhancement, including filtering, filters design, deblurring and contrast
enhancement.
• Image analysis including features detection, morphology, segmentation, and
measurement.
• Spatial transformations and image registration.
• Support for multidimensional image processing.
• Support for ICC version 4 color management system.
• Modular interactive tools including ROI selection, histograms and distance
measurements.
• Interactive image and video display.
• DICOM import and export.
3.6.2.4 Python Neural Network Toolbox
Neurolab is a simple and powerful Neural Network Library for Python. Contains based
neural networks, train algorithms and flexible framework to create and explore other neural
network types.
Key Features
• Pure Python + Numpy
• API like Neural Network Toolbox(NNT) from MATLAB
• Interface to use train algorithms form scipy.optimize
• Flexible network configurations and learning algorithms.You may change, train, error,
initialization and activation functions.
• Unlimited number of neural layers and number of neurons in layers.
• Variety of supported types of Artificial Neural Network and learning algorithms.
3.6.2.5 Working of the Modules
When you save your GUI layout, Tensorflow automatically generates an Py-file that
you can use to control how the GUI works. This Py-file provides code to initialize the GUI and
contains a framework for the GUI callbacks the routines that execute in response to user-
generated events such as a mouse click. Using the Py-file editor, you can add code to the
callbacks to perform the functions you want.
Home Page
1. The system displays the Home page with some options.
2. The user will browse for the input image.
3. User clicks on LOAD IMAGE Button to upload the image.
28
Processing Module
1. This GUI will receive the query image.
2. Displays the noiseless image of the uploaded image.
3. The characters are extracted from the image and displayed.
4. Output will be displayed in a .txt /.doc file.
3.6.3 Training and Decoding
• Calculate the HOG features for each sample in the database.
• Train a multi-class linear SVM with the HOG features of each sample along with the
corresponding label.
• Save the classifier in a file
python train.py:This will generate a text log file and a Tensorflow summary.

Figure:3.6.3 Training an image

python test.py: This will generate, for each image, the line transcription. The output will be
written to decoded.txt by default.
python compute_probs.py: This will generate, for each image, the posterior probabilities at
each timestep. Files will be stored in Problem by default.

29
3.6.4 Screen Shots of Project
Home Page

Figure 3.6.3.1: Home Page

Uploaded Image

Figure 3.6.3.2: Uploading Image

30
Character Extraction

Figure 3.6.3.3: Extracting Character

Output File

Figure 3.6.3.4: Output File

31
3.6.4 Snapshot and description of Experimental Set up
Requirements
• An internet connection
• Python Software
1. Go to python.org/downloads and download the version of Python that you want. In
these examples, I'm downloading Python 3.7.

Figure 3.6.4.1a: setup 1

Figure 3.6.4.1b: setup 2

32
2. Install Python with add PATH to Environment.

Figure 3.6.4.2: setup 3

3. Select all Fields for Python Features

Figure 3.6.4.3: setup 4

33
4. Choose the Location for installation

Figure 3.6.4.4: setup 5

5. Confirm the installation settings by pressing Install

Figure 3.6.4.5: setup 6

34
6. Finish Installation, then Open IDLE for Python and check python version

Figure 3.6.4.6: setup 7

35
3.7. TESTING

3.7.1 Verification
The set of Test Cases are used to test the functionality of each module if that
module works properly then that Test Cases marked as Pass or else Fail.

Test Id Test Case Input Excepted Test Status


Description Output
1 Uploading When user clicks Image file should Pass
Image on open button be selected and
open field box uploaded
will be opened to
select the image
file

2 To preprocess Image will be Conversion from Pass


Images taken for RGB to B/W
preprocessing image
(Binarization)
3 Feature A Gray Scale Character Pass
extraction Image features should
be extracted

4 Output file Normalized file containin Pass


character to the only the text
neural network

Table 3.7.1: Verification

36
3.7.2 Validation
The below table is used to determine whether or not a system satisfies
the acceptance criteria and to determine whether or not to accept the system.

Serial Number Functions Required Output Actual Output


1 Upload the image Image should be Valid image is
with valid format uploaded if supported uploaded
successfully
2 Invalid image Error message Error message is
format should be displayed displayed if the
image format is not
supported
3 Pre-processing of the Image should be Image is
uploaded image pre- processed in preprocessed
order to convert to
Gray scale
4 Extraction of Character features Image features are
features such as edges and extracted
curves are
calculated
5 Displaying result Text of the file Text contained in
displayed the file is displayed

Table 3.7.2: Validation


3.7.3 Failure modes and action on failure
Serial Number Event Action
1 Wrong input file uploaded Error message should be
displayed to the user and
home screen must be
displayed
2 System shut down Tasks should be canceled and
process should be restarted

37
again

Table 3.7.3: Failure modes of system

38
3.7.4 Test Results
Character Geometry

Epochs Hidden Layers Config Classification %


43 10 85-10-26 27.8
104 20 85-20-26 77.4
148 30 85-30-26 93.8
172 35 85-35-26 94.3
117 39 85-39-26 93.3
53 45 85-45-26 12.5
60 50 85-50-26 83.7
76 55 85-55-26 78.3
71 60 85-60-26 18
110 65 85-65-26 49.8
112 70 85-70-26 49.2

Table 3.7.4.1: Using Character Geometry

Gradient Features

Epochs Hidden Layers Config Classification %


47 10 108-10-26 10
189 20 108-20-26 87.4
144 30 108-30-26 82
137 35 108-35-26 82.2
148 39 108-39-26 94.5
109 45 108-45-26 76.8
113 50 108-50-26 86.6
98 55 108-55-26 55.2
94 60 108-60-26 68.3
102 65 108-65-26 89.1
130 70 108-70-26 91.1
Table 3.7.4.2 Using Gradient Features

39
3.7.4 Evaluation
• The Handwritten Character Recognition system was tested on several different
scanned images containing handwritten text written with different styles and
the results were highly encouraging.
• The proposed method performs preprocessing on the image for removing the
noise and further uses feature extraction using gradient technique OR using
character geometry which gives relatively good classification compared to OCR.
• The method is advantageous as it uses nine features to train the neural network using
character geometry and twelve features using gradient technique. The advantage lies
in less computation involved in feature extraction, training and classification phases
of the method.
• The proposed methodology has produced good results for images
containing handwritten text written in different styles, different size and alignment
with varying background. It classifies most of the handwritten characters
correctly if the image contains less noise in the characters and also in the
background. Characters written with legible handwriting are classified more
accurately.

40

You might also like