0% found this document useful (0 votes)
26 views52 pages

Final Finaldoc

This document provides an introduction to a project on digitalization of handwritten text using neural networks. The objective is to take handwritten English text or digit images as input, process the text, train a neural network algorithm to recognize the text. A convolutional neural network, recurrent neural network and connectionist temporal classification layers are used. The project aims to reduce manual effort required to convert old literature into digital form and make text recognition more accurate.

Uploaded by

Anusha Kandula
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
26 views52 pages

Final Finaldoc

This document provides an introduction to a project on digitalization of handwritten text using neural networks. The objective is to take handwritten English text or digit images as input, process the text, train a neural network algorithm to recognize the text. A convolutional neural network, recurrent neural network and connectionist temporal classification layers are used. The project aims to reduce manual effort required to convert old literature into digital form and make text recognition more accurate.

Uploaded by

Anusha Kandula
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 52

Digitalization of Handwritten Text using

Neural Networks
A Industry Oriented Mini Project report submitted
in partial fulfillment of requirements
for the award of degree of

Bachelor of Technology
In
Information Technology
By

KANDULA ANUSHA (Reg No: 17131A1251)


ALLAMSETTY SAILAJA (Reg No: 17131A1206)
CHITLURI ANJANI NOOKAMBICA (Reg No: 17131A1223)
AMULYA NANDANA (Reg No: 17131A1207)

Under the esteemed guidance of

Dr. CH. Sita Kumari


Sr. Assistant Professor
Department of Information Technology

GAYATRI VIDYA PARISHAD COLLEGE OF ENGINEERING


(AUTONOMOUS)
(Affiliated to JNTU-K, Kakinada)
VISAKHAPATNAM
2020 – 2021

i
Gayatri Vidya Parishad College of Engineering (Autonomous)

Visakhapatnam

CERTIFICATE
This report on “DIGITALIZATION OF HANDWRITTEN TEXT
USING NEURAL NETWORKS” is a bonafide record of the mini
project work submitted

By
KANDULA ANUSHA (Reg No:17131A1251)
ALLAMSETTYSAILAJA (RegNo:17131A1206)
CHITLURI ANJANI NOOKAMBICA (Reg No:17131A1223)
AMULYA NANDANA (Reg No:17131A1207)

in their VII semester in partial fulfillment of the requirements for the Award of Degree of
Bachelor of Technology
In
Information Technology
During the academic year 2020-2021

Dr. CH. Sita Kumari Dr. K. B. Madhuri


Sr. Assistant Professor Head of the Department
Project Guide Department of Information Technology

ii
DECLARATION

We hereby declare that this industry oriented mini project entitled

“Digitalization of Handwritten Text using Neural Networks” is a bonafide work

done by us and submitted to Department of Information Technology G.V.P College Of

Engineering (Autonomous) Visakhapatnam, in partial fulfilment for the award of the

degree of B.Tech is of our own and it is not submitted to any other university or has

been published any time before.

PLACE: VISAKHAPATNAM K.ANUSHA(17131A1251)

DATE : A.SAILAJA(17131A1206)

CH.ANJANI NOOKAMBIKA(17131A123)

N.AMULYA(17131A1207)

iii
ACKNOWLEDGEMENT

We thank our faculty at the department particularly Dr. CH.SITA KUMARI


,Sr.Assistant Professor, Information Technology for the kind suggestions and guidance
throughout in house mini project work. We also thank her for her guidance as our
guide that enables the successful completion of our project work.
We wish to express our deep graduate to Dr. K. B. MADHURI, Professor and Head
of the department of IT ,GVPCOE(Autonomous) for giving us an opportunity to do
the project in college.
We wish to express our deep gratitude to Dr. A.B.KOTESWARA RAO, Principal,
GVPCOE(Autonomous)for giving us opportunity to do the project work, giving us a
chance to explore and learn new technologies in the form of mini projects.

Finally we would like to thank all those people who helped us in many ways in
completing this project.

K. Anusha(17131A1251)

A. Sailaja(17131A1206)

Ch. Anjani Nookambica (17131A1223)

N. Amulya (17131A1207)

iv
ABSTRACT

Handwritten Text Recognition has been one of the active and


challenging
researchareasinthefieldofimageprocessingandpatternrecognition.Ithas
numerous applications which include reading aid for partially blind,
bank cheques and conversion of any handwritten document into
structural text. In this project, an attempt is made to recognize
handwritten characters for English alphabets . We use Neural
Networks for our tasks. It consists of convolutional Neural Networks
(CNN) layers, Recurrent Neural Networks (RNN)layers and a final
Connectionist Temporal Classification (CTC) layer.
The data set contains text and digits. In this project, each text image
is resized into 128X32 pixels which is directly subjected to
Training. That is, each resized image has 4096 pixels and these
pixels are given as input to the neural network. The trained network
is used for classification and recognition. We used pycharm to run
the code. 95% of the images are given for training and 5% of the
images are given for testing. We have trained 25epochs and we have
done this with 84% accuracy

v
CONTENTS:

1. INTRODUCTION

1.1 Objective

1.2 Theory

1.2.1. Benefits of Text Recognition

1.2.2. Implementation of Handwritten Text Recognition

1.2.3. What is Neural Network

1.2.4. Why we use Neural Network

1.3 About the algorithm

1.3.1. Convolutional Neural Networks

1.3.2. Recurrent Neural Network

1.3.3. Connectionist Temporal Classification

1.4 Purpose

1.5 Scope

2. SRS DOCUMENT

2.1 Functional Requirements

2.2 Non-Functional Requirements

3. ALGORITHM ANALYSIS

3.1 Existing Algorithm

3.2 Proposed Algorithm

3.3 Feasibility Study

3.4 Cost Benefit Analysis

vi
4. SOFTWARE DESCRIPTION

4.1 Python IDE

4.2 Flask Server

4.3 Pycharm

4.4 Tensor flow

4.5 Numpy

5. PROJECT DESCRIPTION

5.1 Problem Definition

5.2 Project Overview

5.3 Module Description

5.3.1. Flask Framework

5.3.2. Model

5.3.2.1. Image Acquisition

5.3.2.2. Pre-Processing

5.3.2.3. Classification and Recognition

5.3.2.4. Post-Processing

6. SYSTEM DESIGN

6.1 Introduction to UML

6.2 Building blocks of UML

6.2.1. Things

6.2.2. Relationships

6.3.3. Diagrams

6.3 UML Diagram

vii
6.3.1. Use Case Diagram

6.3.2. Sequence Diagram

6.3.3. Activity Diagram

7. DEVELOPMENT

7.1 Data Set Used

7.2 Sample Code

7.2.1. Main.py

7.2.2. App.py

7.2.3. Upload.html

7.3 Input Output Screens

8. SYSTEMMAINTAINENCE

9. CONCLUSION

10. BIBLIOGRAPH

viii
1. INTRODUCTION

1.1. OBJECTIVE:

● The objective of this project is to take handwritten English text or


digit images as input, process the text, train the neural network
algorithm, to recognize the text.

● Reduced man-power to convert old literature into digitized


form manually.

● It is used as a guide text recognition area.

● Making rich the digitized library with English Language.

1.2. THEORY:

1.2.1. BENEFITS OF TEXT RECOGNITION:


It helps in reading and understanding of various combined styles of
written texts. In forensic applications, this will be an effective
method for evidence collection. It also helps us to reduce noise from
the original text. Our method develops accuracy in recognizing text
in divert font and size. More sets of samples invites more accuracy
because of heavy training and testing sessions.

1
Fig.1.3.1. Convolutional Neural Network Diagram

1.2.2. IMPLEMENTATION OF HANDWRITTEN TEXT


RECOGNITION:

HCR works in stages as preprocessing, segmentation, feature


extraction and recognition using neural networks. Preprocessing
includes a series of operations to be carried out on document image
to make it ready for segmentation. During segmentation, the
document image is segmented into individual character image then
feature extraction technique is applied on character image. Finally
the feature vector is presented to be selected as an algorithm for
recognition. Here these extracted features are provided to NN for
recognition of the text.

2
1.2.3. WHAT IS NEURAL NETWORK:

An Artificial Neural Network(ANN) is an information-processing


paradigm that is inspired by the way biological nervous systems,
such as brain process information. The key element of this paradigm
is the novel structure of the information processing system. It is
composed of a large no. of highly interconnected processing
elements(neurons) working in union to solve specific problems.
ANN’s like people, learning by example. An ANN is configured for
a specific application , such as pattern recognition or data
classification, through a learning process. Learning in a biological
system involves adjustments to the synaptic connections that exist
between the neurons.

1.2.4. WHY WE USE NEURAL NETWORK:

Neural network with their remarkable ability to derive meaning


from complicated or imprecise data can be used to extract pattern
and detect trend that are too complex to be noticed by either human
or other computer techniques. A trained neural network can be
thought of as an “expert” in the category of information it has been
given to analyze. This expert can then be used to provide
projections given new situations of interest and answer “what if”
questions. Other advantages include:

3
● Adaptive Learning: An ability to learn how to do tasks
based on the data given for training or initial experience.
● Self-Organization: An ANN can create its own
organization or representation of the information it
receives during learning time.
● Real Time Operation: ANN computations may be
carried out in parallel, and special hardware devices
are being designed and manufactured which take
advantage of this capability.
● Fault Tolerance via Redundant Information coding:
partial destruction of network leads to the corresponding
degradation of performance. However, some network
capabilities may be retained even with major network
damage.

1.3. ABOUT THE ALGORITHM:

We use Neural Networks for our tasks. It consists of


Convolutional Neural Networks (CNN) layers, Recurrent
Neural Networks (RNN) layers and a final Connectionist
Temporal Classification (CTC) layer.

1.3.1. CONVOLUTIONAL NEURAL NETWORK(CNN):

A Convolutional neural network (CNN) is a neural network that has

4
one or more convolutional layers and are used mainly for image
processing, classification, segmentation. Each convolutional layer
contains a series of filters known as convolutional kernels. The filter
is a matrix of integers that are used on a subset of the input pixel
values, the same size as the kernel.
Each pixel is multiplied by the corresponding value in the kernel,
then the result is summed up for a single value for simplicity
representing a grid cell, like a pixel, in the output channel/feature
map.

1.3.2.RECURRENT NEURAL NETWORK (RNN):

Recurrent Neural Network remembers the past and it’s decisions


are influenced by what it has learnt from the past. While RNNs
learn similarly while training, in addition, they remember things
learnt from prior input(s) while generating output(s). It’s part of
the network. RNNs can take one or more input vectors and
produce one or more output vectors and the output(s) are
influenced not just by weights applied on inputs like a regular NN,
but also by a “hidden” state vector representing the context based
on prior input(s)/output(s). So, the same input could produce a
different output depending on previous inputs in the series.

5
1.3.3.CONNECTIONISTTEMPORAL CLASSIFICATION (CTC)
If you want a computer to recognize text, neural networks (NN)
are a good choice as they outperform all other approaches at the
moment. The NN for such use-cases usually consists of
convolutional layers (CNN) to extract a sequence of features and
recurrent layers (RNN) to propagate information through this
sequence. It outputs character-scores for each sequence-element,
which simply is represented by a matrix. Now, there are two
things we want to do with this matrix:

train: calculate the loss value to train the NN

infer: decode the matrix to get the text contained in the

input image Both tasks are achieved by the CTC operation.

Fig.1.3.3. Connectionist Temporal Classification Diagram

6
1.4.PURPOSE:

● Document Reading

● Conversion of any handwritten document into structural text form.

1.5.SCOPE:

● System will be designed in a way to ensure offline Handwritten


Recognition of English characters.
● Our old and epic HCR literature can be restored in digital form.
● Use of Neural Network for classification.

● Large numbers of training data sets will improve the efficiency


of the suggested approach.

7
2. SRS DOCUMENT
2.1. FUNCTIONAL REQUIREMENTS:

● The system should process the input given by the user only if it is
an image file.

● System will show the error message to the user when the input
given is not in the required format.

● System should detect the characters present in the image.

● System should retrieve characters present in the image and


display them to the user.

2.2. NON-FUNCTIONAL REQUIREMENTS:

● PERFORMANCE: Handwritten characters in the input image


will be recognized with high accuracy.

● FUNCTIONALITY: This software will deliver on the functional


requirements mentioned in this document.
● AVAILABILITY: This system will retrieve the handwritten
character regions only if the image contains written characters in
it.
● RECOGNITION ABILITY: The software is very
easy to use and recognizes the characters from the
image.
● RELIABILITY: This software will work reliably
for any type of character images.
8
3. ALGORITHM ANALYSIS

3.1. EXISTING ALGORITHM:

Optical character recognition or optical character reader (OCR) is the


electronic or mechanical conversion of images of typed, handwritten
or printed text into machine-encoded text, whether from a scanned
document, a photo of a document, a scene-photo.

Drawbacks:

● OCR text works well with printed text only and not with
handwritten text. Handwriting needs to be learnt by the
computer.
● OCR systems are expensive.
● Images produced by a scanner consume a lot of memory space.
● Images lose some quality during the scanning and digitizing
process.
● Quality of the final image depends on the quality of the original
image.
● All the documents need to be checked over carefully and then
manually corrected.
● Direct use of OCR remains a difficult problem to resolve, as it
leads to low reading accuracy.

9
3.2. PROPOSED SYSTEM:

We use a NN for our task. It consists of convolutional NN (CNN)


layers, recurrent NN (RNN) layers and a final Connectionist
Temporal Classification (CTC) layer.

CNN: the input image is fed into the CNN layers. These layers are
trained to extract relevant features from the image. Each layer
consists of three operations. First, the convolution operation, which
applies a filter kernel of size 5×5 in the first two layers and 3×3 in
the last three layers to the input. Then, the non-linear RELU
function is applied.
Finally, a pooling layer summarizes image regions and outputs a
downsized version of the input. While the image height is
downsized by 2 in each layer, feature maps (channels) are added, so
that the output feature map (or sequence) has a size of 32×256.

RNN: the feature sequence contains 256 features per time-step, the
RNN propagates relevant information through this sequence. The
popular Long Short-Term Memory (LSTM) implementation of
RNNs is used, as it is able to propagate information through longer
distances and provides more robust training-characteristics than
vanilla RNN. The RNN output sequence is mapped to a matrix of
size 32×80. The IAM dataset consists of 79 different characters,
further one additional character is needed for the CTC operation
(CTC blank label), therefore there are 80 entries for each of the 32
time-steps.

CTC: while training the NN, the CTC is given the RNN output
10
matrix and the ground truth text and it computes the loss value. While
inferring, the CTC is only given the matrix and it decodes it into the
final text. Both the ground truth text and the recognized text can be at
most 32 characters long.

3.3. FEASIBILITY STUDY:

During system analysis the feasibility study of the proposed system


is carried out so that it won’t be a burden for the company. For
feasibility analysis, some understanding of the major requirements
of the system is essential. Dimensions of software feasibility are:
1. Is this project technically feasible?

2. Is it financially feasible?
3. Will the project’s time to market beat competition?

3.4. COST BENEFIT ANALYSIS:

This study is carried out to check the economic impact that the
system will have on the organization. The amount of funds that the
company can pour into the research and development of the system
is limited. The expenditures must be justified. Thus the developed
system as well within the budget and this was achieved because
most of the technologies used are freely available. Only the
customized products must be purchased.

11
4. SOFTWARE DESCRIPTION

4.1. PYTHON IDE:

IDLE (Integrated Development and Learning Environment) is


an integrated development environment (IDE) for Python. The
Python installer for Windows contains the IDLE module by
default. IDLE can be used to execute a single statement just
like Python Shell and also to create, modify and execute
Python scripts.
4.2. FLASK SERVER:
Flask is a web framework. This means flask provides you with
tools, libraries and technologies that allow you to build a web
application. This web application can be some web pages, a blog,
a wiki or go big as a web- based calendar application or a
commercial website. ...Werkzeug a WSGI utility library. Flask is
used for the backend, but it makes use of a templating language
called Jinja2 which is used to create HTML, XML or other
markup formats that are returned to the user via an HTTP request.
4.3. PYCHARM:
Pycharm is by far one of the best IDE for coding or programming in
python. It facilitates one to concentrate on the actual coding with an
intuitive software environment where one can just do the work and
not worry about putting things in place. For writing simple and
efficient Python codes, we need an Integrated Development
Environment (IDE). We use PyCharm as an IDE for developing
12
Python-based applications.
4.4. TENSORFLOW:

TensorFlow is a Python-friendly open source library for numerical


computation that makes machine learning faster and easier. It is called
Tensorflow because it takes input as a multidimensional array, also
known as tensors. You can construct a sort of flowchart of operations
(called a Graph) that you want to perform on that input. The input
goes in at one end, and then it flows through this system of multiple
operations and comes out the other end as output. TensorFlow is the
best library of all because it is built to be accessible for everyone.
Tensorflow library incorporates different API to build at scale deep
learning architecture like CNN or RNN. TensorFlow is based on
graph computation. It allows the developer to visualize the
construction of the neural network with Tensorboard. This tool is
helpful to debug the program. Finally, Tensorflow is built to be
deployed at scale. It runs on CPU and GPU.

4.5. NUMPY:

NumPy is a python library used for working with arrays. Numpy


provides a high-performance multidimensional array and basic tools
to compute with and manipulate these arrays. NumPy stands for
Numerical Python. NumPy arrays are stored at one continuous place
in memory unlike lists, so processes can access and manipulate them
very efficiently. This is the main reason why NumPy is faster than
lists. Also it is optimized to work with the latest CPU architectures.
13
5. PROJECT DESCRIPTION

5.1. PROBLEM DEFINITION:

The purpose of this project is to take handwritten English Characters


as input, process the character, train the neural network, to occur the
pattern and modify the character to a beautiful version of the input.
This project is aimed at developing software which will be helpful in
recognizing the characteristics of English language. This project is
restricted toEnglish characters only. It can be further developed to
recognize the characters of numerals and characters of different
languages. It engulfs the concept of neural network.

5.2. PROJECT OVERVIEW:

We use a NN for our task. It consists of convolutional NN (CNN)


layers, recurrent NN (RNN) layers and a final Connectionist
Temporal Classification (CTC) layer.

14
Fig.5.2. Neural Network Diagram

5.3. MODULE DESCRIPTION:

5.3.1. FLASK FRAMEWORK:

Flask is a web framework. Flash Provides the tools, libraries and


technologies that allow us to build a web application. This web

15
application can be some web pages, a blog, awiki or go as big as a
web-based calendar application or a commercial website. Flask is
part of the categories of the micro-framework. Micro-framework are
normally frameworks with little to no dependencies to external
libraries. This framework is light, there is little dependency to update
and watch for security bugs.

5.3.2.MODEL:

BREAKDOWN MODEL

Fig.5.3.2. Breakdown Model Diagram

5.3.2.1. Image Acquisition:

• In Image acquisition, the recognition system acquires a scanned


image as an input image.
16
• The image should be in .png format .

5.3.2.2. Pre-processing:

• It is a gray-value image of size 128×32.


• Usually, the images from the dataset do not have exactly this size,
therefore we resize it (without distortion) until it either has a width of
128 or a height of 32.

• Then, we copy the image into a (white) target image of size 128×32.

5.3.2.3. Classification and Recognition:

• The classification stage is the decision making part of the recognition


system.
• A feed forward back propagation neural network is used in this work
for classifying and recognizing the handwritten character.
• The total number of neurons in the output layer is 36 as the proposed
system is designed to recognize English alphabets and digits.

5.3.2.4. Post-Processing:

● Post-Processing stage is the final stage of the proposed recognition


system.
● It prints the corresponding recognized character in the structured
form.
● The total number of neurons in the output layer is 36 as the

17
● proposed system is designed to recognize English alphabets and
digit

18
6. SYSTEM DESIGN

6.1. INTRODUCTION TO UML:

Unified Modeling Language (UML) is a general purpose modeling


language. The main aim of UML is to define a standard way to
visualize the way a system has been designed. It is quite similar to
blueprints used in other fields of engineering.
UML is not a programming language, it is rather a visual language.
We use UML diagrams to portray the behavior and structure of a
system. UML helps software engineers, businessmen and system
architects with design and analysis.

6.2. BUILDING BLOCKS OF UML:

UML is composed of three main building blocks, i.e., things,


relationships, and diagrams. Building blocks generate one complete
UML model diagram by rotating around several different blocks. It
plays an essential role in developing UML diagrams. The basic UML
building blocks are :
1. Things
2. Relationships
3. Diagrams
19
6.2.1. THINGS:
Anything that is a real world entity or object is termed as things.

6.2.2. RELATIONSHIPS:
It illustrates the meaningful connections between things. It shows the
association between the entities and defines the functionality of an
application.

6.2.3. DIAGRAMS:
The diagrams are the graphical implementation of the models that
incorporate symbols and text. Each symbol has a different meaning in
the context of the UML diagram. There are thirteen different types of
UML diagrams that are available in UML 2.0, such that each diagram
has its own set of a symbol. And each diagram manifests a different
dimension, perspective, and view of the system.

20
6.3. UML DIAGRAMS:

6.3.1. USECASE DIAGRAM:

Fig.6.3.1.Use Case Diagram for Handwritten Text Recognition

21
6.3.2. SEQUENCEDIAGRAM:

Fig.6.3.2.Sequence Diagram for Handwritten Text Recognition

22
6.3.3.ACTIVITY DIAGRAM

Fig.6.3.3. Activity Diagram for Handwritten Text Recognition

23
7. DEVELOPMENT

7.1. DATA SET USED:


The dataset consists of handwritten text images.

7.2. SAMPLE CODE:

7.2.1. MAIN.PY:

from _future_ import division


from _future_ import print_function

import sys import argparse import cv2


import edit distance
fromDataLoader import DataLoader,
Batch from Model import Model,
DecoderType
fromSamplePreprocessor import
preprocess
classFilePaths:
“filenames and paths to data”
fnCharList = ‘H:/IAM/HTR/model/charList.txt’
fnAccuracy = ‘H:/IAM/HTR/model/accuracy.txt’
fnTrain = ‘H:/IAM/HTR/data/’
fnInfer = ‘H:/IAM/HTR/data/test.png’
fnCorpus = ‘H:/IAM/HTR/data/corpus.txt’

def train(model, loader): “train NN


epoch = 0 # number of training epochs since start
bestCharErrorRate = float(‘inf’) # best occurred character error
rate noImprovementSince = 0 # number of epochs no
improvement of character error rate occurred
24
earlyStopping = 5 # stop training after this number of epochs without
improvement
while True:
epoch += 1 print(‘Epoch:’, epoch)

# train print(‘Train NN’) loader.trainSet()


whileloader.hasNext():
interInfo = loader.getIterator Info()
batch = loader.getNext()
loss = model.trainBatch(batch)
print(‘Batch:’, iterInfo[0],’/’, iterInfo[1], ‘Loss:’,
loss)

# validate
charErrorRate = validate(model, loader)

# if best validation accuracy so far, save model parameters


ifcharErrorRate<bestCharErrorRate:
print(‘Character error rate improved, save model’)
bestCharErrorRate = charErrorRate

noImprovementSince = 0
model.save()

open(FilePaths.fnAccuracy, ‘w’).write(‘Validation character


error rate of saved model: %f%%’ % (charErrorRate*100.0))

else:

25
open(FilePaths.fnAccuracy, ‘w’).write(‘Validation character
error rate of saved model: %f%%’ % (charErrorRate*100.0))
else:
print(‘Character error rate not improved’)
noImprovementSince += 1

# stop training if no more improvement in the last x


epochs ifnoImprovementSince>= earlyStopping:
print(‘No more improvement since %d epochs.
Training stopped.’ % earlyStopping)
break

def validate(model, loader): “validate NN” print(‘Validate NN’)


loader.validationSet() numCharErr = 0
numCharTotal = 0
numWordOK = 0
numWordTotal = 0
whileloader.hasNext():
interInfo = loader.getIterator Info()
print(‘Batch:’, interInfo[0],’/’, interInfo[1])
batch = loader.getNext()
(recognized, _) = model.inferBatch(batch)

print(‘Ground truth -> Recognized’)


for i in range(len(recognized)):
numWord += 1
if batch.gtTexts[i] == recognized[i]
else 0
numWordTotal += 1
dist = edit distance.eval(recognized[i],
batch.gtTexts[i]) numCharErr += dist
numCharTotal += len(batch.gtTexts[i]

26
print(‘[OK]’ if dist==0 else ‘[ERR:%d]’ % dist,’”’ + batch.gtTexts[i] +
‘”’, ‘->’, ‘”’ + recognized[i] + ‘”’)

# print validation result


charErrorRate = numCharErr /
numCharTotal word Accuracy = num
WordEN / numWordTotal
print(‘Character error rate: %f%%. Word accuracy:
%f%%.’ % (charErrorRate*100.0, wordAccuracy*100.0))
returncharErrorRate

def infer(model, fb Img):


“recognize text in image provided by file path”

img = preprocess(cv2.imread(fb Img,


cv2.IMREAD_GRAYSCALE), Model.imgSize)

batch = Batch(None, [img])


(recognized, probability) =
model.inferBatch(batch, True)
print(‘Recognized:’, ‘”’ + recognized[0] + ‘”’)
print(‘Probability:’, probability[0])
return recognized[0]+’,’+ str(probability[0])

def main(path):
“main function”
# optional command line
args parser =
argparse.ArgumentParser()

27
parser.add_argument(‘—train’, help=’train the NN’, action=’store_true’)

parser.add_argument(‘—validate’, help=’validate
the NN’, action=’store_true’)
parser.add_argument(‘—beam search’, help=’use beam search
instead of best path decoding’, action=’store_true’)
parser.add_argument(‘—wordbeamsearch’, help=’use
word beam search instead of best path decoding’,
action=’store_true’) parser.add_argument(‘—dump’,
help=’dump output of NN to CSV file(s)’, action=’store_true’)
args = parser.parse_args()

#args, unknown =
parser.parse_known_args() de…

7.2.2. APP.PY:

import os

from flask import Flask, render_template,

request from main import main

UPLOAD_FOLDER = ‘/static/uploads/’

ALLOWED_EXTENSIONS = set([‘png’, ‘jpg’, ‘jpeg’, ‘gif’])

app = Flask(_name_)

defallowed_file(filename): return‘.’ in filename and \

filename.rsplit(‘.’, 1)[1].lower() in
ALLOWED_EXTENSIONS

@app.route(‘/’)

defhome_page():
28
returnrender_template(‘index.html’) #return“Hi”

@app.route(‘/upload’, methods=[‘GET’, ‘POST’])

defupload_page():

if request.method == ‘POST’:

# check if the post request has the file part if file’ not in

request.files:

returnrender_template(‘upload.html’, msg=’No file selected’)

file = request.files[‘file’]

# if user does not select file, browser also # submit a

empty part without filename

if file.filename == ‘’:

returnrender_template(‘upload.html’, msg=’No file selected’)

if file and allowed_file(file.filename):

file.save(os.path.join(os.getcwd() +
UPLOAD_FOLDER, file.filename))

# call the OCR function on it

extracted_text,probability = main(os.path.join(os.getcwd() +
UPLOAD_FOLDER, file.filename)).split(‘,’)

# extract the text and display it

29
returnrender_template(‘upload.html’, msg=’Successfully

processed’, extracted_text=extracted_text,

probability=probability, img_src=UPLOAD_FOLDER +

file.filename) elif request.method == ‘GET’:

returnrender_template(‘upload.html’)

if _name_ == ‘_main_’:

app.run()

7.2.3. UPLOAD HTML:

<!DOCTYPE html>

<html>

<head>

<title>Upload Image</title>

</head>

<body>

{% if msg %}

<h1>{{ msg }}</h1>

{% endif %}

<h1>Upload new File</h1>

<form method=post enctype=multipart/form-data>

30
<p><input type=file name=file>

<input type=submit value=Upload>

</form>

<h1>Result:</
h1>

{% if img_src %}

<imgsrc=”{{ img_src }}”>

{% endif %}

{% if extracted_text %}

<p> The extracted text from the image above is: <b> {{
extracted_text
}} </b></p>

{% else %}

The extracted text and

{% endif %}

{% if probability %}

<p> The probability is: <b> {{ probability }} </b></p>

{% else %}

probability will be displayed here

31
{% endif %}

</body>

</html>

32
7.3. INPUT OUTPUT SCREENS:

33
34
35
36
37
38
39
40
41
8. SYSTEM MAINTENANCE

Software maintenance is far more than finding mistakes. Provision


must be made for environment changes, which may affect either
the computer, or other parts of the computer based systems. Such
activity is normally called maintenance. It includes both the
improvement of the system functions and the corrections of faults,
which arise during the operation of a new system. It may involve
the continuing involvement of a large proportion of computer
department recourses. The main task may be to adapt existing
systems in a changing environment. Backup for the entire database
files are taken and stored in storage devices like flash drives, pen
drives and disks so that it is possible to restore the system at the
earliest. If there is a breakdown or collapse, then the system gives
provision to restore database files. Storing data in a separate
secondary device leads to an effective and efficient maintenance of
the system. The nominated person has sufficient knowledge of the
organization’s computer passed based system to be able to judge
the relevance of each proposed change.

42
9. CONCLUSION

● Handwritten Text Recognition is a complex problem, which is


not easily solvable. The necessity is around dataset and database.
● This model is built to analyze the text we have written and
convert it in computer text.
● This application is applicable in many sectors of healthcare and
consumer sector.
● This type of model is used in health applications and can save
understanding perspectives of people and store each and every
record digitally.
● Recognition of text depends on writing style.
● Salt and pepper noise can throw off results.

43
10. BIBLIOGRAPHY

WEB REFERENCES:

1. https://towardsdatascience.com/2326a3487cd5

2. https://repositum.tuwien.ac.at/obvutwhs/download/pdf/2874742

3. https://arxiv.org/pdf/1507.05717.pdf

44

You might also like