0% found this document useful (0 votes)
91 views

Handwritten Text Recognition Using Deep Learning

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
91 views

Handwritten Text Recognition Using Deep Learning

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 13

HANDWRITTEN

TEXT
RECOGNITION
USING DEEP
LEARNING

M HARINI G SUBHIKSHA R SRIHARINI V ABINAYA


ABSTRACT
The System is Built to Recognize Handwritten Text and then convert the recognized text into digital
form using Deep Learning. Deep Learning is an advanced technique to get better efficiency and reach
human level Prediction. Handwritten Recognition is a technology that can be used to recognize
handwritten characters. Handwriting text will be in images format. In this system we have used
convolution neural networks to predict real time handwritten text because these neural networks are
most properly used for Analysing images. To predict handwritten text, the Optical Character
Recognition Algorithm is used in the Convolution Recurrent Neural Network Model. Optical Character
Recognition problem is a type of image-based Sequence recognition problem. And for Sequence
recognition problems, most suited neural networks are Recurrent Neural Networks(RNN) while for an
image based problem most suited are Convolutional Neural Networks(CNN). To cope up with the OCR
problems we need to combine both of these CNN and RNN. Deep learning gives higher level
recognition accuracy. The Aim of our project is to make an application that can recognize the
handwriting using concepts of deep learning. We are thinking by approaching our problem using CNN
as they provide better accuracy over such tasks. Image processing could be a manipulation of images
within the computer vision. With the advent of technology, there are many techniques for the
manipulation of photographs.
PROBLEM STATEMENT

• In this system we have used convolution neural network to predict real


time handwritten digit because these neural networks are most properly
used for analyzing images.
• In this system we have used convolution neural network to predict real
time handwritten digit because these neural networks are most properly
used for analyzing images.
EXISTING SYSTEM PROPOSED SYSTEM

Relies on conventional classification methods for Introduces a novel approach using deep learning
handwritten digit recognition. techniques for handwritten text recognition and
conversion into digital form.
Acknowledges the progress made in recognizing
handwritten digits but highlights the limitations in Aims to enhance efficiency and achieve human-level
accuracy impacting work efficiency. prediction accuracy by leveraging advancements in
deep learning.
Utilizes a two-layer CNN network with two fully
connected layers. Integrates Convolutional Neural Networks (CNN) and
Recurrent Neural Networks (RNN) in a Convolutional
Employs the ReLU function to mitigate gradient Recurrent Neural Network Model.
disappearance and saturation challenges.
Utilizes Optical Character Recognition (OCR)
algorithm, leveraging the strengths of CNN for
image-based problems and RNN for sequence
recognition.
SYSTEM ARCHITECTURE
MODULES

● Pre processing
● OCR Algorithm
● Convolution Layer
● Recurrent Layer
● Transcription Layer
PRE PROCESSING

● The preprocessing unit in the architecture diagram prepares input data for the
neural network model.

● It includes resizing, normalization, noise reduction, contrast enhancement, and


segmentation. These steps ensure the input images are standardized, clean, and
optimized for effective recognition by the neural network.
OCR ALGORITHM
● OCR (Optical Character Recognition): OCR is a technology used to recognize text within images, including
scanned documents and photos. It converts various types of text images (typed, handwritten, or printed) into
machine-readable text data.
● OCR Process: OCR involves converting digital or hand-written text images into machine-readable text that
computers can process, store, and edit. This enables the manipulation of text as part of data entry and processing
software.
● Feature Extraction Methods: There are two main methods for extracting features in OCR: one evaluates
characters based on lines and strokes, while the other identifies entire characters through pattern recognition.
● Pattern-Matching Algorithms: OCR software uses pattern-matching algorithms to compare text images
character by character with its internal database. If the system matches the text word by word, it's called optical
word recognition. OCR software essentially "reads" text and converts it into digital form.
● Evolution of OCR: OCR is one of the earliest addressed computer vision tasks and doesn't always require deep
learning techniques, as it can be accomplished with traditional algorithms and methods.
CONVOLUTIONAL LAYER

● The layer is used for image feature extraction. The component of convolutional layers
is constructed by taking the convolutional and max pooling layers in CRNN
model.Sequential feature representation from an input image is extracted using such
component.
● The first layer of a Convolutional Neural Network is always a Convolutional layer.
Convolutional layers apply a convolution operation to the input, passing the result to
the next layer. A convolution converts all the pixels in its receptive field into a single
value.
TRANSCRIPTION LAYER
Transcription Process:

● The transcription layer converts per-frame predictions made by the RNN into a sequence of labels or text. This process is crucial
for transforming the output of the neural network into readable text.
● Connectionist Temporal Classification (CTC) is a commonly used technique in the transcription process. It helps decode the
output from the RNN and convert it into text labels.

Role of Transcription Layer:

● The transcription layer operates after the recognition model, taking the output probabilities from the model.
● Its primary function is to convert these probabilities into a sequence of recognized text or characters. This involves applying
decoding algorithms to determine the most probable sequence based on the output probabilities.

Mapping Probabilities to Symbols:

● The transcription layer maps the continuous probability distributions generated by the recognition model into discrete symbols,
such as characters or words, representing the recognized text.
● By converting probabilities into discrete symbols, the transcription layer enables the neural network to output readable text that
accurately represents the input sequence.
RECURRENT LAYER
1. Bi-directional RNN for Sequence Labelling:

● Bi-directional RNNs are used on top of convolutional layers to label sequences.


● They capture information from both directions, enhancing sequence understanding.

2. Fully Connected Layer:

● Connects every neuron from the previous layer to every neuron in the next.
● Output is fed back to the input, with the number of units determining output
dimensionality.
● Typically uses the hyperbolic tangent (tanh) activation function.

3. Recurrent Layer:

● Comprised of recurrent units processing input and previous hidden state to produce
output.
● Output can be further processed or sent to subsequent layers.
● Captures temporal dependencies within sequences, aiding pattern recognition.
CONCLUSION
● An adaptive method is proposed for handwritten text recognition by pre-processing and
training the dataset consecutively with CNN and RNN.
● The input word images are processed and fed into neural network model layers during
recognition.
● The output of the CNN layers is further processed by the RNN layers. The results demonstrate
the potential of consecutive use of CNN and RNN that improve the accuracy steadily.
FUTURE SCOPE
● In future we are planning to extend this study to a larger extent where different embedding
models can be considered on large variety of the datasets.
● •we aim to enhance the work by implementing online recognition and extend it to different
languages, additionally we can promote the system to recognize degraded text or broken
characters
REFERENCES
1.A. Graves and J. Schmidhuber, “Offline handwriting recognition with multidimensional recurrent neural networks,” in NIPS, 2009.
2.Rohan Vaidya;Darshan Trivedi;Sagar Satra;Prof. Mrunalini Pimpale, ”Handwritten Character Recognition Using DeepLearning”,
in ICICCT, 2018.
3.P. Voigtlaender, P. Doetsch, and H. Ney, “Handwriting recognition with large multidimensional long short-term memory recurrent
neural networks,” in ICFHR, 2016.
4.J. Puigcerver, “Are multidimensional recurrent layers really necessary for handwritten text recognition?” in ICDAR, 2017.
5.D. Keysers, T. Deselaers, H. A. Rowley, L. Wang, and V. Carbune, “Multi-language online handwriting recognition,” PAMI, vol.
39, no. 6, pp. 1180–1194, 2017.
6.V. Carbune, P. Gonnet, T. Deselaers, H. A. Rowley, A. Daryin, M. Calvo, L.-L. Wang, D. Keysers, S. Feuz, and P. Gervais, “Fast
multi- language lstm-based online handwriting recognition,” ArXiV, 2019. 49
7.U. Marti and H. Bunke. The IAM-database: An English Sentence Database for Off-line Handwriting Recognition. Int. Journal on
Document Analysis and Recognition, Volume 5, pages 39 - 46, 2002.
8.H.Bunke1, M. Roth1, E.G. Schukat-Talamazzini. Offline Cursive Handwriting Recognition using Hidden Markov Models.

You might also like