Handwritten Text Recognition Using Deep Learning
Handwritten Text Recognition Using Deep Learning
TEXT
RECOGNITION
USING DEEP
LEARNING
Relies on conventional classification methods for Introduces a novel approach using deep learning
handwritten digit recognition. techniques for handwritten text recognition and
conversion into digital form.
Acknowledges the progress made in recognizing
handwritten digits but highlights the limitations in Aims to enhance efficiency and achieve human-level
accuracy impacting work efficiency. prediction accuracy by leveraging advancements in
deep learning.
Utilizes a two-layer CNN network with two fully
connected layers. Integrates Convolutional Neural Networks (CNN) and
Recurrent Neural Networks (RNN) in a Convolutional
Employs the ReLU function to mitigate gradient Recurrent Neural Network Model.
disappearance and saturation challenges.
Utilizes Optical Character Recognition (OCR)
algorithm, leveraging the strengths of CNN for
image-based problems and RNN for sequence
recognition.
SYSTEM ARCHITECTURE
MODULES
● Pre processing
● OCR Algorithm
● Convolution Layer
● Recurrent Layer
● Transcription Layer
PRE PROCESSING
● The preprocessing unit in the architecture diagram prepares input data for the
neural network model.
● The layer is used for image feature extraction. The component of convolutional layers
is constructed by taking the convolutional and max pooling layers in CRNN
model.Sequential feature representation from an input image is extracted using such
component.
● The first layer of a Convolutional Neural Network is always a Convolutional layer.
Convolutional layers apply a convolution operation to the input, passing the result to
the next layer. A convolution converts all the pixels in its receptive field into a single
value.
TRANSCRIPTION LAYER
Transcription Process:
● The transcription layer converts per-frame predictions made by the RNN into a sequence of labels or text. This process is crucial
for transforming the output of the neural network into readable text.
● Connectionist Temporal Classification (CTC) is a commonly used technique in the transcription process. It helps decode the
output from the RNN and convert it into text labels.
● The transcription layer operates after the recognition model, taking the output probabilities from the model.
● Its primary function is to convert these probabilities into a sequence of recognized text or characters. This involves applying
decoding algorithms to determine the most probable sequence based on the output probabilities.
● The transcription layer maps the continuous probability distributions generated by the recognition model into discrete symbols,
such as characters or words, representing the recognized text.
● By converting probabilities into discrete symbols, the transcription layer enables the neural network to output readable text that
accurately represents the input sequence.
RECURRENT LAYER
1. Bi-directional RNN for Sequence Labelling:
● Connects every neuron from the previous layer to every neuron in the next.
● Output is fed back to the input, with the number of units determining output
dimensionality.
● Typically uses the hyperbolic tangent (tanh) activation function.
3. Recurrent Layer:
● Comprised of recurrent units processing input and previous hidden state to produce
output.
● Output can be further processed or sent to subsequent layers.
● Captures temporal dependencies within sequences, aiding pattern recognition.
CONCLUSION
● An adaptive method is proposed for handwritten text recognition by pre-processing and
training the dataset consecutively with CNN and RNN.
● The input word images are processed and fed into neural network model layers during
recognition.
● The output of the CNN layers is further processed by the RNN layers. The results demonstrate
the potential of consecutive use of CNN and RNN that improve the accuracy steadily.
FUTURE SCOPE
● In future we are planning to extend this study to a larger extent where different embedding
models can be considered on large variety of the datasets.
● •we aim to enhance the work by implementing online recognition and extend it to different
languages, additionally we can promote the system to recognize degraded text or broken
characters
REFERENCES
1.A. Graves and J. Schmidhuber, “Offline handwriting recognition with multidimensional recurrent neural networks,” in NIPS, 2009.
2.Rohan Vaidya;Darshan Trivedi;Sagar Satra;Prof. Mrunalini Pimpale, ”Handwritten Character Recognition Using DeepLearning”,
in ICICCT, 2018.
3.P. Voigtlaender, P. Doetsch, and H. Ney, “Handwriting recognition with large multidimensional long short-term memory recurrent
neural networks,” in ICFHR, 2016.
4.J. Puigcerver, “Are multidimensional recurrent layers really necessary for handwritten text recognition?” in ICDAR, 2017.
5.D. Keysers, T. Deselaers, H. A. Rowley, L. Wang, and V. Carbune, “Multi-language online handwriting recognition,” PAMI, vol.
39, no. 6, pp. 1180–1194, 2017.
6.V. Carbune, P. Gonnet, T. Deselaers, H. A. Rowley, A. Daryin, M. Calvo, L.-L. Wang, D. Keysers, S. Feuz, and P. Gervais, “Fast
multi- language lstm-based online handwriting recognition,” ArXiV, 2019. 49
7.U. Marti and H. Bunke. The IAM-database: An English Sentence Database for Off-line Handwriting Recognition. Int. Journal on
Document Analysis and Recognition, Volume 5, pages 39 - 46, 2002.
8.H.Bunke1, M. Roth1, E.G. Schukat-Talamazzini. Offline Cursive Handwriting Recognition using Hidden Markov Models.