0% found this document useful (0 votes)
99 views2 pages

Handwritten Text Recognition and Digital Text Conversion

Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-3 | Issue-3 , April 2019, URL: https://www.ijtsrd.com/papers/ijtsrd23508.pdf Paper URL: https://www.ijtsrd.com/computer-science/data-processing/23508/handwritten-text-recognition-and-digital-text-conversion/mr-b-ravinder-reddy

Uploaded by

Editor IJTSRD
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
99 views2 pages

Handwritten Text Recognition and Digital Text Conversion

Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-3 | Issue-3 , April 2019, URL: https://www.ijtsrd.com/papers/ijtsrd23508.pdf Paper URL: https://www.ijtsrd.com/computer-science/data-processing/23508/handwritten-text-recognition-and-digital-text-conversion/mr-b-ravinder-reddy

Uploaded by

Editor IJTSRD
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 2

International Journal of Trend in Scientific Research and Development (IJTSRD)

Volume: 3 | Issue: 3 | Mar-Apr 2019 Available Online: www.ijtsrd.com e-ISSN: 2456 - 6470

Handwritten Text Recognition and Digital Text Conversion


Mr. B. Ravinder Reddy, J. Nandini, P. Sowmya, Y. Sathwik
Department of Computer Science and Engineering, Anurag Group of Institutions, Telangana, India

How to cite this paper: Mr. B. Ravinder ABSTRACT


Reddy | J. Nandini | P. Sowmya | Y. Sometimes it is extremely difficult to secure handwritten documents in the real
Sathwik "Handwritten Text Recognition world. While doing so, we may encounter many problems such as misplacing the
and Digital Text Conversion" Published documents, unavailability of access from anywhere, physical damage, etc. So, to
in International Journal of Trend in keep the information secure, we convert that information into digital format to
Scientific Research and Development address all the above mentioned problems. The main aim of our application is to
(ijtsrd), ISSN: 2456- recognize hand written text and display it in digital text format. Image
6470, Volume-3 | processing is very significant process for data analysis these days. In image
Issue-3, April 2019, processing, the visible text from the real world - as input- must be processed
pp.1826-1827, URL: precisely in order to produce the same information - as output - with accuracy.
https://www.ijtsrd.c To do this, the text present in the image must be recognized by the system
om/papers/ijtsrd23 accurately. The proposed system aims at achieving these results. The process
508.pdf IJTSRD23508 goes in this way: The image which contains the handwritten text is fed to the
system is passed into neural network which recognizes the handwritten text
Copyright © 2019 by author(s) and present in the image and displays it in the form of digital text. This can be used
International Journal of Trend in for many purposes such as copying the digital text for using it elsewhere,
Scientific Research and Development producing formal documents and can also be used as input for data processing.
Journal. This is an Open Access article Using this process, we can store the information in a secure way, we can access
distributed under the information from anywhere or at any time and there is no scope for physical
the terms of the damage as the information is in digital format.
Creative Commons
Attribution License (CC BY 4.0)
(http://creativecommons.org/licenses/
by/4.0)
Motivation
Text recognition in images is an active research area which and they did not even look natural. The first commercialized
attempts to develop a computer application with the ability OCR of this generation was IBM 1418, which was designed to
to automatically read the text from images. Nowadays there read a special IBM font, 407. The recognition method was
is a huge demand of storing the information available on template matching, which compares the character image
paper documents in to a computer readable form for later with a library of prototype images for each character of each
use. One simple way to store information from these paper font.
documents in to computer system is to first scan the
documents and then store them as document. The challenges Proposed system
involved are: font characteristics of the characters in paper Handwritten Text Recognition (HTR) system implemented
documents and quality of the images. There is a need of with Tensor Flow (TF) and trained on the IAM off-line HTR
character recognition mechanisms to perform document dataset [2]. This Neural Network (NN) model recognizes the
image analysis which transforms documents in paper format text contained in the images of segmented words. As these
to electronic format. In this paper, we have reviewed and word-images are smaller than images of complete text-lines,
analyzed different methods for text recognition from images. the NN can be kept small and training on the CPU is feasible.
The objective of this review paper is to summarize the well- 3/4 of the words from the validation-set are correctly
known methods for better understanding of the reader. recognized and the character error rate is around 10%.

Existing system Architecture


Character recognition originated as early as 1870 when
Carey invented the retina scanner, which is an image
transmission system using photocells. It is used as an aid to
the visually handicapped by the Russian scientist Tyurin in
1900. However, the first generation machines appeared in
the beginning of the 1960s with the development of the
digital computers. It is the first time OCR was realized as a
data processing application to the business world [Mantas,
1986] [1]. The first generation machines are characterized
by the “constrained” letter shapes which the OCRs can read.
These symbols were specially designed for machine reading,
Figure 1: Project Architecture

@ IJTSRD | Unique Paper ID – IJTSRD23508 | Volume – 3 | Issue – 3 | Mar-Apr 2019 Page: 1826
International Journal of Trend in Scientific Research and Development (IJTSRD) @ www.ijtsrd.com eISSN: 2456-6470
Flow Chart

Screenshot 2: Capturing handwritten text

Conclusion
Handwritten Character Recognition from images is very
essential these days. Character Recognition from images uses
feature extraction using character geometry and gradient
technique [4]. The feature extraction methods have
performed well in classification when fed to the neural
Figure2. Process flow network and preprocessing of image using edge detection
and normalization are the ideal choice for degraded noisy
Methodology images. The method of training neural network with
This project is developed using Tesseract tess-2 module extracted features front sample images of each character has
software which is a Computer vision API library [3] , the detection accuracy to a greater extent. The proposed
modelis pretrained with the dataset containingthe literals of methodology has produced good results for images
the language, which are inturn compared to the input image containing handwritten text written in different styles,
file to produce the required output. different size and alignment with varying background. The
system is developed and evaluated for a set of sample images
Advantages of System containing handwritten text [5]. We discussed a NN which is
1. Converting handwritten text to digital text. able to recognize text in images. The NN consists of 5 CNN
2. We can store it in our versatile itself. and 2 RNN layers and outputs a character-probability matrix.
3. Copy the converted digital text. This matrix is either used for CTC loss calculation or for CTC
4. Share the converted digital text via mail, whatsapp, etc,. decoding. An implementation using TF is provided.

Improvements References
1. It can be trained more to get accurate results. [1] R. Smith, “A Simple and Efficient Skew Detection
2. It can be trained on multiple data sets to adapt to Algorithm via Text Row Accumulation”, Proc. of the 3rd
different languages. Int. Conf. on Document Analysis and Recognition (Vol.
3. Text to speech feature can be added. 2), IEEE 1995, pp. 1145-1148.
[2] S.V. Rice, F.R. Jenkins, T. A. Nartker, The Fourth Annual
Result and Analysis Test of OCR Accuracy, Technical Report 95-03,
We have tested the performance of our proposed system on Information Science Research Institute, University of
many samples of handwritten text. Nevada, Las Vegas, July 1995.
Here are few screenshots of the result
[3] R.W. Smith, The Extraction and Recognition of Text
from Multimedia Document Images, PhD Thesis,
University of Bristol, November 1987.
[4] Chirag I Patel, Ripal Patel. Palak Patel “Handwritten
Character Recognition Using Neural Networks",
International Journal of Scientific & Engineering
Research Volume 2, Issue 5, May- 2011.
[5] Kauleshwar Prasad, Devvrat C Nigam,
AshmikaLakhotiya, Dheeren Umre "Character
Recognition Using Neural Toolbox", International
Journal of u- and e- Service, Science and Technology
Vol. 6, No. 1, February, 2013.
Screenshot 1: output screen

@ IJTSRD | Unique Paper ID - IJTSRD23508 | Volume – 3 | Issue – 3 | Mar-Apr 2019 Page: 1827

You might also like