HWTR
HWTR
II. DATASETS
The data plays a very important role in machine
learning. The past data is used to predict the future
outcome. The relevant data can be downloaded from the
internet.
The data that is related to our project that is HTR
Fig 1. Bar diagram showing lable sizes of different labels
consists of pixel values. The format of the data files is csv
III. ARCHITECTURE
The problem convert a handwritten text which is in the
form of pixels into its digital for is a data driven approach.
The data which is already collected can be used for
extracting the features of each letter. The availability of
more powerful machine learning algorithms introduces an
efficient and better approach to solve this problem.
The project is divided into two modules.
Fig 2. Architecture
IV. METHODOLOGY
The research methodology in this project include,
• Visualizing and understanding the data
• Choosing a suitable model Fig 4. An image of dimensions 351 X 232 pixels
• Agreeing on a common evaluation metric
• Training and Testing the models 2) Detecting the letters
• Implementing the final model Object detection is a Computer vision technique that
• Analyzing the result detectects certain components from in an image or a video.
It makes use of Machine Learning and Deep Learning
Algorithms to yield good results. Detecting the letters is
same as detecting objects. We need to apply some standard
filters to the input image for achieving this task.
Step 1: Convert a BGR image to Greyscale image.
An image with 3 channels is a BGR image but a Grayscale
image consists of a single channel. A channel is a thid
dimension of an image.
Fig 7. Image with contours Fig 9. A cropped image of letter E is one of the results of cropping
REFERENCES
[1] https://www.kaggle.com/sachinpatel21/az-handwritten-alphabets-
in-csv-format
[2] https://www.kaggle.com/c/digit-recognizer/data
[3] CHARACTER RECOGNITION IN NATURAL IMAGES By
Te´ofilo E. de Campos, Bodla Rakesh Babu, Manik Varma
https://www.researchgate.net/publication/221416071_Character_
Figure An image with rectangular border around detected contours Recognition_in_Natural_Images/link/5dd6e92892851c1feda56fc1
/download
[4] Text detection and recognition in raw image dataset and seven
segment digital energy meter display By Karthick
Kanagarathinam, Kavaskar Sekar
https://reader.elsevier.com/reader/sd/pii/S235248471930174X?to
ken=FFC0111CC7487898FEFE8637DDA6CE1692B76C48DBB
Figure A subplot for cropped images of above image 26C375D1CD755667BBC2109D8C5287A2205169F20461A43B
DD304
(ii) If two letters touch each other (like in cursive writing), [5] Scene Text Detection and Recognition: The Deep Learning Era
By Shangbang Long, Xin He, Cong Yao
they are recognized as a single letter. https://arxiv.org/pdf/1811.04256.pdf
[6] Automatic Text Detection and Classification in Natural Images
VIII. CONCLUSION By C.P. Chaithanya, N. Manohar, Ajay Bazil Issac
https://www.ijrte.org/wp-
Convolutional Neural Network learns from the real time content/uploads/papers/v7i5s3/E11330275S19.pdf
data and simplifies model by reducing the number of [7] An End-to-End Trainable Neural Network for Image-based
Sequence Recognition and Its Application to Scene Text
parameters and hence gives considerable accuracy. Recognition By Baoguang Shi, Xiang Bai and Cong Yao
Future Enhancements https://arxiv.org/abs/1507.05717
We can increase the accuracy: [8] EMNIST: an extension of MNIST to handwritten letters By
→ By taking huge datasets Gregory Cohen, Saeed Afshar, Jonathan Tapson, and Andr´e van
Schaik https://arxiv.org/pdf/1702.05373.pdf
→ By adopting much suitable algorithms [9] Handwritten Text Recognition for Historical Documents By
→ We can compile the model at more number of epochs. Veronica Romero, Nicholas Serrano, Alejandro H. Toselli, Joan
→ Hyper-parameter Tuning (There are a lot of parameters Andreu Sanchez and Enrique Vidal
that we can play with). https://www.aclweb.org/anthology/W11-4114.pdf
[10] Arabic Cursive Text Recognition from Natural Scene Images By
→ Use of deeper architectures Saad Bin Ahmed, Saeeda Naz, Muhammad Imran Razzaq and
Rybiyah Yusof https://www.mdpi.com/2076-3417/9/2/236/pdf
This application can be taken to next level by
→ Extending its scope to different writing styles
→ Extending its scope to different writing styles
ACKNOWLEDGEMENT
We would like to acknowledge the help of ‘JNTUA
College of Engineering, Pulivendula’ for the kind support
provided and our faculty and friends for the helpful
discussions. We also would like to thank ‘kaggle’ who
provided datasets.