Rajeswari

Download as pdf or txt
Download as pdf or txt
You are on page 1of 54

HAND WRITTEN LETTER RECOGNITION WITH CNN

USING DEEP LEARNING

Major project report submitted


in partial fulfillment of the requirement for award of the degree of

Bachelor of Technology
in
Computer Science & Engineering

By

ATCHI BALASRINIVASARAO (20UECS0077) (VTU 17170)


YEGURU PENCHALA VINEELA (20UECS1041) (VTU 15373 )

Under the guidance of


Dr. D. RAJESH,M.E.,Ph.D.,
PROFESSOR

DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING


SCHOOL OF COMPUTING

VEL TECH RANGARAJAN DR. SAGUNTHALA R&D INSTITUTE OF


SCIENCE & TECHNOLOGY
(Deemed to be University Estd u/s 3 of UGC Act, 1956)
Accredited by NAAC with A++ Grade
CHENNAI 600 062, TAMILNADU, INDIA

May, 2024
HAND WRITTEN LETTER RECOGNITION WITH CNN
USING DEEP LEARNING

Major project report submitted


in partial fulfillment of the requirement for award of the degree of

Bachelor of Technology
in
Computer Science & Engineering

By

ATCHI BALASRINIVASARAO (20UECS0077) (VTU 17170)


YEGURU PENCHALA VINEELA (20UECS1041) (VTU 15373 )

Under the guidance of


Dr. D. RAJESH, M.E.,Ph.D.,
PROFESSOR

DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING


SCHOOL OF COMPUTING

VEL TECH RANGARAJAN DR. SAGUNTHALA R&D INSTITUTE OF


SCIENCE & TECHNOLOGY
(Deemed to be University Estd u/s 3 of UGC Act, 1956)
Accredited by NAAC with A++ Grade
CHENNAI 600 062, TAMILNADU, INDIA

May, 2024
CERTIFICATE
It is certified that the work contained in the project report titled “HAND WRITTEN LETTER RECOG-
NITION WITH CNN USING DEEP LEARNING” by “ATCHI BALASRINIVASARAO (20UECS0-
077), YEGURU PENCHALA VINEELA (20UECS1041)” has been carried out under my supervision
and that this work has not been submitted elsewhere for a degree.

Signature of Supervisor Signature of Professor In-charge


Computer Science & Engineering Computer Science & Engineering
School of Computing School of Computing
Vel Tech Rangarajan Dr. Sagunthala R&D Vel Tech Rangarajan Dr. Sagunthala R&D
Institute of Science & Technology Institute of Science & Technology
May, 2024 May, 2024

i
DECLARATION

We declare that this written submission represents our ideas in our own words and where others’
ideas or words have been included, we have adequately cited and referenced the original sources. We
also declare that we have adhered to all principles of academic honesty and integrity and have not
misrepresented or fabricated or falsified any idea/data/fact/source in our submission. We understand
that any violation of the above will be cause for disciplinary action by the Institute and can also
evoke penal action from the sources which have thus not been properly cited or from whom proper
permission has not been taken when needed.

ATCHI BALASRINIVASARAO
Date: / /

YEGURU PENCHALA VINEELA


Date: / /

ii
APPROVAL SHEET

This project report entitled HAND WRITTEN LETTER RECOGNITION WITH CNN USING DEEP
LEARNING by ATCHI BALASRINIVASARAO (20UECS0077), YEGURU PENCHALA VINEELA
(20UECS1041) is approved for the degree of B.Tech in Computer Science & Engineering.

Examiners Supervisor

Dr. D. Rajesh,M.E.,Ph.D.,
Professor.,

Date: / /
Place:

iii
ACKNOWLEDGEMENT

We express our deepest gratitude to our respected Founder Chancellor and President Col. Prof.
Dr. R. RANGARAJAN B.E. (EEE), B.E. (MECH), M.S (AUTO),D.Sc., Foundress President Dr.
R. SAGUNTHALA RANGARAJAN M.B.B.S. Chairperson Managing Trustee and Vice President.

We are very much grateful to our beloved Vice Chancellor Prof. S. SALIVAHANAN, for provid-
ing us with an environment to complete our project successfully.

We record indebtedness to our Professor & Dean, Department of Computer Science & Engi-
neering, School of Computing, Dr. V. SRINIVASA RAO, M.Tech., Ph.D., for immense care and
encouragement towards us throughout the course of this project.

We are thankful to our Head, Department of Computer Science & Engineering,Dr.M.S. MU-
RALI DHAR, M.E., Ph.D., for providing immense support in all our endeavors.

We also take this opportunity to express a deep sense of gratitude to our Internal Supervisor Dr.
D. RAJESH,M.E.,Ph.D., for his cordial support, valuable information and guidance, he helped us in
completing this project through various stages.

A special thanks to our Project Coordinators Mr. V. ASHOK KUMAR, M.Tech., Ms. C.
SHYAMALA KUMARI, M.E., for their valuable guidance and support throughout the course of the
project.

We thank our department faculty, supporting staff and friends for their help and guidance to com-
plete this project.

ATCHI BALASRINIVASARAO (20UECS0077)


YEGURU PENCHALA VINEELA (20UECS1041)

iv
ABSTRACT

The “Handwritten Letter Recognition” is a Python-based Graphical User Interface


(GUI) application designed to extract Letter from images containing handwritten
characters. The project leverages deep learning techniques, specifically a pre-trained
Convolutional Neural Network (CNN) model, to accurately recognize and convert
handwritten character into letter. The GUI allows users to upload an image, which
undergoes various image processing steps, including resizing, gray scale conversion,
Gaussian Blur, and threshold. Once preprocessed, the image is fed into the CNN
model, trained on the A to Z dataset for recognizing handwritten. To adapt the model
for uppercase alphabets (A to Z), a mapping dictionary is used. The application suc-
cessfully extracts the letter from the uploaded image and convert Telugu displays
it in the GUI. This project showcases the potential of deep learning in CNN and
serves as a foundation for further improvements and extensions, such as recognizing
uppercase characters and integrating more diverse datasets for enhanced accuracy.
With a user-friendly interface and accurate letter extraction, the “Handwritten to let-
ter Converter” project demonstrates a practical and versatile solution for automating
handwritten character recognition tasks. It can detect single letter and convert to tel-
ugu.The goal will be to create a model that will be able to identify and determine the
letter from its image with better accuracy by using CNN one can get 92%accuracy.

Keywords: CNN, Handwritten Character, Neural Networks, Handwritten Recog-


nition.

v
LIST OF FIGURES

4.1 Architecture Diagram . . . . . . . . . . . . . . . . . . . . . . . . 12


4.2 Data Flow Diagram . . . . . . . . . . . . . . . . . . . . . . . . . 14
4.3 Use Case Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . 16
4.4 Class Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
4.5 Sequence Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . 18
4.6 Activity Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
4.7 Training Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
4.8 Training Accuracy . . . . . . . . . . . . . . . . . . . . . . . . . 24

5.1 Sample Input of Letter M . . . . . . . . . . . . . . . . . . . . . . 25


5.2 Sample Output of Letter M . . . . . . . . . . . . . . . . . . . . . 26
5.3 Unit Testing Input . . . . . . . . . . . . . . . . . . . . . . . . . . 27
5.4 Unit Testing Output . . . . . . . . . . . . . . . . . . . . . . . . . 28
5.5 Integration Testing Output . . . . . . . . . . . . . . . . . . . . . 29
5.6 System Testing Result . . . . . . . . . . . . . . . . . . . . . . . . 30

6.1 Output of Training Data . . . . . . . . . . . . . . . . . . . . . . 34

8.1 Plagarism Report . . . . . . . . . . . . . . . . . . . . . . . . . . 37

9.1 Poster Presentation . . . . . . . . . . . . . . . . . . . . . . . . . 40

vi
LIST OF ACRONYMS AND
ABBREVIATIONS

ANN Artificial Neural Network


CNN Convolutional Neural Network
GUI Graphical User Interface
HCR Handwritten Character Recognition
HLR Handwritten Letter Recognition
LSTM Long Short Term Memory
NN Neural Network
OCR Optical Character Recognition
RNN Recurrent Neural Network

vii
TABLE OF CONTENTS

Page.No

ABSTRACT v

LIST OF FIGURES vi

LIST OF ACRONYMS AND ABBREVIATIONS vii

1 INTRODUCTION 1
1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Aim of the Project . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.3 Project Domain . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.4 Scope of the Project . . . . . . . . . . . . . . . . . . . . . . . . . . 2

2 LITERATURE REVIEW 4

3 PROJECT DESCRIPTION 7
3.1 Existing System . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
3.2 Proposed System . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
3.3 Feasibility Study . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
3.3.1 Economic Feasibility . . . . . . . . . . . . . . . . . . . . . 9
3.3.2 Technical Feasibility . . . . . . . . . . . . . . . . . . . . . 9
3.3.3 Social Feasibility . . . . . . . . . . . . . . . . . . . . . . . 9
3.4 System Specification . . . . . . . . . . . . . . . . . . . . . . . . . 10
3.4.1 Hardware Specification . . . . . . . . . . . . . . . . . . . . 10
3.4.2 Software Specification . . . . . . . . . . . . . . . . . . . . 10
3.4.3 Standards and Policies . . . . . . . . . . . . . . . . . . . . 10

4 METHODOLOGY 12
4.1 General Architecture . . . . . . . . . . . . . . . . . . . . . . . . . 12
4.2 Design Phase . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
4.2.1 Data Flow Diagram . . . . . . . . . . . . . . . . . . . . . . 14
4.2.2 Use Case Diagram . . . . . . . . . . . . . . . . . . . . . . 16
4.2.3 Class Diagram . . . . . . . . . . . . . . . . . . . . . . . . 17
4.2.4 Sequence Diagram . . . . . . . . . . . . . . . . . . . . . . 18
4.2.5 Activity Diagram . . . . . . . . . . . . . . . . . . . . . . . 19
4.3 Algorithm & Pseudo Code . . . . . . . . . . . . . . . . . . . . . . 20
4.3.1 Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
4.3.2 Pseudo Code . . . . . . . . . . . . . . . . . . . . . . . . . 21
4.4 Module Description . . . . . . . . . . . . . . . . . . . . . . . . . . 21
4.4.1 Import libraries and Dataset . . . . . . . . . . . . . . . . . 21
4.4.2 The Data Preprocessing . . . . . . . . . . . . . . . . . . . 21
4.4.3 Create the Model . . . . . . . . . . . . . . . . . . . . . . . 22
4.4.4 Train The Model . . . . . . . . . . . . . . . . . . . . . . . 22
4.5 Steps to execute/run/implement the project . . . . . . . . . . . . . . 22
4.5.1 Import required Libraries . . . . . . . . . . . . . . . . . . . 22
4.5.2 Training Model . . . . . . . . . . . . . . . . . . . . . . . . 23
4.5.3 Testing Accuracy . . . . . . . . . . . . . . . . . . . . . . . 23

5 IMPLEMENTATION AND TESTING 25


5.1 Input and Output . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
5.1.1 Input Design . . . . . . . . . . . . . . . . . . . . . . . . . 25
5.1.2 Output Design . . . . . . . . . . . . . . . . . . . . . . . . 26
5.2 Testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
5.3 Types of Testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
5.3.1 Unit Testing . . . . . . . . . . . . . . . . . . . . . . . . . . 27
5.3.2 Integration Testing . . . . . . . . . . . . . . . . . . . . . . 28
5.3.3 System Testing . . . . . . . . . . . . . . . . . . . . . . . . 29
5.3.4 Test Result . . . . . . . . . . . . . . . . . . . . . . . . . . 30

6 RESULTS AND DISCUSSIONS 31


6.1 Efficiency of the Proposed System . . . . . . . . . . . . . . . . . . 31
6.2 Comparison of Existing and Proposed System . . . . . . . . . . . . 32
6.3 Sample Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

7 CONCLUSION AND FUTURE ENHANCEMENTS 35


7.1 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
7.2 Future Enhancements . . . . . . . . . . . . . . . . . . . . . . . . . 35

8 PLAGIARISM REPORT 37
9 SOURCE CODE & POSTER PRESENTATION 38
9.1 Source Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
9.2 Poster Presentation . . . . . . . . . . . . . . . . . . . . . . . . . . 40

References 40
Chapter 1

INTRODUCTION

1.1 Introduction

The project involving the extraction of handwritten characters from images using a
CNN algorithm represents a significant advancement in the field of computer vision
and character recognition. This project’s primary objective is to develop a robust
and accurate system that can automatically identify and extract handwritten charac-
ters from a wide range of sources, such as scanned documents, handwritten notes,
and images. Handwriting, being highly variable and unique, poses a complex chal-
lenge in OCR, making the utilization of CNNs and deep learning techniques pivotal.
The system’s workflow typically begins with the acquisition of images containing
handwritten letter, followed by preprocessing steps to enhance the quality of these
images. The CNN model, known for its exceptional feature extraction capabilities,
plays a central role in the project. Through a process of training and fine-tuning on a
diverse dataset of handwritten characters, the CNN learns to recognize and differen-
tiate various handwritten characters and symbols. Once trained, the CNN model can
be deployed to automatically process and extract characters from new, unseen im-
ages, achieving a level of accuracy that rivals or surpasses traditional OCR methods.
This technology holds immense potential in numerous applications, from digitiz-
ing historical handwritten documents and improving data entry processes to aiding
individuals with disabilities by converting handwritten recognition. Furthermore, as
handwritten letter is prevalent in a multitude of languages and forms, the project’s
impact transcends borders and domains. Its successful implementation can signifi-
cantly enhance our ability to efficiently convert handwritten content into digital data,
making it accessible, searchable, and editable, thereby revolutionizing the way we
interact with and manage handwritten information. In conclusion, the project’s aim
to employ CNN algorithms for handwritten character extraction is poised to con-
tribute significantly to the advancement of OCR technology and its broad range of
practical applications. We can detect single letter and convert to telugu.

1
1.2 Aim of the Project

The project aims to implement a deep learning model specifically designed to rec-
ognize uppercase alphabets (A to Z) from handwritten images. High accuracy in
character recognition will be a primary focus, achieved through appropriate model
training on a suitable dataset. Overall, the objective is to create a practical and re-
liable solution that finds utility in digitizing handwritten content and assisting in
optical character recognition tasks with ease of use for all users.

1.3 Project Domain

Deep learning is a type of Al that allows software applications to become more


accurate at predicting outcomes without being explicitly programmed to do so. Ma-
chine learning algorithms use historical data as input to predict new output values.
Machine learning is perhaps the principal technology behind two emerging domains:
data science and artificial intelligence. The rise of machine learning is coming about
through the availability of data and computation, but machine learning methodolo-
gies are fundamentally dependent on models.
Deep learning takes the different approach of observing a system in practice and
emulating its behavior with mathematics. One of the design aspects in design- ing
machine learning solutions is where to put the mathematical function. Obtaining
complex behavior in the resulting system can require some imagination in the de-
sign process. This project will give an overview of the approaches that people take
using the classical three subdivisions of the field: supervised learning, unsupervised
learning and reinforcement learning. Each of these approaches uses the mathematical
functions in a different way..

1.4 Scope of the Project

The scope behind developing the “Handwritten to Letter Converter to Telugu”


project stems from the ubiquitous need to automate and streamline the recognition
of handwritten characters, a critical aspect of modern data processing. Handwritten
recognition, often found in forms, notes, documents, and other media, presents a
challenge for efficient data extraction and utilization due to its variability and com-
plexity. Converting such handwritten content into editable and machine-readable let-

2
ter is not only time-consuming when done manually but also prone to human errors.
The emergence of deep learning techniques, particularly CNN, has revolutionized
OCR by offering highly accurate pattern recognition capabilities. The potential of
these techniques to decipher handwritten characters holds immense promise, as they
can enhance the efficiency of letter extraction from images, documents, and forms.

3
Chapter 2

LITERATURE REVIEW

Li Yuan, et al. [1], [2021], proposed Transformers, which are popular for lan-
guage modeling, have been explored for solving vision tasks recently, e.g., the Vision
Transformer (ViT) for image classification. The ViT model splits each image into a
sequence of tokens with fixed length and then applies multiple Transformer layers
to model their global relation for classification. However, ViT achieves inferior per-
formance to CNNs when trained from scratch on a midsize dataset like ImageNet.
We find it is because: 1) the simple tokenization of input images fails to model the
important local structure such as edges and lines among neighboring pixels, leading
to low training sample efficiency; 2) the redundant attention backbone design of ViT
leads to limited feature richness for fixed computation budgets and limited training
samples.

Sixiao,et al.[2], [2021], proposed Most recent semantic segmentation methods


adopt a fully-convolutional network (FCN) with an encoder decoder architecture.
The encoder progressively reduces the spatial resolution and learns more abstrac-
t/semantic visual concepts with larger receptive fields. Since context modeling is
critical for segmentation, the latest efforts have been focused on increasing the recep-
tive field, through either dilated/atrous convolutions or inserting attention modules.
However, the encoder-decoder based FCN architecture remains unchanged. In this
paper, we aim to provide an alternative perspective by treating semantic segmenta-
tion as a sequence-to- sequence prediction task.

Jiaji Yang,et al.[3], [2021], proposed At present, industrial robotics focused more
on motion control and vision; whereas Humanoid Service Robotics (HSRs) are in-
creasingly being investigated among researchers’ and practitioners’ field of speech
interactions. The problematic and quality of human-robot interaction (HRI) has be-
come one of the hot potatoes concerned in academia. This paper proposes a novel
interactive framework suitable for HSRs. The proposed framework is grounded on
the novel integration of Trevarthen Companionship Theory and neural image gener-

4
ation algorithm in computer vision. By integrating the image-tonatural interactivities
generation, and communicate with the environment to better interact with the stake-
holder, thereby changing from interaction to a bionic-companionship.

Danveer Rajpal,et al.[4], [2021], proposed Fusion-Based Hybrid-Feature Approach


for Recognition of Unconstrained Offline Handwritten Hindi Characters Hindi is the
official language of India and used by a large population for several public services
like postal, bank, judiciary, and public surveys. Efficient management of these ser-
vices needs language-based automation. The proposed model addresses the problem
of handwritten Hindi character recognition using a machine learning approach. The
pre-trained DCNN models namely; InceptionV3-Net, VGG19-Net, and ResNet50
were used for the extraction of salient features from the characters’ images. A novel
approach of fusion is adopted in the proposed work; the DCNN-based features are
fused with the handcrafted features received from Bi-orthogonal discrete wavelet
transform..

R. B. Arif, et al.[5], [2021], proposed SVM based hand written letter recognition.
Authors claim that SVM outperforms the Multilayer perceptron classifier. Experi-
ment is carried out on MNIST standard dataset. Advantage of Multilayer perceptron
is that it is able to segment non-linearly separable classes. However, Multilayer per-
ceptron can easily fall into a region of local minimum, where the training will stop
assuming it has achieved an optimal point in the error surface.

R. B. Arif, et al.[6], [2021], proposed SVM based offline digit recognition. Au-
thors claim that SVM outperforms the Multilayer perceptron classifier. Experiment
is carried out on MNIST standard dataset. Advantage of Multilayer perceptron is that
it is able to segment non-linearly separable classes. However, Multilayer perceptron
can easily fall into a region of local minimum, where the training will stop assuming
it has achieved an optimal point in the error surface.

M. M. A. Ghosh , et al.[7], [2021], proposed in their study the recognition of Hindi


and English numerals by representing them in the form of exponential membership
functions which serve as a fuzzy model. The recognition is carried out by modifying
the exponential membership functions fitted to the fuzzy sets. These fuzzy sets are
derived from features consisting of normalized distances obtained using the Box ap-

5
proach. The membership function is modified by two structural parameters that are
estimated by optimizing the entropy subject to the attainment of membership func-
tion to unity. The overall recognition rate is found to be 95Hindi numerals and%
percent for English numerals.

J. Pradeep, et al.[8], [2020], proposed Diagonal feature extraction proposed for


offline character recognition. It is based on ANN model. Two approaches using 54
features and 69 features are chosen to build this Neural Network recognition system.
To compare the recognition efficiency of the proposed diagonal method of feature
extraction, the neural network recognition system is trained using the horizontal and
vertical feature extraction methods.

N.Alrobah, et al.[9], [2021], proposed Hybrid Deep Model for Recognizing Ara-
bic Handwritten Characters.Handwriting recognition for computer systems has been
in research for a long time, with different researchers having an extensive variety of
methods at their disposal. The problem is that most of these experiments are done
in English, as it is the most spoken language in the world. But other languages such
as Arabic, Mandarin, Spanish, French, and Russian also need research done on them
since there are millions of people who speak them. In this work, recognizing and de-
veloping Arabic handwritten characters is proposed by cleaning the state-of- the-art
Arabic dataset called Hijaa, developing Conventional Neural Network (CNN) with
a hybrid model using Support Vector Machine (SVM) and eXtreme Gradient Boost-
ing (XGBoost) classifiers. The CNN is used for feature extraction of the Arabic
character images, which are then passed on to the Machine Learning classifiers. A
recognition rate of up to 96.329classes is achieved, far surpassing the already state-
of-the-art results of the Hijaa dataset.

6
Chapter 3

PROJECT DESCRIPTION

3.1 Existing System

The existing system is an implemented Python-based “Handwritten Letter Recog-


nition” application with a GUI. The GUI is created using the Tkinter library and
provides users with a simple interface to upload images containing handwritten char-
acters. Upon image upload, the system applies image processing techniques such as
resizing, grayscale conversion, Gaussian Blur, and thresholding to prepare the image
for character recognition. The core of the system relies on a pre-trained OCR tech-
nique, originally trained on the MNIST dataset for recognizing handwritten letter.
The model is adapted to recognize uppercase alphabets (A to Z) using a mapping
dictionary. After processing the image through the OCR technique, the system pre-
dicts the character and displays the extracted letter in the GUI. Although the existing
system successfully converts handwritten uppercase characters recognition, there is
potential for further enhancements, such as recognizing uppercase characters and
expanding the model’s training data for improved accuracy. Overall, the existing
system provides a foundation for building upon and refining the “Handwritten Let-
ter Recognition” application to meet future requirements and challenges in optical
character recognition.

Disadvantages
• Depending on the complexity of the input handwriting and variations in writing
styles, the accuracy of character recognition may not always be reliable. The
system might produce incorrect or inaccurate results, impacting the overall use-
fulness of the application.
• The current code might be challenging to scale up for handling large-scale, batch
processing of multiple images simultaneously.
• It lacks the ability to recognize characters, numbers, or other symbols, limiting
its overall usefulness for handling diverse types of handwritten content.

7
• Depending on the complexity of the input handwriting and variations in writing
styles, the accuracy of character recognition may not always be reliable. The
system might produce incorrect or inaccurate results, impacting the overall use-
fulness of the application.
• The current code might be challenging to scale up for handling large-scale, batch
processing of multiple images simultaneously.
• It lacks the ability to recognize characters, numbers, or other symbols, limiting
its overall usefulness for handling diverse types of handwritten content.

3.2 Proposed System

The proposed system for the “Handwritten Letter Recognition” project aims to
overcome the limitations of the existing system and introduce significant improve-
ments. The key enhancements include extending character recognition to upper case
characters, enabling the system to process diverse handwritten content effectively.
To achieve higher accuracy, the proposed system will leverage a more extensive
and diverse dataset for training the deep learning model CNN algorithm, accom-
modating various handwriting styles .The proposed system will support making it
suitable for handling handwriting letter in different scripts. An improved user inter-
face will provide an intuitive and seamless experience, while cloud-based processing
will offer scalability and efficient resource utilization. Adaptive preprocessing and
a quality check mechanism will optimize recognition results and prompt users to
re-upload low-quality images. Overall, these proposed enhancements will transform
the “Handwritten Letter Recognition” into a versatile and powerful application ca-
pable of accurately recognizing and converting various recognize handwritten letter,
catering to a wider range of practical use cases and delivering an enhanced user ex-
perience.

Advantages
• By utilizing a larger and more diverse dataset for training the deep learning
model, the proposed system enhances recognition accuracy, ensuring reliable
and precise Letter extraction from handwritten images.
• The proposed system achieves faster processing times and reduced resource con-
sumption, enhancing overall performance.

8
• The proposed system’s implementation of comprehensive error handling and
validation mechanisms ensures that it gracefully handles unexpected scenarios
and provides users with informative feedback, enhancing the overall user expe-
rience.

3.3 Feasibility Study

The feasibility of the project is analyzed in this phase and business proposal is
put forth with a very general plan for the project and some cost estimates. During
system analysis the feasibility study of the proposed system is to be carried out. This
is to ensure that the proposed system is not a burden to the company. For feasibility
analysis, some understanding of the major requirements for the system is essential.

3.3.1 Economic Feasibility

This study is carried out to check the economic impact that the system will have
on the organization. The amount of fund that the company can pour into the research
and development of the system is limited. The expenditures must be justified. Thus
the developed system as well within the budget and this was achieved because most
of the technologies used are freely available. Only the customized products had to
be purchased.

3.3.2 Technical Feasibility

This study is carried out to check the technical feasibility, that is, the technical
requirements of the system. Any system developed must not have a high demand
on the available technical resources. This will lead to high demands on the available
technical resources. This will lead to high demands being placed on the client. The
developed system must have a modest requirement, as only minimal or null changes
are required for implementing this system.

3.3.3 Social Feasibility

The aspect of study is to check the level of acceptance of the system by the user.
This includes the process of training the user to use the system efficiently. The user
must not feel threatened by the system, instead must accept it as a necessity. The

9
level of acceptance by the users solely depends on the methods that are employed
to educate the user about the system and to make him familiar with it. His level of
confidence must be raised so that he is also able to make some constructive criticism,
which is welcomed, as he is the final user of the system.

3.4 System Specification

This project needs the help of hardware and software requirements to be fit in
the computer or the laptop Pc. The user and the toolkits and hardware and software
requirements are required also.

3.4.1 Hardware Specification

• Processor -Intel(R) core (TM) i3 or more, 2.00 GHz.


• Speed -1.1GHz
• RAM -8Gb
• Storage -500GB.
• Internet connectivity: Yes.(Broadband or wi fi)

3.4.2 Software Specification

• Operating System -Windows 7/8/10


• Python – Python 2.7 or above
• OpenCV – OpenCV 3.2.0 or above
• Numpy – With Python 2.7 3.5
• Tensorflow – Tensorflow 2.1.6 or above
• Pil – Pil 1.1.5 or above

3.4.3 Standards and Policies

Anaconda Prompt
Anaconda prompt is a type of command line interface which explicitly deals with the
ML( MachineLearning) modules.And navigator is available in all the Windows,Linux

10
and MacOS.The anaconda prompt has many number of IDE’s which make the cod-
ing easier. The UI can also be implemented in python.
Standard Used: ISO/IEC 27001
Jupyter Notebook
It’s like an open source web application that allows us to share and create the doc-
uments which contains the live code, equations, visualizations and narrative text. It
can be used for data cleaning and transformation, numerical simulation, statistical
modeling, data visualization, machine learning.
Standard Used: ISO/IEC 27001

11
Chapter 4

METHODOLOGY

4.1 General Architecture

Figure 4.1: Architecture Diagram

The Figure 4.1 describes the architecture diagram of Letter Recognition using
CNN. It shows us the steps involved in recognition the given Input Letter by using
CNN.

Pre-Processing: The role of the pre-processing step is it performs various tasks


on the input image. It basically upgrades the image by making it reasonable for
segmentation. The fundamental motivation behind pre-processing is to take off a
fascinating example from the background. For the most part, noise filtering, smooth-
ing and standardization are to be done in this stage. The pre-processing additionally
characterizes a smaller portrayal of the example. Binarization changes over a gray
scale image into a binary image.

Segmentation: Once the pre-processing of the input images is completed, subim-


ages of individual letters are formed from the sequence of images. Pre-processed 12

12
letter images are segmented into a subimage of individual letters, which are assigned
a number to each letter. Each individual letter is resized into pixels. In this step an
edge detection technique is being used for segmentation of dataset images.

Convolutional Layer: This layer is the first layer that is used to extract the various
features from the input images. In this layer, the mathematical operation of convolu-
tion is performed between the input image and a filter of a particular size MxM. By
sliding the filter over the input image, the dot product is taken between the filter and
the parts of the input image with respect to the size of the filter (MxM).

Pooling Layer: In most cases, a Convolutional Layer is followed by a Pooling


Layer. The primary aim of this layer is to decrease the size of the convolved fea-
ture map to reduce the computational costs. This is performed by decreasing the
connections between layers and independently operates on each feature map. De-
pending upon method used, there are several types of Pooling operations. In Max
Pooling, the largest element is taken from feature map. Average Pooling calculates
the average of the elements in a predefined sized Image section. The total sum of the
elements in the predefined section is computed in Sum Pooling. The Pooling Layer
usually serves as a bridge between the Convolutional Layer and the FC Layer.
Fully Connected Layer: The Fully Connected (FC) layer consists of the weights and
biases along with the neurons and is used to connect the neurons between two differ-
ent layers. These layers are usually placed before the output layer and form the last
few layers of a CNN Architecture.

Activation Functions: An activation function in a neural network defines how


the weighted sum of the input is transformed into an output from a node or nodes
in a layer of the network. Sometimes the activation function is called a “transfer
function.” If the output range of the activation function is limited, then it may be
called a “squashing function.” Many activation functions are nonlinear and may be
referred to as the “nonlinearity” in the layer or the network design. The choice
of activation function has a large impact on the capability and performance of the
neural network, and different activation functions may be used in different parts of
the model.

13
4.2 Design Phase

4.2.1 Data Flow Diagram

Figure 4.2: Data Flow Diagram

The Figure 4.2 describes the datadlow diagram of handwritten letter recognition
Dataset: The dataset serves as the foundation of the project, containing handwrit-
ten letters in various forms, styles, and contexts. These letters serve as the input data
for the recognition system.
Data Preprocessing: In this stage, the dataset undergoes several preprocessing
steps to ensure its compatibility and suitability for the deep learning model. Pre-
processing steps may include resizing images to a standard size, converting images
to grayscale, normalization to adjust pixel values, and augmentation to increase the
diversity of the dataset. These steps aim to enhance the model’s ability to learn and
generalize from the data.
CNN Algorithm: The preprocessed data is fed into a CNN, a type of deep learning
algorithm specifically designed for analyzing visual data such as images. The CNN
consists of multiple layers, including convolutional layers, pooling layers, and fully
connected layers. These layers work together to extract relevant features from the
input images and learn patterns that distinguish different handwritten letters.

14
Feature Extraction: Within the CNN, feature extraction occurs as the convolu-
tional layers analyze the input images, detecting edges, shapes, and other visual
patterns that are characteristic of handwritten letters. These features are then passed
to subsequent layers for further processing.
Model Building: Using the extracted features, a neural network model is con-
structed. This may involve designing and configuring the architecture of the neural
network, including the number and configuration of layers, activation functions, and
other parameters. Additionally, techniques such as transfer learning may be em-
ployed, where a pre-trained neural network is adapted and fine-tuned for the specific
task of handwritten letter recognition.
Test Images: Separate images that were not used during training are input into
the trained model for testing. These test images serve as a benchmark to evaluate
the performance and accuracy of the model in recognizing handwritten letters. It’s
crucial to use unseen data for testing to assess how well the model generalizes to
new, unseen examples.
Predict Output: The trained model predicts outputs or labels for the test images
based on the patterns and features learned during training. The output may include
the recognized letter corresponding to each input image, along with a confidence
score indicating the model’s certainty in its prediction. This output can be further
analyzed and used for various applications, such as optical character recognition
(OCR), document analysis, and text processing.

15
4.2.2 Use Case Diagram

Figure 4.3: Use Case Diagram

The Figure 4.3 describes about the use case diagram. A use case diagram at its
simplest is a representation of a user interaction with the system that shows the re-
lationship between the user and the different use cases in which the user is involved
The actor can be human or other external system that provides input letter . The
input given by actor is sent to letter recognition system which contains various use
cases, The user can upload the image of the letter he wants to detect. The given input
images are pre-processed . The pre-processing additionally characterizes a smaller
portrayal of the example. Binarization changes over a gray scale image into a binary
image.Once the pre-processing of the input images is completed, subimages of indi-
vidual letterss are formed from the sequence of images. Pre-processed letters images
are segmented into a subimage of individual letters. The given input pass undergoes
all the uses case then finally letter is recognised.

16
4.2.3 Class Diagram

Figure 4.4: Class Diagram

The Figure 4.4 describes the Class diagram of model class structure and contents
using design elements such as classes(handwritten frame,image frame,main screen),
packages and objects. Class diagram describes three perspectives when designing
a system Conceptual, Specification, Implemenation. Classes are composed of three
things: name, attributes and operations. the handwritten frame class contains op-
erations like trainAction(), recogniseAction() and mainframe class contains the op-
erations to load and recognise data . Class diagrams also display relations such as
containment, inheritance, associations etc. The association relationship is most com-
mon relationship in a class diagram. The association shows the relationship between
instances of classes. The purpose of class diagram is to model the static view of an
application. Class diagrams are the only diagrams which can be directly mapped
with object oriented languages and thus widely used at the time of construction. The
data from image class is loaded using loadframe operation present in mainframe
class and using the method actions present in the M class the digit is recognized.

17
4.2.4 Sequence Diagram

Figure 4.5: Sequence Diagram

The Figure 4.5 represents the sequence diagram is a graphical view of a scenario
that shows object interaction in a time based sequence what happens first, what hap-
pens next. Sequence diagram establish the role of objects and helps provide essential
information to determine class responsibilities and interfaces. This type of diagram
is best used during early analysis phase in design because they are simple and easy
to comprehend. Sequence diagram are normally associated with use cases presents.

18
4.2.5 Activity Diagram

Figure 4.6: Activity Diagram

The Figure 4.6 describes how activities are coordinated to provide a service which
can be at different levels of abstraction. Typically, an event needs to be achieved by
some operations, particularly where the operation is intended to achieve a number
of different things that require coordination, or how the events in a single use case
relate to one another, in particular, use cases where activities may overlap and require
coordination. It is also suitable for modeling how a collection of use cases coordinate
to represent business workflows.

19
4.3 Algorithm & Pseudo Code

4.3.1 Algorithm

1 S t e p 1 : Choose a D a t a s e t
2 Choose a d a t a s e t o f y o u r i n t e r e s t o r you c a n a l s o c r e a t e y o u r own image d a t a s e t f o r s o l v i n g y o u r
3 own image c l a s s i f i c a t i o n p r o b l e m . An e a s y p l a c e t o c h o o s e a d a t a s e t i s on k a g g l e . com .
4 The d a t a s e t I m g o i n g w i t h c a n be f o u n d h e r e .
5 T h i s d a t a s e t c o n t a i n s 1 2 , 5 0 0 a u g m e n t e d i m a g e s o f b l o o d c e l l s ( JPEG ) w i t h a c c o m p a n y i n g c e l l
6 t y p e l a b e l s (CSV) . T h e r e a r e a p p r o x i m a t e l y 3 , 0 0 0 i m a g e s f o r e a c h o f 4 d i f f e r e n t c e l l t y p e s
7 g r o u p e d i n t o 4 d i f f e r e n t f o l d e r s ( a c c o r d i n g t o c e l l t y p e ) . The c e l l t y p e s a r e E o s i n o p h i l ,
8 Lymphocyte , Monocyte , and N e u t r o p h i l .
9 Here a r e a l l t h e l i b r a r i e s t h a t we would r e q u i r e and t h e c o d e f o r i m p o r t i n g them .
10

11 Step 2: Prepare Dataset for Training


12 P r e p a r i n g o u r d a t a s e t f o r t r a i n i n g w i l l i n v o l v e a s s i g n i n g p a t h s and c r e a t i n g c a t e g o r i e s ( l a b e l s ) ,
13 r e s i z i n g our images .
14

15 R e s i z i n g i m a g e s i n t o 200 X 200
16

17 Step 3 : C r e a t e T r a i n i n g Data
18 T r a i n i n g i s an a r r a y t h a t w i l l c o n t a i n image p i x e l v a l u e s and t h e i n d e x a t which t h e image i n t h e
19 CATEGORIES l i s t .
20

21 Step 4: Shuffle the Dataset


22

23 S t e p 5 : A s s i g n i n g L a b e l s and F e a t u r e s
24 This shape of both the l i s t s w i l l be u s e d i n C l a s s i f i c a t i o n u s i n g t h e NEURAL NETWORKS.
25

26 S t e p 6 : N o r m a l i s i n g X and c o n v e r t i n g l a b e l s t o c a t e g o r i c a l d a t a
27

28 S t e p 7 : S p l i t X and Y f o r u s e i n CNN
29

30 S t e p 8 : D e f i n e , c o m p i l e and t r a i n t h e CNN Model


31

32 S t e p 9 : A c c u r a c y and S c o r e o f model
33

34 c a p t i o n ={XCOMPRESSCU f u n c t i o n } ]
35 XCOMPRESSCU( * pCurCU )
36 M <− FastCUMope ( PO , QP )
37 i f M ! = SPLIT t h e n
38 C2n <− CHECKINTRA( pCurCU )
39 else
40 C2n <−
41 end i f
42 i f M ! = HOMO and Dcur < Dmax t h e n
43 Cn <− 0
44 f o r i = 0 t o 3 do
45 pSubCUi <− p o i n t e r t o SubCUi

20
46 CN <− CN + XCompressCU ( pSubCUi )
47 end f o r
48 else
49 CN <−
50 end i f
51 CHECKBESTMODE( C2N , CN)
52 end f u n c t i o n

4.3.2 Pseudo Code

1 pd . r e a d csv ( ” t r a i n . csv ) as matrix ( )


2 c i f Decision Tree C l a s s i f i e r ( )
3 # training dataset
4 t r a i n data set [0:21000 ,1:]
5 t r a i n l a b e l = dataset [0:21000 ,0]
6 c f f i t ( train , trainlabel
7 # t e s t i n g data
8 t e s t i n g dataset [ 2 1 0 0 0 : 1 : ]
9 actual label= dataset (21000: ,0)
10 d= t e s t i n g [ 0 ]
11 d . shape =(28 ,28)
12 p l t . imshow ( d , cmap= p l t . cm . g r a y ) p r i n t ( c l f . p r e d i c t ( [ t e s t i n g [ 0 ] ] )
13 plt . title ( Sample Letter Recognized )
14 p l t . show ( )

4.4 Module Description

4.4.1 Import libraries and Dataset

At the project beginning, we import all the needed modules for training our model.
We can easily import the dataset and start working on that because Keras library
already contains many datasets and MNIST is one of them. We call MNIST. load
data() function to get training data with its labels and also the testing data with its
labels.

4.4.2 The Data Preprocessing

Model cannot take the image data directly so we need to perform some basic op-
erations and process the data to make it ready for our neural network. The dimension
of the training data is (60000*28*28). One more dimension is needed for the CNN

21
model so we reshape the matrix to shape (60000*28*28*1). The role of the prepro-
cessing step is it performs various tasks on the input image. It basically up grades
the image by making it reasonable for segmentation. The fundamental motivation
behind pre-processing is to take off a fascinating example from the background.

4.4.3 Create the Model

Its time for the creation of the CNN model for this Python-based data science
project. A convolutional layer and pooling layers are the two wheels of a CNN
model. The reason behind the success of CNN for image classification problems is
its feasibility with grid structured data. We will use the Adadelta optimizer for the
model compilation.

4.4.4 Train The Model

To start the training of the model we can simply call the model.fit() function
of Keras. It takes the training data, validation data, epochs, and batch size as the
parameter. The training of model takes some time. After succesful model training,
we can save the weights and model definition in the ‘mnist. h5’ file. Optimizers
are algorithms or methods used to change the attributes of the neural network such
as weights and learning rate in order to reduce the losses. There are many types of
optimizers. Each and every optimiser has its own method to deal with the weights
and bias inputs. In this project the back propagation algorithm is modified with the
adam optimizer instead of the gradient descent optimiser. At first gradient descent
was set to be final but then the adam or stochastic gradient descent optimiser is better
than anyone as the inputs from the various pixels are varying in points.

4.5 Steps to execute/run/implement the project

4.5.1 Import required Libraries

• At the project beginning, all the modules needed for training model are imported.
• One can easily import the dataset and start working on that because the Keras
library already contains many datasets and MNIST is one of them.
• The mnist load data() function is called to get training data with its labels and
also the testing data with its labels.

22
4.5.2 Training Model

• After completing data preprocessing, the CNN model is created which consists
of various convolutional and pooling layers alongside a 3x3 sized kernel.
• The model will then be trained on the basis of training and validation data with
the help of several python libraries such as TensorFlow, Pillow, OpenCV, Tkin-
ter, Numpy that were preloaded to perform these specific tasks.

Figure 4.7: Training Data

4.5.3 Testing Accuracy

• After the model is trained using the training dataset, the testing dataset is used
to evaluate how well model works.
• A particular part of the overall OCR dataset is used as the testing dataset on the
basis of which the accuracy is computed for the proposed model.
• By using CNN algorithm one can get accuracy upto 99% .

23
Figure 4.8: Training Accuracy

The Figure 4.8 describes how Training accuracy is used in the training dataset, the
testing dataset is used to evaluate how well model works. And how it is trained by
using sequential model.

24
Chapter 5

IMPLEMENTATION AND TESTING

5.1 Input and Output

5.1.1 Input Design

Figure 5.1: Sample Input of Letter M

The Figure 5.1 shows the Home Page of letter recognition system in which one
can give input digit to recognise. The user can upload the image of the letter he
wants to detect. After providing the letter, one should click predict button to get the
result.

25
5.1.2 Output Design

Figure 5.2: Sample Output of Letter M

The Figure 5.2 describes about the given input image is pre-processed. After pre-
processing the image is sent to CNN filter where it converts the three dimensionnal
image into one dimensional matrix after in pooloing layer using max pooling the
size of the data is reduced nearly to half of its orginal size. The convoluted image is
compared with the already present test images and based on the accuracy the image
is recognised.

5.2 Testing

Testing is defined as an activity to check whether the actual results match the
expected results and to ensure that the software system is defect free. It involves the
execution of a software component or system component to evaluate one or more
properties of interest. Software testing also helps to identify errors, gaps, or missing
requirements in contrary to the actual requirements.

26
5.3 Types of Testing

5.3.1 Unit Testing

Input

Figure 5.3: Unit Testing Input

The Figure 5.3 represents the Unit testing involves testing individual components
or modules of the system to ensure they are working correctly. In this project, unit
testing could involve testing individual components such as the alphabets, data col-
lection, and the deep learning model to ensure they are working as expected.

27
Test Result

Figure 5.4: Unit Testing Output

5.3.2 Integration Testing

Integration tests are designed to test integrated software components to determine


if they actually run as one program. Testing is event driven and is more concerned
with the basic outcome of screens or fields. Integration tests demonstrate that al-
though the components were individually satisfaction, as shown by successfully unit
testing, the combination of components is correct and consistent. Integration testing
is specifically aimed at exposing the problems that arise from the combination of
components.

28
Test Result

Figure 5.5: Integration Testing Output

5.3.3 System Testing

System testing ensures that the entire integrated software system meets require-
ments. It tests a configuration to ensure known and predictable results. An example
of system testing is the configuration oriented system integration test. System testing
is based on process descriptions and flows, emphasizing pre-driven process links and
integration points.

29
5.3.4 Test Result

Figure 5.6: System Testing Result

The Figure 5.6 describes how system testing is used in the training dataset, the
testing dataset is used to evaluate how well model works. And how it is trained by
using sequential model.And also explains about the different packages in this model.

30
Chapter 6

RESULTS AND DISCUSSIONS

6.1 Efficiency of the Proposed System

Key rationale toward Optical Character Recognition (OCR) from image includes
features extraction technique supported by a classification algorithm for recognition
of characters based on the features. Previously, several algorithms for feature classi
f ications and extraction have been utilized for the purpose of character recognition.
But, with the advent of CNN in deep learning, no separate algorithms are required
for this purpose. However, in the area of computer vision, deep learning is one of
the outstanding performers for both feature extraction and classification. However,
CNNarchitecture consists of many nonlinear hidden layers with a enormous number
of connections and parameters. Therefore, to train the network with very less amount
of samples is a very difficult task. In CNN, only few set of parameters are needed for
training of the system. So, CNN is the key solution capable to map correctly datasets
for both input and output by varying the trainable parameters and number of hidden
layers with high accuracy.
The new system we’re proposing for Optical Character Recognition (OCR) is a
big step forward. Instead of using lots of different algorithms to figure out what char
acters are in an image, we’re using something called Convolutional Neural Networks
(CNNs). These networks are like super-smart filters that can learn to recognize pat
terns in images all on their own. It’s kind of like how our brains see things– we don’t
need someone to explain every detail, we just know what we’re looking at. With
CNNs, the computer can do the same thing!.CNNs can extract local features and
learn complex representations from input images. Despite their complexity, CNNs
require fewer trainable parameters compared to traditional methods, making them
more efficient and scalable.Overall, the proposed system not only improves accu
racy in OCR but also enhances efficiency by simplifying the feature extraction and
classification.

31
6.2 Comparison of Existing and Proposed System

In the Existing system, we implemented a decision tree algorithm that predicts


whether to grant the loan or not. When using a The proposed system relies on the
latest Deep Learning techniques coupled with image pre-processing mechanisms to
recognize handwritten letter with high accuracy. Additionally, it provides a user-
friendly interface and real-time recognition features that make it very convenient
for users to convert their handwritten notes to individual letter whereas the existing
system uses optical character recognition(OCR). The letter is typically extracted as
words, text lines, and paragraphs or text blocks, enabling ac- cess to letter version
of the scanned text. This eliminates or significantly reduces the need for manual
data entry. Both Read versions available today in Computer Vision support several
languages for printed and handwritten letter.
Model Accuracy Proposed Model Features
Convolutional 95% Yes Excellent at capturing spatial
Neural Network hierarchies, commonly used for
(CNN) image-based data.
Recurrent Neural 90% No Good for sequence prediction, but
Network (RNN) less effective for spatial data.
Long Short-Term 92% No Handles long dependencies in data
Memory (LSTM) sequences, better suited for
sequential text.

Table 6.1: Comparison of Existing and Proposed System

6.3 Sample Code

1 i m p o r t numpy a s np
2 i m p o r t p a n d a s a s pd
3 import seaborn as sns
4 import m a t p l o t l i b . pyplot as p l t
5 from i m b l e a r n . u n d e r s a m p l i n g i m p o r t N e a r M i s s
6 from k e r a s . s r c . u t i l s . n p u t i l s i m p o r t n o r m a l i z e
7 from s k l e a r n . m o d e l s e l e c t i o n i m p o r t t r a i n t e s t s p l i t
8 from s k l e a r n . m e t r i c s i m p o r t c l a s s i f i c a t i o n r e p o r t , confusion matrix
9 import keras
10 from k e r a s . l a y e r s i m p o r t Dense , Conv2D , MaxPool2D , F l a t t e n , Dropout , B a t c h N o r m a l i z a t i o n
11 from k e r a s . c a l l b a c k s i m p o r t ReduceLROnPlateau
12 from t e n s o r f l o w . k e r a s . m o d e l s i m p o r t S e q u e n t i a l
13 from t e n s o r f l o w . k e r a s . l a y e r s i m p o r t Dense , Conv2D , MaxPooling2D , F l a t t e n , Dropout , B a t c h N o r m a l i z a t i o n

32
14 import tensorflow as t f
15 import tensorflow addons as t f a
16 from a d a b e l i e f t f i m p o r t A d a B e l i e f O p t i m i z e r
17 import warnings
18 warnings . f i l t e r w a r n i n g s ( ’ ignore ’ )
19 d f = pd . r e a d c s v ( ’ A Z H a n d w r i t t e n D a t a . c s v ’ )
20 y = df [ ’0 ’ ]
21 del df [ ’0 ’ ]
22 x = y . r e p l a c e ( [ 0 , 1 , 2 , 3 , 4 , 5 , 6 , 7 , 8 , 9 , 1 0 , 1 1 , 1 2 , 1 3 , 1 4 , 1 5 , 1 6 , 1 7 , 1 8 , 1 9 , 2 0 , 2 1 , 2 2 , 2 3 , 2 4 , 2 5 ] , [ ’A ’ , ’B ’ , ’C ’ ,
’D ’ , ’E ’ , ’ F ’ , ’G ’ , ’H ’ , ’ I ’ , ’ J ’ , ’K ’ , ’L ’ , ’M’ ,
23 ’N ’ , ’O ’ , ’ P ’ , ’Q ’ , ’R ’ , ’ S ’ , ’T ’ , ’U ’ , ’V ’ , ’W’ , ’X ’ , ’Y ’ , ’Z ’ ] )
24 nM = N e a r M i s s ( )
25 X d a t a , y d a t a = nM . f i t r e s a m p l e ( df , y )
26

27 from k e r a s . u t i l s i m p o r t n o r m a l i z e , t o c a t e g o r i c a l
28

29 # Assuming y d a t a c o n t a i n s c l a s s l a b e l s
30 y = to categorical ( y data )
31

32 num classes = y . shape [ 1 ]


33 num classes
34 X d a t a = np . a r r a y ( X d a t a )
35 X data = X data . reshape ( −1 ,28 ,28 ,1)
36 X t r a i n , X t e s t , y t r a i n , y t e s t = t r a i n t e s t s p l i t ( X data , y , t e s t s i z e =0.2 , r a n d o m s t a t e =102)
37

38 X t r a i n . shape , X t e s t . shape , y t r a i n . shape , y t e s t . shape


39 ((23296 , 28 , 28 , 1) , (5824 , 28 , 28 , 1) , (23296 , 26) , (5824 , 26) )
40 model = S e q u e n t i a l ( )
41 model . add ( Conv2D ( 3 2 , ( 5 , 5 ) , i n p u t s h a p e = ( 2 8 , 2 8 , 1 ) , a c t i v a t i o n = ’ r e l u ’ ) )
42 model . add ( MaxPooling2D ( p o o l s i z e = ( 2 , 2 ) ) )
43 model . add ( F l a t t e n ( ) )
44 model . add ( Dense ( 1 2 8 , a c t i v a t i o n = ’ r e l u ’ ) )
45 model . add ( Dense ( n u m c l a s s e s , a c t i v a t i o n = ’ s o f t m a x ’ ) )
46 l a b e l s m o o t h i n g = 1 e −2
47 l e a r n i n g r a t e = 1 e −4
48 model . c o m p i l e ( o p t i m i z e r = t f a . o p t i m i z e r s . A d a B e l i e f ( l e a r n i n g r a t e = l e a r n i n g r a t e ) ,
49 loss= t f . keras . losses . BinaryCrossentropy ( label smoothing=label smoothing ) ,
50 m e t r i c s = t f . k e r a s . m e t r i c s .AUC( name=”AUC” ) )
51 h i s t o r y = model . f i t ( X t r a i n , y t r a i n , e p o c h s =10 , b a t c h s i z e =128 , v a l i d a t i o n d a t a = ( X t e s t , y t e s t ) )

33
Output

Figure 6.1: Output of Training Data

The Figure 6.1 describes to evaluate how the model works. And how it is trained
by using sequential model.And also explains about the different packages in this
model.

34
Chapter 7

CONCLUSION AND FUTURE


ENHANCEMENTS

7.1 Conclusion

The performance of CNN for letter recognition performed significantly. The pro
posed method obtained 98% accuracy and is able to identify real world images as
well as the loss percentage in both training and evaluation is less than 0.1, which is
negligible. The only challenging part is the noise present in the real world image,
which needs to look after. The learning rate of the model is much dependent on the
number of dense neurons and the cross-validation measure. Recognition of letter
us ing CNN,withRectified Linear Units activation is implemented. The proposed
CNN framework is well equipped with suitable parameters for high accuracy of OCR
letter classification. Time factor is also considered for training the system. After
ward, for further verification of accuracy, the system is also checked by changing the
number of CNN layers. It is worth mentioning here that CNN architecture design
consists of two convolutional layers. The experimented results demonstrate that the
proposed CNN framework for OCR dataset exhibits high performance in terms of
time and accuracy as compared to previously proposed systems. Consequently,letters
are recognized with high accuracy (99.21%).

7.2 Future Enhancements

For future enhancements, the “Handwritten Letter Recognition” project can ex-
plore several avenues to further improve its capabilities. One potential direction in-
volves extending the model’s recognition capabilities to include cursive handwriting
and other unique writing styles, broadening its applicability. Additionally, incorpo-
rating real-time learning mechanisms can enhance adaptability to varying user inputs
and evolving handwriting styles over time. Integration with emerging deep learning

35
architectures or exploring transfer learning techniques could further boost the models
performance. Enhancements in multi-language support and the ability to recognize
special characters would increase the projects utility across diverse linguistic con-
texts. Collaborative efforts to create a continually expanding and diverse dataset
could contribute to ongoing model training, ensuring that the system remains current
and effective. Moreover, exploring edge computing and optimizing the application
for mobile platforms could enhance accessibility and convenience. Continuous en-
gagement with user feedback and a collaborative, open-source approach can foster a
community-driven development process, allowing the “Handwritten Letter Recogni-
tion”to evolve and stay at the forefront of advancements in CNN.

36
Chapter 8

PLAGIARISM REPORT

Figure 8.1: Plagarism Report

37
Chapter 9

SOURCE CODE & POSTER


PRESENTATION

9.1 Source Code

1 i m p o r t numpy a s np
2 import tensorflow as t f
3 from t e n s o r f l o w i m p o r t k e r a s
4 from t e n s o r f l o w . k e r a s i m p o r t l a y e r s
5

6 # Load t h e MNIST d a t a s e t
7 mnist = keras . d a t a s e t s . mnist
8 ( x tr ai n , y t r a i n ) , ( x t e s t , y t e s t ) = mnist . load data ( )
9

10 # Normalize the p i x e l values to the range [0 , 1]


11 x t r a i n = x t r a i n . a s t y p e ( ” f l o a t 3 2 ” ) / 255
12 x t e s t = x t e s t . a s t y p e ( ” f l o a t 3 2 ” ) / 255
13

14 # Add a c h a n n e l d i m e n s i o n t o t h e i m a g e s
15 x t r a i n = np . e x p a n d d i m s ( x t r a i n , −1)
16 x t e s t = np . e x p a n d d i m s ( x t e s t , −1)
17

18 # C o n v e r t t h e l a b e l s t o one − h o t e n c o d i n g
19 y t r a i n = keras . u t i l s . t o c a t e g o r i c a l ( y t r a i n , 10)
20 y t e s t = keras . u t i l s . t o c a t e g o r i c a l ( y t e s t , 10)
21

22 # D e f i n e t h e CNN model a r c h i t e c t u r e
23 model = k e r a s . S e q u e n t i a l (
24 [
25 l a y e r s . Conv2D ( 3 2 , ( 3 , 3 ) , a c t i v a t i o n =” r e l u ” , i n p u t s h a p e = ( 2 8 , 2 8 , 1 ) ) ,
26 l a y e r s . MaxPooling2D ( p o o l s i z e = ( 2 , 2 ) ) ,
27 l a y e r s . Conv2D ( 6 4 , ( 3 , 3 ) , a c t i v a t i o n =” r e l u ” ) ,
28 l a y e r s . MaxPooling2D ( p o o l s i z e = ( 2 , 2 ) ) ,
29 layers . Flatten () ,
30 l a y e r s . Dense ( 1 2 8 , a c t i v a t i o n =” r e l u ” ) ,
31 l a y e r s . Dense ( 1 0 , a c t i v a t i o n =” s o f t m a x ” ) ,
32 ]
33 )
34

35 # Compile t h e model

38
36 model . c o m p i l e ( l o s s =” c a t e g o r i c a l c r o s s e n t r o p y ” , o p t i m i z e r =” adam ” , m e t r i c s = [ ” a c c u r a c y ” ] )
37

38 # T r a i n t h e model
39 model . f i t ( x t r a i n , y t r a i n , b a t c h s i z e =128 , e p o c h s =10 , v a l i d a t i o n s p l i t = 0 . 1 )
40

41 # E v a l u a t e t h e model on t h e t e s t s e t
42 t e s t l o s s , t e s t a c c = model . e v a l u a t e ( x t e s t , y t e s t )
43 p r i n t ( ” Test accuracy : ” , t e s t a c c )
44

45 model . s a v e ( ” my model . h5 ” )

39
9.2 Poster Presentation

Figure 9.1: Poster Presentation

40
References

[1] J. Feng, and S. Yan, “Tokens-to-token ViT: Training vision transformers from
scratch on ImageNet,” in Proc. IEEE/CVF Int. Conf. Comput. Vis. (ICCV), pp.
538–547,Oct 2021.

[2] C. C. Park, Y. Kim, and G. Kim, “Retrieval of sentence sequences for an image
stream via coherence recurrent convolutional networks,” IEEE transactions on
pattern analysis and machine intelligence, vol. 40, no. 4, pp. 945-957, June 2019.

[3] Dala, N.; Triggs, B. “Histograms of oriented gradients for human detection”, In
Proceedings of the IEEE Computer Society Conference on Computer Vision and
Pattern Recognition (CVPR ’05), San Diego, CA, USA, 20–26 June 2020.

[4] J. Pradeep, E. Srinivasan and S. Himavathi, “Diagonal Based Feature Extraction


For Alphabets Recognition System Using Neural Network”, Inter- national
Journal of Computer Science Information Technology (IJCSIT), Vol 3, No 1,
Sept 2020.

[5] Mayank Jain, Harshith Guptha, “Digits Recognition using CNN,” in 2021 4th
International Conference on Electrical Engineering and Information Communi
cation Technology (iCEEiCT), 2021, pp. 118- 123: IEEE.

[6] M. M. A. Ghosh and A. Y. Maghari, “A Comparative Study on Digit Recog


nition Using Neural Networks,” 2020 International Conference on Promising
Electronic Technologies (ICPET), Deir El-Balah, 2021, pp. 77-81.

[7] R. B. Arif, M. A. B. Siddique, M. M. R. Khan, and M. R. Oishe, “Study


and Observation of the Variations of Accuracies for Digits Recognition with
Various Hidden Layers and Epochs using Convolutional Neural Network,” in
432021 4th International Conference on Electrical Engineering and Information

41
Communication Technology (iCEEiCT), 2021, pp. 112-117: IEEE

[8] R. Alhajj and A. Elnagar, “Multiagents to separating connected digits,” in IEEE.


Transactions on Systems, Man, and Cybernetics Part A: Systems and Humans,
vol. 35, no. 5, pp. 593-602, Sept. 2021.

[9] Renata F. P. Neves, Alberto N. G. Lopes Filho, Carlos A.B.Mello, Cle


berZanchettin, “A SVM Based Off-Line ”Digit Recognizer”, International
conference on Systems, Man and Cybernetics, IEEE Xplore, pp. 510-515, 9-12
Oct, 2021.

[10] T. Anita Pal Dayashankar Singh, “Character Recognition Using Neural,”


Network International Journal of Computer Science Communication. Vol. 1, No.
2, July- December 2010, pp. 141-144,Dec 2021 .

[11] T.Som, Sumit Saha, “Character Recognition Using Fuzzy Membership


Function”, International Journal of Emerging Technologies in Sciences and
Engineering, Vol.5, No.2, pp. 11-15, 2022.

[12] U. Pal, T. Wakabayashi and F. Kimura, “Numeral recognition of six popular


scripts,” Ninth International conference on Document Analysis and Recognition
ICDAR 07, Vol.2, pp.749- 753, 2022.

42

You might also like