0% found this document useful (0 votes)

104 views11 pages

Institute of Engineering and Management, Kolkata Artificial Intelligence Project (CS793C) On Handwriting Analysis

The document discusses developing a handwriting recognition tool using machine learning algorithms. It analyzes handwritten text samples with machine learning and provides an interface for users to provide feedback to improve recognition. The tool takes images as input and extracts characters using feature extraction and classification algorithms. It was tested on handwritten digit datasets and achieved over 97% accuracy using support vector machines. Experimental results showed SVMs performed better than other models like hidden Markov models for handwriting recognition.

Uploaded by

aslan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

104 views11 pages

Institute of Engineering and Management, Kolkata Artificial Intelligence Project (CS793C) On Handwriting Analysis

Uploaded by

aslan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 11

INSTITUTE OF ENGINEERING AND

MANAGEMENT,KOLKATA

Artificial Intelligence Project

(CS793C)
On
HANDWRITING ANALYSIS

SUBMITTED BY:
(CSE 4th Year , Section C)

PRANJAL CHOWDHURY(UNI. ROLL.:10400116145)

PARTHO PROTIM SARKAR(UNI. ROLL.: 10400116150)

1
ABSTRACT

The aim of this work is to review existing methods for the handwritten character recognition problem
using machine learning algorithms. The main tasks of the application provides a solution for
handwriting recognition based on touch input, handwriting recognition from live camera frames or a
picture file, learning new characters, and learning interactively basedon user's feedback.

Handwriting recognition has been one of the most fascinating and challenging research areas in field
of image processing and pattern recognition in the recent years. It contributes immensely to the
advancement of automation process and improves the interface between man and machine in
numerous applications. The development of handwriting recognition systems began in the 1950s
when there were human operators whose job was to convert data from various documents into
electronic format, making the process quite long and often affected by errors.

Automatic text recognition aims at limiting these errors by using image preprocessing techniques
that bring increased speed and precision to the entire recognition process. Here , we develop such a
tool which takes an image as an input and extract characters such as alphabets, digits, symbols from
it. The image can be of handwritten document or printed document. It can be used as a form of data
entry from printed records.The implementation of such a tool depends on two factors – Feature
extraction and classification algorithm.

This work discusses about a method for analysing real world handwritten text samples with the aid
of technology. This project is based on Machine learning, We can provide a lot of data set as an input
to the software tool which will be recognized by the machine and similar pattern will be taken out
from them.

2
BACKGROUND
Handwritten character recognition is a field of research in artificial intelligence, computer vision,
and pattern recognition. A computer performing handwriting recognition is said to be able to
acquire and detect characters in paper documents, pictures, touch-screen devices and other
sources and convert them into machine-encoded form. Its application is found in optical
character recognition and more advanced intelligent character recognition systems. Most of these
systems nowadays implement machine learning mechanisms such as neural networks. Machine
learning is a branch of artificial intelligence inspired by psychology and biology that deals with
learning from a set of data and can be applied to solve wide spectrum of problems. A supervised
machine learning model is given instances of data specific to a problem domain and an answer
that solves the problem for each instance. When learning is complete, the model is able not only
to provide answers to the data it has learned on, but also to yet unseen data with high precision.

Handwritten character recognition can be thought of as a subset of the image recognition problem.

The general flow of an image recognition algorithm.

Basically, the algorithm takes an image (image of a handwritten digit) as an input and outputs the
likelihood that the image belongs to different classes (the machine-encoded digits, 1–9).
We will look into the Support Vector Machines (SVMs) techniques to solve the problem.

We will be using the accuracy score to quantify the performance of our model. The accuracy will
tell us what percentage of our test data was classified correctly. The accuracy is a good metric
choice because it will be easy to compare our model’s performance to that of the benchmark as it
uses the same metric. Also, our dataset is balanced (equal number of training examples for each
label) which makes the accuracy appropriate for this problem.
3
Use of Database: For pattern recognition related applications, data patterns are one of the most
necessary requirements. If the data patterns for the particular recognition application is not
available, then the first and foremost task in implementing the recognition system is to collect the
data patterns. Data collection is one of the tedious task in most of the pattern recognition
applications. The handwritten documents are collected and stored ..

The present version of the character image database consists of binary isolated character images
extracted from the collected handwritten data sheets using character segmentation algorithm.

The created character image database is available on request1, and is released in the form of
comma separated values (CSV) files. Three CSV files representing training, validation and
testing images are available. Each row in the CSV files, represents a character image.

4
SOURCECODE:
In [1]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from sklearn import datasets
In [2]:
digits= datasets.load_digits()
x=digits.data
y=digits.target
In [3]:
x[1]
Out[3]:
array([ 0., 0., 0., 12., 13., 5., 0., 0., 0., 0., 0., 11., 16.,
9., 0., 0., 0., 0., 3., 15., 16., 6., 0., 0., 0., 7.,
15., 16., 16., 2., 0., 0., 0., 0., 1., 16., 16., 3., 0.,
0., 0., 0., 1., 16., 16., 6., 0., 0., 0., 0., 1., 16.,
16., 6., 0., 0., 0., 0., 0., 11., 16., 10., 0., 0.])
In [4]:
x_train,x_test,y_train,y_test=train_test_split(x,y,test_size=0.25,random_state=42)
In [5]:
from sklearn import svm
clf=svm.SVC(kernel="poly",C=1,gamma=0.1)
clf.fit(x_train,y_train)
Out[5]:
SVC(C=1, cache_size=200, class_weight=None, coef0=0.0,
decision_function_shape='ovr', degree=3, gamma=0.1, kernel='poly',
max_iter=-1, probability=False, random_state=None, shrinking=True,
tol=0.001, verbose=False)
In [6]:
pred=clf.predict(x_test)
In [8]:
from sklearn.metrics import accuracy_score
In [9]:
accuracy_score(pred,y_test)
Out[9]:
0.9888888888888889
In [10]:
clf.predict(digits.data[[100]])
Out[10]:
array([4])
In [11]:
clf.predict(digits.data[[50]]) 5
Out[11]:
array([2])
In [12]:
clf.predict(digits.data[[500]])
Out[12]:
array([8])
In [13]:
plt.imshow(digits.images[100])
plt.show()

In [14]:
plt.imshow(digits.images[50])
plt.show()

In [15]:
plt.imshow(digits.images[500])
plt.show()

In [ ]:

ExperimentalResults:
Test application Analysis:
The test application accompanying the source code can perform the recognition of handwritten
digits. To do so, open the application (preferably outside Visual Studio, for better performance).
Click on the menu File and select Open. This will load some entries from the Optdigits dataset
into the application. To perform the analysis, click the Run Analysis button. Please be aware that
it may take some time. After the analysis is complete, the other tabs in the sample application
will be populated with the analysis' information. The level of importance Experiments were
performed on different samples having mixed scripting languages on numerals using single
hidden layer.

Data set Training Testing Validation Training Set Test Set Validation Set
Set Size Set Size Set Size Accuracy Accuracy Accuracy

Digit 1778 6270 5430 96 97 96

6
Table: Detail Recognition performance of SVM

It is observed that recognition rate using SVM is higher than other model, i.e. Hidden Markov
Model. However, free parameter storage for SVM model is significantly higher. The memory
space required for SVM will be the number of support vectors multiply by the number of feature
values. This is significantly large compared to HMM which only need to store the weight. HMM
needs less space due to the weight-sharing scheme. However, in SVM, space saving can be
achieved by storing only the original online signals and the penup/ pen-down status in a compact
manner. During recognition, the model will be expanded dynamically as required. SVM clearly
outperforms in all three isolated character cases. The result for the isolated character cases above
indicates that the recognition rate for the hybrid word recognizer could be improved by using
SVM instead of HMM.

7
Results:
After the analysis has been completed and validated, we can use it to classify the new digits
drawn directly in the application. We can see the analysis also performs rather well on
completely new and previously unseen data.

8
Digit Recognition Using SVM

382 97

390 95
Data Set

376 94

387 96

92.5 93 93.5 94 94.5 95 95.5 96 96.5 97 97.5

Accuracy(%)

SVM

Figure: Graph representation of accuracy of SVM

9
CONCLUSION
In this article, we detailed and explored how (Kernel) Support Vector Machines could be applied
in the problem of handwritten digit recognition with satisfying results. The suggested approach
does not suffer from the same limitations of Kernel Discriminant Analysis, and also achieves a
better recognition rate. Unlike KDA, the SVM solutions are sparse, meaning only a generally
small subset of the training set will be needed during model evaluation. This also means the
complexity during the evaluation phase will be greatly reduced since it will depend only on the
number of vectors retained during training.

SVM model requires the most space since each support vector (SV) consist of many feature
values . However, space saving can be achieved by storing only the original online signals and the
pen-up/pen-down status corresponding to the SV in a compact manner. During recognition, the
model will be expanded dynamically as required.Experiments using SVMs with probabilistic
output were also performed on the same datasets for comparison. In many experiments, the
results have shown that at character level, SVM recognition rates are significantly better due to
structural risk minimization implemented by maximizing margin of separation in the
decisionfunction. However, the increase in recognition rate isnot without some impact. The
number of support vectors obtained in the training characterizes SVM model size.Storing these
support vectors for recognition requires larger memory as compared to NN weights since each
support vector is a multidimensional feature vector. The number of support vectors can be
reduced by selecting better C and gamma parameter values through a finer grid search and by
reduced set selection . The comparison of recognition results of SVM with probabilistic output
and SVM distance output shows that both are comparable. In some datasets, SVM distance gives
slightly higher while in some others the probabilistic output gives higher recognition rates.

10
FUTUREWORK
Future works on the database includes extending the character class collection by including all
the presently used valid orthographic shapes for specific language script and creating word, line
and page level collection of document images so that the researchers can focus on other stages of
document recognition system as well.

It has been shown that Support Vector Machines (SVMs) can be applied to image and hand-
written character recognition . However, SVMs don’t perform well in large datasets as the
training time becomes cubic in the size of the dataset. This could be an issue as bigger datasets
dataset containing thousand of samples which is quite large. To deal with this issue, a
techniquecan be proposed ,which is to train a support vector machine on the collection of nearest
neighbours in a solution they called “SVM-KNN” . Training an SVM on the entire data set is
slow and the extension of SVM to multiple classes is not as natural as Nearest Neighbor (NN).
However, in the neighbourhood of a small number of examples and a small number of classes,
SVMs often perform better than other classification methods.

We can use NN as an initial pruning stage and perform SVM on the smaller but more relevant set
of examples that require careful discrimination. This approach reflects the way humans perform
coarse categorization: when presented with an image, human observers can answer coarse queries
such as presence or absence of an animal in as little as 150ms, and of course, can tell what animal
it is given enough time . This process of a quick categorization, followed by successive finer but
slower discrimination was the inspiration behind the “SVM-KNN” technique.

Keonn - Price List
No ratings yet
Keonn - Price List
22 pages
Punehod DTF-A3 Mamual-English-1
100% (1)
Punehod DTF-A3 Mamual-English-1
28 pages
How To Prepare Your Amazon Custom Text Template: Data Definitions Tab
No ratings yet
How To Prepare Your Amazon Custom Text Template: Data Definitions Tab
29 pages
Machine Learning
No ratings yet
Machine Learning
21 pages
BT4344 PPT
No ratings yet
BT4344 PPT
16 pages
2021BCS0103 ICS322 Assignment2
No ratings yet
2021BCS0103 ICS322 Assignment2
10 pages
Handwritten Digit Recognition: An AI
No ratings yet
Handwritten Digit Recognition: An AI
11 pages
Table of Content
No ratings yet
Table of Content
7 pages
SVMBasedRealTimeHand WrittenDigitRecognitionSystem
No ratings yet
SVMBasedRealTimeHand WrittenDigitRecognitionSystem
7 pages
Project Report Title
No ratings yet
Project Report Title
9 pages
DEEP LEARNING TECHNIQUES: CLUSTER ANALYSIS and PATTERN RECOGNITION with NEURAL NETWORKS. Examples with MATLAB
From Everand
DEEP LEARNING TECHNIQUES: CLUSTER ANALYSIS and PATTERN RECOGNITION with NEURAL NETWORKS. Examples with MATLAB
César Pérez López
No ratings yet
Final Seminar Presentation2
No ratings yet
Final Seminar Presentation2
14 pages
Handwritten Manuscript Digitizer: Kaushil Ruparelia Ashay Shah Shah - Ashay@yahoo. Com Seema Wadhwani Dr. M Mani Roja
No ratings yet
Handwritten Manuscript Digitizer: Kaushil Ruparelia Ashay Shah Shah - Ashay@yahoo. Com Seema Wadhwani Dr. M Mani Roja
3 pages
Handwritten Character Recognition Using Deep Learning
No ratings yet
Handwritten Character Recognition Using Deep Learning
8 pages
To Improve The Performance of Handwritten Digit Recognition Using Support Vector Machine
No ratings yet
To Improve The Performance of Handwritten Digit Recognition Using Support Vector Machine
7 pages
JOCC Volume 2 Issue 1 Page 9 19
No ratings yet
JOCC Volume 2 Issue 1 Page 9 19
11 pages
Assignment 2, Machine Learning
No ratings yet
Assignment 2, Machine Learning
5 pages
Input Image
No ratings yet
Input Image
8 pages
Handwritten Digit Recognition With ML Models
No ratings yet
Handwritten Digit Recognition With ML Models
41 pages
DATA MINING and MACHINE LEARNING. PREDICTIVE TECHNIQUES: ENSEMBLE METHODS, BOOSTING, BAGGING, RANDOM FOREST, DECISION TREES and REGRESSION TREES.: Examples with MATLAB
From Everand
DATA MINING and MACHINE LEARNING. PREDICTIVE TECHNIQUES: ENSEMBLE METHODS, BOOSTING, BAGGING, RANDOM FOREST, DECISION TREES and REGRESSION TREES.: Examples with MATLAB
César Pérez López
No ratings yet
Handwritten Digit Recognition Project Paper
No ratings yet
Handwritten Digit Recognition Project Paper
15 pages
ML Digit Classification Report
No ratings yet
ML Digit Classification Report
7 pages
SVMBasedRealTimeHand WrittenDigitRecognitionSystem
No ratings yet
SVMBasedRealTimeHand WrittenDigitRecognitionSystem
7 pages
Title: : Ahsanullah University of Science & Technology
No ratings yet
Title: : Ahsanullah University of Science & Technology
9 pages
Proposal
No ratings yet
Proposal
9 pages
ANN Case Study
No ratings yet
ANN Case Study
12 pages
Sat - 23.Pdf - Handwritten Hindi Character Recognition Using CNN
No ratings yet
Sat - 23.Pdf - Handwritten Hindi Character Recognition Using CNN
11 pages
Synopsis PDF
No ratings yet
Synopsis PDF
2 pages
Kernel Methods: Fundamentals and Applications
From Everand
Kernel Methods: Fundamentals and Applications
Fouad Sabry
No ratings yet
Ai Mini Project
No ratings yet
Ai Mini Project
9 pages
Hand Written Digit Recognition
No ratings yet
Hand Written Digit Recognition
5 pages
A Comparative Study On Handwriting Digit Recognition Using Neural Networks
No ratings yet
A Comparative Study On Handwriting Digit Recognition Using Neural Networks
5 pages
Handwritten Digit Regonizer
No ratings yet
Handwritten Digit Regonizer
12 pages
Internship PPT Jeevika
No ratings yet
Internship PPT Jeevika
16 pages
Updated 2nd Synopsis
No ratings yet
Updated 2nd Synopsis
33 pages
Bangla Handwritten Digit Recognition Report
No ratings yet
Bangla Handwritten Digit Recognition Report
9 pages
Recognizing Handwritten Digits With Scikit-Learn: Punam Seal
No ratings yet
Recognizing Handwritten Digits With Scikit-Learn: Punam Seal
21 pages
Review 1 HDR
No ratings yet
Review 1 HDR
19 pages
Handwritten Number Guessing
No ratings yet
Handwritten Number Guessing
16 pages
Kumar - Singh - 2021 - IOP - Conf. - Ser. - Mater. - Sci. - Eng. - 1084 - 012021
No ratings yet
Kumar - Singh - 2021 - IOP - Conf. - Ser. - Mater. - Sci. - Eng. - 1084 - 012021
9 pages
Digit Main
No ratings yet
Digit Main
30 pages
Handwritten Digit Recognition Systems
No ratings yet
Handwritten Digit Recognition Systems
12 pages
1st Research
No ratings yet
1st Research
13 pages
English
No ratings yet
English
8 pages
University Institute of Technology
No ratings yet
University Institute of Technology
9 pages
DATA MINING AND MACHINE LEARNING. PREDICTIVE TECHNIQUES: REGRESSION, GENERALIZED LINEAR MODELS, SUPPORT VECTOR MACHINE AND NEURAL NETWORKS
From Everand
DATA MINING AND MACHINE LEARNING. PREDICTIVE TECHNIQUES: REGRESSION, GENERALIZED LINEAR MODELS, SUPPORT VECTOR MACHINE AND NEURAL NETWORKS
César Pérez López
No ratings yet
Handwritten Digit Regonizer
100% (3)
Handwritten Digit Regonizer
11 pages
Project Report Image Recognition: Btech Cse (Chandigarh University)
No ratings yet
Project Report Image Recognition: Btech Cse (Chandigarh University)
30 pages
Deep Learning - Handwritten Digit Recognition Using Python REVIEW 0
No ratings yet
Deep Learning - Handwritten Digit Recognition Using Python REVIEW 0
16 pages
AI Mini Project Report
No ratings yet
AI Mini Project Report
7 pages
Tamil CNN
No ratings yet
Tamil CNN
7 pages
DATA MINING and MACHINE LEARNING. CLASSIFICATION PREDICTIVE TECHNIQUES: SUPPORT VECTOR MACHINE, LOGISTIC REGRESSION, DISCRIMINANT ANALYSIS and DECISION TREES: Examples with MATLAB
From Everand
DATA MINING and MACHINE LEARNING. CLASSIFICATION PREDICTIVE TECHNIQUES: SUPPORT VECTOR MACHINE, LOGISTIC REGRESSION, DISCRIMINANT ANALYSIS and DECISION TREES: Examples with MATLAB
César Pérez López
No ratings yet
Methodology: Project Name
No ratings yet
Methodology: Project Name
5 pages
Data Analytics
No ratings yet
Data Analytics
15 pages
MN1
No ratings yet
MN1
20 pages
BATCH 6 For Presentation
No ratings yet
BATCH 6 For Presentation
37 pages
Untitled Document
No ratings yet
Untitled Document
5 pages
Abbas Mustafaoglu
No ratings yet
Abbas Mustafaoglu
21 pages
Paper 2
No ratings yet
Paper 2
4 pages
ML - Aat - Report 1
No ratings yet
ML - Aat - Report 1
8 pages
Handwritten Recognition Using SVM, KNN and Neural Network PDF
No ratings yet
Handwritten Recognition Using SVM, KNN and Neural Network PDF
11 pages
Project
No ratings yet
Project
15 pages
1 Scopus Conf
No ratings yet
1 Scopus Conf
13 pages
Bird Species Identification Using Deep Learning
No ratings yet
Bird Species Identification Using Deep Learning
74 pages
IEEE Project Titles 2020
No ratings yet
IEEE Project Titles 2020
293 pages
Bird Species Identification Using Deep Learning
No ratings yet
Bird Species Identification Using Deep Learning
74 pages
GCC Unit Iii Notes
No ratings yet
GCC Unit Iii Notes
22 pages
GCC Unit - 1 Notes
No ratings yet
GCC Unit - 1 Notes
32 pages
Driver Drowsiness Research Gate
No ratings yet
Driver Drowsiness Research Gate
24 pages
Sally S. Smith: Customer Success Manager
No ratings yet
Sally S. Smith: Customer Success Manager
2 pages
Driver Drowsiness Detector
No ratings yet
Driver Drowsiness Detector
59 pages
Seminar Report Haptics
No ratings yet
Seminar Report Haptics
29 pages
Sally S. Smith: Customer Success Manager
No ratings yet
Sally S. Smith: Customer Success Manager
2 pages
Personalizing Kinetic Home Page
100% (1)
Personalizing Kinetic Home Page
23 pages
Mobile DVR Series Users Manual V1 - 0 - 1
No ratings yet
Mobile DVR Series Users Manual V1 - 0 - 1
173 pages
Zebra TC2X - FLASH OS
No ratings yet
Zebra TC2X - FLASH OS
10 pages
Digital Technologies in Construction Management Dissertation
100% (1)
Digital Technologies in Construction Management Dissertation
23 pages
RH254-RHEL7 Self Prepare Slides
No ratings yet
RH254-RHEL7 Self Prepare Slides
342 pages
Case Preparatory Exam-Grade 10 It
No ratings yet
Case Preparatory Exam-Grade 10 It
6 pages
Manual Placa Mae BioStar B560MX-E PRO
No ratings yet
Manual Placa Mae BioStar B560MX-E PRO
132 pages
Basics Study For AUTOCAD
No ratings yet
Basics Study For AUTOCAD
6 pages
DeltaV Software License Types
No ratings yet
DeltaV Software License Types
3 pages
LG Spec-Sheet VL5F 022008 LR (20200318 043227)
No ratings yet
LG Spec-Sheet VL5F 022008 LR (20200318 043227)
6 pages
Dbms Student Prepared Notes
No ratings yet
Dbms Student Prepared Notes
37 pages
Meta-Meta-Programming: Generating C++ Template Metaprograms With Racket Macros
No ratings yet
Meta-Meta-Programming: Generating C++ Template Metaprograms With Racket Macros
9 pages
Adobe Acrobat 9.5.5 - CPSID - 83708
No ratings yet
Adobe Acrobat 9.5.5 - CPSID - 83708
276 pages
GottFA80 - HW11 - User Manual - v1.1
100% (1)
GottFA80 - HW11 - User Manual - v1.1
16 pages
Unity 3D Game Development, For Beginners
No ratings yet
Unity 3D Game Development, For Beginners
75 pages
Beginning Images and Pictures: Activity 5
No ratings yet
Beginning Images and Pictures: Activity 5
3 pages
Special Order No. 2022-073
No ratings yet
Special Order No. 2022-073
23 pages
Brochure AVEVAE3DDesignMarine 01-20.pdf - Coredownload
No ratings yet
Brochure AVEVAE3DDesignMarine 01-20.pdf - Coredownload
8 pages
AT THE OFFICE Vocabulary-1
No ratings yet
AT THE OFFICE Vocabulary-1
2 pages
Hsslive Class 10 2021 IT Qns&Answers Eng
No ratings yet
Hsslive Class 10 2021 IT Qns&Answers Eng
8 pages
Vansteenkiste Leliaert Dvornik-Design and Verification MuMax3-AIPAdvances 2014
No ratings yet
Vansteenkiste Leliaert Dvornik-Design and Verification MuMax3-AIPAdvances 2014
22 pages
Park-NET: Identifying Public Urban Green Spaces Using Multi-Source Spatial Data and Convolutional Networks
No ratings yet
Park-NET: Identifying Public Urban Green Spaces Using Multi-Source Spatial Data and Convolutional Networks
32 pages
Artificial Intelligence 1
No ratings yet
Artificial Intelligence 1
72 pages
Video Compression
No ratings yet
Video Compression
3 pages
Acdsee Ultimate 12 2019
No ratings yet
Acdsee Ultimate 12 2019
520 pages
E-Post Office: A Mini Project Report On
No ratings yet
E-Post Office: A Mini Project Report On
53 pages
Iamneo RMK USER GUIDE Restrictions
No ratings yet
Iamneo RMK USER GUIDE Restrictions
19 pages

Institute of Engineering and Management, Kolkata Artificial Intelligence Project (CS793C) On Handwriting Analysis

Uploaded by

Institute of Engineering and Management, Kolkata Artificial Intelligence Project (CS793C) On Handwriting Analysis

Uploaded by

INSTITUTE OF ENGINEERING AND

Artificial Intelligence Project

PRANJAL CHOWDHURY(UNI. ROLL.:10400116145)

PARTHO PROTIM SARKAR(UNI. ROLL.: 10400116150)

The general flow of an image recognition algorithm.

Digit 1778 6270 5430 96 97 96

92.5 93 93.5 94 94.5 95 95.5 96 96.5 97 97.5

Figure: Graph representation of accuracy of SVM

You might also like