Evaluation of Different Classifier

The document analyzes and compares the performance of three supervised learning algorithms - Support Vector Machine, Naive Bayes, and k-Nearest Neighbors - on two datasets. The algorithms are evaluated based on metrics like accuracy, error rate, ROC area, and precision. The Naive Bayes classifier achieved the best performance overall, even with small training datasets, due to its ability to quickly estimate required parameters. The performance of learning algorithms depends on both the characteristics of the algorithm and the features of the data used.

Uploaded by

Tiwari Vivek

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

74 views4 pages

Evaluation of Different Classifier

Uploaded by

Tiwari Vivek

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 4

Performance Analysis of Three Learning Classifier

Tennessee Technological University

Cookeville, Tennessee
(931)372-3101

AbstractThe paper presents the performance analysis of
three supervised learning algorithm. We have face that
different algorithm have different characteristics towards
different domain. So to analyze the classifiers I choose metrics
accuracy (the number of correct classification), error (the
number of incorrect classification), ROC area, Confusion
matrix and precision for Support Vector Machine, Naive Bayes
and Nearest Neighbor Classifier.
Keywords- Performance Analysis; Support Vector Machine;
k-Nearest Neighbour; Nave Bayes,Classifier, accuracy
I. INTRODUCTION
Machine learning algorithms are used in various
applications from classification, regression to optimization of
the system. We always want to have the prediction of the
classifier with high accuracy, However there might be some
bias of algorithm towards the domain we are implementing.
We had heard that Nave Bayes algorithm gives the best
performance analysis in text domain then numerical domain.
II. METHODOLOGY
I used the accuracy of the classifier, time complexity, ROC
Area and precision to evaluate the performance of the
algorithm. Brief introduction of the algorithm I used is
explained:
A. k-Nearest Neighbour
The Nearest Neighbor Algorithm is a simple predictive
learning algorithm. When a new instance is presented to the
model, the algorithm produces class by the majority class of
the k most similar training instances stored in the model
(based on a distance metric).
Let us define the Euclidean distance between two input
vectors Xi and Yi. Consider the two vectors as (Xi1,Xi2) and
(Yi1,Yi2). The distance between the two vectors is defined
as the length of the vector Xr-Xs which is defined as

D(Xi, Yi)=| Xi-Yi |= [(Xi1-Yi1)
2
+( Xr2-Yi2)
2
]

This equation evaluates the distanced between the two
instances Xi and Yi. For instance if Xi is to be classified. The
distance D is calculated between Xi and each training
instance Yi, that is if there were 500 instances then X would
evaluate each of these, starting from the arbitrary position.
The instance of y that returns the smallest value of D is
considered to be closest and thus categorize as the
corresponding label of instance. If k is chosen five then five
smallest distances giving label are chosen and input vector is
categorized to the label of majority.
B. Nave Bayes
Nave-Bayes is based on the Bayesian theorem. This
classification technique analyses the relationship between
each attribute and the class for each instance to derive a
conditional probability for the relationships between the
attribute values and the class. We assume that X is a vector
of instances where each instance is described by attributes
{X1,X2Xk} and a random variable Y denoting the class of
that instances Let Xi be the particular instance and Yi the
class.
Using Nave Bayes for classification is a very simple
process. During Training, the probability of each class is
computed by counting how many times it occurs in the
training date sheet. This is called the prior probability
P(Y=Yi). In addition to the prior probability, the algorithm
also computes the probability for the instance Xi give Yi.
Under the assumptions that the attributes are independent
this probability becomes the product of probabilities of each
single attribute. Surprisingly Nave Bayes has achieved good
results in many cases when this assumption is violated as
well.
The probability that an instance Xi belongs to a class Yi
can be computed by combining the prior probability and the
probability from each attributes density function using the
Bayes formula:

P(Y=Yi/X=Xi)={P(Y=Yi)*(P(X=Xi)/Y=Yi)}/P(X=Xi)

Since we want to maximize the P(Y=Yi/X=Xi), we can
ignore denominator. Hence the modified equation is

P(Y=Yi/X=Xi) = P(Y=Yi)*(P(X=Xi)/Y=Yi)

C. Support Vector Machine
Support Vector Machine has gained its prominence in
machine learning and pattern classification. The basic
concept behind SVM is to find an optimal decision function
(a hyperplane) that seperates clusters of vector in such a
way that a cases with one category of the target variable are
on the one side of a plane and cases with other category on
the other side of the plane. The vectors near the hyperplane
are the support vectors.

g(x) is a hyper plane in the feature space, where
g(x)= W
T
x +b

A normal vector of a hyper plane is given by
n= W / || W||
Margin is defined as the width that the boundary could be
increased by before hitting a data point. Given a set of data
points {(xi, yi)}i= 1.n where,

.
With scale transformation on both W and b the above
equation is equivalent to

We know that

Then the Margin Width is given by:

Such that

Or,

such that

Using Lagrange multiplier,

( )
2
1
1
minimize ( , , ) ( ) 1
2
n
T
p i i i i
i
L b y b
=
= +

w w w x

Solving the optimization problem:

The above problem is Quadratic Programming problem,
So training a support vector machines requires the solution
of a very large quadratic programming, which consumes the
lot of time using the standard chunking SVM algorithm.
SMO(Sequential Minimal Optimization) breaks this large
QP problem in to smaller QP problems. These small QP
problems are solved analytically , resulted faster solution
instead of standard chunking SVM which uses complex
matrix computation.
III. EXPERIMENT
I used two data sheets Nursury Database to predict
rank of the nursury school application and Labour to
predict the final settlement in labour negotiation in Industry.
The nursury database I used contains nine attributes with
instances of 12,960. Wheareas I choose only 57 instances
and 16 attributes for Labour negotiation prediction. Incase
of labour prediction in most of the instances one or more
than one attrributes are dont care i,e not defined. However
all in case of nursury all the attrubutes of instances are
defined.
I used WEKA GUI tools Ver.3.6.9 to analyse the
performance of the classifier with test mode of 10-fold cross
validation.
IV. OBSERVATION

A. Nursury Classification Domain

.
Figure 1. Column chart for correctly classified and
incorrectly clssified instances.

For 1, 1
For 1, 1
T
i i
T
i i
y b
y b
= + + >
= + s
w x
w x
1
1
T
T
b
b
+

+ =
+ =
w x
w x
( )
2
( )
M
+
+
=
= =
x x n
w
x x
w w
2
maximize
w
For 1, 1
For 1, 1
T
i i
T
i i
y b
y b
= + + >
= + s
w x
w x
2 1
minimize
2
w
2 1
minimize
2
w
For 1, 1
For 1, 1
T
i i
T
i i
y b
y b
= + + >
= + s
w x
w x
( ) 1
T
i i
y b + > w x
( )
2
1
1
minimize ( , , ) ( ) 1
2
n
T
p i i i i
i
L b y b
=
= +

w w w x
0
p
L c
=
cw 1
n
i i i
i
y
=
=

w x
1
0
n
i i
i
y
=
=
0
p
L
b
c
=
c

Figure 2. Line chart for different performance metrics

B. Labour Classification Domain

Figure 3. Column chart for correctly classified and
incorrectly clssified instances.

Figure 4. Line chart for different performance metrics
V. EVALUATION

We may notice that from the comparison of two
domains column chart the higher the number of instances
the higher percentage of correct predictions. Also the higher
would be the time taken by the classifier to model and
evaluate the domain. However it also depends more on time
complexity of an algorithm we used.

Figure5. Time taken by each case to model the domain

Nave Bayes classifier with even with small number of
instances for the training would able to estimate the
parameters (mean and variances of the variables) required for
classification and classifies the data with more accuracy in
comparison to SVM and k-NN.
k-NN is more stable algorithm in which the variations of
data wont affect it as it does to other classifier.
VI. CONCLUSION

The performance of three different algorithms was
analyzed. It is concluded the algorithm has its own
constraints for performance as well as data sets; hence
overall performance of algorithm depends on that particular
algorithm and characteristics, features of data we used to
train that algorithm.

ACKNOWLEDGMENT
I would like to thank my Professor Dr. Talbert for his
support providing us the datasheets and the concept of these
algorithms I used. I want to extend my gratitude to the
University of Waikato, New Zealand for creating such
wonderful Application WEKA.

REFERENCES

[1] John C. Platt A Fast Algorithm for Support Vecrot Machine1, April
1998.
[2] Thomas B. Bombay K Nearest Algorithm Prediction and
classification. February 2008..
[3] Zhengrong Li. Yuee Liu, Ross Hayward, Rodney Walker Empirical
Comparison of Machine Learning Algorithms For Image Texture
Classsification with Application to Vegetation Management in Power
Line Corridors July 2010.
[4] Mehryar Mohri, Afshin Rostamizadeh and Ameet Talwalkar
Foundations of Machine Learning
[5] Dan Jurafsky Stanford Unicversity Slides on Text Classification
and Nave Bayes
[6] Evaluation and Analysis of supervised learning algorithms and
classifiers Niklas Lavesson
[7] An Introduction of Support Vector Machine Presentaion slide by
Jinwei Gu 2008
[8] Stephen D. Bay Nearest Neighbor Classification from Multiple An
Introduction of Support Vector Machine Presentaion slide by
Jinwei Gu 2008
[9] http://en.wikipedia.org/wiki/Naive_Bayes_classifier.
[10] http://en.wikipedia.org/wiki/Support_vector_machine
[11] http://en.wikipedia.org/wiki/K-nearest_neighbor_algorithm

Mod09-ppt2-ML in Image Classification
No ratings yet
Mod09-ppt2-ML in Image Classification
30 pages
Classification (NaiveBayes KNN SVM DecisionTrees)
No ratings yet
Classification (NaiveBayes KNN SVM DecisionTrees)
105 pages
ML Unit 3 Part B Material
No ratings yet
ML Unit 3 Part B Material
15 pages
QUESTIONS
No ratings yet
QUESTIONS
20 pages
ML Unit 3 V1
No ratings yet
ML Unit 3 V1
25 pages
L6 Lecture Image - Classification.fundemental v4
No ratings yet
L6 Lecture Image - Classification.fundemental v4
66 pages
Tan 2021 J. Phys. Conf. Ser. 1994 012016
No ratings yet
Tan 2021 J. Phys. Conf. Ser. 1994 012016
6 pages
Supervised Learning - Support Vector Machines and Feature Reduction
No ratings yet
Supervised Learning - Support Vector Machines and Feature Reduction
11 pages
Unit 5
No ratings yet
Unit 5
28 pages
Unit 6 Ai
No ratings yet
Unit 6 Ai
28 pages
Machine Learning Algorithms Laiki
No ratings yet
Machine Learning Algorithms Laiki
123 pages
Unit 3
No ratings yet
Unit 3
20 pages
Machine Learning
No ratings yet
Machine Learning
32 pages
INT354 - Unit 3
No ratings yet
INT354 - Unit 3
60 pages
Basic of SVM Algorithm
No ratings yet
Basic of SVM Algorithm
10 pages
Datamining Lect12
No ratings yet
Datamining Lect12
75 pages
4.4-InstanceBasedLearning Part 2
No ratings yet
4.4-InstanceBasedLearning Part 2
16 pages
Support Vector Machine
No ratings yet
Support Vector Machine
52 pages
Chapter 2
No ratings yet
Chapter 2
31 pages
W12 SVM
No ratings yet
W12 SVM
52 pages
Artificial Intelligence Lec 3
No ratings yet
Artificial Intelligence Lec 3
17 pages
Deep Learning Answers
No ratings yet
Deep Learning Answers
36 pages
Tutorial 7 Machine Learning Algorithms
No ratings yet
Tutorial 7 Machine Learning Algorithms
30 pages
Data Mining Lecture 10B: Classification
No ratings yet
Data Mining Lecture 10B: Classification
62 pages
ML Module4 Classification
No ratings yet
ML Module4 Classification
79 pages
Supervised Learning
No ratings yet
Supervised Learning
6 pages
Machine Learning: Dr. Windhya Rankothge (PHD - Upf, Barcelona)
No ratings yet
Machine Learning: Dr. Windhya Rankothge (PHD - Upf, Barcelona)
44 pages
Project Report 2
No ratings yet
Project Report 2
11 pages
ML RUSA Module 6 Probablistic EM KNN SVM
No ratings yet
ML RUSA Module 6 Probablistic EM KNN SVM
51 pages
Pattern Revision
No ratings yet
Pattern Revision
63 pages
Comparative Study of Four Supervised Machine Learning Techniques For Classification
No ratings yet
Comparative Study of Four Supervised Machine Learning Techniques For Classification
15 pages
SVM Class
No ratings yet
SVM Class
33 pages
DL Highlights
No ratings yet
DL Highlights
6 pages
Support Vector Machines: (Vapnik, 1979)
No ratings yet
Support Vector Machines: (Vapnik, 1979)
34 pages
Co-2 ML 2019
No ratings yet
Co-2 ML 2019
71 pages
Notes
No ratings yet
Notes
32 pages
Chapter 4. Classification Algorithms-Stud
No ratings yet
Chapter 4. Classification Algorithms-Stud
43 pages
Unit 3
No ratings yet
Unit 3
99 pages
C. Cifarelli Et Al - Incremental Classification With Generalized Eigenvalues
No ratings yet
C. Cifarelli Et Al - Incremental Classification With Generalized Eigenvalues
25 pages
A Study of Classification Algorithms Using Rapidminer
No ratings yet
A Study of Classification Algorithms Using Rapidminer
12 pages
Support Vector Machine
No ratings yet
Support Vector Machine
50 pages
Machine Learning
No ratings yet
Machine Learning
33 pages
ML and Ai Unit 04 and Unit 05
No ratings yet
ML and Ai Unit 04 and Unit 05
58 pages
Slide 10 Chapter9 Classification Advanced Methods
No ratings yet
Slide 10 Chapter9 Classification Advanced Methods
46 pages
Module 3
No ratings yet
Module 3
79 pages
Lecture 18 - SVM
No ratings yet
Lecture 18 - SVM
54 pages
2B Naive Bayes
No ratings yet
2B Naive Bayes
90 pages
AP For NLP-LO2
No ratings yet
AP For NLP-LO2
38 pages
Session 5
No ratings yet
Session 5
36 pages
Lesson 8 - Classification
No ratings yet
Lesson 8 - Classification
74 pages
Dsbdunitiii T1729232981820-1
No ratings yet
Dsbdunitiii T1729232981820-1
26 pages
CS 601 Machine Learning Unit 5
No ratings yet
CS 601 Machine Learning Unit 5
18 pages
Data Mining 4th Is
No ratings yet
Data Mining 4th Is
24 pages
Classification FoundationalMathofAI S24
No ratings yet
Classification FoundationalMathofAI S24
6 pages
FPA Notes
No ratings yet
FPA Notes
13 pages
Chapter 6 ML Classifications
100% (1)
Chapter 6 ML Classifications
51 pages
105 Machine Learning Paper
No ratings yet
105 Machine Learning Paper
6 pages
4.1 K-Nearest Neighbours (K-NN
No ratings yet
4.1 K-Nearest Neighbours (K-NN
9 pages
ML Question Bank 6th Sem
No ratings yet
ML Question Bank 6th Sem
4 pages
Alotaibi 2020
No ratings yet
Alotaibi 2020
12 pages
Applied Spatial Statistics and Econometrics 1st Edition Katarzyna Kopczewska PDF Download
No ratings yet
Applied Spatial Statistics and Econometrics 1st Edition Katarzyna Kopczewska PDF Download
62 pages
Nihms 1587711
No ratings yet
Nihms 1587711
28 pages
STAT 451: Introduction To Machine Learning Lecture Notes
No ratings yet
STAT 451: Introduction To Machine Learning Lecture Notes
22 pages
Visualization in Stylometry
No ratings yet
Visualization in Stylometry
15 pages
Artikel 10 147 154 Analisis Sentimen Review Penjualan Produk Umkm Pada Kabupaten Nias Dengan Komparasi Algoritma Klasifikasi Machine Learning
No ratings yet
Artikel 10 147 154 Analisis Sentimen Review Penjualan Produk Umkm Pada Kabupaten Nias Dengan Komparasi Algoritma Klasifikasi Machine Learning
8 pages
001-Plant Disease Detection With Fertilizer Recommendation-356 - Plant
No ratings yet
001-Plant Disease Detection With Fertilizer Recommendation-356 - Plant
7 pages
Movie Recommender System Using K-Means
No ratings yet
Movie Recommender System Using K-Means
7 pages
Paper Logo Published
No ratings yet
Paper Logo Published
8 pages
Mobile Application Development
No ratings yet
Mobile Application Development
75 pages
Experimental Disease Prediction Research On Combining Natural Language Processing and Machine Learning
No ratings yet
Experimental Disease Prediction Research On Combining Natural Language Processing and Machine Learning
6 pages
MLfinal 1
No ratings yet
MLfinal 1
7 pages
A Review On Credit Card Default Modelling Using Data Science
No ratings yet
A Review On Credit Card Default Modelling Using Data Science
7 pages
Lab 6
No ratings yet
Lab 6
6 pages
1 s2.0 S1877050923001102 Main
No ratings yet
1 s2.0 S1877050923001102 Main
7 pages
Fall Detection System Using Wearable Sensor Device
No ratings yet
Fall Detection System Using Wearable Sensor Device
16 pages
Loss Functions - An Algorithm-Wise Comprehensive Summary
No ratings yet
Loss Functions - An Algorithm-Wise Comprehensive Summary
5 pages
Bsadcom 201910007
No ratings yet
Bsadcom 201910007
18 pages
Perbandingan Algoritma Naïve Bayes Dan KNN Dalam Analisis Sentimen Masyarakat Terhadap Pelaksanaan PPPK Guru
No ratings yet
Perbandingan Algoritma Naïve Bayes Dan KNN Dalam Analisis Sentimen Masyarakat Terhadap Pelaksanaan PPPK Guru
7 pages
6.data Mining - Classification
No ratings yet
6.data Mining - Classification
37 pages
Weather Prediction Using Machine Learning Techniques
No ratings yet
Weather Prediction Using Machine Learning Techniques
9 pages
An Attention Mechanism Based CNN Bilstm Classification Model For Detection of Inappropriate Content in Cartoon Videos
No ratings yet
An Attention Mechanism Based CNN Bilstm Classification Model For Detection of Inappropriate Content in Cartoon Videos
24 pages
Detection of Stroke Disease Using Machine Learning Algorithams Full
No ratings yet
Detection of Stroke Disease Using Machine Learning Algorithams Full
57 pages
Presentation 2
No ratings yet
Presentation 2
28 pages
Thesis Task 1
No ratings yet
Thesis Task 1
4 pages
State-Of-The-Art of Stress Prediction From Heart Rate Variability Using Artificial Intelligence
No ratings yet
State-Of-The-Art of Stress Prediction From Heart Rate Variability Using Artificial Intelligence
27 pages
Learning-Based Material Classification in X-Ray Security Images
No ratings yet
Learning-Based Material Classification in X-Ray Security Images
8 pages
2020 Machine Learning Approach To Predict Computer Vunerability
No ratings yet
2020 Machine Learning Approach To Predict Computer Vunerability
6 pages

Evaluation of Different Classifier

Uploaded by

Evaluation of Different Classifier

Uploaded by

Performance Analysis of Three Learning Classifier

Tennessee Technological University

You might also like