0% found this document useful (0 votes)
74 views4 pages

Evaluation of Different Classifier

The document analyzes and compares the performance of three supervised learning algorithms - Support Vector Machine, Naive Bayes, and k-Nearest Neighbors - on two datasets. The algorithms are evaluated based on metrics like accuracy, error rate, ROC area, and precision. The Naive Bayes classifier achieved the best performance overall, even with small training datasets, due to its ability to quickly estimate required parameters. The performance of learning algorithms depends on both the characteristics of the algorithm and the features of the data used.

Uploaded by

Tiwari Vivek
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
74 views4 pages

Evaluation of Different Classifier

The document analyzes and compares the performance of three supervised learning algorithms - Support Vector Machine, Naive Bayes, and k-Nearest Neighbors - on two datasets. The algorithms are evaluated based on metrics like accuracy, error rate, ROC area, and precision. The Naive Bayes classifier achieved the best performance overall, even with small training datasets, due to its ability to quickly estimate required parameters. The performance of learning algorithms depends on both the characteristics of the algorithm and the features of the data used.

Uploaded by

Tiwari Vivek
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

Performance Analysis of Three Learning Classifier

Tennessee Technological University


Cookeville, Tennessee
(931)372-3101

AbstractThe paper presents the performance analysis of
three supervised learning algorithm. We have face that
different algorithm have different characteristics towards
different domain. So to analyze the classifiers I choose metrics
accuracy (the number of correct classification), error (the
number of incorrect classification), ROC area, Confusion
matrix and precision for Support Vector Machine, Naive Bayes
and Nearest Neighbor Classifier.
Keywords- Performance Analysis; Support Vector Machine;
k-Nearest Neighbour; Nave Bayes,Classifier, accuracy
I. INTRODUCTION
Machine learning algorithms are used in various
applications from classification, regression to optimization of
the system. We always want to have the prediction of the
classifier with high accuracy, However there might be some
bias of algorithm towards the domain we are implementing.
We had heard that Nave Bayes algorithm gives the best
performance analysis in text domain then numerical domain.
II. METHODOLOGY
I used the accuracy of the classifier, time complexity, ROC
Area and precision to evaluate the performance of the
algorithm. Brief introduction of the algorithm I used is
explained:
A. k-Nearest Neighbour
The Nearest Neighbor Algorithm is a simple predictive
learning algorithm. When a new instance is presented to the
model, the algorithm produces class by the majority class of
the k most similar training instances stored in the model
(based on a distance metric).
Let us define the Euclidean distance between two input
vectors Xi and Yi. Consider the two vectors as (Xi1,Xi2) and
(Yi1,Yi2). The distance between the two vectors is defined
as the length of the vector Xr-Xs which is defined as

D(Xi, Yi)=| Xi-Yi |= [(Xi1-Yi1)
2
+( Xr2-Yi2)
2
]

This equation evaluates the distanced between the two
instances Xi and Yi. For instance if Xi is to be classified. The
distance D is calculated between Xi and each training
instance Yi, that is if there were 500 instances then X would
evaluate each of these, starting from the arbitrary position.
The instance of y that returns the smallest value of D is
considered to be closest and thus categorize as the
corresponding label of instance. If k is chosen five then five
smallest distances giving label are chosen and input vector is
categorized to the label of majority.
B. Nave Bayes
Nave-Bayes is based on the Bayesian theorem. This
classification technique analyses the relationship between
each attribute and the class for each instance to derive a
conditional probability for the relationships between the
attribute values and the class. We assume that X is a vector
of instances where each instance is described by attributes
{X1,X2Xk} and a random variable Y denoting the class of
that instances Let Xi be the particular instance and Yi the
class.
Using Nave Bayes for classification is a very simple
process. During Training, the probability of each class is
computed by counting how many times it occurs in the
training date sheet. This is called the prior probability
P(Y=Yi). In addition to the prior probability, the algorithm
also computes the probability for the instance Xi give Yi.
Under the assumptions that the attributes are independent
this probability becomes the product of probabilities of each
single attribute. Surprisingly Nave Bayes has achieved good
results in many cases when this assumption is violated as
well.
The probability that an instance Xi belongs to a class Yi
can be computed by combining the prior probability and the
probability from each attributes density function using the
Bayes formula:

P(Y=Yi/X=Xi)={P(Y=Yi)*(P(X=Xi)/Y=Yi)}/P(X=Xi)

Since we want to maximize the P(Y=Yi/X=Xi), we can
ignore denominator. Hence the modified equation is

P(Y=Yi/X=Xi) = P(Y=Yi)*(P(X=Xi)/Y=Yi)

C. Support Vector Machine
Support Vector Machine has gained its prominence in
machine learning and pattern classification. The basic
concept behind SVM is to find an optimal decision function
(a hyperplane) that seperates clusters of vector in such a
way that a cases with one category of the target variable are
on the one side of a plane and cases with other category on
the other side of the plane. The vectors near the hyperplane
are the support vectors.

g(x) is a hyper plane in the feature space, where
g(x)= W
T
x +b

A normal vector of a hyper plane is given by
n= W / || W||
Margin is defined as the width that the boundary could be
increased by before hitting a data point. Given a set of data
points {(xi, yi)}i= 1.n where,



.
With scale transformation on both W and b the above
equation is equivalent to





We know that


Then the Margin Width is given by:










Such that






Or,





such that





Using Lagrange multiplier,

( )
2
1
1
minimize ( , , ) ( ) 1
2
n
T
p i i i i
i
L b y b
=
= +

w w w x

Solving the optimization problem:









The above problem is Quadratic Programming problem,
So training a support vector machines requires the solution
of a very large quadratic programming, which consumes the
lot of time using the standard chunking SVM algorithm.
SMO(Sequential Minimal Optimization) breaks this large
QP problem in to smaller QP problems. These small QP
problems are solved analytically , resulted faster solution
instead of standard chunking SVM which uses complex
matrix computation.
III. EXPERIMENT
I used two data sheets Nursury Database to predict
rank of the nursury school application and Labour to
predict the final settlement in labour negotiation in Industry.
The nursury database I used contains nine attributes with
instances of 12,960. Wheareas I choose only 57 instances
and 16 attributes for Labour negotiation prediction. Incase
of labour prediction in most of the instances one or more
than one attrributes are dont care i,e not defined. However
all in case of nursury all the attrubutes of instances are
defined.
I used WEKA GUI tools Ver.3.6.9 to analyse the
performance of the classifier with test mode of 10-fold cross
validation.
IV. OBSERVATION

A. Nursury Classification Domain

.
Figure 1. Column chart for correctly classified and
incorrectly clssified instances.

For 1, 1
For 1, 1
T
i i
T
i i
y b
y b
= + + >
= + s
w x
w x
1
1
T
T
b
b
+

+ =
+ =
w x
w x
( )
2
( )
M
+
+
=
= =
x x n
w
x x
w w
2
maximize
w
For 1, 1
For 1, 1
T
i i
T
i i
y b
y b
= + + >
= + s
w x
w x
2 1
minimize
2
w
2 1
minimize
2
w
For 1, 1
For 1, 1
T
i i
T
i i
y b
y b
= + + >
= + s
w x
w x
( ) 1
T
i i
y b + > w x
( )
2
1
1
minimize ( , , ) ( ) 1
2
n
T
p i i i i
i
L b y b
=
= +

w w w x
0
p
L c
=
cw 1
n
i i i
i
y
=
=

w x
1
0
n
i i
i
y
=
=
0
p
L
b
c
=
c

Figure 2. Line chart for different performance metrics

B. Labour Classification Domain



Figure 3. Column chart for correctly classified and
incorrectly clssified instances.


Figure 4. Line chart for different performance metrics
V. EVALUATION

We may notice that from the comparison of two
domains column chart the higher the number of instances
the higher percentage of correct predictions. Also the higher
would be the time taken by the classifier to model and
evaluate the domain. However it also depends more on time
complexity of an algorithm we used.



Figure5. Time taken by each case to model the domain


Nave Bayes classifier with even with small number of
instances for the training would able to estimate the
parameters (mean and variances of the variables) required for
classification and classifies the data with more accuracy in
comparison to SVM and k-NN.
k-NN is more stable algorithm in which the variations of
data wont affect it as it does to other classifier.
VI. CONCLUSION

The performance of three different algorithms was
analyzed. It is concluded the algorithm has its own
constraints for performance as well as data sets; hence
overall performance of algorithm depends on that particular
algorithm and characteristics, features of data we used to
train that algorithm.


ACKNOWLEDGMENT
I would like to thank my Professor Dr. Talbert for his
support providing us the datasheets and the concept of these
algorithms I used. I want to extend my gratitude to the
University of Waikato, New Zealand for creating such
wonderful Application WEKA.






REFERENCES

[1] John C. Platt A Fast Algorithm for Support Vecrot Machine1, April
1998.
[2] Thomas B. Bombay K Nearest Algorithm Prediction and
classification. February 2008..
[3] Zhengrong Li. Yuee Liu, Ross Hayward, Rodney Walker Empirical
Comparison of Machine Learning Algorithms For Image Texture
Classsification with Application to Vegetation Management in Power
Line Corridors July 2010.
[4] Mehryar Mohri, Afshin Rostamizadeh and Ameet Talwalkar
Foundations of Machine Learning
[5] Dan Jurafsky Stanford Unicversity Slides on Text Classification
and Nave Bayes
[6] Evaluation and Analysis of supervised learning algorithms and
classifiers Niklas Lavesson
[7] An Introduction of Support Vector Machine Presentaion slide by
Jinwei Gu 2008
[8] Stephen D. Bay Nearest Neighbor Classification from Multiple An
Introduction of Support Vector Machine Presentaion slide by
Jinwei Gu 2008
[9] http://en.wikipedia.org/wiki/Naive_Bayes_classifier.
[10] http://en.wikipedia.org/wiki/Support_vector_machine
[11] http://en.wikipedia.org/wiki/K-nearest_neighbor_algorithm

You might also like