Evaluation of Different Classifier
Evaluation of Different Classifier
w w w x
Solving the optimization problem:
The above problem is Quadratic Programming problem,
So training a support vector machines requires the solution
of a very large quadratic programming, which consumes the
lot of time using the standard chunking SVM algorithm.
SMO(Sequential Minimal Optimization) breaks this large
QP problem in to smaller QP problems. These small QP
problems are solved analytically , resulted faster solution
instead of standard chunking SVM which uses complex
matrix computation.
III. EXPERIMENT
I used two data sheets Nursury Database to predict
rank of the nursury school application and Labour to
predict the final settlement in labour negotiation in Industry.
The nursury database I used contains nine attributes with
instances of 12,960. Wheareas I choose only 57 instances
and 16 attributes for Labour negotiation prediction. Incase
of labour prediction in most of the instances one or more
than one attrributes are dont care i,e not defined. However
all in case of nursury all the attrubutes of instances are
defined.
I used WEKA GUI tools Ver.3.6.9 to analyse the
performance of the classifier with test mode of 10-fold cross
validation.
IV. OBSERVATION
A. Nursury Classification Domain
.
Figure 1. Column chart for correctly classified and
incorrectly clssified instances.
For 1, 1
For 1, 1
T
i i
T
i i
y b
y b
= + + >
= + s
w x
w x
1
1
T
T
b
b
+
+ =
+ =
w x
w x
( )
2
( )
M
+
+
=
= =
x x n
w
x x
w w
2
maximize
w
For 1, 1
For 1, 1
T
i i
T
i i
y b
y b
= + + >
= + s
w x
w x
2 1
minimize
2
w
2 1
minimize
2
w
For 1, 1
For 1, 1
T
i i
T
i i
y b
y b
= + + >
= + s
w x
w x
( ) 1
T
i i
y b + > w x
( )
2
1
1
minimize ( , , ) ( ) 1
2
n
T
p i i i i
i
L b y b
=
= +
w w w x
0
p
L c
=
cw 1
n
i i i
i
y
=
=
w x
1
0
n
i i
i
y
=
=
0
p
L
b
c
=
c
Figure 2. Line chart for different performance metrics
B. Labour Classification Domain
Figure 3. Column chart for correctly classified and
incorrectly clssified instances.
Figure 4. Line chart for different performance metrics
V. EVALUATION
We may notice that from the comparison of two
domains column chart the higher the number of instances
the higher percentage of correct predictions. Also the higher
would be the time taken by the classifier to model and
evaluate the domain. However it also depends more on time
complexity of an algorithm we used.
Figure5. Time taken by each case to model the domain
Nave Bayes classifier with even with small number of
instances for the training would able to estimate the
parameters (mean and variances of the variables) required for
classification and classifies the data with more accuracy in
comparison to SVM and k-NN.
k-NN is more stable algorithm in which the variations of
data wont affect it as it does to other classifier.
VI. CONCLUSION
The performance of three different algorithms was
analyzed. It is concluded the algorithm has its own
constraints for performance as well as data sets; hence
overall performance of algorithm depends on that particular
algorithm and characteristics, features of data we used to
train that algorithm.
ACKNOWLEDGMENT
I would like to thank my Professor Dr. Talbert for his
support providing us the datasheets and the concept of these
algorithms I used. I want to extend my gratitude to the
University of Waikato, New Zealand for creating such
wonderful Application WEKA.
REFERENCES
[1] John C. Platt A Fast Algorithm for Support Vecrot Machine1, April
1998.
[2] Thomas B. Bombay K Nearest Algorithm Prediction and
classification. February 2008..
[3] Zhengrong Li. Yuee Liu, Ross Hayward, Rodney Walker Empirical
Comparison of Machine Learning Algorithms For Image Texture
Classsification with Application to Vegetation Management in Power
Line Corridors July 2010.
[4] Mehryar Mohri, Afshin Rostamizadeh and Ameet Talwalkar
Foundations of Machine Learning
[5] Dan Jurafsky Stanford Unicversity Slides on Text Classification
and Nave Bayes
[6] Evaluation and Analysis of supervised learning algorithms and
classifiers Niklas Lavesson
[7] An Introduction of Support Vector Machine Presentaion slide by
Jinwei Gu 2008
[8] Stephen D. Bay Nearest Neighbor Classification from Multiple An
Introduction of Support Vector Machine Presentaion slide by
Jinwei Gu 2008
[9] http://en.wikipedia.org/wiki/Naive_Bayes_classifier.
[10] http://en.wikipedia.org/wiki/Support_vector_machine
[11] http://en.wikipedia.org/wiki/K-nearest_neighbor_algorithm