0% found this document useful (0 votes)
101 views8 pages

2013 Selection of The Best Classifier From Different Datasets Using WEKA PDF

Uploaded by

praveen kumar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
101 views8 pages

2013 Selection of The Best Classifier From Different Datasets Using WEKA PDF

Uploaded by

praveen kumar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/313648961

Selection Of The Best Classifier From Different Datasets Using WEKA

Article · March 2013

CITATIONS READS

10 1,653

1 author:

Ranjita Dash
National Institute of Technology Rourkela
13 PUBLICATIONS   18 CITATIONS   

SEE PROFILE

Some of the authors of this publication are also working on these related projects:

Thermal management of three dimensional network-on-chip systems View project

Thermal Management in 3D NoCs View project

All content following this page was uploaded by Ranjita Dash on 13 February 2017.

The user has requested enhancement of the downloaded file.


International Journal of Engineering Research & Technology (IJERT)
ISSN: 2278-0181
Vol. 2 Issue 3, March - 2013

Selection Of The Best Classifier From Different Datasets Using WEKA


Ranjita kumari Dash
Assistant Professor, Institute Of Technical Education and Research,
SOA University

Abstract
1. Introduction
In today’s world large amount of data is
available in science, industry, business and many other
Data mining is the process of extracting patterns
areas. These data can provide valuable information
from data [10, 11]. It is seen as an increasingly
which can be used by management for making
important tool by modern business to transform data as
important decisions. By using data mining we can find
the technology advances and the need for efficient data
valuable information. Data mining is the popular topic
analysis is required. Data mining involves the use of
among researchers. There is a lot of work that cannot
data analysis tools to discover previously unknown,
be explored till now. But, this paper focuses on the
valid patterns and relationships in large data set. It is
fundamental concept of the Data mining that is
currently used in a wide range of areas like marketing,
RRTT
Classification Techniques. In this paper, Naive Bays ,
surveillance, fraud detection, and scientific discovery
Functions, Lazy, Meta, Nested dichotomies, Rules and
etc.
Trees classifiers are used for the classification of data
In this paper we process a cancer dataset and use
IIJJEE

set. The performance of these classifiers analyzed with


different classification methods to learn from the test
the help of correctly classified instances, incorrectly
data set.
classified instances and time taken to build the model
Classification is a basic task in the data
and the result can be shown statistical as well as
analysis that requires the construction of a classifier,
graphically. WEKA data mining tool is used for this
that is, a function that assigns a class label to instances
purpose. WEKA stands for Waikato Environment for
described by a set of attributes. It is one of the
Knowledge Analysis. Three datasets are used on which
important applications of data mining. This technique
different classifiers are applied to check which
predicts categorical class labels. In this paper, we are
classifier is giving the best result, where different
giving the comparison of various classification
measurements are taken.71 different classifiers are
techniques using WEKA. Our aim is to investigate the
applied on this dataset. The dataset is in ARFF
performance of different classification methods using
format.10 fold cross validation is used to provide better
WEKA. Classification of data is very typical task in
accuracy. Finally the classification technique which
data mining. There are large number of classifiers that
provides the best result will be suggested. The result
are used to classify the data such as Bayes, function,
shows that no single algorithm always performed the
lazy learners, Meta, rule based and Decision tree etc.
best for each dataset.
The goal of classification is to correctly predict the
value.
KEY TERM’S
For Breast cancer, there is a substantial amount of
Bays Net, J48, Mean Absolute Error, Naive Bays, Root research with machine learning algorithm [1]. Machine
Mean-Squared Error learning covers such a broad range of processes that it
is difficult to define precisely [6]. Young women
being diagnosed in their teens, twenties and
thirties. Even if the percentage is very low compared to
that of older women aged 40 years and older [7, 8, 9].
1% of all diagnosed breast cancers are in men. We
report the case of a 34-year-old woman affected by

www.ijert.org 1
International Journal of Engineering Research & Technology (IJERT)
ISSN: 2278-0181
Vol. 2 Issue 3, March - 2013

breast cancer that had metastasized to the bone. Today,


about one in eight women over their lifetime have been
3. METHODS
This section describes the classification methods used
affected by breast cancer in the United States. In recent in this paper. We discuss each method and explain how
years, the incidence rate keeps increasing. However the the method has been used in our experiment. For this
appropriate methods to predict the breast cancer Breast cancer dataset we have taken eight methods
survival have not been established. In this study, we Bayes, Functions, Lazy, Meta, Misc, Nested
use those models to evaluate the prediction rate of dichotomies, Rules and Trees classifiers for the
breast cancer patients from the perspectives of classification of data set.
accuracy.
3.1. NAIVE BAYES CLASSIFIER
2. WEKA Bayes methods are also used as one of the
classification solutions in data mining. In our work we
WEKA stands for Waikato Environment for use six main Bayesian methods namely AODE,
Knowledge Analysis. WEKA is created by researchers AODEsr, Naive Bayes, Bayesian net, Naive Bayes
at the University of Waikato in New Zealand. WEKA simple and Naive Bayes updateable, that are
was first implemented in its modern form in 1997. The implemented in WEKA software for classification
GNU General Public License (GPL) is used here. The .Naive Bayes is an extension of Bayes theorem in that
figure of WEKA is shown in the figure .The software is it assumes independence of attributes[3]. This
written in the Java™ language and contains a GUI for assumption is not strictly correct when considering
interacting with data files .For working of WEKA, we classification based on text extraction from a document
do not need the deep knowledge of data mining for as there are relationships between the words that
which WEKA a very popular data is mining tool. accumulate into concepts. Problems of this kind, called
WEKA also provides the graphical user interface of the problems of supervised classification, are
RRTT
user and provides many facilities. In this paper, we are ubiquotous.Naive Bayes sometimes also called as
giving the comparison of various classification idiot's Bayes, simple Bayes and independence
techniques using WEKA. WEKA is a state of-the-art Bayes.This is important for several reasons.
IIJJEE

facility for developing machine learning (ML)


techniques and their application to real-world data It is easy to construct without any need for
mining problems. The data file normally used by complicated iterative parameter estimation schemes.
WEKA is in ARFF file format. ARFF stands for This means it may be readily applied to huge datasets.
Attribute Relation File Format, which consists of It is robust, easy to interpret, and often does
special tags to indicate differentiating in the data file. surprisingly well though it may not be the best
WEKA implements algorithms for data pre-processing, classifier in any particular application.
classification, regression and clustering and association
rules. It also includes visualization tools. It has a set of
panels, each of which can be used to perform a certain
task. The new machine learning schemes can also be 3.2. FUNCTION CLASSIFIER
developed with this package. WEKA is open source Function classifier uses the concept of neural network
software issued under General Public License. The and regression. Here two examples from neural
algorithms are applied directly to a dataset. The main network and regression will be taken for discussing the
features of WEKA includes scenario[2]. A multilayer perceptron is a free forward
• 49 data pre-processing tools artificial neural network model that maps sets of input
• 76 classification/regression algorithms data onto a set of appropriate output. It is a
• 8 clustering algorithms modification of the standard linear perceptron in that it
• 15 attribute/subset evaluators + 10 search uses three or more layers of neurons with nonlinear
Algorithms for feature selection. activation functions and it is more powerful than the
• 3 algorithms for finding association rules perceptron in that it can distinguish data that is not
• 3 graphical user interfaces linearly separable or separable by a hyperplane[4].A
multilayer perceptron has distinctive characteristics.
The model of each neuron in the network includes a
non linear activation function. The network contains
one or more layers of hidden neurons that are not part
of the input or output of the network. These hidden

www.ijert.org 2
International Journal of Engineering Research & Technology (IJERT)
ISSN: 2278-0181
Vol. 2 Issue 3, March - 2013

neurons enable the network to learn complex tasks by Decision tree induction has been studied in details in
extracting progressively more meaningful features from both areas of pattern recognition and machine learning
the input patterns. The network exhibits a high degree [13, 14]. This synthesizes the experience gained by
of connectivity determined by the network. A change in people working in the area of machine learning and
the connectivity of the network requires a change in the describes a computer program called ID3.
population of synaptic connections on their weights[5].
3.3. RULES CLASSIFIER
Association rules are used to find interesting
correction relationship among all the attributes. They
may predict more than one conclusion. The number of
records an association rule can predict correctly is
called coverage. Support is defined as coverage divided
by total number of records[5]. Accuracy is the number
of records that is predicted correctly expressed as a
percentage of all instances that are applied to the
methods of this algorithm are Conjunctive Rule,
Decision table,DTNB,JRip,NNge,Oner,Rider and Zero.
Rules are easier to understand than large trees. One root 4. DISCUSSION AND RESULT
is created for each path from the root to the leaf. Each
attribute value pair along a path forms a conjunction. By investigating the performance on the
The leaf holds the class prediction. Rules are mutually selected classification methods or algorithms namely
exclusive. These are learned one at a time .Each time a Bayes ,Function, Lazy ,Meta ,Rules ,Misc ,nested
rule is learned ,the tuples are covered by the rules are dichotomies and Trees we use the same experiment
removed. procedure as suggested by WEKA. The 75% data is
RRTT
used for training and the remaining is for testing
3.4. LAZY CLASSIFIER purposes.
In WEKA, all data are considered as instances and
IIJJEE

When making a classification or prediction, lazy features in the data are known as attributes. The
learners can be computationally expensive. They simulation results are partitioned into several sub items
require efficient storage techniques and well suited to for easier analysis and evaluation. On the first part,
implementation on parallel hardware. They offer little correctly and incorrectly classified instances will be
explanation or insight into the structure of the data. partitioned in numeric and percentage value and
Lazy learners however, naturally support incremental subsequently time taken to build model will be in
learning. They are able to model complex decision second .The results of the simulation are shown in
spaces having hyper polygonal shapes that may not be Tables. These are the graphical representation of the
as easily describable by other learning algorithms. The simulation result. On the basis of comparison done
methods of this algorithm are IBI, IBK,K- Star, LBK over accuracy and error rates the classification
and LWL. techniques with highest accuracy are obtained for this
dataset in given different machine learning tools.
We can clearly see that the highest accuracy is 75.52%
and the lowest is 51.74%.In fact, the highest accuracy
3.5. META CLASSIFIER belongs to the Meta classifier. The total time required
to build the model is also a crucial parameter in
comparing the classification algorithm. In this
Meta classifier includes a wide range of classifier. experiment, we can say that a single conjunctive rule
When the attributes have a large number of values learner requires the shortest time which is around 0.15
because the time and space complexities depend not seconds compared to the others.
only on the number of attributes, but also on the With the help of figures we are showing the
number of values for each attribute. working of various algorithms used in WEKA. We are
showing also advantages and disadvantages of each
3.6. DECISSION TREES algorithm. Every algorithm has their own importance
and we use them on the behaviour of the data. Deep
knowledge of algorithms is not required for working in
WEKA. This is the main reason WEKA is more

www.ijert.org 3
International Journal of Engineering Research & Technology (IJERT)
ISSN: 2278-0181
Vol. 2 Issue 3, March - 2013

suitable tool for data mining applications. This paper Figure no-1
shows only the clustering operations in the WEKA, we
will try to make a complete reference paper of WEKA.
incorrectly classcified
Table for best algorithms:-
Name of Correctly Incorrectl Time taken
instance
algorithm classified y to build the
35
instance classified model 30
25
instances 20
15
Bayesnet 72.028 27.972 0.03 10
5 incorrectly
0

oridinalclassclass…
Simple 75.1748 24.8252 1.44 classcified

kstar

j48
decission table
bayesnet

edmisc.hyperpipes
simple logistic

filteredclasscifier
instance
logistic

K-Star 73.4266 26.5734 0

Filtered 75.5245 24.4755 0


classifier

Ordinal 75.5245 24.4755 0.01


RRTT
classifier Figure no-2

Misc 69.9301 30.0699 0


IIJJEE

Decision 73.4266 26.5734 0.5


Table
time taken to build the
model
J48 75.5245 24.4755 0.01
bayesnet

simple logistic
correctly classified instances
76
75
74 Figure no-3
73
72
71
70
69
68 correctly 4.2 Comparison between LUNG dataset,
67 classified HEART dataset, DIABETES DATASET
oridinalclassclasscif…
edmisc.hyperpipes

j48
bayesnet

kstar

decission table
simple logistic

filteredclasscifier

instances

www.ijert.org 4
International Journal of Engineering Research & Technology (IJERT)
ISSN: 2278-0181
Vol. 2 Issue 3, March - 2013

Algorith Correctly Inco T FP Time


m Classified rrec P Rat taken Incorrectly Classified
Instances in tly e to Instances in %
% Clas R build
sifie a mode MultilayerPer
d t l in ceptron
Inst e secon Multiclass
anc ds (s) Classifier
es in
Figure no-5
%

Multilaye 100 0 0 0.4 0.2 TP Rate


r . 36
Perceptr 7 MultilayerPer
on 5 ceptron
Multiclass
Multiclas 77.2135 22.7 0 0.3 0.02 Classifier
s 865 . 21
Classifier 7
Figure no-6
RRTT
7
2
Time taken to build
IIJJEE

SPegasos 77.7344 22.2 0 0.3 0.19 model in seconds (s)


656 . 27
7
MultilayerPerc
7
eptron
7

Table no -2(lung dataset,heardataset,diabetes Figure no-7


dataset)

Correctly Classified FP Rate


Instances in % MultilayerPerce
ptron
MultilayerPer Multiclass
ceptron Classifier
Multiclass SPegasos
Classifier

Figure no-4 Figure no-8

www.ijert.org 5
International Journal of Engineering Research & Technology (IJERT)
ISSN: 2278-0181
Vol. 2 Issue 3, March - 2013

5. References [10] S. Aruna et al. (2011). Knowledge based analysis of


various statistical tools in detecting breast cancer.

[1] D.Lavanya, Dr.K.Usha Rani,..,” Analysis of feature [11] J. Han and M. Kamber, (2000) “Data Mining: Concepts
selection with classification: Breast cancer datasets”,Indian and Techniques,” Morgan Kaufmann.
Journal of Computer Science and Engineering
(IJCSE),October 2011. [12] William H. Wolberg, M.D., W. Nick Street, Ph.D.,
[2] E.Osuna, R.Freund, and F. Girosi, “Training support Dennis M. Heisey, Ph.D., Olvi L. Mangasarian, Ph.D.
vector machines: Application to face detection”. Proceedings computerized breast cancer diagnosis and prognosis from
of computer vision and pattern recognition, Puerto Rico pp. fine needle aspirates, Western Surgical Association meeting
130–136.1997. in Palm Desert, California, November 14, 1994.

[3] Buntine, Theory refinement on Bayesian networks. In B. [13] Chen, Y., Abraham, A., Yang, B.(2006), Feature
D. D’Ambrosio, P. Smets, & P.P. Bonissone (Eds.), In Press Selection and Classification using Flexible Neural Tree.
of Proceedings Journal of Neurocomputing 70(1-3): 305–313.
of the Seventh Annual Conference on Uncertainty Artificial
Intelligent (pp. 52-60). San Francisco, CA [14] K. Golnabi, et al., "Analysis of firewall policy rules
using data mining techniques," 2006, pp. 305-315.
[4] S. V. Chakravarthy and J. Ghosh (1994), Scale Based
Clustering [15] Duda, R.O., Hart, P.E.: “Pattern Classification and Scene
using Radial Basis Function Networks, In Press of Analysis”, In: Wiley-Interscience Publication, New York
Proceeding of (1973)
IEEE International Conference on Neural Networks, Orlando,
Florida.
pp. 897-902. 5. M. D. Buhmann (2003), Radial Basis [16] Bishop, C.M.: “Neural Networks for Pattern
Functions: Theory and Implementations, Recognition”. Oxford University Press, New York (1999).
RRTT
[17] Vapnik, V.N., The Nature of Statistical Learning Theory,
[5] Howell, A.J. and Buxton, H. (2002). RBF Network 1st ed., Springer-Verlag, New York, 1995.
Methods for Face
Detection and Attentional Frames, Neural Processing Letters [18] Ross Quinlan, (1993) C4.5: Programs for Machine
IIJJEE

(15), Learning, Morgan Kaufmann Publishers, San Mateo, CA.


Pp.197-2114. Daniel Grossman and Pedro Domingo’s (2004).
Learning Bayesian [19] Cabena, P., Hadjinian, P., Stadler, R., Verhees, J. and
Network Classifiers by Maximizing Conditional Likelihood. Zanasi, A. (1998). Discovering Data Mining: From Concept
In Press to Implementation, Upper Saddle River, N.J., Prentice Hall.
of Proceedings of the 21st International Conference on
Machine [20] E.Osuna, R.Freund, and F. Girosi, “Training support
Learning, Banff, Canada. vector machines: Application to face detection”. Proceedings
of computer vision and pattern recognition, Puerto Rico pp.
[6] U.S. Cancer Statistics Working Group. United States 130–136.1997.
Cancer Statistics: 1999–2008 Incidence and Mortality Web-
based Report. Atlanta (GA): Department of Health and
Human Services, Centres for Disease Control and Prevention, [21] Vaibhav Narayan Chunekar, Hemant P. Ambulgekar
and National Cancer Institute; 2012. (2009). Approach of Neural Network to Diagnose Breast
[7] Lyon IAfRoC: World Cancer Report. International Cancer on three different Data Set. 2009
Agency for Research on Cancer Press 2003:188-193.
[22] D. Lavanya, “Ensemble Decision Tree Classifier for
[8] Elattar, Inas. “Breast Cancer: Magnitude of the Problem”, Breast Cancer Data,” International Journal of Information
Egyptian Society of Surgical Oncology Conference, Taba, Technology Convergence and Services, vol. 2, no. 1, pp. 17-
Sinai, in Egypt (30 March – 1 April 2005). 24, Feb. 2012.

[23] Yoav Freund, Robert E. Schapire, (1999) "Large Margin


Classification Using the Perceptron Algorithm." In: Machine
[9] Daniel F. Roses (2005). Clinical Assessment of Breast Learning, 37(3).
Cancer and Benign Breast Disease, In: Breast Cancer: Vol. 2,
Ch. 14, M. N. Harris [editor], Churchill Livingstone, and [24] J.D.M.Rennie, L.Shih, J.Teevan, and D.R.Karger, 2003.
Philadelphia. “Tackling the poor assumptions of naive bayes text
classification.” In ICML2003, pages 616–623.

www.ijert.org 6
International Journal of Engineering Research & Technology (IJERT)
ISSN: 2278-0181
Vol. 2 Issue 3, March - 2013

[25] Kanako Komiya, Naoto Sato er. Al., “Negation Naïve


Bayes for Categorization of Product Pages on the Web”,
Proceedings of recent advances in Natural Language
Processing, pages 586-591, Hissar, Bulgaria, 12-14
September 2011

[26] Cheng J. Greiner, R, (2001). “Learning Bayesian Belief


Networks Classifiers: Algorithms and Systems, In Stroulia, E.
& Marwin, S.(ed.), AI 2001,, 141-151, LNAI 2056

RRTT
IIJJEE

www.ijert.org 7
View publication stats

You might also like