0% found this document useful (0 votes)
59 views

Comparative Analysis of Classification Algorithms Using Weka

Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-6 | Issue-5 , August 2022, URL: https://www.ijtsrd.com/papers/ijtsrd50568.pdf Paper URL: https://www.ijtsrd.com/computer-science/data-miining/50568/comparative-analysis-of-classification-algorithms-using-weka/sakshi-goel

Uploaded by

Editor IJTSRD
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
59 views

Comparative Analysis of Classification Algorithms Using Weka

Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-6 | Issue-5 , August 2022, URL: https://www.ijtsrd.com/papers/ijtsrd50568.pdf Paper URL: https://www.ijtsrd.com/computer-science/data-miining/50568/comparative-analysis-of-classification-algorithms-using-weka/sakshi-goel

Uploaded by

Editor IJTSRD
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

International Journal of Trend in Scientific Research and Development (IJTSRD)

Volume 6 Issue 5, July-August 2022 Available Online: www.ijtsrd.com e-ISSN: 2456 – 6470

Comparative Analysis of Classification Algorithms using WEKA


Sakshi Goel1, Neeraj Kumar2, Saharsh Gera3
1
M Tech Scholar, CSE, MERI College of Engineering & Technology, Sampla, Haryana, India
2,3
Assistant Professor, CSE, MERI College of Engineering and Technology, Sampla, Haryana, India

ABSTRACT How to cite this paper: Sakshi Goel |


Data Mining is the process of drawing out the useful information Neeraj Kumar | Saharsh Gera
from the raw data that is present in various forms. Data Mining is "Comparative Analysis of Classification
defined as study of the Knowledge Discovery in database process or Algorithms using
Weka" Published in
KDD. Data mining techniques are relevant for drawing out the useful
International Journal
information from the huge amount of raw data that is present in
of Trend in
various forms. In this research work different types of classification Scientific Research
algorithms accuracies are calculated which are widely used to draw and Development
the significant amount of data from the huge amount of raw data. (ijtsrd), ISSN: 2456- IJTSRD50568
Comparative analysis of different Classification Algorithms have 6470, Volume-6 |
been done using various criteria’s like accuracy, execution time (in Issue-5, August 2022, pp.858-869, URL:
seconds) and how much instances are correctly classified or not www.ijtsrd.com/papers/ijtsrd50568.pdf
classified correctly.
Copyright © 2022 by author (s) and
KEYWORDS: Data Mining, J48, Random Tree, Naïve Bayes, International Journal of Trend in
Multilayer Perceptron, WEKA Scientific Research and Development
Journal. This is an
Open Access article
distributed under the
terms of the Creative Commons
Attribution License (CC BY 4.0)
(http://creativecommons.org/licenses/by/4.0)

I. INTRODUCTION
Data Mining is the process of exploring the patterns Step 2: Pre-Processing Step: In the second step the
with the help of various techniques in the data data which is collected in the selection step is highly
gathered from the various sources [1]. Data Mining concerned with problems like vagueness, missing and
also involves selection of the relevant data from the irrelevant data due to magnificent size and
database, preprocessing of the relevant data, complexity. The above concerned problems are
transformation in the suitable form, data mining and molded into a form which is suitable for the data
evaluation of the data and afterwards online updating mining techniques with the help of the different tools
and visualization [1]. It is the analysis step of the used for the data mining [2].
“Knowledge Discovery” process. The actual task of
the Data Mining is semi-self-regulating or self-
regulating investigation of the large batches of the
dataset for extracting the previously unknown,
unusual records and dependencies [1]. Knowledge
Discovery process includes various selection steps
which helps in the efficient extraction of the useful
data from the large datasets. These steps are sequential
steps and they are repeated in iterative sequential
manner until the useful information is not extracted.
Data Mining is one of the essential steps in the KDD
process [2].
Step 1: Selection Step: In the first step suitable data
Figure 1: Sequential Steps of KDD Process
for the investigation task is fetched from the database
[3]. On the basis of the extraction of suitable data
objective dataset is formed [2].

@ IJTSRD | Unique Paper ID – IJTSRD50568 | Volume – 6 | Issue – 5 | July-August 2022 Page 858
International Journal of Trend in Scientific Research and Development @ www.ijtsrd.com eISSN: 2456-6470
Step 3: Transformation Step: In the third step data large dataset K-star comes out with the highest
is molded into the form which is suitable for the definitiveness. [7].
classification by performing different operations like N. Orsu et al, 2013, stated about the different
accumulation, induction, normalization, discretization classification algorithms and their comparisons on
and construction operations for the features [2] [3]. micro-array of data that helps in predicting the
WEKA tool is used for the research work. occurrence of the tumor. Authors have compared 14
Step 4: Data Mining: In the fourth step the Data different classification algorithms on the basis of the
Mining techniques (algorithms) are used for drawing accuracy. According to the research work all
out figures. Data Mining is used to analyze the dataset classifiers comes out with the significant
[2] [3]. In this work Data Mining Classification performances in terms of accuracies [8].
algorithms like J48, Random Tree, Naïve Bayes, and S. Khare, S. Kashyap, 2015, provided analysis of the
Multilayer Perceptron are used for the investigation different classification algorithms which includes
using WEKA Machine Learning Tool. decision tree, bayesian network, k-nearest neighbor
Step 5: Interpretation/ Evaluation Step: In this step classifiers and artificial neural networks. A brief
data patterns are identified on the basis of the some description of data mining and classification is given
measures. To figure out and interpret the mining in the paper. Voting Dataset is used for analysis.
results correctly users need visualization approach to According to the research work decision tree accuracy
work with [2]. is better than the other algorithms [9].
II. RELATED WORK Md. N. Amin, Md. A. Habib, 2015, worked on the
K. Ahmed, T. Jesmin, 2014, this paper proposes to comparative analysis of J48 decision tree, multilayer
analyze accuracy of the data mining algorithms using perception, and naïve bayes. According to the authors
three testing beds which are Percentage Split method, the research work shows the best algorithm is J48 with
Training Data Set method and Cross Validation an accuracy of 97.61%, and the algorithm which is
method. The classification is performed on type-2 having lowest error rate with 27.91% is Naïve Bayes
Diabetes disease dataset. According to this research [10].
paper the top 5 algorithms for classifying diabetes S. Carl et al, 2016, worked on the comparative
patients are Bagging (accuracy 85%), Logistic and analysis of data mining algorithms which are k-means
Multiclass Classifier (accuracy 81.82%) [4]. algorithms, k nearest neighbor algorithm, decision tree
C. Anuradha, T. Velmurugan, 2015, this paper algorithm, naïve bayes algorithm. From the research
comes up with the prediction of the future outcome of performed by the authors they have found that k
the final year results of UG student’s dataset. Cross means algorithm have less error rate and is the easier
fold validation and percentage split are the two testing algorithm as compared to the KNN and Bayesian [11].
beds used in the classification. According to the S. Vijayarani, M. Muthulakshmi, 2013, worked on
research Naïve Bayes and Bayes Net performs well the performance analysis of the bayesian and lazy
for the data set taken and K-NN, OneR performs algorithms. Various performance factors like ROC
poorly [5]. area, Kappa Statistics, TP Rate etc are used for the
S. Gupta, N. Verma, 2014, proposes to analyze the analysis. From the comparison it can be concluded
classification algorithms on the basis of the Mean that Lazy classifiers is efficient than the Bayesian
Absolute Error, Root Mean Squared Error and the classifiers [12].
Confusion Matrix. The performance evaluation is S. Nikam, 2015, worked on the comparative analysis
being done on the Naïve Bayes classifier and of classification algorithm like C4.5, ID3, k- nearest
according to the research the Mean Absolute Error neighbor, Naïve Bayes, SVM and ANN. Each
and the Root Mean Squared Error is less in case of algorithm has its limitations and features and based on
the training data set. According to the evaluated the conditions we can choose the best suited algorithm
results Naïve Bayes comes out to be the best suited for our dataset [13].
algorithm [6].
G. Raj et al, 2018, has shown comparative analysis of
R. Sharma et al, 2015, worked with various data the classification algorithms using WEKA on
mining algorithms to comparatively analyze those hematological data of diabetic patients. The
using criteria’s like definitiveness, execution time, algorithms which have been studied are J48 decision
different datasets and their applications. The tree, Zero R, Naïve Bayes. From this comparison it
algorithms which have been compared in the research can be concluded that Naïve Bayes is the best
are M5P algorithm, K Star algorithm, M5 Rule algorithm on diabetic data with 76.3021% accuracy.
algorithm, Multilayer Perceptron algorithm. For the

@ IJTSRD | Unique Paper ID – IJTSRD50568 | Volume – 6 | Issue – 5 | July-August 2022 Page 859
International Journal of Trend in Scientific Research and Development @ www.ijtsrd.com eISSN: 2456-6470
Naïve Bayes classifier can be used to enhance the K. Danjuma, A. Osofisan, 2014, stated about various
traditional classification methods which are used in classification algorithms and they have been
the medical or bioinformatics areas [14]. comparatively analyzed using cross-fold validation
method and sets of performance metrics. The analysis
N. Jagtap et al, 2017, provided a comprehensive
shows that 97.4% accuracy was of Naïve Bayes,
analysis of different classification algorithms like
Multilayer Perceptron having 96.6% and J48 comes
Support Vector Machines, Bayesian Networks,
with much less accuracy that is 93.5% [23].
Genetic Algorithms, Fuzzy Logic etc. The
comparative study of the algorithms is done on the N. Kaur, N. Dokania, 2018, worked on the
basis of the advantages and disadvantages of the comparative analysis of k-mean and y-mean done on
algorithms [15]. the basis of the features like efficiency, number of
clusters an item belongs, performance, shape of
N. Nithya et al, 2014, stated about the Logistics,
cluster, detection rate etc.[24].
Simple Logistics, SMO algorithms which are
compared on the basis of the accuracy measurement, E. Sondakh, R. Pungus, 2017, worked on the
TP Rate, FP Rate, Precision, Kappa Statistics etc. comparative analysis of three classification algorithms
According to the analysis Logistics method suits best to compose the best suited algorithm for model. Three
from the Function Classifier Algorithm, but according algorithms resulting models shows no significant
to the time accuracy SMO produces the best result difference between performance of Naïve Bayes and
[16]. Decision Tree while SVM shows lowest performance
[25].
S. Chiranjibi, 2015, worked on the comparative
analysis of Naïve Bayes, Bayes Network, Logistics, K. Kishore, M. Reddy, 2017, stated about data mining
Decision tree, Multilayer Perception, REPTree, and its different techniques. Two things have been
ZeroR, Ada Boost. From the work it can be concluded explained one the comparison between different
that logistic algorithm is best which works well for the datasets using one algorithm and second comparison
higher no of attributes and higher no of instances [17]. of different algorithms using single dataset [26].
C. Fernandes et al, 2017, describes about the III. RESEARCH METHODOLOGY
different decision tree classifiers and the decision tree In data mining classification of large data set is a
classifiers are used to forecast student’s proficiency. problem. Data mining has various techniques like
CHAID has highest accuracy rate that is classification, regression, clustering etc. This paper
76.11%followed by C4.5 by 73.13% [18]. mainly focuses on the classification techniques having
various algorithms which will help in classifying the
S. Srivastava et al, 2013, worked on the performance
records. The datasets contains instances or the classes
of classification algorithms and results are compared
and the attributes which helps in classifying the
and evaluation is done on the already existing
records. Random Tree, J48 Decision Tree, Multilayer
datasets. Accuracy of the SPRINT algorithm is more
Perceptron and Naïve Bayes are the algorithms used
and the performance is satisfactorily good [19].
for the analysis of the classification techniques.
A. Lohani et al, 2016, worked on the comparative
The research work mainly focuses on the comparative
analysis of the algorithms and the result of the
analysis of the classification algorithms which are
analysis is shown using ROC (Receiver Operating
Naïve Bayes, Multilayer Perceptron, Random Tree
System) graphically. This paper shows that if
and J48 on Chronic Kidney Disease dataset. The
ensemble methods are used than better results can be
results of comparative analysis are anatomized to
seen. C4.5 algorithm is not stable [20].
deduce best suited algorithm on the basis of
S. Devi, M. Sunadaram, 2016, stated about the data definitiveness, execution time, correctly classified
mining and the various research domains, about meta instances and incorrectly classified instances.
and tree classifiers. This paper provides analysis
A. DATASET USED: In this research work we
between meta and tree classifiers and as a result of the
have used Chronic Kidney Disease (CKD)
analysis it is shown that meta classifier is more
dataset. The main focus of this reasearch is
efficient than tree classifier [21].
performance and evaluation of Naïve Bayes,
S. Priya, M. Venila, 2017, stated about the cancer Multilayer Perceptron, J48, Random Tree
diagnosis which is a field of healthcare and the algorithms. This dataset contains 400 instances
diagnosis of the disease is done with the help of the and 25 attributes. For analyzing the performance
data mining classification algorithms on the basis of of the classification algorithms WEKA data
the correctly and incorrectly classified instances [22]. mining tool is used.

@ IJTSRD | Unique Paper ID – IJTSRD50568 | Volume – 6 | Issue – 5 | July-August 2022 Page 860
International Journal of Trend in Scientific Research and Development @ www.ijtsrd.com eISSN: 2456-6470
Chronic Kidney Disease is a type of disease in which kidney losses its function over a period of month or year.
Clinical Diagnosis of the Chronic Kidney Disease is done with the help of urine and the samples of the blood as
well diagnosing the sample of the kidney tissue. Early diagnosis and detection of the disease is very important so
that failure of the kidney can be stopped. For predicting chronic kidney disease data mining and analytics
techniques are used and historical patient’s data and diagnosis records are used. Using the CKD dataset
comparative analysis of the algorithms is done on the basis of parameters accuracy, properly graded instances,
improperly graded instances, error rate and execution time [28].

Figure 2: Abbreviations used in dataset

Figure 3: Instances and Attributes in Dataset

@ IJTSRD | Unique Paper ID – IJTSRD50568 | Volume – 6 | Issue – 5 | July-August 2022 Page 861
International Journal of Trend in Scientific Research and Development @ www.ijtsrd.com eISSN: 2456-6470
B. CLASSIFICATION: Classification is a data mining technique and is a supervised learning having broad
applications. Classification technique classifies each item of a set into a predefined set of classes or groups.
Among all the techniques in the data mining the apex technique is classification. Dataset is being inspected
by classification and each instance of the dataset is considered. The instances which are inspected and
considered by the technique are appointed to appropriate class such that there will be least error in the model
[29].
Models defining the influential data classes inlying in a particular dataset are withdrawn using classification
technique. The two states of the classification includes application of the algorithm to construct the model and
afterwards constructed model is tested contrary to a already defined dataset to measure the performance and
definitiveness(accuracy) of the model. In this research work we have analyzed Naïve Bayes, Random Tree, J48
and Multilayer Perceptron algorithms on Chronic Kidney Disease dataset. Above algorithms are briefly described
below:
NAÏVE BAYES: Naive Bayes is one of the classifier algorithms in data mining under the bayes class or it can be
said that it is an enhanced form of bayes theorem. The possible result is calculated according to the input in
Bayesian classifier. Those features of class are considered by the naïve bayes which are not related to any other
feature of the class [29]. Working of naïve bayes algorithm is described as follows:
P (d|b) Posterior probability of class (target) given predictor (attribute) of class.
P(d) Prior probability of class.
( | )× ( )
( | )=
( )
( | )= ( | )∗ ( | )∗ ( | )∗…………∗ ( | )∗ ( )
Figure 4: Naïve Bayes Theorem [30]
P (b|d) likelihood which is the probability of predictor of given class.
P(b) Prior probability of predictor of class.
J48: J48 classifier is the enhanced version of the C4.5 classifier. Decision tree is produced as a resultby the J48.
Decision tree produces a tree like structure which has different nodes in it. These different nodes in the tree
contain some judgment and each judgment leads to the particular outcome known as decision tree [10]. Simple
algorithm is being followed by the J48 which works as follows:
New items are being classified by constructing a decision tree which uses available training datasets values after
that those attributes are identified who segregates the distinct instances most clearly [30]. Due to this highest
information from the data instances can be gained [30]. Dataset is partitioned into commonly restricted areas
where each area has its own tag, values and associated actions to describe its data points. This partitioning helps
in deciding which portion of the tree is reaching to a particular resulting node [10].
MULTILAYER PERCEPTRON: Linearly separable problems can be classified by the single layer perceptron.
We use more than one or multiple layers for the non-separable problems. For this we use multilayer network. The
Multilayer (feed forward) network has multiple layers including multiple hidden layers containing neurons and
these neurons are hidden neurons. By using the past data input is correctly mapped into the output when desired
output is not known. With each input the output of the neural network is compared with the desired output so as
to compute the error [10]. For computing the error output produces by the neural network is compared with the
desirable output [10].

@ IJTSRD | Unique Paper ID – IJTSRD50568 | Volume – 6 | Issue – 5 | July-August 2022 Page 862
International Journal of Trend in Scientific Research and Development @ www.ijtsrd.com eISSN: 2456-6470
Figure of the multilayer network is shown below:

Figure 5: Multilayer Perceptron


RANDOM TREE: Random Tree is a type of supervised learning algorithm. This learning algorithm produces
various trainees. Random Trees have been introduced by the Leio Brieman and Adele Cutler. Random tree is a
group of tree predictors which is known as forest. The random tree algorithm is as follows: random treeclassifier
get its input feature vector, this input vector is compared with each tree in the forest and gives the name of the
class as an output with which this input vector matches having majority of votes. 2 machine algorithms are
combined to form the random forest. Random forest ideas are combined with single modeled trees.
TOOL USED: WEKA known as Waikato Environment for Knowledge Analysis which is constructed in New
Zealand in the University of Waikato. This machine learning software is written in Java. WEKA is a collection of
visualization tools and algorithms for the predictive modeling [27]. Different types of data mining algorithms can
be tested using different type of datasets. The techniques which are supported by the WEKA are Data Processing,
Classification, Clustering, Visualization Regression and Feature Selection [21]. There are 5 interfaces in the tool
and main user interface is explorer with which we work but all other interfaces provides same functionality just as
the explorer [27].
IV. EXPERIMENTAL RESULTS
This research work analyses different classification algorithms accomplishment for Chronic Kidney Disease
dataset. Comparison of classifiers for Chronic Kidney Disease dataset is done using criteria accuracy, correctly
classified instances, incorrectly classified instances, error rate and execution time to analyse the performance of
the classification algorithms and its application domain is also discussed. Models for each algorithm are
constructed using two methods maily – Cross Validation with 10 folds out of which training set uses 9 folds and 1
fold for testing and Percentage Split in which 60% of the dataset is used for the training and 40% is used for the
testing and output is given according to it.
Figures are shown for the comaprison of the different classifiers for CKD dataset using 10 fold cross validation
testing bed. Applications are also discussed of these classifiers in the table. According to the table and research
the execution time taken by the Random Tree algorithm is least with 0.02 seconds followed by Naïve Bayes with
0.02 seconds, J48 algorithm with 0.1 seconds and multilayer perceptron took much more time for execution
which is 8.97 seconds. Accuracy of Multilayer perceptron is 99.75%, J48 with 99%, Random treewith 95.5%
and naïve Bayes with 95%. The accuarcies of the algorithms don’t have much difference in between. Hence
according to the data Multilayer perceptron algorithm is most accurate in case of 10 fold cross validation
method.

@ IJTSRD | Unique Paper ID – IJTSRD50568 | Volume – 6 | Issue – 5 | July-August 2022 Page 863
International Journal of Trend in Scientific Research and Development @ www.ijtsrd.com eISSN: 2456-6470

Figure 6: Result evaluation for different classification algorithm on CKD dataset


For Chronic Kidney Disease
Classifier Naïve Bayes Multilayer Perceptron Random Tree J48
Cross
Testing Bed Cross Validation Cross Validation Cross Validation
validation
Text classification, Machine learning, Emotion
Speech recognition,
Spam filtering, Genetic algorithm, recognition,
Image recognition,
Applications Online Application, Fault diagnosis, Verbal
Machine translation
Hybrid Rotating column
software [32].
recommender system Machinery [33]. pathologies.
Execution Time 0.03 seconds 8.97 seconds 0.02 seconds 0.1 seconds
Accuracy 95% 99.75% 95.5% 99%
Table 1: Comparison of classifiers for CKD dataset using cross validation testing bed

Figure 7: Graphical representation of different algorithms accuracy and execution time using cross
validation method.

@ IJTSRD | Unique Paper ID – IJTSRD50568 | Volume – 6 | Issue – 5 | July-August 2022 Page 864
International Journal of Trend in Scientific Research and Development @ www.ijtsrd.com eISSN: 2456-6470
In the graph the abbreviation NB stands for Naïve Bayes, MP for Multilayer Perceptron, RT for Random Tree.
The number of correctly classified instances in Naïve Bayes is 380, Multilayer perceptron with 399, Random
tree with 382 and J48 with 396. The incorrectly classified instances by Naïve Bayes is 20, Multilayer perceptron
with 1, Random tree with 18 and J48 with 4. Now analysis for CKD using percentage split method is done and
this is as below:

@ IJTSRD | Unique Paper ID – IJTSRD50568 | Volume – 6 | Issue – 5 | July-August 2022 Page 865
International Journal of Trend in Scientific Research and Development @ www.ijtsrd.com eISSN: 2456-6470

For Chronic Kidney Disease


Classifier Naïve Bayes Multilayer Perceptron Random Tree J48
Testing Bed Percentage Split Percentage Split Percentage Split Percentage Split
Execution Time 0 seconds 0 seconds 0 seconds 0.01 seconds
Accuracy 95% 98.125% 96.25% 100%
Tale 2: Comparison of classifiers for CKD dataset using pecrentage split method
According to this test method that is percentage split it can be concluded that Naïve Bayes, Random Tree and
Multilayer Perceptron took 0 sceonds for execution while J48 took 0.01 seconds for execution. Accuracy of the
J48 algorithm comes out to be 100% while that of Multilayer Perceptron with 98.125%, Naïve Bayes with 95%
accurate and random Tree with 96.25% accuarte. The number of correctly classified instances in Naïve Bayes is
152, Multilayer Perceptron with 157, Random Tree with 154 and J48 with 160. Number of incorrectly classified
instances in Naïve Bayes is 8, Multilayer Perceptron with 3, Random Tree with 6 and J48 with 0.

@ IJTSRD | Unique Paper ID – IJTSRD50568 | Volume – 6 | Issue – 5 | July-August 2022 Page 866
International Journal of Trend in Scientific Research and Development @ www.ijtsrd.com eISSN: 2456-6470

Figure 8: Graphical representation of different algorithms accuracy and execution time in percentage
split
Graphical representation of different algorithms accuracy in percentage split method. The abbreviations in the
chart stands for Naïve BAyes, Multilayer Perceptron, Random Tree.
Graphical representation of correctly and incorrectly classified instnces by the classifiers are:

Figure 9: correctly and incorrectly classified instances in case of Percentage Split

Figure 10: correctly and incorrectly classified instances in case of Cross Validation
From the graphs it is analyzed that there is no such difference between the perfromance of the classification
algorithms they have significant performances for the chronic kidney disease dataset but on th basis of graph
analysis Multilayer Perceptron classifier is most accurate when using cross validation method andJ48 classifier is
most accurate when using percentage split.
V. CONCLUSION
Comparision and investigation of the accomplishment 7 and 8 it can be analyzed that all the algorithms don’t
of various classification algorithms is done using have much significant difference in between their
different criteria which are accuracy, execution time, accuracies. Hence type and size of the datasets are the
correctly classified instances, incorrectly classified factors on which algorithms performance depends.
instances and error rate. According to the result The further result evaluation study can be done for the
evaluation it can be concluded that Multilayer performance of other classification techniques with
Perceptron is most accurate with 99.75% when 10 large dataset sample. Clustering, association,
folds cross validation method is applied for CKD sequential patterns etc techniques can be used to draw
dataset and for Percentage Split method J48 algorithm more efficient results apart from the classification
is most accurate with 100% accuracy. From the figure technique

@ IJTSRD | Unique Paper ID – IJTSRD50568 | Volume – 6 | Issue – 5 | July-August 2022 Page 867
International Journal of Trend in Scientific Research and Development @ www.ijtsrd.com eISSN: 2456-6470
VI. FUTURE WORK [11] S. Carl et al, “Implementation of Classification
In future focus will be on how to improve the Algorithms and their Comparisons for
classifiers performance so that classification Educational Datasets”. International Journal of
techniques requiresless time to execute. For enhancing Innovative Science, Engineering and
the performance different classification algorithms can Technology, vol. 3, PP 700-705, No. 3(2016).
be used together. [12] S. Vijayarani, M. Muthulakshmi, “Comparative
REFERENCES Analysis of Bayes and Lazy Classification
[1] https://en.wikipedia.org/wiki/Data_mining Algorithms”. International Journal of Advanced
Research in Computer and Communication
[2] R. Sharma et al, “Comparative Analysis of
Engineering, vol. 2, PP 3118-3124, No.
Classification Techniques in Data Mining Using
8(2013).
Different Datasets”. International Journal of
Computer Science and Mobile Computing, vol. [13] S. Nikam, “A Comparitive Study of
4, PP 125-134, No. 12(2015). Classification Techniques in Data Mining
Algorithms”. Oriental Journal of Computer
[3] https://data-flair.training/blogs/data-mining-and-
Science and Technology, vol. 8, PP 13-19, No.
knowledge-discovery/
1(2015).
[4] K. Ahmed, T. Jesmin, “Comparative Analysis
[14] G. Raj et al, “Comparison of Different
of Data Mining Classification Algorithms in
Classification Techniques using WEKA for
Type-2 Diabetes Prediction Data Using Weka
Diabetic Diagnosis”. International Journal of
Approach”. International Journal of Science and
Innovative Research in Computer and
Engineering, vol. 7, PP 155-160, No. 2(2014).
Communication Engineering, vol. 6, PP 509-
[5] C. Anuradaha, T. Velmurugan, “A 516, No. 1(2018).
Comparative Analysis on the Evaluation of
[15] N. Jagtap et al, “A Comparative Study of
Classification Algorithms in the Prediction of
Classification Techniques in Data Mining
Students Performance”. International Journal of
Algorithms”. International Journal of Modern
Science and Technology, vol. 8, No. 15(2015).
Trends in Engineering and Research, vol. 4, PP
[6] S. Gupta, N. Verma, “Comparative Analysis 58-63, No. 10(2017).
of the Classification Algorithms using Weka
[16] N. Nithya et al, “Comparative Analysis of
Tool”. International Journal of Scientific and
Classification Function Algorithms in Data
Engineering Research, vol. 7, No. 8(2014).
Mining”. International Conference on
[7] R. Sharma et al, “Comparative Analysis of Information and Image Processing, PP 272-275,
Classification Techniques in Data Mining using No. 2(2014).
Different Datasets”. International Journal of
[17] S. Chiranjibi, “A Comparative Study for Data
Computer Science and Mobile Computing, vol.
Mining Algorithms in Classification”. Journal
4, PP 125-134, No. 12(2015).
of Computer Science and Control Systems, vol.
[8] N. Orsu et al, “Performance Analysis and 8, PP 29-32, No. 1(2015).
Evaluation of Different Data Mining
[18] C. Fernandes, et al, “A Comparative Analysis of
Algorithms used for Cancer Classification”.
Decision Tree Algorithms for Predicting
International Journal of Advanced Research in
Student’s Performance”. International Journal
Artificial Intelligence, vol. 2, PP 49-55, No.
of Engineering Science and Computing, vol. 7,
5(2013).
PP 10489-10492, No. 4(2017).
[9] S. Khare, S. Kashyap, “A Comparative
[19] S. Srivastava et al, “Comparative Analysis of
Analysis of Classification Techniques on
Decision tree Classification Algorithms”.
Categorical Data in Data Mining”. International
International Journal of Current Engineering
Journal on Recent and Innovation Trends in
and Technology, vol. 3, PP 334-337, No.
Computing and Communication, vol. 3, PP
2(2013).
5142-5147, No. 8(2015).
[20] Lohani et al, “Comparative Analysis of
[10] Md. N. Amin, Md. A. Habib, “Comparison of
Classification Methods Using Privacy
Different Classification Techniques using
Preserving Data Mining”. International Journal
WEKA for Hematological Data”. American
of Recent Trends in Engineering and Research,
Journal of Engineering Research, vol. 4, PP 55-
vol. 2, PP 677-682, No. 4(2016).
61, No. 3(2015).

@ IJTSRD | Unique Paper ID – IJTSRD50568 | Volume – 6 | Issue – 5 | July-August 2022 Page 868
International Journal of Trend in Scientific Research and Development @ www.ijtsrd.com eISSN: 2456-6470
[21] S. Devi, M. Sundaram, “A Comparative [26] K. Kishore, M. Reddy, “Comparative Analysis
Analysis of Meta and Tree Classification between Classification Algorithms and Data Set
Algorithms Using WEKA”. International (1: N and N: 1) Through WEKA”. Open Access
Research Journal of Engineering and International Journal of Science and
Technology, vol. 3, PP 77-83, No. 11(2016). Engineering, vol. 2, PP 23-28, No. 5(2017).
[22] S. Priya, M. Venila, “A Study on [27] https://en.wikipedia.org/wiki/Weka_(machine_l
Classification Algorithms and Performance earning)
Analysis of Data Mining Using Cancer Data
[28] F. Aqlan, R. Markle, “Data Mining for Chronic
to Predict Lung Cancer Disease”.
Kidney Disease”. Proceedings of the 2017
International Journal of New technology and
Industrial and Systems Engineering Conference,
Research, vol. 3, PP 88-93, No. 11(2017).
vol. 4, No. 3(2017).
[23] K. Danjuma, A. Osofisan, “Evaluation of
[29] https://data-flair.training/blogs/classification-
Predictive Data Mining Algorithms in
algorithms/
Erythemato-Squamous Disease Diagnosis”.
International Journal of Computer Science [30] https://www.google.com/search?q=naive+bayes
Issues, vol. 11, PP 85-94, No. 1(2014). +theorem+formula&source=lnms&tbm=isch&s
a=X&ved=0ahUKEwjXtcSJr-
[24] N. Kaur, N. Dokania, “Comparative Study of
zbAhXMMY8KHbBVBK0Q_AUICigB&biw=
Various Techniques in Data Mining”.
1366&bih=662#imgrc=kwLT20eBUyxVdM:
International Journal of Engineering Sciences
and Research Technology, vol. 7, PP 202-209, [31] Mishra, B. Ratha, “Study of Random Forest
N0. 5(2018). Data Mining Algorithms for Microarray Data
Analysis”. International Journal on Advanced
[25] E. Sondakh, R. Pungus, “Comparative Analysis
Electrical and Computer Engineering, vol. 3, PP
of Three Classification Algorithms in
5-7, No. 4(2016).
Predicting Computer Science Students Study
Duration”. International Journal of Computer [32] https://en.wikipedia.org/wiki/Multilayer_percep
and Information Technology, vol. 6, PP 14-18, tron#Applications
No. 1(2017). [33] https://link.springer.com/chapter/10.1007/978-
1-84628-814-2_82

@ IJTSRD | Unique Paper ID – IJTSRD50568 | Volume – 6 | Issue – 5 | July-August 2022 Page 869

You might also like