Performance Analysis of Data Mining Classification Method Using Naïve Bayes Algorithm To Predict Student Graduation Timeliness
Performance Analysis of Data Mining Classification Method Using Naïve Bayes Algorithm To Predict Student Graduation Timeliness
ISSN No:-2456-2165
Abstract:- Graduation rate is one of the parameters of The Naïve Bayes method is a simple probabilistic
the effectiveness of educational institutions. The decrease helper that calculates a set of probabilities by calculating the
in student graduation rate affects college accreditation. frequency and combination of values in a given data set [3].
University database stores student administration and Thus the use of Data Mining method will provide the best
academic data, if explored appropriately using data accuracy results in data classifying. Previous research has
mining techniques, it can be known patterns or examined the performance comparison of several Data
knowledge to make decisions. The naive bayes algorithm Mining classification methods by comparing Decision Tree
aims to measure the level of accuracy to be applied in the and Naive Bayes algorithms. The study aims to predict
case of student graduation timeliness. The Naive Bayes which students drop out. From the results of accuracy
method is a classifier with probability and statistical testing using both, the highest accuracy is obtained in
methods to predict future opportunities based on past decision tree algorithms.
experience. This research uses student data of
Informatics Engineering Education program of Padang Research on the use of Decision Tree algorithms such
State University class of 2011. The variables used in this as J48, Naïve Bayes, Random Tree, and Decision Stump to
study were: NIM, name, gender, entry status, GPA, area identify students who are weak and likely to fail high exams.
of origin and employment status. Based on the test From the tests obtained that J48 algorithm is an algorithm
results by measuring the performance of the method, it that has the highest accuracy compared to the four algortima
is known that naive bayes has a good accuracy value of used [3].
93.48%. From the accuracy value can be concluded that
the algorithm naive bayes have a good performance in II. RESEARCH METHODS
predicting the timeliness of student graduation.
I. INTRODUCTION
Table 4.2 Results of Probability Calculation Right and Table 4.3 Comparison Table of Accuracy and Error
Late Algorithms C4.5 and Naive Bayes
Probabilitas Algoritma Akurasi Error
No NIM Tepat Terlambat Prediksi Naive Bayes 93,48% 6,52 %
1 1102628 0,016 0 Tepat
2 1102631 0,003 0 Tepat In naive bayes error calculation, obtained error value
3 1102632 0,004 0 Tepat naive bayes algorithm has by 6.52 %.
4 1102638 0,001 0,003 Terlambat
5 1102644 0 0,014 Terlambat V. IMPLEMENTATION AND RESULTS
6 1102650 0,001 0,003 Terlambat
7 1102651 0,004 0 Tepat In Implementation and Results will be explained
8 1102656 0,004 0 Tepat Implementation or testing to find out the results of manual
9 1102663 0 0,014 Terlambat calculations with results using software supporting
algorithm Naïve Bayes. This aims to see whether the data
10 1102664 0 0,009 Terlambat
analyzed and processed is correct or not. The software used
11 1102668 0 0,014 Terlambat is Rapidminer Studio 7.5.3. Rapidminer Studio is an open
12 1102672 0,008 0 Tepat source Data Mining application. In the case of predicting the
13 1102675 0,008 0 Tepat timeliness of graduation of these students, the data to be
14 1102676 0 0,004 Terlambat used on Rapidminer amounted to 92 records.
15 1102678 0 0,004 Terlambat
16 1102687 0 0,01 Terlambat 5.1 Naive Bayes Algorithm Accuracy and Error Rates
17 1102688 0,01 0 Tepat
18 1102691 0 0,01 Terlambat a. Naive Bayes
19 1102692 0,01 0 Tepat Naive Bayes Accuracy Rate In naive bayes accuracy
20 1102696 0,014 0,018 Terlambat calculation obtained accuracy of 93.48% because it
produces 86 correctly classified data.
21 1102697 0,017 0,007 Tepat
22 1102698 0,012 0 Tepat
23 1102703 0,002 0,004 Terlambat
24 1102705 0 0,003 Terlambat
25 1102707 0,002 0,004 Terlambat
26 1106999 0 0,017 Terlambat
27 1107001 0 0,005 Terlambat
28 1107016 0,007 0 Tepat
29 1107017 0,002 0,004 Terlambat
30 1107025 0,008 0,012 Terlambat
31 1107033 0,003 0,006 Terlambat
32 1202175 0,012 0,001 Tepat Figure 5.1 Accuracy of Naive Bayes
33 1202183 0,012 0,001 Tepat
34 1202191 0,012 0 Tepat Naive Bayes Error Rate In Naive Bayes Error
35 1202196 0,002 0 Tepat calculation obtained accuracy of 6.52% because it produces
36 1202197 0,002 0 Tepat 6 incorrectly classified data.
37 1203244 0,015 0,002 Tepat
38 1203237 0,007 0,003 Tepat
39 1203238 0,003 0,001 Tepat
40 1203239 0,003 0,001 Tepat
41 1206507 0,016 0 Tepat
42 1206519 0,006 0 Tepat
43 1206520 0 0,005 Terlambat
44 1206522 0 0,042 Terlambat
45 1206538 0 0,008 Terlambat
46 1206545 0,017 0,007 Tepat
Figure 5.2 Naive Bayes Error
REFERENCES