0% found this document useful (0 votes)
2 views

review-on-predicting-student-academic-performance-using-data-mining-classification-algorithm-Rwuc

This review article discusses various data mining classification techniques used to predict student academic performance, focusing on attributes like GPA and internal assessments. It highlights the effectiveness of methods such as Neural Networks and Decision Trees, which are commonly employed in educational data mining. The findings suggest that these predictive models can help identify students at risk of underperforming and guide interventions to improve their academic outcomes.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

review-on-predicting-student-academic-performance-using-data-mining-classification-algorithm-Rwuc

This review article discusses various data mining classification techniques used to predict student academic performance, focusing on attributes like GPA and internal assessments. It highlights the effectiveness of methods such as Neural Networks and Decision Trees, which are commonly employed in educational data mining. The findings suggest that these predictive models can help identify students at risk of underperforming and guide interventions to improve their academic outcomes.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

Admass, J Comput Eng Inf Technol 2021, 10:11

Journal of Computer
Engineering & Information
Technology
Mini Review a SciTechnol journal

good learner and excellent learner. Classifying the student academic


Review on Predicting Student performance by their status helps to improve the student from failure
and the teachers to focus on the poor learner students. Students
Academic Performance using could improve their learning activities, allowing the administration

Data Mining Classification to improve the systems performance. Thus, the application of data
mining techniques can be focused on specific needs with different
Algorithm entities. In this systematic article review is reviewed to answer the
following question as hypothesis. These are
Wasyihun Sema Admass*
¾¾ what importance attributes the researcher focused to predict
student’s academic performance
Abstract ¾¾ What are the methods that different researchers used to predict
This paper has reviewed previous studies on predicting students’
student’s academic performance
performance with various analytical methods. Most of the ¾¾ What will be predicted as a future work from the given articles?
researchers have used cumulative grade point average (CGPA) and
internal assessment as data sets. While for prediction techniques, ¾¾ Which of the data mining algorithms when used the most
the classification method is frequently used in educational data predictive set of students’ academic performance.
mining area. Under the classification techniques, Neural Network
and Decision Tree are the two methods highly used by the Objectives
researchers for predicting students’ performance. In conclusion, the
meta-analysis on predicting students’ performance has motivated us The objectives of this systematically article review on the
to carry out further research to be applied in our environment. It will prediction of student academic performance using data mining
help the educational system to monitor the students’ performance classification technique is the following.
in a systematic way.
¾¾ To identify the attributes used to predict the academic
Keywords performance of students
Student Performance, Prediction Technique, Data-Mining, ¾¾ To identify the gaps in the existing prediction and indicating
Algorithms.
future work
¾¾ To identify the methods used in the existing prediction methods
Introduction to predict students’ performance.
The topic of explanation and prediction of academic performance
is widely researched. The prediction of student performance should be
Methodology
topical debates in the education center. There are increasing research The reasoning for performing meta-analysis way of systematic
interests in education field using data mining. Application of Data article review is to find suitable methods for existing parameter, to
mining techniques concerns to develop the methods that discover fulfil the gaps in existing research and to place a new research activity
knowledge from data and used to uncover hidden or unknown in the suitable context.
information that is not apparent, but potentially useful [1].
¾¾ Searching: There is large amount of articles done on the area of
In the area of educational center the data is increased rapidly so education with different titles. To perform meta-analysis way of
the researcher should have to transform in to useful information and article review searching is important to get multiple articles from
knowledge, so data mining techniques play a special role in extracting different journals by using keywords.
useful and hidden patterns form tremendous amount of data. In the
area of education, educational data mining EDM has become an Usage of Data mining to predict students’ academic
emerging area for research interest amongst scientists and researchers performance
across the globe. The EDM converts raw data from traditional and
Many researcher do research on the area of education to predict
online education systems into important and useful information the academic performance of students by using different data mining
for educational institutes and research [2]. Different scholars Techniques.
perform research on the area of education to predict the academic
performance of students. All of them agreed that predicting student’s Surjeet Kumar Yadav and Saurabh Pal [3] conducted a research
academic performance helps to identify the status of students as slow on 400 students to predict academic performance engineering
learner (poor), good learner(Good), Medium learner(Average), very students by using decision tree (ID3 and C4.5 and CART) algorithms.
The researcher uses the past performance of the students to predict
*Corresponding author: Wasyihun Sema Admass, Faculty of informatics and
whether a new student will perform or not and it predict the result
Department of information technology, University of Gondar, Gondar, Ethiopia, as pass and faille. The study experiments are conducted to find the
E-mail: [email protected] best classifier for prediction of student’s performance in First Year
Received: November 03 2021 Accepted: November 17, 2021 Published: of engineering exam. From the classifiers accuracy it is clear that the
November 24, 2021 true positive rate of the model for the FAIL class is 0.786 for ID3 and

All articles published in Journal of Computer Engineering & Information Technology are the property of SciTechnol, and is
International Publisher of Science,
protected by copyright laws. Copyright © 2021, SciTechnol, All Rights Reserved.
Technology and Medicine
Citation: Admass WS (2021) Review on Predicting Student Academic Performance using Data Mining Classification Algorithm. J Comput Eng Inf Technol
10:11.

C4.5 decision trees that means model is successfully identifying the Naïve Bayes algorithm, it is also decision tree classification based on
students who are likely to fail. These students can be considered for probability inference the result was found from Figure 1, shows the
proper counselling so as to improve their result. The study is also performance analysis by using this algorithm the result shows 30%
focused to identify those students which needed special attention. of slow learner, 20% of Average learner, 40% good learner and 10%
of excellent learner. The experiment of knearest- neighbor algorithm
Vrushali Mhetre and Prof. Mayura Nagar [4] paper focuses on
result shows 45% of slow learner, 10% of Average learner, 5% good
predicting academic performance as slow learner, fast learner and
learner and 40% of excellent learner. From these experiment and
average learner. For that they applied various data mining techniques
analysis of classification accuracy, K-Nearest Neighbor taken a less
and compare the Accuracy based on students attributes. This research
time for classifying the student performance as excellent learner,
work is done to identify the best feature selection and classification
Good Learner, Average Learner and Slow Learner. Knearest Neighbor
algorithms to examine slow, average and fast in education data set.
has best accuracy of time taken in classification when compared to
to find the best attribute by comparing the performance of various
other techniques by the significance of examination result and other
feature selection techniques in the prediction of learners using
activities are affected in the rule set. This study is very useful to
different classification algorithms such as Naïve Bayes, J48, ZeroR and
identify the ratio of slow learner for rectify the failures early and take
Random Tree using WEKA tool. The idea of this research work is to
identify slow learners which help the faculties to give special attention action to improve the weaker student in perfectly manner.
to individual student’s to improve their overall performance. Finally V. Ramesh, P. Parkavi and K. Ramar [7] conducted this research
it has been investigated that Random Tree technique performs best paper which focuses on identifying weak students and the identified
with accuracy 95.4545% and identify students who are slow learners student can be individually assisted by the educators so that their
which further provide base for deciding Special aid to them. performance is better for the future. This study is also investigate
Sagardeep Roy and Anchal Garg [5] conduct a research on the accuracy of some classification techniques for predicting
predicting student academic performance using data mining the performance of students. The researcher uses four different
techniques which has the goal to help student improve their skills, classification algorithms: NaïveBayes, Multilayer Perception, J48, and
to find out what hinders student from achieving success and how to REPTree. From the research experiment the result shows multilayer
improve it. This paper is don on 32 attributes of a student by using perception (MLP) classifier is most appropriate for predicting
Naïve Bayes classifier, J48 Decision Tree and MLP classification student performance which gives 72.38% of prediction and the paper
algorithms. The accuracy of these algorithm is Naïve Bayes classifier concludes the important factors that affect the students performance
68.6 %, J48 algorithm 73.92% and MLP has 51.13 % there for the related to the school.
result is J48 perform best accuracy than others. The result identify Sajadin Sembiring [8] conduct a research on student performance
the abilities of students, their interests and weaknesses. Student prediction to predict the performance of students based on their
performance can be influenced by different types of attributes. This grade (GPA). The researcher grouped all the grades in to five groups
can be social, demographic and related to school. ’excellent, very good, good, average and poor’ and the researcher
M. Mayilvaganan and D. Kalpanadevi [6] conduct a research to categorized the value of each item in questionaries’ with high, low
predict student academic performance using classification algorithm and medium. The researcher uses two data mining techniques
to classify the student as Excellent Leaner, Good Learner, and average SSVM and kerner k-means clustering algorithm. This paper is done
learner, Slow Learner for diagnosis by using three main classification on 300 students of samples and every samples is expressed by ten
techniques such as decision tree, Naïve Bayesian methods, and characteristics parameters. We used five performance predictors that
knearest- neighbor. The research experiment result of these three proposed in this study and five characteristics demographic data of
algorithms the decision tree, shows that 30% slow learner, 20% of student. From the research experiment result shows that the average
Average learner, 40% good learner and 10% of excellent learner and testing accuracy for the lowest 61% for prediction “good” performance

Figure 1: Accuracy result of different prediction methods.

Volume 10 • Issue 11 • 1000296 • Page 2 of 5 •


Citation: Admass WS (2021) Review on Predicting Student Academic Performance using Data Mining Classification Algorithm. J Comput Eng Inf Technol
10:11.

and the highest 93.7% for the prediction “poor” performance. Based The most popular task to predict students’ performance is
on the results obtained they are sufficient to prove that the rule model classification. From the classification techniques the researcher uses
of prediction student performance by using predictor’s of student Decision tree, Artificial Neural Networks, Naive Bayes, K-Nearest
performance proposed acceptable and good enough to serve as Neighbor and Support Vector Machine.
predictor of student performance.
The specific application of data mining techniques grouped by
Ahmed Mueen, [9] this research paper is conducted two achiever algorithms in predicting student performance will be described in the
three basic objectives first objective was to predict student academic following:
performance, second objective was to reduce number of attributes,
and the last objective is to compare classification accuracy of different Decision Tree
classifiers. The researcher uses three classifiers to achieve these Decision tree is the most wieldy used classification algorithms
objectives Naïve Bayes, Multilayer Perception (neural network) in data mining techniques. The decision tree models are easily
and C4.5 (J48). From these research experiment the accuracy of understood because of their reasoning process and can be directly
each classifier shows naïve Bayes 86%, Multilayer Perception 82.7% converted into set of IF-THEN rules. From the seven papers six of the
and decision Tree (J48) 79.2%. From this the researcher conclude paper have been used decision tree algorithms Table 2.
that Naïve Bayes classifier has best accuracy to predict the student
performance than others. Finally the researcher analyzed the dataset Naive Bayes
to identify factors which cause student to loss his academic status due Naïve Bayes algorithm is the next option for the researcher to
to academic performance. We have found that poor performance of predict the students’ academic performance. Among seven (7) papers
student was due to lack of participation in on-line discussion forum. five (5) of them have been used naïve Bayes algorithm as prediction
method to estimate student performance. The following table shows
Important Attributes used to Predict Student
the predicted result of naïve Bayes algorithm Table 3.
Performance
Neural Network
The meta-analysis systematical article review helps to identify the
important attributes used to predict academic student performance. The next predictor method used by the researcher to estimate
The attribute’s which are frequently used and which has great role student performance is the neural network. The researcher uses
is predicting student academic performance the attributes will be Multi-layer perceptron algorithm to predict student performance.
important attribute Table 1. Among seven papers three of them have been used this neural
network techniques. Look at the following table which shows the
The attribute’s that are frequently used is GPA and assessment.
result estimated by neural network techniques Table 4.
The researcher used the GPA frequently either directly or indirectly to
predict the students’ academic performance. GPA is a good predictor K-Nearest Neighbor
because it is tangible measurement for future education and career
mobility CGPA is the most influence attributes in determining the The researcher have been also used K-nearest Neighbor data
survival of students in their study, whether they can complete their mining algorithm as predictor method to predict student performance.
study or not. In this review, assessment was classified as assignment From the seven paper one paper is used K-nearest neighbor algorithm
mark, quizzes, lab work, class test and attendance. All attributes will be as predictor method. According to [6] the estimated result of the
grouped in one attribute called internal assessment. The attributes are student by using k-nearest neighbor algorithm is 45% of slow learner,
mostly used among the researchers to predict students’ performance. 10% of Average learner, 5% good learner and 40% of excellent learner
The next important attributes used to predict student academic from the required data set.
performance is student demographic factors which includes gender, REPTree, Random Tree and ZeroR
age, family background, and disability. The reason that the researcher
used demographic factors of a student is to identify which sex has Some of the researcher also used these algorithms to predict
better attitude to learn and more strategic to study. The other attributes student academic performance. According to [7] REPTree is used
which are used by the researcher extracurricular activity high school to estimate the student performance based on demographic and
background there are also several researchers in another study who psychometric attributes. The predicted result of REPTree is 60.13%.
have used psychometric factor to predict students’ performance. A ZeroR also used by [4] to estimate the student performance, then the
psychometric factor is identified as student interest, study behavior, result shows 36.36%. Random Tree algorithm is used by [4] to predict
engage time, and family support. They have used this attributes to student performance and the result is 95.45 % Table 5.
make a system to look very clear, simple and user friendly. It helps
the lecturer to evaluate students achievement based on their personal Support vector Machine and K-means Clustering
interest and behavior. However, these attributes are rarely to apply
The researcher Sajadin Sembiring [8] has been used these two
in predicting students’ performance by several researchers because it
methodes to estimate the student performance. According to [8]
focuses more on qualitative data and it is also hard to get a valid data
after data set is prepared the data set is inserted to k means clustering
from respondents.
algorithm. The number of clusters was determined as an external
Prediction Methods used in Predicting Student parameter then grouped in to five clusters. The researcher is also used
SVM as prediction method and the result of the prediction. Average
Performance testing accuracy for the lowest 61% for prediction “good” performance
To predict student performance prediction modelling is used. To and the highest 93.7% for the prediction “poor” performance. The
use predictive modelling in educational data mining different activity following graph indicates the best prediction methodes used in these
may be performed like classification, regression and categorization. systematic review.

Volume 10 • Issue 11 • 1000296 • Page 3 of 5 •


Citation: Admass WS (2021) Review on Predicting Student Academic Performance using Data Mining Classification Algorithm. J Comput Eng Inf Technol
10:11.

Table 1: Attribute Factor Used in Different Researcher.

Attribute Authors

Students Branch,Students grade in High School,Students grade in Senior Secondary,Medium of teaching, Living location of teaching Surjeet Kumar Yadavand
,Student family size, Student family status, Family income,Family occupation,Result(Pass,Pro,Fail) Saurabh Pal [1]

Vrushali Mhetre and Prof.


Student CBGS, curricular and extra-curicular activity, quize, assignment marks, projects, result,Learner{slow,average,fast}
Mayura Nagar [4]

School, Type of Address, Parent's Cohabitation Status, family educational qualification, family employment type, Reason for opting
a certain school, Time taken to travel to school, Weekly Study time, Educational support given by family, internet access, family Sagardeep Roy, Anchal
relationship, free time out of school, workday alcohol Consumption, weakly alcohol consumption, current health status, Absences in Garg [5]
school, first year grade, second year grade, grades

Speciality, lower class grade, Higher Class Grade, Extra Knowledge or skill, Attendance, hours spend to study, resources, seminar Mayilvaganan
performance, result, class test grade(internal), lab work, exercise, homework, quiz, over all semester mark Kalpanadevi [6]
Grade obtained at secondary level father occupation, mother occupation, school area at secondary level, school area at higher
secondary level, private tuition at secondary level group of study, student’s community, school area at elementary level, parent’s
education

Interest, Study Behaviour, Engage Time, Believe, and Family Support and GPA as dependent variable

Grade Point Average (GPA), quiz1, quiz2, quiz average, Assignment submit, Assignment delay, labtest1, labtest2, lab test average,
final exam grade, total time spent, hours spent studying daily, methods of study used, city of birth, transport method, distance to
the college, subjects interest, motivation level, difficulty doing homework, facilities in college, having home tuition, level of father
education, level of mother education, attendance

Table 2: Decision Tree Accuracy Result.

Methods Result Author

Decision Tree ID3=62.2, C4.5=67.7,CART=62.2 Surjeet Kumar Yadav and Saurabh Pal [1]

Decision Tree(J48) =72.7 Vrushali Mhetre and Prof. Mayura Nagar

Decision Tree(J48) 73.92 % Sagardeep Roy, Anchal Garg

Decision Tree(J48) 79.2% Ahmed Mueen, Bassam Zafar, Umar Manzoor

Decision Tree(J48) 64.8% V.Ramesh, P.Parkavi and K.Ramar [7]


30% of slow learner,
Decision Tree(C4.5) 20% of Average learner, 40% good learner and M. Mayilvaganan and D. Kalpanadevi [6]
10% of excellent learner

Table 3: Naive Bayes Accuracy Result.

Methods Result Authors

Naïve Bayes 49.5 % V.Ramesh, P.Parkavi and K.Ramar [7]


30% of slow learner
Naïve Bayes 20% of Average learner, 40% good learner and M. Mayilvaganan and D. Kalpanadevi [6]
10% of excellent learner
Naïve Bayes 68.60% Sagardeep Roy and Anchal Garg [5]

Naïve Bayes 68.1818% Vrushali Mhetre and Prof. Mayura Nagar [4]

Naïve Bayes 85.7% Ahmed Mueen, Bassam Zafar, Umar Manzoor [9]

Table 4: Neural Network Accuracy Result.

Methods Result Authors

Neural network(MLP) 72.38 % V.Ramesh, P.Parkavi and K.Ramar [7]

Neural network(MLP) 51.1392% Sagardeep Roy and Anchal Garg [5]

Neural network(MLP) 81.4% Ahmed Mueen, Bassam Zafar, Umar Manzoor [9]

Table 5: 1 Accuracy Result for REPTree, Random Tree and ZeroR Algorithm.

Method Result Authors

REPTree 60.13% V.Ramesh, P.Parkavi and K.Ramar [7]

Random Tree 36.36% Vrushali Mhetre and Prof. Mayura Nagar [4]

ZeroR 95.45% Vrushali Mhetre and Prof. Mayura Nagar [4]

Volume 10 • Issue 11 • 1000296 • Page 4 of 5 •


Citation: Admass WS (2021) Review on Predicting Student Academic Performance using Data Mining Classification Algorithm. J Comput Eng Inf Technol
10:11.

Discussion performance. Other attributes factors like psychological factor,


alcohol, and consumption, romantic (love) relationship is also
The discussion section presents the analyzed result all papers important factors used to estimate the performance of students
in predicting students’ academic performance. The meta- analysis
systematic review of this article mainly focuses on identifying the Conclusion
important attributes and methodes used in predicting student
academic performance. On the other hand it also identify the research Estimating student performance is mostly important for the
gaps and indicate future works for the researcher. teachers and students to improve their teaching and learning
process. This systematic review has reviewed the previous studies on
This review analyze the important attributes in two three predicting student academic performance with various prediction
basic category of attributes. These are tangible value attributes, methodes. Most of the researcher used demographic attributes,
demographic attributes and psychometric attributes. The tangible psychometric attributes and tangible attributes but more frequently
attributes are measurable values like GPA and assessment’s the used attributes are student GPA and assessments (tangible) are used,
demographic attribute is gender, age, family background, disability while for prediction methodes classification method is frequently
and other related attributes and the psychometric attributes student used. Form the classification method decision tree, naïve Bayes and
interest, study behavior, engage time, and family support [3-7, 9] neural network are widely used by researchers. In conclusion, the
from these attributes the best predictor is GPA. All other attributes review on predicting students’ performance has motivated us to carry
directly or indirectly predict the GPA of the student. out further research to be applied in our environment. It will help
the educational system to monitor the students’ performance in a
The next important factor is prediction methods. From the above
systematic way and motivate to conduct a research on what is the
graph ZeroR has the highest prediction accuracy by (95.45%) followed
reason that the student skill will be minimized (educational Froude
by SVM which is highest prediction for poor class by (93.70%). The
for the student).
reason that ZeroR algorithm has high prediction accuracy all the
attributes are tangible to predict students’ performance clearly Next, References
naïve Bayes and Neural network has high prediction with 85.7% and 1. Sajadin Sembiring (2011) Prediction of student academic performance by an
81.4 % respectively [9], but naïve Bayes are the lowest prediction application of data mining techniques.
method with 45.6 % accuracy at [7] the lowest prediction methodes 2. Pamela Chaudhury (2016) Enhancing the capabilities of Student Result
used in this systematic review is random tree with 36.36 % accuracy Prediction System.
result.
3. Surjeet Kumar Yadav (2012) Data Mining: A Prediction for Performance
Improvement of Engineering Students using Classification. World of Comp
Evaluation and critique Sci and Info Tech J (WCSIT) 2(2): 51-56.

Strong side 4. Vrushali Mhetre (2017) Classification based data mining algorithms topredict
slow, average and fast learners in educational system using Weka. IEEE
All the papers are estimate the students’ academic performance to International Conference on Computing Methodologies and Communication.
identify the week or low performed students to announce the teachers 5. Sagardeep Roy (2017) Predicting Academic Performance of Student Using
to focus on weak or low performed students to prevent from failure Classification Techniques.
and indicate the teacher should more interactive with student, provide 6. Mayilvaganan (2014) Comparison of Classification Techniques for predicting
proper guidance and motivate the student. These all researchers use the performance of Students Academic Environment. International
tangible variable, demographic variable, psychometric variables Conference on Communication and Network Technologies (ICCNT).
to predict the performance of students (weak or Lowe performed 7. Ramesh (2013) Predicting Student Performance: A Statistical and Data
students and strong well performed students). Mining ApproachInt J Comput Appl 63: 975-8887.

8. Sajadin Sembiring (2011) Prediction of Student Academic Performance


Weak side by an Application of Data Mining Techniques. International Conference on
Management and Artificial Intelligenc 6.
The researcher is focused only the Assessment attribute,
demographic factor (attribute) and psychometric attributes. These 9. Ahmed Mueen (2016) Modeling and Predicting Students’ Academic
attributes are not the only important factors that predict the student Performance Using Data Mining Techniques. IJ Modern Education and
Computer Science.

Author Affiliation Top


Department of information technology, University of Gondar, Gondar, Ethiopia.

Volume 10 • Issue 11 • 1000296 • Page 5 of 5 •

You might also like