0% found this document useful (0 votes)
10 views5 pages

Feature Selection Algorithms For Predicting Students Academic Performance Using Data Mining Techniques

The document discusses the use of Educational Data Mining (EDM) techniques to predict students' academic performance by applying various feature selection algorithms and classification methods. It highlights the importance of removing non-predictive elements from datasets to enhance predictive accuracy, demonstrating improved efficiency with specific algorithms like CfsSubsetEval and GainRatioAttributeEval. The study analyzes multiple datasets and classification algorithms, reporting varying levels of accuracy in predicting student performance.

Uploaded by

easambadi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views5 pages

Feature Selection Algorithms For Predicting Students Academic Performance Using Data Mining Techniques

The document discusses the use of Educational Data Mining (EDM) techniques to predict students' academic performance by applying various feature selection algorithms and classification methods. It highlights the importance of removing non-predictive elements from datasets to enhance predictive accuracy, demonstrating improved efficiency with specific algorithms like CfsSubsetEval and GainRatioAttributeEval. The study analyzes multiple datasets and classification algorithms, reporting varying levels of accuracy in predicting student performance.

Uploaded by

easambadi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

INTERNATIONAL JOURNAL OF SCIENTIFIC & TECHNOLOGY RESEARCH VOLUME 9, ISSUE 04, APRIL 2020 ISSN 2277-8616

Feature Selection Algorithms For Predicting


Students Academic Performance Using Data
Mining Techniques
Abeje Orsango Enaro, Dr. Sudeshna Chakraborty

Abstract: Educational Data Mining (EDM) is used by an educational organization to enhance the academic progress of students. For predicting the
academic achievement of the student, EDM comes with many features selection and Machine Learning techniques. The purpose of using these features
selection techniques is to remove the unwanted elements from the student academic datasets that have not required for performance prediction. By
using feature selection techniques, the quality of students' datasets has improved, and with it, the predictive accuracy of various data mining techniques
has also enhanced. Taking these facts into consideration analysis of four feature selection and six classification techniques are implemented on student
datasets to check the predictive accuracy. After the implementation of FS and classification techniques only CfsSubsetEval, GainRatioAttributeEval
feature selection gave improved efficiency up to 5%.

Index Terms: Educational Data Mining, Attribute Selection, Classification, Prediction, Accuracy
————————————————————

1 INTRODUCTION The principal purpose of these FS algorithms is to select


Data Mining (DM) technique is used to find concealed the most predictive features from the chosen dataset for
information from more essential data. Its use in education analysis and ignore the rest of the attribute, which is non-
has become prevalent lately, and most researchers work in predictive. It means that non-predictive elements are not
this area. This broad field of EDM varies from predicting the affecting the actual result, but it reduces the complexity of
student’s placement to academics. It is an evolving the analysis results. The accuracy and effectiveness of the
interdisciplinary area in which DM techniques are useful in student's performance prediction model can have improved
academic data. Nowadays, educational systems store with the help of these feature selection algorithms. These
massive data that come from multiple sources and in feature selection algorithms can have further divided into
diverse format. In a real-life scenario, every educational three different groups, namely filter, wrapper and integrated
problem requires different mining techniques. The reason methods. The filtering methods of feature selection
behind is that traditional DM techniques cannot be applied algorithms is one of the primary techniques which depends
directly to all issues. So many software tools have on the general characteristics of the learning data and get
developed, but all do not handle the educational problem, performed during the pre-processing phase of the dataset.
and an information officer is not able to use these tools The Wrapper method is used to evaluate functions using
without the understanding of DM. EDM is an essential learning algorithms. Embedded methods are executed
application of data mining techniques to solve the research during the classifier's learning process and be more specific
issues of the educational problem. In educational to learning algorithms.
researcher’s community, main focused areas for research
are Intelligent Tutoring System (ITS), Online Tutorial 2 RELATED LITERATURE
System (OTS) and e-learning to fabricate enhanced The given section is a short review of work done in the area
educational outcome. The university can determine the of feature selection algorithm by a different researcher.
academic performance of a student by using the number of Many authors used feature selection (FS) algorithms in
parameters. It could have based on academic or non- combination with classification algorithms to compare the
academic factors. Previously, students who excelled at the prediction accuracy of varying student dataset. Some of the
secondary education level can lose their interest due to exciting work in this field of EDM has reviewed. Siva
social lifestyles and peer pressure. As compared to those Kumar S, Venkataraman S, et al., "Predictive Modeling of
who were struggling earlier with family distraction might be Student Dropout Indicators in Educational Data Mining
able to concentrate from home, excelling at the university. using Improved Decision Tree," proposed an improved
Feature Selection is a productive and dynamic research version of decision tree algorithm which will predict the
area in the field of Machine Learning and Data Mining. dropout students. The dataset of 240 students has been
collected by the authors via survey and then applied the
correlation-based feature selection algorithm for pre-
____________________________ processing of the dataset. The classification accuracy of
this dataset is more than 90%. K. W. Stephen et al., in his
 Abeje Orsangoy Enaro Research Scholar, Department of study "Data Mining Model for Predicting Student Enrolment
Computer Science and Engineering, Sharda University, Greater in STEM Courses in Higher Education Institutions," predict
Noida, Uttar Pradesh 201310. Email Id: the fresh students’ enrolment in the course of STEM
[email protected]
 Dr. Sudeshna Chakraborty (Science, Technology, Engineering and Mathematics). They
 Assistant Professor, Department of Computer Science and selected 18 different features and collected data from
Engineering, Sharda University, Greater Noida, Uttar Pradesh students through the questionnaire. For the pre-processing
201310. Email Id: [email protected] 3622

IJSTR©2020
www.ijstr.org
INTERNATIONAL JOURNAL OF SCIENTIFIC & TECHNOLOGY RESEARCH VOLUME 9, ISSUE 04, APRIL 2020 ISSN 2277-8616

of data, authors used Chi-Square and IG feature selection to other classification algorithms (Decision Tree, K-NN and
algorithm and found better prediction with CART decision Naive Bayes). M. Zaffar, M. A. Hashmani et al., in his study
tree algorithm. E. Osmanbegović, et al., in his study "Performance analysis of feature selection algorithm for
"Determining Dominant Factor for students Performance educational data mining," implemented different filter
Prediction by using Data Mining Classification Algorithms," feature selection algorithms on the selected dataset of a
calculate the academic performance of the secondary student. In this research, authors used the dataset of two
school education student at Tuzla. For the pre-processing different students with a different number of feature
phase of the collected dataset, they used Gain Ratio (GR) selection algorithm and analyzed the result for prediction
feature selection algorithm. They found the best prediction accuracy.
accuracy with the Random Forest (RF) algorithm as
compared to other classification algorithms. A. Figueira, et 3 RESEARCH METHODOLOGY
al., "Predicting Grades by Principal Component Analysis: A The considered dataset for this study has taken from
Data Mining Approach to Learning Analytics'," predict the Kalboard-360, which is a Multi-agent Learning Management
students’ academic grade in Bachelor Degree program. For System. In this technological era, such an online learning
pre-processing phase, authors used Principal Component platform provides the user with unlimited access to
Analysis (PCA) feature selection algorithm. In this study, educational resources from several places and on any
PCA feature selection algorithm has been used to build a device which uses an internet connection. The dataset
decision tree. This tree is used to predict the grade of the consists of 16 features and 480 records of students, in
student in academic. N. Rachburee and W. Punlumjeak et which 305(Male) and 175 (Female) students. These
al., in his study "A comparison of feature selection approach features can have classified into three major categories,
between greedy, IG-ratio, Chi-square, and mRMR in namely demographics, academic background, behavioural
educational mining, "compare different feature selection characteristics. In table 1, all the 16 attributes have
algorithm like IG-ration, Chi-Square, Greedy Forward considered for the analysis on the dataset mentioned
selection and mRMR. This work has conducted on the first above, and the following output has achieved where Naïve-
year's student's dataset (with 15 attributes) of the University Bayes is providing the most classified instances with
of Technology, Thailand. In this research, authors found accuracy up to 74.30%.
better prediction accuracy by using Greedy Forward (GF)
selection with Artificial Neural Network (ANN) as compared

Table 1: Performance of Students' Academic Performance Dataset (xAPI-Edu-Data)


Correctly Classified
Algorithm Precision Recall F-Measure Incorrectly Classified Instances
Instances
Naive Bayes 0.742 0.743 0.74 1 74.3056 % 25.6944 %
JRip 0.726 0.723 0.724 72.2917 % 27.7083 %
Random Forest 0.766 0.767 0.766 76.6667 % 23.3333 %
J48 0.760 0.758 0.759 75.8333 % 24.1667 %
DecisionTable 0.728 0.727 0.727 72.7083 % 27.2917 %
Logistic Regression 0.738 0.738 0.738 73.75 % 26.25 %

Different Feature Selection (FS) Algorithms: intercorrelations have preferred for those features that have
Here, four FS algorithms such as CfsSubsetEval, extremely related to the class. Attribute Subset Evaluator
GainRatioAttributeEval, InfoGainAttributeEval and (cfsSubsetEval) + Search Method (Best first (forwards)) In
ReliefAttributeEval are evaluated. Classifications algorithms table-2, best seven attributes (gender, Relation,
Naive Bayes (NB), Logistic Regression (LR), DecisionTable raisedhands, VisITedResources, AnnouncementsView,
(DT), JRip, J48 and Random Forest (RF) has evaluated ParentAnsweringSurvey, StudentAbsenceDays.) has
through academic algorithms. cfsSubsetEval: Attributes selected based on the FS mentioned above algorithm. It
subsets are evaluated based on both the predictive ability providing most classified instances with accuracy up to
and the degree of redundancy of each feature. Low 77.29% and least is JRip with accuracy up to 73.75%.

Table-2: CfsSubsetEval algorithm used for Academic Performance Evaluation


Algorithm Precision Recall F-Measure Correctly Classified Instances Incorrectly Classified Instances
Naive Bayes 0.744 0.746 0.743 74.5833 % 25.4167 %
JRip 0.738 0.738 0.738 73.75 % 26.25 %
Random Forest 0.773 0.773 0.773 77.2917 % 22.7083 %
J48 0.760 0.760 0.760 76.0417 % 23.9583 %
DecisionTable 0.729 0.729 0.729 72.9167 % 27.0833 %
Logistic Regression 0.763 0.763 0.763 76.25 % 23.75 %

GainRatioAttributeEval: Estimates the value of an attribute GainR (Class, Attribute) = (H (Class) - H (Class | Attribute))
by calculating the gain ratio concerning the class attribute. / H (Attribute). Attribute Subset Evaluator
3623

IJSTR©2020
www.ijstr.org
INTERNATIONAL JOURNAL OF SCIENTIFIC & TECHNOLOGY RESEARCH VOLUME 9, ISSUE 04, APRIL 2020 ISSN 2277-8616

(GainRatioAttributeEval) + Search Method (Ranker) In RF and J48 providing most classified instances with
table2, best seven attributes (StudentAbsenceDays, accuracy up to 76.45% and least is decision table with
raisedhands, VisITedResources, AnnouncementsView, accuracy up to 71.66%.
ParentAnsweringSurvey, Discussion, Relation) has
selected based on the FS mentioned above algorithm. And

Table-3: GainRatioAttributeEval algorithm used for Academic Performance Evaluation


Algorithm Precision Recall F-Measure Correctly Classified Instances Incorrectly Classified Instances
Naive Bayes 0.750 0.752 0.749 75.2083 % 24.7917 %
JRip 0.736 0.735 0.735 73.5417 % 26.4583 %
Random Forest 0.764 0.765 0.764 76.4583 % 23.5417 %
J48 0.764 0.765 0.764 76.4583 % 23.5417 %
DecisionTable 0.719 0.717 0.717 71.6667 % 28.3333 %
Logistic Regression 0.754 0.754 0.754 75.4167 % 24.5833 %

InfoGainAttributeEval: Estimates the value of an attribute by raisedhands, VisITedResources, AnnouncementsView,


calculating the information gain concerning the class ParentAnsweringSurvey, Discussion, and Relation) has
attribute. InfoGain (Class, Attribute) = H (Class) - H (Class | selected based on the FS as mentioned above algorithm. It
Attribute). Attribute Subset Evaluator was providing most classified instances with accuracy up to
(InfoGainAttributeEval) + Search Method (Ranker) In 75.62%, and least is decision table with accuracy up to
table3, best seven attributes (StudentAbsenceDays, 69.32%.

Table-4: InfoGainAttributeEval algorithm used for Academic Performance Evaluation


Algorithm Precision Recall F-Measure Correctly Classified Instances Incorrectly Classified Instances

Naive Bayes 0.721 0.722 0.721 72.2222 % 27.7778 %


JRip 0.702 0.702 0.701 70.2083 % 29.7917 %
Random Forest 0.756 0.756 0.756 75.625 % 24.375 %
J48 0.715 0.713 0.714 71.25 % 28.75 %
DecisionTable 0.695 0.693 0.693 69.3252 % 30.6748 %

Logistic Regression 0.707 0.708 0.708 70.8333 % 29.1667 %

ReliefAttributeEval: It operates on commonly on continuous PlaceofBirth, raisedhands, VisITedResources,


and discrete data class. It estimates the value by constantly AnnouncementsView, ParentAnsweringSurvey,
sampling an instance and worth of the given attribute for StudentAbsenceDays) has selected based on FS. As
nearby example of a similar and diverse course. Attribute mentioned above, algorithm and RF is providing the most
Subset Evaluator (ReliefFAttributeEvall) + Search Method classified instances with accuracy up to 75.62% and least is
(Ranker) In table 4, the best seven attributes (NationaliTy, the decision table with accuracy up to 67.91%.

Table 5: ReliefAttributeEval used for Academic Performance Evaluation


Algorithm Precision Recall F-Measure Correctly Classified Instances Incorrectly Classified Instances

Naive Bayes 0.704 0.706 0.703 70.625 % 29.375 %

JRip 0.702 0.702 0.701 70.2083 % 29.7917 %

Random Forest 0.756 0.756 0.756 75.625 % 24.375 %

J48 0.715 0.713 0.713 71.25 % 28.75 %

DecisionTable 0.682 0.679 0.680 67.9167 % 32.0833 %

Logistic Regression 0.707 0.708 0.708 70.8333 % 29.1667

3624

IJSTR©2020
www.ijstr.org
INTERNATIONAL JOURNAL OF SCIENTIFIC & TECHNOLOGY RESEARCH VOLUME 9, ISSUE 04, APRIL 2020 ISSN 2277-8616

4 RESULTS AND DISCUSSIONS the precise and harmonic means of memory. The results of
In given work, our primary focus is on evaluating the four FS algorithms has reported by applying six classifiers
performance of four FS algorithms on academic to Tables-2 to Table-5. These tables represent the results
performance on student dataset. FS algorithm’s obtained by each FS algorithm with their evaluation
performance can have measured through various parameters.
parameters like recall, precision, F-measurement and
predictive accuracy. The F-measurement has defined as

Table 5: Correctly Classified Instances (CCI) by all attribute selection algorithms

ML CCI with CCI with CCI with CCI with CCI with
Algorithm ( all attribute) (CfsSubsetEval) (GainRatioAttributeEval) (InfoGainAttributeEval) (ReliefAttributeEval)

NB 74.3056 % 74.5833 % 75.2083 % 72.2222 % 70.625 %


JRip 72.2917 % 73.75 % 73.5417 % 70.2083 % 70.2083 %
RF 76.6667 % 77.2917 % 76.4583 % 75.625 % 75.625 %
J48 75.8333 % 76.0417 % 76.4583 % 71.25 % 71.25 %
DT 72.7083 % 72.9167 % 71.6667 % 69.3252 % 67.9167 %
LR 73.75 % 76.25 % 75.4167 % 70.8333 % 70.8333 %

In the below table 5, six algorithms have used out of which selection algorithms, shows better accuracy than other
Naïve Bayes, J48 is giving maximum output with algorithms in combination. In future, more feature selection
GainRatioAttributeEval attribute of 75.20% and 76.45% algorithms are analyzed with different classification
correctly classified instances. JRip, RF, DT, and LR, on the algorithms to get better efficiency. The same work can also
other hand, has a maximum output with CfsSubsetEval of have done on different student academic dataset. Apart
73.75%, 77.29%, 72.91% and 76.25% correctly classified from this, we can't overlook the benefits of feature selection
instances, respectively. The graphical representation of techniques in Data Mining
table-5 has represented in figure-1.

6 REFERENCES:
[1]. S. Sivakumar, S. Venkataraman, and R. Selvaraj,
"Predictive Modeling of Student Dropout Indicators
in Educational Data Mining using Improved
Decision Tree," Indian Journal of Science and
Technology, vol. 9, 2016.
[2]. K. W. Stephen, "Data Mining Model for Predicting
Student Enrolment in STEM Courses in Higher
Education Institutions," 2016.
[3]. E. Osmanbegović, M. Suljić, and H. Agić,
"Determining Dominant Factor for students
Performance Prediction by using Data Mining
Classification Algorithms," Tranzicija, vol. 16, pp.
147-158, 2015.
[4]. A. Figueira, "Predicting Grades by Principal
Component Analysis: A Data Mining Approach to
Figure 1: Graphical representation of Correctly Classified Learning Analyics," in Advanced Learning
Instances Technologies (ICALT), 2016 IEEE 16th
International Conference on, 2016, pp. 465-467.
5 CONCLUSION [5]. N. Rachburee and W. Punlumjeak, "A comparison
In this work, different FS algorithms are evaluated and of feature selection approach between greedy, IG-
analyzed with different classification algorithms (like ratio, Chi-square, and mRMR in educational
Random Forest, JRip, J48, Decision Tree, Linear mining," in Information Technology and Electrical
Regression). The implementation result of these FS Engineering (ICITEE), 2015 7th International
algorithms doesn’t show any significant change range from Conference on, 2015, pp. 420-424.
67.9167% to 77.2917% using WEKA toolkit. The [6]. M. Zaffar, M. A. Hashmani, and K. Savita,
cfsSubsetEval algorithm with Random Forest algorithm "Performance analysis of feature selection
gave the highest accuracy up to 77.2917%, and algorithm for educational data mining," in Big Data
ReliefAttributeEval algorithm with Decision Tree gave the and Analytics (ICBDA), 2017 IEEE Conference on,
lowest accuracy up to 67.9167%. From Figure 1, it is very 2017, pp. 7-12.
much clear that Random Forest, with almost all feature
3625

IJSTR©2020
www.ijstr.org
INTERNATIONAL JOURNAL OF SCIENTIFIC & TECHNOLOGY RESEARCH VOLUME 9, ISSUE 04, APRIL 2020 ISSN 2277-8616

[7]. A. Mueen, B. Zafar, and U. Manzoor, "Modeling


and Predicting Students' Academic Performance
Using Data Mining Techniques," International
Journal of Modern Education and Computer
Science, vol. 8, p. 36, 2016.
[8]. Amrieh, E. A., Hamtini, T., & Aljarah, I. (2015,
November). Pre-processing and analyzing
educational data set using X-API for improving
student's performance. In Applied Electrical
Engineering and Computing Technologies
(AEECT), 2015 IEEE Jordan Conference on (pp. 1-
5). IEEE.
[9]. N. Rachburee and W. Punlumjeak, "A comparison
of feature selection approach between greedy, IG-
ratio, Chi-square, and mRMR in educational
mining," in Information Technology and Electrical
Engineering (ICITEE), 2015 7th International
Conference on, 2015, pp. 420-424.
[10]. J. Novaković, "Toward optimal feature selection
using ranking methods and classification
algorithms," Yugoslav Journal of Operations
Research, vol. 21, 2016.
[11]. C. Anuradha and T. Velmurugan, "Feature
Selection Techniques to Analyse Student
Academic Performance using Naïve Bayes
Classifier," in The 3rd International Conference on
Small & Medium Business, 2016, pp. 345-350.
[12]. K. W. Stephen, "Data Mining Model for Predicting
Student Enrolment in STEM Courses in Higher
Education Institutions," 2016.
[13]. A. Figueira, "Predicting Grades by Principal
Component Analysis: A Data Mining Approach to
Learning Analytics’," in Advanced Learning
Technologies (ICALT), 2016 IEEE 16th
International Conference on, 2016, pp. 465-467.
[14]. Amrieh, E. A., Hamtini, T., & Aljarah, I. (2016).
Mining Educational Data to Predict Student’s
academic Performance using Ensemble Methods.
International Journal of Database Theory and
Application, 9(8), 119-136.

3626

IJSTR©2020
www.ijstr.org

You might also like