0% found this document useful (0 votes)
68 views

Multiclass Prediction Model For Student Grade Prediction Using Machine Learning

This document describes a study that aimed to improve the performance of predictive models for predicting student grades in an imbalanced multi-classification setting. The study compared the predictive accuracy of several machine learning techniques on a dataset of 1282 students' course grades. It then proposed a multiclass prediction model using synthetic minority oversampling and feature selection to address the imbalanced classification problem. The results showed that a random forest model with the proposed approach achieved the highest F-measure of 99.5%, indicating an improvement over models without addressing imbalanced data. The study concluded the proposed model provides a promising approach for enhancing predictive performance in imbalanced multi-classification problems for student grade prediction.

Uploaded by

Wisnuaryn
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
68 views

Multiclass Prediction Model For Student Grade Prediction Using Machine Learning

This document describes a study that aimed to improve the performance of predictive models for predicting student grades in an imbalanced multi-classification setting. The study compared the predictive accuracy of several machine learning techniques on a dataset of 1282 students' course grades. It then proposed a multiclass prediction model using synthetic minority oversampling and feature selection to address the imbalanced classification problem. The results showed that a random forest model with the proposed approach achieved the highest F-measure of 99.5%, indicating an improvement over models without addressing imbalanced data. The study concluded the proposed model provides a promising approach for enhancing predictive performance in imbalanced multi-classification problems for student grade prediction.

Uploaded by

Wisnuaryn
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 14

Received May 26, 2021, accepted June 12, 2021, date of publication June 30, 2021, date of current

version July 13, 2021.


Digital Object Identifier 10.1109/ACCESS.2021.3093563

Multiclass Prediction Model for Student Grade


Prediction Using Machine Learning
SITI DIANAH ABDUL BUJANG1 , ALI SELAMAT 1,2 , (Member, IEEE),
ROLIANA IBRAHIM2 , (Member, IEEE), ONDREJ KREJCAR 3 ,
ENRIQUE HERRERA-VIEDMA 4 , (Fellow, IEEE),
HAMIDO FUJITA 5 , (Life Senior Member, IEEE),
AND NOR AZURA MD. GHANI 6 , (Member, IEEE)
1 Malaysia-Japan International Institute of Technology (MJIIT), Universiti Teknologi Malaysia, Kuala Lumpur 54100, Malaysia
2 Malaysia and Media and Games Center of Excellence (MagicX), Faculty of Engineering, School of Computing, Universiti Teknologi Malaysia,
Johor Baharu 81310, Malaysia
3 Faculty of Informatics and Management, University of Hradec Kralove, 50003 Hradec Kralove, Czech Republic
4 Department of Computer Science and AI, Andalusian Research Institute in Data Science and Computational Intelligence, University of Granada,

18071 Granada, Spain


5 i-SOMET Incorporated Association, Morioka 020-0104, Japan
6 Faculty of Computer and Mathematical Sciences, Universiti Teknologi MARA, Shah Alam, Selangor 40450, Malaysia

Corresponding author: Ali Selamat ([email protected])


This work was supported in part by the Ministry of Higher Education through the Fundamental Research Scheme under
Grant FRGS/1/2018/ICT04/UTM/01/1, in part by the Specific Research Project (SPEV) at the Faculty of Informatics and Management,
University of Hradec Kralove, Czech Republic, under Grant 2102-2021, in part by the Universiti Teknologi Malaysia (UTM) under
Research University Grant Vot-20H04, and in part by the Malaysia Research University Network (MRUN) under Grant Vot 4L876.

ABSTRACT Today, predictive analytics applications became an urgent desire in higher educational insti-
tutions. Predictive analytics used advanced analytics that encompasses machine learning implementation
to derive high-quality performance and meaningful information for all education levels. Mostly know that
student grade is one of the key performance indicators that can help educators monitor their academic per-
formance. During the past decade, researchers have proposed many variants of machine learning techniques
in education domains. However, there are severe challenges in handling imbalanced datasets for enhancing
the performance of predicting student grades. Therefore, this paper presents a comprehensive analysis of
machine learning techniques to predict the final student grades in the first semester courses by improving
the performance of predictive accuracy. Two modules will be highlighted in this paper. First, we compare the
accuracy performance of six well-known machine learning techniques namely Decision Tree (J48), Support
Vector Machine (SVM), Naïve Bayes (NB), K-Nearest Neighbor (kNN), Logistic Regression (LR) and
Random Forest (RF) using 1282 real student’s course grade dataset. Second, we proposed a multiclass predic-
tion model to reduce the overfitting and misclassification results caused by imbalanced multi-classification
based on oversampling Synthetic Minority Oversampling Technique (SMOTE) with two features selection
methods. The obtained results show that the proposed model integrates with RF give significant improvement
with the highest f-measure of 99.5%. This proposed model indicates the comparable and promising results
that can enhance the prediction performance model for imbalanced multi-classification for student grade
prediction.

INDEX TERMS Machine learning, predictive model, imbalanced problem, student grade prediction,
multi-class classification.

I. INTRODUCTION data containing information about student academic results in


In higher education institutions (HEI), every institution has final examination marks and grades in different courses and
its student academic management system to record all student programs. All student marks and grades have been recorded
and used to generate a student academic performance
The associate editor coordinating the review of this manuscript and report to evaluate the course achievement every semester.
approving it for publication was Syed Islam . The data keep in the repository can be used to discover

This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
95608 VOLUME 9, 2021
S. D. Abdul Bujang et al.: Multiclass Prediction Model for Student Grade Prediction Using Machine Learning

insightful information related to student academic perfor- best selected features to improve imbalanced multi-
mance. Solomon et al. [1] indicated that determining stu- classification for student grade prediction.
dent academic performance is a crucial challenge in HEI. • Our comparative analysis showed that the ratio between
Due to this, many previous researchers have well-defined the minority class in imbalanced dataset does not neces-
the influence factors that can highly affect student academic sarily to approach same ratio of majority class to obtain
performance [2]. However, most common factors are rely- better performance in student grade prediction.
ing on socioeconomic background, demographics [3] and • Our proposed model shows different impact in improv-
learning activities [4] compared to final student grades in ing the performance of student grade prediction model
the final examination [5]. As for this reason, we observe based on the versatility of two feature selection algo-
that the trend of predicting student grades can be one of the rithm after implementing SMOTE.
solutions that are applicable to improve student academic This paper is organized as follows. Section II describes the
performance [6]. related research work that has been conducted for student
Predictive analytics has shown the successful benefit in grade prediction. Section III illustrates the methodology of
the HEI. It can be a potential approach to benefit the com- developing predictive models to predict final student grades
petitive educational domain to find hidden patterns and make by phases. Section IV and Section V present the descrip-
predictions trends in a vast database [7]. It has been used to tive analysis and prediction results of this study’s findings,
solve several educational areas that include student perfor- respectively. Section VI discusses the findings result. Lastly,
mance, dropout prediction, academic early warning systems, the paper is highlighted with the main conclusions with some
and course selection [8]. Moreover, the application of predic- future directions in Section VII.
tive analytics in predicting student academic performance has
increased over the years [9]. II. RELATED WORKS
The ability to predict student grade is one of the important Several studies have been conducted in HEI for predicting
area that can help to improve student academic performance. student grades using various machine learning techniques.
Many previous research has found variant machine learn- It involves analytical process of many attributes and samples
ing techniques performed in predicting student academic data from variety of sources for student grade prediction in
performance. However, the related works on mechanism to different outcome. However, the performance of predictive
improve imbalanced multi-classification problem in predict- model for imbalanced dataset in education domains are still
ing students’ grade prediction are difficult to found [10], [11]. rarely discussed. Related to this issues, a study from [12]
Therefore, in this study, a comparative analysis has been done used discretization and oversampling SMOTE methods to
to find the best prediction model for student grade prediction improve the accuracy of students’ final grade prediction.
by addressing the following questions: Several classification algorithms have been applied such as
RQ1: Which predictive model among the selected machine NB, DT and Neural Network (NN) for classifying students’
learning algorithms performs high accuracy performance to final grade into five categories; A, B, C, D and F. They
predict student’s final course grades? showed that NN and NB applied with SMOTE and optimal
RQ2: How imbalanced multi-classification dataset can equal width binning outperformed other methods with similar
be addressed with selected machine learning algorithms highest accuracy of 75%. However, NB found better com-
using oversampling Synthetic Minority Oversampling Tech- pared to NN as the optimal time to utilize the prediction
nique (SMOTE) and feature selection (FS) methods? models are faster than NN. Research conducted by [13],
To address the above-mentioned questions, we collect the has developed a method for predicting future course grades
student final course grades from two core courses in the obtained from the Computer Science and Engineering (CSE)
first semester of the final examination result. We present and Electrical and Computer Engineering (ECE) programs at
a descriptive analysis of student datasets to visualize stu- the University of Minnesota. Based on the proposed meth-
dent grade trends, which can lead to strategic planning in ods, the results indicated that Matrix Factorization (MF)
decision making for the lecturers to help students more and Linear Regression (LinReg) performed more accurate
effectively. Then, we conduct comparative analysis using predictions than the existing traditional methods. The author
six well-known machine learning algorithms, including LR, also found that the use of a course-specific subset of data
NB, J48, SVM, kNN and RF on the real student data of can improve prediction accuracy for predicting future course
Diploma in Information Technology (Digital Technology) at grades. Another study in [14], applied MF, Collaborative Fil-
one of Malaysia Polytechnic. As for addressing the imbal- tering (CF) and Restricted Boltzmann Machines (RBM) tech-
anced multi-classification, we endeavor to enhance the per- niques on 225 real data of undergraduate students to predict
formance of each predictive model with data-level solutions student grade in different courses. They observe that using
using oversampling SMOTE and FS. The novel contribution CF does not indicate good performance especially when
of this paper are summarized as follows: there found many sparsity in the dataset compared to MF.
• We proposed combination of modification on over- However, their overall findings show that the proposed RBM
sampling SMOTE and two feature selection algorithms provides efficient learning and better prediction accuracy
to automatically determine the sampling ratio with compared to CF and MF with minimum Root Mean Squared

VOLUME 9, 2021 95609


S. D. Abdul Bujang et al.: Multiclass Prediction Model for Student Grade Prediction Using Machine Learning

Error (RMSE) 0.3 especially for modeling tabular data. multi-classification for student grade prediction. The frame-
A study in [15] has developed a predictive model that can work consists of four main phases is shown in Figure 1.
predict student’s final grades in introductory courses at an The input of our framework contain student’s final course
early stage of the semester. They have compared eleven grade that we extract from student’s academic spreadsheet
machine learning algorithms in five different categories con- document and student academic repository. We applied two
sist of Bayes, Function, Lazy (IBK), Rules-Based (RB) and data-level solution using oversampling SMOTE and two
Decision Tree (DT) using WEKA. To reduce high dimen- FS methods to reduce the overfitting and misclassification of
sionality and unbalanced data, they have performed feature imbalanced multi-classification dataset. Then, we design our
selection correlation-based and information-gain for data- proposed model by combining both techniques into selected
preprocessing. The author also applied SMOTE to balance machine learning classifier to evaluate the performance using
the distribution instances of three different classes. Among performance metrics. Finally, data visualization is used to
the 11 algorithms, they indicated that Decision Tree clas- visualize the trend of dataset and final classification results.
sifier (J48) have the highest accuracy of 88% compared to The description of each phases is given in the following
other categories of algorithms. Al-Barrak [16] used DT (J48) subsection.
algorithm to discover classification rules for predicting stu-
dents’ final Grade Point Average (GPA) based on student
grades in previous courses. They have used 236 students
who graduated from Computer Science College at King
Saud University in 2012. They found that the classification
rule produced from J48 can detect early predictors and can
extract useful knowledge for final student GPA based on
their grades in all mandatory courses to improve students’
performance. Another study in [17] have predicted the stu-
dent’s grade performance using three different DT algo-
rithms; Random Tree (RT), RepTree and J48. In this context,
cross-validation is used to measure the performance of the
predictive model. From the findings, the results indicated
that RT obtained the highest accuracy of 75.188% better
than the other algorithms. The accuracy of the predictive
models can be improved by adding more number of samples
and attributes in the dataset. [18] has proposed a framework
for predicting student academic performance at University
Sultan Zainal Abidin (UniSZA), Malaysia. The study applied
399 student records from the academic department database
in the eight years’ intakes that contained student demograph-
ics, previous academic records and family background infor-
mation. The results indicated that the Rule-Based (PART) is FIGURE 1. The framework of the proposed multiclass prediction model
for predicting final student grades.
the best model with 71.3% accuracy compared to DT and NB.
However, using the small sample size has affected accuracy
performance due to incomplete and missing value found in A. DATA PREPARATION
the dataset. Anderson and Anderson [19] performed an exper- The dataset we used was collected by the Department of
imental study on 683 students at the Craig School of Business Information and Communication Technology (JTMK) at
at California State University from 2006 to 2015 by applying one of the Malaysia Polytechnics. The dataset contains
three machine learning algorithms to predict student grades. 1282 instances which is the total course grades of the first
The study found that SVM is the best classifier. Itfo consis- semester students taken from the final examination during
tently outperforms a simple average approach that obtained June 2016 to December 2019 session. Students need to take
the lowest error rate to optimize each data class. The result some compulsory, specialization and core courses modules
could be different for the large set of data due to significant to qualify them for the next academic semester. However,
changes in the historical grade dataset’s structure and format. in this study we selected only two core courses that contained
We have summarized related studies composed of sample the percentage of final examination and course assessment
size, data source, attributes, algorithm, best performance and marks. All features which are used for prediction are listed in
limitation in Table 1. the Table 2.
III. FRAMEWORK OF MULTICLASS PREDICTION MODEL
FOR STUDENT GRADE PREDICTION B. DATA PRE-PROCESSING AND DESIGN MODEL
This paper aims to identify the most effective In this phase, we applied data pre-processing for the collected
predictive model especially in addressing imbalanced dataset. For the convenience of data pre-processing, we have
95610 VOLUME 9, 2021
S. D. Abdul Bujang et al.: Multiclass Prediction Model for Student Grade Prediction Using Machine Learning

TABLE 1. The taxonomy of related studies on student grade prediction.

ranked and grouped the students into 5 categories of grades: different machine learning algorithms to evaluate which
Exceptional (A+), Excellent (A), Distinction (A−, B+, B), of the algorithms performed the highest performance for
Pass (B−, C+, C, C−, D+, D) and Fail (E, E−, F). The group predicting student’s final grades. There are three experi-
was created to be the output of the prediction class. However, ments were conducted in four distinct phases based on
the class distribution of the dataset indicated an imbalanced the five different classes. The accuracy is evaluated using
class instances containing number of (63) exceptional, (377) ten-fold cross-validation which our dataset is partitioned
excellent, (635) distinction, (186) pass and (21) fail with high into 90% for training set and 10% for testing set on the same
number of ratio 3:18:30:9:1 that can lead to overfit results. dataset [22].
Therefore, data-level solution using oversampling SMOTE Figure 2 illustrates the flowchart of the proposed multiclass
and two FS methods; Wrapper and Filter based were used as prediction model applied in this study.
the benchmark methods in this study to overcome the problem In particular, the following are the theoretical model used
of imbalanced multi-classification dataset. The experiment as basis to construct our multiclass prediction model:
used the open-source tool Waikato Environment for Knowl- • Logistic Regression (LR) known as cost function that
edge Analysis (WEKA) version 3.8.3 because it provides used logistic function as represent mathematical mod-
many machine learning algorithms with easy graphical user eling to solve classification problems. The model per-
interfaces for simple visualization [20], [21]. forms great contextual analysis for categorical data to
understand the relationship between variables [23].
C. PERFORMANCE ANALYSIS • Naïve Bayes (NB) is based on Bayesian theorem that
This paper aims to predict students’ final grades based widely used as it is simple and able to make fast pre-
on their previous course performance records in the first dictions. It is suitable for small datasets that combines
semester’s final examination. The proposed model applied complexity with a flexible probabilistic model [24].

VOLUME 9, 2021 95611


S. D. Abdul Bujang et al.: Multiclass Prediction Model for Student Grade Prediction Using Machine Learning

FIGURE 2. Flowchart of the proposed multiclass prediction model.

TABLE 2. The information of the input features. Algorithm 1 Algorithm for Multiclass Prediction Model
(SFS)
Input: The training dataset
Output: The predicted Student’s Grade label, SG
1 Begin
2. Import necessary library packages and select dataset
3. Perform data preprocessing
3.1 Select filters for oversampling
3.2 Set parameter of SMOTE (nearest neighbor, k =
10)
3.3 Select features with attribute evaluator & search
method
3.4 Select attribute selection mode (Use full training
set)
4. Use classification models to predict the results
4.1. Splitting data into training and testing dataset
using 10-fold cross validation
4.2. Using well-known classification models (J48,
kNN, SVM, LR, NB, RF) to predict the SG (Excep-
tional, Excellent, Distinction, Pass, Fail)
5. Evaluate the accuracy of well-known classification
models
6. end
• Decision Tree (J48) a widely used in several multi class
classification that can handle missing values with high
dimensional data. It has been implemented effectively a sorted dataset and predicts, which of two conceivable
for giving an optimum results of accuracy with mini- classes includes the information, making the SVM a
mum number of features [25]. non-probabilistic binary linear classifier.
• Support Vector Machine (SVM) is based on the notion • K-Nearest Neighbor (kNN) is a non-parametric algo-
of decision planes that states decision boundaries which rithm that classifies and calculate the difference between
handle classification problem successfully [11]. It takes instances in the dataset based on their nearest vectors

95612 VOLUME 9, 2021


S. D. Abdul Bujang et al.: Multiclass Prediction Model for Student Grade Prediction Using Machine Learning

DD
where k refers to the distance in the n- dimensional +
space. It uses a distance function to suitability performs DA + BD + DC + DD + DE
in small features of dataset [11]. EE
+ (4)
• Random Forest (RF) is a classifier based on ensemble EA + EB + EC + ED + EE
learning that used number of decision trees on various PR
F − Measure = 2 (5)
subset to find the best features for high accuracy and P+R
prevents the problem of overfitting. The RF is relatively where the f-measure is weighted harmonic mean of precision
robust to outliers and noise that operates effectively in and recall.
classification [26].
A confusion matrix helps to visualize the classification per- D. DATA VISUALIZATION
formance of each predictive model. Table 3 presents the con- In this phase, after performed the data analysis, we extracted
fusion matrix used for student grade prediction where A, B, and visualized our findings to view the useful information and
C, D and E represent the classes for student grade (SG) level student grade performance trends in different courses using
as being ‘exceptional’, ‘excellent’, ‘distinction’, ‘pass’ and Python. Data visualization allows discovering all the features
‘failure’. The class label represents in a form an expression: and insightful of the student dataset to help lecturers improve
student academic performance for better decision making in
SG ∈ {A, B, C, D, E} (1) the future. We also compare each the result of our proposed
model in a better graphical approach to better understand the
TABLE 3. Confusion matrix for student grade prediction classification.
findings’ results.

IV. DESCRIPTIVE ANALYSIS OF STUDENT DATASET


Our dataset contains records of 641 students who taken two
core courses namely Computer System Architecture (CSA)
and Introduction to Computer System (ICS). Based on the
analysis performed, we found 362 students obtained distinc-
tion grade (A−, B+, B) in CSA course, followed by the
pass grade (B−, C+, C, C−, D+, D) with 176 students,
the excellent grade (A) with 80 students, failed grade
(E, E−, F) with 19 students and finally exceptional
grade (A+) with 4 students. On the other hand, for the ICS
course, the highest grades obtained by the students were
in excellent grade (A) with 297 students, followed by dis-
The performance metrics of the confusion matrix is deter-
tinction grade (A−, B+, B) with 273 students, exceptional
mined using accuracy, precision, recall and f-measure in the
grade (A+) with 59 students, pass grade (B−, C+, C, C−,
following equation:
D+, D) and failed grade (E, E−, F) with 10 and 2 students
(AA + BB + CC + DD + EE) respectively. Correspondingly, we have investigated the mean
Accuracy (A) = P (2)
N and standard deviation of the final student grades for the
CSA course were respectively 68.95 and 9.189, whereas for
where N is the number of samples
 ICS course 79.62 and 7.379. Table 4 shows the number of
1 AA students in both courses.
Precision (P) =
5 AA + BA + CA + DA + EA
TABLE 4. Result of student performance by course.
BB
+
AB + BB + CB + DB + EB
CC
+
AC + BC + CC + DC + EC
DD
+
AD + BD + CD + DD + ED 
EE
+ (3)
AE + BE + CE + DE + EE

1 AA
Recall (R) =
5 AA + AB + AC + AD + AE
BB
+
BA + BB + BC + BD + BE Figure 3 shows the mean and standard deviation of stu-
CC dents’ final marks and grade achievement according to the
+
CA + CB + CC + CD + CE taken course. The students’ final marks were calculated based
VOLUME 9, 2021 95613
S. D. Abdul Bujang et al.: Multiclass Prediction Model for Student Grade Prediction Using Machine Learning

FIGURE 5. Analysis of average grade point trend for ICS and CSA courses
FIGURE 3. Mean and standard deviation of student’s final marks against by yearly basis.
student’s final grades achievement according to the taken courses.
ICS students is higher than the CSA. Therefore, from these
on the total of percentage from continuous assessment marks findings, we indicated that CSA course is more challenging
evaluated during class and the final test marks in the final to those students who are weak in mathematics whereas the
exam at the end of the semester. However, the students must ICS course is more easy to understand for students who
earn more than 40 marks for both assessments in order to already have basic knowledge of computers before entering
enable them to pass in both courses. the polytechnic.
From the results, we recognize there is a difference in
student achievement results between the CSA and ICS, where V. EXPERIMENTAL RESULTS
the students obtained higher marks better in ICS course com- In this section, the results of this study are divided into
pared to CSA. Figure 4 shows the normal trend of final marks two subsections according to research questions. We have
distribution achieved by the students. Out of the total number conducted a comprehensive performance analysis with three
of failure students, we found 3% of them are prominent experiments that run based on real dataset. The experiments’
in CSA compared to the ICS course. From these findings, results of J48, kNN, NB, SVM, LR and RF were explored and
we indicated that students who failed in both courses were compared. Then, we also compared and evaluate the impact
not performed the minimum passing marks of the final exam- of using oversampling SMOTE and FS methods in order to
ination, although their final marks classified as good and pass improve the imbalanced multi-classification problem with the
grades. same dataset.

A. RQ1: COMPARISON OF THE PREDICTIVE MODEL


USING MACHINE LEARNING ALGORITHMS
Our main objective is to compare the predictive model based
on the accuracy performance in this section. Here, six selected
algorithms were used to train the student dataset and their
prediction accuracy was evaluated. In order to analyze the
differences, we compare the performance accuracy using
the ten-fold cross-validation with stratification as a testing
method to derive the best predictive model for optimal results.
We measure the performance using various metrics includ-
ing classification accuracy, precision, recall (Sensitivity) and
f-measure to ensure the predictive model was fit to produce
accurate results. Table 5 summarizes the prediction perfor-
mance measures of different classifier on the student dataset.
FIGURE 4. Graph plot of student’s final marks distribution. It can be seen from Table 4 that the results indicated J48 and
RF achieves the best prediction performance with precision
Furthermore, we also visualized the average grade point score of 0.989 whereas followed by kNN with 0.985. Mean-
trend for ICS and CSA courses based on yearly achievement while, LR and SVM obtained precision 0.983 and 0.981
(2016 to 2019) as shown in Figure 5. From the observation, respectively. The lowest model is achieved by NB with 0.978.
we found that the students’ overall academic performance However, because of the classes in our dataset were highly
was improved yearly for both ICS and CSA courses. How- imbalanced, the prediction results were often lead to mis-
ever, it is clearly shows that the grade point obtained from the classification decisions of the minority class that was

95614 VOLUME 9, 2021


S. D. Abdul Bujang et al.: Multiclass Prediction Model for Student Grade Prediction Using Machine Learning

TABLE 5. Performance comparison of predictive models. auto-detect the non-empty minority class. Then, the num-
ber of nearest neighbor’ k value was set up to equal 10
(k = 10) with percentage of instances 100% and SMOTE
filter was applied in ten times of iteration. The impact of
oversampled dataset has increased the number of instances
from 1282 up to 2932 where the SG class distribution using
SMOTE becomes (504) exceptional, (377) excellent, (635)
distinction, (744) pass and (672) fail by reducing the ratio
to 1:1:2:2:2. In Table 6 we present the details comparison
results of all predictive models with all performance mea-
sures. When the classifiers were used with oversampling
created while training the dataset. For generalizability pur- SMOTE, we found that the effectiveness of all predictive
pose, another experiments in dealing with the issues were models were consistently improved.
conducted to reduce the ratio of each classes which it is Among these predictive models, RF generated the most
described in the next subsection. promising f-measure of 99.5%, whereas followed by kNN
with 99.3%, J48 with 99.1%, SVM with 98.9%, LR with
B. RQ2: IMPACT OF OVERSAMPLING AND FEATURE 98.8% and NB with 98.3%. This result was statistically sig-
SELECTION FOR IMBALANCED MULTI-CLASS DATASET nificant with confidence level of 95% using Paired T-Tester
Here, we only focus on data-level solution using (corrected) as showed in Figure 6. We also observed when
oversampling SMOTE and two FS algorithms for addressing SMOTE method was applied, the minority class instance has
imbalanced multi-classification dataset [27], [28]. To see the increased to balance with other classes by number of iteration
performance of each predictive model, we have performed and number of k value to our dataset. The detailed analysis of
three experiments on six selected machine learning algo- the accuracy performance was presented based on confusion
rithms to reduce the imbalanced problem. First, we performed matrix as reported in Table 7.
SMOTE on our dataset with six selected machine learning
algorithms independently. Secondly, the dataset was executed
on two FS algorithms independently using three different
attribute evaluators, and thirdly the proposed multiclass pre-
diction model (SFS) was performed and tested using the same
dataset in six selected machine learning algorithms. For a
FIGURE 6. Result of predictive model performance with SMOTE.
better view of the dimensionality prediction accuracy, other
performance metrics on precision, recall and f-measure were
used to ensure that our predictive model was fit to produce It is obviously seen that confusion matrix of all predictive
accurate results. models derived from J48, NB, kNN, SVM, LR and RF shows
improvement results of correctly classified for ‘Pass’ and
1) SMOTE OVERSAMPLING TECHNIQUE ‘Fail’ grades.
SMOTE known as Synthetic Minority Oversampling Tech- However, there is small decrease performance from SVM
nique is the most commonly used to improve the overfitting where the predictive model correctly classified 97.2% of
problem based on random sampling algorithm [29]. It can student who obtained ‘Pass’ grades compared to 99.5%
modify an imbalanced dataset and generates new existing when applied without SMOTE. For comparative analysis,
minority class instances by using synthetic sampling tech- Figures 7 and Figure 8 illustrate actual scores and predictions
nique to create the distribution more balanced. This study was based on five categories of grade before and after applying
taking into consideration by increasing the default parameter the SMOTE respectively. Each predictive model performance
of nearest neighbors (k) in sample SG in the minority class, shows the significant improvement for the majority classes
select N samples randomly and record them as SGi . The new except for minority class.
sample SGnew is defined by the follows expression:
2) FEATURE SELECTION
SGnew = SGorigin + rand Another experiment that we applied is feature selection (FS)
× SGi − SGorigin , i = 1, 2, 3, . . . n which is effective in reducing dimensionality, removing irrel-

(6)
evant data and learning accuracy [30], [31]. In this experi-
where rand is a seed used of random sampling within range ment, two FS methods consist of wrapper and filter based
(0,1) and index of class value 0 with the ratio of generating were used as the benchmark methods to maximize the per-
new samples approximates 100%. In Weka, we implemented formance of six predictive models. The FS wrapper algo-
weka.filters.supervised.instance.SMOTE to insert synthetic rithm used to identify the best features set in this study
instances between minority class samples of neighbors to consist of two attribute evaluator using J48 classifier; Wrap-
our dataset. We set parameter of index class value 0 to perSubsetEval (FS-1) and ClassifierSubsetEval (FS-2) with

VOLUME 9, 2021 95615


S. D. Abdul Bujang et al.: Multiclass Prediction Model for Student Grade Prediction Using Machine Learning

TABLE 6. Result of oversampling SMOTE with different predictive models.

TABLE 7. Analysis of correctly classified based on confusion matrix.

FIGURE 7. Comparison of correctly classified by class without


applied SMOTE.

TABLE 8. Detailed selected features over different FS algorithms.

FIGURE 8. Comparison of correctly classified by class with


applied SMOTE.

BestFirst search method. While for the FS filter algorithm,


InfoGainAttributeEval (FS-3) with ranker search method
more than 0.5 were selected as best feature set. The number
of features in both FS algorithms are presented in Table 8. For that kNN exhibited the highest performance f-measure score
the analysis, we have used the same dataset to find the best up to 98.8% and 98.9% with the optimal selected features
predictive model that fit with the requirements for giving an set obtained from FS-2 and FS-3 algorithm respectively com-
optimal result. pared to others predictive models.
Table 9 shows overall results of different predictive model As we also can see from Table 9, NB shows the lowest per-
with all measurement of FS algorithms. The result showed formance of accuracy but the f-measure for NB shows slightly

95616 VOLUME 9, 2021


S. D. Abdul Bujang et al.: Multiclass Prediction Model for Student Grade Prediction Using Machine Learning

TABLE 9. Classification performance of FS in different predictive models.

3) PROPOSED MULTICLASS PREDICTION MODEL (SFS)


Then, we performed and tested the third experiments of the
proposed SFS model by combining SMOTE oversampling
and FS on the same dataset for pre-processing. The visualiza-
tion of the comparison performance accuracy and f-measure
rate of the proposed SFS model with all predictive models
are presented in Figure 10. The highest score of accuracy and
f-measure of each predictive models are presented in high-
lighted red. The results show that the proposed model with
RF and J48 outperformed the highest f-measure performance
up to 99.5% and 99.3% with SFS-1 algorithms, whereas
kNN and SVM obtained the highest f-measure of 99.4%
FIGURE 9. Classification performance of accuracy and f-measure with
different FS. and 98.9% with SFS-2 algorithms respectively. LR and NB
shares the result of f-measure up to 98.7% with SFS-2. The
integration of the oversampling SMOTE and FS improves the
improvement varied from 97.8% to 98.2% after FS-2 algo- performance of imbalanced multi-classification in our dataset
rithm was undertaken. On the other hand, the performance where the oversampling SMOTE can balanced the selected
of J48, LR, SVM and RF showed low promising perfor- features by increasing the number of features from minority
mance when compared to without applied any FS. By reduc- class to equal with majority class. The details performance
ing the number of features for the high imbalance ratio in results of the proposed model SFS are presented in Table 10.
multi-class dataset hinders the learning performance to pre-
dict student grade better. The comparison of the highest accu- VI. DISCUSSION
racy and f-measure score with different FS are highlighted This study was conducted to address the imbalanced
in Figure 9. multi-classification problems focus on data-level solution for

VOLUME 9, 2021 95617


S. D. Abdul Bujang et al.: Multiclass Prediction Model for Student Grade Prediction Using Machine Learning

TABLE 10. Classification performance of the proposed SFS in different predictive models.

comparing the accuracy performance of the prediction model


in a selected machine learning algorithm. Then, we also
applied oversampling SMOTE and two FS methods to com-
pare the effectiveness of the predictive model by using eval-
uation metrics of accuracy, precision, recall and f-measure to
show the predictive models’ performance.
Overall results indicated that all predictive model derived
from J48, NB, kNN SVM, LR and RF deliver a better per-
formance when we applied SMOTE independently to the
imbalanced dataset. However, after we applied FS method
on imbalanced dataset using wrapper-based, only kNN and
NB shows significant improvement whereas SVM remain
FIGURE 10. Comparison of accuracy and f-measure of proposed SFS same with none changes. This happened due to the tendency
model. of overfitting and bias result caused by imbalanced data
created when selecting the subgroup features. Other than
student grade prediction. For handling the imbalanced prob- that, we noticed the SVM not able to work independently
lem, we used the real student’s final course grades dataset in solving imbalanced multi-classification due to limitation
from JTMK at one of Malaysia Polytechnics to analyze for computing the best hyperplane for high dimensional
and compare the results of the proposed model. A similar imbalanced dataset [33]. As for NB, the used of FS for
study conducted in [7], [32], also mentioned the significant predicting student’s grade also supported in [30] where the
course grades can help in decision making in the educa- author found NB shows the highest accuracy performance
tional domains. To answer our research question, we con- when wrapper-based subset feature selection was undertaken.
ducted a comprehensive experiment on real student dataset by However, we identified that FS independently not able to
95618 VOLUME 9, 2021
S. D. Abdul Bujang et al.: Multiclass Prediction Model for Student Grade Prediction Using Machine Learning

improve the accuracy performance of RF that might be due SMOTE is overall improved consistently than using FS alone
to imbalanced dataset. Thus, we indicated FS enabled the with all predictive models. However, our proposed multiclass
predictive model to be interpreted more quickly, but the prediction model performed more effectively than using over-
improvement was not depending on few features [34]. sampling SMOTE and FS alone with some parameter settings
Then, we attempted to reduce the overfitting and misclas- that can influence the performance accuracy of all predic-
sification of the minority class by combining SMOTE with tive models. Here, our findings contribute to be a practical
a selection of appropriate features for all predictive models approach for addressing the imbalanced multi-classification
by introducing the SFS model. Here, the overall performance based on the data-level solution for student grade prediction.
indicated the proposed SFS model outperformed with RF In HEI, predictive analytics plays a significant role in gov-
higher than previous study conducted by [12], [15]. The best ernance for improving valuable information and developing
accuracy obtained by the RF with 99.5% slightly higher than trusted decision-making that contributes to data science [38].
kNN and J48 shows that the RF algorithm was the ideal Determining the quality of the collected dataset to reduce
solution algorithm to predict student final grade. Meanwhile, the imbalance and missing values difficulties is part of the
kNN was the ideal solution that can work with the best challenging issues that adhere to select the relevant and valu-
value of k and optimal features [35]. The experiment results able predictive models [39]. Therefore, as for future works,
revealed that the proposed SFS model had more significant further investigation on the use of appropriate emerging pre-
effect on kNN depending on the selected of FS algorithms. dictive techniques in such advanced machine learning algo-
Certainly, these result also similar to the best performance of rithms [40] and more ensemble algorithms are recommended
kNN in handling imbalanced data with different case studies to optimize the result for predicting student grades. It is also
as depicted in [36]. In this context, we also observe that most essential to select several multi-class imbalanced datasets
of the predictive models considered benefit when performing to be analyzed with appropriate sampling techniques and
oversampling SMOTE but integrating the accurate features different evaluation metrics which suitable for the imbalanced
with different FS algorithms can influence the prediction multi-class domain such as Kappa, Weighted Accuracy and
effectiveness as well. other measures. Thus, using machine learning in higher learn-
Despite these findings, we have identified several limi- ing institutions for student grade prediction will ultimately
tations to this fact; (1) the analysis is based on a defined enhance the decision support system to improve their student
dataset, but other dataset should be tested for data general- academic performance in the future.
ization that could affect the analysis results; (2) the analysis
is only carried out with the certain well-known algorithms ACKNOWLEDGMENT
but can be analyzed with ensemble or advanced machine The authors are grateful for the support of Student Sebastien
learning algorithms to compare the effectiveness for imbal- Mambou in consultations regarding application aspects.
anced multi-classification prediction model. (3) we used only
one method of oversampling SMOTE, more method could REFERENCES
be used to analyze whether they can improve the multi-class [1] D. Solomon, S. Patil, and P. Agrawal, ‘‘Predicting performance and poten-
imbalanced problem. tial difficulties of university student using classification: Survey paper,’’
Therefore, this study still needs to be improved in predict- Int. J. Pure Appl. Math, vol. 118, no. 18, pp. 2703–2707, 2018.
[2] E. Alyahyan and D. Düştegör, ‘‘Predicting academic success in higher
ing students’ final grades by improving the sampling tech- education: Literature review and best practices,’’ Int. J. Educ. Technol.
niques for imbalanced multi-class dataset that might affect the Higher Educ., vol. 17, no. 1, Dec. 2020.
[3] V. L. Miguéis, A. Freitas, P. J. V. Garcia, and A. Silva, ‘‘Early segmen-
accurate prediction results. In addition, we also be considered tation of students according to their academic performance: A predictive
to use SVM ensemble to be as part of the analysis since it modelling approach,’’ Decis. Support Syst., vol. 115, pp. 36–51, Nov. 2018.
has produced greater accuracy when predicting students’ final [4] P. M. Moreno-Marcos, T.-C. Pong, P. J. Munoz-Merino, and C. D. Kloos,
‘‘Analysis of the factors influencing Learners’ performance prediction with
grades as mentioned in [37]. learning analytics,’’ IEEE Access, vol. 8, pp. 5264–5282, 2020.
[5] A. E. Tatar and D. Düştegör, ‘‘Prediction of academic performance at
undergraduate graduation: Course grades or grade point average?’’ Appl.
VII. CONCLUSION AND FUTURE DIRECTIONS
Sci., vol. 10, no. 14, pp. 1–15, 2020.
Predicting student grades is one of the key performance [6] Y. Zhang, Y. Yun, H. Dai, J. Cui, and X. Shang, ‘‘Graphs regularized robust
indicators that can help educators monitor their academic matrix factorization and its application on student grade prediction,’’ Appl.
Sci., vol. 10, p. 1755, Jan. 2020.
performance. Therefore, it is important to have a predic- [7] H. Aldowah, H. Al-Samarraie, and W. M. Fauzy, ‘‘Educational data mining
tive model that can reduce the level of uncertainty in the and learning analytics for 21st century higher education: A review and
outcome for an imbalanced dataset. This paper proposes synthesis,’’ Telematics Informat., vol. 37, pp. 13–49, Apr. 2019.
[8] K. L.-M. Ang, F. L. Ge, and K. P. Seng, ‘‘Big educational data &
a multiclass prediction model with six predictive models analytics: Survey, architecture and challenges,’’ IEEE Access, vol. 8,
to predict final student’s grades based on the previous stu- pp. 116392–116414, 2020.
[9] A. Hellas, P. Ihantola, A. Petersen, V. V. Ajanovski, M. Gutica,
dent final examination result of the first-semester course. T. Hynninen, A. Knutas, J. Leinonen, C. Messom, and S. N. Liao, ‘‘Predict-
Specifically, we have done a comparative analysis of com- ing academic performance: A systematic literature review,’’ in Proc. 23rd
bining oversampling SMOTE with different FS methods to Annu. Conf. Innov. Technol. Comput. Sci. Educ., Jul. 2018, pp. 175–199.
[10] L. M. Abu Zohair, ‘‘Prediction of student’s performance by modelling
evaluate the performance accuracy of student grade predic- small dataset size,’’ Int. J. Educ. Technol. Higher Educ., vol. 16, no. 1,
tion. We also have shown that the explored oversampling pp. 1–8, Dec. 2019, doi: 10.1186/s41239-019-0160-3.

VOLUME 9, 2021 95619


S. D. Abdul Bujang et al.: Multiclass Prediction Model for Student Grade Prediction Using Machine Learning

[11] X. Zhang, R. Xue, B. Liu, W. Lu, and Y. Zhang, ‘‘Grade prediction of [36] P. Nair and I. Kashyap, ‘‘Optimization of kNN classifier using hybrid
student academic performance with multiple classification models,’’ in preprocessing model for handling imbalanced data,’’ Int. J. Eng. Res.
Proc. 14th Int. Conf. Natural Comput., Fuzzy Syst. Knowl. Discovery Technol., vol. 12, no. 5, pp. 697–704, 2019.
(ICNC-FSKD), Jul. 2018, pp. 1086–1090. [37] Brodic, A. Amelio, and R. Jankovic, ‘‘Comparison of different classifi-
[12] S. T. Jishan, R. I. Rashu, N. Haque, and R. M. Rahman, ‘‘Improving cation techniques in predicting a university course final grade,’’ in Proc.
accuracy of students’ final grade prediction model using optimal equal 41st Int. Conv. Inf. Commun. Technol. Electron. Microelectron., 2018,
width binning and synthetic minority over-sampling technique,’’ Decis. pp. 1382–1387.
Anal., vol. 2, no. 1, pp. 1–25, Dec. 2015. [38] P. Brous and M. Janssen, ‘‘Trusted decision-making: Data governance for
[13] A. Polyzou and G. Karypis, ‘‘Grade prediction with models specific to creating trust in data science decision outcomes,’’ Administ. Sci., vol. 10,
students and courses,’’ Int. J. Data Sci. Anal., vol. 2, nos. 3–4, pp. 159–171, no. 4, p. 81, Oct. 2020.
Dec. 2016. [39] H. Sun, M. R. Rabbani, M. S. Sial, S. Yu, J. A. Filipe, and J. Cherian, ‘‘Iden-
[14] Z. Iqbal, J. Qadir, A. N. Mian, and F. Kamiran, ‘‘Machine learning tifying big Data’s opportunities, challenges, and implications in finance,’’
based student grade prediction: A case study,’’ 2017, arXiv:1708.08744. Mathematics, vol. 8, no. 10, p. 1738, Oct. 2020.
[Online]. Available: https://arxiv.org/abs/1708.08744 [40] M. Tsiakmaki, G. Kostopoulos, S. Kotsiantis, and O. Ragos, ‘‘Implement-
[15] I. Khan, A. Al Sadiri, A. R. Ahmad, and N. Jabeur, ‘‘Tracking student ing autoML in educational data mining for prediction tasks,’’ Appl. Sci.,
performance in introductory programming by Means of machine learning,’’ vol. 10, no. 1, pp. 1–27, 2020.
in Proc. 4th MEC Int. Conf. Big Data Smart City (ICBDSC), Jan. 2019,
pp. 1–6.
[16] M. A. Al-Barrak and M. Al-Razgan, ‘‘Predicting students final GPA using SITI DIANAH ABDUL BUJANG received the
decision trees: A case study,’’ Int. J. Inf. Educ. Technol., vol. 6, no. 7, B.S. degree in science (computer science) and the
pp. 528–533, 2016. M.S. degree in science from Universiti Teknologi
[17] E. C. Abana, ‘‘A decision tree approach for predicting student grades in Malaysia (UTM), in 2006 and 2010, respectively.
research project using WEKA,’’ Int. J. Adv. Comput. Sci. Appl., vol. 10, She is currently pursuing the Ph.D. degree in soft-
no. 7, pp. 285–289, 2019. ware engineering with the Malaysia-Japan Inter-
[18] F. Ahmad, N. H. Ismail, and A. A. Aziz, ‘‘The prediction of students’ national Institute of Technology, UTM, Kuala
academic performance using classification data mining techniques,’’ Appl. Lumpur. Her thesis focuses on the application of
Math. Sci., vol. 9, pp. 6415–6426, Apr. 2015. predictive analytics on student grade prediction in
[19] T. Anderson and R. Anderson, ‘‘Applications of machine learning to
a higher education institution. From 2010 to 2019,
student grade prediction in quantitative business courses,’’ Glob. J. Bus.
Pedagog., vol. 1, no. 3, pp. 13–22, 2017. she was a Senior Lecturer of Information and Communication Technology
[20] S. Hussain, N. A. Dahan, F. M. Ba-Alwib, and N. Ribata, ‘‘Educational data Department, Polytechnic Sultan Idris Shah, Sabak, Selangor, Malaysia. She
mining and analysis of students’ academic performance using WEKA,’’ has experience in developing the polytechnic curriculum for Diploma in
Indonesian J. Electr. Eng. Comput. Sci., vol. 9, no. 2, pp. 447–459, 2018. information technology (technology digital), 2.5 years’ program. She is one
[21] A. Verma, ‘‘Evaluation of classification algorithms with solutions to class of the book authors that contribute for the Department of Polytechnic and
imbalance problem on bank marketing dataset using WEKA,’’ Int. Res. Community College Education. Her research interests include data analyt-
J. Eng. Technol., vol. 6, no. 3, pp. 54–60, 2019. ics, predictive analytics, learning analytics, educational data mining, and
[22] D. Berrar, ‘‘Cross-validation,’’ Comput. Biol., vols. 1–3, pp. 542–545, machine learning.
Jan. 2018, doi: 10.1016/B978-0-12-809633-8.20349-X.
[23] M. Hussain, W. Zhu, W. Zhang, S. M. R. Abidi, and S. Ali, ‘‘Using machine
ALI SELAMAT (Member, IEEE) has been the
learning to predict student difficulties from learning session data,’’ Artif.
Intell. Rev., vol. 52, no. 1, pp. 381–407, Jun. 2019. Dean of the Malaysia-Japan International Institute
[24] B. Predić, G. Dimić, D. Ranćić, P. Štrbac, N. Maček, and P. Spalević, of Technology (MJIIT), an academic institution
‘‘Improving final grade prediction accuracy in blended learning environ- established under the cooperation of the Japanese
ment using voting ensembles,’’ Comput. Appl. Eng. Educ., vol. 26, no. 6, International Cooperation Agency (JICA) and the
pp. 2294–2306, Nov. 2018, doi: 10.1002/cae.22042. Ministry of Education Malaysia (MOE) to pro-
[25] K. Srivastava, D. Singh, A. S. Pandey, and T. Maini, ‘‘A novel feature vide the Japanese style of education in Malaysia,
selection and short-term price forecasting based on a decision tree (J48) Universiti Teknologi Malaysia (UTM), Malaysia,
model,’’ Energies, vol. 12, p. 3665, Jan. 2019. since 2018. He is currently a Full Professor with
[26] L. E. O. Breiman, ‘‘Random Forests,’’ Mach. Learn., vol. 45, pp. 5–32,
UTM, where he is also a Professor with the Soft-
Oct. 2001.
[27] T. M. Barros, P. A. SouzaNeto, I. Silva, and L. A. Guedes, ‘‘Predictive ware Engineering Department, Faculty of Computing. He has published
models for imbalanced data: A school dropout perspective,’’ Educ. Sci., more than 60 IF research articles. His H-index is 20, and his number of
vol. 9, no. 4, p. 275, Nov. 2019. citations in WoS is more than 800. His research interests include software
[28] T. Alam, C. F. Ahmed, S. A. Zahin, M. A. H. Khan, and M. T. Islam, ‘‘An engineering, software process improvement, software agents, web engineer-
effective recursive technique for multi-class classification and regression ing, information retrievals, pattern recognition, genetic algorithms, neural
for imbalanced data,’’ IEEE Access, vol. 7, pp. 127615–127630, 2019. networks, soft computing, computational collective intelligence, strategic
[29] N. V. Chawla, K. W. Bowyer, L. O. Hall, and W. P. Kegelmeyer, ‘‘SMOTE: management, key performance indicator, and knowledge management. He is
Synthetic minority over-sampling technique,’’ J. Artif. Intell. Res., vol. 16, on the Editorial Board of the journal Knowledge-Based Systems (Elsevier).
pp. 321–357, Jun. 2002. He has been serving as the Chair of the IEEE Computer Society Malaysia,
[30] C. Jalota and R. Agrawal, Feature Selection Algorithms and Student Aca-
demic Performance: A Study, vol. 1165. Singapore: Springer, 2021. since 2018.
[31] G. A. Sharifai and Z. Zainol, ‘‘Feature selection for high-dimensional
and correlation based redundancy and binary,’’ Genesm, vol. 11, pp. 1–26, ROLIANA IBRAHIM (Member, IEEE) received
Jun. 2020. the B.Sc. degree (Hons.) in computer studies
[32] Buenaño-Fernández, D. Gil, and S. Luján-Mora, ‘‘Application of machine from Liverpool John Moores University, the M.Sc.
learning in predicting performance for computer engineering students: degree in computer science from Universiti
A case study,’’ Sustain., vol. 11, no. 10, pp. 1–18, 2019. Teknologi Malaysia (UTM), and the Ph.D. degree
[33] S. Chinna Gopi, B. Suvarna, and T. Maruthi Padmaja, ‘‘High dimensional in systems engineering from Loughborough Uni-
unbalanced data classification Vs SVM feature selection,’’ Indian J. Sci. versity. She is currently the Director of applied
Technol., vol. 9, no. 30, Aug. 2016. computing at the School of Computing, formerly
[34] R. Hasan, S. Palaniappan, S. Mahmood, A. Abbas, K. U. Sarker, and
M. U. Sattar, ‘‘Predicting student performance in higher educational insti-
known as the Faculty of Computing. Previously,
tutions using video learning analytics and data mining techniques,’’ Appl. she was the Head of the Information Systems
Sci., vol. 10, no. 11, p. 3894, Jun. 2020. Department, for three years, at the Faculty of Computing, UTM. She has been
[35] S. Zhang, X. Li, M. Zong, X. Zhu, and D. Cheng, ‘‘Learning k for kNN an Academic Staff at the Information Systems Department, since 1999. She
Classification,’’ ACM Trans. Intell. Syst. Technol., vol. 8, no. 3, pp. 1–9, was previously a Coordinator of the B.Sc. Computer Science (Bioinformat-
2017. ics) Program and the Master of Information Technology (IT Management).

95620 VOLUME 9, 2021


S. D. Abdul Bujang et al.: Multiclass Prediction Model for Student Grade Prediction Using Machine Learning

ONDREJ KREJCAR received the Ph.D. degree AI journals like IEEE TRANSACTIONS ON FUZZY SYSTEMS, IEEE TRANSACTIONS
in technical cybernetics from the Technical Uni- ON INTELLIGENT TRANSPORTATION SYSTEMS, IEEE TRANSACTIONS ON SYSTEMS,
versity of Ostrava, Czech Republic, in 2008, and MAN, AND CYBERNETICS: SYSTEMS, Knosys, Applied Soft Computing, Fuzzy
the Ph.D. degree in applied informatics from the optimization and Decision Making, Information Sciences, and Soft Com-
University of Hradec Kralove. He is focusing on puting. He has also been a Guest Lecturer in plenary lectures and tutorials
lecturing on smart approaches to the development in multiple national and international conferences related to artificial
of information systems and applications in ubiqui- intelligence.
tous computing environments with the University
of Hradec Kralove. He is a Full Professor of
systems engineering and informatics at the Center
for Basic and Applied Research, Faculty of Informatics and Management,
University of Hradec Kralove, Czech Republic, and a Research Fellow
at the Malaysia-Japan International Institute of Technology, University
HAMIDO FUJITA (Life Senior Member, IEEE)
Technology Malaysia, Kuala Lumpur, Malaysia. From 2016 to 2020, he was
received the title of Honorary Professor and the
the Vice-Dean of Science and Research at the Faculty of Informatics and
Doctor Honoris Causa degree from Óbuda Uni-
Management, UHK. He has been the Vice-Rector of science and creative
versity, Budapest, Hungary, in 2011 and 2013,
activities at the University of Hradec Kralove, since June 2020. He is also cur-
respectively, and the Doctor Honoris Causa degree
rently the Director of the Center for Basic and Applied Research, University
from Timisoara Technical University, Timişoara,
of Hradec Kralove. His H-index is 20, with more than 1300 citations received
Romania, in 2018. He is an Emeritus Profes-
in the Web of Science, where more than 100 IF journal articles is indexed in
sor with Iwate Prefectural University, Takizawa,
JCR index. In 2018, he was the 14th Top Peer Reviewer in multidisciplinary
Japan. He is currently the Executive Chair-
in the world according to Publons and a Top Reviewer in the Global Peer
man of i-SOMET Incorporated Association,
Review Awards 2019 by Publons. His research interests include control
www.i-SOMET.org, Morioka, Japan. He is a Distinguished Research Profes-
systems, smart sensors, ubiquitous computing, manufacturing, wireless tech-
sor at the University of Granada and an Adjunct Professor with Stockholm
nology, portable devices, biomedicine, image segmentation and recognition,
University, Stockholm, Sweden; the University of Technology Sydney,
biometrics, technical cybernetics, and ubiquitous computing. His second area
Ultimo, NSW, Australia; National Taiwan Ocean University, Keelung,
of research interests include biomedicine (image analysis), biotelemetric
Taiwan, and others. He has supervised Ph.D. students jointly with the
system architecture (portable device architecture and wireless biosensors),
University of Laval, Quebec City, QC, Canada; the University of Technology
and development of applications for mobile devices with use of remote
Sydney; Oregon State University, Corvallis, OR, USA; the University of
or embedded biomedical sensors. He is currently on the Editorial Board
Paris 1 Pantheon-Sorbonne, Paris, France; and the University of Genoa, Italy.
of the Sensors (MDPI) IF journal (Q1/Q2 at JCR) and several other ESCI
He has four international patents in software systems and several research
indexed journals. He has been the Vice-Leader and Management Committee
projects with Japanese industry and partners. He was a recipient of the
Member at WG4 at Project COST CA17136, since 2018. He has also been
Honorary Scholar Award from the University of Technology Sydney, in 2012.
a Management Committee Member Substitute at Project COST CA16226,
He was a Highly Cited Researcher in cross-field for the year 2019 and in
since 2017. Since 2019, he has been the Chairman of the Program Committee
computer science field for the year 2020 by Clarivate Analytics. He headed
of the KAPPA Program, Technological Agency of the Czech Republic, and
a number of projects, including intelligent HCI, a project related to mental
was a Regulator of the EEA/Norwegian Financial Mechanism in the Czech
cloning for healthcare systems as an intelligent user interface between
Republic, from 2019 to 2024. Since 2020, he has been the Chairman of the
human-users and computers, and a SCOPE Project on virtual doctor systems
Panel 1 (computer, physical and chemical sciences) of the ZETA Program,
for medical applications. He collaborated with several research projects in
Technological Agency of the Czech Republic. From 2014 to 2019, he was
Europe. He is recently collaborating in OLIMPIA Project supported by
the Deputy Chairman of the Panel 7 (processing industry, robotics, and
Tuscany Region on therapeutic monitoring of Parkison’s disease. He has
electrical engineering) of the Epsilon Program, Technological Agency of
published more than 400 highly cited articles. He is the Emeritus Editor-
the Czech Republic.
in-Chief of Knowledge-Based Systems and currently the Editor-in-Chief of
Applied Intelligence (Springer).

ENRIQUE HERRERA-VIEDMA (Fellow, IEEE)


received the M.Sc. and Ph.D. degrees in computer
science from the University of Granada, Granada,
Spain, in 1993 and 1996, respectively. He is cur- NOR AZURA MD. GHANI (Member, IEEE) is
rently a Professor of computer science AI and the a Professor with the Center for Statistical Stud-
Vice-President of research and knowledge transfer, ies and Decision Sciences, Faculty of Computer
University of Granada. He has authored or coau- and Mathematical Sciences, Universiti Teknologi
thored more than 300 articles in JCR journals. In MARA, Malaysia, and the Chair of IEEE Com-
2013, he has published in the prestigious journal puter Society Malaysia Chapter. She currently
science about the new role of digital libraries in serves as the Deputy Director (Research Impact)
the era of the information society. He was the Vice-President for Publi- at the Research Management Center, Universiti
cations with the IEEE System Man and Cybernetics Society from 2019 Teknologi MARA, Malaysia. Her expertise is big
to 2020. He is currently the VP of cybernetics, the Founder of the IEEE data, image processing, artificial neural networks,
TRANSACTIONS ON ARTIFICIAL INTELLIGENCE, and a Highly Cited Researcher by statistical pattern recognition, and forensic statistics. She is the author or
Clarivate Analytics in computer science and engineering from 2014 to 2020. coauthor of many journals and conference proceedings at national and inter-
His H-index is 101 in Google Scholar (more than 33000 citations) and 85 in national levels.
WoS (more than 23000 citations). He is also an Associate Editor of several

VOLUME 9, 2021 95621

You might also like