Journal
Journal
1. Introduction
Predicting academic achievementis one of the main areas in educational data mining [1]. To date,
there are 13,653 documents documented in Scopus that discuss students' academic performance.
Research on student academic achievementwas first conducted in 1954 by Reed M. Merrill in order
to evaluate student academic achievementon probation. Articles related to student academic
achievementfrom 2002 to 2021 totaled 13,016 articles. The country with the highest number of
articles related to the topic of student academic achievementis the United States with 4,098 articles,
followed by India with 815 articles, China, Spain, Australia with 700 articles each. Meanwhile,
Indonesia is ranked 8th with 470 articles. The development of research with a focus on predicting
the academic achievementof students in the last ten years has also experienced a remarkable
increase. There were 1027 articles in Scopus from 2012-2021. Based on these data, the prediction of
student academic achievementis an important and interesting research area to study.
Student academic achievement prediction models generally use the GPA variable as a target.
Classify GPA into classes [2]–[5]. In addition, some researchers use regression models to improve
student academic achievement[6], [7]. Regression models such as graddient boosting regression tree
(GBRT), random forest and neural networks involve a number of hyperparameters that must be
setup before using them [8].
The goal of GBRT is to enhance the regression achievementof a single model by combining
many fitted models. As a result, GBRT uses two algorithms: the decision tree (DT) group's
regression tree and gradient boosting, a general metalearning approach used to combine single
http://ijair.id [email protected]
2 International Journal of Artificial Intelegence Research ISSN 2579-7298
Vol. 8, No. 1.1 (2024)
regression tree models [9]. We will concentrate on this model since it is currently the best-
performing approach for the majority of Kaggle contests [10], [11] and because the achievementis
greatly influenced by the selection of the hyperparameters.
Algorithms that use machine learning automatically pick up new information and, as a result,
modify their internal parameters in response to new information. These parameters are referred to as
"model parameters" or just "parameters" for short. However, there are some settings that must be
made beforehand rather than being changed throughout the learning process. These parameters are
commonly known as "hyperparameters." While model parameters illustrate how input data is
converted into the desired output, hyperparameters indicate the structure and organization of the
model itself. Depending on the selection and values of a machine learning model's hyperparameters,
its achievementcan significantly affect, A machine learning can get significantly higher accuracy
when it makes the right hyperparameter tuning [12], [13].
Tuning hyperparameters is an important step before implementing a prediction algorithm [8],
[14]–[16]. Model achievementdepends on hyperparameters model selection [17]. The choice of the
hyper-parameter configuration is known to have a significant impact on the achievementof machine
learning models [16]. [15], [18] Tuning Hyperparameters for Deep Learning, [12], [16], [19] tuning
hyperparameter to machine learning algorithms, [20] to ensemble machine learning, [21] to random
forest, [22], [23] to neural networks, [24] to SVM. [25] Manual search is one method for Hyper-
Parameter optimization; however, this requires a significant amount of time.
The originality of the paper is found in the comparison of numerous hyperparameter tuning
techniques to Predicting Student Academic Achievement. Four hyperparameter techniques are
applied in this research; Grid Search, Random Search, Optuna and HyperOpt.
2. Method
In previous studies we have compared several prediction algorithms to predict student academic
achievement, researchers discovered that the best technique for this is the Gradient Boosting
Regression Tree (GBRT) [6]. For this reason, in this study we tuned hyperparameters on the
algorithm. Figure 1. show the steps for this study.
2.1. Hyperparameters
During training, parameters model acquires their values. You cannot manually set this value.
From the provided data, the model learns. Model parameters also include, for instance, the linear
regression model's coefficients [26].
In contrast, hyperparameters are not determined from the data; instead, researchers need to set
them manually. During the model-building phase, researchers must always specify the values for
hyperparameters before starting the training process [26].
Muhammad Arifin et.al (Hyperparameter Tuning in Machine Learning to Predicting Student Academic Achievement)
ISSN 2579-7298 International Journal of Artificial Intelegence Research 3
Vol. 8, No. 1.1 (2024)
Muhammad Arifin et.al (Hyperparameter Tuning in Machine Learning to Predicting Student Academic Achievement)
4 International Journal of Artificial Intelegence Research ISSN 2579-7298
Vol. 8, No. 1.1 (2024)
while registration data provides economic information. The student affairs department's co-
curricular data comes in the form of extraction findings from university-based organization decrees.
The data's inconsistent values are eliminated. Students with a GPA below 1, those who use the LMS
very infrequently, those who have academic information but no LMS record, and so on are some
examples of these students. Consequently, 4436 student data sets' worth of information were
examined for the study
Muhammad Arifin et.al (Hyperparameter Tuning in Machine Learning to Predicting Student Academic Achievement)
ISSN 2579-7298 International Journal of Artificial Intelegence Research 5
Vol. 8, No. 1.1 (2024)
example is 100 (10 × 10), which is significantly fewer than in the previous situation (1280
iterations). We set the vulnerable parameters as same with Grid Search. The results of experiments
using Random Search are shown in Figure 3.
Muhammad Arifin et.al (Hyperparameter Tuning in Machine Learning to Predicting Student Academic Achievement)
6 International Journal of Artificial Intelegence Research ISSN 2579-7298
Vol. 8, No. 1.1 (2024)
limit the model's capacity to effectively capture data patterns. Finally, Hyperopt produced the
highest MAE of 0.368, with parameters such as a very small number of estimators and a low max
depth, indicating that this model most likely suffers from underfitting and cannot adequately capture
the complexity of the data. Overall, GridSearchCV seems to provide the best results and represents a
more effective approach in determining hyperparameters for this model.
Table 1. Hyperparameter Techniques Values
Hyperparameter learning_rate subsample n_estimators max_depth MAE
Techniques
GridSearchCV 0.02 0.5 300 16 0.249
RandomizedSearchCV 0.1 0.5 50 12 0.259
Optuna 0.1 0.666 106 8 0.278
Hyperopt 0.109 2 1 5 0.368
Discuss
Based on the experiments that have been carried out, it shows that GridSearchCV is the best
hyperparameter technique feed with the lowest MAE value. Meanwhile [17] said that Optuna is
better when compared to HyperOpt, Optunity and SMAC. [31] Random search is faster when
compared to grid search, but cannot guarantee the results. [22] Grid Search is better than Random
Search but the best method is Self-Tuning Networks. [32] Grid Search shows better stability than
Random Search. However, this difference is not big. [33] recommend more Random Search to
search for the best hyperparameters. [34] discover that the Hyperopt technique works better than the
Random search and Grid search methods due to its higher mean Gini score, a sign of more accurate
predictions. Random Search shows the best achievementwhen compared to TPE, Grid Search, and
CMAES [35].
4. Conclusion
The results showed that from the comparison of the four hyperparameter techniques, they had
almost the same MAE value. Vulnerable MAE values between techniques are insignificant.
GridSearchCV is a technique that has the lowest MAE value, but to achieve this value requires a
large estimator value and depth and a very small learning_rate.
References
[1] C. Romero, M. Ventura, Sebastian Pechenizkiy, and R. S. J. . Baker, Handbook of
Educational Data Mining, 1st ed. United States of America: Springer US, 2010.
[2] R. O. Aluko, E. I. Daniel, O. Shamsideen Oshodi, C. O. Aigbavboa, and A. O. Abisuga,
“Towards reliable prediction of academic performance of architecture students using data
mining techniques,” J. Eng. Des. Technol., vol. 16, no. 3, pp. 385–397, 2018, doi:
10.1108/JEDT-08-2017-0081.
[3] H. Karalar, C. Kapucu, and H. Gürüler, “Predicting students at risk of academic failure
using ensemble model during pandemic in a distance learning system,” Int. J. Educ.
Technol. High. Educ., vol. 18, no. 1, 2021, doi: 10.1186/s41239-021-00300-y.
[4] R. Conijn, C. Snijders, A. Kleingeld, and U. Matzat, “Predicting student performance from
LMS data: A comparison of 17 blended courses using moodle LMS,” IEEE Trans. Learn.
Technol., vol. 10, no. 1, pp. 17–29, 2017, doi: 10.1109/TLT.2016.2616312.
[5] S. Helal et al., “Predicting academic performance by considering student heterogeneity,”
Knowledge-Based Syst., vol. 161, no. December 2017, pp. 134–146, 2018, doi:
10.1016/j.knosys.2018.07.042.
[6] M. Arifin, Widowati, Farikhin, A. Wibowo, and B. Warsito, “Comparative Analysis on
Educational Data Mining Algorithm to Predict Academic Performance,” Proc. - 2021 Int.
Semin. Appl. Technol. Inf. Commun. IT Oppor. Creat. Digit. Innov. Commun. within Glob.
Pandemic, iSemantic 2021, pp. 173–178, 2021, doi:
10.1109/iSemantic52711.2021.9573185.
[7] L. W. Santoso and Yulia, “Predicting student performance in higher education using multi-
regression models,” Telkomnika (Telecommunication Comput. Electron. Control., vol. 18,
Muhammad Arifin et.al (Hyperparameter Tuning in Machine Learning to Predicting Student Academic Achievement)
ISSN 2579-7298 International Journal of Artificial Intelegence Research 7
Vol. 8, No. 1.1 (2024)
Muhammad Arifin et.al (Hyperparameter Tuning in Machine Learning to Predicting Student Academic Achievement)
8 International Journal of Artificial Intelegence Research ISSN 2579-7298
Vol. 8, No. 1.1 (2024)
[26] H. Ma, X. Yang, J. Mao, and H. Zheng, “The Energy Efficiency Prediction Method Based
on Gradient Boosting Regression Tree,” 2nd IEEE Conf. Energy Internet Energy Syst.
Integr. EI2 2018 - Proc., vol. 1, no. 4, 2018, doi: 10.1109/EI2.2018.8581904.
[27] P. Datta, P. Das, and A. Kumar, “Hyper parameter tuning based gradient boosting algorithm
for detection of diabetic retinopathy: an analytical review,” Bull. Electr. Eng. Informatics,
vol. 11, no. 2, pp. 814–824, 2022, doi: 10.11591/eei.v11i2.3559.
[28] Z. M. Alhakeem, Y. M. Jebur, S. N. Henedy, H. Imran, L. F. A. Bernardo, and H. M.
Hussein, “Prediction of Ecofriendly Concrete Compressive Strength Using Gradient
Boosting Regression Tree Combined with GridSearchCV Hyperparameter-Optimization
Techniques,” Materials (Basel)., vol. 15, no. 21, p. 7432, 2022, doi: 10.3390/ma15217432.
[29] T. Akiba, S. Sano, T. Yanase, T. Ohta, and M. Koyama, “Optuna : A Next-generation
Hyperparameter Optimization Framework,” in Proceedings of the 25th ACM SIGKDD
International Conference on Knowledge Discovery & Data Mining, Jul. 2019, pp. 2623–
2631. doi: 10.1145/3292500.3330701.
[30] J. Bergstra, B. Komer, C. Eliasmith, D. Yamins, and D. D. Cox, “Hyperopt: A Python
library for model selection and hyperparameter optimization,” Comput. Sci. Discov., vol. 8,
no. 1, 2015, doi: 10.1088/1749-4699/8/1/014008.
[31] P. Liashchynskyi and P. Liashchynskyi, “Grid Search, Random Search, Genetic Algorithm:
A Big Comparison for NAS,” no. 2017, pp. 1–11, 2019.
[32] L. Villalobos-Arias, C. Quesada-López, J. Guevara-Coto, A. Mart\’\inez, and M. Jenkins,
“Evaluating Hyper-Parameter Tuning Using Random Search in Support Vector Machines
for Software Effort Estimation,” in Proceedings of the 16th ACM International Conference
on Predictive Models and Data Analytics in Software Engineering, 2020, pp. 31–40. doi:
10.1145/3416508.3417121.
[33] L. Villalobos-Arias and C. Quesada-López, “Comparative study of random search hyper-
parameter tuning for software effort estimation,” in Proceedings of the 17th International
Conference on Predictive Models and Data Analytics in Software Engineering, Aug. 2021,
pp. 21–29. doi: 10.1145/3475960.3475986.
[34] S. Putatunda and K. Rama, “A Comparative Analysis of Hyperopt as Against Other
Approaches for Hyper-Parameter Optimization of XGBoost,” in Proceedings of the 2018
International Conference on Signal Processing and Machine Learning - SPML ’18, 2018,
pp. 6–10. doi: 10.1145/3297067.3297080.
[35] J. Joy and M. P. Selvan, “A comprehensive study on the performance of different Multi-
class Classification Algorithms and Hyperparameter Tuning Techniques using Optuna,”
Proc. Int. Conf. Comput. Commun. Secur. Intell. Syst. IC3SIS 2022, 2022, doi:
10.1109/IC3SIS54991.2022.9885695.
Muhammad Arifin et.al (Hyperparameter Tuning in Machine Learning to Predicting Student Academic Achievement)