Seminal Review Paper
Seminal Review Paper
MSC PROGRAM
SEMINAR PAPER
ON
PREDICTING STUDENTS' ACADEMIC SUCCESS USING
MACHINE LEARNING.
ID: PGE/49446/15
2016 E.C
This study aims to develop predictive models by leveraging large-scale educational datasets that
encompass various student attributes, academic performance metrics, and socio-economic factors.
The goal is to identify students at risk of academic failure, success, or dropout by considering
factors such as personal information, academic evaluation, student activities, environment, and
attendance.
Initially, several supervised machine learning algorithms, including Decision Trees, Neural
Networks, Random Forests, Logistic Regression, Naïve Bayes, KNN, and SVM, were evaluated.
However, due to the increasing number of students and data diversity, these algorithms proved
insufficient. To overcome this challenge, the study implemented Big Data technology, which
enhanced processing efficiency while maintaining accuracy.
The system achieved a high recognition rate for predicting student success, demonstrating its
effectiveness in monitoring and enhancing student academic achievement. The research also
discusses the challenges faced by educators in identifying at-risk students and provides a review
of the theoretical foundations of predictive modeling in education. The approach involves the
collection, analysis, and interpretation of data using advanced analytics techniques.
Furthermore, the study presents a machine learning model and algorithm that employ clustering
and classification techniques to train and test the machine on data from higher education
institutions. Big data analytics technology was leveraged to optimize processing time without
compromising algorithm efficiency. Data is stored in HDFS and classification algorithms is
applied for the prediction of student success using MAPREDUCE .The results, compared before
and after the use of Educational Data Mining, showcase improved execution time and a recognition
rate of using the SVM algorithm.
Overall, these models demonstrate high efficiency and effectiveness in predicting student
outcomes, with a focus on success, performance, retention, and graduation rate. This papers offers
valuable insights and tools for educational practitioners to make timely interventions, personalize
instructional approaches, and support decision-making. By harnessing the power of machine
learning and predictive modeling, educators can enhance educational outcomes and address
challenges like student failures and dropouts.
I|Page
TABLE CONTENTS
ABSTRACT .................................................................................................................................... I
1. INTRODUCTION ................................................................................................................. 1
1.1 BACKGROUND OF THE STUDY ..........................................................................................................2
1.2 Statement of Problem ................................................................................................................................4
1.3 Objective of the Study ................................................................................................................................4
1.3.1 General Objective ............................................................................................................... 4
1.3.2 Specific Objectives .............................................................................................................. 4
1.4 Formulated Research Question.................................................................................................................5
1.5 Motivation ...................................................................................................................................................5
1.6 Importance of the study for local schools .................................................................................................5
1.7 Scope of the study .......................................................................................................................................6
1.8 Applying Machine Learning in Education ...............................................................................................6
1.8.1 Machine Learning Based Prediction Model ..................................................................... 7
1.8.2 Machine learning predictive approaches .......................................................................... 7
1.8.3 System Architecture used predicting student academic success ..................................... 8
2. LITERATURE REVIEW ..................................................................................................... 9
II | P a g e
LIST OF TABLES AND FIGURES
III | P a g e
1. INTRODUCTION
Big Data analysis is an analytical tool that includes a comprehensive and sophisticated set of
procedures and algorithms for extracting meaningful information from studied data. In recent
years, it has been employed in almost every sector, including health, economics, social services,
human resources, education, industry, and government [1]. Education is an area where a huge
amount of data is generated and accumulated. As a result, Big Data Technologies are being used
in the education sector; it has been given the moniker Educational Data Mining because it is used
to mine educational data. It supports predictions, grouping, association extraction, model
discovery, and presentation of data. It is utilized for diverse objectives, including assessing learners
and developing.
The seminar paper focuses on the students' academic achievement. Student Academic performance
is a key indicator of educational quality and institutional success. A successful student is one who
has completed their program and validated every semester at school. Student academic success is
characterized as a set of indicators that capture engagement, assessment completion, and learning.
However, i utilize the grade point average (GPA) of the semesters or total mark transfer to quantify
student academic progress.
In this paper, I reviewed a method and Machine Learning Algorithms for predicting the academic
success at school, because students face many changes, both in teaching methods and in evaluation
methods that require assistance to be successful in their educational life cycle. A variety of
elements influence students' academic progress. I divided them into five categories: personal
information of the students, academic evaluation and activities of the students in school,
psychological and environmental factors. Then I utilized the property selection methods to identify
properties that would be effective for predicting student academic background.
The seminar paper has the following structured. The opening section explains why BDA and
Machine Learning are vital for improving research. This part comprises the identified problem,
Motivation, limitations, questions, research deliverable methodologies, finding or result and
discussion and analysis, Papers Critiques. The Literature Review provides background for the
issue and summarizes past papers in this area. Technique highlights the technique used, as well as
the models and performance measures used to illustrate the big data and machine learning
approach. Finally, the Conclusion section offers the work's principal findings.
1|Page
1.1 BACKGROUND OF THE STUDY
In Ethiopia, the academic year begins in September and ends in July, and the official primary
school entrance age is 7. The system is structured so that the primary school cycle lasts 6 years,
lower secondary lasts 4 years, and upper secondary lasts 2 years. Ethiopia has a total of 21,418,000
students enrolled in primary and secondary education. Of these students, about 16,200,000 (76%)
are enrolled in primary education. The study shows the highest level of education reached by youth
ages 15-24 in Ethiopia. Although youth in this age group may still be in school and working
towards their educational goals, it is notable that approximately 16% of youth have no formal
education and 54% of youth have attained at most incomplete primary education, meaning that in
total 70% of 15-24 year olds have not completed primary education in Ethiopia [2].
2|Page
The Performance on Learning
This section provides information on indicators of learning, which lends insight into the quality of
educational provision. In this profile, learning is measured through literacy rates, which are
important because literacy is a foundational skill needed to attain secondary levels of learning, and
national performance on learning assessments. According to UNESCO Institute for Statistics
(UIS), compared to other countries, Ethiopia ranks at the 29 percentile in access and at the 8
percentile in learning. The data source compares youth and adult literacy rates and shows that, in
Ethiopia, the literacy rate is 55% among the youth population; this is lower than the average youth
literacy rate in other low income countries.
Accordingly to Adaba Woreda Education Office (AWEO), secondary schools students
registered in 2015 is a lowest number of enrolments in secondary education, 5000 students,
reaching the lowest rate in the last decade (2013-2015). In 2016 the trend continued, and Woreda
had a new enrolment, 4571 students enrolled in Secondary Education (SE). On the other hand,
grade repetition has been identified by the Office for Education Growth and Development as
one of the main problems of the Woreda education system. The reported state that the share
of early school leavers is substantial and many of those fail to pursue additional training 15-24
year-olds have not completed 1st cycle and 2nd cycle secondary education and are not enrolled in
any further training or education, in the woreda. One of the main goals set by AWEO is the
reduction of student dropout and year repetition rates and the need for metrics to measure success
in improving equity, performance, and school dropout rates.
In general, the purpose of this seminar paper study is to contribute to the field by developing
accurate and reliable prediction models that can improve educational practices and decision-
making, resulting in improved student performance and satisfaction.
3|Page
1.2 Statement of Problem
From this seminar paper, several problems were identified in predicting student academic success.
One major issue is the increasing number of students and the diversity of data sources, which uses
traditional method ways of inefficient analyzing and making decisions in a timely and effective
manner to improve and identify student academic achievement. Additionally, previous studies
have shown that factors influencing student success can be crucial in predicting academic success.
The use of big data analytics and machine learning algorithms for prediction can be limited in
handling the numerous factors that impact student success. Furthermore, the need for data mining
methods to construct accurate prediction models is essential, especially considering the growing
complexity of student data and the varying learning methods, low rate of student knowledge
retention, and high rate of student dropout from the schools. By utilizing big data technologies,
developing the system that achieve a remarkable recognition rate in predicting student success,
showcasing the effectiveness of the approach in monitoring and improving student outcomes play
a vital role to minimize this problems.
4|Page
1.4 Formulated Research Question
The questions raised in these papers can be inferred from the papers are summarized as follows:
Q1: What factors influence the prediction of students' academic success?
Q2: What BDA and Machine Learning methods can be utilized to predict student success?
Q3: How can BDA Tools be integrated to improve the efficiency and accuracy of models?
Q4: How effective are traditional models in predicting student success compared?
1.5 Motivation
The motivation behind the seminar on predicting student academic success using big data and
machine learning algorithms lies in monitoring students to prevent academic failure. By
developing a system that can accurately predict whether a student will succeed or fail academically,
teachers and educators can intervene early to support students in improving their learning and
performance. The seminar paper aims to utilize various factors such as personal information,
academic evaluation, student activities, environment, and attendance to create a predictive model
for student success. Ultimately, the goal of these papers are to leverage advanced technologies to
enhance the monitoring and support of students in achieving academic success.
The study this seminar paper on predicting student academic success addresses the critical issue
of academic failure among students by providing a tool for early prediction. By accurately
predicting student academic success, retention, dropout or academic failure, teachers and
administrators can identify at-risk students and provide them with the necessary assistance to
improve their academic performance and success. The study on this research are have the
following key significance. These are: Early intervention, supports personalized learning,
optimizes resource allocation, informs data-driven decision-making, promotes educational equity,
facilitates continuous improvement, and advances research in educational data analytics.
5|Page
1.7 Scope of the study
The scope of the study on the prediction of student success using big data analytics and machine
learning can encompass various aspects related to student outcomes and the application of
predictive models. Here are some dimensions within the scope that the study considers:
Predictive Variables: These predictors can include academic-related factors (grades, test scores),
demographic information, socio-economic status, attendance records, behavioral data, learning
styles, engagement metrics, and other relevant data sources.
Prediction Models: It may consider various techniques, such as Decision Trees, KNN, Neural
Networks, Random Forests, or SVM models, and compare their performance in predicting student
success outcomes.
Student Success Outcomes: The scope may include a specific focus on academic outcomes, such
as total grade mark, grade point average, retention rates, or graduation rates.
Evaluation Metrics: Common evaluation metrics include accuracy, precision, recall, score, or
other relevant measures that indicate the predictive power of the models.
The identified dimensions the scope will allow researchers to generate meaningful insights and
contribute to the field of predicting student success using big data analytics and machine learning.
By using big data analysis and machine learning algorithms to analyze student data, it helps
educators understand individual learning styles, strengths, and areas for improvement. This
empowers students, teachers and schools to orient their instruction, provide targeted support, and
offer a truly customized learning experience for each student. With machine learning, education
becomes more effective, engaging, and adaptive, ensuring that every learner reaches their full
potential.
6|Page
1.8.1 Machine Learning Based Prediction Model
Machine learning techniques utilize various methods to create prediction models, including:
•Linear Regression
•Decision Trees
1. Supervised Learning •Random Forests
•K-Nearest Neighbors
•Support Vector Machines
•Clustering
2. Unsupervised Learning •hierarchical clustering
•K-means
•Model-based
3. Reinforcement Learning •Model-free
7|Page
4. Data Analysis: The reviewed paper have various data analysis techniques, such as statistical
analysis, data mining, machine learning, and predictive modeling, to uncover patterns and
relationships in the data is used for predicting student academic success.
5. Model Development: Develop prediction models using the selected features and the
appropriate machine learning algorithms. These models should be trained on a subset of the data
and validated using another subset to ensure their accuracy and reliability.
The system architecture for student prediction using big data analytics and machine learning may
involve several components working together to process, analyze, and predict student outcomes.
Here is a high-level overview of typical system architecture:
Data Analyzing
Feature Selection
Storage in HDFS
Classification
Student Dataset
MAPREDUCE
(LMS, School Data)
Decision Making Model Development
Model Evaluation
8|Page
2. LITERATURE REVIEW
The potential of using big data analytics and machine learning approaches to predict student
academic success has been widely acknowledged in several review studies. Researchers have
explored diverse data sources and variables to enhance the accuracy of predictions, addressing the
challenge of effectively analyzing data due to the increasing number of students and the variety of
data sources [3].
In predicting student outcomes, previous studies have employed various algorithms, including
KNN, Neural Networks, Decision Trees, Regression, and SVM. These studies have emphasized
the significance of factors such as internet behavior and registration data [4].
The field of predicting student academic success can be divided into two main axes: understanding
the factors influencing academic success prediction and employing data mining methods for model
construction and validation. Various factors have been considered in these prediction models,
including personal information, academic evaluation, student activities, psychological aspects, and
environment. Big data analysis techniques and machine learning algorithms have been employed
to forecast student success [5].
Research in this area has explored multiple machine learning algorithms to predict different
outcomes, such as academic achievement, dropouts, course completion, graduation rates, and
engagement. Techniques like correlation analysis, feature importance ranking, and input from
domain experts have been utilized to identify the most relevant features for prediction.
Additionally, feature engineering approaches have been explored to capture complex relationships
within the data [6].
To evaluate the performance of prediction models, various metrics such as accuracy, precision,
recall, and score have been employed. Comparisons of different algorithms and the exploration of
ensemble models have been conducted to enhance prediction accuracy. Model performance and
generalizability have been assessed using cross-validation and holdout validation techniques [7].
While there are promising findings in the literature, it is important to note that the field is rapidly
evolving, and a continuous review of new advancements is necessary to advance the application
of big data analytics and machine learning in predicting student academic success.
9|Page
Seminar Papers
Note: Hadoop MAPREDUCE is a software framework for easily writing applications which process vast amounts of data (multi-
terabyte data-sets) in-parallel on large clusters (thousands of nodes) of commodity hardware in a reliable, fault-tolerant manner.
10 | P a g e
Table1: Five seminar papers (1-5) evaluation and Comparison
SNO Level of Education Factors/Features BDA Approaches Dataset ML Selection BDA Tools Metrics performance
/Result
Paper Four: Supervised machine learning algorithms for predicting student dropout and academic success: a comparative study
4 -University Students Various factors -Data Collection 4424 Compared ML model -Python -Accuracy
-Student Age, Gender -Data Preprocessing -LR -Apache -Precision
-Demographic factors -Feature Selection -Decision Trees -Hadoop -Recall
-Academic evaluation -Model Selection -Random Forest -F-Score
-Socioeconomic factor -Cross-Validation -GBM -CatBoost
-Model Evaluation -SVM .
-Neural Networks
Paper Five: Machine learning model and ensemble algorithm for prediction of students' retention and graduation rate
5 -Primary school Average Grade Score CRISP-DM 3623 -ANN Retention vs Graduate
-Secondary schools -Identification -Business Understanding -KNN(best selected) -Retention =0.909
-Highest University -Assessment -Data Understanding -Logistic Regression -Graduate =0.817
-Data Preparation -Decision Trees
-Modeling * Accuracy of the
-Evaluation -Random Forest knowledge retention
-Deployment -GBM
-SVM
-RG-DMML(new
created model)
Note: Hadoop and its frameworks HDFS, MAPREDUCE are the current seminar topic in computer science which is highly
investigated paper of “Predicting student academic success using big data analytics and machine learning algorithms”.
11 | P a g e
3. STRENGTH, WEAKNESS AND CHALLENGES OF PAPERS
Table 2: Seminar Papers Evaluation on Strength, Weakness and Challenges
SNO Strength Weakness Challenges/Limitations
1 Machine Learning Based Predicting Student Academic Success
Early Intervention and Support Bias and Fairness(Small dataset) Data Quality and Availability
Student Capacity Identification Lack of Context Interpretable Models-unfair and reinforce
Early Resource Allocation for Schools Inefficient factors/features for better accuracy Dynamic nature of student development
Evidence-Based Decision Making Overemphasis on Quantitative Data Lack of contextual understanding
2 Prediction of Student Success: A Smart Data-Driven Approach
Methodology Data Quality and Representativeness Data Integration and Quality
Performance Evaluation Algorithm Selection and Evaluation Algorithm Selection and Validation
Data Analysis Ethical and Privacy Considerations Interdisciplinary Collaboration
Practical Implications Interpretability and Transparency Ethical and Privacy Considerations
3 Predicting students' academic success using Machine Learning
Innovation in Methodology Data Limitations Data Quality and Accessibility
Practical Implications Overemphasis on Technical Aspects Interpretability of Models
Data Interpretation Ethical Considerations Ethical and Privacy Concerns
Contributions to the Field Validation and Reproducibility Integration into Educational Practices
4 Supervised machine learning algorithms for predicting student dropout and academic success: a comparative study
Comprehensive Comparative Analysis Lack of gaining adequate and quality data. Comprehensive Comparative Analysis
Robust Methodology Imbalanced dataset issue results to train models Robust Methodology
Large and Representative Dataset Poor feature engineering and interoperability Large and Representative Dataset
Practical Implications and Actionable Insights Inadequate consideration of relevant features Practical Implications and Actionable Insights
5 Machine learning model and ensemble algorithm for prediction of students' retention and graduation rate
Integration of Diverse Data Sources Lack of model evaluation in practical Diversity Data Sources
Comprehensive Model Evaluation There are no much factors/features addressed Model Evaluation
Ensemble Learning Techniques Small amount of dataset to train and test Big Data Tools integration
Practical Relevance and Actionable Insights Model selection is confused with features Actionable Insights on model performance
12 | P a g e
4. SEMINAR PAPERS GAPS
Paper title: Supervised machine learning algorithms for predicting student dropout and
academic success: a comparative study
One potential gap could be related to the representativeness of the dataset used in the study. If the
dataset does not adequately represent the diversity of student populations or academic institutions,
it could limit the generalizability of the findings to broader contexts.
Paper title: Supervised machine learning algorithms for predicting student dropout and
academic success: a comparative study
The critical gap of this seminar paper is relate to the selection and evaluation of the supervised
machine learning algorithms themselves.
13 | P a g e
Paper title: Machine learning model and ensemble algorithm for prediction of students' retention
and graduation rate
This paper lack an in-depth exploration of the ensemble techniques another machine learning
algorithms used for prediction. If the study does not adequately explain the rationale for choosing
a particular ensemble method or fails to compare the performance of ensemble approaches with
individual models, this could represent a paper gap.
Common Gaps and Paper Relationships
I wants to highlight some common gaps that are often found in these seminar papers related to big
data and machine learning algorithms for predicting student dropout and academic success.
Authors should be aware of common gaps found in seminar papers on predictive modeling for
student success. One of the main gap is inadequate sample sizes and data quality that can hinder
accurate predictions of the models. Another important point is the interpretation of machine
learning models, especially complex ones, which often lack interpretability and explanation,
making it crucial to clarify how they interpret findings. Additionally, thorough validation and
consideration of generalizability across different contexts or populations are essential, especially
in diverse educational settings.
5. CRITIQUE ON REVIEWED PAPERS
Based on the context provided in Table 1 and the evaluation in Table 2 it is evident that the
strengths of the models vary across the papers and are often closely linked to their specific
implications.
General critiques about these seminar papers include the lack of detailed discussion on the specific
limitations or drawbacks of the proposed system and methods used. While the papers highlights
the use of big data technology to improve efficiency and accuracy in predicting student academic
success, it does not dig out deeply into potential challenges or constraints faced during the
implementation of the system. Additionally, the paper could benefit from more in-depth analysis
and comparison of the results obtained before and after the use of big data technology, providing
a clearer understanding of the impact of this technology on the predictive capabilities of the system.
Furthermore, the document could have provided more context on the practical implications of the
paper findings, such as how the predicted student success rates can be translated into actionable
interventions to support student academic achievements. Five of the evaluated papers was the
development and analysis of predictive models in most of higher educational sectors. They do not
considered the problems found at primary and secondary schools.
14 | P a g e
6. DISCUSSION AND RESULTS
The investigation of five seminar papers presents a various system developed and analysis to
predict student academic success or failure by utilizing various factors and data mining techniques.
These papers focused on predicting student academic success rate, minimizing students drop out
from the schools, and student academic retention for further knowledge. These investigation are
utilized based on students personal information, academic evaluation, student activities,
environment, and attendance, using big data analytics tools and machine learning algorithms like
Light GBM , Random Forest, KNN, Decision Trees, Logistic Regression, CRISP-DM, and SVM
initially. However, these methods were deemed insufficient due to the increasing number of
students and data diversity, leading to the implementation of Big Data technology for more
efficient processing.
Papers Algorithm Recall Precision F-Measure Accuracy Specificity
Papers1 SVM 88 98 - 87 -
Papers2 LGBMC 88 98 - 88 -
Papers3 SVM 87.09 87.82 87.45 87.32 87.57
Papers4 - - - - - -
Papers5 LGBMC - - 0.86 0.909 -
Table 3: The results obtained by all models based on the best accuracy algorithms
The results were compared before and after the use of Big Data, showing significant improvements
in execution time and prediction accuracy. These performance measures alone are not enough.
Because the choice of model improvement also depends on the execution time. The following
graph shows the execution time for each model. I took the time to evaluate and analysis models
based on machine learning algorithm and execution time one which gave maximum accuracy, high
record evaluation metrics and well organized seminar paper structure for all classification
algorithms is considered as the best paper evaluated.
15 | P a g e
Execution Time of Algorithms
0.45
0.4
0.35
0.3
0.25
0.2
0.15
0.1
0.05
0
SVM KNN C4.5
In summary, the paper discusses the development of a system for predicting student success using
Big Data and machine learning algorithms. By considering multiple factors and leveraging
advanced technologies, the system was able to effectively predict student academic outcomes,
showcasing the potential for enhancing student performance through data-driven approaches in
education. The paper predicting student academic success using Big Data Analytics and Machine
Learning algorithms is the best ever paper in terms of the developed model performance , model
execution time and factors influencing the system after several evaluation.
In summary, these seminar papers study should be combined and presents a machine learning
model and algorithm that employ clustering and classification techniques to train (70%) and test
(30%) the machine on data from higher education institutions. Big data analytics technology was
leveraged to optimize processing time without compromising algorithm efficiency. Data is stored
in HDFS and classification algorithms is applied for the prediction of student academic success,
retention, graduation rate and failures using MAPREDUCE seminar trend concepts. The results,
compared before and after the use of Educational Data Mining, showcase improved execution time
and a recognition rate of using the SVM algorithm
16 | P a g e
7. CONCLUSION AND FUTURE WORK
1. Conclusion
The paper has provided valuable knowledge that can revolutionize educational practices and
empower students to reach their full potential. The seminar presents a system developed to predict
student academic success by leveraging big data technology and machine learning algorithms. The
study focused on utilizing various factors, such as personal information, academic evaluation,
student activities, environment, and attendance, to predict student success. Traditional models like
KNN, Decision Trees, Regression, and SVM, CRISP were initially used but were found to be
insufficient due to the increasing number of students and data diversity. To address this issue, the
implementation of emerged Big Data technology enabled the distribution of processing, improving
efficiency without compromising accuracy.
The reviewed paper demonstrates the importance of big data in education, the models used for
predicting student success, and the significance of monitoring and enhancing student performance.
Through the utilization of big data technologies, the system high achieved a recognition rate of in
predicting student success, highlighting the effectiveness of the approach in predicting student
outcomes. Ultimately, the aim of these papers are to pave the way for a generation of successful
students who are equipped to excel in the challenges of the future.
17 | P a g e
2. Future Works
There are several areas for potential development and enhancement in the prediction of student
academic success using big data and machine learning algorithms. In the future studies could
explore integrating additional data sources such as social media interactions, extracurricular
activities, and psychological factors to enhance the prediction model's accuracy and robustness.
Further researchers could investigate the implementation of more advanced machine learning
algorithms to improve prediction accuracy, such as deep learning algorithms or ensemble methods.
There is a possibility of developing a system that can provide real-time predictions of student
academic success to enable immediate interventions and support for at-risk students.
Finally further future studies could conduct deep analysis to track the progress and changes in
student performance over time, providing insights into the effectiveness of interventions and
support strategies. Exploring these avenues for future research, the field of predicting student
academic success using big data and machine learning algorithms can continue to evolve and
contribute to improving educational outcomes for student in every time of educational
extracurricular activities.
18 | P a g e
8. REFERENCES
[1] "The Future of Big Data and Analytics in K-12 Education," "Education Week," big-data-and-
analytics. .
[2] "World Bank, National Education Profile," 2018.
[3] S. K. M. Hussain, "Student-Performulator," Predicting Students’ Academic Performance at
Secondary and Intermediate Level Using Machine Learning, 2021.
[4] J. W. H. P. R. W. Xing Xu, "Prediction of academic performance associated with internet usage
behaviors using machine learning algorithms," Computers in Human Behavior, 98 , p. 166–173,
2019.
[5] A. B. Zorić, "Predicting Students’ Academic Performance Based on Enrolment Data," International
Journal of Innovation and Economic Development, vol. 6, pp. 54-61, 2020.
[6] S. A.L., "Some studies in machine learning using the game of checkers // IBM J. Res. Dev," vol. 3,
p. 210–229, 1959.
[7] M. T, "Machine learning," // McGraw-Hill Science/Engineering/Math , p. 432, 1997.
[8] S. B. Kotsiantis, "Use of machine learning techniques for educational proposes: a decision support
system for forecasting students’ grades.," Artificial Intelligence Review 37, pp. 331 - 344, 2012.
[9] S. B. C. J. P. a. P. E. P. Kotsiantis, "Preventing student dropout in distance learning using machine
learning techniques.," International conference on knowledge-based and intelligent information and
engineering systems. Springer, Berlin, Heidelberg,, 2003.
[10] B. Z. M. Ahmed Mueen, "Modeling and Predicting Students' Academic Performance Using Data
Mining Techniques.," International Journal of Modern Education and Computer Science, vol. 11,
pp. 36-42 , 2016.
[11] L. S. Chung JY, "Dropout early warning systems for high school students using machine learning,"
Child Youth Serv Rev, vol. 3, p. 346–53, 2019.
[12] K. S. P. C. V. V. Gkontzis AF, "A predictive analytics framework as a countermeasure for attrition
of students," Interact Learn Environment, vol. 3, p. 1028–43, 2022.
[13] S. K. G. S. O. S. B. J. Berens J, "Early detection of students at risk–predicting student dropouts
using administrative student data and machine learning methods," SSRN J, 2018.
[14] T. D. M. J. B. L. R. V. Martins MV, "Early prediction of student’s performance in higher education:
a case study," Trends and applications in information systems and technologies, vol. 9, p. 166–75,
2021.
[15] A. H. M. M. A. H. R. A. A. M. N. A. M. S. &. S. T. Ali, "Big data classification based on improved
parallel k-nearest neighbor," TELKOMNIKA (Telecommunication Computing Electronics and
Control, vol. 21, p. 235–246, 2023.
[16] S. C. G. T. K. &. M. K. Amirtharaj, "A systematic approach for assessment of attainment in
outcome-based education.," Higher Education for the Future, 9(1), p. 8–29, 2021.
[17] S. M. Z. E. A. R. A. H. A. B. S. &. A.-N. S. S. Arqawi, "Predicting university student retention
using artificial intelligence," International Journal of Advanced Computer Science and Applications,
13(9), , vol. 9, p. 315–324, 2022.
[18] H. S. A. W. A. N. &. H. S.-U. Brdesee, "Predictive model using a machine learning approach for
enhancing the retention rate of students at-risk," International Journal on Semantic Web and
Information Systems , vol. 18(1), p. 1–21, 2022.
19 | P a g e