2015-Student Performance Prediction Using Machine Learning

This document summarizes a research paper that used machine learning techniques to predict student performance. Specifically: 1) The researchers used neural networks to analyze student attributes like grades, living location, medium of instruction and predict performance. 2) They found neural networks outperformed other techniques like Bayesian classification for this continuous data. 3) An application was created that took in student data and used a neural network to accurately predict marks within 8 points 70% of the time.

Uploaded by

deden

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

33 views

2015-Student Performance Prediction Using Machine Learning

Uploaded by

deden

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 3

International Journal of Engineering Research & Technology (IJERT)

ISSN: 2278-0181
Vol. 4 Issue 03, March-2015

Student Performance Prediction using Machine

Learning
Havan Agrawal, Harshil Mavani
Department of Information Technology
K. J. Somaiya College of Engineering
Mumbai, India

Abstract - In this paper, a model is proposed to predict the

performance of students in an academic organization. The II. BACKGROUND AND RELATED WORK
algorithm employed is a machine learning technique called
Neural Networks. Further, the importance of several different A. Algorithm
attributes, or "features" is considered, in order to determine Classification is one of the most frequently studied
which of these are correlated with student performance. Finally, problems by data mining and machine learning (ML)
the results of an experiment follow, showcasing the power of researchers. It consists of predicting the value of a
machine learning in such an application.
(categorical) attribute (the class) based on the values of other
Keywords— Artificial intelligence, machine learning, student attributes (the predicting attributes). There are different
performance, neural networks classification methods.
Bayesian classification is an algorithm that is based on
I. INTRODUCTION Bayes rule of conditional probability. Bayes rule is a
There are many studies in the learning field that technique to estimate the likelihood of a property given the
investigated the ways of applying machine learning set of data as evidence or input. Bayes rule or Bayes theorem
techniques for various educational purposes. One of the is-
focuses of these studies is to identify high-risk students, as
well as to identify features which affect the performance of
students.
The study conducted by Kotsiantis et al [1] is one of the A more recent development in classification is that of
initial studies which investigated application of machine artificial neural networks. These networks are modeled after
learning techniques in distance learning for dropout the human neural system (hence the name), and have proven
prediction. The most significant contribution by this study to be as powerful, if not more, as any other algorithm. While
was that it was a pioneer and carved the path for several such implementations may be complex, these networks are capable
studies. While machine learning algorithms had been of understanding non-linear patterns in data.
previously implemented in several settings, this was perhaps A detailed description of the algorithms can be found in
the first time that these techniques were applied to an [4].
academic environment.
Kotsiantis et al [1] compared five algorithms, viz.
Bhardwaj and Pal [2] conducted a study in India, Decision Trees (C4.5), Naive Bayes algorithm (Bayesian
Faizabad to determine factors that most heavily affected networks), 3-NN (kNN), RIPPER (Rule Learning) and
student performance. They used Bayesian Classification for WINNOW (Perceptron based neural networks). This study
their study. was composed of two experimental stages, training and
The study by Erkan Er [3] was based upon Kotsiantis' as testing. During these stages, number of attributes was
well as other similar studies. It concluded that Naive Bayes increased step-by-step. For example, while only demographic
indeed performed better than any other machine learning data was included in the first step, performance attributes
algorithm. However, the crucial contribution of this study were added in the next step. Five algorithms were tested for
was that time-invariant features may be detrimental to the each these subsequent steps and then they were compared.
machine learning process, and hence are better left out of the This comparative study helped in narrowing down candidates
study entirely. He also concluded that "Instead of for our own application.
demographic characteristics of students, using initial However, classification of data into binary groups
attendance and homework grades produces better prediction seems insufficient. The primary goal of this study was only
rate at earlier stages." detecting at-risk students instead of determining performance
levels of students. Classifying students according to their
performances in different levels (e.g. poor, average, good,
excellent, etc.) might be more useful. In this way, instructors
can provide more adaptive feedback for each student.

IJERTV4IS030127 www.ijert.org 111

(This work is licensed under a Creative Commons Attribution 4.0 International License.)
International Journal of Engineering Research & Technology (IJERT)
ISSN: 2278-0181
Vol. 4 Issue 03, March-2015

B. Features of this student. Octave was used for test purposes. The marks
Bhardwaj and Pal [2] conducted a study on the student of 80 B.E. I.T (Bachelor of Engineering, Information
performance based by selecting 300 students from 5 different Technology) students from semester 3 to semester 6 were
degree college conducting BCA (Bachelor of Computer used. The algorithm is trained on a training set of 60 students,
Application) course of Dr. R. M. L. Awadh University, and tested on a cross-validation set of 10 students, to predict
Faizabad, India. By means of Bayesian classification method marks in 6 subjects. This is done 7 times, varying the training
on 17 attributes, it was found that the factors like students' and test sets each time (k-fold cross validation). An error of
grade in senior secondary exam, living location, medium of plus or minus 8 marks was considered as accurate. The error
teaching, mother's qualification, students other habit, family statistics were as follows:
annual income and student's family status were highly Average error = 6
correlated with the student academic performance. Accurate = 296
In the present study, those variables whose probability Erroneous = 124
values were greater than 0.70 were given due considerations Accuracy Rate = 70.48%
and the highly influencing variables with high probability Once it was confirmed that the data conforms well to a
values have been shown in Table 1. These features were used machine learning algorithm, we conducted a comparative
for prediction model construction. For both variable selection study of neural networks and Bayesian classification, on the
and prediction model construction, the publishers have used basis of varying training and test sets. The results were fairly
MATLAB. surprising. In general, the neural networks tend to outperform
From the table, it is found that the second high potential Bayesian classification. This is somewhat justified once one
variable for students' performance is their living location, and realizes that the input provided to the algorithm was on a
the third high potential variable for students' performance is continuous range, and Bayesian classification traditionally
medium of teaching. In Uttar Pradesh the mother tongue requires discrete data.
language of students is Hindi. Hence, students tend to be Finally, an application was made that employed neural
more comfortable in Hindi and other languages, than in the networks (Figure 2). The application provides to and fro
English language. access of data from .csv (Comma Separated Values) files.
C. Uniqueness When a prediction is required, it dynamically trains a network
of 3 layers, and provided prediction of marks in discrete
The study conducted by Erkan Er [3] proved valuable in
classes of 20 marks.
confirming the uniqueness of the proposed application. His
work concluded that all current applications of machine The training dataset size was increased in increments of
learning in an academic setting were to predict dropout rates 10, starting from 40, for 17 subjects. The test set was of 10
in a distance learning program. There is perhaps no students, to predict a single subject. The accuracy results are
application that attempts to predict the absolute performance summarized in Table 2.
of the student. If one does exist, it has not been published yet.
D. Inference
We analyzed the experiments and results of the
aforementioned studies, and two prominent inferences were
drawn. The first is that Naive Bayes Classification proves to
be an excellent algorithm for the application of predicting
student performance in an academic setting. Further, a worthy
contender for the same is neural networks. Secondly, several
factors contribute to a student's performance, apart from
previous academic performance.

Table 1: Study Results

Variable Description Probability
GSS Student’s Grade in Secondary 0.8642
Education
LLoc Living Location 0.7862
Figure 1: Screenshot of Application
Med Medium of Teaching 0.7225
Table 2: Neural Network Accuracy
III. EXPERIMENTAL WORK Training Dataset Size Accuracy
Initially, the existence of a linear relationship between a 40 50 %
student's previous academic performance levels was 50 50 %
considered. This relationship can be expressed accurately 60 60 %
70 70 %
using Multivariate Linear Regression. Multivariate Linear
Regression uses past semester marks of a student and marks
scored by this student's senior batches to predict future marks

IJERTV4IS030127 www.ijert.org 112

(This work is licensed under a Creative Commons Attribution 4.0 International License.)
International Journal of Engineering Research & Technology (IJERT)
ISSN: 2278-0181
Vol. 4 Issue 03, March-2015

IV. CONCLUSION REFERENCES

Present studies shows that academic performances of the
students are primarily dependent on their past performances. [1] S. Kotsiantis, C. Pierrakeas, and P. Pintelas, “Preventing student
dropout in distance learning systems using machine learning
Our investigation confirms that past performances have techniques,” AI Techniques in Web-Based Educational Systems at
indeed got a significant influence over students' performance. Seventh International Conference on Knowledge-Based Intelligent
Further, we confirmed that the performance of neural Information & Engineering Systems, pp. 3-5, September 2003.
networks increases with increase in dataset size. [2] B.K. Bharadwaj and S. Pal. "Data Mining: A prediction for
performance improvement using classification", International Journal
Machine learning has come far from its nascent stages, of Computer Science and Information Security (IJCSIS), Vol. 9, No. 4,
and can prove to be a powerful tool in academia. In the pp. 136-140, 2011.
future, applications similar to the one developed, as well as [3] Erkan Er. "Identifying At-Risk Students Using Machine Learning
Techniques", International Journal of Machine Learning and
any improvements thereof may become an integrated part of Computing, Vol. 2, No. 4, pp. August 2012.
every academic institution. [4] S. Kotsiantis, I.D. Zaharakis, and P. Pintelas, "Assessing Supervised
Machine Learning Techniques for Predicting Student Learning
ACKNOWLEDGMENTS Preferences"

We thank Prof. Yogita Borse. Without her guidance, this

paper could never have been accomplished.

IJERTV4IS030127 www.ijert.org 113

(This work is licensed under a Creative Commons Attribution 4.0 International License.)

Betty Final Draft Assignment 2 - 04
No ratings yet
Betty Final Draft Assignment 2 - 04
13 pages
Parental Involvement
100% (1)
Parental Involvement
44 pages
Journal Publications
No ratings yet
Journal Publications
13 pages
Arasetv44 N1 PP105 119
No ratings yet
Arasetv44 N1 PP105 119
15 pages
Irjet V7i2688 PDF
No ratings yet
Irjet V7i2688 PDF
4 pages
Student Performance Prediction
No ratings yet
Student Performance Prediction
4 pages
Abstract student outcomes
No ratings yet
Abstract student outcomes
2 pages
12058-Article Text-21417-1-10-20220201
No ratings yet
12058-Article Text-21417-1-10-20220201
7 pages
Prediction Model For Students PDF
No ratings yet
Prediction Model For Students PDF
4 pages
A Novel Approach To Predict Students Performance in Online Courses Through Machine Learning
No ratings yet
A Novel Approach To Predict Students Performance in Online Courses Through Machine Learning
6 pages
A Belief Rule Based Expert System To Predict Student Performance Under Uncertainty
No ratings yet
A Belief Rule Based Expert System To Predict Student Performance Under Uncertainty
6 pages
Tracking and Predecting Students Performance With Machine Learning
0% (1)
Tracking and Predecting Students Performance With Machine Learning
47 pages
Paper 7
No ratings yet
Paper 7
5 pages
doc (6)
No ratings yet
doc (6)
8 pages
9746 14870 1 PB
No ratings yet
9746 14870 1 PB
13 pages
Educational Data Mining and Analysis of Students' Academic Performance Using WEKA
No ratings yet
Educational Data Mining and Analysis of Students' Academic Performance Using WEKA
13 pages
ssrn-3370802_2
No ratings yet
ssrn-3370802_2
5 pages
18d2d550ad9b71c9315f45c680d8629283cd
No ratings yet
18d2d550ad9b71c9315f45c680d8629283cd
6 pages
Data Mining: A Prediction For Performance Improvement Using Classification
No ratings yet
Data Mining: A Prediction For Performance Improvement Using Classification
5 pages
Expert System For Student Placement Prediction
No ratings yet
Expert System For Student Placement Prediction
5 pages
Predicting Student Academic Performance Using Data Mining Methods
No ratings yet
Predicting Student Academic Performance Using Data Mining Methods
5 pages
Article 4
No ratings yet
Article 4
9 pages
A Machine Learning Approach For Tracking and Predicting Student Performance in Degree Programs
No ratings yet
A Machine Learning Approach For Tracking and Predicting Student Performance in Degree Programs
34 pages
108
No ratings yet
108
17 pages
Predicting Students Performance Through Data Mini
No ratings yet
Predicting Students Performance Through Data Mini
15 pages
Student Academic Performance Prediction Under Various Machine Learning Classification Algorithms
No ratings yet
Student Academic Performance Prediction Under Various Machine Learning Classification Algorithms
19 pages
Study of Artificial Neural Network and Support Vector Machine For Students Performance Prediction
100% (1)
Study of Artificial Neural Network and Support Vector Machine For Students Performance Prediction
2 pages
Comparative analysis of deep learning algorithms for student performance prediction across different machine learning models
No ratings yet
Comparative analysis of deep learning algorithms for student performance prediction across different machine learning models
62 pages
Report WT
No ratings yet
Report WT
24 pages
Analysis of Student Academic Performance Using Machine Learning Algorithms: - A Study
No ratings yet
Analysis of Student Academic Performance Using Machine Learning Algorithms: - A Study
15 pages
Literature Review
No ratings yet
Literature Review
11 pages
PredictingStudentSuccess-AutoML PrePrint
No ratings yet
PredictingStudentSuccess-AutoML PrePrint
23 pages
Review On Prediction Algorithms in Educational Data Mining: A.Dinesh Kumar, R.Pandi Selvam, K.Sathesh Kumar
No ratings yet
Review On Prediction Algorithms in Educational Data Mining: A.Dinesh Kumar, R.Pandi Selvam, K.Sathesh Kumar
8 pages
Research Paper, 2020
No ratings yet
Research Paper, 2020
5 pages
Hybrid Machine Learning Algorithms For P
No ratings yet
Hybrid Machine Learning Algorithms For P
10 pages
Review Paper Student Dropout Rate Prediction GRP 16
No ratings yet
Review Paper Student Dropout Rate Prediction GRP 16
4 pages
Machine Learning Based Student AcademicPerformance Prediction
No ratings yet
Machine Learning Based Student AcademicPerformance Prediction
6 pages
Predicting Students Academic Perfomace u
No ratings yet
Predicting Students Academic Perfomace u
10 pages
Data Mining Applications: A Comparative Study For Predicting Student's Performance
No ratings yet
Data Mining Applications: A Comparative Study For Predicting Student's Performance
7 pages
Analysis of Educational
No ratings yet
Analysis of Educational
5 pages
11861-Article Text-21047-1-10-20211230
No ratings yet
11861-Article Text-21047-1-10-20211230
7 pages
A Machine Learning Approach For Tracking and Predicting Student Performance in Degree Programs
No ratings yet
A Machine Learning Approach For Tracking and Predicting Student Performance in Degree Programs
2 pages
The Predicting Students Performance Using Machine Learning Algorithms.
No ratings yet
The Predicting Students Performance Using Machine Learning Algorithms.
3 pages
Student Academic Performance Prediction Using Supervised Learning Techniques
No ratings yet
Student Academic Performance Prediction Using Supervised Learning Techniques
13 pages
2108 ArticleText 3776 1 10 20190403
No ratings yet
2108 ArticleText 3776 1 10 20190403
13 pages
Academic Analytics Using Machine Learning
No ratings yet
Academic Analytics Using Machine Learning
26 pages
A Decision Support System For Evaluating International Student Applications
No ratings yet
A Decision Support System For Evaluating International Student Applications
4 pages
2108 ArticleText 3776 1 10 20190403
No ratings yet
2108 ArticleText 3776 1 10 20190403
13 pages
A Paper
No ratings yet
A Paper
11 pages
Data Mining: A Prediction of Performer or Underperformer Using Classification
No ratings yet
Data Mining: A Prediction of Performer or Underperformer Using Classification
5 pages
The Research Story 1
No ratings yet
The Research Story 1
5 pages
A Novel Prediciting Students Performance Approach To Compentency & Hidden Risk Factor Identifier Using A Various Machine Learning Classifiers
No ratings yet
A Novel Prediciting Students Performance Approach To Compentency & Hidden Risk Factor Identifier Using A Various Machine Learning Classifiers
15 pages
Student Placement Prediction
No ratings yet
Student Placement Prediction
4 pages
A Naïve Bayes Students' Performance Prediction Model for Decision Support System
No ratings yet
A Naïve Bayes Students' Performance Prediction Model for Decision Support System
9 pages
1 Report
No ratings yet
1 Report
45 pages
Performance Evaluation of Feature Selection Algorithms in Educational Data Mining
No ratings yet
Performance Evaluation of Feature Selection Algorithms in Educational Data Mining
9 pages
collegeadmissions
No ratings yet
collegeadmissions
11 pages
Student's Placement Eligibility Prediction Using Fuzzy Approach
No ratings yet
Student's Placement Eligibility Prediction Using Fuzzy Approach
5 pages
EDUCATION DATA MINING FOR PREDICTING STUDENTS’ PERFORMANCE
From Everand
EDUCATION DATA MINING FOR PREDICTING STUDENTS’ PERFORMANCE
Dr. GEETHA N DATA SCIENTIST, BENGALURU
No ratings yet
Machine Learning: Fundamentals and Applications
From Everand
Machine Learning: Fundamentals and Applications
Fouad Sabry
No ratings yet
Machine Learning Algorithms for Data Scientists: An Overview
From Everand
Machine Learning Algorithms for Data Scientists: An Overview
Vinaitheerthan Renganathan
No ratings yet
Pathways to Machine Learning and Soft Computing: 邁向機器學習與軟計算之路（國際英文版）
From Everand
Pathways to Machine Learning and Soft Computing: 邁向機器學習與軟計算之路（國際英文版）
Jyh-Horng Jeng
No ratings yet
Lesson Paln 6
No ratings yet
Lesson Paln 6
2 pages
ENG 102x: English Composition: Research and Writing: General Course Information Course Overview
No ratings yet
ENG 102x: English Composition: Research and Writing: General Course Information Course Overview
7 pages
Senior Seminar Syllabus
No ratings yet
Senior Seminar Syllabus
6 pages
Developing EFL Vocabulary Through Speaking and Listening Activities
No ratings yet
Developing EFL Vocabulary Through Speaking and Listening Activities
6 pages
Warm Up Activities
No ratings yet
Warm Up Activities
11 pages
R-EDS 122 Study Guide
No ratings yet
R-EDS 122 Study Guide
164 pages
S12 - U4 - Reading - S15
100% (1)
S12 - U4 - Reading - S15
15 pages
Edu 280 Multicultural Lesson Plan
No ratings yet
Edu 280 Multicultural Lesson Plan
5 pages
Tips For Choosing A Department in University
No ratings yet
Tips For Choosing A Department in University
2 pages
Julian-5c-20pro 41761340
No ratings yet
Julian-5c-20pro 41761340
1 page
Contextualized DLL For Health 8 - Q1 - With CSE
No ratings yet
Contextualized DLL For Health 8 - Q1 - With CSE
7 pages
COP3502 Syllabus
No ratings yet
COP3502 Syllabus
4 pages
Making Content Comprehensible Web
No ratings yet
Making Content Comprehensible Web
1 page
Answer Sheet 30
No ratings yet
Answer Sheet 30
1 page
Awq 3 Mcourseoutline
No ratings yet
Awq 3 Mcourseoutline
2 pages
Mia 3 18 - 3 22
No ratings yet
Mia 3 18 - 3 22
2 pages
Q1
No ratings yet
Q1
11 pages
Gacnn - Training Deep Convolutional Neural Networks With Genetic Algorithm
No ratings yet
Gacnn - Training Deep Convolutional Neural Networks With Genetic Algorithm
4 pages
2020 Application Form For Programs in English at Komaba (PEAK)
No ratings yet
2020 Application Form For Programs in English at Komaba (PEAK)
7 pages
2016-17 Foundation Manuel
No ratings yet
2016-17 Foundation Manuel
168 pages
PGT 202E Presentation Titles First Semester 2021/2022 Instructions: 1. All The Presentation Handouts Must Include
No ratings yet
PGT 202E Presentation Titles First Semester 2021/2022 Instructions: 1. All The Presentation Handouts Must Include
4 pages
STUDENTS INTERESTS AND PERCEPTION IN STUDYING LITERATURE
No ratings yet
STUDENTS INTERESTS AND PERCEPTION IN STUDYING LITERATURE
9 pages
Application Letter For Quality Assurance Officer
100% (2)
Application Letter For Quality Assurance Officer
2 pages
Surmounting Struggles: Honoring The Unsung Heroes of The Pandemic
No ratings yet
Surmounting Struggles: Honoring The Unsung Heroes of The Pandemic
3 pages
EDITED - Science4 - Q1 - Module 5
No ratings yet
EDITED - Science4 - Q1 - Module 5
44 pages
Iste Standard For Students Project
No ratings yet
Iste Standard For Students Project
3 pages
Cot q3 Elapsed Time
100% (7)
Cot q3 Elapsed Time
3 pages
Title Concept
No ratings yet
Title Concept
28 pages