Dissertation Part-I: Name:Kamalpreet Kaur Roll No.:2018CSB2015 Guide:Prof Kiranbir Kaur
Dissertation Part-I: Name:Kamalpreet Kaur Roll No.:2018CSB2015 Guide:Prof Kiranbir Kaur
Name:Kamalpreet Kaur
Roll No.:2018CSB2015
Guide:Prof Kiranbir Kaur
Index
Introduction
Base Paper
Literature survey
Comparison Table
Research Gap
Problem Definition
Methodology and Algorithm
Conclusion
References
Introduction
The student performance monitoring is critical in indicating which class student must select
for better performance. Several techniques plays a part in the selection of features that must
be considered while selecting class of student. These techniques are part of data mining.
Data mining means , filtering mechanism that must be accommodated to select required data
from large dataset. There are different phases that must be accommodated within mining
approach for student class selection and prediction procedure.
Integrity Check
Feature selection
Classification
1. Integrity Check
This indicates that validity of data is upto the mark or not. Integrity constraint is used to
determine weather data is valid or not. Integrity constraints are divided into following
categories
Table 2 contains child process and table 3 specifies parent process. Rollno in fees table must match with
student table rollno or it should contain null value. This is known as referential integrity constraint.
2. Feature Selection
Feature selection is the mechanism of determining features from dataset. Individual fields from the
dataset are extracted and for doing so feature extraction can be used using optimization procedure. For
optimization, genetic based approach, ant colony optimization , or particle swarm optimization can be
used for feature extraction.
To select feature that are required for student performance prediction statistical based approach can be
used. The features that can be used includes mean, median, mode, kurtosis, regression, correlation etc.
all these feature extracted and selected for student performance prediction.
3.Classification
This is the last phase in student performance prediction process. Collaborative sum of different features
gives class in which result must lie. Class prediction often suffer from deviation factor that could be
more or less depending upon received value. Classification process is expressed in terms of confusion
matrix. The critical parameter for the observation is classification accuracy.
This work is based on associative learning based approach for student performance prediction. The
overall goal is to increase classification accuracy. Next section discussed existing literature that has
done work towards this section.
Base Paper
(Al-Sudani 2019) proposed a neural network based approach for predicting student performance. Large dataset comprising of 491
students is used in this case. Feed forward network is used to tackle uncertainties present within dataset. The classification
accuracy is achieved to be 83-85% depending upon uncertainties presents within dataset. Result is compared against k nearest
neighbour and Naïve Bayes approach for validation. The prediction model used in this approach although effective enough but
recommendations for the weak students is not generated that could resolve the retention problem.
(Khder 2018) proposed a classification model for university students. Dataset worked upon by this literature is real time. Dataset
size is considerably large. Tool used for prediction includes mining based random forest approach. The result predictor approach
used in this literature present classification accuracy up to 89%. The problem of high execution time can be rectified in future
work.
(Bekele and Menzel 2017) proposed Bayesian network based student performance system. The case study of ethiopian student is
presented using this approach. Bayesian network based approach is used as a tool for predicting student performance but dataset
used in small in scale and no confusion matrix is formed in this case. In future work confusion matrix could be accommodated
within Bayesian based network for student performance prediction.
(Eashwar and Venkatesan 2017) proposed a student performance evaluation using SVM. This SVM based approach is based on
forming hyper-planes. The segmentation is used in order to divide the data into critical and non- critical segments and k-means
segmentation is used for classification. The classification accuracy is of 85% that can be further improved by accommodating
missing data handling mechanisms within this approach.
(Zaffar and Hashmani 2017)Discussed big data approach for student performance monitoring through feature selection. Data about
student is gathered both real time and through offline dataset formation mechanism to form synthetic dataset. Categorization is
applied for fast analysis of data. The parallel data analysis approach is used for analyzing distinct categories also known as clusters.
Execution time is reduced but reliability is at stakes since accuracy optimization mechanism is not applied.
(Singh et al. 2016)Proposed data mining approach for predicting student performance. The mechanism employed technique in order
to judge the ranking of university so that student can take admission in best possible institution. Analysis is made using supervised
learning and rapid mining tool. Noise handling mechanism is employed to handle noisy data.
(Muthukrishnan 2018)presented a survey of techniques corresponding to student performance prediction. These techniques include
mechanism to handle the issues corresponding to problems in prediction due to missing data. Big data formation in education is yet
to be explored but this paper studied distinct papers to explore tools and techniques desirable for prediction. This literature however
does not present mechanisms for enhancing accuracy in prediction.
(Dogan and Diri 2016)discussed mechanisms to handle predictions corresponding to MOOC analysis. Importance of learner,
teacher manager and policy makers in the MOOC analysis is judged and result is presented in the form of classification accuracy. It
is a review paper and no new mechanism is suggested to improve classification accuracy.
Comparison Table
Author and Reference Technique Parameters Results Shortcoming
Al-sudani 2019 Neural Network Classification accuracy, 83% to 85% accuracy No recommendation
Specificity , sensitivity mechanism suggested to
tackle retention problem
Predicting performance of student using learning analytics is need of the hour. This is required due
to uncertainty in cause selection from student side. This uncertainty in course prediction can be
overcome by the help of technology. The research gap is listed as under
There are three phases associated with the prediction of student performance. First phase
includes gathering of data and handling noise. In exiting system format handling
mechanism is not employed that means format in case is not correct then prediction
accuracy will be low. In the second phase neural network based approach is fed with the
data. Neural network based approach uses supervised learning but in case training and
testing data does not match than all instances of data cannot be classified. In the third phase
classification is performed, to perform classification hyper-plane strategy is employed,
weight adjustment is performed in case feature vector from test data does not match with
the training data. Feature vector formation for training can be slow in case size of dataset is
increased.
Methodology & Algorithm
This work proposes three phase approach to predict student performance along with
recommendation. In the first phase, format handling mechanism through normalization can be
employed. This phase identify initial percentage of student and eliminate possibilities of noisy
data. This mechanism uses concept of domain and range for specified field within the dataset.
Data not falling within domain and range is identified as noisy data. By using the ID of student,
missing value is identified and replace with averaged value from corresponding field.
In the second phase data fetched from pre-processing phase is fed into neural network layers.
Associative learning based approach is used. In this approach previous neuron learning is fed
into next neuron and hence learning is less time consuming as compared to supervised learning
mechanism. Once feature vectors are formed, test data is compared against the feature vector
formed corresponds to train data.
In the third phase, result is checked against the formed classes. In case no class satisfy the
result than weight is adjusted. This process continues until some classification result is
obtained. The methodology is listed as under
Algorithm for Student performance prediction
Real time dataset formation
Applying pre-processing mechanism using integrity check approach
Feeding data into neural network
Input layer accept input from pre-processing phase
Training phase extract feature vectors from the variables by applying associative learning procedure
Applying weight adjustment to make result fall within particular class
Perform classification using weight adjustment mechanism
Dataset formation using real time approach
The propose system modify the existing system at three different parts. At first phase, pre-
processing mechanism is employed using format setting strategy. Once missing data is
handled classification accuracy at final stage is increased. At the second phase, neural
network based approach using associative learning is applied. At classification weight
adjustment approach is used.
This entire procedure can be implemented in MATLAB 2018b . The result includes
confusion matrix with accuracy, sensitivity, specificity and F-Score.
References
Al-sudani S (2019) Predicting students ’ final degree classification using an extended profile. Springer 2357–2369
Dogan G, Diri B (2016) An Overview Of Studies About Students ‘ Performance Analysis and Learning Analytics in
MOOCs. IEEE Access
Eashwar KB, Venkatesan R (2017) STUDENT PERFORMANCE PREDICTION USING SVM. IJMET 8:649–662
Khder M (2018) A Classification and Prediction Model for Student ’ s Performance in University. Res Gate. doi:
10.3844/jcssp.2017.228.233
Muthukrishnan SM (2018) Big Data Framework for Students ’ Academic Performance Prediction : A Systematic
Literature Review. 2018 IEEE Symp Comput Appl Ind Electron 376–382
Singh I, Sabitha ASAI, Bansal A, Cse A (2016) STUDENT PERFORMANCE ANALYSIS USING. IEEE Access 294–
299
Zaffar M, Hashmani MA (2017) Performance Analysis of Feature Selection Algorithm for Educational Data Mining.
IEEE Access 7–12