0% found this document useful (0 votes)

28 views

Student Performance Analysis Using Educa

This document discusses using educational data mining techniques like Naive Bayes classification and Weighted Naive Bayesian algorithms to analyze student performance data and predict student results. The goals are to help students improve their skills and reduce stress and suicide rates by providing predictions and recommendations before exams.

Uploaded by

NicholasRahe

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

28 views

Student Performance Analysis Using Educa

Uploaded by

NicholasRahe

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 8

Vol.

14 ICETCSE 2016 Special Issue International Journal of Computer Science and Information Security (IJCSIS)
ISSN 1947-5500 [https://sites.google.com/site/ijcsis/]
69

Student Performance Analysis Using Educational

Data Mining

P Ramya M Mahesh Kumar

M.Tech Student, Asst Professor, Dept of IT
Gudlavalleru Engineering LakiReddy Balireddy
College, College of Engineering,
Gudlavalleru, Krishna(Dt) Mylavaram, Krishna(Dt)
Vijayawada Vijayawada

Abstract— Software industry is hiring the students from Bayes, K- Nearest neighbor, and many others. Using these
the engineering colleges who are good in communication, techniques many kinds of knowledge can be discovered such as
programming, and also academically performing well. Most of the association rules, classifications and clustering. The discovered
engineering institutions focused on the students performance on knowledge can be used for prediction regarding enrolment of
the above stated factors. The engineering students have to students in a particular course, alienation of traditional
improve their academic performance, programming skills and classroom teaching model, detection of unfair means used in
also communication skills. To help such kind of students, we online examination, detection of abnormal values in the result
designed a project which can predict the students performance sheets of the students, prediction about students‟ performance
before the announcement of their results and before they attend and so on.
their semester exams. By this the students can know their The main aim of this project is to improvise the student performance
performance and can improve their skills by proper planning or in studies based on some important factors. Education is an essential
by making changes in their plans. This can help the students element for the betterment and progress of a country. It enables the
improve in their academics, which eventually leads to a good people of a country civilized and well mannered. Now-a-days
performance in their end examinations. By this the suicide rates developing new methods to discover knowledge from educational
of students will also get reduced since the stress is reduced. This database in order to analyse student's trends and behaviours towards
could help in our country development by providing good and education. To analyse the data from different dimensions categorize
efficient engineers to the country. it and to summarize the relationships. It motivated us to work on
We applying Naive Bayes classification algorithm and student dataset analysation.The data collection, categorization and
Weighted Naïve Bayesian algorithm on the student data set which classification is being performed manually. The main disadvantage
is collected from LBRCE IT department, Mylavaram for building of this process is delay in results, remedial measures are not taken
this model. Based on these results we can classify the weak students properly due to late analysis of student performance. There will be
and take the remedial measures to improve their performance. delay in the results announcements which leads to the poor performance
Keywords: Educational Data Mining, Classification, Prediction. of the students in the next examination due to lack of planning in their
I. INTRODUCTION preparation. When count of students increases, the analysis of
performance of a student becomes difficult. To overcome this difficulty
The advent of information technology in various fields has we now introduce you to educational data mining. When institutes store
lead the large volumes of data storage in various formats like their students details in cloud, it will be difficult to analyse large data
records, files, documents, images, sound, videos, scientific data often called as big data. By applying data mining on the data stored,
and many new data formats. The data collected from different we can easily categories and analyse the results of a student in short
applications require proper method of extracting knowledge time without any difficulties. Here, mainly concentrated on the students
from large repositories for better decision making. Knowledge internal marks, ability to concentrate, attendance, awareness on course
outcomes, tutorials, semester marks, content perception, assignments
discovery in databases (KDD), often called data mining, aims
at the discovery of useful information from large collections of II. DATA MINING DEFINITION AND TECHNIQUES
data [1]. The main functions of data mining are applying Data mining, also popularly known as Knowledge
various methods and algorithms in order to discover and extract Discovery in Database refers to extracting or “mining"
patterns of stored data [2]. Data mining and knowledge knowledge from large amounts of data. Data mining techniques
discovery applications have got a rich focus due to its are used to operate on large volumes of data to discover hidden
significance in decision making and it has become an essential patterns and relationships helpful in decision making. While
component in various organizations. Data mining techniques data mining and knowledge discovery in database are
have been introduced into new fields of Statistics, Databases, frequently treated as synonyms, data mining is actually part of
Machine Learning, Pattern Reorganization, Artificial the knowledge discovery process. The sequences of steps
Intelligence and Computation capabilities etc. identified in extracting knowledge from data are shown in
There are increasing research interests in using data mining Figure 1.
in education. This new emerging field, called Educational Data
Mining, concerns with developing methods that discover
knowledge from data originating from educational
environments [3]. Educational Data Mining uses many
techniques such as Decision Trees, Neural Networks, Naïve
Proceedings of 3rd International Conference on Emerging Technologies in Computer Science & Engineering (ICETCSE 2016)
V. R. Siddhartha Engineering College, Vijayawada, India, October 17-18, 2016
Vol. 14 ICETCSE 2016 Special Issue International Journal of Computer Science and Information Security (IJCSIS)
ISSN 1947-5500 [https://sites.google.com/site/ijcsis/]
70

C. Predication
Regression technique can be adapted for predication.
Regression analysis can be used to model the relationship
between one or more independent variables and dependent
variables. In data mining independent variables are attributes
already known and response variables are what we want to
predict. Unfortunately, many real-world problems are not
simply prediction. Therefore, more complex techniques (e.g.,
logistic regression, decision trees, or neural nets) may be
necessary to forecast future values. The same model types can
often be used for both regression and classification. For
example, the CART (Classification and Regression Trees)
decision tree algorithm can be used to build both classification
trees (to classify categorical response variables) and regression
trees (to forecast continuous response variables). Neural
networks too can create both classification and regression
models.
FIG 1:KDD PROCESS
D. Association rule
Various algorithms and techniques like Classification , Association and correlation is usually to find frequent item findings
Clustering , Regression , Artificial Intelligence , Neural among large data sets. This type of finding helps businesses to
Networks , Association rules , Decision trees , Genetic make certain decisions, such as catalogue design, marketing and
Algorithm, Nearest Neighbor method etc., are used for customer shopping behavior analysis. Association Rule
knowledge discovery from databases. These techniques and algorithms need to be able to generate rules confidence values less
methods in data mining need brief mention to have better than one. However the number of Association Rules for a given
understanding. dataset is generally very large and a high proportion of the rules are
A. Classification usually of little (if
any) value.
The Classification is the one of the most important
technique used in data mining. It is a 2 step process 1.first build E. Neural networks
classification model. 2. Predict the class label, which employs
Neural network is a set of connected input/output units and
a set of pre-classified examples to develop a model that can
each connection has a weight present with it. During the
classify the population of records at large. This approach
learning phase, network learns by adjusting weights so as to be
regularly employs decision tree or neural network-based able to predict the correct class labels of the input tuples. Neural
classification algorithms. The data classification process
networks have the remarkable ability to derive meaning from
involves learning and classification. In Learning the training complicated or imprecise data and can be used to extract
data are analyzed by classification algorithm. In classification patterns and detect trends that are too complex to be noticed by
test data are used to estimate the accuracy of the classification either humans or other computer techniques. These are well
rules. If the accuracy is acceptable the rules can be applied to suited for continuous valued inputs and outputs. Neural
the new data tuples. The classifier-training algorithm uses these networks are best at identifying patterns or trends in data and
pre-classified examples to determine the set of parameters well suited for prediction or forecasting needs.
required for proper discrimination. The algorithm then encodes
these parameters into a model called a classifier. F. Decision Trees
B. Clustering Decision tree is tree-shaped structures that represent sets of
decisions. These decisions generate rules for the classification
Clustering can be defined as discovery of similar classes of of a dataset. Specific decision tree methods include
objects.. By using clustering techniques we can further identify Classification and Regression Trees (CART) and Chi Square
dense and sparse regions in object space and can discover Automatic Interaction Detection (CHAID).
overall distribution pattern and correlations among data
attributes. Classification approach can also be used for effective G. Nearest Neighbor Method
means of distinguishing groups or classes of object A technique that classifies each record in a dataset based on
but it becomes costly so clustering can be used as a combination of the classes of the k record(s) most similar to
preprocessing approach for attribute subset selection and it in a historical dataset (where k is greater than or equal to 1).
classification. Sometimes called the k-nearest neighbor technique.

Proceedings of 3rd International Conference on Emerging Technologies in Computer Science & Engineering (ICETCSE 2016)
V. R. Siddhartha Engineering College, Vijayawada, India, October 17-18, 2016
Vol. 14 ICETCSE 2016 Special Issue International Journal of Computer Science and Information Security (IJCSIS)
ISSN 1947-5500 [https://sites.google.com/site/ijcsis/]
71

III. RELATED WORK

Data mining in higher education is a recent research field Pandey and Pal [10] conducted study on the student
and this area of research is gaining popularity because of its performance based by selecting 60 students from a degree
potentials to educational institutes. college of Dr. R. M. L. Awadh University, Faizabad, India. By
means of association rule they find the interestingness of
Data Mining can be used in educational field to enhance our student in opting class teaching language.
understanding of learning process to focus on identifying,
extracting and evaluating variables related to the learning Ayesha, Mustafa, Sattar and Khan [11] describes the use of
process of students as described by Alaa el-Halees [4]. Mining k-means clustering algorithm to predict student’ s learning
in educational environment is called Educational Data Mining. activities. The information generated after the implementation
of data mining technique may be helpful for instructor as well
Han and Kamber [3] describes data mining software that as for students.
allow the users to analyze data from different dimensions,
categorize it and summarize the relationships which are Bray [12], in his study on private tutoring and its
identified during the mining process. implications, observed that the percentage of students receiving
private tutoring in India was relatively higher than in Malaysia,
Pandey and Pal [5] conducted study on the student Singapore, Japan, China and Sri Lanka. It was also observed
performance based by selecting 600 students from different that there was an enhancement of academic performance with
colleges of Dr. R. M. L. Awadh University, Faizabad, India. By the intensity of private tutoring and this variation of intensity of
means of Bayes Classification on category, language and private tutoring depends on the collective factor namely socio-
background qualification, it was found that whether new comer economic conditions.
students will performer or not.
Bhardwaj and Pal [13] conducted study on the student
Hijazi and Naqvi [6] conducted as study on the student performance based by selecting 300 students from 5 different
performance by selecting a sample of 300 students (225 males, degree college conducting BCA (Bachelor of Computer
75 females) from a group of colleges affiliated to Punjab Application) course of Dr. R. M. L. Awadh University,
university of Pakistan. The hypothesis that was stated as Faizabad, India. By means of Bayesian classification method on
"Student's attitude towards attendance in class, hours spent in 17 attribute, it was found that the factors like students‟ grade in
study on daily basis after college, students' family income, senior secondary exam, living location, medium of teaching,
students' mother's age and mother's education are significantly mother ’ s qualification, students other habit, family annual
related with student performance" was framed. By means of
income and student’s family status were highly correlated
simple linear regression analysis, it was found that the factors
with the student academic performance.
like mother’s education and student’s family income were
highly correlated with the student academic performance. IV. DATA MINING PROCESS
Khan [7] conducted a performance study on 400 students In present day’s educational system, a students‟ performance
comprising 200 boys and 200 girls selected from the senior is determined by the internal assessment and end semester
secondary school of Aligarh Muslim University, Aligarh, India examination. The internal assessment is carried out by the teacher
with a main objective to establish the prognostic value of based upon students‟ performance in educational activities such as
different measures of cognition, personality and demographic class test, seminar, assignments, general proficiency, attendance
variables for success at higher secondary level in science and lab work. The end semester examination is one that is scored
stream. The selection was based on cluster sampling technique by the student in semester examination. Each student has to get
in which the entire population of interest was divided into minimum marks to pass a semester in internal as well as end
groups, or clusters, and a random sample of these clusters was semester examination.
selected for further analyses. It was found that girls with high
socio-economic status had relatively higher academic A. Data Preparations
achievement in science stream and boys with low socio- The data set used in this study was obtained from LakiReddy
economic status had relatively higher academic achievement in Bali reddy College of Engineering ,Information Technology
general. department, Mylavaram from session 2012 to 2016. Initially
size of the data is 50. In this step data stored in different tables
Galit [8] gave a case study that use students data to analyze
their learning behavior to predict the results and to warn was joined in a single table after joining process errors were
students at risk before their final exams. removed.
B. Data selection and transformation
Al-Radaideh, et al [9] applied a decision tree model to
predict the final grade of students who studied the C++ course In this step only those fields were selected which were
in Yarmouk University, Jordan in the year 2005. Three different required for data mining. A few derived variables were selected.
classification methods namely ID3, C4.5, and the NaïveBayes While some of the information for the variables was extracted
were used. The outcome of their results indicated that Decision from the database. All the predictor and response variables
Tree model had better prediction than other models. which were derived from the database are given in Table I for
reference.

TABLE I. STUDENT RELATED VARIABLES

Awareness on co’s:
Variable Description Possible Values Before learning a subject one should have a clarity
{A>60% about what they are going to learn and why they are going to
B>45 & <60%
Internal C>36 & <45% learn. When a student knows the course outcomes before
IM Marks Fail<36%} starting the course, it will be easy for him/her to concentrate
{A > 60% more on the subject. By having knowledge about course
B >45 & <60% outcomes, the student gains interest to start that course and
PSM Previous Semester Marks
C >36 & <45% improve his knowledge.
Fail < 36%}
Assignments:
Basics Basics in the subject {Poor , Average, Good} By writing assignments, the students read the
textbook, understand it and need to prepare notes for it. When
Ability to Concentrate in the a student frequently submits assignments, then the teacher can
ACIC Class {Poor , Average, Good}
say that the student is regular and interested in learning by
ASS Assignment {Yes, No} his/her own. By assignments, the students can learn subject by
CP Content Perception {Poor , Average, Good} their own. Moreover, instead of reading subject, writing the
subject improves the concentration of the student.
ATT Attendance {Poor , Average, Good}
Course Internal marks:
Awareness Outcomes The marks allotting to the students are divided as
on CO’s Awareness {Yes, No} internal and external marks. The external marks are nothing but
{First > 60% end exams (or) sem exams. By dividing the marks, makes it
Second >45 & <60% easier to assess the student performance more accurately. To
ESM End Semester Marks assess our capability before end exams.
Third >36 & <45%
Fail < 36%} Semester Marks:
The semester marks of a student are helpful in analyzing
performance of particular student. The semester marks are the
The domain values for some of the variables were
defined for the present investigation as follows: marks that are obtained by a student in his/her end exam. The
semester marks are converted in percentages and these
Basics: percentages are considered during the campus placements as
cut-off. The previous semester marks are considered to
Helping students to study effectively. Easy to analyze the improve the students performance in their next semester. So
subject by knowing the basics and can easily remember the that he can maintain percentage to get a good job. By
concept for longer time. Can generate new ideas. Allowing considering the semester wise marks of a student, we can
students to more clearly communicate ideas, thoughts and observe the change in the performance of that student.
information. Helping students integrate new concepts with older Tutorials:
concepts. By conducting tutorials the staff (or) the teacher can
maintain the record of a students performance. Observing the
Ability to concentrate on the class: tutorials, the student can know where he should concentrate to
Pay attention in the class is more important to gain more score more marks.
knowledge. Concentration in the class leads the students to
understand the subject more easily. By paying attention in the C. Proposed System
class, students can do assignments & homework easily Can
easily remember the topics being concentration in the classes.
By taking notes in the class is helps to study easily. By
concentration in the class students can take notes very
effectively, which will help his/her further reference.
Attendance
The presence of student in a class can also improve
his/her concentration in studies. Due to attendance marks ,the
students attends the classes regularly .So, that they concentrate
more in studies. Students can share knowledge with others. Can
easily communicate with others.

Content perception:
By knowing about the content perception of a student, 
the teacher can help the student in understanding the subject FIG2:BLOCK DIAGRAM
further. We can assess whether the student listens or not by
content perception.

SNo ACO Basics ACOC CP IM SM ASS TUT ATT PSM

To justify the capabilities of data mining techniques in 1 No Avg Avg Avg A Avg No Yes A Avg
context of higher education by offering a data mining model for
2 Yes Avg Avg Strong C Fail No No C Fail
higher education system in the university we designed a model
called "STUDENT PERFORMANCE ANALYSIS USING 3 No Avg Avg Weak C Fail No No B Fail
EDUCATIONAL DATA MINING”. Using these techniques 4 No Avg Strong Strong B Avg No No B Fail
many kinds of knowledge can be discovered such as 5 No Avg Avg Avg B Avg Yes No C Fail
association rules, classifications and clustering. The main 6 No Avg Weak Avg C Fail No No B Fail
objective of this project is to use data mining methodologies to 7 No Avg Avg Avg A Avg Yes No A Avg
study student's performance in the courses. Data mining provides
many tasks that could be used to study the student performance. 8 No Avg Avg Strong A Avg Yes Yes A Avg
In this project, the classification task is used to evaluate student's 9 No Avg Strong Strong C Fail No No C Fail
performance and as there are many approach that is used for 10 No Weak Weak Weak C Fail No No C Fail
data classification. Information's like Attendance, Class test, 11 No Weak Weak Avg B Avg No Yes B Fail
Seminar and Assignment marks were collected from the 12 No Weak Weak Weak D Fail No No C Fail
student's management system, to predict the performance at the
end of the semester. This project reduces the time taken by the 13 No Avg Avg Avg B Avg Yes Yes B Avg
survey to collect the data, analyze the data and also reduces the 14 Yes Avg Avg Strong C Fail No No C Fail
errors in entering the data than that of the survey method. 15 No Avg Avg Avg A Avg No Yes A Avg
Software industry is hiring the students from the engineering 16 Yes Avg Avg Strong C Fail No No C Fail
colleges who are good in communication, programming, and 17 No Avg Avg Weak C Fail No No B Fail
also academically performing well. Most of the engineering
18 No Avg Strong Strong B Avg No No B Fail
institutions focused on the students' performance on the above
stated factors. We are applying naive Bayes classification 19 No Avg Avg Avg B Avg Yes No C Fail
algorithm and weighted Naive Bayes algorithm on the student 20 No Avg Weak Avg C Fail No No B Fail
data set which is collected from LBRCE IT department, 21 No Avg Avg Avg A Avg Yes No A Avg
Mylavaram for building this model. 22 No Avg Avg Strong A Avg Yes Yes A Avg
Modules include
23 No Avg Strong Strong C Fail No No C Fail
1.Form Creation
2.Collection Of Trained Datasets 24 No Weak Weak Weak C Fail No No C Fail
3.PreProcessing Datasets Collected 25 No Weak Weak Avg B Avg No Yes B Fail
4.Applying Naive Bayesian Classifier 26 No Weak Weak Weak D Fail No No C Fail
5.Applying Weighted Naive Bayesian Classifier 27 No Avg Avg Avg B Avg Yes Yes B Avg
6.Calculating Confusion Matrix
28 Yes Avg Avg Strong C Fail No No C Fail
7.Calculating Precision, Recall & Specificity
8.Comparision By Graphical Representation
TABLE 2: DATA SET OF STUDENTS
2. PRE-PROCESSING DATASETS COLLECTED:
V. RESULTS AND DISCUSSION
The data set of 28 students used in this study was obtained Preprocessing is done in this module. Preprocessing
from LakiReddy Bali eddy College of Engineering ,Dept of IT, techniques are data cleaning, data integration, data
Mylavaram from 2012 to 2016. transformation, data reduction. In our project we are doing
cleaning, transformation and reduction. In cleaning we are
1.COLLECTION OF TRAINED DATASETS: pruning incomplete values, inconsistent values and Null
values. All these errors are pruned by using the java script. In
We created JSP files to store the trained dataset and transformation we are converting marks into grades to classify
test dataset into the database. Java Server Pages (JSP) is a the end results.
technology for developing web pages that support dynamic
content which helps developers insert java code in HTML
3 APPLYING NAÏVE BAYESIAN CLASSIFIER:
pages by making use of special JSP tags. JSP is more powerful
and easier to use. In the early days of the Web, the Common
In our project we are applying two algorithms on the
Gateway Interface (CGI) was the only tool for developing
student dataset to predict the student performance analysis.
dynamic web content. However, CGI is not an efficient
One is Naive Bayesian Algorithm and other is Weighted Naïve
solution .JSP is the better solution for dynamic web content.
Bayesian Algorithm. Coming to Naïve Bayesian Algorithm, it
Released in 1999 by Sun Microsystems, JSP is similar to PHP
is based on the Bayesian theorem. It is particularly suited when
and ASP, but it uses java programming language.
the dimensionality of the inputs is high. Parameter estimation
for naive Bayes models uses the method of maximum
likelihood. In spite over-simplified assumptions , it often
performs better in many complex real-world
situations.Advantage: Requires a small amount of training data
to estimate the parameters. The Weighted Naïve Bayesian
algorithm
Proceedings of 3rd International Conference on Emerging Technologies is also
in Computer based&on
Science Bayesian(ICETCSE
Engineering theorem but some weights
2016)
V. R. Siddhartha Engineering College, Vijayawada, India, October 17-18, 2016
Vol. 14 ICETCSE 2016 Special Issue International Journal of Computer Science and Information Security (IJCSIS)
ISSN 1947-5500 [https://sites.google.com/site/ijcsis/]
are assigned to the attributes that plays a major role in the 3.2 Weighted Naïve Bayesian Classification Algorithm : 74
student end result prediction. When compared to Naïve
Bayesian Algorithm , the weighted naive Bayesian algorithm The weighted Naive Bayesian Classification
gives more accurate results. represents a supervised learning method as well as a statistical
method for classification. Assumes an underlying probabilistic
3.1 Naive Bayesian Classification Algorithm: model and it allows us to capture uncertainty about the model in
a principled way by determining probabilities of the outcomes.
Bayesian classifiers are statistical classifiers. They can It can solve diagnostic and predictive problems. Bayesian
predict class membership probabilities, such as the probability classification provides practical learning algorithms and prior
that a given tuple belongs to a particular class. Bayesian knowledge and observed data can be combined. Bayesian
classification is based on Bayes theorem. Studies comparing Classification provides a useful perspective for understanding
classification algorithms have found a simple Bayesian and evaluating many learning algorithms. It calculates explicit
classifier known as the naïve Bayesian classifier to be probabilities for hypothesis and it is robust to noise in input
comparable in performance with decision tree and selected data. In this we are applying weights for each and every attribute.
neural network classifiers. Bayesian classifiers have also
exhibited high accuracy and speed when applied to large Algorithm
database. The Bayesian Classification represents a supervised
learning method as well as a statistical method for Step-1: Let T be a training set of samples, each with
classification. Assumes an underlying probabilistic model and their class labels. There are k classes,C1,C2Ck. Each
it allows us to capture uncertainty about the model in a sample is represented by an n-dimensional
principled way by determining probabilities of the outcomes. vector,X={x1,x2,.xn}, measured values of n attributes,
It can solve diagnostic and predictive problems. Bayesian A1,A2, An, respectively.
classification provides practical learning algorithms and prior Step-2: Calculating prior probabilities
knowledge and observed data can be combined. Bayesian
Classification provides a useful perspective for understanding p(Ci)=n(Ci)/m where i=1,2,m;
and evaluating many learning algorithms. It calculates explicit Step-3: We add weights to Xn if their value is
probabilities for hypothesis and it is robust to noise in input data. highest among the values.
[P(X/Ci).P(Ci)] + Value.
Algorithm Step-4: Else we do calculation
Step-1: Let T be a training set of samples, each with p(X/Ci)=nk=1p(Xk/Ci).
their class labels. There are k classes,C1,C2Ck. Each Step-5: In order to predict the class label of
sample is represented by an n-dimensional X,p(X/Ci)p(Ci) is evaluated for each class Ci.
vector,X={x1,x2,.xn}, measured values of n attributes, p(X/Ci).p(Ci)>p(X/Cj).p(Cj) for 1jm, ji.
A1,A2, An, respectively.
Step-2: Calculating prior probabilities Weights Included

p(Ci)=n(Ci)/m where i=1,2,m; The below table contains Boolean value attribute weights
from scale 0-1. These weights are added in weighted naïve
Step-3: Posterior probabilities Bayesian algorithm, so that to get more accurate results
p(Ci/X)=[p(X/Ci).p(Ci)]/p(X). than that of naïve Bayesian classifier. The Boolean
valued attributes are nothing but having binary values
Step-4: Calculating like yes or no, true or false.
p(X/Ci)=nk=1p(Xk/Ci).
Boolean value Attribute weights from 0-1 scale
Step-5: In order to predict the class label of S.No Awareness Assignments Tutorials
X,p(X/Ci)p(Ci) is evaluated for each class Ci. of CO’s
Yes No Yes No Yes No
p(X/Ci).p(Ci)>p(X/Cj).p(Cj) for 1jm, ji. Professor 0.18 0.0 0.22 0.0 0.22 0.0
1
Professor 0.20 0.0 0.18 0.0 0.18 0.0
2
Professor 0.22 0.0 0.20 0.0 0.20 0.0
3
Average 0.20 0.0 0.20 0.0 0.20 0.0

TABLE 3: BOOLEAN VALUE ATTRIBUTE WEIGHTS

Several standard terms have been defined for the 2 class matrix:
The accuracy (AC) is the proportion of the total number of
Multi value Attribute weights from 0-1 scale predictions that were correct.
S.N Basics Ability to Content It is determined using the equation
o Concentrate Perception AC= (a+d)/(a+b+c+d).
in the Class The recall or true positive rate (TP) is the proportion of positive
S A W S A W S A W cases that were correctly
vg vg vg It is determined using the equation
Prof 0. 0. 0. 0. 0. 0. 0. 0. 0. TP=d/(c+d),
esso 50 18 0 60 32 0 75 50 0 The false positive rate (FP) is the proportion of negatives cases
r1 that were incorrectly classified as positive as calculated using the
Prof 0. 0. 0. 0. 0. 0. 0. 0. 0. formula
esso 45 22 0 55 28 0 85 48 0 FP= b/(a+b).
r2 The true negative rate (TN) is defined as the proportion of
Prof 0. 0. 0. 0. 0. 0. 0. 0. 0. negatives cases that were classified correctly as calculated using
esso 65 20 0 65 30 0 80 52 0 the equation.
r3 TN= a/(a+b).
Ave 0. 0. 0. 0. 0. 0. 0. 0. 0. The false negative rate (FN) is the proportion of positives cases
rage 50 20 0 60 30 0 80 50 0 that were incorrectly classified as negative as calculated using the
S:Strong Avg:Average W:Weak equation.
TABLE 4: MULTI VALUE ATTRIBUTE WEIGHTS FN= c/(c+d).
Finally, precision (P) is the proportion of the predicted positive
Multi Value Attribute Weights cases that were correct, as calculated using the equation.
S.No A B C D P=d/ (b+d).
Professor 1 0.90 0.72 0.50 0.0
Professor 2 0.88 0.68 0.8 0.0
Professor 3 0.92 0.70 0.52 0.0
Average 0.90 0.70 0.50 0.0
TABLE 5: MULTI VALUE ATTRIBUTE WEIGHTS

NOTE:WE DON’T TAKE ANY WEIGHTS FOR

ATTENDANCE ATTRIBUTE

CALCULATING CONFUSION MATRIX

PREDICTED
Negative Positive
Actual NEGATIVE A B
POSITIVE C D

TABLE 6 CONFUSION MATRIX

The entries in the confusion matrix have the following Fig 3:Data set of 28 Students
meaning in the context of our study:
a. is the number of correct predictions that an instance is
negative,
b. is the number of incorrect predictions that an instance is
positive,
c. is the number of incorrect of predictions that an instance
negative, and
d. is the number of correct predictions that an instance is positive

CALCULATING RECALL,PRECISION & SPECIFICITY

In this module the accuracy, precision, recall and

specificity are calculated from the confusion matrix. By
considering the above table accuracy, precision, recall and
specificity are defined below.
Fig 4:Preprocessed data set of collected Students data set
Proceedings of 3rd International Conference on Emerging Technologies in Computer Science & Engineering (ICETCSE 2016)
V. R. Siddhartha Engineering College, Vijayawada, India, October 17-18, 2016
Vol. 14 ICETCSE 2016 Special Issue International Journal of Computer Science and Information Security (IJCSIS)
ISSN 1947-5500 [https://sites.google.com/site/ijcsis/]
[7] Case Study”, 2009..Morgan Kaufmann, 2000. 76
[8] Alaa el-Halees, “Mining students data to analyze e-Learning behavior: A
Case Study”, 2009..
[9] U . K. Pandey, and S. Pal, “Data Mining: A prediction of performer or
underperformer using classification”, (IJCSIT) International Journal of
Computer Science and Information Technology, Vol. 2(2), pp.686-690,
ISSN:0975-9646, 2011.
[10]
Case Study”, 2009..
[11] U . K. Pandey, and S. Pal, “Data Mining: A prediction of performer or
underperformer using classification”, (IJCSIT) International Journal of
Computer Science and Information Technology, Vol. 2(2), pp.686-690,
ISSN:0975-9646, 2011.
[12] Press, Massachusetts Institute Of Technology. ISBN 0–262 56097–
6,1996.
[13] J. Han and M. Kamber, “Data Mining: Concepts and Techniques,”
Morgan Kaufmann, 2000.
[14] Alaa el-Halees, “Mining students data to analyze e-Learning behavior: A
Case Study”, 2009..
[15] U . K. Pandey, and S. Pal, “Data Mining: A prediction of performer or
Fig 5:Comparison of Bayesian and Weighted Bayesian underperformer using classification”, (IJCSIT) International Journal of
Classifier Computer Science and Information Technology, Vol. 2(2), pp.686-690,
ISSN:0975-9646, 2011.
CONCLUSION
[16] S. T. Hijazi, and R. S. M. M. Naqvi, “Factors affecting student ’ s
In this paper, the classification task is used on student database performance: A Case of Private Colleges”, Bangladesh e-Journal of
Sociology, Vol. 3, No. 1, 2006.
to predict the students division on the basis of previous database.
As there are many approaches that are used for data classification,
[7] Z. N. Khan, “Scholastic achievement of higher secondary students in
the Naïve Bayesian Classifier and Weighted Naïve Bayesian science stream”, Journal of Social Sciences, Vol. 1, No. 2, pp. 84-87,
Classifier are used here. Information’s like Attendance, Class 2005..
test, Seminar and Assignment marks were collected from the [8] Galit.et.al, “Examining online learning processes based on log files
student’s previous database, to predict the performance at the end analysis: a case study”. Research, Reflection and Innovations in
Integrating ICT in Education 2007.
of the semester.
[9] Q. A. AI-Radaideh, E. W. AI-Shawakfa, and M. I. AI-Najjar, “Mining
This study will help to the students and the teachers to improve student data using decision trees”, International Arab Conference on
the division of the student. This study will also work to identify Information Technology(ACIT'2006), Yarmouk University, Jordan,
those students which needed special attention to reduce fail ration 2006.
and taking appropriate action for the next semester examination. [10] U. K. Pandey, and S. Pal, “A Data mining view on class room teaching
This can help the students improve in their academics, which language”, (IJCSI) International Journal of Computer Science Issue, Vol.
eventually leads to a good performance in their end examinations. 8, Issue 2, pp. 277-282, ISSN:1694-0814, 2011.
By this the suicide rates of students will also get reduced since the [11] Shaeela Ayesha, Tasleem Mustafa, Ahsan Raza Sattar, M. Inayat Khan,
“Data mining model for higher education system”, Europen Journal of
stress is reduced. This could help in our country development by Scientific Research, Vol.43, No.1, pp.24-29, 2010.
providing good and efficient engineers to the country.
[12] M. Bray, The shadow education system: private tutoring and its
implications for planners, (2nd ed.), UNESCO, PARIS, France, 2007.
[13] B.K. Bharadwaj and S. Pal. “Data Mining: A prediction for performance
REFERENCES improvement using classification”, International Journal of Computer
[1] Heikki, Mannila, Data mining: machine learning, statistics, and databases, Science and Information Security (IJCSIS), Vol. 9, No. 4, pp. 136-140,
IEEE, 1996. 2011.

[2] U. Fayadd, Piatesky, G. Shapiro, and P. Smyth, From data mining to [14] J. R. Quinlan, “Introduction of decision tree: Machine learn”, 1: pp. 86-
106, 1986.
knowledge discovery in databases, AAAI Press / The MIT Press,
Massachusetts Institute Of Technology. ISBN 0–262 56097–6,1996. [15] Vashishta, S. (2011). Efficient Retrieval of Text for Biomedical Domain
[3] J. Han and M. Kamber, “Data Mining: Concepts and Techniques,” using Data Mining Algorithm. IJACSA - International Journal of
Morgan Kaufmann, 2000. Advanced Computer Science and Applications, 2(4), 77-80.
Alaa el-Halees, “Mining students data to analyze e-Learning behavior: A [16] Kumar, V. (2011). An Empirical Study of the Applications of Data
Case Study”, 2009.. Mining Techniques in Higher Education. IJACSA - International Journal
of Advanced Computer Science and Applications, 2(3), 80-84. Retrieved
[4] U . K. Pandey, and S. Pal, “Data Mining: A prediction of performer or from http://ijacsa.thesai.org.
underperformer using classification”, (IJCSIT) International Journal of
Computer Science and Information Technology, Vol. 2(2), pp.686-690,
ISSN:0975-9646, 2011.
[5] Press, Massachusetts Institute Of Technology. ISBN 0–262 56097–
6,1996.
J. Han and M. Kamber, “Data Mining: Concepts and Techniques,”
Morgan Kaufmann, 2000.
[6] Alaa el-Halees, “Mining students data to analyze e-Learning behavior: A

Proceedings of 3rd International Conference on Emerging Technologies in Computer Science & Engineering (ICETCSE 2016)
V. R. Siddhartha Engineering College, Vijayawada, India, October 17-18, 2016

Shodhganga Thesis Download
100% (3)
Shodhganga Thesis Download
8 pages
18 Professionals in Public Service Organizations: Implications For Public Sector "Reforming"
No ratings yet
18 Professionals in Public Service Organizations: Implications For Public Sector "Reforming"
21 pages
Final Survey Paper 17-9-13
No ratings yet
Final Survey Paper 17-9-13
5 pages
Data Mining Applications: A Comparative Study For Predicting Student's Performance
No ratings yet
Data Mining Applications: A Comparative Study For Predicting Student's Performance
7 pages
Predicting Students Performance Using Data Mining Technique With Rough Set Theory Concepts
No ratings yet
Predicting Students Performance Using Data Mining Technique With Rough Set Theory Concepts
7 pages
Jurnal Awc
No ratings yet
Jurnal Awc
7 pages
Data Mining: A Prediction of Performer or Underperformer Using Classification
No ratings yet
Data Mining: A Prediction of Performer or Underperformer Using Classification
5 pages
educational-data-mining-the-case-of-department-of-mathematics-and-computing-in-the-period-2009-2018
No ratings yet
educational-data-mining-the-case-of-department-of-mathematics-and-computing-in-the-period-2009-2018
5 pages
V3i12 0295
No ratings yet
V3i12 0295
9 pages
ICSMB2016-C Anuradha
No ratings yet
ICSMB2016-C Anuradha
7 pages
Regression Analysis of Student Academic Performance Using Deep Learning
No ratings yet
Regression Analysis of Student Academic Performance Using Deep Learning
16 pages
Role Of Data Mining in Education for Improving Students Performance for Social Change
No ratings yet
Role Of Data Mining in Education for Improving Students Performance for Social Change
2 pages
1 s2.0 S1877050915019018 Main
No ratings yet
1 s2.0 S1877050915019018 Main
9 pages
Yash 21BSDS12 Perdictive Analysis Report
No ratings yet
Yash 21BSDS12 Perdictive Analysis Report
20 pages
(fa) fianl research paper Data mining..
No ratings yet
(fa) fianl research paper Data mining..
59 pages
Analysis of Students'Critical Thinking Skills Using Data Mining Approaches (Survey Based Research)
No ratings yet
Analysis of Students'Critical Thinking Skills Using Data Mining Approaches (Survey Based Research)
5 pages
The Journal of Engineering - 2019 - Li - Educational Data Mining For Students Performance Based On Fuzzy C Means
No ratings yet
The Journal of Engineering - 2019 - Li - Educational Data Mining For Students Performance Based On Fuzzy C Means
6 pages
Educational Data Mining: Student Performance Prediction in Academic
No ratings yet
Educational Data Mining: Student Performance Prediction in Academic
7 pages
Data Mining For Small Student Data Set - Knowledge Management System For Higher Education Teachers
No ratings yet
Data Mining For Small Student Data Set - Knowledge Management System For Higher Education Teachers
11 pages
Literature Review
No ratings yet
Literature Review
11 pages
20122
No ratings yet
20122
22 pages
Tegegne 2018
No ratings yet
Tegegne 2018
15 pages
Paper 31-Educational Data Mining Students Performance Prediction
No ratings yet
Paper 31-Educational Data Mining Students Performance Prediction
9 pages
10.1007@978 981 13 6861 548
No ratings yet
10.1007@978 981 13 6861 548
15 pages
Mining Students Data To Analyze Learning Behavior: A Case Study
No ratings yet
Mining Students Data To Analyze Learning Behavior: A Case Study
4 pages
Student Performance Prediction by Using Data Mining Classification Algorithms
No ratings yet
Student Performance Prediction by Using Data Mining Classification Algorithms
6 pages
Educational Data Mining A Case Study
No ratings yet
Educational Data Mining A Case Study
9 pages
Studentperformancepredictionbyusingdataminingclassificationalgorithms_IJCSMR_2012
No ratings yet
Studentperformancepredictionbyusingdataminingclassificationalgorithms_IJCSMR_2012
5 pages
1.Student Performance Prediction techniques
No ratings yet
1.Student Performance Prediction techniques
5 pages
Classification Model of Prediction for Placement of Students
No ratings yet
Classification Model of Prediction for Placement of Students
9 pages
PM Web 18058
No ratings yet
PM Web 18058
18 pages
Daud 2017
No ratings yet
Daud 2017
7 pages
Data Mining: A Prediction For Performance Improvement Using Classification
No ratings yet
Data Mining: A Prediction For Performance Improvement Using Classification
5 pages
A5 PDF
No ratings yet
A5 PDF
5 pages
Student's Placement Eligibility Prediction Using Fuzzy Approach
No ratings yet
Student's Placement Eligibility Prediction Using Fuzzy Approach
5 pages
Analysis of Educational
No ratings yet
Analysis of Educational
5 pages
Badr 2016
No ratings yet
Badr 2016
10 pages
BIA Assignment
No ratings yet
BIA Assignment
7 pages
Kamal 2018
No ratings yet
Kamal 2018
9 pages
(Cybernetics and Information Technologies) Predicting Student Performance by Using Data Mining Methods For Classification
No ratings yet
(Cybernetics and Information Technologies) Predicting Student Performance by Using Data Mining Methods For Classification
12 pages
Data Mining in Education Data Classification and Decision Tree Approach 097 Z00080E10038 2
No ratings yet
Data Mining in Education Data Classification and Decision Tree Approach 097 Z00080E10038 2
5 pages
Higher Education Student Dropout Prediction and Analysis Through Educational Data Mining
No ratings yet
Higher Education Student Dropout Prediction and Analysis Through Educational Data Mining
5 pages
Student Performance Prediction Using Machine Learn
No ratings yet
Student Performance Prediction Using Machine Learn
8 pages
Data Mining On Educational Domain: Nikhil Rajadhyax Prof. Rudresh Shirwaikar
No ratings yet
Data Mining On Educational Domain: Nikhil Rajadhyax Prof. Rudresh Shirwaikar
6 pages
CHAPTER TWO
No ratings yet
CHAPTER TWO
7 pages
A Decision Tree Approach for Predicting Students Academic Performance
No ratings yet
A Decision Tree Approach for Predicting Students Academic Performance
8 pages
Chapter One 1.1 Background of The Study
No ratings yet
Chapter One 1.1 Background of The Study
220 pages
Synopsis New
No ratings yet
Synopsis New
5 pages
Irjet V7i2688 PDF
No ratings yet
Irjet V7i2688 PDF
4 pages
Review on Prediction Algorithms in Educational Data Mining
No ratings yet
Review on Prediction Algorithms in Educational Data Mining
2 pages
Educational Data Mining Techniques Approach To Predict Student's Performance
No ratings yet
Educational Data Mining Techniques Approach To Predict Student's Performance
4 pages
Hari Ganesh 2015
No ratings yet
Hari Ganesh 2015
6 pages
Prediction Clustering
No ratings yet
Prediction Clustering
16 pages
Markony Undag
No ratings yet
Markony Undag
5 pages
Development of Student's Academic Performance Prediction Model
No ratings yet
Development of Student's Academic Performance Prediction Model
16 pages
c45 K Nearest Neighbor Naïve Bayes and R b0991171
No ratings yet
c45 K Nearest Neighbor Naïve Bayes and R b0991171
10 pages
Handling Missing Value in Decision Tree Algorithm PDF
No ratings yet
Handling Missing Value in Decision Tree Algorithm PDF
6 pages
A Feature Selection Technique Based Approach For Predicting Student 2021
No ratings yet
A Feature Selection Technique Based Approach For Predicting Student 2021
10 pages
Introduction to Robotics
From Everand
Introduction to Robotics
Swarnalata Verma
No ratings yet
"Big Data Science" Basic Concepts and Applications
From Everand
"Big Data Science" Basic Concepts and Applications
Sukanta Bhattacharya
No ratings yet
Teaching and Learning in STEM With Computation, Modeling, and Simulation Practices: A Guide for Practitioners and Researchers
From Everand
Teaching and Learning in STEM With Computation, Modeling, and Simulation Practices: A Guide for Practitioners and Researchers
Alejandra J. Magana
No ratings yet
Core Concepts in Statistical Learning
From Everand
Core Concepts in Statistical Learning
Tushar Gulati
No ratings yet
My Grosir Dasboard
No ratings yet
My Grosir Dasboard
1 page
Hybrid Machine Learning Algorithms For P
No ratings yet
Hybrid Machine Learning Algorithms For P
10 pages
Prediction and Analysis of Student Performance by Data Mining in WEKA
No ratings yet
Prediction and Analysis of Student Performance by Data Mining in WEKA
56 pages
Data Visualization in Society: Edited by Martin Engebretsen and Helen Kennedy
100% (1)
Data Visualization in Society: Edited by Martin Engebretsen and Helen Kennedy
466 pages
Introduction To Visualization For Computer Securit
No ratings yet
Introduction To Visualization For Computer Securit
18 pages
Student Performance Prediction: Mukul Gharpure, Pushpak Chaudhari, Yash Bhole, Sagar Borkar, Aashutosh Awasthi
No ratings yet
Student Performance Prediction: Mukul Gharpure, Pushpak Chaudhari, Yash Bhole, Sagar Borkar, Aashutosh Awasthi
7 pages
Intro To Data Viz 2016
No ratings yet
Intro To Data Viz 2016
25 pages
Data Visualization: Created By: Joshua Rafael Sanchez
No ratings yet
Data Visualization: Created By: Joshua Rafael Sanchez
39 pages
A Guide To Advanced Data Visualization in Excel 2016 Final
100% (1)
A Guide To Advanced Data Visualization in Excel 2016 Final
238 pages
What Do Higher Education Students Want From Online Learning
No ratings yet
What Do Higher Education Students Want From Online Learning
9 pages
System Analysis and Design
No ratings yet
System Analysis and Design
19 pages
BCA Brochure
No ratings yet
BCA Brochure
15 pages
Syllabus Dcrust
No ratings yet
Syllabus Dcrust
42 pages
Maintenance Standard For Port Structures in Asean-Japan
100% (1)
Maintenance Standard For Port Structures in Asean-Japan
143 pages
M2 Porta-Vac
No ratings yet
M2 Porta-Vac
9 pages
One Pager Resume Finance
No ratings yet
One Pager Resume Finance
2 pages
The Impact of Polygamy On United Arab Emirates' First Wives and Their Children
No ratings yet
The Impact of Polygamy On United Arab Emirates' First Wives and Their Children
10 pages
STS-107 Flight Readiness Review: Safety & Mission Assurance
No ratings yet
STS-107 Flight Readiness Review: Safety & Mission Assurance
8 pages
Idea Lesson Exemplar 3 Is
No ratings yet
Idea Lesson Exemplar 3 Is
4 pages
3 Process and Capacity Design
100% (2)
3 Process and Capacity Design
3 pages
R All-in-One For Dummies Joseph Schmuller pdf download
No ratings yet
R All-in-One For Dummies Joseph Schmuller pdf download
42 pages
Motivation As A Tool For Increasing Productivity of
No ratings yet
Motivation As A Tool For Increasing Productivity of
60 pages
Internship Report On Green HRM
100% (2)
Internship Report On Green HRM
50 pages
Grammatical Competence Related
No ratings yet
Grammatical Competence Related
21 pages
State of The Art Paper and Research Master S Thesis Course Guide 2023-2024
No ratings yet
State of The Art Paper and Research Master S Thesis Course Guide 2023-2024
11 pages
Class 11 - PPT-1
No ratings yet
Class 11 - PPT-1
15 pages
Linkert Scales Technique: BRM-Likert Scale Technique 1
No ratings yet
Linkert Scales Technique: BRM-Likert Scale Technique 1
12 pages
Basil and Migraines
No ratings yet
Basil and Migraines
9 pages
Perceived Determinants of Brain Drain Among Mental Health Care Professionals in Specialist Health Care Facilities in Benin City
No ratings yet
Perceived Determinants of Brain Drain Among Mental Health Care Professionals in Specialist Health Care Facilities in Benin City
17 pages
Ahp Mathematical Models
No ratings yet
Ahp Mathematical Models
10 pages
Untitled
No ratings yet
Untitled
20 pages
Brochure - MBA 20 Pages (PDF - Io)
100% (1)
Brochure - MBA 20 Pages (PDF - Io)
20 pages
AS9100 2016 KeyChanges
No ratings yet
AS9100 2016 KeyChanges
16 pages
MIS group assignment
No ratings yet
MIS group assignment
3 pages
Bayley-III Motor Scale and Neuro Exam at 2 Years Doesnt Predict Motor Skills at 4,5 Years
No ratings yet
Bayley-III Motor Scale and Neuro Exam at 2 Years Doesnt Predict Motor Skills at 4,5 Years
8 pages
Behavioral Intention of Public Transit Passengers (Perceive Value)
100% (1)
Behavioral Intention of Public Transit Passengers (Perceive Value)
8 pages
A Complete Project Report on A STUDY ON RISK MANAGEMENT IN LIFE INSURANCE
No ratings yet
A Complete Project Report on A STUDY ON RISK MANAGEMENT IN LIFE INSURANCE
27 pages

Student Performance Analysis Using Educa

Uploaded by

Student Performance Analysis Using Educa

Uploaded by

Vol.

Student Performance Analysis Using Educational

P Ramya M Mahesh Kumar

III. RELATED WORK

TABLE I. STUDENT RELATED VARIABLES

SNo ACO Basics ACOC CP IM SM ASS TUT ATT PSM

TABLE 3: BOOLEAN VALUE ATTRIBUTE WEIGHTS

NOTE:WE DON’T TAKE ANY WEIGHTS FOR

CALCULATING CONFUSION MATRIX

TABLE 6 CONFUSION MATRIX

CALCULATING RECALL,PRECISION & SPECIFICITY

In this module the accuracy, precision, recall and

You might also like