0% found this document useful (0 votes)

41 views7 pages

ICSMB2016-C Anuradha

This document discusses a research paper that analyzes student academic performance using feature selection techniques and a naïve Bayes classifier. The paper aims to investigate the most relevant subset of features to achieve high predictive accuracy when determining student performance. It applies correlation-based feature subset evaluation and gain-ratio attribute evaluation for feature selection before using a naïve Bayes classifier in the WEKA tool. The results show the effectiveness of using feature selection to improve predictive accuracy while reducing the number of attributes used.

Uploaded by

habeeb4sa

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

41 views7 pages

ICSMB2016-C Anuradha

Uploaded by

habeeb4sa

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 7

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/299993218

FEATURE SELECTION TECHNIQUES TO ANALYSE STUDENT ACADAMIC

PERFORMANCE USING NAÏVE BAYES CLASSIFIER

Conference Paper · February 2016

CITATIONS READS

26 1,697

2 authors, including:

Velmurugan Thambusamy
Dwaraka Doss Goverdhan Doss Vaishnav College
122 PUBLICATIONS 1,245 CITATIONS

SEE PROFILE

Some of the authors of this publication are also working on these related projects:

Ph.D project View project

An Implementation of Substitution Techniques in Cloud Security using Playfair and Caesar algorithms View project

All content following this page was uploaded by Velmurugan Thambusamy on 08 April 2016.

The user has requested enhancement of the downloaded file.

The 3rd International Conference on Small & Medium Business 2016
January 19 - 21, 2016, Nikko Saigon Hotel, Hochiminh, Vietnam

FEATURE SELECTION TECHNIQUES TO ANALYSE

STUDENT ACADAMIC PERFORMANCE USING NAÏVE
BAYES CLASSIFIER
C.Anuradha1, T.Velmurugan2
1
Research Scholar, Bharathiar University, Coimbatore, India.
2
Associate Professor, PG and Research Dept. of Computer Science, D.G.Vaishnav College, Chennai-600106, India.
1
[email protected]; [email protected]

Abstract: Data mining provides educational institutions that the capability to explore, visualize and analyze large
amounts of data in order to reveal valuable patterns in students’ learning behaviors. Turning raw data into useful
information and knowledge also enables educational institutions to improve teaching and learning practices, and to
facilitate the decision-making process in educational settings. Thus, educational data mining is becoming an
increasingly important with a specific focus to exploit the abundant data generated by various educational systems
for enhancing teaching, learning and decision making. In EDM, Feature Selection is to choose a subset of input
variables by eliminating irrelevant features. Feature Selection Algorithm has proven to be effective in enhancing
learning efficiency, increasing predictive accuracy and reducing complexity of learned results. The primary objective
of this research work is to investigate the most relevant subset features for achieving high performance accuracy by
adopting Correlation based feature Subset Attribute evaluation and Gain-Ratio Attribute evaluation feature
selection techniques. For classification, the Naïve Bayes classifier is implemented by using WEKA tool. The outcome
shows the effectiveness in the predictive accuracy with minimum number of attributes. Also the results reveals that
the selected data features have found to be influenced the classification process of the student performance model.

Keywords: Educational Data Mining (EDM), Classification algorithm, Naïve Bayes Algorithm, Feature Selection,
Prediction.

I. INTRODUCTION patterns or knowledge from huge amount of data. As we

know large amount of data is stored in educational
Nowadays the field of data analytics and data mining database, so in order to get required data & to find the
(DM) is taking a new role. The role that is undertaking is hidden relationship, different data mining techniques are
as an enabler of educational institutions to improve key developed & used. There are varieties of popular data
performance indicators. The importance of data analytics mining task within the educational data mining e.g.
is growing and a new sub-field of studies is in its infancy. classification, clustering, outlier detection, association
This young field is called Educational Data mining and rule, prediction etc. We can use the data mining in
its main purpose is to analyze data by using a different educational system as: predicting drop-out student,
number of techniques. EDM integrates different relationship between the student university entrance
approaches as database systems, data warehousing, examination results & their success, predicting student's
statistics, machine learning and others. Moreover an academic performance, discovery of strongly related
experiment will be conducted with this educational data, subjects in the undergraduate syllabi, knowledge
the experiment will start with the description of the state discovery on academic achievement, classification of
of the art of EDM and it will continue with the students performance in computer programming course
development of a method for exploring data and according to learning style, investing the similarity &
predicting trends that will contribute to improve difference between colleges and schools. EDM develops
educational data or to analyze current problems to methods and applies techniques from statistics, machine
increase organizational performance. Educational Data learning, and data mining to analyze data collected
Mining is an emerging discipline, concerned with during teaching and learning. EDM tests learning
developing methods for exploring the unique types of theories and informs educational practice. As a result,
data that come from educational settings, and using those researchers try to determine the variables that are related
methods to better understand students, and the settings to academic achievement of students and may affect the
which they learn in. registration process. Therefore, one of the most important
Data mining is extraction of interesting (non-trivial, challenges that higher education faces is recognizing the
implicit, previously unknown and potentially useful) pattern of loyal students.

345
The 3rd International Conference on Small & Medium Business 2016
January 19 - 21, 2016, Nikko Saigon Hotel, Hochiminh, Vietnam

The effective feature selection techniques are for evaluation and naïve Bayes classifier for
required to analyze the efficient classification algorithms. classification purpose. Accuracy and time is the outcome
This research work attempts to foretell the students of the classification model and also various measures like
academic failure by reviewing the field of various feature sensitivity, specificity, precision and recall are also
selection algorithms based on the Naïve Bayes classifier. calculated [4]. A work carried out by Lumbini and Pravin
This research work is structured as follows. Section 2 [5] have proposed an experiment attempts the detection
illustrates the research work that has been conducted in of student’s failure to improve their academic
EDM. In section 3 consist of methods and materials of performance. They have applied different approaches to
the domain of study will be defined. The description of resolve the problem of high dimensionality and using
the process of building a model includes data collection classification algorithm on engineering students data set.
and used tools are given in Section 4. Then Section 5 Predictive Analytics Using Data Mining Technique
presents the experimentation and results obtained. Finally, [6] by Hina Gulati has presents the work of data mining
conclusion is given in Section 6. is predicting the dropout feature of students. Author also
applied some feature selection algorithms. Tool used for
II. RELATED WORKS feature selection and mining is weka. Another work by
Jai and David discussed about Analysis of Influencing
This section discusses about some of the research Factors in Predicting Students Performance Using MLP-
work carried out by various researchers in the same field. A Comparative Study [7]. This paper mainly focused on
A work done by Humera Shaziya et al. has presents an analyzing the prediction accuracy of the academic
approach to predict the performance of students in a performance using influencing factors by Multi Layer
semester exams. This approach is based on a Naive Perception algorithm and compares it with the prediction
Bayes classifier. The objective is to know what grades accuracy. Another research work carried out by Anal
students may obtain in their end semesters results. This and Devadatta have discussed about Application of
helps the educational institute, teachers and students i.e., Feature Selection Methods in Educational Data Mining.
all the stakeholders involved in an education system. Different feature selection algorithms are applied on this
Students and teachers can take necessary actions to data set and the results are obtained by Correlation Based
improve the results of those students whose result Feature Selection algorithm with 8 features. Then
prediction is not satisfactory. A training dataset of classification algorithms may be applied on this feature
students is taken to build the Naive Bayes model. The subset for predicting student grades [8]. Another work by
model is then applied on the test data to predict the end the same authors have discussed about Early Prediction
semester results of students. In this study, number of of Students Performance using Machine Learning
attributes is considered to predict the grade of a student Techniques. In this paper a set of attributes are first
[1]. defined. Then feature selection algorithms are applied on
Another work done by Tajunisha and Anjali have the data set to reduce the number of features. Five classed
discussed about Predicting Student Performance Using of Machine Learning Algorithm (MLA) are then applied
MapReduce. Authors introduced the MapReduce concept on this data set and it was found that the best results were
to improve the accuracy and reduce the time complexity. obtained with the decision tree class of algorithms [9].
In this work, the deadline constraint is also introduced.
Based on this, an extensional MapReduce Task III. METERIALS AND METHODS
Scheduling algorithm for Deadline constraints (MTSD) is
proposed. It allows user to specify a job’s (classification A feature selection algorithm can be seen as the
process in data mining) deadline and tries to make the job combination of a search technique for proposing new
to be finished before the deadline. Finally, the proposed feature subsets, along with an evaluation measure which
System has higher classification accuracy even in the big scores the different feature subsets. The simplest
data and it also reduced the time complexity [2]. Another algorithm is to test each possible subset of features
study focused on Predicting Students Final GPA Using finding the one which minimizes the error rate. The
Decision Trees by Mashael and Muna [3]. choice of evaluation metric heavily influences the
Authors applied the J48 decision tree algorithm to algorithm, and it is these evaluation metrics which
discover classification rules. They extracted useful distinguish between the three main categories of feature
knowledge and identified the most important courses in selection algorithms: wrappers, filters and embedded
the students study plan based on their grades in the methods. Wrapper methods use a predictive model to
mandatory courses. A work carried out by Karthikeyan score feature subsets. Filter methods use a proxy measure
and Thangaraju had proposed a work in genetic instead of the error rate to score a feature subset. This
algorithm and particle Swarm optimization search measure is chosen to be fast to compute, while still
techniques and correlation based feature selection is used capturing the usefulness of the feature set. Embedded

346
The 3rd International Conference on Small & Medium Business 2016
January 19 - 21, 2016, Nikko Saigon Hotel, Hochiminh, Vietnam

methods are a catch-all group of techniques which C. Naïve Bayes

perform feature selection as part of the model The Naïve Bayes classifier technique is used when
construction process [10]. dimensionality of the inputs is high. This is a simple
algorithm but gives good output than others. This
A. Correlation Based Feature Subset Selection classifier is used to predict the dropout of the students by
CFS is a correlation-based filter method CFS from calculating the probability of each input for a predictable
[11]. It gives high scores to subsets that include features state [13].
that are highly correlated to the class attribute but have
low correlation to each other Let S be an attribute subset IV. EXPERIMENTAL DATA
that has k attributes, rcf models the correlation of the
attributes to the class attribute, rff the intercorrelation The dataset is a collection of first year students
between attributes. information contains 5 undergraduate degree courses
meritS = k rcf / sqrt (k+k (k-1) rff) collected from SSBSTAS College, Thiruvalluvar
University, Tamilnadu for a period of 2013-2014. The
B. Gain Ratio Attribute Evaluator student data set of 257 records with 21 attributes that
Gain Ratio Attribute Evaluator is simple individual includes the gender, category of admission, living
attribute ranking mechanism. In this technique, each location, family size, and family type, annual income of
attribute is assigned a score where the score is delineated the family, father’s qualification and mother’s
by means of the difference of attributes entropy and its qualification. The attributes referring to the students’ pre-
class conditional entropy [12]. college characteristics included Students Grade in High
GainR (Class, Attribute) = (H (Class) - H (Class | School and Students Grade in Senior Secondary School.
Attribute)) / H (Attribute). The attributes describing other college features include
the branch of study of the students, place of stay,
Classification is a data mining task that predicts previous semester mark, class test performance, seminar
group membership for data instances. In this research performance, assignment, general proficiency, class
work classification techniques are used to predict the attendance and performance in the laboratory work.
class of the graduate student and how the other attributes Following Table 1 shows the description of attributes.
affects the performance. The classifier used in this study
is Naïve Bayesian algorithm.

Table 1: Student Data Set Description

Variables Description Possible Values

Gender Students Sex {Male, Female}
Branch Students Branch {BCA, B.SC, B.COM, B.A}
Cat Students category {BC, MBC, MSC, OC, SBC, SC}
HSG Students grade in {O – 90% -100%, A – 80% - 89%, B – 70% - 79%,
High School C – 60% - 69%, D – 50% - 59%, E – 35% - 49%,
FAIL - <35%}
SSG Students grade in {O – 90% -100%, A – 80% - 89%, B – 70% - 79%,
Senior Secondary C – 60% - 69%, D – 50% - 59%, E – 35% - 49%,
FAIL - <35% }
Medium Medium of instruction Tamil, English, others
LLoc Living Location of Student {Village, Taluk, Rural, Town, District}
HOS Student stay in hostel or not {Yes, No}
FSize student’s family size {1, 2, 3, >3}
FType Students family type {Joint, Individual}
FINC Family annual income {poor, medium, high}
FQual Fathers qualification {no-education, elementary, secondary, UG, PG, Ph.D}
MQual Mother’s Qualification {no-education, elementary,
secondary, UG, PG, Ph.D. NA}
PSM Previous Semester Mark {First > 60%, Second >45 &<60%, Third >36 &<45%
Fail < 36%}
CTG Class Test Grade {Poor, Average, Good}

347
The 3rd International Conference on Small & Medium Business 2016
January 19 - 21, 2016, Nikko Saigon Hotel, Hochiminh, Vietnam

SEM_P Seminar Performance {Poor , Average, Good}

ASS Assignment {Yes, No}
GP General Proficiency {Yes, No}
ATT Attendance {Poor , Average, Good}
LW Lab Work {Yes, No}
ESM End Semester Marks {First > 60% , Second >45 &<60% , Third >36 &<45%,
Fail < 36%}

For the purpose of designing and evaluating our shows the results of applying two feature selection
experiments, we have used WEKA. It is open source algorithms.
software which is freely available for mining data and
implements a large collection of mining algorithm. It can Table 2: Best Selected Attributes
accept data in various formats and also has converter
supported with it. So we have converted the student Algorithm Attributes Selected
dataset into CSV file. Under the “Test options”, the 10- cfsSubsetEval Branch,SSG,FINC,PSM,GP,ATT
fold cross-validation is selected as our evaluation process. Age,branch,cat,SSG,
The various performance Metrics are discussed as GainRatio medium,ATT,GP,FINC,
follows. AttributeEval FQUAL,MQUAL,
The Accuracy of the predictive model is calculated HSG,SEM_P,LOC
based on the True positive rate, false positive rate, and
precision and recall values [14]. TP rate(True Positive): A. Results of cfsSubset Evaluator
A positive test results accurately reflects the test for In this experiment Correlation Based Feature
activity. If the outcome from a prediction is p, and the selection algorithm is used with 6 attributes along with
actual value is also p, then it is called true positive (TP). Naïve Bayes classifier was implemented on the data set
and the results are presented in Table 3. It shows that
TP = TP/P where P= (TP+FN) classification results for Naïve Bayes correctly classifies
about 84.2% for 10 fold cross validation. Also True
TN (True negative): It has occurred when both the Positive rate is high for the class Second and first,
prediction outcome and the actual value are n in the Whereas TP rate is very low for the class Third. Fig.1
number of input data. shows the graphical representation of the classifier.

TN = TN/N, where N = (TN+FN)

FP rate(False positive): If the outcome from a prediction

is p and the actual value is n, then it is said to be false
positive (FP).
FP = FP / (FP+TN)
Precision: It is the fraction of retrieved instances that are
relevant.
Precision = TP/ (TP+FP)
Recall: It is a fraction of relevant instances that are
retrieved. TP/ (TP+FN)

V. RESULTS AND DISCUSSION

The present investigation focuses on two feature

selection techniques namely cfsSubsetEval and
GainRatioAttributeEval, which is one of the important Figure 1: Result of CfsSubset Evaluator
and frequently used in data preprocessing in data mining.
Using these attribute selection algorithms we can select B. Results of GainRatioAttributeEvaluator
the best attributes out of huge number of attributes of The present study implements GainRatio
students that affect the student’s performance. And the Attribute Evaluator with 13 attributes. The Result of
results are obtained with Naïve Bayes classifier. Table 2 Naïve Bayes classifier is shown in Table 4. It shows that

348
The 3rd International Conference on Small & Medium Business 2016
January 19 - 21, 2016, Nikko Saigon Hotel, Hochiminh, Vietnam

classifier correctly classifies about 74.4% for 10 fold

cross validation. True positive rate is high for the class
second and it is very low for the class Third. Fig.2 shows
the graphical representation of Naïve Bayes classification
algorithm.

Figure 2: Result of Gain-Ratio Attribute Evaluator

Table 3: Classifier Result for CfsSubsetEvaluator

Naïve Bayes – 10 fold cross validation

Class
TP Rate FP Rate Precision Recall F-Measure ROC Area
Second 0.888 0.205 0.664 0.888 0.759 0.885
Fail 0.333 0.034 0.429 0.333 0.375 0.823
First 0.759 0.162 0.859 0.759 0.806 0.875
Distinction 0.25 0.016 0.859 0.25 0.316 0.835
Third 0 0 0 0 0 0.059
Weighted Avg. 0.842 0.359 0.844 0.842 0.835 0.869

Table 4: Classifier Result for GainRatioAttributeEvaluator

Naïve Bayes – 10 fold cross validation

Class
TP Rate FP Rate Precision Recall F-Measure ROC Area
Second 0.873 0.25 0.683 0.873 0.767 0.859
Fail 0 0.015 0 0 0 0.631
First 0.764 0.155 0.848 0.764 0.804 0.886
Distinction 0.143 0.015 0.25 0.143 0.182 0.825
Third 0 0 0 0 0 0.053
Weighted Avg. 0.744 0.179 0.72 0.744 0.726 0.867

Table 5: Overall Accuracy of Feature Selection Algorithm

Naïve Bayes
Algorithm Second Fail First Distinction Third Weighted
Avg.
cfsSEval 0.888 0.333 0.759 0.25 0 0.842
GRAE 0.873 0 0.764 0.143 0 0.744

349
The 3rd International Conference on Small & Medium Business 2016
January 19 - 21, 2016, Nikko Saigon Hotel, Hochiminh, Vietnam

C. Performance comparison between the Feature REFERENCES

Selection Algorithms
[1] Humera Shaziya, Raniah Zaheer, Kavitha.G, “ Prediction of
The results for the performance of the selected students in Semester Exams using a Naïve Bayes Classifier”, Int.
feature selection algorithm on Naïve Bayes classifier is Journal of Innovative Research in Science, Engineering and
Technology, Vol.4, Issue 10, 2015, pp.9823-9829.
summarized in Table 5. The results of Feature Selection
[2] Tajunisha N,Anjali M, “ Predicting Student Performance Using
algorithm along with the naïve Bayes classifier reveals
MapReduce”, Int. Journal of Engineering and Computer Science,
that Correlation Based Feature subset Evaluator performs Vol.4, Issue 1, 2015,pp.9971-9976.
very well with 6 attributes in comparison with Gain Ratio [3] Mashael A. Al-Barrak, Muna Al-Razgan, “Predicting Students
which has 13 attributes. The overall accuracy of CFS Final GPA Using Decision Trees: A Case Study”, Int. Journal of
algorithm is about 84%. On the other hand Gain Ratio Information and Education Technology, Vol.6, No.7, 2016,
performs less accurate of just 74%. Also the pp.528-533.
classification accuracy is very good for the class Second [4] Karthikeyan.T, Thangaraju.P,”Genetic Algorithm based CFS and
and First. In addition, further analysis that the prediction Naïve Bayes Algorithm to Enhance the Predictive Accuracy”,
result shows that accuracy is low for the class Distinction Indian Journal of Science and Technology, Vol.8, No.26, 2015,
and very worst for the class Third. pp.1-8.
[5] Lumbini P.Khobragade, Pravin Mahadik,” Students Academic
Failure Prediction Using Data Mining”, Int. Journal of Advanced
VI. CONCLUSION
Research in Computer and Communication Engineering, Vol.4,
Issue.11,2015,pp.290-298.
In this research work, It is presented a case study in [6] Hina Gulati, “Predictive Analytics Using Data Mining Technique”,
educational data mining. The obtained results show that 2nd International Conference on Computing for Sustainable Global
the feature selection techniques can improve the accuracy Development, 2015, pp.713-716.
and efficiency of the classification algorithm by [7] Jai Ruby,K. David, “Analysis of Influencing Factors in Predicting
removing irrelevant and redundant attributes. It was Students Performance Using MLP-A Comparative Study”, Int.
especially used to improve the student performance. The Journal of Innovative Research in Computer and Communication
most relevant features are got by using GainRatio and Engineering, Vol.3, Issue.2, 2015, pp.1085-1092.
CFS subset evaluator. Naïve Bayes classifiers have been [8] Anal Acharya, Devadatta Sinha, “Application of Feature Selection
Methods in Educational Data Mining”, Int. Journal of Computer
applied on the selected features. From the results, it is Applications, Vol.103, No.2,2014,pp.34-38.
concluded that Correlation Based Feature Subset [9] Anal Acharya, Devadatta Sinha, “Early Prediction of Students
evaluator performs well with the Naïve Bayes classifier Performance using Machine Learning Techniques”, Int. Journal of
as compared with Gain-Ratio Attribute Evaluator. In Computer Applications, Vol.107,No.1,2014,pp.37-43.
[10] Guyon, Isabelle, and Andre Elisseeff, "An introduction to variable
future, this work extend the experiment with different and feature selection", The Journal of Machine Learning Research,
data mining techniques like clusters can applied with Vol. 3, 2003, pp. 1157-1182.
other feature selection algorithms on large data set in the [11] Hall, M. A., Smith, L. A, “Practical feature subset selection for
same educational field. machine learning”, Australian Computer Science Conference,
Springer, 1998, pp.181-191.
[12] Muhammad Naeem,”An Empirical Analysis and Performance
Evaluation of Feature Selection Techniques for Belief Network
Classification System”, Int. Journal of Control and Automation,
Vol.8, No.3, 2015, pp.375-388.
[13] Mital Doshi, Setu K Chaturvedi,”Correclation Based Feature
Seleciton(CFS) Technique to Predict Student Performance”, Int.
Journal of Computer Networks & Communications, Vol.6, No.3,
2014,pp.197-206.
[14] P.V.Praveen Sundar,” A comparative study for Predicting
Students Academic Performance using Bayesian Network
Classifiers”, IOSR Journal of Engineering, Vol.3, Issue
2,2013,pp.37-42.

Figure 3: Overall accuracy of Feature Selection

Algorithm

350

View publication stats

Allan Bloom - ''Rousseau, The Turning Point''
100% (1)
Allan Bloom - ''Rousseau, The Turning Point''
25 pages
7 Steps of Writing 10X Better Prompts
No ratings yet
7 Steps of Writing 10X Better Prompts
63 pages
Assignment No: 1: Zahida Amin
No ratings yet
Assignment No: 1: Zahida Amin
23 pages
Globalization in Question, 3rd Edition: Paul Hirst, Grahame Thompson, Simon Bromley
0% (1)
Globalization in Question, 3rd Edition: Paul Hirst, Grahame Thompson, Simon Bromley
3 pages
Seminar Report PDF
No ratings yet
Seminar Report PDF
29 pages
AP Biology Lab Report Rubric Grading Template
No ratings yet
AP Biology Lab Report Rubric Grading Template
4 pages
1253926997-BS Architecture Curriculum
No ratings yet
1253926997-BS Architecture Curriculum
41 pages
AI - Learning.websites and Other Educational Websites
No ratings yet
AI - Learning.websites and Other Educational Websites
18 pages
Assignment #2
100% (1)
Assignment #2
2 pages
The Issue of Identity in Manjula Padmanabhan's Harvest and Mahesh Dattani's Dance Like A Man.
No ratings yet
The Issue of Identity in Manjula Padmanabhan's Harvest and Mahesh Dattani's Dance Like A Man.
7 pages
Lec 4 Teaching Methods
No ratings yet
Lec 4 Teaching Methods
13 pages
CV - Reyes Christian B. (Philippines) 2020 PDF
No ratings yet
CV - Reyes Christian B. (Philippines) 2020 PDF
6 pages
APHL Molecular Workshop 2020 Lee How To Validate A Molecular Assay
No ratings yet
APHL Molecular Workshop 2020 Lee How To Validate A Molecular Assay
32 pages
Summative Assignment
No ratings yet
Summative Assignment
31 pages
Bab 6
No ratings yet
Bab 6
19 pages
Principles of Ecology PDF
No ratings yet
Principles of Ecology PDF
63 pages
Wave Optics (M.K.Saxena Sir) English
No ratings yet
Wave Optics (M.K.Saxena Sir) English
143 pages
Role of AI Chatbots in Education Systematic Litera
No ratings yet
Role of AI Chatbots in Education Systematic Litera
17 pages
AbuSaa2019 Article FactorsAffectingStudentsPerfor
No ratings yet
AbuSaa2019 Article FactorsAffectingStudentsPerfor
32 pages
Course Guideline - Innovation
No ratings yet
Course Guideline - Innovation
17 pages
Andrija Puharich
No ratings yet
Andrija Puharich
2 pages
Analysis Trend Triz
No ratings yet
Analysis Trend Triz
6 pages
Lesson 34
No ratings yet
Lesson 34
13 pages
Bias and Inaccuracy in AI Chatbot Ophthalmologist
No ratings yet
Bias and Inaccuracy in AI Chatbot Ophthalmologist
9 pages
Prof. Dr. Muhammad Qaiser: Project Director: Project Director: Project Director: HEC Eminent Professor
No ratings yet
Prof. Dr. Muhammad Qaiser: Project Director: Project Director: Project Director: HEC Eminent Professor
23 pages
Open Problems and Fundamental Limitations of Reinforcement Learning From Human Feedback
No ratings yet
Open Problems and Fundamental Limitations of Reinforcement Learning From Human Feedback
34 pages
Predicting Academic Outcomes - A Survey From 2007 Till 2018
No ratings yet
Predicting Academic Outcomes - A Survey From 2007 Till 2018
33 pages
Prediction Clustering
No ratings yet
Prediction Clustering
16 pages
Educational Data Mining and Its Role in Determining Factors Affecting Students Academic Performance A Systematic Review
No ratings yet
Educational Data Mining and Its Role in Determining Factors Affecting Students Academic Performance A Systematic Review
7 pages
BIA Assignment
No ratings yet
BIA Assignment
7 pages
Performance Evaluation of Feature Selection Algorithms in Educational Data Mining
No ratings yet
Performance Evaluation of Feature Selection Algorithms in Educational Data Mining
9 pages
1 s2.0 S1877050915019018 Main
No ratings yet
1 s2.0 S1877050915019018 Main
9 pages
Scavenger Hunt
No ratings yet
Scavenger Hunt
2 pages
Literature Review
No ratings yet
Literature Review
11 pages
Data Mining Review1
No ratings yet
Data Mining Review1
5 pages
Answer All Questions INDIVIDUALLY.: Tutorial Topic 7: Counting Discrete Mathematics
No ratings yet
Answer All Questions INDIVIDUALLY.: Tutorial Topic 7: Counting Discrete Mathematics
2 pages
Efficient Skyline Computation On Massive Incomplet
No ratings yet
Efficient Skyline Computation On Massive Incomplet
19 pages
The Behaviour of Rank Correlation Coefficients For
No ratings yet
The Behaviour of Rank Correlation Coefficients For
15 pages
20122
No ratings yet
20122
22 pages
Online Feature Selection and Classification With Incomplete Data
No ratings yet
Online Feature Selection and Classification With Incomplete Data
13 pages
Factors Affecting Students Performance I
No ratings yet
Factors Affecting Students Performance I
32 pages
ASurveyon AIChatbots
No ratings yet
ASurveyon AIChatbots
10 pages
Generating Descriptive Model For Student Dropout: A Review of Clustering Approach
No ratings yet
Generating Descriptive Model For Student Dropout: A Review of Clustering Approach
24 pages
Ijertv13n10 46withibthal-0.5
No ratings yet
Ijertv13n10 46withibthal-0.5
15 pages
Data Mining Applications: A Comparative Study For Predicting Student's Performance
No ratings yet
Data Mining Applications: A Comparative Study For Predicting Student's Performance
7 pages
Salah Hashim 2020 IOP Conf. Ser. Mater. Sci. Eng. 928 032019
No ratings yet
Salah Hashim 2020 IOP Conf. Ser. Mater. Sci. Eng. 928 032019
19 pages
Date Sheet 6th Sem
No ratings yet
Date Sheet 6th Sem
4 pages
Student Performance Prediction by Using Data Mining Classification Algorithms
No ratings yet
Student Performance Prediction by Using Data Mining Classification Algorithms
6 pages
Badr 2016
No ratings yet
Badr 2016
10 pages
Case Study 3 - Manufacturing
No ratings yet
Case Study 3 - Manufacturing
4 pages
Yash 21BSDS12 Perdictive Analysis Report
No ratings yet
Yash 21BSDS12 Perdictive Analysis Report
20 pages
Hari Ganesh 2015
No ratings yet
Hari Ganesh 2015
6 pages
Paper 31-Educational Data Mining Students Performance Prediction
No ratings yet
Paper 31-Educational Data Mining Students Performance Prediction
9 pages
Student Performance Analysis Using Educa
No ratings yet
Student Performance Analysis Using Educa
8 pages
Review On Prediction Algorithms in Educational Data Mining: A.Dinesh Kumar, R.Pandi Selvam, K.Sathesh Kumar
No ratings yet
Review On Prediction Algorithms in Educational Data Mining: A.Dinesh Kumar, R.Pandi Selvam, K.Sathesh Kumar
8 pages
Pad Project Research Paper
No ratings yet
Pad Project Research Paper
15 pages
Pattern
No ratings yet
Pattern
14 pages
Predicting Student Academic Success DDA
No ratings yet
Predicting Student Academic Success DDA
26 pages
Educational Data Mining: Student Performance Prediction in Academic
No ratings yet
Educational Data Mining: Student Performance Prediction in Academic
7 pages
Geometria Sagrada
No ratings yet
Geometria Sagrada
2 pages
CID 0548 Synopsis
No ratings yet
CID 0548 Synopsis
1 page
Predicting Students Performance Using Data Mining Technique With Rough Set Theory Concepts
No ratings yet
Predicting Students Performance Using Data Mining Technique With Rough Set Theory Concepts
7 pages
Data Mining Approach To Predict Academic Performance of Students
No ratings yet
Data Mining Approach To Predict Academic Performance of Students
11 pages
Techniques For Examining Student Data For Indicators of Future Success - A Survey and Analysis
No ratings yet
Techniques For Examining Student Data For Indicators of Future Success - A Survey and Analysis
8 pages
Edu Data Mining
100% (1)
Edu Data Mining
6 pages
A Feature Selection Technique Based Approach For Predicting Student 2021
No ratings yet
A Feature Selection Technique Based Approach For Predicting Student 2021
10 pages
Student Performance Prediction Using Machine Learn
No ratings yet
Student Performance Prediction Using Machine Learn
8 pages
Analysis of Educational
No ratings yet
Analysis of Educational
5 pages
Educational Data Mining: A Review and Analysis of Student's Academic Performance
No ratings yet
Educational Data Mining: A Review and Analysis of Student's Academic Performance
15 pages
Writing Scientific Papers in English Successfully 2014 GREAT
No ratings yet
Writing Scientific Papers in English Successfully 2014 GREAT
21 pages
Handling Missing Value in Decision Tree Algorithm PDF
No ratings yet
Handling Missing Value in Decision Tree Algorithm PDF
6 pages
Darkmatter
No ratings yet
Darkmatter
4 pages
Regression Analysis of Student Academic Performance Using Deep Learning
No ratings yet
Regression Analysis of Student Academic Performance Using Deep Learning
16 pages
Final Survey Paper 17-9-13
No ratings yet
Final Survey Paper 17-9-13
5 pages
Irjet V7i2688 PDF
No ratings yet
Irjet V7i2688 PDF
4 pages
Educational Data Mining: A State-Of-The-Art Survey On Tools and Techniques Used in EDM
No ratings yet
Educational Data Mining: A State-Of-The-Art Survey On Tools and Techniques Used in EDM
7 pages
PM Web 18058
No ratings yet
PM Web 18058
18 pages
(Fa) Fianl Research Paper Data Mining..
No ratings yet
(Fa) Fianl Research Paper Data Mining..
59 pages
Hybrid Feature Selection Student Performance Prediction Paper
No ratings yet
Hybrid Feature Selection Student Performance Prediction Paper
17 pages
A Survey On Educational Data Mining Techniques in Predicting Student's Academic Performance
No ratings yet
A Survey On Educational Data Mining Techniques in Predicting Student's Academic Performance
3 pages
Introduction To Health Research Methods A Practical Guide 3rd Edition Scribd Download
100% (15)
Introduction To Health Research Methods A Practical Guide 3rd Edition Scribd Download
16 pages
Student Performance Prediction by Using Data Mining Classification Algorithms
No ratings yet
Student Performance Prediction by Using Data Mining Classification Algorithms
5 pages
Review On Prediction Algorithms in Educational Data Mining
No ratings yet
Review On Prediction Algorithms in Educational Data Mining
2 pages
Role of Data Mining in Education For Improving Students Performance For Social Change
No ratings yet
Role of Data Mining in Education For Improving Students Performance For Social Change
2 pages
1.student Performance Prediction Techniques
No ratings yet
1.student Performance Prediction Techniques
5 pages
Predicting Academic Success in Higher Education Literature Review and Best Practices
No ratings yet
Predicting Academic Success in Higher Education Literature Review and Best Practices
3 pages
Chapter Two
No ratings yet
Chapter Two
7 pages
Abstract Educational Data Mining
No ratings yet
Abstract Educational Data Mining
2 pages
A Naïve Bayes Students' Performance Prediction Model For Decision Support System
No ratings yet
A Naïve Bayes Students' Performance Prediction Model For Decision Support System
9 pages
Educational Data Mining The Case of Department of Mathematics and Computing in The Period 2009 2018
No ratings yet
Educational Data Mining The Case of Department of Mathematics and Computing in The Period 2009 2018
5 pages
Feature Selection Algorithms For Predicting Students Academic Performance Using Data Mining Techniques
No ratings yet
Feature Selection Algorithms For Predicting Students Academic Performance Using Data Mining Techniques
5 pages
2950-Article Text-5557-1-10-20210418
No ratings yet
2950-Article Text-5557-1-10-20210418
6 pages
Abu A - Factors Affecting Students Performance in Higher
No ratings yet
Abu A - Factors Affecting Students Performance in Higher
33 pages
Educational Data Mining: A Literature Review
No ratings yet
Educational Data Mining: A Literature Review
9 pages
Immunology For Medical Students 2nd Edition Official Test Bank
No ratings yet
Immunology For Medical Students 2nd Edition Official Test Bank
408 pages
Mixed Methods Research: Applying AI Tools for Effective Writing and Publishing
From Everand
Mixed Methods Research: Applying AI Tools for Effective Writing and Publishing
Krishna Bista
No ratings yet
AI and ML Applications for Decision-Making in Education Sector
From Everand
AI and ML Applications for Decision-Making in Education Sector
Zemelak Goraga
No ratings yet
EDUCATION DATA MINING FOR PREDICTING STUDENTS’ PERFORMANCE
From Everand
EDUCATION DATA MINING FOR PREDICTING STUDENTS’ PERFORMANCE
Dr. GEETHA N DATA SCIENTIST, BENGALURU
No ratings yet

ICSMB2016-C Anuradha

Uploaded by

ICSMB2016-C Anuradha

Uploaded by

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

FEATURE SELECTION TECHNIQUES TO ANALYSE STUDENT ACADAMIC

Conference Paper · February 2016

Ph.D project View project

The user has requested enhancement of the downloaded file.

FEATURE SELECTION TECHNIQUES TO ANALYSE

I. INTRODUCTION patterns or knowledge from huge amount of data. As we

methods are a catch-all group of techniques which C. Naïve Bayes

Table 1: Student Data Set Description

Variables Description Possible Values

SEM_P Seminar Performance {Poor , Average, Good}

TN = TN/N, where N = (TN+FN)

FP rate(False positive): If the outcome from a prediction

V. RESULTS AND DISCUSSION

The present investigation focuses on two feature

classifier correctly classifies about 74.4% for 10 fold

Figure 2: Result of Gain-Ratio Attribute Evaluator

Table 3: Classifier Result for CfsSubsetEvaluator

Naïve Bayes – 10 fold cross validation

Table 4: Classifier Result for GainRatioAttributeEvaluator

Naïve Bayes – 10 fold cross validation

Table 5: Overall Accuracy of Feature Selection Algorithm

C. Performance comparison between the Feature REFERENCES

Figure 3: Overall accuracy of Feature Selection

View publication stats

You might also like