Chapter 3

Uploaded by

Tobi Odedele

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

5 views3 pages

Chapter 3

Uploaded by

Tobi Odedele

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 3

Methodology

The proposed methodology used in this study for Analyzing student academic performance is
based on clustering algorithms which belong to the process of Data Mining. The stages in the
process include the following:

Data Mining Process

In present day’s educational system, a student’s performance is determined by the internal

assessment and end semester examination results . The end semester examination
is the mark obtained by the student at the end of semester examination. Each student has to get
minimum marks to pass a semester course from both internal and end semester examination.

Data Collection

The data set used in this research was obtained from Ladoke Akintola University of Technology
Ogbomoso OYO State in Western part of Nigeria, on the sampling method of Faculty of Computing
and Informatics Department of information Systems. Initially size of the data is 500. In this step,
data stored in different tables was joined in a single table, after the joining process, errors were
removed.

Data Preprocessing

Data pre-processing is an important step in Machine Learning because the quality of data and the
valuable knowledge that can be extracted from it directly affects our model's ability to learn thus, it
is critical that preprocess our data before feeding it into our model. Its aim is to transform raw data
into a format that mining algorithms can use. During this process, the following tasks are
completed.

1) DATA CLEANING
Data cleaning refers to the elimination of incomplete, missing or duplicate data. There are many
ways to fill in missing values for attributes, such as ignoring tuples, using a global constant to fill in
missing values, using the mean of attributes to fill in missing values, etc. Delete the grade records
of
the courses with more missing courses, and fill in the grade
records of the courses with fewer missing courses. This project follows the following principles:
Delete the score records with empty scores in more than two courses, and if there are still students
whose course scores are empty, fill it with the average value of the course. It is understood that a
course
with a score of 0 is a student’s absence from the exam, and the corresponding student’s score
record is deleted.
2) DATA INTEGRATION
In order to solve data redundancy, it is necessary to merge related courses. Since some courses are
divided into several
semesters, merging these courses and taking the average score of several semesters as the score
of the course is conducive to reducing the characteristics in the process of subsequent analysis.
3) DATA TRANSFORMATION
The original score data are presented in the form of a percentile system, with no difference of order
of magnitude, and no standardized operation is required. The K-means algorithm is only suitable for
processing numerical data.
When the data for analysis is combined into a table, in addition to setting the student number to
character type, the data type of each subject score is converted to numerical type, and decimal
places are set to 0.

The Data Mining Tools

The experimental tool used was R programming language,R is a powerful tool for data analysis,
Statistical modeling, and visualization. its extend libraries and packages together with graphical
user
interfaces for easy access to this functionality make it a popular choice among data scientists and
researchers .

Proposed Clustering Methodology

In recent years, the effectiveness of the use of clustering techniques in student perfor-
mance analysis studies has attracted the interest of many researchers. The clustering
technique refers to one method of grouping several similar objects into one cluster while
different objects into another. The clustering technique will be very useful if the labelled
information from students in the dataset is unknown. In addition, the division of large data
sets into small, logical clusters will make it easier for researchers to examine and explain
the meaning of the data.
K-means Algorithm. The research main choice is the k-means algorithm, a popular
clustering technique. This technique is popular because the way it is implemented is very
simple, and the results are also easy to understand. The k-means algorithm is a method for
grouping nearby objects into the k number of the centroid. The elbow method is a popular
way to figure out the best number of clusters. When given several clusters, k, this approach
calculates the total of the within-cluster variance, also known as inertia, and then shows
the variance curve concerning k. The best number of clusters could be the k value at the
curve’s initial turning point.
The alternative technique is to use silhouette plot analysis by calculating the coeffi-
cients for each data point to measure its similarity with its cluster as compared to other
clusters. The value of the silhouette coefficient is in the range [1,−1] where a high value
indicates that the object is well matched to its cluster.

The procedures of the

K-means algorithm:
1) Arbitrarily select k samples from n samples as the initial clustering centers, and the initial
clustering center is randomly determined.
2) Assign all other sample to the nearest clustering center.
3) Calculate the clustering center of each cluster, and Euclidean distance is used as the formula for
calculating distance

Clustering Model Evaluation

In the clustering analysis phase, the accuracy or quality of clustering results will
be determined and confirmed. It is an important measurement in determining which
algorithm achieved the best performance by using input data for the study. Clustering
evaluation is a stand-alone process and is not included during the clustering process. It is
always carried out after the final output of the clustering is produced. There are two
methods practiced in measuring the quality of clustering results: internal validation and
external validation.
Internal validation is the process of evaluating clustering that is compared to the
results of the clustering itself, namely the relationship between the structures of clusters
that have been formed. This is more realistic and efficient in solving problems involving
educational datasets with increasing daily sizes and dimensions.

Dua Qamoos Saify Sagheer
No ratings yet
Dua Qamoos Saify Sagheer
48 pages
Milestone Challenge On Used Bikes Data Set
25% (8)
Milestone Challenge On Used Bikes Data Set
11 pages
BEGC-101 June 2022
No ratings yet
BEGC-101 June 2022
4 pages
B-63833en 02 (Hardware)
100% (1)
B-63833en 02 (Hardware)
338 pages
ST Stephen School Sonarpur Bengali
No ratings yet
ST Stephen School Sonarpur Bengali
4 pages
RNW 3RD Q Reviewer
No ratings yet
RNW 3RD Q Reviewer
7 pages
Top 10 Data Mining Papers
No ratings yet
Top 10 Data Mining Papers
126 pages
A Poison Love
No ratings yet
A Poison Love
3 pages
Class XI Eng - PDF 24-02-2022
No ratings yet
Class XI Eng - PDF 24-02-2022
100 pages
Schermelleh Moosbrugger Mueller ModelFit MPR 2003
No ratings yet
Schermelleh Moosbrugger Mueller ModelFit MPR 2003
53 pages
(Fa) Fianl Research Paper Data Mining..
No ratings yet
(Fa) Fianl Research Paper Data Mining..
59 pages
Project Final
100% (1)
Project Final
8 pages
Logarithm Exercise Full Solution (Combined)
No ratings yet
Logarithm Exercise Full Solution (Combined)
40 pages
Activity in Class 17-5-2022
No ratings yet
Activity in Class 17-5-2022
2 pages
Predicting Student Academic Success DDA
No ratings yet
Predicting Student Academic Success DDA
26 pages
Catch Up Plan Form 1 2022
No ratings yet
Catch Up Plan Form 1 2022
41 pages
Q2 MIL-Revised M7 Week-7
No ratings yet
Q2 MIL-Revised M7 Week-7
17 pages
Data Mining Data-Mining
No ratings yet
Data Mining Data-Mining
34 pages
Yash 21BSDS12 Perdictive Analysis Report
No ratings yet
Yash 21BSDS12 Perdictive Analysis Report
20 pages
Analysis of Student Academic Performance Using Clustering Techniques
No ratings yet
Analysis of Student Academic Performance Using Clustering Techniques
21 pages
Troubleshooting Guide Omnipcx Enterprise LX: NB of Pages: 43
No ratings yet
Troubleshooting Guide Omnipcx Enterprise LX: NB of Pages: 43
44 pages
Learning Strategy With Groups On Page Based Students' Profiles
No ratings yet
Learning Strategy With Groups On Page Based Students' Profiles
19 pages
Data Preprocessing 013333
No ratings yet
Data Preprocessing 013333
8 pages
Chapter Two
No ratings yet
Chapter Two
7 pages
478-Article Text-756-1-10-20220819
No ratings yet
478-Article Text-756-1-10-20220819
22 pages
Sequence Paper
No ratings yet
Sequence Paper
10 pages
MACHINE LEARNING FOR BEGINNERS: A Practical Guide to Understanding and Applying Machine Learning Concepts (2023 Beginner Crash Course)
From Everand
MACHINE LEARNING FOR BEGINNERS: A Practical Guide to Understanding and Applying Machine Learning Concepts (2023 Beginner Crash Course)
Elaine Tate
No ratings yet
Development of Student's Academic Performance Prediction Model
No ratings yet
Development of Student's Academic Performance Prediction Model
16 pages
Pad Project Research Paper
No ratings yet
Pad Project Research Paper
15 pages
Ijertv13n10 46withibthal-0.5
No ratings yet
Ijertv13n10 46withibthal-0.5
15 pages
L0 U6 Answer
No ratings yet
L0 U6 Answer
5 pages
(DT) - Study On Student Score Based On Data Min
No ratings yet
(DT) - Study On Student Score Based On Data Min
9 pages
Student Performance Prediction Using Machine Learn
No ratings yet
Student Performance Prediction Using Machine Learn
8 pages
Wlin35 ITMD536 ResearchPaper-V2
No ratings yet
Wlin35 ITMD536 ResearchPaper-V2
15 pages
Data Mining
No ratings yet
Data Mining
11 pages
Software Assignment No1 Zohaib Ijaz 23811
No ratings yet
Software Assignment No1 Zohaib Ijaz 23811
10 pages
Clustering Crackerjack Students Using Data Mining Approach
No ratings yet
Clustering Crackerjack Students Using Data Mining Approach
7 pages
Student Performance Assessment Using Clustering Techniques
No ratings yet
Student Performance Assessment Using Clustering Techniques
10 pages
Maths
No ratings yet
Maths
7 pages
Chapter 04
No ratings yet
Chapter 04
6 pages
K Means Clustering Algorithm For Students Selection and Performance Prediction
No ratings yet
K Means Clustering Algorithm For Students Selection and Performance Prediction
5 pages
The Hero's Journey Analysis of The Movie Ever After
No ratings yet
The Hero's Journey Analysis of The Movie Ever After
1 page
The Journal of Engineering - 2019 - Li - Educational Data Mining For Students Performance Based On Fuzzy C Means
No ratings yet
The Journal of Engineering - 2019 - Li - Educational Data Mining For Students Performance Based On Fuzzy C Means
6 pages
Translation of Sad Songs of Rafi: Dil Ka Soona Saaz...
No ratings yet
Translation of Sad Songs of Rafi: Dil Ka Soona Saaz...
2 pages
Dake 2019 Ijca 919320
No ratings yet
Dake 2019 Ijca 919320
6 pages
Presentation For Follow Up
No ratings yet
Presentation For Follow Up
23 pages
Role of Data Mining in Education For Improving Students Performance For Social Change
No ratings yet
Role of Data Mining in Education For Improving Students Performance For Social Change
2 pages
Predicting Academic Success in Higher Education Literature Review and Best Practices
No ratings yet
Predicting Academic Success in Higher Education Literature Review and Best Practices
3 pages
A Survey On Educational Data Mining Techniques in Predicting Student's Academic Performance
No ratings yet
A Survey On Educational Data Mining Techniques in Predicting Student's Academic Performance
3 pages
Mrs. I. Madhavi Assistant Professor GST, GITAM University
No ratings yet
Mrs. I. Madhavi Assistant Professor GST, GITAM University
20 pages
Prediction Clustering
No ratings yet
Prediction Clustering
16 pages
Student Profiling On Academic
No ratings yet
Student Profiling On Academic
8 pages
Student Performance Analysis Using Educa
No ratings yet
Student Performance Analysis Using Educa
8 pages
The Isolation of The Lady of Shalott by Gaurav Trivedi
No ratings yet
The Isolation of The Lady of Shalott by Gaurav Trivedi
2 pages
Data Mining Applications: A Comparative Study For Predicting Student's Performance
No ratings yet
Data Mining Applications: A Comparative Study For Predicting Student's Performance
7 pages
Db2 Training Class 001
No ratings yet
Db2 Training Class 001
13 pages
PeopleSoft Internet Architecture 081000
No ratings yet
PeopleSoft Internet Architecture 081000
19 pages
Running Head:: Data Mining 1
No ratings yet
Running Head:: Data Mining 1
7 pages
A Survey On Educational Data Mining Techniques
No ratings yet
A Survey On Educational Data Mining Techniques
5 pages
Educational Data Mining: Student Performance Prediction in Academic
No ratings yet
Educational Data Mining: Student Performance Prediction in Academic
7 pages
Paper 31-Educational Data Mining Students Performance Prediction
No ratings yet
Paper 31-Educational Data Mining Students Performance Prediction
9 pages
Student Performance Evaluation in Educat
No ratings yet
Student Performance Evaluation in Educat
3 pages
Analysis of Students'Critical Thinking Skills Using Data Mining Approaches (Survey Based Research)
No ratings yet
Analysis of Students'Critical Thinking Skills Using Data Mining Approaches (Survey Based Research)
5 pages
Peer Evaluation-Pronunciation HGMT
No ratings yet
Peer Evaluation-Pronunciation HGMT
2 pages
EDUCATION DATA MINING FOR PREDICTING STUDENTS’ PERFORMANCE
From Everand
EDUCATION DATA MINING FOR PREDICTING STUDENTS’ PERFORMANCE
Dr. GEETHA N DATA SCIENTIST, BENGALURU
No ratings yet
Handling Missing Value in Decision Tree Algorithm PDF
No ratings yet
Handling Missing Value in Decision Tree Algorithm PDF
6 pages
There Is No Such Thing As A Morale or An Immoral Book
No ratings yet
There Is No Such Thing As A Morale or An Immoral Book
3 pages
Evaluating Student's Performance Using K-Means Clustering: Rakesh Kumar Arora, Dr. Dharmendra Badal
No ratings yet
Evaluating Student's Performance Using K-Means Clustering: Rakesh Kumar Arora, Dr. Dharmendra Badal
5 pages
Evaluation of Data Mining Techniques For Predicting Student's Performance
No ratings yet
Evaluation of Data Mining Techniques For Predicting Student's Performance
7 pages
Evaluating Students Performance Using K Means Clustering IJERTV6IS050070
No ratings yet
Evaluating Students Performance Using K Means Clustering IJERTV6IS050070
3 pages
Lab#7: Complex Part Using G03 Ijk Input: Description
No ratings yet
Lab#7: Complex Part Using G03 Ijk Input: Description
3 pages
Mining Students Data To Analyze Learning Behavior: A Case Study
No ratings yet
Mining Students Data To Analyze Learning Behavior: A Case Study
4 pages
Data Mining: A Prediction of Performer or Underperformer Using Classification
No ratings yet
Data Mining: A Prediction of Performer or Underperformer Using Classification
5 pages
Mapping Student's Performance Based On Data Mining Approach (A Case Study)
No ratings yet
Mapping Student's Performance Based On Data Mining Approach (A Case Study)
5 pages
Paper Dinesh Clustering Techniques
No ratings yet
Paper Dinesh Clustering Techniques
5 pages
Educational Data Mining Techniques Approach To Predict Student's Performance
No ratings yet
Educational Data Mining Techniques Approach To Predict Student's Performance
4 pages
Novel Approach To Evaluate Student Performance Using Data Mining
No ratings yet
Novel Approach To Evaluate Student Performance Using Data Mining
6 pages
Extending The Student's Performance Via K Means and Blended Learning
No ratings yet
Extending The Student's Performance Via K Means and Blended Learning
4 pages
Final Survey Paper 17-9-13
No ratings yet
Final Survey Paper 17-9-13
5 pages
Ejsr 43 1 03
No ratings yet
Ejsr 43 1 03
6 pages
Management-Mining Students Data To Predict Student
No ratings yet
Management-Mining Students Data To Predict Student
6 pages
Unit 4
No ratings yet
Unit 4
4 pages
Theory: Scale and Chord: Ke y # Sharps Ke y B Flats
No ratings yet
Theory: Scale and Chord: Ke y # Sharps Ke y B Flats
3 pages
Predicting Students' Performance Using K-Median Clustering
No ratings yet
Predicting Students' Performance Using K-Median Clustering
4 pages
Prediction of Student Academic Performance by An Application of K-Means Clustering Algorithm
No ratings yet
Prediction of Student Academic Performance by An Application of K-Means Clustering Algorithm
3 pages
Application of K-Means 1002.2425 PDF
No ratings yet
Application of K-Means 1002.2425 PDF
4 pages
Simple Present Tense (1) : I We You (Singular) You (Plural) They
No ratings yet
Simple Present Tense (1) : I We You (Singular) You (Plural) They
6 pages
Resumen Literatura
No ratings yet
Resumen Literatura
5 pages
Data Mining Technique: Stepwise Regression
No ratings yet
Data Mining Technique: Stepwise Regression
2 pages
IGNOU BCA System Analysis and Design Previous Year Solved Papers MCS 014
From Everand
IGNOU BCA System Analysis and Design Previous Year Solved Papers MCS 014
Manish Soni
No ratings yet

Chapter 3

Uploaded by

Chapter 3

Uploaded by

Methodology

Data Mining Process

In present day’s educational system, a student’s performance is determined by the internal

The Data Mining Tools

Proposed Clustering Methodology

The procedures of the

Clustering Model Evaluation

You might also like