Evaluation of Clustering Methods For Student Learn
Evaluation of Clustering Methods For Student Learn
Research paper
1Advanced Analytic Engineering Center (AAEC), Faculty of Computer and Mathematical Sciences, Universiti Teknologi MARA, Shah
Alam, Selangor, Malaysia.
2Faculty of Computer and Mathematical Sciences, Universiti Teknologi MARA, Shah Alam, Selangor, Malaysia.
*Corresponding author, e-mail: [email protected]
Abstract
Students’ performance is a key point to get a better first impression during a job interview with an employer. However, there are several
factors, which affect students’ performances during their study. One of them is their learning style, which is under Neurolinguistic Pro-
gramming (NLP) approach. Learning style is divided into a few behavioral categories, Visual, Auditory and Kinesthetics (VAK). This
paper addresses the evaluation of clustering methods for the identification of learning style based on system preferences. It starts with the
distribution of questionnaires to acquire the information on the VAK for each student. About 167 respondents in the Faculty of Computer
and Mathematical Science are collected. It is then pre- processed to prepare the data for clustering method evaluations. Three clustering
methods; Simple K-Mean, Hierarchical and Density-Based Spatial Clustering of Applications with Noise are evaluated. The findings
show that Simple K-Mean offers the most accurate prediction. Upon completion, by using the dataset, Simple K-Means technique esti-
mated four clusters that yield the highest accuracy of 74.85 % compared to Hierarchical Clustering, which estimated four clusters and
Density- Based Spatial Clustering of Applications with Noise which estimated three clusters with 52.69% and 61.68 % respectively. The
clustering method demonstrates the capability of categorizing the learning style of students based on three categories; visual, auditory
and kinesthetic. This outcome would be beneficial to lecturers or teachers in university and school with an automatically clustering the
students’ learning style and would assist them in teaching and learning, respectively.
Keywords: Clustering; Hierarchical Clustering; K-Means; Learning Style; Neuro Linguistic Programming
2. Neuro Linguistic Programming System Gender Semester CGPA V A K Subject Course Style Total
1 6 3.65 4 1 7 ITT575 CS245 K 12
Preferences 1 6 2.68 2 6 5 ITT575 CS245 A 13
1 6 2.64 6 1 5 ITT575 CS245 V 12
1 6 3.31 7 1 4 ITT575 CS245 V 12
As for visual, the person prefers the depiction of diagrams, graphs,
0 6 2.56 7 1 4 ITT575 CS245 V 12
or charts during their study. Symbols also helped them to under- 0 6 2.97 3 7 2 ITT575 CS245 A 12
stand better rather than audio or physical activities. This prefer- 1 6 2.79 5 1 6 ITT575 CS245 K 12
ence is unique as they prefer to use all the images and symbols, 1 6 3.75 7 1 4 ITT575 CS245 V 12
which could have been depicted, by using words or sentences. In a 0 6 2.71 5 4 3 ITT575 CS245 V 12
1 6 2.77 4 5 3 ITT575 CS245 A 12
visual sense, people who prefer this type of learning style learn
1 6 3.22 6 1 5 ITT575 CS245 V 12
best by reading and watching, as they must see it for them to un- 0 6 3.35 9 3 0 ITT575 CS245 V 12
derstand it [7]. They prefer to visual things in every kind of things
they try to understand as if they have a movie camera playing in
their mind while they think. By using this “movie camera”, they Fig. 1: Sample of pre-processed data
always recall things easily from what they had captured.
3.2 Clustering Methods
The people with this type of learning style prefer information,
which is heard or spoken rather than seeing and visualizing. They
Clustering methods have been highlighted in many research and
learn best from verbal discussions during classes instead of watch-
ing the slides. They must hear first to learn and understand what applied in many domains [9-13]. In clustering the idea is not to
was being explained [7]. A person with this style of learning pre- predict the target class as like classification, it is more ever trying
fers to talk loud and even talking to oneself to recall things accu- to group the similar kind of things by considering the most satis-
rately. Unlike visual style of learning, they prefer to speak first fied conditions all the items in the same group should be similar
rather than organizing and sorting their ideas beforehand. They are and no two different group items should not be similar [14]. To
also a good listener, but this may become a disadvantage to them
group the similar kind of items in clustering, different similarity
as they are easily distracted by sounds. Kinesthetic people are the
one who prefers physical activities involved. They are poor listen- measures should be considered. This paper highlights the evalua-
ers as they prefer to learn by doing practically and have an out- tion of the three most common techniques in clustering; Density-
going personality. These types of people tend to learn best by Based Spatial Clustering of Applications with Noise (DBSCAN),
doing hands-on activities [7]. By doing so, they can remember Hierarchical Clustering and Simple K-Means. These clustering
most of the things they had done, unlike visual and auditory per- methods were developed in Phyton and the plotting of the graph
son. However, one of the disadvantages of these types of learners
were using Matplotlib library. Figure 2 is a flowchart to demon-
are they are easily distracted or having a hard time paying atten-
tion as they prefer to do things while learning. They connect to the strate the employment of clustering technique steps. Firstly, the
reality, as they need experiences, practices or simulations. These dataset is loaded into the system. Then, a clustering method is
also include videos of how to do things, or a demonstration by an chosen, whether DBSCAN, Simple K-Mean, or Hierarchical clus-
expert or even case studies. tering technique is used to cluster the data. With the techniques
Multimodality is a term used when a person prefers a few learning chosen, the number of clusters is then determined manually, or by
styles in any of the combinations such as visual and auditory, or using Elbow Method [15]. The distance of each instance coordi-
kinesthetic and auditory, writing and writing and kinesthetic or nates is calculated with the centroids. This step determines the
such. The person who has multimodality easily adapts with sur- cluster of each centroid. Finally, the clusters are plotted into a
rounding very well rather than someone with a singular modality graph for visualization.
[8]. However, the person with multimodality is uncommon as they
need to be able to multitask while doing something to be able to
learn faster. This type of modalities is divided into two types. First
is someone who can do things with all their learning styles simul-
taneously, or second, the one who can change their learning style
one after another.
3. Clustering Implementation
3.1 Data Acquisition and Pre-processing Fig. 2: Flowchart of Clustering Employment Steps
A survey of VAK learning style is distributed to students in the
Faculty of Computer and Mathematical Sciences (FSKM), 4. Analysis of Results
University Teknologi MARA. About 167 questionnaires were
answered. It is then pre-processed to prepare the data for clus-
The experiments were performed measure the accuracy of the
tering method evaluations. VAK scoring is calculated for each
respondent. VAK score is transformed, normalized and orga- three methods; DBSCAN, Simple K-Mean and Hierarchical. Ta-
nized in CSV format. A normalization technique as such z - ble 1 shows the number of clusters identified for the three tech-
score scaling formula is used to reduce the range of the data niques. For Simple K-Mean clustering and Hierarchical clustering,
and the difference of the data will not be too large. For cate- the number of clusters predicted were both 4 clusters; V, A, K and
gorical data such as gender, a common technique, which con- Multimodalities (M). M is a mix of more than two of learning
verts the data into a binomial form, is also used. The example style meanwhile DBSCAN predicted it to have 3 clusters; V, A, K
of pre-processed data is shown in Figure 1.
instead. This is due to the method of calculating the number of
clusters are different for each type of clustering methods.
International Journal of Engineering & Technology 65
Table 1: Number of clusters for each technique misclustered data in Cluster 2. The performance of Hierarchical
Clustering Method Number of Clusters Learning Style clustering is not as good as Simple K-Means technique.
DBSCAN 3 V, A, K
Table 3: Confusion matrix for Hierarchical cluster
Simple K-Means 4 V, A, K, M
System Cluster No
Hierarchical 4 V, A, K. M
Preferences 0 1 2 3
4.1 Results of Simple K- Means K 62 0 0 0
A 13 20 14 0
Figure 3 shows the data cluster for the technique Simple K- Means which
consists of 4 clusters. The data are well represented in Table 3 shows the V 57 0 0 0
exact value of each clustered data. Figure 3 is the clustered data in the
M 13 0 0 6
Visual against Cumulative Grade Point Average (CGPA), Auditory against
Cumulative Grade Point Average, Kinesthetic against Cumulative Grade In Figure 4, the data is clustered into 3 clusters. This data is well
Point Average and the last one is showing the clustered data and identifica- represented in the Table 4. Many noisy data are represented with
tion of Visual for cluster 1, Auditory for cluster 2, Kinesthetic for cluster 3 the black spots.
and Multimodal for cluster 4 for the students in FSKM.
4.3 Result of DBSCAN
Figure 5 shows the data cluster for the technique DBSCAN which
consists of 4 clusters. The data is well represented in Table 5
which shows the exact value of each clustered data. Figure 5 is the
clustered data for the Visual against Cumulative Grade Point Av-
erage, Auditory against Cumulative Grade Point Average, Kines-
thetic against Cumulative Grade Point Average and the last one is
Fig. 3: The cluster for SimpleKMeans showing the clustered data and identification of Visual for cluster
1, Auditory for cluster 2, Kinesthetic for cluster 3 and Multimodal
In Table 3, the confusion matrix shows the correctly clustered data for cluster 4 for the students in FSKM by using DBSCAN tech-
with a total of 167 data. The data are accurately clustered which nique.
the cluster 0, Cluster 1, Cluster 2 and Cluster 3 represent the learn-
ing style of M, A, V and K respectively. As demonstrated in Table
2, there are 46 instances which are incorrectly clustered. 34 of
them were misclustered under Cluster 0, and 12 of them were
misclustered under Cluster 1.
References
[1] Ali, S., Haider, Z., Munir, F., Khan, H., & Ahmed, A. (2013). Fac-
tors Contributing to the Students Academic Performance: A Case
Study of Islamia University Sub-Campus. America Journal of Edu-
cational Research, 1(8), 283-289.
[2] Curry, L. (1987). Integrating concepts of cognitive or learning
style: A review with attention to psychometric standards. Ottawa,
ON: Canadian College of Health Service Executives.
[3] Reid, J. M. (1987). The Learning Style Preferences of ESL Students.
TESOL Quarterly, 21(1),87
Fig. 7: CGPA against Visual graph [4] Broadbent, J., & Poon, W. L. (2015). Self-regulated learning strate-
gies & academic achievement in online higher education learning
Figure 8 shows the clustered data shows a significant value of environments: A systematic review. The Internet and Higher Edu-
cation, 27, 1-13.
clustered data with high density of cluster from the range of 0.5 to [5] Brown, H. D. (2000). Principles of language teaching and learning,
1.0. This shows the students has a mixed of Kinesthetic learning (4th ed.). White Plains, NY: Longman.
style with the other 2 learning style. With the mixed of 2 or more [6] O'Connor, J., & Seymour, J. (2011). Introducing NLP: Psychologi-
cal skills for understanding and influencing people. Conari Press.
learning style, the students are well adapted to their study envi- [7] Kanar, C. C. (1995). The confident student. Boston: Houghton Mif-
ronment and can utilize all the learning style to achieve a higher flin Company.
CGPA in their study. [8] Krishna, T. S., Babu, A. Y., & Kumar, R. K. (2018). Determination
of Optimal Clusters for a Non-hierarchical Clustering Paradigm K-
Means Algorithm. In Proceedings of International Conference on
Computational Intelligence and Data Engineering (pp. 301-316).
Springer, Singapore.
[9] Ooi, T. H., Ngah, U. K., Khalid, N. E. A., & Venkatachalam, P. A.
(2000). Mammagraphic Calcification Clusters Using The Region
International Journal of Engineering & Technology 67