0% found this document useful (0 votes)
10 views15 pages

Inggris Mining

Uploaded by

Ilham Maulana
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views15 pages

Inggris Mining

Uploaded by

Ilham Maulana
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 15

This article was downloaded by: [Dalhousie University]

On: 27 December 2012, At: 00:42


Publisher: Taylor & Francis
Informa Ltd Registered in England and Wales Registered Number: 1072954 Registered office: Mortimer
House, 37-41 Mortimer Street, London W1T 3JH, UK

International Journal of Computational Intelligence


Systems
Publication details, including instructions for authors and subscription information:
http://www.tandfonline.com/loi/tcis20

Using data mining on student behavior and cognitive


style data for improving e-learning systems: a case
study
a a a a
Milos Jovanovic , Milan Vukicevic , Milos Milovanovic & Miroslav Minovic
a
Faculty of Organizational Sciences, University of Belgrade, Jove Ilica 154, Belgrade,
Serbia
Version of record first published: 28 May 2012.

To cite this article: Milos Jovanovic , Milan Vukicevic , Milos Milovanovic & Miroslav Minovic (2012): Using data mining
on student behavior and cognitive style data for improving e-learning systems: a case study, International Journal of
Computational Intelligence Systems, 5:3, 597-610

To link to this article: http://dx.doi.org/10.1080/18756891.2012.696923

PLEASE SCROLL DOWN FOR ARTICLE

Full terms and conditions of use: http://www.tandfonline.com/page/terms-and-conditions

This article may be used for research, teaching, and private study purposes. Any substantial or systematic
reproduction, redistribution, reselling, loan, sub-licensing, systematic supply, or distribution in any form to
anyone is expressly forbidden.

The publisher does not give any warranty express or implied or make any representation that the contents
will be complete or accurate or up to date. The accuracy of any instructions, formulae, and drug doses
should be independently verified with primary sources. The publisher shall not be liable for any loss, actions,
claims, proceedings, demand, or costs or damages whatsoever or howsoever caused arising directly or
indirectly in connection with or arising out of the use of this material.
International Journal of Computational Intelligence Systems, Vol. 5, No. 3 (June, 2012), 597-610

Using data mining on student behavior and cognitive style data for improving e-learning
systems: a case study

Milos Jovanovic, Milan Vukicevic, Milos Milovanovic, Miroslav Minovic


Faculty of Organizational Sciences, University of Belgrade, Jove Ilica 154
Belgrade, Serbia
E-mail: {milos.jovanovic, milan.vukicevic, milos.milovanovic, miroslav.minovic}@fon.bg.ac.rs
www.bg.ac.rs
Received 6 December 2011
Accepted 15 May 2012
Downloaded by [Dalhousie University] at 00:42 27 December 2012

Abstract

In this research we applied classification models for prediction of students’ performance, and cluster models for
grouping students based on their cognitive styles in e-learning environment. Classification models described in this
paper should help: teachers, students and business people, for early engaging with students who are likely to
become excellent on a selected topic. Clustering students based on cognitive styles and their overall performance
should enable better adaption of the learning materials with respect to their learning styles. The approach is tested
using well-established data mining algorithms, and evaluated by several evaluation measures. Model building
process included data preprocessing, parameter optimization and attribute selection steps, which enhanced the
overall performance. Additionally we propose a Moodle module that allows automatic extraction of data needed for
educational data mining analysis and deploys models developed in this study.

Keywords: educational data mining, prediction, students, performance, classification, clustering, Moodle.

Educational Data Mining (EDM), concerns with


1. Introduction developing methods that discover knowledge from data
Moodle is an open source Learning Management originating from educational (traditional or distance
System (LMS) that is mostly regarded as Course learning) environments.6 Increasing research interests in
Management System by the open community. It is using data mining in education is recorded in the last
dominantly used in higher education and it has proven decade7,8,9,10,11 with focus on different aspects of
as a successful tool in that setting.1,2 For that reason our educational process (e.g. students, teachers, teaching
faculty built a distance learning system (DLS) based on materials, organization of classes etc.).
Moodle LMS. The system was built and developed as
an in-house solution at University of Belgrade for the Benefits from extracting knowledge from e-learning
students of Information technology. One of the main data are expected under assumption that the trails of
requirements was to completely support distance user actions can be used to identify specific information
learning process in all its aspects. The system enables on users. We hope that the user behavior captured in log
dealing with advanced courses, which use multimedia files and recorded in data structures can be used to
lessons, advanced workshops and face to face create models that predict user behavior, or describe
communication through video conferencing. their peculiarities. There are several groups of people
who can leverage this knowledge, and are potential
Web-based learning management systems are stakeholders: Students, Teachers, e-learning system
extensively used nowadays and produce vast amounts of administrators, University management.
data that are potentially useful for improving
educational process.2,4,5 The new emerging field, called These stakeholders could use this knowledge for
different goals9:

Published by Atlantis Press


Copyright: the authors
597
Milos Jovanovic, Milan Vukicevic , Milos Milovanovic, Miroslav Minovic

discusses open issues and related problems for these


1. Applications dealing with the assessment of students’ types of applications.
learning performance.
2. Applications that provide course adaptation and 2. Background
learning recommendations based on the students’
Romero and Ventura gave a systematic survey about
learning behavior. EDM from 1995 to 2005.10 Because of increasing
3. Approaches dealing with the evaluation of learning
popularity and number of researches in this area, the
material and educational web based courses.
same authors gave an extensive overview about the state
4. Applications that involve feedback to both teachers of the art in this area until 2011 with over 300
and students of e-learning courses, based on the
references.12 In this paper we will focus on researches
students’ learning behavior.
that are closest to our work. Study by Wang and Liao
5. Developments for the detection of atypical students’ was performed in order to investigate how Data Mining
learning behavior.
techniques can be successfully used for adaptive
learning.14 In academic institutions, Moodle platform is
Downloaded by [Dalhousie University] at 00:42 27 December 2012

These goals are achieved with help of data mining often utilized as a significant part of e-learning systems.
techniques such as k-nearest neighbor, naive Bayes,
Romero et al. described how different data mining
decision trees, artificial neural networks, support vector
techniques can be used in that setting to improve the
machines, K-means, hierarchical clustering etc.12 course and the students’ learning.6

Still, learning management systems are not primarily


Applications or tasks that have been resolved through
designed with data analysis and mining in mind, data mining techniques are classified by Romero and
because usage data is not stored in a systematic way. Its
Ventura in twelve categories: Analysis and visualization
thorough analysis requires long and tedious pre-
of data, Providing feedback for supporting instructors,
processing.13 Furthermore, LMS systems usually Recommendations for students, Predict students’
produce statistic reports. These reports however do not
performance, Student modeling, Detecting undesirable
assist instructors in drawing out useful conclusions
student behaviors, Grouping students Social network
either for the course potential or student abilities and are analysis, Developing concept maps, Constructing
useful only for platform administrative purposes.2
courseware, Planning and scheduling.6

This research shows how one can leverage the available One of the most frequent research topics in the area of
data on student behavior, in order to predict success of
EDM (also investigated in this research) is the
students, as well as profile students into groups which
prediction of student performance.6,14,16 The main idea
may help improve existing learning material and behind this research direction is that based on student
collaborative learning. The study involves data from
activity one can predict the future outcome of student
students attending online (distance learning) university
performance. For the purpose of predicting students’
courses as suggested by Romero et al.,6 and extends final outcome on a course, researchers used various
available data with students cognitive styles.
techniques and algorithms. Kotsiantis et al., proposed an
Additionally we propose Moodle module that allows
incremental ensemble of classifiers as a technique for
automatic extraction of data needed for EDM analysis predicting students’ performance in distance
and deploys models evolved in this study.
education.17 Neuro-fuzzy system rules are used for
student knowledge diagnosis through game learning
The paper is structured as follows: Section 2 introduces environment.18 Kotsiantis also proposed a prototype
related work on using e-learning data and applying data
version of decision support system for prediction of
mining models. Architectural design of the decision-
students’ performance based on students’ demographic
support system is given in Section 3, with experimental characteristics and their marks in a small number of
results in using data mining models presented in Section
written assignments.19 Myller et al. used neural
4. Potential ways of using knowledge gained by data
networks (multilayer perceptron),20 and Traynor and
mining models is described in Section 5, and Section 6 Gibson used combination of Artificial Neural Networks

Published by Atlantis Press


Copyright: the authors
598
8VLQJGDWDPLQLQJRQVWXGHQWEHKDYLRU

and Evolutionary Computation models to predict


students’ performance.21 Similarly, Minaei-Bidgoli et al. Tang and Mccalla proposed a clustering algorithm
used genetic algorithm for optimization of multiple based on large generalized sequences to find groups of
classifier performance.22 Delgado et al. used neural students with similar learning characteristics based on
networks to predict success of the students defined with their traversal path patterns and the content of each page
binary classes (pass or fail).23 they have visited.14 Chen et al. used K-means clustering
algorithm for effectively grouping students who
Grouping students is another important research task in demonstrate similar behavior in e-learning environment.
educational environments. Tang and McCalla,
suggested data clustering as a basic resource to promote In this paper we utilized K-means algorithm on the data
group-based collaborative learning and to provide concerning students’ cognitive styles (gathered through
incremental student diagnosis.14 Student grouping by questionnaire), and so models that are generated could
neural network based on affective factors in learning be applied in e-learning as well as traditional teaching
English was proposed by Bachtiar et al.23 The clustering environments.37
Downloaded by [Dalhousie University] at 00:42 27 December 2012

technique based on the implementation of the Bisection


K-Means algorithm and Kohonen’s SOM algorithm was 3. University of Belgrade case study
applied in several researches.25,26,27 These algorithms In this section, we describe the data that are needed for
were used to group similar course materials with the EDM and propose automatic procedure for data
aim of helping users to find and organize distributed extraction and preparation. Further we build and
course resources in process of online learning. Also, the evaluate data mining models. Finally Moodle module
use of k-means clustering algorithm for predicting for deployment of models is described.
student’s learning activities was described in Ayesha’s
work, where the information generated after the 3.1. Automatic data extraction
implementation of data mining technique may be For evaluation of our approach, we used the data from
helpful for instructor as well as for students.28 Ayers et the Moodle system as recommended by Romero et al.6
al. used K-means and model based algorithm to group Moodle CMS is specific when database model is in
students with similar skill profiles on artificial data.29 question. Since Moodle is an open source solution,
Zakrzewska used hierarchical clustering for grouping currently there is a great community of developers built
students based on their learning styles,30 defined with around it. There are many people constantly developing
Felder and Sylverman model,31 in order to build this system and adding functionalities, and model of
individual models of learners and adjust teaching paths data management based on modules was used in order
and materials to their needs. Usefulness of combining to enable easy expansion. What does this model imply?
cognitive and emotional aspects for investigations of When a developer wants to add some functionality to
students’ learning was emphasized in Heikkila et al.32 existing Moodle version he develops adequate PHP
Perera et al. used a technique of grouping both similar pages and adds tables to the database model that will be
teams and similar individual members, and sequential used to manage data for the new set of functionalities.
pattern mining was used to extract sequences of New data is connected to user information through
frequent events.33 including relations towards new sets of data. This aspect
of the model complicates future extraction of
Cognitive style approach for Mining students' learning information on students’ activities for each new module.
patterns and performance in Web-based environment Romero et al. gave directions for extraction and
was proposed by Chen and Liu.34 preparation of data for EDM analysis based on series of
queries defined in MySQL database.6 Here we describe
Adán-Coello et al., included learning styles for forming in more detail problems of data extraction and
groups for collaborative learning of introductory preparation. Additionally, we propose a procedure for
computer programming.35 Learning styles are also automatic extraction and preparation of Moodle data for
included for the implementation of the adaptive system data mining analyses (see Figure 1) implemented in
in electronic learning.36 RapidMiner.38

Published by Atlantis Press


Copyright: the authors
599
Milos Jovanovic, Milan Vukicevic , Milos Milovanovic, Miroslav Minovic

n_quiz_a Number of quizzes passed.


There are two alternative sources on student activity n_quiz_s Number of quizzes failed.
n_posts Number of messages sent to the
data available in Moodle database structure. First source forum.
is an activity log that Moodle system uses to track the n_read Number or messages read on the
activities of each student. Moodle is a Web based forum.
system and it is not able to track continuous usage of the total_time_assignment Total time spent on assignments.
total_time_quiz Total time spent on quizzes.
system since it is based on a HTTP request/reply model. total_time_forum Total time spent on forum.
It is very difficult to determine any time spent in some mark Final mark obtained by the student
activity since activity is only listed if a user performs in the course.
click action on some link.
Since Moodle is primarily an LMS system, it faces
Second source of data can be a set of tables that are common issues of such systems. LMS systems use
created for each individual module. They keep track of resources from many alternative sources. Additionally,
major students’ activities regarding that specific Moodle is an Open Source system which implies many
developers working on the same task. This sometimes
Downloaded by [Dalhousie University] at 00:42 27 December 2012

module. For instance when a student performs an


Assignment, that module keeps track when a student leads to data heterogeneity, especially syntactic
reads the assignment, when he/she submits it, edits it, heterogeneity. This means that information sources may
etc. Unfortunately, this data, as opposed to the first use different representations and encodings for data.
source, provides less information on each action student Syntactic interoperability can be achieved when
performs, during each activity. During our analysis we compatible forms of encoding and access protocols are
decided to combine data from both sources since this used to allow information systems to communicate. 39
combination gives good foundation for EDM analysis.6
For our analysis we used data extracted from standard
Figure 1 shows the stream for automatic extraction and Moodle modules that have a long history of
aggregation of Moodle data for EDM analyses. In order development. They are an integrated part of the
to prepare the data in format described in Table 1 we production issue of Moodle, and this helped in data
first extracted the data about students, courses and the standardization and prevention of data heterogeneity.
grades achieved on every course. Second, the data from This is why our work wasn’t struck by this particular
different modules is extracted, namely Assignment, Quiz problem. On the other hand data inconsistency is
and Forum. In case of the Assignment module, tracking usually generated by improper use of the system and
students time spent on a specific assignment was a almost every system suffers from this problem.
trivial issue. Data on starting time and submission time Problems with data inconsistency are expected with
were provided by that module. Also in case of a Quiz open source systems. Regardless, Moodle is rather
module, time on each test is measured and recorded in consistent with the data it uses. For instance time values
the adequate data structure. are always represented in a time stamp format that
significantly simplifies preprocessing steps. This also
Table 1: Description of data used in experiments for enables easy data manipulation such as retrieving the
each user per course. duration time by subtracting the beginning time from
end time for each activity. Most issues are generated by
Name Description improper use of the system. For instance if a student
course Identification number of the course. does not complete the quiz and simply close the
n_assigment Number of assignments done.
browser, the quiz will be regarded as still open, and end
n_quiz Number of quizzes taken.

Published by Atlantis Press


Copyright: the authors
600
Downloaded by [Dalhousie University] at 00:42 27 December 2012

Fig. 1. Automatic data extraction model

time will be set to zero value since it is in a timestamp student can spend undetermined amount of time reading
format. This is especially the case if the quiz is left open some forum without providing feedback to the system,
by the educator as is in the case of self evaluation tests it is difficult to determine if the user is active in the
that are open during the entire semester. By subtracting forum or not. In this case we used the activity log that
beginning time from end time one would get a negative tracks each click user makes on a link in a system.
time value. Such cases we had to exclude as if they did
not even attempt the quiz. The same issue occurred with Since Moodle provides module name as one of the
the assignments that students did not finish and upload. meta-data regarding that action we were able to track
the students’ movement through a forum. The time
Systems that have many users can suffer from data spent was determined as the addition of the times
redundancy issues. In case of Moodle the most common between two clicks. If a student made a last click in a
case of redundancy is in duplicate courses or user forum context and then was inactive for a prolonged
accounts. In our organization we minimized the period of time, as a referent time we decided to use an
occurrence of redundancy by using the centralized average time between two clicks in a forum context for
approach in generation of courses and user accounts. all users. This is caused by the fact that users often do
System administrator is generating courses and user not properly log out of the system. Usually they just
accounts upon teacher’s requests and it is his duty to close the web browser and move on to other activities.
primarily check if new addition is already entered in the Unfortunately, this does not leave any feedback when
system. This process practically excluded the possibility certain activity ended. For calculation of time spent on
of redundancy that targets this analysis. forums for every student on every course, we designed
specialized application that is integrated into stream
When it comes to the Forum module, extraction of the (Figure 1) and uses extracted data from Forum module
information about time spent is more complicated. Since

Published by Atlantis Press


Copyright: the authors
601
Milos Jovanovic, Milan Vukicevic , Milos Milovanovic, Miroslav Minovic

and Activity Log. Finally, extracted data is aggregated 3.2. Prediction of students’ success
on the student-course level.
We defined a classification model to predict if a student
would display excellent performance (i.e. highest
Additionally we used the data about students’ cognitive
grades) on a selected course. This problem is interesting
styles that are gathered from a questionnaire that we
since there are many stakeholders interested in
administered through Moodle. We administered self-
recognizing students with excellent
report MBTI questionnaire which is already
performance.45,46,47,48 For the input data for the
successfully used for analysis of student’s profiles.40
prediction, the model would use the data describing
The MBTI form has 95 forced-choice items that forms
student behavior on e-learning resources (e.g. forums,
four bipolar scales: Extraversion-Introversion (EI),
discussions, quizzes, posts, assignments) as described in
Sensing-Intuition (SN), Thinking-Feeling (TF) and
the previous subsection. The dataset contains 260
Judging-Perception (JP). A combination of these
instances. The preparation of data included extracting
dimensions builds 16 different types of cognitive
more features (such as grouping courses in math-
functioning. Introverts who are oriented primarily to the
oriented and social-oriented), normalizing features, and
Downloaded by [Dalhousie University] at 00:42 27 December 2012

internal cues, and extroverts, who are oriented primarily


resolving missing values within data. As opposed to
to the external events, due to the differences in focusing
Romero et al.,50 who discretized class label in four
psychical energy, show different pattern of performing
categories (fail, pass, good and excellent) class label for
intellectual tasks. Sensing mode of perceiving world is
our model was a new binary attribute, which separated
characterized by the respect for data obtained by one of
students with highest grade (label value 1), from the rest
the five senses. Contrary, intuitive type is prone to lean
(label value 0). This was done because research of
on inner processes, perceiving the bigger picture that
Romero et al.,50 showed that classification algorithms
enables him to concentrate and to see hidden
couldn't achieve accuracy over 70% if class is
possibilities, implications of the subject in matter.
discretized in four categories (even with several pre-
processing setting of predictors). The goal is to devise a
Myers and McCaulley postulate two decision-making
model that would be able to predict if a student will
styles when assessing the validity of perception:
perform with excellent results, based on the input data.
thinking (assessment based on logical impersonal
processes) and feeling (assessment based on personal,
The main usage of this model would be to early detect
subjective process of mental evaluation).41 There are
well performing students on a course. People who could
individual differences in preference of the quality of
benefit this model would be:
environment one exist (learn) in, explicitly the level of
• teachers, for distinction of students they can
structure inherently given in it. So, there are two
collaborate with;
categories of subjects: judgers who structure and order
• students, for checking if there is a need for more
that promote predictable surrounding where decisions
effort to achieve better results;
could have been brought quickly, and perceivers who
• business people, for early engaging with students
need to keep options open unconcerned for deadlines.
who are likely to become excellent on a selected
As MBTI is well theoretically conceptualized, 43 and
topic.
metrically evaluated instrument44,49, we believe that it
might be useful to apply it on problem of distance
We decided to utilize eight different state-of-the-art
learning. Carlson examined great body of reliability
algorithms for classification, which often showed good
tests for this scale and found that coefficients for split-
results in the area of EDM.6 The algorithms used are:
half reliability goes from .66 to .92, and test-retest
• AdaBoost (with C4.5 algorithm) (abbr: “Boost”)
reliability shows that results are relatively stable
• Bagging (with C4.5 algorithm) (abbr: “Bag”)
(coefficients in different studies are ranging from .69 to
• J4.8 (a C4.5 implementation in Java)
.89).44 Records about students are extended with the
• Linear Discriminant Analysis (abbr: “LDA”)
attributes resulting from determining their cognitive
• Logistic regression
style.
• Naive Bayes
• Neural net (multi-layered perceptons) (abbr: “NN”)

Published by Atlantis Press


Copyright: the authors
602
8VLQJGDWDPLQLQJRQVWXGHQWEHKDYLRU

• Random Forests (abbr: “Forests”) The suggestion is then to avoid RandomForests


algorithm, since it does not have similar power in
The results show expected accuracy of the models in predicting both “excellent” and “other” students,
percentage. Since this model would be used on future resulting in lower AUC performance. Basic, and more
data, we used 10-fold cross-validation technique to simple algorithms, such as Logistic regression, Naive
prevent choosing an over-trained model, and assess Bayes and LDA generally performed poorly. While J4.8
models’ generalization ability. The cross-validation uses and LDA seem to have good accuracy, looking at the
so called “stratified” sampling, which means that in AUC measure, we see that those classifiers have close
each fold a similar class distribution is kept. to random performance, and the “accuracy” is due to
Additionally, we measured other evaluation measures, class imbalance. Furthermore, we see poor results with
Area Under Curve (AUC) and LIFT ratio. AUC all algorithms that produce linear models (LDA,
estimate can be interpreted as the probability that the Logistic regression and Naive bayes), which might
classifier will assign a higher score to a randomly suggest that the decision boundary for excelent students
chosen positive example than to a randomly chosen is non-linear. Other methods which produce more
Downloaded by [Dalhousie University] at 00:42 27 December 2012

negative example. LIFT ratio measures the degree to complex decision boundaries performed better. Overall,
which the predictions of a classification model are better both NeuralNet and AdaBoost gave quite good results,
than randomly-generated predictions. It is defined as the rendering models with quality which allow further use.
ratio of true positives to total positives resulting from Since the results are taken using cross-validation, we
the classification process compared to the fraction of can expect to successfully predict excellence of roughly
true positives in the overall population. We used both of 9 out of 10 students.
these measures to complement the evaluation based on
accuracy. This is important measure since in this After selecting the three most promising classification
research we are dealing with imbalanced data. algorithms for the task at hand, namely AdaBoost,
These measures are important because accuracy often NeuralNet and RandomForests, we tried to improve the
tends to overlook the classifier inability to predict all the performance, measured by AUC, by doing different
classes, when it is concentrating only to detect one preprocessing and parameter optimization steps. The
class. Testing is done in RapidMiner data mining setup for this preprocessing is shown in Figure 2.
platform,38 using default parameters and random seeds.
The results are shown in Table 2.

Table 2: Performance results of different classification


algorithms for predicting student excellence on a course.

Algorithm Accuracy AUC Lift


AdaBoost (J4.8) 91.74 % 0.8256 4.1071
Bagging (J4.8 unprunned) 90.87 % 0.7504 2.0536
J4.8 93.04 % 0.5000 1 Fig. 2. Preprocessing steps in the training phase, for the
LDA 93.04 % 0.5000 1 selected algorithms
Logistic Regression 92.17 % 0.5181 1.0575 The results of applying these steps are given in Table 3,
Naive Bayes 53.48 % 0.7222 1.5375
where the steps build upon each other in a “chain” when
Neural Net (Rapid default) 91.30 % 0.8346 4.7917
Random Forests 93.04 % 0.7498 7.1875 they bring any performance gain.

Looking at the results, several algorithms show good Table 3: Improvement of AUC by preprocessing, for
performance in generating the needed classification selected algorithms
models. These are NeuralNet, AdaBoost and
no handle optimize attribute
RandomForests, and all three are comparable with preprocess missing parameters selection
respect to accuracy. While RandomForests looks most RandForests 0.750 0.858 0.890 0.848
useful by accuracy, AUC evaluation measure does not Adaboost 0.826 0.779 0.839 0.838
prefer this algorithm, since the AUC value is too small. NeuralNet 0.835 0.767 0.853 0.812

Published by Atlantis Press


Copyright: the authors
603
Milos Jovanovic, Milan Vukicevic , Milos Milovanovic, Miroslav Minovic

For preprocessing, first we tried to average out the


missing values present in the data, usually for attributes For this purpose we used k-means clustering algorithm,
such as total_time_assignment or total_time_quiz. Some adapted for use over categorical data, since the data on
of the algorithms clearly needed this, since we see cognitive styles are categorical. Results for the
improvement of AUC for RandomForests. Next, we clustering on different courses are given in Figures 3-6.
tried to optimize the algorithm learning parameters, to
better fit the problem at hand. This step improved all For each course, several student profiles are found
three algorithms, and is clearly something to consider based on similarities of students by cognitive style.
when applying these algorithms. Finally, we tried to
perform attribute selection and remove attributes that
act as “noise”. The selection is done as backward
elimination, removing one attribute at the time, until
there are some improvements in the performance. Here
we see that although removal of attributes did not
Downloaded by [Dalhousie University] at 00:42 27 December 2012

contribute to AUC, for AdaBoost algorithm it did not


render AUC any lower, even after completely removing
two attributes. These attributes are
total_time_assignment and total_time_forum, and
AdaBoost could build a good model even without those,
which is an important consideration when using this
particular algorithm.

Another interesting detail to notice is the fact that


RandomForests used the preprocessing to produce the
best AUC score overall, even if it was least promising
of the three algorithms, before preprocessing. Still, we Fig. 3. Student profiles for course on Mathematics
could not strongly prefer any of these algorithms, so the
recommendation for practical application is to use one Figures 3 and 4 show different groups of students on
of these three algorithms. different courses. Each row represents different
cognitive properties (described in detail in section 3.1).
3.3. Grouping students Each column represents one profile, which is a group
(cluster) of students with similar cognitive properties.
In order to provide information for better adaption of
Variable “Success” describes the success of students of
the learning material, we defined clustering models that
a particular profile, where “P” stands for poor
would detect groupings of students with respect to their
performance, “G” for good, and “E” for excellent. Here
cognitive styles, as well as overall performance. Each
we used three instead of two levels of success, in order
student is described by cognitive styles and the score he
to have more detailed description of found clusters. In
achieved on a course. Data is separated by courses
essence, any number of success levels could be used,
(separate cluster model is defined for each course), so
but empirically we detected highest clarity of cluster
student profiles can be considered for each course
interpretations when using three levels. This was also
separately.
true for the number of clusters, which we set to three.
Using this model, one would be able to see which
We see that, for example, in course on Mathematics
profile of students (defined by cognitive styles) is
shown on Figure 3, students with profile “SEJF” had
having difficulties, and whether that profile performed
excellent results, while other profiles had moderate
poorly in the past. This way, each teacher is guided in
(“good”) success. This means that students other than
the way he should adapt the learning materials, to
“SEJF” profile had more trouble in delivering best
enable poor performing groups to increase performance
performance, which might be caused by many factors.
for his course.

Published by Atlantis Press


Copyright: the authors
604
Using data mining on student behavior

Still, since we know the cognitive profile of these who always passed the Psychology course with “good”,
students, we can direct our effort in adapting our course but not with “excellent” results.
materials to fit that target group. For example, analysis
indicates that course on Mathematics is more suitable to Also, there are occasions when we cannot isolate a
Empiric and Judging cognitive styles. This is probably complete cognitive profile of successful students, as in
due to the nature of the subject that gives the upper edge Figure 5. However, partial information could be
to Empirics that are better in deductive thinking and observed, for example, looking at only first two
reasoning. Teacher can try to overcome that gap by cognitive attributes. Here, Empiric, as well as Introvert,
adapting materials to the opposing cognitive styles. are part of only the first cluster of students (first
Intuitives might benefit through learning by doing column), which all turned out excellent by the end of
approach, through simulations or games. Also the course. This links only those attributes (Empiric and
Perceivers could find it more appealing to use Introvert) to the success of students.
interactive multimedia material. Also we could offer
different examination approaches, as, for example,
Downloaded by [Dalhousie University] at 00:42 27 December 2012

Introverts have more difficulty in verbal expression. Of


course, to fully leverage this information, an advice
from a psychologist would be very useful, but this is
now possible due to this new information we have on
students attending our course.

Fig. 5. Student profiles for course on Management

For English language (Figure 6), it is interesting that in


first two clusters there were students with different
success.

First cluster mainly contains students with “excellent”


Fig. 4. Student profiles for course on Psychology and “poor” success. Second cluster mainly contain
students with “good” and “poor” success. Both of these
Naturally, different courses will better fit different clusters have an overlap between cognitive styles. Third
profiles, because of the different areas of research and cluster resolved this confusion by identifying that
different materials offered by the instructors. Figure 4 students with “poor” success are Empiric and Judging,
shows profiles for the course on Psychology, where in contrast to Mathematics where deductive thinking
interestingly, profiles of “excellent” and “poor” students and reasoning (that characterize this combination of
are similar, while “good” students are differing in cognitive styles).
cognitive styles. It is interesting though, that all the
Introverts are grouped only in the “excellent” group, so
it points out that Introvert students tend to understand
this course material better. Also, profile with “good”
students is distinct, especially for the Rational students,

Published by Atlantis Press


Copyright: the authors
605
Milos Jovanovic, Milan Vukicevic , Milos Milovanovic, Miroslav Minovic

books written by native speakers. On the other hand,


intuitive and perceiving students already have
developed sense for the language and they don’t need
additional activities.

3.4. Deployment of models


In order to provide educators with information acquired
by using models defined in previous sub-sections we
propose a Moodle module that utilizes defined models.
They can browse through list of students involved in
their course and see prediction of success for each
student (Figure 7) on that course.

Based on this prediction, educators can change their


Downloaded by [Dalhousie University] at 00:42 27 December 2012

approach in working with students that are predicted not


Fig. 6. Student profiles for course on English language to be excellent or further involve with students,
predicted as excellent, in extra-curricular activities.
These students should work on developing a “sense” for
language by more involving in listening conversations
between native speakers or reading and understanding

Fig. 7. New Moodle report for prediction of each student

Published by Atlantis Press


Copyright: the authors
606
Using data mining on student behavior

specific student and attempt to adapt the material to


By selecting link “Cognitive style profiles on this better fit that specific student and its profile.
course” (Figure 8) educator is provided with graphical
presentation of student groups based on their cognitive 4. Discussion and future work
styles and success (Figure 9). Prediction of students’ success and grouping of students
are common tasks in educational data mining and are
valuable for educators as well for students. In this paper
we presented a case study that involves these tasks for
the web usage data gathered from University of
Belgrade distance learning system.

We defined the automatic procedure for extraction of


data from Moodle LMS and pre-processing them in to
appropriate form for analysis with EDM algorithms.
Downloaded by [Dalhousie University] at 00:42 27 December 2012

Further we created classification models, accurate


enough to predict if students will have an excellent
performance on different courses based on web usage
mining data. Valuable information can also be retrieved
from students’ cognitive styles. Describing students
with their cognitive styles seems natural in the
educational context, and this research encourages
further usage of this kind of data. So we built a
clustering model that identifies the groups of students
with similar cognitive styles and different success.
Defined models are evaluated and used for construction
Fig. 8. Moodle block for accessing prediction of Moodle module that can help educators for two
purposes: for distinction of students they can collaborate
with or identification of students that need extra
attention on that course, adaption of learning materials
to better fit some specific cognitive styles or even
recommend courses to students that better fit their
cognitive style.

One problem we had to overcome in our study is the


lack of data for more thorough analysis. This issue
targets many distance learning institutions.50 Usually,
distance learning systems do not have many students
enrolled which implies smaller number of potential
participants for data mining research. As distance
learning system was introduced at our University only
several years ago, future usage of the system will allow
Fig. 9. Moodle report for description of students’ cognitive
more analysis, and verification of hypothesis tested in
style and the success they had on the course
this paper. What is more, these early analysis of the e-
Educator can then go further in analysis and track each learning system would benefit its faster advancement
student according to its expected success based on towards maturity, and offer all participants
cognitive style. Educator can change his approach to the functionalities that make introducing such a system
worthwhile.

Published by Atlantis Press


Copyright: the authors
607
Milos Jovanovic, Milan Vukicevic , Milos Milovanovic, Miroslav Minovic

5. A. J. Berlanga, F. J. García-Peñalvo, P. B. Sloep, Towards


In future work we plan to evaluate more classification eLearning 2.0 University. Interactive Learning
Environments 18 (3) (2010), pp. 199-201.
and clustering algorithms in order to make even better
6. C. Romero, S. Ventura and E. García, Data mining in course
fitting of models to web usage data.51,52,53 For this management systems: moodle case study and tutorial,
purpose reusable component based algorithms could Comput. Educ. 51(1) (2008) 368–384.
also be used.54,55,56,57,58 Additionally, enriching the 7. V. Kumar, An Empirical Study of the Applications of Data
student data with even more descriptors (e.g. data Mining Techniques in Higher Education, International
Journal of Advanced Computer Science and Applications,
gathered through social network analysis) of their
2(3) (2011) 80-84.
behavior on the educational system is definitely a 8. V.Ramesh, P.Parkavi, P.Yasodha, Performance Analysis of
worthy investment. Specifically, informal learning Data Mining Techniques for Placement Chance Prediction,
becomes more and more important because learning can International Journal of Scientific & Engineering Research
happen anywhere at any time and analysis of informal 2 (8) (2011) pp. 1-7.
learning data in distance learning systems provides a 9. F. Castro, A. Vellido, À. Nebot and F. Mugica, Applying
data mining techniques to e-learning problems. Evolution of
growing research area.59 This will open a true potential
teaching and learning paradigms in intelligent environment,
Downloaded by [Dalhousie University] at 00:42 27 December 2012

for analysis of student behavior, more than has ever 62 (2007) pp. 183–221.
been possible in the traditional learning context. 10. A. C. Romero, and A. S. Ventura, Educational data mining:
A survey from 1995 to 2005, Journal of Expert Systems
Acknowledgements Applications, 33(1) (2007) 135-146.
11. C-H Weng, Mining fuzzy specific rare itemsets for
This research is partially funded by a grant from the education data, Knowledge-Based Systems 24 (5) (2011) pp.
Serbian Ministry of Science and Technological 697-708.
Development, project ID III 47003. 12. C. Romero and S. Ventura, Educational data mining: a
This research is partially funded with support from the review of the state-of-the-art, IEEE Trans. Syst. Man
Cybernet. C Appl. Rev., 40(6) (2011) 601–618.
European Commission through Life Long Learning
13. A. Krueger, A. Merceron and B. Wolf, A Data Model to
Programme project TRAILER No. 519141-LLP-1- Ease Analysis and Mining of Educational Data, in Proc.
2011-1-ES-KA3-KA3MP. Third International Conference on Educational Data
This publication reflects the views only of the author, Mining, (USA, Pittsburgh, 2010) pp. 131-140.
and the Commission cannot be held responsible for any 14. Y-H Wang, H-C Liao, Data mining for adaptive learning in
use which may be made of the information contained a TESL-based e-learning, Expert Systems with Applications
38 (6) (2011), pp. 6480-6485.
therein. 15. V.Ramesh, P.Parkavi, P.Yasodha, Performance Analysis of
Data Mining Techniques for Placement Chance Prediction,
References International Journal of Scientific & Engineering Research
2 (8) (2011).
1. Y-C Lee, N. Terashima, A Distance Instructional System 16. C. Vialardi, J. Chue, J.P. Peche, G. Alvarado, B. Vinatea, J.
with Learning Performance Evaluation Mechanism: Estrella and Á. Ortigosa, A data mining approach to guide
Moodle-Based Educational System Design, Distance
students through the enrollment process based on academic
Education Technologies 10 (2) (2012). doi: performance, User modeling and user-adapted interaction
10.4018/jdet.2012040104 21 (1-2) (2011), pp. 217-248. doi: 10.1007/s11257-011-
2. T. Martin-Blas, A. Serano-Fernandez, The role of new
9098-4.
technologies in the learning process: Moodle as a teaching 17. S. Kotsiantis, K. Patriarcheas and M. Xenos, A
tool in Physics, Computers & Education 52 (2009) pp. 35- combinational incremental ensemble of classifiers as a
44. doi:10.1016/j.compedu.2008.06.005
technique for predicting students’ performance in distance
3. I. Kazanidis, S. Valsamidis, T. Theodosiou and S. education, Knowledge-Based Systems, 23(6) (2010) 529-
Kontogiannis, Proposed framework for data mining in e- 535.
learning: The case of Open e-Class, in Proc. IADIS
18. K. Kuk, P. Spalevic, S. Ilic, M. Caric, Z. Trajcevski, A
International Conference of Applied Computing, (Rome, Model for Student Knowledge Diagnosis through Game
Italy, 2009), pp. 254–258. Learning Environment, Technics Technologies Education
4. F. J. García-Peñalvo, M. Á.Conde, M. Alier, María J.
Management – TTEM, 7 (1) (2012) 103-110.
Casany, Opening Learning Management Systems to 19. S. Kotsiantis, Use of machine learning techniques for
Personal Learning Environments, Journal of Universal educational proposes: a decision support system for
Computer Science 17(9)(2011), pp. 1222-1240.
forecasting students’ grades, Artificial Intelligence Review,
(Online First) (2011) 1-14.

Published by Atlantis Press


Copyright: the authors
608
Using data mining on student behavior

20. N. Myller, J. Suhonen and E. Sutinen, Using Data Mining attributional strategies: a person oriented approach, High
for Improving Web-Based Course Design, in Proc. Educ 61 (2011), pp. 513–529. doi: 10.1007/s10734-010-
International Conference on Computers in Education, 9346-2
(USA, Washington, 2002) pp. 959- 964. 33. D. Perera, J. Kay, I. Koprinska, K. Yacef and O. R. Zaïane,
21. D. Traynor and J.P. Gibson, Synthesis and Analysis of Clustering and Sequential Pattern Mining of Online
Automatic Assessment Methods in CS1, in Proc. The 36th Collaborative Learning Data, IEEE Transaction on
SIGCSE Technical Symposium on Computer Science Knowledge and Data Engineering, 21 (6) (2009), pp. 759-
Education SIGCSE’05, (ACM Press., Louis Missouri, USA 772.
, 2005) pp. 495-499. 34. S.Y. Chen and X. Liu, Mining students' learning patterns
22. B. Minaei-bidgoli, D. A. Kashy, G. Kortmeyer and W. F. and performance in Web-based instruction: a cognitive style
Punch, Predicting student performance: an application of approach, Interactive Learning Environments 19 (2) (2011).
data mining methods with an educational Web-based doi:10.1080/10494820802667256
system, in Proc. 33rd International Conference on Frontiers 35. J.M. Adán-Coello C.M. Tobar E.S.J. de Faria, W.S de
in Education, (Colorado, Westminister, 2003) pp. 13-18. Menezes, R.L. de Freitas, Forming Groups for Collaborative
23. M. Delgado, E. Gibaja, M.C. Pegalajar and O. Pérez, Learning of Introductory Computer Programming Based on
(2006). Predicting Students' Marks from. Moodle Logs Students’ Programming Skills and Learning Styles,
using Neural Network Models, in Proc. International International Journal of Information and Communication
Downloaded by [Dalhousie University] at 00:42 27 December 2012

Conference on Current Developments in Technology Technology Education 7 (4) (2011). doi:


Assisted Education, (Sevilla, Spain, 2006) pp. 586-590. 10.4018/jicte.2011100104
24. F.A. Bachtiar, W.E. Cooper, K.K. Kamei, Student grouping 36. S. Jevremovic, Implementation of the adaptive system in
by neural network based on affective factors in learning electronic learning, Management 14 (53) (2009), pp.57-61.
English in Proc. International Conference on e-Education, 37. C-M. Chen, M-C. Chen, Mobile formative assessment tool
Entertainment and e-Management (ICEEE), 2011. doi: based on data mining techniques for supporting web-based
10.1109/ICeEEM.2011.6137792. learning, Computers & Education 52 (2009), pp. 256–273.
25. A. Drigas, J. Vrettaros, An Intelligent Tool for Building e- doi:10.1016/j.compedu.2008.08.005
Learning Contend-Material Using Natural Language in 38. I. Mierswa, M. Wurst, R. Klinkenberg, M. Scholz and T.
Digital Libraries. WSEAS Transactions on Information Euler, YALE: Rapid prototyping for complex data mining
Science and Applications 5(1) (2004) 1197-1205. tasks, in Proc. 12th ACM SIGKDD International
26. K. Hammouda, M. Kamel, Data Mining in e-Learning. In: Conference on Knowledge Discovery and Data Mining,
Pierre, S. (ed.): e-Learning Networked Environments and (Philadelphia, USA, ACM Press, 2006) pp. 935-940.
Architectures: A Knowledge Processing Perspective. 39. J. Cardoso, Developing Course Management Systems Using
Springer-Verlag, Berlin Heidelberg New York (2005). The Semantic Web, The Semantic Web, Semantic Web and
27. J. Tane, C. Schmitz, G. Stumme, Semantic Resource Beyond, 2008, Volume 6, Part IV, 169-188. doi:
Management for the Web: An e-Learning Application. In: 10.1007/978-0-387-48531-7_8
Fieldman, S., Uretsky, M. (eds.): The 13th World Wide 40. M. Minović, M. Milovanović, I. Kovačević, J. Minović and
Web Conference 2004, WWW2004. ACM Press, New York D. Starčević, Game design as a learning tool for the course
(2004) pp. 1-10 of Computer Networks, International Journal of
28. S. Ayesha, T. Mustafa, A.R. Sattar, M.I. Khan, Data Mining Engineering Education, 27(3) (2011) 498 - 508.
Model for Higher Education System, Europen Journal of 41. I.B. Myers and M.H. McCaulley, Manual: a guide to the
Scientific Research 43(1)(2010), pp.24-29. development and use of the Myers±Briggs Type Indicator,
29. E. Ayers, R. Nugent, and N. Dean, A Comparison of (Consulting Psychologists Press, Palo Alto, CA, 1985).
Student Skill Knowledge Estimates. in Proc. International 42. I.B. Myers, M.H. McCaulley, N.L. Quenk and A.L.
Conference On Educational Data Mining, (Cordoba, Spain, Hammer, MBTI Manual. A guide to the Development and
2009), pp. 1-10. Use of Myers-Briggs Type Indicator, (Consulting
30. D. Zakrzewska, Cluster analysis for user’s modeling in Psychologists Press, Palo Alto, CA, 1998).
intelligent e-learning systems, in Proc. In International 43. C.G. Jung, Psychological Types. The Collected Works, vol.
Conference on Industrial, Engineering & Other 6., (Routledge and Kegan Paul, London, UK, 1971)
Applications of Applied Intelligent Systems: New Frontiers 44. J.G. Carlson, Resent Assessments of Myers-Briggs Type
in Applied Artificial Intelligence (IEA/AIE '08, eds. N. T. Indicator, Journal of Personality Assessment, 49(4) (1985)
Nguyen, L. Borzemski, A.Grzech, and M. Ali, (Springer- 356-365.
Verlag, Berlin, Heidelberg, 2008) pp. 209-214. 45. R. Colomo Palacios, E. Tovar Caro, A. García Crespo, &
31. R.M. Felder and L.K. Silverman, Learning and teaching J.M. Gómez Berbís, Identifying Technical Competences of
styles in engineering education, Eng. Educ., 78 (7) (1988) IT Professionals: The Case of Software Engineers,
674–681. International Journal of Human Capital and Information
32. A. Heikkila, N. Markku, J. Nieminen, K. Lonka, Technology Professionals 1(1) (2010), pp. 31-43.
Interrelations among university students’ approaches to 46. R. Colomo-Palacios, E. Fernandes, P. Soto-Acosta & M.
learning, regulation of learning, and cognitive and Sabbagh, M, Software product evolution for Intellectual

Published by Atlantis Press


Copyright: the authors
609
Milos Jovanovic, Milan Vukicevic , Milos Milovanovic, Miroslav Minovic

Capital Management: The case of Meta4 PeopleNet, Data & Knowledge Engineering (2012). doi:
International Journal of Information Management 31(4) 10.1016/j.datak.2012.03.005
(2011), pp. 395-399. 59. B. Chen and T. Bryer, Investigating Instructional Strategies
47. A. García-Crespo, R. Colomo-Palacios, J.M Gómez-Berbís, for Using Social Media in Formal and Informal Learning,
& M. Mencke, M,. BMR: Benchmarking Metrics The International Review of Research in Open and Distance
Recommender for Personnel issues in Software Learning, ISSN: 1492-3831, 13 (1) (2012).
Development Projects. International Journal of
Computational Intelligence Systems 2(3) (2009), pp. 257-
267.
48. S. Westlund, Leading Techies: Assessing Project
Leadership Styles Most Significantly Related to Software
Developer Job Satisfaction. International Journal of Human
Capital and Information Technology Professionals 2(2)
(2011), pp. 1-15. doi:10.4018/jhcitp.2011040101
49. O.C.S. Tzeng, S.L. Ware, J-M. Chen, Measurement and
Utility of Continuous Unipolar Ratings for the Myer-Briggs
Type Indicator, Journal of Personality Assessment, 53(4)
Downloaded by [Dalhousie University] at 00:42 27 December 2012

(1989) 727-738.
50. C. Romero, S. Ventura, P. G. Espejo and C. Hervs, Data
Mining Algorithms to Classify Students, in Proc. 1st
International Conference on Educational Data Mining
(EDM’08), (Montreal, Canada, 2008) pp. 8–17.
51. P. Lingras, M. Joshi, Experimental Comparison of Iterative
Versus Evolutionary Crisp and Rough Clustering,
International Journal of Computational Intelligence
Systems, 4(1)(2011), pp.12-28.
52. Y.-C. Lin, T.-K. Wu, S.-C. Huang, Y.-R. Meng, W.-Y.
Liang, Rough Sets as a Knowledge Discovery and
Classification Tool for the Diagnosis of Students with
Learning Disabilities, International Journal of
Computational Intelligence Systems, 4(1) (2011), pp.29-43.
53. M. Matijaš, M. Vukićević, S. Krajcar, Supplier Short Term
Load Forecasting Using Support Vector Regression and
Exogenous Input, Journal of Electrical Engineering
62(5)(2011) pp. 280-285. doi:10.2478/v10187-011-0044-9
54. B. Delibašić, M. Jovanović, M. Vukićević, M. Suknović, Z.
Obradović, Component-based decision trees for
classification, Intelligent Data Analysis 15 (5) (2011) pp.
671-693. doi: 10.3233/IDA-2011-0489
55. M. Suknovic, B. Delibasic, M. Jovanovic, M. Vukicevic,
D. Becajski-Vujaklija and Z. Obradovic, Reusable
components in decision trees induction algorithms,
Computational Statistics (2012). doi:10.1007/s00180-011-
0242-8.
56. B. Delibasic, K. Kirchner, J. Ruhland, M. Jovanovic, M.
Vukicevic, Reusable components for partitioning clustering
algorithms. Artificial Intelligence Review 32 (1-4) (2009)
pp. 59-75. doi: 10.1007/s10462-009-9133-6
57. M. Vukicevic, M. Jovanovic, B. Delibasic, S. Isljamovic,
M. Suknovic, Reusable component-based architecture for
decision tree algorithm design, International Journal on
Artificial Intelligence Tools (2012). doi:
10.1142/S0218213012500224
58. B. Delibasic, M. Vukicevic, M. Jovanovic, K. Kirchner, J.
Ruhland, M. Suknovic, An architecture for component-
based design of representative-based clustering algorithms,

Published by Atlantis Press


Copyright: the authors
610

You might also like