2020-Student Performance Prediction Based On Blended Learning
2020-Student Performance Prediction Based On Blended Learning
2020-Student Performance Prediction Based On Blended Learning
Abstract—Contribution: This article explored blended learning are the providers of knowledge and students are the recipi-
by implementing a student-centered teaching method based on ents of knowledge. As technology advances, the chalk in the
the flipped classroom and small private online course (SPOC). teacher’s hand has been gradually replaced by laser pointer
The impact of general online learning behavior on student
performance was analyzed. This work is practical and provides with remote control, but the student’s role of the audience
enlightenment for learning analysis and individualized teaching did not change essentially. The traditional teaching method is
in blended learning. increasingly difficult to adapt to the development of the new
Background: Providing individualized teaching in a large class era, and its quality is hard to be guaranteed. College students
is an effective way to improve teaching quality, but the tradi- of this generation are digital natives who are growing up in the
tional teaching method makes it difficult for teachers to learn
about each student’s learning situation. Blended learning offers ever-changing digital and Internet age. It is not easy to adapt
the possibility of individualized teaching for teachers. The com- them to traditional teaching. First, the Internet has become
bination of flipped classroom and SPOC is a good way to easier for students to access information and knowledge [2].
implement blended learning, but few studies have verified the pre- The sources of knowledge are diverse, and teachers are no
dictability of learning performance in such a scenario to explore longer the only imparter of knowledge. Once students are not
individualized teaching.
Intended Outcomes: Students’ behavior in blended learn- satisfied with the lecture, the content, the way of teaching, or
ing can be used to predict their learning outcomes, and the even the accent of the mother tongue in the lecture can become
implementation method is reproducible. Teachers can implement the cause of students not listening, sleeping in class, and skip-
individualized teaching in blended learning. ping classes. Second, mobile phones have become the basic
Application Design: The learning activities were designed and equipment for college students [2]. The excessive use of cell-
reconstructed to create a blended learning scenario, data that
depict students’ learning behavior were collected and used phone makes students become classroom phubbers, affecting
to predict their performance by a multiple regression model. their attention and learning in the class [3]. Traditional teach-
Student performance was measured by the final offline exam, ing needs to change with the interactions between technology
and its predictability in the 1/4, 1/2, and 3/4 semester was tested and learning. Besides, large-class education is universal in
for early intervention. some countries such as China [4]. Traditional large-class edu-
Findings: The results show that students’ online behavior can
be predictors of their performance, and with the advance of the cation may have difficulty in providing individualized teaching
course, the predicted results are more stable and reliable. and could even have negative effects on student academic
performance [5]–[7].
Index Terms—Blended learning, flipped classroom, individu-
alized teaching, massive open online courses (MOOCs), small The emergence of the massive open online course (MOOC)
private online courses (SPOCs), student assessment, student seems to bring a wave of reform to traditional educa-
performance. tion. MOOC attracts students at diverse levels of knowledge
and abilities, and it covers a wide range of knowledge
and reduces the deepness of knowledge to set a lower
threshold for learning [8]. Whereas, low completion, diffi-
I. I NTRODUCTION culty in mutual recognition of credits, low social recogni-
N HIGHER education, traditional face-to-face teaching is
I still the mainstream [1]. It is teacher centered that teachers
tion, and dishonesty [9] in MOOC have prompted college
educators to introspect and explore constantly. The small
private online course (SPOC) sets off an educational revo-
Manuscript received March 13, 2019; revised November 12, 2019, January
20, 2020, and May 3, 2020; accepted July 4, 2020. This work was sup- lution of classroom teaching. SPOC is aimed at small-class
ported in part by the Graduate Education Innovation Program of Guangdong teaching and is more suitable for further imparting profes-
Province under Project 2015JGXM-ZD04 and in part by the Guangdong sional knowledge [8]. There is some practice of SPOC that
Higher Education Teaching Reform Project “Research on Constructivism
and Its Application in the Teaching of Computer Networks.” (Corresponding has achieved good results, such as Copyright [10] and The
author: Zhuojia Xu.) Architectural Imaginary [11] set up by Harvard University and
Zhuojia Xu and Hua Yuan are with the Communication and Computer Circuit Principle of Tsinghua University [12]. In recent years,
Network Lab of Guangdong, School of Computer Science and Engineering,
South China University of Technology, Guangzhou 510641, China (e-mail: SPOC has also been popular in blended learning [13]–[15].
[email protected]; [email protected]). Blended learning based on SPOC provides an opportunity
Qishan Liu is with the Theoretical Physics Department, Johannes for teachers to explore personalized teaching, but the prac-
Gutenberg-Universität Mainz, 55099 Mainz, Germany (e-mail:
[email protected]). tice has also exposed some problems. First, most teachers use
Digital Object Identifier 10.1109/TE.2020.3008751 questionnaires to learn about students’ learning behavior and
0018-9359
c 2020 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See https://www.ieee.org/publications/rights/index.html for more information.
Authorized licensed use limited to: University of Wollongong. Downloaded on August 14,2020 at 14:38:59 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.
their attitude toward the curriculums [16], which is time con- developed frameworks to extract event-based and position-
suming and not time for teaching feedback. Is there a more based sequences from student video-watching clickstreams in
direct and automated way to help teachers learn about stu- MOOC. Their experiments demonstrated that video-watching
dents’ learning and their performance for timely intervention? behavior can help improve student performance prediction.
Second, in the learning process, students’ learning behaviors, Due to declining participation over time in MOOC [26],
such as study time and assignment grades, are recorded and Jiang et al. [27] utilized students’ assessment performance
showed as diagrams on the SPOC platforms. What is the rela- and social data in week 1 to predict students’ certificate
tionship between these data and student performance? Can obtaining.
this information be used to predict their performance to help In addition to MOOC, there is research on the prediction
individualized teaching? of student performance in SPOC. The work [28] developed
To solve the above issue, this article created a blended a linear regression model and a deep learning model to
course computer networks based on SPOC and flipped class- predict student performance in SPOC, and it stated that the
room to explore the possibility of using online learning model can be generalized for MOOC or other online learning.
behavior to predict student performance in blended learning. Wan et al. [29] used logical regression to predict the weekly
Considering the blend of online and offline learning, student test pass of students in SPOC. Marcos et al. [25] used fea-
performance was measured by the offline final exam and pre- tures related to platform visiting and interactions with videos
dicted by online learning data, which were generated on the and quizzes to predict whether a student can pass the admis-
SPOC platform. Multiple linear regression was used to ana- sion test, providing enlightenment for student performance
lyze the impact of online behavior on student performance prediction in SPOC-based blended learning. Overall, there is
and the possibility of early prediction. It is noteworthy that stu- a lack of work concerning studies on student performance
dent performance is not necessarily predictable in this case, as prediction in SPOC-based blended learning.
offline learning behavior is not taken into account and online Some studies predicted student performance in blended
learning data just involve partial learning activities in blended learning by using data directly from the learning management
learning. If student performance can be predicted only through system (LMS). Raga and Raga [17] developed a deep neural
online learning behavior, teachers can save time and effort network model for early prediction of student performance
in collecting offline data and make full use of educational in blended learning. Kim et al. [18] developed linear and
data mining to assist personalized teaching in blended learning. nonlinear prediction models based on the pedagogical types
There are previous works on student performance prediction of blended learning. Figueira [19] used the number of online
in blended learning [17]–[19], but most of them discussed course access, coverage of digitally provided learning mate-
this issue without teaching context. As the types of blended rial, and the differences in student sequences with their golden
learning vary with the dimensions of the blend [20] and the standard to predict the online score.
prediction is data driven [21], it is not that practical to predict In the literature, student performance is often quan-
student performance without a specific context. Besides, online tified by final grade [17]–[19], [24], [30], [31], course
learning data in such research were mostly collected through engagement [32], certificate obtaining [33], or student
private and hidden learning logs [17]–[19], [22]–[25], which dropouts [34], [35]. Linear models, such as multiple linear
are unlikely to be obtained by teachers, bringing more difficul- regression [18], [24], [30], [31] and nonlinear models, such as
ties to practice. Compared to related work, the contributions gradient boosting decision tree (GBDT) [36], K-nearest neigh-
of this article are as follows. bors (KNNs) [37], [38], and decision tree (DT) [39]–[41] have
1) This article explores the possibility of using online been widely used in the prediction of student performance.
learning behavior to predict student performance and Multiple linear regression is popular due to its simplicity and
provide personalized teaching, providing an example for convenience in analyzing the influence of multiple factors.
the design of blended learning and the application of Students’ interactions with the course, such as video-watching
educational data mining. clickstreams, visiting boards, login frequency, the number
2) The online behavioral data in this article were general of posts, and behavior in online quizzes, are generally
and accessible for teachers, making the research more considered as predictors. Whereas, most of the studies tend
practical. to use hidden and more detailed learning logs [17]–[19],
3) Linear and nonlinear models were compared to further [22]–[24], which may be hard to obtain due to privacy
discuss the generalization of the predictions. protection. Their proposals may be difficult to apply widely
in practice. Second, some studies tend to pay attention to the
prediction accuracy of student performance while ignoring the
teaching context [17]–[19]. As argued by [21], the predictors
II. R ELATED W ORK of student performance are data driven and the results are
The prediction of student performance has been a hotspot closely related to the teaching context.
in online learning. Many studies are devoted to finding Based on related work, this article mainly used multiple
online learning behavior that can be used to predict student linear regression to analyze the predictability of student
performance. In [22], six types of features were extracted performance and verified the generality of the prediction
from click-stream logs within an MOOC and used to predict through typical nonlinear models, which are GBDT, DT, and
students’ grades of next assessments. Brinton et al. [23] KNN. Students’ online learning data were more general and
Authorized licensed use limited to: University of Wollongong. Downloaded on August 14,2020 at 14:38:59 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.
Authorized licensed use limited to: University of Wollongong. Downloaded on August 14,2020 at 14:38:59 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.
TABLE I
M AIN S TUDENT B EHAVIORAL DATA S HEET G ENERATED D URING THE B LENDED L EARNING P ROCESS
A. Data Preprocessing
The original data of student behavior on the SPOC plat-
form must be preprocessed before analysis. Some features in
Table I were preprocessed in the following way to produce
more descriptive characteristics.
1) Time of the First Access: The time of first access
describes how soon students enter the course, which may be
linked to the enthusiasm for learning. The interval between the
time when SPOC was launched and the student’s first access
was calculated and defined as the time delay of study.
2) Submission Time: The submission time describes
whether the student procrastinates on the assignment, which
may be linked to the student’s learning attitude and initiative.
The time interval between the submission time of MCQ and
its deadline was calculated and defined as the time delay of
MCQ.
3) Number of Submissions and Scores of MCQ: Each stu-
Fig. 2. Distribution of (a) study time, (b) time for MCQ, (c) submission
dent had two opportunities to submit an MCQ assignment. delay, and (d) number of submissions in two semesters.
Their final MCQ score was the higher one of the two sub-
missions. For students who did not submit their assignments,
their number of submissions was recorded as 0, regardless of student’s average score for a semester. Once a student misses
whether he/she used the second opportunity to make up for an MCQ assignment, his/her final score will reduce, so the
the assignment. average score can give a penalty to the absence of submitting
After the preprocess above, feature 1 in Table I becomes assignments.
the time delay of study and feature 6 becomes the submission The range of each feature is shown in Table I. In the
delay. last step, each eigenvalue in Table I was normalized to the
As mentioned in Section III, each student had multiple interval [0–1] by the min–max normalization to keep the same
assignment-related records. In order to facilitate the prediction fundamental unit.
of the model, the average values of these quantities were
calculated, as follows: B. Empirical Study
1
n The Matplotlib module [42] was used to visualize data. In
x= xi (3) Fig. 2, the distribution of study time, the used time for MCQ,
n
i=1 submission delay, and the number of submissions in the spring
where xi can be one of the assignment features: the score, the and fall classes are shown.
number of submissions, the submission delay, the used time In Fig. 2(a), the distribution of spring class in study time
for MCQ, etc. The variable n represents the number of assign- is more uniform than that of the fall class. The study time of
ments for all units. For example, when you survey a student’s the fall class is shorter than that of the spring class, which
MCQ scores for a semester, which has eight online tests in is consistent with the fact that there were fewer requirements
total, then you will get n = 8. Formula (3) represents the for fall class to study online than that for the spring class. In
Authorized licensed use limited to: University of Wollongong. Downloaded on August 14,2020 at 14:38:59 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.
TABLE II
P EARSON C OEFFICIENT W ITH P -VALUE AND VIF B ETWEEN THE VARIABLES AND THE O FFLINE F INAL G RADE
Fig. 2(b), the used time for MCQ in the spring class is posi- earlier the homework is submitted, the better student academic
tively skewed, and the used time of most students is shorter, performance may be. Except for the time delay of study, other
while in the fall class, the distribution is more like the normal variables were positively correlated with the final grade, from
distribution, and the total answer time is longer than that in weak correlation [0.2–0.4), medium correlation [0.4, 0.6), to
the spring class because there was no limitation of the answer strong correlation [0.6, 0.8). The correlation between the MCQ
time in the fall class. The distribution of submission delays grade and the final grade was 0.3 in the spring class but was
between the two classes is similar, and some students like to 0.679 in the fall class. The MCQ grade is more correlated
submit assignments before the deadline. In Fig. 2(d), the distri- with student performance in the fall class. In addition to the
bution of the spring class is relatively narrow, and the number grade of MCQ and peer review, the correlation between most
of submissions is concentrated in 0.5–0.8 (note that this is the of the online behavior and the final grade in the fall class is
normalized value, so it is less than 1), while the variance of weaker than that of the spring class, which was in line with
the fall class distribution is more extensive. the empirical cognition that their online learning requirements
Overall, the spring class seems to be more active online than were fewer than those for the spring semester and assignment-
the fall class, and the results are consistent with the empirical related activities became their major activities.
cognition. The online learning requirements of the spring class Through correlation analysis, teachers can understand
were more than those of the fall class. which behaviors are correlated with student performance
and screen learning behaviors for the prediction of student
C. Prediction of Student Performance performance.
The linear analysis was used to discover the correlation 2) Prediction: The variance inflation factor (VIF) quanti-
between student behavior and performance. Python module fies the severity of multicollinearity. A variable with a VIF
Statsmodels [43] and SciPy [44] were used for the analysis larger than 5 indicates that the variable is collinear with other
of learning behavior. variables. The VIF was calculated to detect multiple collinear
1) Correlation Analysis: The Pearson coefficient describes variables before using multiple linear regression.
the correlation between dependent variables and independent In Table II, the VIF of each variable was lower than 5,
variables. In Table II, the Pearson coefficient with p-value was indicating no severe multicollinearity between the variables.
calculated. Since there was no multicollinearity between the variables and
Pearson coefficients between the time delay of study and there existed correlations between the variables and student
the final grades of the two classes were −0.258 and −0.386, performance, all features were considered in the prediction.
respectively. This feature describes how soon students enter Multiple linear regression with forward selection used ordi-
the course, and the result indicates that the longer the delay, nary least squares (OLSs) to learn the coefficient of each
the worse his/her academic performance is likely to be. It can variable, whose significance was tested by student t-test. When
be observed that the submission delay had a positive effect fitting the model, forward selection would select the variable
on student performance in the two classes, indicating that the whose inclusion improves the fit significantly. The coefficients,
Authorized licensed use limited to: University of Wollongong. Downloaded on August 14,2020 at 14:38:59 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.
Fig. 3. Predicted results and ground-truth values in the (a) spring class and (b) fall class.
standard errors, and p-values of the variables in the model are randomly selects 20% of the original data as the predicted sam-
shown in Table II. ples. The vertical axis represents the normalized value of the
The R2 and adjusted R2 of the spring class were 0.579 and final grade. As shown in Fig. 3, most of the predicted grades
0.524, respectively, and were 0.567 and 0.547 in the fall class, are close to the true values, which means student performance
respectively. Compared with the previous prediction models can be well predicted by multiple linear regression.
that had a reasonable amount of explained variance ranging In the experiment, the eight students with the lowest grades
from 0.22 to 0.52 [30], the results in this experiment show that and the eight students with the highest grades were selected
online learning behavior is predictive for student performance. to further illustrate the fitting results of the model. In the
It can be seen from the “coefficient” and “p of coeff” spring class, predicted grades and true grades of the worst
columns in Table II that the study time, the number of posts, eight were both lower than 0.5 (note that the score was nor-
MCQ grade, the used time for MCQ, the submission delay, and malized between 0 and 1). The last and the fourth of the worst
the grade of peer review are predictors in the spring class, but students got the predicted grades closest to their true grades.
the coefficient of the MCQ grade was not significant, indicat- The true grade of the worst student was 0.1 while his/her pre-
ing that the other features are sufficient enough to predict the dicted grade was 0.0, and the true grade of the fourth-worst
final grade. The higher the grade of peer review, the longer student was 0.17 while his/her predicted grade was 0.176. Top
the study time, and the sooner the assignments can be sub- eight in the spring class had high predicted grades, but the
mitted, the better their final academic performance may be. In gaps between the predicted and true values were larger than
the fall class, the number of posts, the MCQ grade, and the those of the worst students. Overall, the prediction model of
grade of peer review become predictors. Although only three the spring class had a good prediction for the worst students,
features played a role in the prediction model of the fall class, which means it can better discover at-risk students. In the fall
the prediction effect was not weak. Assignment-related fea- class, the predicted grades of the worst and the top eight were
tures such as the MCQ grade, the grade of peer review seems both close to their true grades, with the largest gap of 0.162
to be important predictors as they appeared in the prediction and a minimum of 0.004. Students with final grades no more
models of both classes. than 0.5 also had predicted grades that no more than 0.5. The
What interesting is in the spring class, the MCQ grade had prediction model of the fall class had a good prediction for
a positive correlation with student performance, but its coeffi- both poor and good students.
cient was negative. This does not mean the correlation between The experimental results show that online learning behavior
the MCQ grade and student performance has changed from can predict student performance. Teachers can learn about stu-
positive to negative. Correlation analysis is like single fac- dents’ learning situations and discover at-risk students through
tor analysis, as it can provide point-to-point relationships for the prediction. If the prediction can be realized earlier, it may
teachers to understand which behaviors are positive and which be more helpful for personalized guidance.
are negative. But when different behaviors were combined,
they may influence each other and have different effects on
student performance. The prediction allows teachers to under- D. Prediction in the Learning Process
stand the comprehensive effect of the combination and the To verify the predictability of student performance in the
performance of each student. early stage, the behavioral data generated in the 1/4 semester,
Fig. 3 shows the predicted grade and the true grade intu- midsemester, and 3/4 semesters were used to predict the final
itively. The horizontal axis identifies ten tests, each of which exam grade. It is essential to state that in the 1/4 semester, only
Authorized licensed use limited to: University of Wollongong. Downloaded on August 14,2020 at 14:38:59 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.
reached 1.00, and for KNN with the number of neighbors set
to 5, R2 was 0.499. R2 of GBDT and DT was high, and R2 of
KNN was close to that of linear regression. GBDT and DT are
more robust, they can perfectly fit the nonlinear relationship
between learning behavior and student performance, but KNN
cannot fit such nonlinear relationship as well as GBDT and
DT. The performance of KNN may be affected by the number
of neighbors [45]. These results verify the generalization of the
prediction. Although the performance of the nonlinear model
is better than that of linear regression, the interpretability is
poor as it utilizes nonlinear combinations of behaviors and
Fig. 4. Adjusted R2 in the 1/4, 1/2, 3/4 semesters, and the whole semester. their nonlinear relationships with student performance.
part of the records was ready for analysis. The MCQ assign-
ments in the first and second units as well as the grades of V. D ISCUSSION
the first peer review. In the midsemester, the records of MCQ This article implemented blended learning in two different
assignments from unit one to unit four and the grades of twice semesters to explore the predictability of student performance
peer review were used. In 3/4 semesters, MCQ assignments and the possibility of early intervention.
from unit one to unit six and the grades of all peer review were The experiment found that online learning data involving
accessible. In this process, the records of the study time and part of the learning activities of blended learning can be
the numbers of posts were not accessible. The number of fea- used to predict student performance, and assignment-related
tures may be different in the learning stage. Since the adjusted features were potentially important predictors, which means
R2 takes the number of features into account, the experiment teachers can save time and effort in collecting offline data.
only examines this indicator. They can use general online learning behavior, especially
Similarly, the forward stepwise regression was run to assignment-related behavior to learn about students’ learning
select predictors and produce prediction formulas. For com- situations.
parison, the adjusted R2 of multiple linear regressions in Student performance can also be predicted at the early stage,
the 1/4 semester, midsemester, 3/4 semester, and the whole but the earlier the stage, the more unstable the prediction
semester are plotted in Fig. 4. In the 1/4 semester, the adjusted results are. The impact of learning behavior on student
R2 of the spring class was only 0.133, and the fall class was performance is different from that in the correlation analy-
0.338, which was higher. In the midsemester, the adjusted R2 sis, which means the combination of learning behaviors has
of the spring class was increased by about 0.25 compared with different effects on student performance. Such effects are com-
its previous result, whereas, the fall class slightly increased by plex as they may be related to the learning attitude, learning
about 0.08. As the course advanced, the adjusted R2 of the two method, learning process, accidental factors, etc. The corre-
classes rose to 0.466 and 0.479, respectively. Though only part lation analysis provides candidate factors for prediction and
of the data were used in the 3/4 semester, its results were close allows teachers to learn about point-to-point relationships
to the whole semester. between behavior and performance. The prediction of student
The result shows that student performance can be predicted performance allows teachers to know who would have poor
at the early stage, but the earlier the stage, the more unsta- performance and who would have good performance so that
ble the prediction results are. In the middle stage, student teachers have the opportunity to learn about students’ situ-
performance can be preliminarily predicted to help teachers ations in advance and provide personalized intervention and
adjust teaching as soon as possible, and intervention can be guidance in offline face-to-face teaching.
performed to achieve personalized guidance and teach students
in accordance with their aptitude.
VI. F UTURE W ORK
E. Comparing With Nonlinear Models There is still some work to explore in the future. The exper-
Based on the related work, this article uses GBDT, DT, iments can be developed to discover abnormal students so
and KNN to verify the generalization of the prediction. R2 as to provide more abundant information for intervention. In
is a common indicator to exam the goodness of fit of non- terms of early intervention, the relationship between inter-
linear prediction models, and therefore, the experiment only vention time and predictive stability can be further explored
compared R2 of these nonlinear models with the multiple through a large number of experiments. It would be possi-
linear regression. The nonlinear models were implemented ble to recommend intervention time intelligently based on the
through the machine learning tool [41] and fitted with default relationship.
parameters. It is believed that with the publicity of educational
In the spring class, R2 of GBDT reached 0.999, DT reached information, more and more teachers can collect educational
1.00, and for KNN with the number of neighbors set to 5, R2 data and participate in the research, making more contributions
was 0.540. In the fall class, R2 of GBDT reached 0.994, DT to improve the quality of higher education.
Authorized licensed use limited to: University of Wollongong. Downloaded on August 14,2020 at 14:38:59 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.
Authorized licensed use limited to: University of Wollongong. Downloaded on August 14,2020 at 14:38:59 UTC from IEEE Xplore. Restrictions apply.