0% found this document useful (0 votes)
11 views12 pages

Can We Predict Student Performance Based On Tabular and Textual Data

This document discusses a study that aims to predict student performance using a newly collected multimodal dataset that combines student behavior data and course comments. The authors propose a Transformer-based framework for data fusion, demonstrating improved classification performance compared to existing methods. Empirical results indicate that the inclusion of textual features significantly enhances the model's effectiveness and generalization capabilities.

Uploaded by

anuanamika0220
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views12 pages

Can We Predict Student Performance Based On Tabular and Textual Data

This document discusses a study that aims to predict student performance using a newly collected multimodal dataset that combines student behavior data and course comments. The authors propose a Transformer-based framework for data fusion, demonstrating improved classification performance compared to existing methods. Empirical results indicate that the inclusion of textual features significantly enhances the model's effectiveness and generalization capabilities.

Uploaded by

anuanamika0220
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

IEEE RELIABILITY SOCIETY SECTION

Received 22 July 2022, accepted 7 August 2022, date of publication 16 August 2022, date of current version 22 August 2022.
Digital Object Identifier 10.1109/ACCESS.2022.3198682

Can We Predict Student Performance Based on


Tabular and Textual Data?
YUBIN QU 1,2 , FANG LI3 , LONG LI 4 , (Member, IEEE),
XIANZHEN DOU2 , AND HONGMEI WANG5
1 Guangxi Key Laboratory of Trusted Software, Guilin University of Electronic Technology, Guilin 541004, China
2 School of Information Engineering, Jiangsu College of Engineering and Technology, Nantong 226001, China
3 School of Marxism, Jiangsu College of Engineering and Technology, Nantong 226001, China
4 School of Computer Science and Information Security, Guilin University of Electronic Technology, Guilin 541004, China
5 School of Computer, Jiangsu University of Science and Technology, Zhenjiang 212100, China

Corresponding author: Hongmei Wang ([email protected])


This work was supported in part by the Jiangsu Province Education Science ‘‘14th Five-Year Plan’’ under Project D/2021/01/133, in part
by the Philosophy and Social Science Research Projects in Jiangsu under Grant 2020SJB0836, in part by the Nantong Science and
Technology Project under Grant JC2021124, in part by the Guangxi Key Laboratory of Trusted Software under Grant kx202046, in part by
the Scientific Research Projects of Jiangsu College of Engineering and Technology under Grant GYKY/2020/4, in part by the Research
Project of Modern Educational Technology in Jiangsu Province under Grant 2021-R-94735, in part by the Special Project of China Higher
Education Association under Grant 21SZYB23, in part by the Fifth Jiangsu Province Vocational Education Teaching Reform Research
Project under Grant ZYB686, in part by the Special Foundation for Excellent Young Teachers and Principals Program of Jiangsu Province,
and in part by the Qing Lan Project of Jiangsu Province.

ABSTRACT With the emergence of more new teaching systems, such as Massive Open Online Courses
(MOOCs), massive amounts of data are constantly being collected. There is a huge value in these massive
teaching data. However, the data, including both student behavior data and student comment data about the
course, is not processed to discover models and paradigms which can be useful for school management.
There is no multimodal dataset with tabular and textual data for educational data mining yet. We first collect
a dataset that included student behavior data and course comments textual data. Then we fuse the student
behavior data with course comments textual data to predict student performance, using a Transformer-based
framework with a uniform vector representation. The empirical results of the collected dataset show the
effectiveness of our proposed method. In terms of F1 and AUC the performance of our method improves by
up to 3.33% and 4.37% respectively. We find that the uniform feature vector representation learned by our
proposed method can indeed improve the classifier’s performance, compared with existing works. Further,
we validate our approach on an open dataset. The results of the empirical study show that our proposed
method has a strong generalization capability. Moreover, we perform interpretability analysis using the
SHapley Additive exPlanation (SHAP) method and find that text features have a more important influence
on the classification model. This further illustrates that fusing text features can improve the performance of
classification models.

INDEX TERMS Educational data mining, deep learning, multimodal, data fusion, random forest.

I. INTRODUCTION web-based online educational systems have flourished expo-


Traditional educational institutions have accumulated much nentially, thus providing multiple data sources with different
information about the student, including the student’s school granularity levels [1], [2]. With the emergence of more new
number, age, gender, etc. This data is usually stored in a teaching systems, such as MOOCs, massive amounts of data
relational database. This type of data is called tabular data. are constantly being collected. There is a huge value in these
With the development of the mobile Internet in recent years, massive teaching data. However, this data is not being pro-
cessed in time to discover models and paradigms useful for
The associate editor coordinating the review of this manuscript and school management. In fact, the tension between the sheer
approving it for publication was Zhaojun Steven Li . size of data and knowledge discovery is a huge challenge

This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
86008 VOLUME 10, 2022
Y. Qu et al.: Can We Predict Student Performance Based on Tabular and Textual Data?

for educational institutions today [3]. The applications of timodal dataset for EDM, but the MUTLA dataset is not
data mining techniques for the specific data from educational open now. Cano et al. developed a multiview early warn-
environments are called educational data mining (EDM) [4]. ing system built with comprehensible Genetic Programming
EDM data comes from a wide variety of educational sys- classification rules adapted to specifically target underrep-
tems, such as traditional face-to-face education, computer- resented and underperforming student populations. The sys-
based educational systems and blended learning systems. tem integrated many student information repositories using
Each of the different educational systems provides different multi-view learning to improve the accuracy and timing of
data sources [5]. Using machine learning techniques, such as the predictions [16].
clustering, text mining, and classification techniques, these There are no open multimodal educational assessment
different types of data are analyzed to solve various edu- datasets available. To address the lack of multimodal datasets,
cational problems. The taxonomy comprises thirteen tasks we collected multimodal data from several teaching man-
addressed by EDM systems, including predicting student agement systems and MOOCs platforms. The data includes
performance, detecting undesirable student behaviors, pro- student behavior as well as students’ course comments. The
filing and grouping students, social network analysis, pro- reason for choosing course comments instead of other data
viding reports, creating alerts for stakeholders, planning and formats such as audio or video data is that the course com-
scheduling, creating courseware, developing concept maps, menting module exists in most MOOCs platforms. This
generating recommendation, adaptive systems, evaluation makes data collection less expensive and our proposed
and scientific inquiry [6], [7]. multimodal data fusion model has a strong generalization
D’Mello discussed the ubiquity and importance of emo- capability.
tion to learning [8]. The emotions may not always be con- Research on multimodal data fusion has focused more on
sciously experienced, but they existed and influenced cog- the processing of text and images [17], however, the educa-
nition nonetheless [9]. Language can express feelings very tional multimodal data fusion has not fully been exploited.
well, so text mining-based sentiment analysis techniques To address the problem of heterogeneous data mining, stu-
have great potential for analyzing the relationship between dents’ behavior data and comment textual data are col-
students’ thoughts and learning experiences. Yang et al. lected and manually aligned. Then a multimodal data fusion
applied sentiment analysis techniques on students’ posts on approach is designed to fuse structured students’ behavior
MOOCs courses. They found that a negative correlation data and unstructured students’ comment textual data into
between the ratio of positive to negative terms and dropout a unified semantic representation to predict student perfor-
across time [10]. Methods to automatically identify student mance. Based on the dataset we collected, we conducted an
confusion were developed from MOOCs posts [11]. This empirical study. The study results show that the classification
analysis method only uses unimodal data; MOOCs are now method can achieve better classification results in terms of
able to provide researchers with multimodal data, including RECALL, F1 and AUC.
students’ behavioral data, textual data, audio, video, brain- In our study, to better elucidate our proposed research
wave data, and more. idea of multimodal data fusion for educational data mining,
DataShop dataset was one of the first and biggest datasets we design the following four research questions (RQs):
that also provided a tool for intelligent tutoring systems [12]. RQ1: Whether a multimodal dataset can be used to obtain
While the student learned from the software, the student’s a better classification model than a unimodal dataset?
actions and the tutor’s responses were stored in a log database RQ2: Can our proposed method outperform other data
or file, which was imported into DataShop for storage and fusion methods when performing teaching effectiveness eval-
analysis. Graphical Interactive Student Monitoring Tool for uation?
Moodle (GISMO) is another popular public dataset and is RQ3: Does our proposed model have strong a generaliza-
a graphical interactive monitoring tool that provides use- tion ability?
ful visualization of students’ activities in online courses RQ4: Can we perform interpretable analysis on our pro-
to instructors. With GISMO instructors can examine vari- posed deep multimodal data fusion model?
ous aspects of distance students, such as the attendance to In summary, the contributions of this paper can be summa-
courses, reading of materials and submission of assignments. rized as follows:
Users of the popular learning management system Moodle
may benefit from GISMO for their teaching activities [13]. • To the best of our knowledge, we are the first to propose
Unimodal sentiment features and classifications (e.g., text, the use of student behavior data with course comments
audio, and video) are used for sentiment discovery and anal- textual data to predict student performance.
ysis (SDA) [14]. The Multimodal Teaching and Learning • we are the first to propose an open dataset that includes
Analytics (MUTLA) dataset was very well described and student behavior data as well as course comments textual
covered many academic subjects (i.e., Mathematics, English, data.
Physics and Chemistry). User records at question level log • We are the first to propose a Transformer-based frame-
of student responses, brainwave data and webcam data were work for creating deep multimodal data fusion algo-
collected [15]. The MUTLA dataset is the first rich mul- rithms with a uniform vector representation.

VOLUME 10, 2022 86009


Y. Qu et al.: Can We Predict Student Performance Based on Tabular and Textual Data?

• Empirical results on real-world datasets show the effec- EDM/LA techniques in terms of chi-squared, information
tiveness of our proposed method. gain, symmetrical uncertainty, information gain ratio and
The rest of this paper is organized as follows. Section II Weighted Information gain. The data source was collected
introduces the background of educational data mining and in one per semester from the Spring of 2015 to the Fall
multimodal data fusion. Section III describes our proposed of 2017 [21]. Chui et al. stated that improved conditional
method in detail, including the framework of deep teach- generative adversarial network based deep support vector
ing quality assessment based on multimodal data fusion, machine (ICGAN-DSVM) algorithm was proposed to predict
the Transformer-based semantic representation of course students’ performance under supportive learning via school
comment texts and deep multimodal data fusion algorithm. and family tutoring [22]. For learning management systems,
Section IV reports our experimental setup, including exper- Partial Least Squares Structural Equation Model (PLS-SEM)
imental subjects, performance evaluation measures, strate- was used to analyze collaborative learning and to predict
gies for experimental comparison, and experimental design. the team grade in teamwork groups. The data source was
Section V discusses the results of our experiments. Section VI collected from a CS2 course [23]. For e-Learning Man-
analyzes the potential threats to the validity of our empirical agement Systems, an interpretable rule-based Genetic Pro-
results. Section VII concludes the paper with some future gramming classifier was used to predict student performance
work. and students at risk as soon as possible to intervene early
to facilitate student success in terms of Geometric mean,
AUC, and Kappa. The student data was from the Virginia
II. BACKGROUND AND RELATED WORK Commonwealth University [16]. In addition to analyzing
In this section, we mainly discuss the related studies on computer-based educational systems from students’ behavior
educational data mining, sentiment analysis, and multimodal data, sentiment analysis during online learning was also used
data fusion. to predict learning performance.

A. EDUCATIONAL DATA MINING B. SENTIMENT ANALYSIS


Traditional educational institutions have accumulated a large Sentiment analysis (SA), also called Opinion Mining (OM)
amount of basic teaching data, such as basic information was the task of extracting and analyzing people’s opinions,
about students, through information transformation over the sentiments, attitudes, perceptions, etc. Sentiment analysis
years. With the development of mobile Internet in recent posed a powerful tool for researchers to extract and analyze
years, web-based online educational systems have flourished public mood and views and finally make better decisions [24],
exponentially, thus providing multiple data sources with dif- [25]. SDA aims to automatically identify the underlying atti-
ferent granularity levels [18]. With the emergence of more tudes, sentiments, and subjectivity towards a certain entity
new teaching systems, such as MOOCs, massive amounts such as learners and learning resources. Due to its enor-
of data are constantly being collected. There is considerable mous potential for smart education, SDA has been deemed a
value in these enormous teaching data. Many new research powerful technique for identifying and classifying sentiments
areas have been born for the new education system. Educa- from multimodal and multisource data over the whole pro-
tional Data Mining is concerned with developing methods for cess of education [14]. D’Mello discussed the ubiquity and
exploring the unique types of data that come from educa- importance of emotion to learning [8]. The emotions may
tional environments [4]. Learning Analytics can be defined not always be consciously experienced, but they existed and
as the measurement, collection, analysis, and reporting of influenced cognition nonetheless [9]. Language can express
data about learners and their contexts, for purposes of under- feelings very well, so text mining-based sentiment analy-
standing and optimizing learning and the environments in sis techniques have great potential to analyze the relation-
which it occurs [19]. Academic Analytics and Institutional ship between students’ thoughts and learning experiences.
Analytics are concerned with the collection, analysis, and Yang et al. applied sentiment analysis techniques to students’
visualization of academic program activities such as courses, posts on their MOOCs courses. They found a negative cor-
degree programs research, the revenue of students’ fees, relation between the ratio of positive to negative terms and
course evaluation, resource allocation, and management to dropout across time [10]. Methods to automatically identify
generate institutional insight [20]. Educational Data Science student confusion were developed from MOOCs posts [11].
is defined as the use of data gathered from educational
environments/settings for solving educational problems [18]. C. MULTIMODAL DATA FUSION
The different research areas share the same research inter- Han et al. argued that there were many studies on unimodal
ests using a data-driven approach to educational research sentiment features and classifications(e.g., text, audio and
and share the same goal of improving teaching and learning visual) [14]. Though they presented a novel SDA frame-
practices. work of multimodal fusions, together with the description of
For intelligent tutoring systems, Markov Decision Pro- their crucial components, how to implement this multimodal
cess (MDP) framework was used to analyze and explore framework had not been studied. The MUTLA dataset is the
the application and effect of pedagogical strategies with first rich multimodal dataset for EDM. Cano et al. developed

86010 VOLUME 10, 2022


Y. Qu et al.: Can We Predict Student Performance Based on Tabular and Textual Data?

a multiview early warning system built with comprehen- The student learning process data of different modalities
sible Genetic Programming classification rules adapted to contain rich user information, and data mining can be per-
target underrepresented and underperforming student popu- formed for the student learning process data of different
lations [15]. The system integrated many student informa- modalities to build a student teaching quality assessment
tion repositories using multi-view learning to improve the model. The framework for deep education quality assessment
accuracy and timing of the predictions [16]. For MOOCs is shown in Figure 2, which is based on the semantic vector
courses, student behavior data can be obtained from the logs representation of students’ comment text, as well as stu-
of the software system, and course comments can reflect the dents’ behavior data. In the deep teaching quality assessment
emotional state of the student learning process. The datasets framework, the problem of predicting student performance is
for student behavior and course comments are easy to be formalized as a binary classification problem, and the model
collected and the cost of collecting these data is manage- classifies the results as excellent learning effect or average
able compared to collecting brainwave data, video data, etc. learning effect. The feature vector classification function is
Therefore, fusing student behavior data with course com- defined as:
ments can better reflect the learning process of students
y0 = argmaxc∈{0,1} fθ (x) (1)
and enable the prediction of student performance. Previous
research in educational data mining has been conducted in In Equation 1, x represents the input student learning status
a relatively isolated manner, either from student behavioural data, including student behavior data, such as MOOC learn-
data or from the perspective of student sentiment analysis. ing progress, learning progression for objective practice ques-
It is difficult for such studies to comprehensively measure the tions, etc., and also includes the students’ course comments,
behaviour of students during their online learning process. for example, the student’s comment, ‘‘The course is rather
Especially with the popularity of MOOCS, more and more obscure and covers a lot of underlying principles.’’. Student
students are involved in the learning process and they express behavior data and course comments are persistently stored in
their attitudes towards the course by leaving comments. a relational database from online education platforms, such
These student comments and student behaviour provide a as MOOC and SPOC Academy, as well as from third-party
good basis for our data modelling. we can predict student open data interfaces, such as Golden Classroom. fθ (.) denotes
performance based on tabular and textual data. the classifier obtained by historical training data of student
learning, such as random forest, etc. The training data of
III. OUR PROPOSED METHOD
the model is done by aligning multiple databases, and the
In this section, we first briefly describe the framework of excellent learning effect is labeled as 1, and the average
deep teaching quality assessment based on multimodal data learning effect is labeled as 0. For the training dataset Dtr ,
fusion; then, the Transformer-based semantic representation a dataset containing N training samples is defined Dtr =
of review texts and deep multimodal data fusion algorithm is {xn , yn }N
n=1 , the samples are labeled yn ∈ {0,  1},and the train-
proposed. ing samples xn = xn1 , xn2 , xn3 , xn4 , xn5 , xn6 , xn7 , xn1 to xn6 denote
the behavioral characteristics of student learning, including
learning progress (LP), learning progression for objective
A. FRAMEWORK FOR DEEP EDUCATION QUALITY practice questions (LPO), learning progression for subjec-
ASSESSMENT BASED ON MULTIMODAL FUSION DATA tive practice question (LPS), in-class discussion participation
Online education platforms, like MOOCs, provide a fast, (DP), number of posts and number of replies respectively. The
interactive platform for educational data mining. From the definitions of each behavioral characteristic are as follows.
MOOCs platform, students’ learning process data can be Number of studied chapters
collected, including both student behavior data and student LP = (2)
Total number of course chapters
interaction information, such as student comments on the Number of completed objective questions
learning course. The data can be extracted from relational LPO = (3)
Total number of objective questions
databases at a low cost. We can intuitively feel that students
Number of completed subjective questions
who study hard will be more motivated to complete their LPS = (4)
assignments and will eventually achieve better performance. Total number of subjective questions
In addition, we can also just get the students’ learning status Number of submitted class exercises
DP = (5)
from their course comment text. For example, students who Total number of class exercises
are more optimistic about their course tend to have more The number of posts and the number of replies refer
positive attitudes toward learning, leading to better academic to the number of posts made by students in the forum of
performance. The process of extracting data from a relational the MOOC platform. The above data features are collected
database is shown in Figure 1. Based on the Transformer from the MOOC platform, which are exported after stu-
architecture’s powerful learning capability of natural lan- dents finish a course on the MOOC platform. xn7 represents
guage, we have the potential to learn more information about the one-dimensional feature vector of students’ course com-
students’ learning status from course comments, which will ments, as shown in Figure 2, which is computed from a deep
ultimately enhance deeper mining of student learning data. semantic vector learning model based on Transformer. xn is

VOLUME 10, 2022 86011


Y. Qu et al.: Can We Predict Student Performance Based on Tabular and Textual Data?

FIGURE 1. The process of extracting data from a relational database.

FIGURE 2. An overview of our study.

represented as a uniform feature vector of student learning information entropy calculation, |Dn | indicates the number of
state data. Multiple decision trees are constructed to form a samples for a given classification for a selected characteristic,
|Dn |
random forest-based on the training data Dtr and ensemble |D| indicates the probability of a classification for a selected
learning are used in the random forest. xn7 differs from other feature, H (Dn ) denotes the empirical information entropy
features in that its conditional entropy must be calculated of D.
considering the domain feature migration of the Transformer
network, and the parameters of the Transformer network are B. THE TRANSFORMER-BASED SEMANTIC
determined based on the training data of the review text, and REPRESENTATION OF REVIEW TEXTS
its specific calculation formula is as Eq. 6. The Transformer architecture has gained wide application
N
! in natural language processing. The pre-trained BERT mod-
X |Dn | els can achieve better classification performance after fine-
H (D | A) = wθ H (Dn ) (6)
|D| tuning domain-specific data, and its classification is done by
n=1
computing cross-entropy loss functions on feature vectors
wθ represents the Transformer network that determines the by a linear classifier [26]. The attention mechanism and
optimal network parameters, H (D | A) denotes the empirical the feature vector representation of text provide a unified
conditional entropy in the case of condition A of the selected representation for the fusion of multimodal data, such as

86012 VOLUME 10, 2022


Y. Qu et al.: Can We Predict Student Performance Based on Tabular and Textual Data?

between course review texts and teaching quality assessment


results, which is done by fine-tuning the course review texts
using BERT. The fine-tuned model outputs a state vector,
which is used as the input to a linear classifier for learning.
This linear classifier is defined as shown in Eq. 7.

Xoutput = Linear (ReLU ( Linear (Xattention ))) (7)

Xattention represents the state vector output of the pre-


trained model, and Xoutput is the output result, which repre-
sents the classification result of the comment text, with the
value of 0 or 1. The loss function of the current training
sample is calculated based on the outcome of the binary
classification, and the cross-entropy loss function used in the
calculation is shown in Eq. 8.
−1
L= ((1 − α)yi log (pi ) + α (1 − yi ) log (1 − pi )) (8)
2
L denotes the calculated loss function value, yi denotes the
actual probability that the sample is i, pi denotes the predicted
FIGURE 3. Transformer-based learning method for 1D text feature vector
representation. probability obtained based on the training of the fine-tuned
BERT model on the historical dataset, and α denotes the
proportion of classes with actual probability pi on the train-
ing dataset over the total dataset. α parameter is used to
text and images [17]. The multidimensional feature vector address the class imbalance problem existing in the training
representation of text is unstructured data, and fusing the dataset [28]. Fine-tuning of this teaching quality assessment
table-based data from the teaching process with the feature model was completed after recording the best classification
vector directly or calculating the attention between different model on the validation dataset. The training dataset is rein-
features does not fully use the features of the table-based troduced into the final Transformer model and the semantic
data [27], and conversely, integrating the multidimensional representation of the review text for this historical data is
feature vector of BERT fine-tuning into the table-based data computed by forwarding computation.
may bring the problem of feature redundancy. To construct
the teaching quality assessment model, a Transformer-based C. DEEP MULTIMODAL DATA FUSION ALGORITHM
1D text feature vector representation learning method is As shown in Figure 2, the deep multimodal fused data-based
designed, and this deep semantic feature learning process is teaching quality assessment framework uses a random for-
shown in Figure 3. We give an example of the learning pro- est classifier to classify a uniform feature vector and
cess for eigenvectors of student comments. For example, the create multiple decision trees to vote to predict student
student comment ‘‘The course is rather obscure and covers learning effectiveness. This unified feature vector is the
a lot of underlying principles’’. The student grade of 60 is Transformer-based deep multimodal data fusion represen-
converted to label 1. The student comment is fed as input to a tation, which includes both behavioral data during student
pre-trained model for fine-tuning. A linear classifier is used learning and Transformer-based comment text semantic vec-
at the feature vector layer to classify the output features on tors. The deep multimodal data fusion process is shown in
a multilayer neural network, and a loss function is calculated Algorithm 1.
by comparing it with the classification label 1. As shown in Algorithm 1, the algorithm can effectively
The course review texts of students and the results of use the tabular data of students’ learning behavior and
the teaching quality assessment are taken as input, and the meanwhile, embed the one-dimensional feature vector of
course review texts are fine-tuned using the Transformer- comment text into the tabular data. So random forest can
based model BERT. The deep semantic connection between fully use the information entropy of unified features vec-
the input texts and the teaching quality assessment is estab- tor representation to build students’ learning quality assess-
lished based on the attention mechanism. In the input layer, ment model. The algorithm can effectively integrate with
the word embedding vector representation of the review text the traditional online teaching platform to extract students’
is performed. The segment embedding representation of each behavior from the relational database; at the same time,
text, as well as the location embedding vector are obtained, the algorithm introduces students’ interaction behavior of
and the summation is the vector input of each review text. The comment text, enriching the description of student learning
key to learning the semantic vector of course review texts is to status and describing the student learning process from more
use the multiple attention mechanism to obtain the connection dimensions.

VOLUME 10, 2022 86013


Y. Qu et al.: Can We Predict Student Performance Based on Tabular and Textual Data?

Algorithm 1 Deep Multimodal Data Fusion Algorithm TABLE 2. Confusion matrix for predicting student performance.

Input :
training set Dtr = {xn , yn }N n=1 ;
pre-trained model BERT;
Output:
unified features vector representation Rx ;
1 for data in Dtr do
2 Feed forward xn7 in BERT, compute loss value and proposed method on a publicly available dataset and has been
back propagation; used to perform binary classification [27]. The source of the
3 Record the neural network parameters and obtain the reviews is anonymous. Data examples consist of a review,
domain representation of the text; a rating, the clothing category of the product etc.
4 end
5 for data in Dtr do B. PERFORMANCE EVALUATION METRICS
6 Freeze deep neural networks and perform forward There is class imbalance in the teaching quality assess-
pass; ment dataset. We consider three performance metrics: recall,
7 Obtain a one-dimensional semantic vector of F1-measure (F1) and the area under the receiver operating
comment text vtext ; characteristic curve (AUC). The confusion matrix for the
Concatenate, Rxi = xi1 , xi2 , xi3 , xi4 , xi5 , xi6 , vtext ;

8 teaching quality assessment dataset is shown in Table 2, TP
9 end (true positive) indicates that the sample with average learning
x
10 Use R to train random forest classifier RFquality ; effect is correctly predicted as average, FN (false negative)
indicates that the sample with average learning effect is incor-
rectly predicted as excellent, FP (false positive) indicates
TABLE 1. The teaching quality assessment dataset.
that the sample with excellent learning effect is incorrectly
predicted as average, and TN (true negative) indicates that the
sample with excellent learning effect is correctly predicted as
excellent.
TP
precision = (9)
TP + FP
IV. EXPERIMENTAL SETUP TP
recall = (10)
In this section, we introduce the experiment setup, including TP + FN
experimental subjects, performance evaluation metrics, mul- FP
FPR = (11)
timodal data fusion methods and experimental design. FP + TN
2 × (precision × recall)
F1 = (12)
A. EXPERIMENTAL SUBJECTS precision + recall
To compare the data fusion methods, we collected one dataset The AUC is calculated as the area formed by the Receiver
for predicting student performance and used one publicly Operating Characteristic (ROC) curve and the coordinate
available dataset to evaluate the generalizability of our pro- axis, with the maximum value not exceeding 1. The larger the
posed method. AUC value, the better the classification effect. The TPR indi-
The first dataset we collected comes from the MOOCs cates the percentage of samples that are correctly predicted
platform we are using. The courses are intended for college as average learning effect among all samples that are actually
students. The collected data comes from three teaching sys- average learning effect, and its value is equal to recall; the
tems, including a MOOCs platform, the student course eval- FPR indicates the percentage of samples that are incorrectly
uation system and academic management system. Learning predicted as average learning effect among all samples that
progress, learning progression for objective practice ques- are actually excellent learning effect.
tion, learning progression for subjective practice question, To statistically evaluate the detailed results, we first
in-class discussion participation, number of posts and num- employ the Friedman test to determine whether there are sta-
ber of replies were obtained from a MOOCs platform; the tistically significant differences among compared methods.
students’ comments were obtained from the student course If there is a statistically significant difference, the post-hoc
evaluation system. Students’ course grades were obtained Nemenyi test is applied to compare the difference.
from academic management system. A brief description of When the null hypothesis is rejected, the average rank
the first teaching quality assessment dataset is described in should be calculated and compared with the critical distance
Table 1. (CD).
The second dataset is Women’s E-Commerce Clothing r
Reviews dataset, collected by Nick Brooks in 2018. This k × (k + 1)
dataset is used to evaluate the generalization ability of our CD = qa × (13)
6N
86014 VOLUME 10, 2022
Y. Qu et al.: Can We Predict Student Performance Based on Tabular and Textual Data?

k represents different algorithms, and N represents all


training datasets. qa is obtained by looking up the table
depending on the different parameters. Therefore, the result
of CD can be computed according to the Eq. 13. In addition,
to evaluate the degree of difference among the compared
methods in terms of recall, F1 and AUC, we apply Cohen’s d
to measure the effect size [29], [30], [31].
M1 − M2
Cohen0 s d = q 2 2 (14)
σ1 +σ2
2

where M1 and M2 represent the mean of the statistic,


and σ represents the standard deviation of the statistic. If
d ∈ {0, 0.2}, this indicates the effect size is negligible.
If d ∈ {0.2, 0.5}, this indicates the effect size is negligible.
If d ∈ {0.5, 0.8}, this indicates the effect size is medium. FIGURE 4. The comparison results via box plot.
If d ∈ {0.8, 1}, this indicates the effect size is large.
work is used, and the random forest classifier uses the default
C. MULTIMODAL DATA FUSION METHODS hyperparameters, where the number of classification subtrees
To evaluate our proposed teaching quality prediction model is 10. The ratio of training dataset, validation dataset and test
(RfBERT) in a comprehensive manner, we chose the follow- dataset was 8:1:1 during the experiment, and the number of
ing data fusion methods as baseline methods for comparison. training repetitions was 10, with random stratified sampling
text_only: As shown in Eq. 8, only the review text is used each time to maintain the consistency of data distribution.
as the input of the teaching quality assessment model, and the During the validation process, the early stop method was
cross-information entropy is used as the loss function to build used to terminate the neural network training process to pre-
the Transformer-based text classification model.  vent overfitting. In the fine-tuning of the Transformer model,
tabular_only: Using xn = xn1 , xn2 , xn3 , xn4 , xn5 , xn6 as input a hyperparameter α is introduced to solve the class imbalance
samples, we build a teaching quality assessment model based problem in the training data set, which reduces the impact
on random forest. of the majority class on the imbalanced data distribution by
concat: As shown in Figure 2, the uniform feature vector penalizing the loss value of the majority class.
is used as the input sample to build a teaching quality assess-
V. EXPERIMENTAL RESULTS
ment model with a linear classifier.
In this section, we report experimental results for the four
MLPconcat: Implementation of the data fusion method
RQs.
proposed by Gu et al [27]. This method separate MLPs on
numerical feats then concatenation of transformer output, A. RESULT ANALYSIS FOR RQ1
with processed numerical feats before the final classifier RQ1: Whether a multimodal dataset can be used to obtain
layer(s). a better classification model than a unimodal dataset?
MAG: Implementation of the data fusion method based on Motivation: To verify whether better performance in pre-
the attention mechanism proposed by Rahman1 et al. [32]. dicting student performance can be obtained by fusing multi-
In the output layer, this method used gated summation of modal data, we compare our proposed method with classifiers
transformer outputs, numerical feats, and categorical feats that employ unimodal data. The text_only fusion method is
before the final classifier layer(s). preformed to compare the tabular data and the tabular_only
fusion method is performed to compare the textual data.
D. EXPERIMENTAL DESIGN To answer this RQ, we conduct the experiments on the
The experiments run on Windows OS, and the running hard- collected dataset. The comparison results via box plot are
ware environment is Intel Core i7-10700K CPU with 64G shown in Figure 4. From these figures, we can observe that
RAM. The fine-tuning of BERT is completed on NVIDIA our proposed method RfBERT achieves best performance
GeForce RTX 2070 GPU. The deep neural network library in terms of recall, F1 and AUC. All three different data
used in the experiments is Pytorch 1.8 stable version and the fusion methods obtained high recall; the AUC value of the
open-source huggingface library is used to implement BERT. text data fusion method was only 0.8153 and the method
The pre-trained model BERT used in the experiments is also had the lowest performance on the F1 metric, which
bert-base-uncased, which has 12 layers, 12 head attention, could indicate that student performance could not be fully
word embedding dimension of 768 and network parameters predicted using only the text of student course comments.
of 110 M . Sentences exceeding a specific length are trun- The tabular_only fusion method achieves sub-optimal perfor-
cated, and zero-fill operations are performed for sentences mance in F1 and AUC metrics; our proposed method achieves
that do not satisfy the length. The open source sklearn frame- the best performance in all three metrics. This means that our

VOLUME 10, 2022 86015


Y. Qu et al.: Can We Predict Student Performance Based on Tabular and Textual Data?

proposed method can fully fuse two different kinds of data by


learning feature vectors from text and then achieves the best
performance.
To compare the performance of different data fusion meth-
ods from a statistical point of view, the non-parametric Fried-
man test at a confidence level of 95% is used to conduct a
statistical analysis of the results. We find that the calculated
value is smaller than the critical value for a 0.05 significance
level. To reveal the differences between different data fusion
methods, we further adopt a post hoc statistical analysis
method. In this experiment, k means three different algo-
rithms, and N = 10 means that the collected dataset was
randomly sampled 10 times. Finally, Cohen’s d effect size
is 0.75 between RfBERT and text_only in terms of AUC and
this indicates the effect size is medium; Cohen’s d effect size
is 0.82 between RfBERT and tabular_only in terms of AUC
FIGURE 5. The box plot on different data fusion methods.
and this indicates the effect size is large.
Summary for RQ1: From the box plot as well as the
statistical results, our proposed method can fully fuse two TABLE 3. Performance comparison of data fusion methods on open
different kinds of data by learning feature vectors from text dataset.
and then achieves the best performance. Course comment
texts should be considered when creating student academic
assessment models.

B. RESULT ANALYSIS FOR RQ2


RQ2: Can our proposed method outperform other data
fusion methods when performing teaching effectiveness
evaluation?
Motivation: Based on Gu et al. research [27], for multi- C. RESULT ANALYSIS FOR RQ3
modal data fusion, there are three main baselines,including RQ3: Does our proposed model have strong generaliza-
concat, NLPconcat, MAG. We need to verify whether our tion ability?
proposed method can outperform the three baseline methods. Motivation: Although our method achieves the best classi-
According to the box plot of Figure 5, all four data fusion fication performance on the dataset we collected, to validate
methods obtained high recall, while our proposed data fusion the generalization ability of our proposed method, we com-
method obtained the highest recall. In terms of the AUC and pare multiple feature fusion methods on an open dataset.
F1 metrics, the MAG data fusion method obtained the lowest We conducted experiments on the open clothes review
performance and the concat data fusion method obtained dataset to evaluate feature fusion methods including con-
the sub-optimal performance. Compared with concat, MLP- cat, MLPconcat, MAG, RfBERT. We performed a ten fold
concat incorporates MLP on the tabular data output layer, cross-validation and obtained the mean F1 and AUC. The
which may be the reason for its performance degradation. results are shown in Table 3.
We can clearly see that our proposed method obtains the best Our proposed method RfBERT obtains the best classi-
performance and has a high stability. fication performance on F1 and AUC metrics. On the F1
The non-parametric Friedman test at a confidence level of metric, our method improves 3.34% over the method MLP-
95% is used to conduct a statistical analysis of the results. concat, which achieves the worst performance. Meanwhile,
We find that the calculated value is smaller than the critical on the AUC metric, our method improves 2.37% over the
value for a 0.05 significance level. The post hoc statistical method MLPconcat, which achieves the worst performance.
analysis method was adopted. In this experiment, k means As similar to the results of the RQ2 experiment, concat
four different algorithms, and N = 10 means that the obtained suboptimal performance on both F1 and AUC met-
collected dataset was randomly sampled 10 times. Finally, rics. The performance of MAG is slightly stronger than MLP-
Cohen’s d effect size is 0.68 and this indicates the effect size concat and lower than our proposed method. This conclusion
is medium. remains largely consistent with RQ2. This implies that for
Summary for RQ2: Our proposed method achieves the multimodal tabular and textual data, unified features vec-
best classification performance compared to the base meth- tor representation can effectively improve the classification
ods. This implies that the uniform feature vector representa- performance.
tion learned by our proposed method can indeed improve the Summary for RQ3: Our proposed method has a strong
classifier’s performance. generalization capability. The classification performance can

86016 VOLUME 10, 2022


Y. Qu et al.: Can We Predict Student Performance Based on Tabular and Textual Data?

FIGURE 6. The importance of each feature using SHAP method.


FIGURE 7. The contribution of different features to the prediction result
of ‘‘average learning’’.

be improved by using a unified features vector representation


a negative correlation between the student’s course comments
for multimodal tabular and textual data.
and the current classification result. The third feature has an
average impact on the classification results. The fourth fea-
D. RESULT ANALYSIS FOR RQ4
ture also shows a negative correlation with the classification
RQ4: Can we perform interpretable analysis on our pro- results. The fifth and seventh characteristics show a positive
posed deep multimodal data fusion model? relationship with the classification results. We can see that
Motivation: Based on RQ1, RQ2 and RQ3, we can build the learned text feature vector plays an important role for
a prediction model with strong generalization ability and the category ‘‘average learning’’. This is consistent with the
higher performance from the historical dataset. The model distribution of our data.
contains multimodal data. To analyze students’ academic Summary for RQ4: Based on the results of the inter-
performance in a timely manner and intervene accordingly, pretability analysis, we see that the unified feature vector
it is necessary to conduct an interpretable analysis of our fused with the text vector can indeed play a key role in model
model. classification.
To evaluate the contribution of the seven features in the
unified feature vector to the random forest classifier, we intro- VI. THREATS TO VALIDITY
duce the SHAP method. The calculation procedure is shown In this section, we mainly discuss potential threats to the
in Eq. 15. Suppose the i sample is xi , the j feature of validity of our study.
the i sample is xi,j , the predicted value of the model for
the i sample is yi , and the baseline (usually the mean of A. THREATS TO CONSTRUCT VALIDITY
the target variable for all samples) of the whole model To evaluate our proposed approach, we collected data from
is ybase . multiple instructional management systems and built experi-
 
yi = ybase + f xi,1 + f xi,2 + · · · + f xi,7

(15) mental datasets by alignment. However, the size of these data
is currently small, and the dataset will need to be continuously
We implemented the SHAP method on the random forest expanded later. When testing the generalization capability,
classifier and calculated the importance of each feature as the test was conducted on only one open dataset. As more
shown in Figure 6. We see that the feature ‘‘number of posts’’ datasets are shared, there is a need to validate on more
is the largest contributor to the prediction result, followed by datasets.
the text vector we learned from the BERT. This also shows
that the course comment texts we used can improve the per- B. THREATS TO INTERNAL VALIDITY
formance of the classifier and make an important contribution We use several open source software in our experiments,
to the classification task. such as huggingface, sklearn, etc. These open source software
Further we can observe the contribution of different fea- provide default hyperparameter settings, such as pre-trained
tures to the prediction result of ‘‘average learning’’. The result model BERT, etc. Although we fine-tuned the deep neural
is shown in Figure 7. A total of seven features influence the network by validation dataset, there are still more hyperpa-
classification results of the model. When the classification rameters with default values. In addition, the machine learn-
result of the model is ‘‘average learning’’, different features ing classifiers we used, such as random forest, also used the
contribute differently. The first feature and the sixth feature default hyperparameter settings.
have a relatively even effect on the current classification
result; the second feature has a negative effect on the current C. THREATS TO EXTERNAL VALIDITY
classification result most of the time. The second feature has a External validity is the degree to which the research results
negative effect on the current classification result, indicating can be generalized to the population under study and

VOLUME 10, 2022 86017


Y. Qu et al.: Can We Predict Student Performance Based on Tabular and Textual Data?

other research settings. There are no commercial datasets [2] R. S. Baker and P. S. Inventado, ‘‘Educational data mining and learn-
available for testing yet, and we need to keep an eye ing analytics,’’ in Learning Analytics: From Research to Practice.
Springer, Jan. 2014, pp. 61–75. [Online]. Available: https://link.springer.
on developments based on multimodal tabular and textual com/chapter/10.1007/978-1-4614-3305-7_4
data fusion. [3] M. I. Baig, L. Shuib, and E. Yadegaridehkordi, ‘‘Big data in education:
A state of the art, limitations, and future research directions,’’ Int. J. Educ.
Technol. Higher Educ., vol. 17, no. 1, pp. 1–23, Dec. 2020. [Online]. Avail-
VII. CONCLUSION AND FUTURE WORK able: https://educationaltechnologyjournal.springeropen.com/articles/10.
With the emergence of more new teaching systems, such as 1186/s41239-020-00223-0
[4] B. Bakhshinategh, O. R. Zaiane, S. ElAtia, and D. Ipperciel, ‘‘Educational
MOOCs, massive amounts of data are constantly being col-
data mining applications and tasks: A survey of the last 10 years,’’ Educ.
lected. This massive amount of data is a vast gold mine. How- Inf. Technol., vol. 23, no. 1, pp. 537–553, Jul. 2017. [Online]. Available:
ever, the multimodal data including both student behavior https://link.springer.com/article/10.1007/s10639-017-9616-z
data and student course comments textual data, is not pro- [5] C. Romero and S. Ventura, ‘‘Educational data mining: A survey
from 1995 to 2005,’’ Exp. Syst. Appl., vol. 33, no. 1, pp. 135–146,
cessed to discover models and paradigms which can be Jul. 2007.
useful for school management. All these state data during [6] A. Hernández-Blanco, B. Herrera-Flores, D. Tomás, and
the learning process can reflect the effectiveness of student B. Navarro-Colorado, ‘‘A systematic review of deep learning approaches
to educational data mining,’’ Complexity, vol. 2019, May 2019,
learning. There is no multimodal dataset with tabular data Art. no. 1306039.
and textual data yet. So we first collected an open dataset that [7] M. D. Laddha, V. T. Lokare, A. W. Kiwelekar, and L. D. Netak, ‘‘Per-
included student behavior data as well as course comments formance analysis of the impact of technical skills on employability,’’ Int.
J. Performability Eng., vol. 17, no. 4, p. 371, Apr. 2021. [Online]. Avail-
textual data. We fused student behavior data with course able: http://www.ijpe-online.com/EN/10.23940/ijpe.21.04.p5.371378
comments textual data to predict student performance. Then a [8] C. Lang, G. Siemens, A. Wise, and D. Gasevic. (2017). Hand-
Transformer-based framework for creating deep multimodal book of Learning Analytics. [Online]. Available: https://www.academia.
edu/download/56326181/hla17.pdf
data fusion algorithms with a uniform vector representation [9] A. öhman and J. J. Soares, ‘‘Unconscious anxiety’: Phobic responses to
was proposed. The empirical results of the collected dataset masked stimuli,’’ J. Abnormal Psychol., vol. 103, no. 2, pp. 231–240, 1994.
show the effectiveness of our proposed method in terms [10] M. Wen, D. Yang, and C. P. Rosé. Sentiment Analysis in MOOC Discussion
Forums: What Does It Tell Us? Citeseer. Accessed: Apr. 2022. [Online].
of recall, F1 and AUC. The empirical research indicates Available: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.
that: (1)our proposed method can fully fuse two different 1.1.660.5804&rep=rep1&type=
kinds of data by learning feature vectors from text and then [11] D. Yang, M. Wen, I. Howley, R. Kraut, and C. Rosé, ‘‘Exploring the effect
of confusion in discussion forums of massive open online courses,’’ in
achieves the best performance. Course comment texts should Proc. 2nd ACM Conf. Learn. Scale, Mar. 2015, pp. 121–130.
be considered when creating student academic assessment [12] (2010). A Data Repository for the EDM Community. [Online].
models; (2)Our proposed method achieves the best classi- Available: https://www.researchgate.net/publication/254199600_A_
Data_Repository_for_the_EDM_Community
fication performance compared to the base methods. This [13] Graphical Interactive Student Monitoring Tool for Moodle.
implies that the uniform feature vector representation learned Accessed: Apr. 2022. [Online]. Available: http://gismo.sourceforge.net/
by our proposed method can indeed improve the classifier’s index.html
[14] Z. Han, J. Wu, C. Huang, Q. Huang, and M. Zhao, ‘‘A review on
performance. sentiment discovery and analysis of educational big-data,’’ Wiley Inter-
Further, we validated our approach on an open cloth- discipl. Rev., Data Mining Knowl. Discovery, vol. 10, no. 1, p. e1328,
ing dataset. The results of the empirical study showed that Jan. 2020. [Online]. Available: https://onlinelibrary.wiley.com/doi/full/
10.1002/widm.1328 and https://onlinelibrary.wiley.com/doi/abs/10.1002/
our proposed method had a strong generalization capabil- widm.1328 and https://wires.onlinelibrary.wiley.com/doi/10.1002/widm.
ity. Moreover, we performed interpretability analysis using 1328
SHAP method and found that text features had more impor- [15] F. Xu, L. Wu, K. P. Thai, C. Hsu, W. Wang, and R. Tong, ‘‘MUTLA:
A large-scale dataset for multimodal teaching and learning analytics,’’
tant influence on the classification model. This further illus- Oct. 2019, arxiv:1910.06078v1.
trated that fusing text features can improve the performance [16] A. Cano and J. D. Leonard, ‘‘Interpretable multiview early warning system
of classification models. adapted to underrepresented populations,’’ IEEE Trans. Learn. Technol.,
vol. 12, no. 2, pp. 198–211, Apr. 2019.
In the future, we will continue to expand our dataset and [17] D. Kiela, S. Bhooshan, H. Firooz, E. Perez, and D. Testuggine, ‘‘Super-
apply our proposed method to other domains to validate its vised multimodal bitransformers for classifying images and text,’’ 2019,
generalization capability continuously. In addition, we will arXiv:1909.02950.
[18] C. Romero and S. Ventura, ‘‘Educational data science in massive open
also continue our in-depth research on the representation of online courses,’’ Wiley Interdiscipl. Rev., Data Mining Knowl. Discovery,
unified feature vectors based on natural language processing vol. 7, no. 1, p. e1187, Jan. 2017.
techniques.We will work on additional ways to fuse data to [19] C. Lang, G. Siemens, A. Wise, D. Gašević, and A. Research. Handbook of
Learning Analytics Society for Learning. Accessed: Apr. 2022. [Online].
improve the classification performance of student learning Available: https://www.solarresearch.com
classification models. [20] J. Campbell, P. DeBlois, D. Oblinger. (2007). Academic Analytics: A
New Tool for a New Era. [Online]. Available: https://er.educause.edu/
articles/2007/7/academic-analytics-a-new-tool-for-a-new-era
REFERENCES [21] Exploring Induced Pedagogical Strategies Through a Markov Decision
[1] C. Romero and S. Ventura, ‘‘Educational data science in massive Process Framework: Lessons Learned. Accessed: Apr. 2022. [Online].
open online courses,’’ Wiley Interdiscipl. Rev., Data Mining Knowl. Available: https://par.nsf.gov/biblio/10105557
Discovery, vol. 7, no. 1, p. e1187, Jan. 2017. [Online]. Available: [22] K. T. Chui, R. W. Liu, M. Zhao, and P. O. de Pablos, ‘‘Predicting students’
https://onlinelibrary.wiley.com/doi/full/10.1002/widm.1187 and https:// performance with school and family tutoring using generative adversar-
onlinelibrary.wiley.com/doi/abs/10.1002/widm.1187 and https://wires. ial network-based deep support vector machine,’’ IEEE Access, vol. 8,
onlinelibrary.wiley.com/doi/10.1002/widm.1187 pp. 86745–86752, 2020.

86018 VOLUME 10, 2022


Y. Qu et al.: Can We Predict Student Performance Based on Tabular and Textual Data?

[23] Z. Li and S. Edwards. (2018). Applying Recent-Performance FANG LI was born in Baoji, China, in 1982. She
Factors Analysis to Explore Student Effort Invested in Programming received the M.S. degree in computer science and
Assignments. [Online]. Available: https://search.proquest.com/openview/ technology from Henan Polytechnic University,
1344abae126cd4240dfdce3764087786/1?pq-origsite=gscholar&cbl= China, in 2011. Since 2014, she has been a Lec-
1976352 turer with the Jiangsu College of Engineering and
[24] M. Birjali, M. Kasri, and A. Beni-Hssane, ‘‘A comprehensive survey on Technology. Her research interests include net-
sentiment analysis: Approaches, challenges and trends,’’ Knowl.-Based work ideological and political education and com-
Syst., vol. 226, Aug. 2021, Art. no. 107134.
puter application.
[25] A. G. Etemad, A. I. Abidi, and M. Chhabra, ‘‘Fine-tuned T5
for abstractive summarization,’’ Int. J. Performability Eng., vol. 17,
no. 10, pp. 900–906, Oct. 2021. [Online]. Available: http://www.ijpe-
online.com/EN/10.23940/ijpe.21.10.p8.900906
[26] J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, ‘‘BERT: Pre-training
of deep bidirectional transformers for language understanding,’’ 2018, LONG LI (Member, IEEE) received the Ph.D.
arXiv:1810.04805. degree from the Guilin University of Electronic
[27] K. Gu and A. Budhkar, ‘‘A package for learning on tabular and Technology, Guilin, China, in 2018. He is cur-
text data with transformers,’’ in Proc. 3rd Workshop Multimodal rently a Lecturer with the School of Computer
Artif. Intell., 2021, pp. 69–73. [Online]. Available: https://aclanthology. Science and Information Security, Guilin Uni-
org/2021.maiworkshop-1.10 versity of Electronic Technology. His research
[28] L. Fang, Q. Yubin, C. Xiang, L. Long, and Y. Fan, ‘‘A senti- interests include cryptographic protocols, privacy-
ment analysis method based on class imbalance learning,’’ J. Jilin preserving technologies in big data, and the IoT.
Univ. Sci. Ed., vol. 59, no. 4, pp. 929–935, 2021. [Online]. Available:
http://xuebao.jlu.edu.cn/lxb/CN/abstract/abstract4404.shtml
[29] Y. Qu, X. Chen, F. Li, F. Yang, J. Ji, and L. Li, ‘‘Empirical evaluation on the
impact of class overlap for EEG-based early epileptic seizure detection,’’
IEEE Access, vol. 8, pp. 180328–180340, 2020.
XIANZHEN DOU was born in Xuzhou, China,
[30] D. Lakens, ‘‘Calculating and reporting effect sizes to facilitate
in 1987. He received the M.S. degree from the
cumulative science: A practical primer for T-tests and ANOVAs,’’
Frontiers Psychol., vol. 4, p. 863, Nov. 2013. [Online]. Available: School of Electronics and Information, Nantong
/pmc/articles/PMC3840331//pmc/articles/PMC3840331/?report=abstract University, China, in 2013. Since 2019, he has
and https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3840331/ been a Lecturer with the Information Engineer-
[31] S. S. Sawilowsky, ‘‘New effect size rules of thumb,’’ J. Modern Appl. ing Institute, Jiangsu College of Engineering and
Stat. Methods, vol. 8, no. 2, p. 26, Nov. 2009. [Online]. Available: Technology. His research interests include soft-
https://digitalcommons.wayne.edu/jmasm/vol8/iss2/26 ware engineering and machine learning.
[32] W. Rahman, M. Kamrul Hasan, S. Lee, A. Zadeh, C. Mao, L.-P. Morency,
and E. Hoque, ‘‘Integrating multimodal information in large pretrained
transformers,’’ 2019, arXiv:1908.05787.

HONGMEI WANG was born in Lianyuan,


China, in 1981. She received the B.S. and M.S.
degrees in computer science and technology from
YUBIN QU was born in Nanyang, China, in 1981. Henan Polytechnic University, China, in 2005 and
He received the B.S. and M.S. degrees in computer 2008, respectively. She is currently pursuing the
science and technology from Henan Polytechnic Ph.D. degree with the Nanjing University of
University, China, in 2004 and 2008, respectively. Posts and Telecommunications. She was a Vis-
Since 2009, he has been a Lecturer with the Infor- iting Scholar at the University of Hong Kong,
mation Engineering Institute, Jiangsu College of from 2018 to 2019. Since 2008, she has been with
Engineering and Technology. He is the author the Jiangsu University of Science and Technology.
of more than ten articles. His research interests Her research interests include information security, machine learning, and
include software maintenance, software testing, artificial intelligence.
and machine learning.

VOLUME 10, 2022 86019

You might also like