0% found this document useful (0 votes)
32 views

(N) Semi-Supervised Learning Quantization Algorithm With Deep Features

[N] Semi-supervised learning quantization algorithm with deep features
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
32 views

(N) Semi-Supervised Learning Quantization Algorithm With Deep Features

[N] Semi-supervised learning quantization algorithm with deep features
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 13

Applied Soft Computing Journal 89 (2020) 106071

Contents lists available at ScienceDirect

Applied Soft Computing Journal


journal homepage: www.elsevier.com/locate/asoc

Semi-supervised learning quantization algorithm with deep features


for motor imagery EEG Recognition in smart healthcare application

Minjie Liu a ,1 , Mingming Zhou a ,1 , Tao Zhang b,c , , Naixue Xiong d
a
School of Nursing, Taihu University of Wuxi, Qianrong street No.68, Wuxi, Jiangsu Province 214064, China
b
Key Laboratory of Urban Land Resources Monitoring and Simulation, Ministry of Land and Resources, Shenzhen, China
c
Jiangsu Provincial Engineering Laboratory for Pattern Recognition and Computational Intelligence, Jiangnan University, Wuxi 214000, China
d
School of Computer Science and Technology, Tianjin University, Tianjin, China

article info a b s t r a c t

Article history: This paper depicts a novel semi-supervised classification model with convolutional neural networks
Received 29 August 2019 (CNN) for EEG Recognition. The performance of popular machine learning algorithm usually rely on
Received in revised form 24 December 2019 the number of labeled training samples, such as the deep learning approaches,sparse classification
Accepted 3 January 2020
approaches and supervised learning approaches. However, the labeled samples are very difficulty
Available online 15 January 2020
to get for electroencephalography(EEG) data. In addition, most deep learning algorithms are usually
Keywords: time-consuming in the process of training. Considering these problems, in this article, a novel semi-
Convolutional neural networks supervised quantization algorithm based on the cartesian K -means algorithm is proposed, which
Semi-supervised classification named it as the semi-supervised cartesian K -means (SSCK), we use the CNN models pre-trained on
EEG Recognition motor imagery samples to create deep features, and then we applied it for motor imagery (MI) data
Smart healthcare classification. Unlike the traditional semi-supervised learning models that labeled information can be
Cartesian K -means directly casted into the model training, label information can only be implicitly used in the semi-
supervised learning strategy, in the semi-supervised learning algorithm, supervised information is
integrated into the quantization algorithm by resorting a supervised constructed laplacian regularizer.
Experimental results over four popular EEG datasets substantiate the efficiency and effectiveness of
our proposed semi-supervised cartesian K -means.
© 2020 Elsevier B.V. All rights reserved.

1. Introduction are very easy to get, and they are frequently used to conduct
the detection operations [10]. EEG signals are translated into
In recent years, deep learning technology has been widely environmental control signals through the BCI system. Activity in
used in image recognition, speech recognition, natural language different positions of the brain respond to different body move-
processing and other fields [1–6]. With the development of smart ments and activity of imagination. The ipsilateral (relative to the
healthcare, images gradually occupy a key role in recognizing subject’s unilateral limb) and contralateral sensorimotor cortex
the categories of the diseases and then in providing an accurate will display appearances referred to as event-related synchro-
prediction results for patients. The brain is the body’s central nization and event-related desynchronization, respectively [11].
nervous system and behavior control center [7]. People mainly This laid a foundation for the classification of electrophysiological
depend on brain reaction to get command of their bodies with motor imagery (MI) representation signals. Motor imagery is
respect to external stimulus. The brain consciousness of patients one of multitudinous examples which depend upon numerous
with dyskinesia is normal, but the motor intention cannot be existing patterns of the EEG signal.
transferred [8,9]. Electroencephalograph (EEG) is an efficient and Recently, with the development of machine learning (deep
convenient technique for analyzing and studying brain electrical
learning,semi-supervised learning and other quantization learn-
activities. In most intelligent human–computer interaction sys-
ing approaches [12–17]) and smart healthcare, MI EEG data clas-
tems,such as brain computer interface (BCI), we find EEG signals
sification problem has triggered broad discussion in BCI area [18,
19]. However, MI EEG recognition is usually difficulty due to
∗ Corresponding author at: Jiangsu Provincial Engineering Laboratory for
the samples difference of the same class and time-consuming.
Pattern Recognition and Computational Intelligence, Jiangnan University, Wuxi
214000, China.
While on the other hand, approximate nearest neighbor(ANN)
E-mail address: [email protected] (T. Zhang). search algorithm has grown into a hot research topic for its
1 Minjie Liu and Mingming Zhou contributed equally to this work and should high retrieve performance on the large-variation datasets. ANN
be considered co-first authors. search is aim to find those instances whose Euclidean distance

https://doi.org/10.1016/j.asoc.2020.106071
1568-4946/© 2020 Elsevier B.V. All rights reserved.
2 M. Liu, M. Zhou, T. Zhang et al. / Applied Soft Computing Journal 89 (2020) 106071

are relatively small. In the ANN research to get the Euclidean All the above mentioned algorithms have obtained an sig-
distance between the query vector q and all the vectors involving nificant improvement on the research of product quantization
lots of computations which is infeasible for the large scale and algorithm, however, all of these algorithms are still belong to the
high dimensional cases. unsupervised learning framework, which may explicitly restrict
Many machine learning algorithms have been proposed and the performance of these algorithms. To effectively reduce the
improved to eliminate those computations, such as k-d tree [20], quantization error in each subspace and improve the retrieval
multiple k-d trees and priority search [21] as well as cover tree performance, we propose semi-supervised cartesian K -means al-
and the trinary-projection tree et al. Hashing is also a very attrac- gorithm (SSCK). It firstly build a laplacian matrix ground on the
tive method in the ANN search work, it deal it by converting the labeled data, the similarity computation between two samples
database vectors into short codes, storage cost is small, making who share the same labels will be given a large value, while
the in-memory search feasible and the distance computation data belong to different clusters will be assigned a small value.
cost is also reduced. Highly effective hashing approaches have Then add the laplacian regularizer to the specific loss term to
been designed, such as typical locality sensitive hashing [22, construct the more discriminative cartesian K -means objective
23], kernelized locality-sensitive hashing [24], semi-supervised function. We also design corresponding optimization algorithm
hashing [25,26], K -means hashing [27], supervised hashing [28, to solve our constructed model. The main contributions of this
29]. paper are summarized as the following:
Quantization also plays important role in the ANN search
• In view of the number of labeled samples is very small,
research work, it deal the job by adopting the data representation
we propose a semi-supervised cartesian K -means algorithm
strategy. Unsupervised algorithm is firstly used to implement the
with deep features extracted from CNN in EEG data.
data clustering task to enable original data to get labels. Then the
instances in each cluster are represented or reconstructed. Then
• Labeled data is integrated in the quantization step to provide
the additional constraints to promote the data reconstruc-
distance calculation between the query data and base data can
tion ability.
be converted into distance computation between the centers of
clustering in which class the query data and base data belong
• Laplacian matrix is build based on the label data and is
added to the optimal reverse prediction model to get more
to respectively. In other words, the distance between the query
discriminative cartesian K -means model.
and database vector can be approximated equal to the distance
among the centers of clustering in which cluster the query and
• Strategy to optimize the semi-supervised cartesian K -means
is given to enable the function to get an minimum value.
base data vector belong to respectively. Product Quantization [30]
is an algorithm that the original vector space is decomposed into The reminder of our paper is stated as following: We review
cartesian product of several low-dimensional vector Spaces, and the knowledge of cartesian K -means and the optimal reverse
the low-dimensional vector Spaces obtained by decomposition prediction algorithm in Section 2. In Section 3 we depict the semi-
are quantified respectively. The computational complexity of vi- supervised cartesian K -means algorithm, then the supervised
olent exhaustion is a very high, which is obviously not desirable. cartesian K -means are presented. Section 4 gives the interpreta-
Therefore, for the similar search scenario of high-dimensional tion and implementation detail of the algorithm. Section 5 gives
data with large amount of data, we need some efficient approx- the detailed descriptions of the algorithm design and results
imate nearest neighbor search technologies, and PQ is one of with quantitative comparison with other algorithms. In the end,
them. Section 6 summarizes the main points of this paper.
Since the quantization algorithm had been proposed, many
extended algorithms have been developed to enhance its search 2. Related work
performance. Cartesian K -means [31] extend the PQ algorithm
and impose an appropriate regulatory factor to a linear map- Previous to our proposed algorithm, we will briefly introduce
ping vector of the classical K -means objective function, which previous MI EEG data classification scheme, the optimal reverse
make the optimizing procedure become more efficient and ef- prediction and cartesian K -means algorithm.
fective than the traditional K -means by adopting the Lloyds al-
gorithm [32]. Unlike cartesian K -means optimizing with respect 2.1. MI EEG classification
to sub-codebook only, OPQ [33] optimize with respect to sub-
codebook and space decomposition both to find the optimal space In this subsection, we briefly review existing methods and
decomposition scheme and rotation matrix simultaneously. By models for EEG MI classification. Through the investigation and
balancing the eigenvalue of the covariance matrix, OPQ can obtain research of MI EEG classification methods, we found that feature
the optimal space decomposition, however, strong multimodal extraction and recognition are two key factors. These two kinds of
distributions are prone to errors during decomposition [34]. Op- methods have been used for the MI EEG classification tasks. Many
timal cartesian K -means [35] (OCK-means) is different from the researchers focus on feature extraction stage with respect to MI
previous coding scheme where one subcode word is choosed [35], EEG classification. Currently, the feature extraction algorithms
OCK-means encodes data points using subvectors of multiple sub- mainly include Filter bank common spatial patterns (FBCSPs),
code words, performance are significantly improved in the ANN short time Fourier transform (STFT), wavelet transform (WT) and
research experiments, same algorithm is also proposed in [36]. common spatial pattern (CSP) and other approaches [39–42].
In recent years, the research about approximate nearest neighbor Feature recognition strategies mainly relied on linear discrim-
(ANN) has spread all over the pattern recognition area. However, inant analysis (LDA), support vector machine (SVM) and other
existing typical ANN-based approaches are designed for static classifiers [43–45].
databases only. Liu et al. [37] have demonstrated the benefit of The current approaches for feature extraction have been pro-
making full use of random kernel functions in a PQ strategy, gressing at a dramatic pace, but they may do not work well in
they proposed a Kernelized product quantization method through some specific areas. Many studies have shown that compared
decomposing corresponding implicit eigenspace. Xu et al. [38] with traditional feature extraction methods, adaptive auto re-
constructed appropriate incoming streaming data through up- gressive works better in extracting MI EEG feature [46]. For SIFT
dating quantization codebook and presented an effective online feature descriptors, the frequency resolution and time resolution
product quantization (OPQ) algorithm. are determined by the window size. A larger window size means
M. Liu, M. Zhou, T. Zhang et al. / Applied Soft Computing Journal 89 (2020) 106071 3

lower time resolution and higher frequency resolution. In con- the concept of optimal reverse prediction: predict the inputs from
trast, a smaller window size result in higher time solution and the target labels, optimizing both over model parameters and any
lower frequency resolution. The limited window size will cause missing labels. Supervised least squares, principal components
a certain frequency leakage, especially related to the discrete- analysis, K -means clustering and normalized graph-cut can all be
time Fourier transform. Therefore, it is not possible to obtain expressed as instances of the same training principle. The opti-
both time and frequency resolution. Wavelet transform has excel- mal reverse prediction algorithm unifies multiple supervised and
lent time–frequency localization characteristics and can be used unsupervised training principles through the concept of optimal
for multi-scale analysis. But at present, most researchers manu- reverse prediction: give a appropriate prediction about the input
ally screen wavelet coefficient EEG features by using statistical samples, and optimize the model parameters without labels. Most
properties such as average, maximum, minimum and standard machine learning algorithm (Supervised least squares, K mean
deviation. Although the moving image signal pattern is not based
clustering and normalized graph cutting) can all be treated in the
on prior knowledge, its irregular motion will make the signal to
same training method.
be unpredictable.
Lately, there have been high attention on CNN and RNN
method, which are putting forward to reach excellent capability 2.3. Cartesian K -means
on the MI EGG data [47]. And relevant comparative tests about
DBN and SVM have been carried out [48] to solve the problems In [30], product quantization algorithm is proposed for ANN
of two MI classification. The result showed that DBN has an research task. High dimensionality input data space is averagely
advantage over SVM. CNN was also used in [49], for recognition decomposed and represented as a special cartesian product of M
of MI EEG. RNN and CNN also were executed formerly in [50], lower dimensionality. Each space can form a codebook by using
in order to find the cognitive cases from MI EEG datasets by the conventional K -means algorithm. Hence there will generate
proposing the multidimensional features. Autoencoders, which K sub codewords for each sub-vector, by this means, the M sub-
is a famous part of deep learning schemes, have been utilized vectors will forms K M clusters, while if we adopt the traditional
in emotion recognition of EEG signals [51]. Plenty of studies
method to encode the whole input data by K -means requires
have transformed EEG signals to images and also used many
storage O(K M P). At the same time, the computing complexity can
deep learning models which have high performance in classifying
be reduced to O(KP).
images. In order to maintain the structures of EEG data, including
In PQ algorithm, the codewords in each subspace are gener-
spatial, temporal and spectral, [52] had proposed a new kind of
compound features. Three chosen frequency bands of the power ated by employing the K -means clustering which optimize the
spectrum were estimated by using every electrodes’ EEG signals. squared distortion errors (2) with respect to b and C iteratively.
And the short-time Fourier transform was used to transform data However, PQ algorithm does not present a method how to get the
of EEG time series to 2D images in [48]. MI signals’ spectral optimal space decomposition for the ANN search task.
features were taken and 1D CNN [53] and a stacked autoen- ∑
min ∥x − Cb∥22
coder(SAE) [54] were also used to get better property for MI EEG b∈H1/k
x∈X
dataset. (2)
The above studies have tried to make full use of the essences of s.t . b ∈ {0, 1}k and ∥b∥ = 1
deep learning for EEG classification, however, this kind of model C:,⊤i C:,j = 0 for i ̸ = j, C:,⊤i C:,i > 0
is not very suitable for other fields, such as image and signal pro-
cessing. Therefore, research on conducting and designing some Cartesian K -means [31] solve this problem by designing appro-
novel learning models for EEG MI classification is very urgent. priate constraints on the column of the linear mapping vector C
in (2) to implicitly adjust the instances’ dimension information,
2.2. Optimal reverse prediction which make the optimizing (2) with respect to b become more
tractable in the orthogonal cartesian K -means, at the same time,
In [55] Xu et al. proposed an optimal reverse prediction (ORP) to find better subspace partitioning in the cartesian K -means and
algorithm whose objective function contains two terms: one is achieves better ANN search performance, same algorithm opti-
the conventional K -means algorithm formula — an unsupervised mized product quantization is also proposed in [56]. Orthogonal
clustering algorithm where neither the cluster center matrix nor constraint on the cluster center guarantee the cluster center can
the label matrix are known, another one is the supervised learn- be represented as C ≡ RD, where R⊤ R = RR⊤ = I, so (2) can be
ing term, which is similar with the conventional K -means formula reformulated as (3). Minimizing (3) with respect to R, D and B,
but label is known previously. The label variable in the objec- the obtained optimal rotation matrix R and cluster center D can
tive function adopt the 1-of-K encoding scheme. The optimal assist (3) in getting a lower distortion error.
reverse prediction algorithm is usually resolved through itera-  ⎡ 1 1⎤ 2
tively computing the least square loss term with respect to the  D bi 
cluster center matrix and the unknown label matrix variables. The ⎢ .. ⎥
∑ 
min xi − R ⎣ . ⎦

objective function is defined as: R,D,B

 
i 
DM bM  (3)
L(C , B) = min min tr((X (L) − CY (L) )T (X (L) − CY (L) ))/NL i 2
C B (1) s.t . R R = I
T
bm
i ∈ {0, 1} K ×1
, ∀i , m
+ η2 tr((X (U) − CB)T (X (U) − CB))/NU ∥ bm ∥ = 1 , ∀i , m
i 1
where X (L) ∈ RP ×NL and Y (L) ∈ RK ×NL represent the training where the number of the subspace is denoted as M.
instances matrix and labels matrix respectively, X (U) ∈ RP ×NU
is the unlabeled data matrix, B ∈ RK ×NU is the unknown label
matrix and η2 is trading off parameter. P is the dimension of the 3. Semi-Supervised Cartesian K -means (SSCK)
instances, NL and NU indicate the labeled and unlabeled samples
number respectively, K is the number of the clusters. In the following part, we will describe discriminative semi-
The optimal reverse prediction algorithm is a unification of supervised cartesian K -means algorithm and then extend the
several supervised and unsupervised training principles through Semi-supervised concept to other ANN search algorithms.
4 M. Liu, M. Zhou, T. Zhang et al. / Applied Soft Computing Journal 89 (2020) 106071

3.1. Semi-supervised Cartesian K -means model, we construct the corresponding matrix by adopting the
method presented in [59].
As presented in [55], the classical K -means, principal com- Optimizing (5) generally is an intractable job, for Y (L) and B
ponent analysis (PCA) and normalized cut [57,58] et al. can be are the discrete matrixes with 1-of-K encoding scheme, discrete
considered as the special cases of the optimal reverse prediction optimization is difficult in the non-submodular problem. Here we
algorithm. Based on this concept we can use (1) to substitute (3) provide two methods to deal with this difficulty. The first one:
in the quantization procedure and to give the semi-supervised We used the exhaustive search algorithm which also adopted
cartesian K -means algorithm. Given a set {X (L) , Y (L) } ∈ RP ×NL × in [36,60]. The second one: We first relax the discrete variable Y (L)
RK ×NL and unlabeled dataset {X (U) } ∈ RP ×NU , where P is the into the continuous and resort a constraint to Y (L) , then optimize
dimensionality of the instance, K indicates the total number
the cartesian K -means objective function with the constraint
of the quantization center, NL and NU indicate the labeled and
to get an optimal Ȳ (L) , threshold method is adopted to get the
unlabeled samples number respectively. Ground on input space
discrete variable.
decomposition strategy, the semi-supervised cartesian K -means
(SSCK) is defined as: Update Y (L) : To optimize (5) with respect to Y (L) using the ex-
haustive search method, we can rewrite (5) as the following(only
LSSCK = min ∥R X ⊤ (L)
− DY (L⋆) 2F
∥ + η ∥R X
2 ⊤ (U)
− DB∥ 2
F
R,D,B those terms relevant to Y (L) are kept):
i ∥ ∈ {0, 1}
s.t . R⊤ R = RR⊤ = I ∥ym K ×1
, ∥(y(L⋆) )m
i ∥1 = 1
(4) NL NL
∑ λ∑
∥ bm
i ∥ ∈ {0, 1} K ×1
,∥ bm
i 1 ∥ = 1, ∀i, m (5) = ∥R⊤ x(L) (L) 2
µ,i − Dyi ∥2 + wij ∥y(L) (L) 2
i − y j ∥2
2
i=1 i,j=1
Based on the space decomposition concept, X (L)⎡, X (U)⎤, DY and ⎛ ⎞ (6)
(L) NL N
X1 ∑
⎝∥R⊤ x(L) (L) 2 λ∑L

. = µ,i − Dyi ∥2 + wij ∥y(L) (L) 2 ⎠


i − yj ∥2
DB can be individually represented as: X (L) = ⎣ .. ⎦, X (U) = 2
⎢ ⎥
i=1 j=1
(L)
XM (L)
⎡ (U)
X1
⎤ ⎡
D1 (Y (L⋆) )1
⎤ ⎡
D1 B1
⎤ Optimize (6) with respect to we can fix all the other {y(L) }j̸=i
yi ,
⎢ .. ⎥ .. .. and try xi in all the clusters to find the optimal assignment which
⎣ . ⎦, DY = ⎣ ⎦, DB = ⎣ ⎦, (Y (L⋆) )i =
⎢ ⎥ ⎢ ⎥
. . can guarantee the loss function to have the global minimum.
DM (Y (L⋆) )M
(U) (L)
DM BM Of course, optimize (6) with respect to yi with the exhaustive
[ X(LM⋆)
(y(L⋆) )iN
[ i i
)i1 i
] ]
(y ··· , B = b1 · · · bNU and ∥ · ∥F represents search is a very expensive, we need to compute the loss function
L
the Frobenius norm. From (4) given the labeled dataset X (L) , Y (L⋆) (5)28 × NL times in each iteration when adopt the 8-bit code
and other unlabeled dataset X (U) , the discriminative cartesian K - length. For 32-bit, 64-bit and 128-bit we need compute the loss
means algorithm can be obtained. More specifically, the above (4) function (5) 4 × 28 × NL , 8 × 28 × NL and 16 × 28 × NL times
is only the quantization problem, a part of the cartesian K -means. respectively.
However, we cannot use (4) to quantize the labeled and unla- The second method we adopted to optimize (6) is to relax
beled data right now, because we do not know the quantization the discrete variable Y (L) into continuous and resort constraint to
label matrix Y (L⋆) . Y (L) firstly, then optimize the objective to the optimal value and
When optimal reverse prediction used in the clustering task, threshold it finally. We rewrite (5) as following(only those terms
the notation of labels matrix Y (L) in (1) denote the clustering label relevant to Y (L) are kept):
or classification label, which can be obtained from the known
NL NL
labeled data. However, when the optimal reverse prediction used ∑ λ∑
in our proposed semi-supervised cartesian K -means (4), its role is (5) = ∥R⊤ x(L) (L) 2
µ,i − Dyi ∥2 + wij ∥y(L) (L) 2
i − y j ∥2
2
to quantize or encode the data, label matrix Y (L) is used to indict i=1 i,j=1
⎛ ⎞
the quantization label, but it is unknown. NL N (7)

⎝∥R⊤ x(L) (L) λ∑L
(L) (L)
In another words, the quantization label is different from the = µ,i − Dyi 22 ∥ + w ∥
ij yi − yj 22
∥ ⎠
clustering label, clustering label can be directly obtained from the 2
i=1 j=1
labeled data, but the quantization label cannot, so (4) cannot be (L)
used for our semi-supervised cartesian K -means. s.t.: (yi )⊤ 1 = 1, i = 1, 2, . . . , NL
To solve this problem, we introduce the laplacian regularizer (L)
term into the above semi-supervised cartesian K -means model Denote the predicted Y (L) as Ȳ (L) and let ri = R⊤ xµ,i . Aim to get
(4) and get the following formulation: the optimal solution of Ȳ (L) , we consider the Lagrange function
L(ȳi , β ), which is defined as:
LSSCK = min ∥R⊤ Xµ(L) − DY (L) ∥2F
R,D,B,Y ,µ NL NL
(L)
∑ λ∑
+ η2 ∥R⊤ Xµ(U) − DB∥2F + λtr(Y (L) L(Y (L) )⊤ ) (5)
L(ȳi , β ) = ∥ri − Dȳ(L) 2
i ∥2 + wij ∥ȳ(L) (L) 2
i − ȳj ∥2
2 (8)
i,j=1
s.t . R R = RR = I ∥
⊤ ⊤
(y(L) )m
i ∥ ∈ {0, 1} K ×1
, i=1
(L)
+β ((ȳi )⊤ 1 − 1)
i ∥1 = 1 ∥bi ∥ ∈ {0, 1}
∥ym , ∥bm
i ∥1 = 1, ∀i, m
m K ×1

∂L
where we denote Xµ(U) ≡ X (U) − µ(1(U) )⊤ , Xµ(L) ≡ X (L) − µ(1(L) )⊤ , Let (L)
∂ ȳi
= 0, we have
µ is the mean value vector of the input data. Both Y (L) and B are
NL
the quantization label and unknown, L represents the constrained ∂L ∑
laplacian term∑ and L = W − D, the similarity matrix is denoted as = C ȳ(L)
i +λ wij (ȳ(L) (L)
i − ȳj ) + β 1 = 0
W and Dii = ∂ ȳ(L)
j Wi,j . We construct the similarity matrix W use
i j=1
(9)
the supervised method, entries Wij will be assigned a large weight NL
∂L ∗ (L)
∑ (L)
if xi and xj have the same clustering label and a small weight will = C ȳi − λ w ij ȳj + β1 = 0
be given if xi and xj belong to different clusters. In our proposed ∂ ȳ(L)
i j=1
M. Liu, M. Zhou, T. Zhang et al. / Applied Soft Computing Journal 89 (2020) 106071 5

where X = X (L) ηX and Y = Y (L) ηB .


[ (U)
] [ ]
where C = (ri 1⊤ − D)⊤ (ri 1⊤ − D) and
⎡ ⎤
∑NL We present our algorithm to the semi-supervised cartesian
C11 − j=1 wij , C12 ,... , C1n
⎢ ⎥ K -means in Table 1.
⎢C21 − ∑NL wij , C22 ,... , C2n ⎥
j=1
C∗ = ⎢ .. .. .. ⎥ Pre-multiply (9) by
⎢ ⎥
3.2. Other extensions
. . ··· . ⎦
⎢ ⎥

∑NL
Ck1 − j=1 wij , Ck2 ,... , Ckn The semi-supervised concept can also be used to other quan-
⊤ ∗ −1 tization algorithms for ANN search task, such as optimized carte-
1 (C ) , we can get:
⎛ ⎞ sian K -means (OCK) [56] and optimized product quantization
∑NL (OPQ) [63]. For example, the semi-supervised PQ can be ob-
1 − 1⊤ λ(C ∗ )−1 ⎝ wij ȳ(L)
j
⎠ + 1⊤ β (C ∗ )−1 1 = 0 (10) tained by optimizing (19). The optimization process is easier
j=1
than the semi-supervised cartesian K -means, because no columns
(∑N ) orthogonal constraints to the cluster center matrix.
1⊤ λ(C ∗ )−1 L w ȳ(L) −1
From (10) we can get β =
j=1 ij j
. LPQ =∥Xµ(L) − DY (L) ∥2F + η2 ∥Xµ(U) − DB∥2F
1⊤ (C ∗ )−1 1
From (9) we can get
(L)
the optimal ȳi with respect to (8) : + λtr(Y (L) L(Y (L) )⊤ )
(19)

NL

i ∥ ∈ {0, 1}
s.t. ∥(y(L) )m K ×1
, ∥ym
i ∥1 = 1

i ∥ ∈ {0, 1} , ∥bm
i ∥1 = 1, ∀i, m
(L)

ȳi = (C ∗ )−1 ⎝λ wij ȳ(L)
j − β1
⎠ (11) ∥ bm K ×1

j=1
Resort the semi-supervised concept to the optimized cartesian K -
(L) means algorithm is more complicated than the cartesian K -means
Substitute β into (11), we get the update equation of ȳi :
(OCK), because the OCK algorithm adopt the multiple codeword
⎛ (∑
NL (L)
) ⎞ to quantize the data. The semi-supervised optimized cartesian
NL
∑ 1⊤ λ(C ∗ )−1 j=1 wij ȳj −1 K -means can be obtained by solving the following (20).
(L)
ȳi ← (C ∗ )−1 ⎝λ wij ȳ(L)
j − 1⎠
1⊤ (C ∗ )−1 1 LOCK =∥X (L) − RD̂B̂∥2F + η2 ∥X (U) − RD̂Ŷ (L) ∥2F
j=1

+ λtr Y (L) L(Y (L) )⊤


( )
(12) (20)
(L) s.t. ∥(y(L) )m
i ∥ ∈ {0, 1}
K ×1
, ∥ym
i ∥1 = C
Having obtained the predicted ȳi , we can then get the dis-
i ∥ ∈ {0, 1}
∥ bm , ∥bm
i ∥1 = C , ∀i, m
(L) K ×1
crete yi by the following:
⎛ 1 ⎞
yi
(L)
← max(ȳ(L) D̂
i ) (13)
where D̂ ≜ ⎝ .. ⎠, D̂ ≜ D̂m,1
⎟ m (
D̂m,C , B̂ ≜
)
.

···
Update B: Having obtained the quantization Y (L) , we can get the
D̂m
cluster center D by computing the average value of all labeled ⊤
1⊤ M⊤
( ) M
( )
data X (L) in each quantization cluster. Based on the predicted B̂ ··· B̂ , B̂ ≜ b̂m ··· b̂m ,
( )⊤1 N
cluster center D, the labels matrix B of the unlabeled data can be m,1 ⊤ m,C ⊤
b̂m
i ≜ b̂i ··· b̂i . Here Y (L) has as same notation as B
obtained by adopting the KNN clustering algorithm. To be more
precisely, the solving process of B is described as follows: and only B is given for simplicity.
Given η2 = 0, we can get the supervised Cartesian K -means
D ← min ∥R⊤ Xµ(U) − DY (L) ∥2F quantization algorithm, same concept can also been adopted in
D
(14) other quantization algorithms to obtain the supervised quanti-
B ← min ∥R⊤ Xµ(U) − DB∥2F zation algorithms such as supervised product quantization al-
B
gorithms, supervised orthogonal K -means algorithm as well as
Update D: Based on the labeled and unlabeled data, considering supervised optimized Cartesian K -means algorithms. Generally
their quantization label Y (L) and B, the clustering center D can be speaking, supervised and semi supervised algorithms perform
updated in the following way: better than the unsupervised algorithms. If our semi-supervised
concept is workable, this paper will pave a new research direction
D ← min ∥R⊤ Xµ − DY ∥2F (15) on the semi-supervised quantization algorithm.
D
In above model training, many effort are needed to get the
where Xµ := Xµ(L) η Xµ
(U)
ηB .
[ ] [ ]
and Y := Y (L) optimal Y (L) and B based on objective function. In our research, we
Update R: Many papers have been published to treat orthogonal propose to adapt the exhaustive search and relax two methods
constraint optimization difficulty like in [61]-a Crank–Nicolson- to optimize the semi-supervised quantization objective function
like update scheme to guarantee appropriate orthogonal con- with respect to Y (L) . Lots of computations are needed for the ex-
straints. In our experiment we still adopt the [62] algorithm haustive search strategy, but global minimum can be guaranteed.
as does cartesian K -means algorithm to solve it for its high Relax plus constrain method is more efficient, in this article we
performance and efficiency. Based on the adopt the relax method to solve our semi-supervised cartesian
K -means problem.
[UR , SR , VR⊤ ] = SVD(Xµ (Y (L) )T DT ) (16)
4. Interpretation and implementation details
Then R can be obtained by:

R ← UR VRT (17) The algorithm we propose is called semi-supervised Cartesian


K-means (SSCK). It makes use of the deep CNN feature, which
Update µ : After we have got R, D, Y (L) and B, then we can update can fuse all features of the extracted convolutional by applying
the µ in the following way: fully connected layers. The across time samples are input to EEG
channels, then we will get a 2D feature map output after the exe-
µ ← mean(X − RDY ) (18) cution of convolutions. After finishing the feature learning phase,
6 M. Liu, M. Zhou, T. Zhang et al. / Applied Soft Computing Journal 89 (2020) 106071

Table 1
Algorithm to Semi-Supervised Cartesian K -means.
Semi-Supervised Cartesian K -means
Input: Labeled samples (X (L) , Y (L) ) and unlabeled samples X (U) ,
balanced factors η, λ, β , maximum number of iterations T
(L)
and convergence factor τ , original value Y0
t=0 (
While t ≤ T && ∥L((Y (L) )t +1 , B(t +1) , R(t +1) , D(t +1) , µ(t +1) ) − L((Y (L) )t +1 , B(t +1) , R(t) , D(t) , µ(t) )∥ ≥ τ
)
t ← t + 1;
1 Update Y (L) :
(L)
Update all columns of Y (L) separately. Fix the those columns yj ,
(L) (L)
j ̸ = i and update yi (based on )the following: yi ← max(ȳ(L)
i )
∑N (L)
1⊤ λ(C ∗ )−1 L w ȳ
j=1 ij j
−1
Where β = and
( 1⊤ (C ∗ )−1 1 )
(∑ N )
1⊤ λ(C ∗ )−1 L w ȳ(L) −1
(L) ∑NL (L) j=1 ij j
ȳi ∗ −1
← (C ) λ j=1 w ij ȳj − 1⊤ (C ∗ )−1 1
1

2 Update B:
D ← minD ∥R⊤ Xµ(U) − DY (L) ∥2F
B ← minB ∥R⊤ Xµ(U) − DB∥2F
3 Update D:
D ← minD ∥R⊤ Xµ − DY ∥2F
4 Update R:
R ← UR VRT , where [UR , SR , VR⊤ ] = SVD(Xµ (Y (L) )T DT )
5 Update µ:
µ ← mean(X − RDY )
End while
Output: The predicted label D, R and µ

Fig. 1. Detailed process of our proposed algorithm.

in the next phase we execute the multilayer feature extraction reverse prediction is used to quantize or encode the sample
and fusion phase. In the first phase of this model, the saw cropped instances into short codeword, which is different from its applica-
input EEG signal is taken and Laplacian matrix construction is tion of classification task. In quantization step of ANN search, we
used to process these features. The features we used in the generally quantize the data into 8 bits which results 256 clusters,
quantization process are taken from the convolution layers after however, the number of real label is usually not equal to 256.
the max pooling, which is in order to decrease the feature size So the first key problem in the semi-supervised cartesian K -
with keeping relevant information extracted by the convolutional means is to get the quantization label, which is a very hard
layers complete. Therefore, the feature maps are all taken out problem. How to convert the label information of the labeled
from Pool-1, Pool-2, Pool-3 and Pool-4 (max pooling) layers. Every data into the quantization label? In this article, we innovatively
pooling layers stand for convolutional layers at distinct levels of introduced the laplacian regularizer factor to solve this problem.
abstraction about CNN architecture. For example, the initial layers Given a dataset {X } =: {X (L) , X (U) } includes a small amount
indicate simpler features and the more deep layers will get more labeled samples X (L) ∈ RP ×NL and large amount of unlabeled
complex features. In the end, we use a supervised constructed samples X (U) ∈ RP ×NU . In order to transfer the clustering label in-
Laplacian regularizer to integrate the semi-supervised informa- formation into the quantization label, we construct the laplacian
tion into the quantization algorithm. Each detailed steps of our matrix based on the supervised method. In the laplacian similar-
proposed algorithm are displayed in Fig. 1. In our experiment, ity matrix construction, samples who have the same labels will be
we use the optimal reverse prediction model as the quantiza- assigned a large similarity weights and a lower similarity value
tion distortion function and take the sample’s label information should be given to those samples who do not share the same
into the quantization step, propose the semi-supervised cartesian clustering label. Assuming we have found a method to optimize
K -means model to improve the previous cartesian K -means’s per- (5) with respect to Y (L) , then the regularizer term tr(Y (L) L(Y (L) )T )
formance. Optimal reverse prediction is a simple semi-supervised in (5) will force those labeled samples who share the same label
learning algorithm, many algorithms such as least square, PCA, to intend to get the same quantization label Y (L) or the assigned
K -means as well as Normal Cut are the special cases of it. In quantization center have the minimal distance value. However,
the semi-supervised cartesian K -means algorithm the optimal as we have mentioned above that optimizing (5) is a NP-hard
M. Liu, M. Zhou, T. Zhang et al. / Applied Soft Computing Journal 89 (2020) 106071 7

problem and intractable. Two kinds of algorithms have emerged is about O(4NL (PN(K + P + 1) + 2KNL2 )) at least, where P is di-
to ease above difficult problem, the first one is to adopt the mensionality of samples, K indicates the number of quantization
exhuastive search method which is very common used method center, here K = 256. Where N = NL + NU , NL and NU indicate
and have been used in optimized cartesian K -means [56], com- the labeled and unlabeled samples number respectively.
posite quantization [36] and Sparse quantization [60]. The second Based the above analysis, we proposed another algorithm to
one is to relax the discrete variable into continuous and resort substitute the exhaustive search method. We relax the discrete
the relative constraint to the variable to enable the objective variable to continuous and resort the constrain to it, finally to
optimization to be tractable. threshold the continuous variable to get the 1-of-k code. In each
Take the SEED dataset for example, we use YC denotes the iteration, the computation complexity is O(NL4 ), which is the
clustering label of the digital number and YC ∈ {0, 1, . . . , 9}. YQ operation to compute the Moore −Penrose pseudo inverse matrix
denotes the quantization label and YQ ∈ {1, 2, 3, . . . , 256} and of C ⋆ .
Y (L) is used to denote the quantization label matrix with 1-of-K
Apparently, our proposed algorithm has less computation
encode to be estimated as mentioned above. In another word, Y (L)
complexity than the exhaustive search algorithm. Because the
is constructed by using the 1-of-K encode scheme based on the
pseudo inverse computation, our algorithm still has large compu-
YQ value. YC is known, while YQ and Y (L) are unknown.
tation cost. Developing new algorithm to own less computation
The regularizer term tr(Y (L) L(Y (L) )T ) in (5) can implicitly con-
complexity and enhance the model performance is our main job
strain the quantization label YQ keep close relationship to the
clustering label YC and the estimated Y (L) can guarantee the those in the following research work.
labeled data who share the same label in the quantization center
as close as possible. So label information is indirectly conducted 5. Experiment
to the quantization label YQ from YC .
In this experiment, we apply Intel Xeon E5-2680 2.40 GHz
4.1. ANN search
CPUs with 12 cores and 64 GB RAM. For more efficient deep
learning, we use TITAN RTX GPU with 24 GB memory. And we
For any query q, we search the base set S = {s1 , . . . , sN } to
use the PyTorch which is an outstanding deep learning frame-
find those instances that are most closest to q. If the query q
work to construct CNN and MNE-Python to process EEG data. In
is the real value vector, we use asymmetric quantizer distance
order to estimate the advantage of our proposed algorithm SSCK,
method to compute the distance between the query instance q
and base instance si , i ∈ {1, . . . , N } where the distance of ∥q − si ∥ we choose three datasets to implement a range of ANN search
can be approximate by ∥k(q) − k(si )∥. If the query q is the encode tests: SEED [64], fifteen volunteers with EEG signals and three
word, symmetric quantizer distance (SQD) is used to estimate sentiments(positive, neutral and negative) about eye motions;
the distance between q and si where ∥k(q) − k(si )∥ is used to DEAP [65], Electroencephalogram and 32 patients with periph-
approximate the distance ∥q − si ∥. Where k(q) and k(si ) denote the eral physiological signals; BCI [66], the MI EEG evaluation were
reconstructed query vector q and base set vector si respectively. resorted from 9 subjects and partition into 4 classifications of
As for the orthogonal cartesian K -means and ITQ algorithms in physical activities(left hand, right hand, foot and tongue).
the experiments we use the asymmetric hamming (AH) distance The previous works show that pre-trained CNN models on
and symmetric hamming (SH) distance to compute the distance large datasets can be transferred to extract CNN features for other
when the query data is original data and encode data respectively. image datasets. We use the VGG-Net [67] to extract features.
As for selecting the features from CNN, those belonging to the
4.2. Computation analysis shallow layers contain too many dimensions and they are too
sparse to get appropriate results for classification. Moreover, the
Immediately after that, we will give the computation com- features of the deepest layer is totally corresponding to the orig-
plexity of our algorithms in the training and information retrieval inal dataset, which is hard to transfer to other tasks. Therefore
task. we select some middle layers of CNN to extract features for the
In the model training stage, we solve the objective function to recognition task.
get Y (L) , R, D, µ and B iteratively. Optimizing R, D, µ and B have as Selecting appropriate data is a key problem in this paper, for
same computation complexity as the original Cartesian K -means example,we can randomly select eight subject data for training
algorithm, however, besides those computation cost, still more and one for subject-specific testing. In this manner, the system is
computations needed for our proposed semi supervised Cartesian tested without seeing the subject beforehand and is a completely
K -means. As mentioned previously, optimizing Y (L) is really a
new test case for the system. The testing is challenging, and
NP-hard problem, exhaustive search is a very common used strat-
they also can be further generalized. These sets are retained until
egy to deal this problem. All the three quantization algorithm-
complete training and testing are over. Finally, We calculate the
composite quantization [36], sparse quantization [60] as well as
results by averaging the values obtained at each stage. We use
the optimized product quantization [33,63] have adopted the
the supervised learning strategy to extract effective feature, the
exhaustive search strategy to optimize the label variable. In our
proposed semi-supervised cartesian K -means, exhaustive search softmax classification function uses the output from the feature
is also used to optimize the objective function with respect to Y (L) . extraction function, such that the CNN network can optimize both
Exhaustive search method can guarantee our objective function to functions simultaneously.
obtain the global minimum, however it is really a very expensive We compare our semi-supervised cartesian K -means, with
choice. In each iteration of the semi-supervised cartesian K - many typical and recent approaches: product quantization (PQ)
means optimization model, the objective function (5) is computed [30], cartesian K -means (CKmeans) [31], Correlated Attention
about 256 × NL × LC /8 times, where NL represents the labeled Network (CAN) [68], Kernelized product quantization (KPQ) [37],
instance number and LC indicates the code length. And in the loss online product quantization (OPQ) [38], semi-supervised extreme
function computation, the computation complexity is O(PN(K + learning machine (SS-ELM) [69], orthogonal K -means (OKmeans)
P + 1) + 2KNL2 ). Assuming that we adopt the 32-bit coding length, and iterative quantization (ITQ) [70] as well as the composite
optimizing (5) with respect to Y (L) , the computation complexity quantization(CQ) [36].
8 M. Liu, M. Zhou, T. Zhang et al. / Applied Soft Computing Journal 89 (2020) 106071

dataset [31] and ITQ also have a better performance than the
Cartesian K -means. That because we do not use the sift or gist
feature to implement the search task and small amount samples
used for model training. We only present the precision of 128 bits
encode experimental result in 3, because the performance of all
algorithms using the 32 and 128 bits have almost the same per-
formance for using the pixel feature. Comparing the conventional
cartesian K -means, our proposed semi-supervised cartesian K -
means have improved the performance effectively, which shows
the efficiency of our semi-supervised cartesian K -means model.
Orthogonal K -means model gives the worst performance.
From Fig. 2 we can observe a curious phenomenon that the
semi-supervised cartesian K -means algorithm have a better per-
formance by using the symmetric quantize distance (SQD) than
Fig. 2. Performance comparisons of eight methods on SEED dataset.
using the asymmetric quantizer distance (AQD) method, which
is different from the previously conclusions. We think that is
Table 2 because we have introduced much locality information of the
An example of selection criteria. samples. Besides the recall rate experimental result comparisons,
Number Emotion label Film clips sources we also compare the classification performance (based on average
1 Negative Tangshan Earthquake accuracy and standard deviation acc ± std) of these algorithms
2 Negative Back to 1942 on this data, detailed experimental results are given in 3. We can
3 Positive Lost in Thailand
easily draw that the semi-supervised cartesian K -means’ priority
4 Positive Flirting Scholar
5 Positive Just Another Pandora Box over the other algorithms. In this experiment the parameter α
6 Neutral World Heritage in China and η for SSCK are well adjusted, we set α = 0.2 and η = 0.5.

Table 3
Comparisons of classification results using 32, 64, and 128 bits. 5.2. Experiments on DEAP
Algorithms acc ± std (32 bits) acc ± std (64 bits) acc ± std (128 bits)
SSCK 94.95 ± 0.68 95.16 ± 0.67 95.35 ± 0.70 The DEAP dataset is a collection of signals collected when
CKmeans 89.15 ± 1.17 89.78 ± 1.22 90.13 ± 1.18 participants watch a minute-long emotion music video. We chose
PQ 90.73 ± 1.13 91.05 ± 1.08 91.54 ± 0.99
OKmeans 91.09 ± 1.08 91.58 ± 1.01 91.89 ± 0.98
5 as the threshold and divided the experiments into two cat-
ITQ 93.03 ± 0.81 93.38 ± 0.82 93.55 ± 0.79 egories on account of the levels of excitement and titer. The
CAN 93.25 ± 0.83 93.87 ± 0.79 94.03 ± 0.77 downloaded preprocessing data is used in our experiment. Then,
KPQ 93.85 ± 0.80 94.12 ± 0.78 94.21 ± 0.74
OPQ 93.95 ± 0.81 94.27 ± 0.77 94.76 ± 0.76
we selected 1000 samples of each class from the basic dataset and
divided them into two parts, one as labeled data and the other as
unlabeled data, to train our proposed semi-supervised cartesian
k-mean. For the other comparative algorithms, we use the whole
5.1. Experiments on SEED 10 000 instances to train the model. The comparisons of precision
are given 4 and 5 respectively. The classification performance is
The SEED dataset was were selected from material database
measured based on average accuracy and standard deviation (acc
as stimuli used in the experiments, ESI neural scanning system
± std).
was used to record the eye movement signals with sampling
In Table 4 we have shown an example of our used samples,
rate of 1000 Hz and 62 channels in the electrode cap. SMI ETG
eye-movement tracking glasses were used to record the eye this file is available in Open-Office Calc (online ratings.ods), Mi-
movement signals. The signals recorded by the subjects when crosoft Excel (online _ratings.xls), and Comma separated values
they watch the first 9 movie clips serve as the training dataset (online _ratings.csv) formats. The scores were collected through
of each experiment, and the rest serve as the test dataset. 32 and an online self-assessment software as cited in [71]. Participants
128 bits code length are used in the experiments. We used DE used Sam’s manikin to rate arousal, valence and dominance on
features as the eye movement features, which are the same as a discrete nine-point scale. Participants also rated their feelings
Lu et al. 2015 [64]. 32 and 128 bits code length are used in the through the emotional wheel tool (refer [8]).
experiments. In this experiment, it seems that all the algorithms can the
In Table 2 we have shown an example of selection criteria. The AQ and SQ ANN search have the same performance and we only
selection guidelines of film clips are: (a) the whole experiment present the AQD method recall precision of all the algorithms
length should not be too long to avoid fatigue of subjects; (b) the using the 128 bits code length (see Fig. 3).
video can be easily understood; (c) these video should trigger a
Orthogonal K -means and cartesian K -means have unsatisfying
single target emotion. Each clip is about four minutes long and is
results, which are different from the results obtained on the SEED
carefully edited to create coherent emotions.
dataset. AQD and SQD method have the same performance we
In Fig. 2 we present the recall rate experimental result com-
parisons of using SSCK and other typical and recent approaches think that because we do not use large amount samples for the
on the SEED with 32 and 128 bits code length. From Fig. 2 we model training and the samples are the high dimensionality data.
can see that the performance of SSCK surpass the other compar- The experimental result comparisons between SSCK and cartesian
ative algorithms on the SEED dataset. That is because we have K -means show the efficiency of the SSCK on the image retrieval
found an appropriate method to optimize (4) with respect to Y . task. The parameter value of α and η are given by α = 0.5 and
Cartesian K -means algorithm keep its priority as does in other η = 0.6.
M. Liu, M. Zhou, T. Zhang et al. / Applied Soft Computing Journal 89 (2020) 106071 9

trials (12 for each of the four possible classes), resulting in a total
of 288 trials per session.
The performance of different algorithms is analyzed for dif-
ferent proportion of labeled training samples on BCI dataset.
In 288 trials of experimental training, testing set contains 28
samples randomly chose from the whole data, and 10%–90% of
the remaining 260 samples were naturally selected as the labeled
training samples, and the rest were used as the unlabeled training
samples. Moreover, the 288 samples is also set as the test sam-
ples. The process by which we distribute samples for ten repeats.
Fig. 4 gives the results of four representative subjects (A01, A02,
A03,A04,A05,A06,A07, A08).
Our experiments randomly choose 50 samples of each object
as the query and the reminder used as the base data. 500 samples
of each class are then selected from the base dataset to train
those comparative models such as cartesian K -means, orthogonal
Fig. 3. Performance comparisons of eight methods on DEAP dataset.
K -means, product quantization and iterative quantization algo-
rithms. For SSCK model, the 1000 samples are averagely divided
Table 4
An example of dataset summary.
into two sections, one section acts as the labeled samples and
File name Format Section Details
the other is the unlabeled samples . The corresponding contrast
experiments are summarized in Fig. 4 and Table 6 respectively.
Online_ratings Spreadsheet Online self- Individual ratings.
assessment
Experimental results show that SSCK perform better than its
Video_list Spreadsheet Both parts Music videos in competitors on all the three experiments of using 32 and 128
YouTube links bits code length respectively. Product quantization have the worst
Participant_1 Spreadsheet Experiment All participants performance except by using 32 bits code experiment. That is
rated video.
because the feature occur a great change on the bits code and this
Participant_2 Spreadsheet Experiment Participants’
answers. kind of algorithm is not suitable any more, all the algorithms ex-
Face_video Zip file Experiment Face video on the cept the product quantization have adopted the rotation matrix to
front. adjust the dimensionality of samples to reduce the errors. In this
Data_original Zip file Experiment Raw physiological experiment we use the common spatial pattern(CSP) approach
data.
in [66] to implement the ANN search work, those algorithms who
Data_preprocessed Python, Experiment Preprocessed
Matlab physiological data. have adopted the rotation matrix strategy can implicitly reduce
the objective function error and improve the ANN search perfor-
mance, however, product quantization algorithm does not use the
Table 5 rotation matrix to adjust the feature and cannot get the satisfying
Comparisons of classification results using 32, 64, and 128 bits.
result. Comparing with the cartesian K -means, our proposed SSCK
Algorithms acc ± std (32 bits) acc ± std (64 bits) acc ± std (128 bits)
can effectively improve the ANN search performance on the BCI
SSCK 89.96 ± 1.51 90.16 ± 1.47 90.36 ± 1.41 dataset.
CKmeans 84.16 ± 2.11 84.73 ± 2.06 85.11 ± 2.03
PQ 85.23 ± 1.99 85.75 ± 1.98 86.51 ± 1.94
Unlike the circumstances presented in [31], all the algorithms
OKmeans 85.89 ± 1.96 86.58 ± 1.91 87.39 ± 1.88 use the image pixel as the feature cannot testify that the cartesian
ITQ 86.63 ± 1.91 87.38 ± 1.89 88.15 ± 1.85 K -means priority over the iterative quantization and AQ (AH) and
CAN 87.29 ± 1.83 87.87 ± 1.87 88.51 ± 1.79 SQ (SH) method have almost the same performance.
KPQ 88.05 ± 1.83 88.32 ± 1.81 88.89 ± 1.74
Additionally, in order to emphasize the necessity of our re-
OPQ 88.55 ± 1.81 88.97 ± 1.77 89.16 ± 1.72
search and make our experiments more complete, We have
added more comparative experiments and statistical hypotheses
to demonstrate the effectiveness of the algorithm. we compare
5.3. Experiments on BCI datasets the proposed model to four recent and representative algorithms,
including hierarchical semi-supervised extreme learning machine
The dataset is a collection of EEG data from 9 subjects. The method (HSS-ELM) [66], wavelet transform time–frequency im-
clue-based BCI paradigm contain four different motor represen- age and convolutional network based approach (WTT-CNN) [72],
tation tasks, namely, the motor imagination of the left hand (class Multilevel Weighted Feature Fusion Using Convolutional Neural
1), right hand (class 2), foot (class 3), and tongue (class 4). Each Networks (MWF-CNN) [73], and novel fused convolutional neural
session contain six runs with a short break. A run contains 48 network (FCNN) [74]. The classification performance is evaluated

Table 6
Classification accuracy on the BCI Dataset using 128 bits.
Algorithms SSCK CKmeans PQ OKmeans ITQ SS − ELM KPQ OPQ HSS − ELM
A01 83.37 73.77 74.82 76.66 78.65 79.27 81.93 82.74 81.14
A02 51.04 43.17 44.57 46.24 47.61 48.68 48.78 49.84 49.86
A03 78.69 70.23 71.59 77.23 76.12 77.95 77.77 77.81 78.02
A04 63.93 59.09 58.55 60.29 61.53 62.99 63.78 62.89 63.33
A05 45.36 39.25 40.62 41.28 43.56 44.86 44.80 44.02 44.03
A06 50.33 41.35 43.90 45.39 46.59 48.95 48.81 48.98 49.44
A07 82.02 73.12 75.54 76.23 79.03 80.83 80.77 81.94 81.11
A08 82.06 74.03 76.51 78.21 80.51 81.21 81.76 81.87 81.49
A09 82.62 72.12 74.44 76.16 79.48 80.69 80.96 81.81 81.38
MC 68.82 60.68 62.28 64.19 65.90 67.27 67.70 67.99 67.76
MK 0.5886 0.5021 0.5232 0.5417 0.5571 0.5636 0.5756 0.5792 0.5701
10 M. Liu, M. Zhou, T. Zhang et al. / Applied Soft Computing Journal 89 (2020) 106071

Fig. 4. Performance comparisons of nine approaches on BCI datasets.

Table 7 Table 8
Comparisons of classification results and training time on SEED dataset. Comparisons of classification results and training time on DEAP dataset.
Algorithms acc ± std Time (s) Algorithms acc ± std Time (s)
OurAlgorithm 95.35 ± 0.71 58 OurAlgorithm 90.36 ± 1.41 49
HSS − ELM 92.16 ± 1.21 28.175 HSS − ELM 87.86 ± 1.71 26.218
WTT − CNN 91.83 ± 1.39 2800 WTT − CNN 87.83 ± 1.79 2710
MWF − CNN 90.89 ± 1.46 2280 MWF − CNN 87.17 ± 1.83 2190
FCNN 91.63 ± 1.41 1300 FCNN 88.03 ± 1.65 1220

Table 9
in terms of average accuracy and standard deviation (acc ± Comparisons of classification results and training time on BCI dataset.

std). Tables 7–9 summarizes the classification results. From the Algorithms acc ± std Time (s)
results shown in above three tables, it is evidenced that our pro- OurAlgorithm 76.86 ± 1.78 42
HSS − ELM 72.16 ± 2.21 24.852
posed algorithm achieved best performance, the pure supervised
WTT − CNN 71.89 ± 2.39 2500
learning algorithms, such as HSS-ELM, WTT-CNN, MWF-CNN, MWF − CNN 73.68 ± 2.06 2056
and FCNN with deep architecture, present Inferior performances. FCNN 74.29 ± 1.91 1170
And it demonstrates that our constructed semi-supervised carte-
sian K-means algorithm actually enhanced classification accuracy,
suggesting effective exploitation of unlabeled data. It is also
observed that our designed model with deep features outper- DEAP and BCI dataset, under the null hypothesis using a signif-
formed the other existing deep learning algorithm on all three icance level of 0.05. p−v alue is found as 0.00082, 0.00102 and
datasets, indicating the successful incorporation of deep network 0.00221 respectively, indicating that accuracy rate performed by
architecture to extract compact and high-level features. Despite our approach is indeed significantly better than above four recent
the computational efficiency is not the best, compared with deep proposed methods.
learning algorithms, the computation complexity of our algorithm
is relatively very low on all of the datasets. 6. Conclusion
To give more statistical analysis, we perform t − test for
the precision obtained by HSS-ELM method, WTT-CNN method, EI EMM recognition could promote the interaction between
MWF-CNN method, FCNN method and by our approach on SEED, human and intelligent devices. We propose a semi-supervised
M. Liu, M. Zhou, T. Zhang et al. / Applied Soft Computing Journal 89 (2020) 106071 11

cartesian K -means architecture based on CNN for EEG MI clas- References


sification. The kind of recognition model not only captures the
local activities responding to emotion, but also exploits the in- [1] G. Wenzhong, J. Li, C. Guolong, N. Yuzhen, C. Chengyu, A pso-optimized
real-time fault-tolerant task allocation algorithm in wireless sensor
teractions among intracranial nerve areas. In our research work,
networks, Trans. Parallel Distrib. Syst. 26 (2015) 3236–3249.
we have designed the semi-supervised cartesian K -means model [2] S. Zhirong, P. P. C. Lee, S. Jiwu, G. Wenzhong, Encoding-aware data
based on the cartesian K -means and the orthogonal optimal re- placement for efficient degraded reads in xor-coded storage systems:
verse prediction algorithms. To promote the traditional cartesian Algorithms and evaluation, Trans. Parallel Distrib. Syst. 29 (2018)
K -means performance, we adopt the samples’ label information 2757–2770.
[3] C. Yongli, J. Hong, W. Fang, H. Yu, F. Dan, G. Wenzhong, W. Yunxiang, Using
to construct the similarity matrix. Those samples who share high-bandwidth networks efficiently for fast graph computation, Trans.
the same label are imposed a high weights value and a lower Parallel Distrib. Syst. 30 (2019) 1170–1183.
weight value are given to those who have a different labels. [4] H. Xing, G. Wenzhong, L. Genggeng, C. Guolong, Fh-oaos:a fast 4-step
Those instances who have the same label will be assigned to the heuristic for obstacle-avoiding octilinear architecture router construction,
same or the nearest clustering in the quantization step. However, ACM Trans. Des. Autom. Electron. Syst. 21 (2016) 1–31.
[5] H. Xing, L. Genggeng, G. Wenzhong, N. Yuzhen, C. Guolong, Obstacle-
optimizing objective function with respect to the label matrixes avoiding algorithm in x-architecture based on discrete particle swarm
is a relatively difficult problem. We optimize the equation sepa- optimization for vlsi design, ACM Trans. Des. Autom. Electron. Syst. 20
rately and the performance is validated over three public datasets. (2015) 24–28.
Detailed experimental analysis indicates our proposed algorithms [6] L. Bin, G. Wenzhong, X. Naixue, C. Guolong, V.V. Athanasios, Z. Hong, A
pretreatment workflow scheduling approach for big data applications in
perform better than other comparative algorithms. We have ob-
multi-cloud environments, Trans. Netw. Serv. Manage. 13 (2016) 581–594.
tained a good result on the research of the semi-supervised [7] X. Kelvin, B. Jimmy, K. Ryan, C. Kyunghyun, C. A.C, Brain-computer
cartesian K -means algorithm, however, more effort are needed interfaces: principles and practice, 2012.
to solve the problems, such as finding a more suitable algorithm [8] K.D. Sidney, M.K. Jacqueline, A review and meta-analysis of multimodal
for optimizing the semi-supervised cartesian K -means objective affect detection systems, ACM Comput. Surv. 47 (2015) 43–50.
[9] X. Kelvin, B. Jimmy, K. Ryan, C. Kyunghyun, C. A.C, Show, attend and tell:
function and better semi-supervised quantization algorithms to Neural image caption generation with visual attention, in: ICML, Vol. 14,
be proposed to improve the performance. By incorporating deep 2015, pp. 77–81.
features of deep network model, it proves to be more effective to [10] P. Gert, N. Christa, Motor imagery and direct brain-computer communica-
improve the classification and evaluation of MI EEG signals. tion, Proc. IEEE 89 (2001) 1123–1134.
[11] S. Saeid, A.C. Jonathon, EEG Signal Processing, John Wiley & Sons, 2007.
The main contributions of our work are mainly reflected in
[12] Z. Weiping, G. Wenzhong, Y. Zhiyong, X. Haoyi, Multitask allocation to
four aspects: In view of the number of labeled samples is very heterogeneous participants in mobile crowd sensing, Wirel. Commun.
small, we propose a semi-supervised cartesian K -means algo- Mobile Comput. 2018 (2018) 721–731.
rithm with deep features extracted from CNN in EEG data; La- [13] M. Yuchang, X. Liudong, L. Yi-Kuei, G. Wenzhong, Efficient analysis of
beled data is integrated in the quantization step to provide the repairable computing systems subject to scheduled checkpointing, Trans.
Dependable Secure Comput. 2018 (2018) 286–293.
additional constraints to promote the data reconstruction ability; [14] Y. Yang, L. Ximeng, Z. Xianghan, R. Chunming, G. Wenzhong, Efficient
Laplacian matrix is build based on the label data and is added traceable authorization search system for secure cloud storage, Trans.
to the optimal reverse prediction model to get more discrimi- Cloud Comput. Online Publ. 2018 (2018) 282–294.
native cartesian K -means model; Strategy to optimize the semi- [15] W. Shiping, G. Wenzhong, Sparse multi-graph embedding for multimodal
feature representation, Trans. Multimedia 19 (2017) 1454–1466.
supervised cartesian K -means is given to enable the function to
[16] L. Fangfang, G. Wenzhong, Y. Yuanlong, C. Guolong, A multi-label
get an minimum value. The framework proposed is designed to classification algorithm based on kernel extreme learning machine,
extract spectral, temporal features from EEG motor data while Neurocomputing 260 (2016) 313–320.
learning general spatially invariant characteristics of MI tasks. The [17] N. Yuzhen, C. Jianer, G. Wenzhong, Meta-metric for saliency detection
multilayer feature fusion methods based on CNN have yet to be evaluation metrics based on application preference, Multimedia Tools Appl.
77 (2018) 26351–26369.
tested on other EEG datasets. Therefore, our method can also be [18] S. Wojciech, K. Motoaki, M. Klausrobert, Divergence-based framework for
used for other EEG applications. common spatial patterns algorithms, IEEE Rev. Biomed. Eng. 7 (2014)
50–72.
Declaration of competing interest [19] S. Qingshan, G. Haitao, M. Yuliang, L. Zhizeng, Scale-dependent signal iden-
tification in low-dimensional subspace: motor imagery task classification,
Neural Plast. 2016 (2016) 743–752.
No author associated with this paper has disclosed any po- [20] H.F. Jerome, J.L. Bentley, F. Raphael Ari, An algorithm for finding best
tential or pertinent conflicts which may be perceived to have matches in logarithmic expected time, ACM Trans. Math. Softw. (TOMS)
impending conflict with this work. For full disclosure statements 3 (3) (1977) 209–226.
refer to https://doi.org/10.1016/j.asoc.2020.106071. [21] C. Silpa-Anan, R. Hartley, Optimised kd-trees for fast image descriptor
matching, in: Computer Vision and Pattern Recognition, 2008. CVPR 2008.
IEEE Conference on, 2008, pp. 1–8.
CRediT authorship contribution statement [22] A. Gionis, P. Indyk, R. Motwani, et al., Similarity search in high dimensions
via hashing, in: VLDB, Vol. 99, 1999, pp. 518–529.
Minjie Liu: Investigation, Methodology, Software, Writing - [23] M. Datar, N. Immorlica, P. Indyk, V.S. Mirrokni, Locality-sensitive hashing
original draft. Mingming Zhou: Investigation, Data curation, scheme based on p-stable distributions, in: Proceedings of the Twentieth
Annual Symposium on Computational Geometry, 2004, pp. 253–262.
Methodology, Visualization. Tao Zhang: Conceptualization, Writ-
[24] B. Kulis, K. Grauman, Kernelized locality-sensitive hashing, IEEE Trans.
ing - review & editing, Supervision. Naixue Xiong: Visualization, Pattern Anal. Mach. Intell. 34 (6) (2012) 1092–1104.
Data curation, Validation. [25] W. Jun, K. Sanjiv, C. Shih-Fu, Semi-supervised hashing for scalable image
retrieval, in: Computer Vision and Pattern Recognition (CVPR), 2010 IEEE
Acknowledgments Conference on, 2010, pp. 3424–3431.
[26] T. Zhang, W. Jia, C. Gong, J. Sun, X. Song, Semi-supervised dictionary learn-
ing via local sparse constraints for violence detection, Pattern Recognit.
This research was partly supported by National Science Foun- Lett. 107 (2018) 98–104.
dation, China (No. 61702226, 21365008, 61562013), the Nat- [27] H. Kaiming, F. Wen, J. Sun, K-means hashing: An affinity-preserving
ural Science Foundation of Jiangsu Province, China (Grant no. quantization method for learning binary compact codes, in: Proceedings
BK20170200, BK20161135), the Open Fund of Key Laboratory of of the IEEE Conference on Computer Vision and Pattern Recognition, 2013,
pp. 2938–2945.
Urban Land Resources Monitoring and Simulation, Ministry of [28] L. Wei, W. Jun, J. Rongrong, J. Yu-Gang, C. Shih-Fu, Supervised hashing
Land and Resources, China (KF-2018-03-065), the Fundamental with kernels, in: Computer Vision and Pattern Recognition (CVPR), 2012
Research Funds for the Central Universities, China (JUSRP11854). IEEE Conference on, 2012, pp. 2074–2081.
12 M. Liu, M. Zhou, T. Zhang et al. / Applied Soft Computing Journal 89 (2020) 106071

[29] T. Zhang, W. Jia, X. He, J. Yang, Discriminative dictionary learning with [58] U. Von Luxburg, A tutorial on spectral clustering, Stat. Comput. 17 (4)
motion weber local descriptor for violence detection, IEEE Trans. Circuits (2007) 395–416.
Syst. Video Technol. 27 (3) (2017) 696–709. [59] M. Zheng, J. Bu, C. Chen, C. Wang, L. Zhang, G. Qiu, D. Cai, Graph regularized
[30] H. Jegou, M. Douze, C. Schmid, Product quantization for nearest neighbor sparse coding for image representation, IEEE Trans. Image Process. 20 (5)
search, IEEE Trans. Pattern Anal. Mach. Intell. 33 (1) (2011) 117–128. (2011) 1327–1336.
[31] M. Norouzi, D.J. Fleet, Cartesian k-means, in: 2013 IEEE Conference on
Computer Vision and Pattern Recognition (CVPR),, 2013, pp. 3017–3024. [60] Z. Ting, Q. Guo-Jun, T. Jinhui, W. Jingdong, Sparse composite quantization,
[32] S. Lloyd, Least squares quantization in pcm, Inf. Theory IEEE Trans. 28 (2) in: Proceedings of the IEEE Conference on Computer Vision and Pattern
(1982) 129–137. Recognition, 2015, pp. 4548–4556.
[33] G. Tiezheng, H. Kaiming, K. Qifa, S. Jian, Optimized product quantization [61] W. Zaiwen, Y. Wotao, A feasible method for optimization with
for approximate nearest neighbor search, in: Computer Vision and Pattern orthogonality constraints, Math. Program. 142 (1–2) (2013) 397–434.
Recognition (CVPR), 2013 IEEE Conference on, 2013, pp. 2946–2953.
[62] P.H. schonemann, A generalized solution of the orthogonal procrustes
[34] Y. Kalantidis, Y. Avrithis, Locally optimized product quantization for
problem, Psychometrika 31 (1) (1966) 1–10.
approximate nearest neighbor search, in: Computer Vision and Pattern
Recognition (CVPR), 2014 IEEE Conference on, 2014, pp. 2329–2336. [63] G. Tiezheng, H. Kaiming, K. Qifa, S. Jian, Optimized product quantization,
[35] W. Jianfeng, J. Song, X. Xu, H. Shen, S. Li, Optimized cartesian k-means, 2014.
2014. [64] Y. Lu, W. Zheng, B. Li, Combining eye movements and eeg to enhance
[36] Z. Ting, C. Du, W. Jingdong, Composite quantization for approximate emotion recognition, in: IJCAI’15 Proceedings of the 24th International
nearest neighbor search, in: ICML, 2014, pp. 838–846. Conference on Artificial Intelligence, 2015, pp. 1170–1176.
[37] J. Liu, Y. Zhang, J. Zhou, Kernelized product quantization, neurocomputing
[65] Z. Yin, M. Zhao, Y. Wang, Recognition of emotions using multimodal
235 (2017) 15–26.
physiological signals and an ensemble deep learning model, Comput.
[38] D. Xu, I.W. Tsang, Y. Zhang, Online product quantization, IEEE Trans. Knowl.
Methods Programs Biomed. 140 (2017) 93–110.
Data Eng. 30 (2018) 2185–2198.
[39] S. Hui-Wen, F. Yun-Fa, X. Xin, Identification of eeg induced by motor [66] Q. She, Z. Luo, T. Nguyen, Y. Zhang, A hierarchical semi-supervised extreme
imagery based on hilbert-huang transform, Acta Automat. Sinica (2015) learning machine method for eeg recognition, Med. Biol. Eng. Comput. 57
1686–1692. (2019) 147–157.
[40] F. Yun-Fa, X. Bao-Lei, L. Yong-Cheng, Recognition of actual grip force [67] S. Christian, L. Wei, J. Yangqing, Going deeper with convolutions, in: 2015
movement modes based on movement-related cortical potentials, Acta IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2015,
Automat. Sinica (2014) 1045–1057. pp. 1–9.
[41] A.M. Ebrahim, J.M. Jerome, B.F. Paul, J.L. Brian, Wavelet common spatial
pattern in asynchronous offline brain computer interfaces, Biomed. Signal [68] J. Qiu, X. Li, K. Hu, Correlated attention networks for multimodal emotion
Process. Control (2011) 121–128. recognition, in: 2018 IEEE International Conference on Bioinformatics and
[42] H. Wei-Chun, L. Li-Fong, C. Chun-Wei, H. Yu-Tsung, L. Yi-Hung, Eeg Biomedicine (BIBM), 2018, pp. 2656–2660.
classification of imaginary lower limb stepping movements based on fuzzy [69] G. Huang, S. Song, J. Gupta, Semi-supervised and unsupervised extreme
support vector machine with kernel-induced membership function, Int. J. learning machines, IEEE Trans. Cybern. 44 (2014) 2405–2417.
Fuzzy Syst. (2017) 566–579. [70] G. Yunchao, S. Lazebnik, Iterative quantization: A procrustean approach to
[43] H. Lianghua, H. Die, W. Meng, W. Ying, D. Karen M. von, Z. MengChu, learning binary codes, in: Computer Vision and Pattern Recognition (CVPR),
Common bayesian network for classification of eeg-based multiclass motor 2011 IEEE Conference on, 2011, pp. 817–824.
imagery bci, IEEE Trans. Syst. Man Cybern.: Syst. (2016) 843–854.
[44] I. Md Rabiul, T. Toshihisa, M. Islam, Multiband tangent space mapping and [71] S. Mohammad, P. Maja, P. Thierry, Multimodal emotion recognition in
feature selection for classification of eeg during motor imagery, J. Neural response to videos, Affect. Comput. Intell. Interact. (2015) 491–497.
Eng. (2018) 1333–1344. [72] X. Baoguo, S. Linlin, W. Aiguo, Wavelet transform time-frequency image
[45] J.D. Lpez, V. Litvak, J.J. Espinosa, K. Friston, G. Barnes, Algorithmic proce- and convolutional network based motor imagery eeg classification, IEEE
dures for bayesian meg/eeg source reconstruction in spm, NeuroImage 84 Access (2018) 1–11.
(1) (2014) 476–487.
[73] A. Syed-Umar, A. Mansour, M. Ghulam, Multilevel weighted feature fusion
[46] B. P. Hayes, J. K. Gruber, M. Prodanovic, A closed-loop state estimation tool
using convolutional neural networks for eeg motor imagery classification,
for mv network monitoring and operation, IEEE Trans. Smart Grid 6 (4)
IEEE Access (2019) 16–26.
(2015) 2116–2125.
[47] C. Eduardo, S. Miho, N. Isao, W. Yasuhiro, Convolutional neural net- [74] P. Shutao, A novel fused convolutional neural network for biomedical
works with 3d input for p300 identification in auditory brain-computer image classification, Med. Biol. Eng. Comput. (2019) 107–121.
interfaces, Comput. Intell. Neurosci. (2017) 16–25.
[48] Y.R. Tabar, U. Halici, A novel deep learning approach for classification of
eeg motor imagery signals, J. Neural Eng. (2017) 159–169.
[49] X. An, D. Kuang, X. Guo, Y. Zhao, L. He, A deep learning method for clas- Minjie Liu received the bachelor’s degree from
sification of eeg data based on motor imagery, Intell. Comput. Bioinform. Changjiang University, Jingzhou, China, in 2006, and
(2014) 203–210. the M.S. degree from the Physiology, Liaoning Medical
[50] H. Yang, S. Sakhavi, K.K. Ang, C. Guan, On the use of convolutional neural University, Jinzhou, China, in 2013. She is currently a
networks and augmented csp features for multi-class motor imagery of lecturer with the School of Nursing, Taihu University
eeg signals classification, in: 2015 37th Annual Int. Conf. of the IEEE of Wuxi, Wuxi, China. Her research interests focus on
Engineering in Medicine and Biology Society, 2015, pp. 2620–2623. understanding the mechanisms of electromagnetic ac-
[51] V.J. Lawhern, A.J. Solon, N.R. Waytowich, Eegnet: a compact convolutional tivities in biological tissue and systems, computational
neural network for eeg-based brain–computer interfaces, J. Neural Eng. modeling and analysis of organ systems to aid clinical
(2018) 369–374. diagnosis of dysfunction in the human body.
[52] Y. Zhang, Y. Qian, D. Wu, M.S. Hossain, A. Ghoneim, M. Chen,
Emotion-aware multimedia system security, IEEE Trans. Multimed. (2018)
1251–1259. Mingming Zhou received the M.S. degree from
[53] Y. LeCun, L. Bottou, Y. Bengio, P. Haffner, Gradient Based Learning Applied Anatomy and histoembryology, Nantong University
to Document Recognition, IEEE, 1998, pp. 2278–2324. Medical School, Jiangsu Province, China, in 2004, and
[54] Y. Bengio, P. Lamblin, D. Popovici, H. Larochelle, Greedy layer-wise training the Ph.D.degree from Molecular Medicine, Fudan Uni-
of deep networks, Adv. Neural Inform. Process. Syst. (2007) 136–147. versity Medical School, Shanghai, China, in 2009. He
[55] X. Linli, M. White, D. Schuurmans, Optimal reverse prediction: a unified is currently a associate professor with the School
perspective on supervised, unsupervised and semi-supervised learning, of Nursing, Taihu University of Wuxi, Wuxi, China.
in: Proceedings of the 26th Annual International Conference on Machine He has authored over twenty quality journal articles
Learning, 2009, pp. 1137–1144. and conference papers. His research interests focus on
[56] W. Jianfeng, J. Song, X. Xu, H. Shen, S. Li, Optimized cartesian k-means, understanding the mechanisms of electromagnetic ac-
IEEE Trans. Knowl. Data Eng. 27 (1) (2015) 180–192. tivities in biological tissue and systems, computational
[57] J. Shi, J. Malik, Normalized cuts and image segmentation, IEEE Trans. modeling and analysis of organ systems to aid clinical diagnosis of dysfunction
Pattern Anal. Mach. Intell. 22 (8) (2000) 888–905. in the human body.
M. Liu, M. Zhou, T. Zhang et al. / Applied Soft Computing Journal 89 (2020) 106071 13

Tao Zhang received the bachelor’s degree from Henan Naixue Xiong is current a Professor in College of
Polytechnic University, Jiaozuo, China, in 2008,and the Intelligence and Computing, Tianjin University, China.
Ph.D.degree from the Institute of Image Processing He received his both PhD degrees in Wuhan University
and Pattern Recognition, Shanghai Jiao Tong University, (about sensor system engineering), and Japan Advanced
Shanghai, China, in 2016. He is currently a asso- Institute of Science and Technology (about depend-
ciate professor with the Jiangsu Provincial Engineering able sensor networks), respectively. Before he attended
Laboratory for Pattern Recognition and Computational Tianjin University, he worked in Northeastern State
Intelligence, Jiangnan University, Wuxi, China. He has University, Georgia State University, Wentworth Tech-
led many research projects (e.g.,the National Sci- nology Institution, and Colorado Technical University
ence Foundation and the National Joint Fund), He (full professor about 5 years) about 10 years. His re-
has authored over thirty quality journal articles and search interests include Cloud Computing, Security and
conference papers. His current research interests include medical image pro- Dependability, Parallel and Distributed Computing, Networks, and Optimization
cessing, medical data analysis, visual surveillance, scene understanding, behavior Theory.
analysis, object detection,and pattern analysis.

You might also like