0% found this document useful (0 votes)
12 views12 pages

A Transfer Learning Based Cross-Subject Generic Model For Continuous Estimation of Finger Joint Angles From A New User

A Transfer Learning Based Cross-Subject Generic Model for Continuous Estimation of Finger Joint Angles From a New User

Uploaded by

Ayush
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views12 pages

A Transfer Learning Based Cross-Subject Generic Model For Continuous Estimation of Finger Joint Angles From A New User

A Transfer Learning Based Cross-Subject Generic Model for Continuous Estimation of Finger Joint Angles From a New User

Uploaded by

Ayush
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

1914 IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS, VOL. 27, NO.

4, APRIL 2023

A Transfer Learning Based Cross-Subject


Generic Model for Continuous Estimation of
Finger Joint Angles From a New User
Yucheng Long , Student Member, IEEE, Yanjuan Geng , Member, IEEE,
Chenyun Dai , Member, IEEE, and Guanglin Li , Senior Member, IEEE

Abstract—Continuous estimation of finger joints based Index Terms—Surface electromyography (sEMG), finger
on surface electromyography (sEMG) has attracted much joint angles, motor intent estimation, generic model,
attention in the field of human-machine interface (HMI). multiuser, cross-subject, transfer learning.
A couple of deep learning models were proposed to es-
timate the finger joint angles for specific subject. When
applied onto a new subject, however, the performance of I. INTRODUCTION
the subject-specific model would degrade significantly due
STIMATION of human motor intent based on surface
to the inter-subject differences. Therefore, a novel cross-
subject generic (CSG) model was proposed in this study
to estimate continuous kinematics of finger joints for new
E electromyogram (sEMG) plays an important role in the
intuitive human-machine interface (HMI). This technique has
users. Firstly, a multi-subject model based on the LSTA-
Conv network was built by using sEMG and finger joint recently been increasingly applied in a couple of scenarios, such
angles data from multiple subjects. Then, the subjects ad- as space teleoperation, industrial robots, intelligent prosthesis,
versarial knowledge (SAK) transfer learning strategy was and neurorehabilitation robots. However, the state-of-art litera-
adopted to calibrate the multi-subject model with the train- ture has mostly focused on the subject-specific model for motor
ing data from a new user. With the updated model param- intent identification so far. Namely, this model requires data from
eters and the testing data from the new user, multiple fin-
ger joint angles could be estimated afterwards. The overall a specific subject for model training. The model performance
performance of the CSG model for new users was vali- would significantly degrade when applied to a new subject
dated on three public datasets from Ninapro. The results independent of model training or even to the same subject on
showed that the newly proposed CSG model significantly a different day due to inter-subject and inter-day differences
outperformed five subject-specific models and two transfer and the nonstationary characteristics of sEMG. Thus, building
learning models in terms of Pearson correlation coefficient,
root mean square error, and coefficient of determination. a generic model for motor intent estimation across multiuser
Comparison analysis showed that both the long short-term becomes essential to facilitate the application of sEMG-based
feature aggregation (LSTA) module and the SAK transfer HMI.
learning strategy contributed to the CSG model. Moreover, Recently, the framework of feature engineering and pattern
increasing number of subjects in training set improved the recognition (FE-PR) have been mostly investigated towards
generalization capability of the CSG model. The novel CSG
model would facilitate the application of robotic hand con- building a cross-subject generic myoelectric interface. Specif-
trol and other HMI settings. ically, several feature transformation methods were proposed
aiming at reducing the discrepancies between the target-domain
set and the source-domain set [1], [2], [3], [4]. For example,
Khushaba et al. implemented a style-independent feature trans-
Manuscript received 13 September 2022; revised 30 November 2022;
accepted 4 January 2023. Date of publication 6 January 2023; date formation method named canonical correlation analysis (CCA),
of current version 5 April 2023. This work was supported in part in which different users’ data is projected onto a unified-style
by the National Natural Science Foundation of China under Grants space [2]. To realize multiuser hand gesture recognition, Xue
U1913601 and 82161160341, in part by the Natural Science Foundation
of Guangdong Province under Grant 2021A1515011892, and in part by et al. proposed another novel framework. They used the CCA
the Basic Research Program of Shenzhen Scientific Plan under Grant to extract inherent user-independent properties from multiuser
JCYJ20210324101607022. (Yucheng Long and Yanjuan Geng con- EMG signals, then an optimal transport was applied to reduce
tributed equally to this work.) (Corresponding authors: Yanjuan Geng;
Guanglin Li.) the discrepancies in data distribution between the transformed
Yucheng Long, Yanjuan Geng, and Guanglin Li are with the feature matrix from the training and the testing sets [3]. Simi-
Shenzhen Institute of Advanced Technology, Chinese Academy of larly, Sheng et al. presented a common spatial-spectral analysis
Sciences, Shenzhen 518055, China, and also with the University
of Chinese Academy of Sciences, Beijing 101400, China (e-mail: framework to overcome the cumbersome training and retraining
[email protected]; [email protected]; [email protected]). procedures that is required across multiday and multiuser [4].
Chenyun Dai is with the Center for Intelligent Medical Electron- They used a linear projection to minimize the objective function,
ics, School of Information Science and Technology, Fudan University,
Shanghai 200433, China (e-mail: [email protected]). which was formulated to measure the diversity of spatial-spectral
Digital Object Identifier 10.1109/JBHI.2023.3234989 EMG signals among multiple days and multiple users. All these
2168-2194 © 2023 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See https://www.ieee.org/publications/rights/index.html for more information.

Authorized licensed use limited to: Indian Institute of Technology Patna. Downloaded on June 17,2023 at 13:21:30 UTC from IEEE Xplore. Restrictions apply.
LONG et al.: TRANSFER LEARNING BASED CROSS-SUBJECT GENERIC MODEL 1915

methods proved effective in improving the cross-subject identi- TABLE I


DATASETS DESCRIPTION
fication performance of motor intents. However, it is noteworthy
that with the PR framework, only discrete recognition of prede-
fined motion classes could be output for a new user. It remains
unclear how the motor intent identification performance would
be if the number of degrees of freedom (DOF) increased.
Compared with conventional FE-PR framework, continuous
kinematics estimation from multiple DOF, such as the joint
angles or torques, is more promising from the perspective of
robotic control, but also challenging. In order to provide simul-
taneous and proportional (SPC) control of robotic hands for a
new user, both model-based and model-free approaches have
been investigated recently. For instance, Pan et al. [5] proposed
a user-generic musculoskeletal model to continuously predict
the coordinated motion between metacarpophalangeal and wrist
flexion/extension. A minimum calibration procedure that only
involves capturing maximal voluntary muscle contraction was
required for individual users. Note that the model-based ap-
proaches depend on the existence of “proven laws”, which need
to accurately describe the dynamics of the constitutive units
and functional relationship between observed variables, and are
thus limited by the uncertainties that we cannot measure internal Fig. 1. Dataset description. (a) Six grasping movements, and (b) 10
physiological parameters always. When facing practical control finger joint angles denoted as yellow dots were selected in this work.
problems, researchers tend to use model-free approaches, which
treat the relationship between the EMG and kinematics as a
black box model. More recently, Jiang et al. utilized linear and 2) The LSTA-Conv network was used to build the multi-
quadratic regression models with a time lag to build the EMG- subject model, and the newly proposed long short-term
torque models [6]. They proposed an unsupervised cross-subject feature aggregation (LSTA) module could extract more
model calibration scheme via correlation-based data weighting. representative sEMG features.
The results showed significantly reduced root mean square error 3) The SAK based transfer learning strategy was proposed
of EMG-torque regression in cross-subject settings. These recent to transfer information from multiple users to a new user
studies made important steps towards building generic models by means of adversarial learning. It improved the gener-
across multiuser [5], [6], however, they are limited to only alization ability of the multi-subject model significantly.
shoulder, elbow, and wrist joints. Given the hand’s complicated To the best of our knowledge, this is the first study to build a
anatomical structure and kinesiology, building a cross-subject CSG model for continuous estimation of finger kinematics from
generic model for finger kinematics estimation is much more a new user. The outcomes obtained in this work would facilitate
challenging. In one more recent work, a BEAT-based network the application of sEMG-based robot hand control and other
was proposed for continuous estimation of cross-subject hand HMI settings.
kinematics, wherein the testing data were from individuals to
build the proposed BERT model [7], essentially, it was not from II. METHODS
the new user independent of model training.
Therefore, herein we proposed a novel transfer learning based A. Datasets
cross-subject generic (CSG) model to estimate the finger kine- Ninapro is well-known public database that designed for the
matics for new users independent of model training. Firstly, the research on artificial intelligence robots and prosthetic hands
multi-subject model based on LSTA-Conv network was built [8]. For easy access, the first, second and seventh dataset of the
using sEMG and finger joint angles data from multiple subjects. Ninapro, named DB1, DB2, and DB7, were selected to verify
Then, the subjects adversarial knowledge (SAK) transfer learn- the generalization ability our proposed method in this study [9],
ing strategy was adopted to calibrate the multi-subject model [10]. Table I summarizes the general description of the three
with the training data from a new user. With the updated model datasets. In this study, only right-handed intact subjects were
parameters and the testing data from the new user, multiple finger included, while the left-handed and amputated subjects were
joint angles could be estimated afterwards. The generalization excluded. Specifically, a total of 25, 35, and 16 subjects from
performance of the proposed CSG model to new users was DB1, DB2, DB7, were included, respectively.
validated on three public datasets from Ninapro. The main In this work, six hand grasping movements based on their
novelties of this work are: representative shapes and diameters were selected from each
1) A transfer learning based CSG model (LSTA- dataset (Fig. 1(a)). The shapes included a cylinder, a ball and
Conv+SAK) for continuous estimation of finger joint a flat object. The diameters included large, medium and small
angles from a new user was proposed for the first time. diameter objects. Each movement was repeated 6 times, held

Authorized licensed use limited to: Indian Institute of Technology Patna. Downloaded on June 17,2023 at 13:21:30 UTC from IEEE Xplore. Restrictions apply.
1916 IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS, VOL. 27, NO. 4, APRIL 2023

Fig. 2. The overall structure of the newly proposed cross-subject generic (CSG) model. It was composed of two main parts, the LSTA-conv
network and the subject adversarial knowledge (SAK) based transfer learning strategy. The multi-subject model was firstly built based on the sEMG
and finger joint angles from multiple subjects, and then calibrated by the SAK transfer learning strategy and the training data from a new subject.
With the updated multi-subject model, the finger joint angles from the new user could be estimated. Herein, the LSTA-conv network was composed
of long short-term feature aggregation (LSTA) module, multi-scale convolution [11], and logistic regressor. Four LSTA modules (N = 4) constitute
the backbone of the multiuser model. Multi-scale convolution was designed to aggregate features across different time frames, where DC-n denotes
the dilated convolution [12] with a dilation rate of n. A logistic regressor was then applied for finger joint angle estimation, where conv denotes
convolution, mean denotes dimension mean operation, and MLP denotes full-connect layer.

for 5 seconds and then rest for 3 seconds for DB2 and DB7, acceptable accuracy and controller delay. While for DB1, the
while repeated 10 times for DB1. During movement execution, sliding window size was set as 1600 ms, and the increasement
10-12 channels of sEMG were recorded with different sampling was set as 100 ms because of its lower sampling frequencies
rates and different devices. Meanwhile, the Cyber Glove II (100 Hz). The last joint angle value within each window was
data-glove was used to recording 22 finger joint angles with used to construct the joint angle feature matrix.
a sampling rate of 20 Hz. Herein, a total of 10 finger joint 4) Normalization: Next, the sEMG was normalized to [−1,
angles (Fig. 1(b)), namely the proximal interphalangeal point 1]. Specifically, for each window SF , the sEMG were normal-
and Metacarpophalangeal point were included in this study, ized to [−Sr , +Sr ] to keep appropriate amplitude range, and
because they are the main active joints in the grasping movement. then linearly mapped to [−1, 1]. The value of Sr was chosen
according to formula:
B. Data Preprocessing
Sr = max (max (SF ) , abs (min (SF ))) + α (2)
1) Denoising: A fourth-order Butterworth bandpass filter
with cutoff frequency of 5–450 Hz was used to remove the direct where α is an artificially set parameter.
current component and high frequency domain noise of sEMG,
and a north filter was used to attenuate the power line interference C. Transfer Learning Based Cross-Subject Generic
(50 Hz). As for the joint angle signals, a second-order low-pass Model
filter was used to smooth the curve.
2) Amplification: Given existing small amplitudes for several With the preprocessed EMG and finger joint angles, the
sEMG channels, a logarithmic scaling algorithm called the u-law performance of the transfer learning based CSG model could
transformation was used to amplify the sEMG signals [13], [14], be evaluated. Fig. 2 illustrates the overall block diagram and
which is formulated as: the structure of the newly proposed CSG model. It was com-
posed of two main parts, the LSTA-Conv network and the SAK
ln (1 + μ |xt |)
F (xt ) = sign (xt ) (1) transfer learning strategy. The LSTA-Conv network could serve
ln (1 + μ) as a multi-subject model or a subject-specific model, wherein
where t is the time, xt is the sEMG signal, and μ is an artificially four LSTA modules were used to extract sEMG features on
set parameter. In this experiment, μ was set as 256. different temporal scales. The SAK transfer learning strategy
3) Window Segmentation: Considering the requirement of was proposed to map the data from multiple subjects to a new
data sufficiency for deep learning, the amplified sEMG and the user and calibrate the multi-subject model. In this section, the
smoothed kinematics data were converted into multi-channel construction of the LSTA module and the SAK transfer learning
long exposure by means of window segmentation. For DB2 strategy will be mainly described as follows.
and DB7, we used a sliding window with time length of 1000 1) Long Short-Term Feature Aggregation Module: As shown
ms and increasement of 50 ms, given the tradeoff between in Fig. 3(a), a reconstruct operation at the beginning of the LSTA

Authorized licensed use limited to: Indian Institute of Technology Patna. Downloaded on June 17,2023 at 13:21:30 UTC from IEEE Xplore. Restrictions apply.
LONG et al.: TRANSFER LEARNING BASED CROSS-SUBJECT GENERIC MODEL 1917

was used in the long-term branch for its low computational cost
[16]. For the short-term branch, the Cross Stage Partial (CSP)
structure was employed for high computational efficiency [17].
Herein, the feature receptive fields on the long-term branch
and the short-term branches of LSTA were set as 25 and 3,
respectively. Then, the outputs from two parallel branches were
concatenated together, followed by a residual connection. At
last, the multi-level features were obtained via convolution.
Fig. 3(b) shows the structure of the Chunked Transformer
in the long-term branch. At first, the standard multi-head self-
attention (MSA) was used to divide the sEMG features into
individual non-overlapped chunks and compute self-attention
within each chunk, followed by a two-layer channel Multilayer
Perceptron (MLP). Then the MSA with shifted windows was
used to extract the cross-chunk connection information across
every two consecutive chunks. The output of Shifted MSA went
through the channel MLP to give the output of the Chunked
Transformer. Note a Layer Norm (LN) layer and a residual con-
nection were applied after each MSA and channel MLP. In this
work, the depth-wise separable convolution was used to replace
the linear mapping and position embedding in MSA, because it
could improve the computational effectiveness of model [18],
[19], [20], [21]. In addition, one-dimensional convolution was
used in the MSA to fit the temporal information from sEMG
instead of the two-dimensional convolution. All receptive fields
of MSA were set to 25 in this work. The LeakyRelu nonlinearity
was applied to the output of MLP for gradient optimization
during model training [22].
Fig. 3(c) illustrates the structure of the CSP block in the
short-term branch. The input of the CSP block was fed into two
parallel branches. The input of the former branch was linearly
encoded by the convolution with filters size of 1 and then directly
linked to the end of the CSP block. The input of the latter branch
went through encoded convolution with filter size of 1, two
convolutions with filter size of 3, and then a residual connection
Fig. 3. The construction of the long short-term feature aggregation
[23]. Then, the output went through a convolution layer with
(LSTA) module. (a) Two parallel branches, the long-term branch and the filter size of 1, and was concatenated with the output from
short-term branch comprise the LSTA module. The long-term branch the former branch. Note after each convolution, the obtained
mainly consists of n transformer blocks in stacks, while the core struc-
ture of the short-term branch is the cross stage partial (CSP_N) block.
data were normalized by using batch normalization [24], and
The size of the input sequence is c ∗ l, where c denotes the number followed by rectified linear unit activation using LeakyRelu
of features, and l denotes length of the features in the time dimension. function.
Conv denotes convolution, and concat denotes concatenate operation.
(b) The structure of a transformer block, where MSA denotes multi-head
2) SAK Transfer Learning Strategy: Transfer learning strate-
self-attention, LN denotes layer norm, and MLP denotes multilayer per- gies were proved effective to transfer prior information from
ceptron. (c) The structure of the CSP_N block, where N denotes that source domain to target domain [25], [26], [27]. In this study, the
the cross stage partial (CSP) block has N residual convolutions. Conv-1
and Conv-3 denote convolution operations with filter sizes of 1 and 3,
SAK transfer learning strategy was proposed to transfer infor-
respectively. mation from multi-subjects to new users by means of adversarial
learning [28], the parameters of the multi-subject model could be
calibrated in the meanwhile. Fig. 4(a) illustrates the calibration
was used to reduce the input sequence X ∈ Rc∗l by half in process when using the SAK transfer learning strategy, which
the time dimension and double the channel dimension, yield- was comprised of a multiple-subject source domain network
l
ing the output X̃ ∈ R2c∗ 2 without information lost [15]. The (Mnet-s), a new-subject target domain network (Nnet-t), and a
reconstructed data was then encoded by means of convolution, domain feature discriminator (DFD). The Mnet-s was used to
and split into two equal parts, which were fed into two parallel extract features from the source domain (multi-subject training
branches separately. The long-term branch could capture the data), the Nnet-t was used to extract features from the target
sparse long-term relationships from all the temporal signals, domain (new subject training data), and the DFD was used to
while the short-term branch could extract local information from minimize the feature distance between the source domain and
neighboring patches. In this work, the Chunked Transformer the target domain.

Authorized licensed use limited to: Indian Institute of Technology Patna. Downloaded on June 17,2023 at 13:21:30 UTC from IEEE Xplore. Restrictions apply.
1918 IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS, VOL. 27, NO. 4, APRIL 2023

consistency [29], a novel SAK transfer learning loss function


Ltotal was proposed to optimize the DFD and Nnet-t in this
work:
Ltotal = LD + LN (3)
where LD denotes the loss function of DFD, it was used to
represent the domain distance between Mnet-s and Nnet-t.
LD = − log (DF D (Fs )) − log (1 − DF D (Ft )) (4)
where Fs represents the sources domain features, and Ft repre-
sents the target domain features.
LN denotes the loss function of Nnet-t. It was defined as:
LN = Lmap + Lsubject (5)
where Lmap indicates the extent of the feature mapping from
Mnet-s to Nnet-t.
Lmap = − log (DF D (Ft )) (6)
Lsubject was used to reweight the loss function, towards
making the Nnet-t sensitive to the features of new subject during
transfer learning.
N

Lsubject = α (N net (x) − x̂)2 (7)
n

where N net(x) denotes estimated finger joint angles, x̂ denotes


labels of finger joint angles, N represents the number of finger
joints, and α is a weighting coefficient.
Fig. 4(b) illustrates the structure of the densely-connected
residual block (DCR) [30], which constituted the DFD. The
DCR block was composed of a convolution layer, a densely-
connected layer, a transition layer, and residual connection.
Fig. 4. (a) The process of the subjects adversarial knowledge (SAK) The convolution layers were used to extract feature, and the
transfer learning strategy, where the dashed line indicates the direc- densely-connected layer was direct connection between con-
tion of parameter optimization during transfer learning. LSTA denotes volution layers. The feature extracted from all previous layers
long short-term feature aggregation module, Conv denotes convolution,
mean denotes dimension mean operation, MLP denotes Full-connect were concatenated as input of the subsequent layers, which
layer, and DCR denotes densely-connected residual block. (b) The could improve information propagation within the network so
structure of densely-connected residual (DCR) block. as to speed up the model convergence [30]. Then, a transition
layer used 1×1 convolution to compress the number of features
and compact the network. The last residual connection gave
Before transfer learning, the Mnet-s was trained with data the network a deeper layer and a more vigorous hierarchical
from multiple subjects using a supervised Mean-Squared loss representation. Note in this study, three representative features
function. The Nnet-t had the same network architecture as the from Mnet-s (denoted with red line with arrows), and three
Mnet-s had, and it was initialized by the weights of the trained representative features from Nnet-t (denoted with blue line with
Mnet-s. During transfer learning, the weights of the Mnet-s were arrows) were separately fed into the DCRs, as shown in Fig. 4(a).
frozen, while the weights of Nnet-t and DFD were trained and These three representative features were the output of the last
optimized towards minimizing the feature distance between the LSTA module, the output of the multi-scale convolution module,
source domain and the target domain. When it was difficult and the output of the last convolution.
to distinguish two domain features using DFD, the minimum In this study, all algorithms were built on the Pytorch 1.8.0
distance between these two domains was found, the knowledge framework, and trained on the NVIDIA RTX 3090 GPU with
transfer was finished. 24 G memory and Intel Core i9 CPU with 3.50 GHz, the Adam
Based on the idea of adversarial learning [28], the process optimizer was utilized for model training. During the training
of SAK transfer learning was implemented end-to-end with phase of the specific-subject model and multi-subject model, the
following formulas (1)–(5). batch size was set as 2048, the learning rate was set to 1e-3, and
Given conventional transfer learning usually focuses on the the training epoch was set to 600. While during calibration of
alignment of domain marginal distribution, but ignores feature the multi-subject model with a new user, the batch size was set

Authorized licensed use limited to: Indian Institute of Technology Patna. Downloaded on June 17,2023 at 13:21:30 UTC from IEEE Xplore. Restrictions apply.
LONG et al.: TRANSFER LEARNING BASED CROSS-SUBJECT GENERIC MODEL 1919

data from multiple subjects to the testing subject (new user)


and to calibrate the model meanwhile. During model calibra-
tion for each new user, five of the six motion repetitions were
randomly selected, and the remaining one repetition was used
for evaluation. That is, six-fold cross validation was adopted for
each new user during model calibration. For comparison, the
Fig. 5. The choice of training data to build the multi-subject model. subject-specific model based on the LSTA-Conv network, where
35 right-handed intact subjects were ranked according to the sEMG re- the number of training subject was one, was also included.
peatability in terms of standard deviation (from left to right), and divided
into 7 groups.
3) Evaluating the Effect of SAK Transfer Learning Strategy
on the CSG Model: Next, we mainly investigated the effect
of the SAK transfer learning strategy on the generalization
performance of the CSG model. For comparison, another two
to 64, the calibration rate was set to 1e-3, and the calibration
previously reported transfer learning strategies were included.
epoch was set to 30.
1) Transfer learning strategy proposed by Qin et al. [31]
(abbreviated as QTL): The QTL only updated the param-
D. Evaluation Process eters of the MLP layer in the multi-subject model, while
1) Evaluating the Effect of LSTA on the LSTA-Conv Network: other parameters were retained. In this way, the calibrated
With the preprocessed sEMG and finger joint kinematics, the model retained previous information from the multiple
LSTA-Conv based multi-subject model could be built then. First subjects, and became adaptive to the new subject.
off, we investigated the effect of newly proposed LSTA module 2) Transfer learning strategy proposed by Xu et al.’s [32]
on the LSTA-Conv network. For comparison, the Only Long- (abbreviated as XTL): To avoid overfitting caused by
term Feature (OLT) module and the Only Short-term Feature limited data from the new subject, the XTL constrained
(OST) module were used to replace the LSTA module in the the effective search space during multi-subject model cal-
LSTA-Conv network. As the names suggest, the OLT kept only ibration by adding an L2-regularisation penalty function.
the long-term branch, the OST kept only the short-term branch, With the multi-subject model trained by using 28 subjects’
while the other branch was chopped off in comparison with the data from DB2, each of the three transfer learning strategies
LSTA module. Despite of the network structural difference, the was separately applied to calibrate the multi-subject model and
parameters in OLT and OST were kept the same with that in then evaluate the continuous estimation performance of finger
LSTA. joint angles for new users, which refers to another seven subjects
In this section, both the training data and testing data were in the DB2 dataset.
from each of the 35 right-handed intact subjects (from DB2), 4) Overall Validation of the CSG Model Based on Three
respectively. In other words, the impact of the LSTA module on Datasets: For overall validation of the newly proposed CSG
the performance of the LSTA-Conv network was examined for model on new users, two more datasets (DB1 and DB7) in
subject-specific usage, because to train a subject-specific model addition to the commonly used DB2, were adopted in this
has much lower computational cost than to build a multi-subject study. With each dataset, three aforementioned transfer learn-
model. ing based methods (abbreviated as LSTA-Conv+QTL, LSTA-
2) Evaluating the Effect of Number of Subjects in Training Conv+ XTL, and LSTA-Conv+SAK, respectively), as well as
Set on the CSG Model: If the number of subjects in training five subject-specific models which served as the baseline, were
set would affect the performance of the CSG model is also an all included for comprehensive comparison.
important issue to be clarified. For this purpose, by increasing Herein, the five subject-specific models were the recurrent
the number of subjects to train the multi-subject model, the neural network (RNN) model, the long short-term memory
generalization performance of the corresponding CSG model (LSTM) model, the sparse pseudo-input gaussian processes
was tested on new users. (SPGP) model, and the CNN-Attention model proposed in our
At first, the order of the 35 right-handed intact subjects (from previous work [33]. The LSTM has been proved suitable for the
DB2) were ranked according to the sEMG repeatability in terms regression-based problems, especially with input data contain-
of standard deviation, and then divided into seven groups with ing long term time information. The RNN and LSTM algorithm
five subjects in each group (Fig. 5). For each group, four subjects used in this study was consistent with that in our previous work
were used for model training and the remaining one subject was [33]. Five stacked LSTM which has four interactive layers in
used to test the performance of the CSG model. In this section, each cell were used to increase the depth of the model for
the number of subjects gradually increased as the number of improving the fit ting ability of the model. The hidden dimension
groups increased by one. That is to say, a minimum of 4 subjects, of each layer was set to 256, and the dropout rate was set
and a maximum of 28 subjects were included as the training as 0.1 to prevent over fitting, the last layer outputs 10 joint
data with increasement of four, while the number of subjects for angles through linear mapping. Meanwhile, the parameters of
testing started from one and reached seven at maximum. the RNN are consistent with those of the LSTM. The SPGP is
For each of the seven sets of training data and testing data, the the improvement of Gaussian process regression (GPR), which
SAK transfer learning strategy was adopted to map the training has been widely used for non-linear non-parametric regression

Authorized licensed use limited to: Indian Institute of Technology Patna. Downloaded on June 17,2023 at 13:21:30 UTC from IEEE Xplore. Restrictions apply.
1920 IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS, VOL. 27, NO. 4, APRIL 2023

in high-dimensional space for its low computational complexity


and time [34]. In this work, we used the SPGP proposed by
Edward Selsdon et al. and used the same parameters as that in
[35]. The CNN-Attention model mainly consists of a multi-scale
convolution and three multi-headed self-attention modules, the
model parameters are consistent with that in our previous work
[33].
In this section, we firstly investigated the performance of
the five subject-specific models. For each model, 7, 5, and 5
repetitions of each movement (corresponding to DB1, DB2,
and DB7) were randomly selected for each subject, and the
remaining repetitions were used as the testing data. Then, the
performance of multi-subject models was investigated, that is Fig. 6. Effect comparisons of LSTA, OLT and OST on the subject-
specific LSTA-Conv network. (a) Averaged CC, (b) Averaged RMSE,
to say, the training data were from multiple subjects, while the and (c) Averaged R2 across 35 subjects (from DB2) were used as the
testing data were from individuals of the training set and new performance metrics.
subjects, respectively. Lastly, three transfer learning based CSG
models were investigated. For each CSG model, 20, 28 and
12 subjects from DB1, DB2 and DB7 were used to train the be. It is formulated as:
multi-subject model, and each of the remaining subjects were N
used as the new user. (gi − pi )2
For consistency with previous work, the root mean square R2 = 1 − iN= 1 2
(10)
i = 1 (gi − ḡ)
(RMS) was used as the sEMG feature when using RNN, LSTM,
SPGP, and CNN-Attention model [36], [37]. While for the In above formulas, N denotes sample size, pi denotes the
LSTA-Conv based approaches, the sEMG signals were directly sample point of predicted finger joint angle, and gi denotes the
used as the input. sample point of actual finger joint angle, respectively.

E. Performance Metrics F. Statistical Analysis


For each of the three aforementioned deep learning models, The Friedman test, a non-parametric alternative to one-way
three performance metrics, the Pearson correlation coefficient ANOVA with repeated measures was used to examine the effect
(CC), the root mean square error (RMSE), and the coefficient of of LSTA module on the LSTA-Cove network, the effect of
determination (R2) were used, respectively. SAK transfer learning strategy on the CSG model, and the
1) Pearson correlation coefficient (CC): CC was used to overall validation of CSG model. The CC, RMSE, and R2 were
measure the linear correlation between the finger joint separately used as the dependent variables. Once the Friedman
angle estimates and the corresponding actual data [38]. test showed a statistically significant effect, post-hoc compar-
The calculation formula is as follows: isons were performed using the Wilcoxon sign-rank test, with
the p-value adjusted by the Bonferroni correction for multiple
N
i=1(pi − p̄) (gi − ḡ) comparisons. In this work, p-value < 0.05 was taken as the
CC =   (8)
2 N
N 2
threshold of statistical significance.
i=1 (pi − p̄) i=1 (gi − ḡ)

III. RESULTS
The closer the CC value is to 1, the closer the predicted
finger motion trajectory is to the actual trajectory, and the A. The Effect of LSTA on the LSTA-Conv Network
higher estimation accuracy of the method can reach. Fig. 6 illustrates the continuous estimation performance of
2) Root mean square error (RMSE) (◦ ): RMSE was used to finger joint angles when using the LSTA-Conv network for each
evaluate the deviation between the estimated and mea- specific subject, wherein the newly proposed LSTA module
sured finger joint angles in degrees (◦ ) [39]. RMSE is was compared with the OLT module and OST module. The
calculated as: results show that the LSTA module outperformed the other two
 modules, with averaged CC of 0.825 ± 0.015, averaged RMSE
N (pi − gi )2 of 7.926 ± 0.325, and averaged R2 of 0.727 ± 0.041. These
RM SE = (9)
i=1 N values were 3.9% higher, 1.072 lower, and 6.3% higher when
compared with OLT, while 12.8% higher, 1.659 lower, and 6.6%
3) Coefficient of determination (R2): As a comprehensive higher when compared with OST in terms of CC, RMSE, and R2,
evaluation metric to measure the model’s overall accuracy respectively. The statistical analysis indicated that the proposed
[40], it is defined as the percent variability in the true LSTA module outperformed the OLT in terms of RMSE (p-value
values explained by the estimated values. The larger the < 0.05), and the LSTA outperforms the OST in terms of CC
R2 value is, the better the estimation performance would (p-value < 0.01) and RMSE (p-value < 0.01). These findings

Authorized licensed use limited to: Indian Institute of Technology Patna. Downloaded on June 17,2023 at 13:21:30 UTC from IEEE Xplore. Restrictions apply.
LONG et al.: TRANSFER LEARNING BASED CROSS-SUBJECT GENERIC MODEL 1921

Fig. 8. Comparison of transfer learning strategies. With the same


multi-subject model built already, each of the three transfer learning
strategies, abbreviated as QTL, XTL, and SAK was applied to calibrate
the multi-subject model, and then tested on seven new subjects (corre-
sponding results were denoted as seven dots with different colors). (a)
Averaged CC, (b) Averaged RMSE, and (c) Averaged R2 were used as
the performance metrics.

C. The Effect of SAK Transfer Learning Strategy on the


CSG Model
The effect of the newly proposed SAK transfer learning
strategy on the CSG model was evaluated on DB2, and compared
with two previously reported transfer learning strategies. After
a quick calibration, the continuous estimation performance of
Fig. 7. Performance of the CSG model when using increasing number
of subjects (from DB2) for training. The number of subjects ranges from finger kinematics for seven new users were tested. As shown
1 to 28 with an increasement of 4, where 1 corresponding to the LSTA- in Fig. 8, the SAK transfer learning strategy achieved the best
Conv based subject-specific model without training with multi-subjects. performance, with averaged CC, averaged RMSE, and averaged
(a) Averaged CC, (b) Averaged RMSE, and (c) Averaged R2 were used
as the performance metrics. R2 of 0.859 ± 0.012, 7.344 ± 0.242, and 0.776 ± 0.033,
respectively. Compared with QTL, the SAK transfer learning
strategy has 8.4% higher CC, 1.610 lower RMSE, and 13.3%
higher R2. Also, the SAK strategy outperformed XTL with
suggest that in comparison with the OLT and OST, the LSTA 2.9% higher CC, 0.630 lower RMSE, and 5.1% higher R2. The
module could extract more representative sEMG features. statistical analysis shows that the SAK strategy was significantly
better than the QTL in terms of CC (p-value < 0.001) and RMSE
B. The Effect of Number of Subjects in Training Set on (p-value < 0.001).
the CSG Model
The effect of number of subjects in training set on the CSG D. Overall Validation of the CSG Model
model was evaluated with DB2. The results show that with the To further validate the performance of the newly proposed
increase of the number of subjects (from 1 to 28), both the CSG model on new users, two more datasets, DB1 and DB7,
averaged CC and averaged R2 increased, while the averaged were included. Each of the three transfer learning based CSG
RMSE decreased (Fig. 7). The linear regression indicated that approaches and each of the five subject-specific methods, was
the model performance with increasing number of subjects applied on each dataset, respectively. All results were summa-
could be approximated by a straight line. The coefficient de- rized in Table II, where the bold data denote the best performance
termination of the linear regression, r2, corresponding to the under each condition.
CC, RMSE, and R2 was 0.844, 0.877, and 0.536, respectively. It can be seen that under subject-specific condition, the newly
Moreover, the slope of the fitted curve for RMSE declines proposed LSTA-Conv network outperformed the other four
fastest, while the slopes of the fitted curves for CC and R2 methods (RNN, LSTM, SPGP, and CNN-Attention) regardless
are flatter but still increasing linearly. The outcomes imply that of the dataset. Once applied for multi-subject model, the esti-
including more subjects to build the multi-subject model would mation performance degraded significantly (p-value <0.05) for
improve the generalization capability of the CSG model and all five methods, especially when the testing data were from
the accuracy of continuous estimation performance of finger new users independent of the training set. When applied for
kinematics. new users, however, the adoption of SAK transfer learning

Authorized licensed use limited to: Indian Institute of Technology Patna. Downloaded on June 17,2023 at 13:21:30 UTC from IEEE Xplore. Restrictions apply.
1922 IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS, VOL. 27, NO. 4, APRIL 2023

TABLE II
SUMMARY OF THE RESULTS WHEN APPLYING EACH ALGORITHM TO THREE DIFFERENT DATASETS, RESPECTIVELY

Authorized licensed use limited to: Indian Institute of Technology Patna. Downloaded on June 17,2023 at 13:21:30 UTC from IEEE Xplore. Restrictions apply.
LONG et al.: TRANSFER LEARNING BASED CROSS-SUBJECT GENERIC MODEL 1923

strategies significantly improved the generalization ability of the receptive fields on temporal scales could provide more infor-
LSTA-Conv based multi-subject model. It also outperformed the mative features to the model [43], [44]. In this work, the newly
other two transfer learning strategies, the QTF and the XTF. proposed LSTA module extracted sEMG features by using both
Statistical analysis showed that compared with the other five long temporal scale and short temporal scale. This might explain
algorithms (RNN, LSTM, SPGP, CNN-Attention, and LSTA- why our LSTA module achieved better performance than the
Conv) which served as subject-specific models, the newly pro- OLT and OST.
posed CSG model (LSTA-Conv+SAK) achieved significantly How to utilize the training data from multiple subjects to
better performance in terms of CC (p-value < 0.05) and RMSE improve the generalization capability of the CSG model is a
(p-value < 0.05) when using DB2. Similar results were found crucial step. Therefore, how the increasing number of sub-
when using DB1 and DB7. These outcomes suggest that the jects affect the generalization ability of the CSG model was
novel CSG model could provide sound generalization perfor- studied. Results shown in Fig. 7 illustrated linearly increased
mance for new users, and yield higher accuracy for continuous accuracy of the CSG model with increasing number of subjects
estimation of finger kinematics. that was used to build the multi-subject model. This finding
Among all three datasets, DB1 brought the worst estimation confirmed previous conclusion that the diversity of subjects’
performance, while DB7 brought the best estimation accuracy, data plays an important role in improving the generalization
which suggests that the estimation performance of finger kine- performance of the CSG model [6]. However, due to the lim-
matics was also dependent on the dataset quality. Note although ited number of subjects in DB2 (N = 35), we only observed
DB7 brought the best estimation accuracy, the improvement increase of CC, RMSE, and R2 in Fig. 7, but not turning
of estimation performance with DB2 was consistently higher points to reach the plateau. Therefore, the optimal number of
by comparing the newly proposed CSG model with other two subjects to reach maximum generalization needs to be further
transfer learning based CSG models and five subject-specific examined.
models. We further investigated the contribution of the SAK transfer
learning strategy on the CSG model by means of comparison.
Results in Fig. 8 showed that in terms of CC and RMSE, the
IV. DISCUSSION SAK based transfer learning strategy significantly improved the
Aiming at establishing a CSG model to estimate the finger generalization ability of the CSG model than the QTL and XTL.
kinematics for new users, a novel transfer learning based model The QTL only updated the parameters of the MLP layer of the
was proposed in this study. Firstly, a novel LSTA module was multi-subject model [14], while SAK updated all parameters of
used to extract richer temporal features of sEMG when to the multi-subject model as a whole. Better parameter optimiza-
build the multi-subject model. Then, the SAK transfer learning tion might account for the superiority of the SAK over QTL. On
strategy was used to update the parameters of the multi-subject the other hand, a L2-regularisation penalty function was adopted
model, which was used to estimate the finger kinematics of new to map the training data from multiple subjects to a new subject
users afterwards. The overall performance of the newly proposed when using the XTL. While the SAK was designed to work in an
CSG model was validated on three public datasets. Compared adversarial fashion by following the idea of adversarial learning.
with five subject-specific models, as well as two transfer learning It could overcome the problem of domain shifting in comparison
based approaches, our CSG model brought not only higher to the XTL. More importantly, it could exploit the information
accuracy for subject-specific finger kinematics estimation, but from source domain to deal with the problems of insufficient
also better generalization performance for new users. We also training data in the target domain [45]. These reasons might
investigated the effect of LSTA on the LSTA-Conv network, the explain the better performance of the SAK transfer learning
effect of number of subjects in training set on the CSG model, strategy in comparison to XTL.
and the effect of SAK transfer learning strategy on the CSG Finally, we conducted overall validation of the CSG model
model. This work made an important step towards promoting based on three datasets, DB1, DB2, and DB7. Unsurprisingly,
the deep learning approaches into the robotic hand control and once applied as the multi-subject model for each algorithm
other HMI scenarios. (RNN, LSTM, SPGP, CNN-Attention, or LSTA-Conv), the
To examine if the newly proposed LSTA module could ex- performance would degrade significantly in comparison to the
tract richer sEMG features, the finger kinematics estimation subject-specific model. Especially, the performance degraded
performance when using LSTA was compared with than when dramatically when tested with new subjects independent of the
using OLT, and OST. Our results in Fig. 6 showed that the training set. However, the adoption of SAK transfer learning
LSTA achieved the best performance. Moreover, the OLT out- strategy improved the generalization capability of the LAST-
performed OST, which suggests that the long-term features con- Conv network significantly regardless of the datasets, and even
tribute more than the short-term features for accurate estimation outperformed all the five subject-specific models. Among all
of finger joint angles. The possible reason is that larger temporal three datasets, the DB1 brought the worst performance in com-
scale could provide more representative sEMG features, while parison to the DB2 and DB7. The possible reason is that the
little useful information could be obtained from short temporal sampling rate in DB1 was as low as 100 Hz, in contrast, DB2
scales, especially for sEMG, which are easily contaminated by and DB7 both have a sampling rate of 2000 Hz. The low
multiple factors during acquisition [41], [42]. This finding is sampling rate resulted in insufficient ability to represent the
also consistent with previous conclusions that filters with larger complicated neuromuscular activities, and then caused lower

Authorized licensed use limited to: Indian Institute of Technology Patna. Downloaded on June 17,2023 at 13:21:30 UTC from IEEE Xplore. Restrictions apply.
1924 IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS, VOL. 27, NO. 4, APRIL 2023

estimation performance of finger kinematics. As for the higher [11] B. Qian et al., “Dynamic multi-scale convolutional neural network for
generalization performance achieved by the LSTA-Conv +SAK time series classification,” IEEE Access, vol. 8, pp. 109732–109746, 2020,
doi: 10.1109/ACCESS.2020.3002095.
model in comparison to the LSTA-Conv network when using [12] F. Yu and V. Koltun, “Multi-scale context aggregation by dilated convolu-
DB2, the possible reason is that with DB2, more subjects were tions,” in Proc. 4th Int. Conf. Learn. Representations, (Conference Track
used to build the multi-subject model, which could bring better Proceedings), San Juan, Puerto Rico, May 2-4, 2016. [Online]. Available:
http://arxiv.org/abs/1511.07122
generalization performance. [13] E. Rahimian, S. Zabihi, S. F. Atashzar, A. Asif, and A. Mohammadi,
There were also some limitations in this work. First off, the “XceptionTime: Independent time-window xceptiontime architecture for
newly proposed CSG model is essentially a supervised algo- hand gesture classification,” in Proc. IEEE Int. Conf. Acoust., Speech
Signal Process., 2020, pp. 1304–1308.
rithm, i.e., to calibrate the built multi-subject model requires [14] E. Rahimian, S. Zabihi, A. Asif, D. Farina, S. F. Atashzar, and A. Moham-
data from the new user during transfer learning. Unsupervised madi, “Hand gesture recognition using temporal convolutions and attention
transfer learning without data from the new user is a promising mechanism,” in Proc. IEEE Int. Conf. Acoust., Speech Signal Process.,
2022, pp. 1196–1200, doi: 10.1109/ICASSP43922.2022.9746174.
research direction in the future work. Secondly, during window [15] J. Wang, T. Xiao, Q. Gu, and Q. Chen, “YOLOv5_CSL_F: YOLOv5’s
segmentation, we used sliding window with time length of 1000 loss improvement and attention mechanism application for remote sensing
ms and increasement of 50 ms for DB2 and DB7. While for image object detection,” in Proc. Int. Conf. Wireless Commun. Smart Grid,
2021, pp. 197–203, doi: 10.1109/ICWCSG53609.2021.00045.
DB1, the sliding window size was set as 1600 ms, and the [16] Z. Liu et al., “Swin transformer: Hierarchical vision transformer using
increasement was set as 100ms because of its lower sampling shifted windows,” in Proc. IEEE/CVF Int. Conf. Comput. Vis., 2021,
frequencies (100 Hz). The selection of these window parameters pp. 9992–10002, doi: 10.1109/ICCV48922.2021.00986.
[17] C.-Y. Wang, H.-Y. Mark Liao, Y.-H. Wu, P.-Y. Chen, J.-W. Hsieh, and I.-H.
was based on the tradeoff between acceptable accuracy and Yeh, “CSPNet: A new backbone that can enhance learning capability of
controller delay. In our ongoing work, the optimal window CNN,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit. Work-
configurations which would affect the online performance needs shops, 2020, pp. 1571–1580, doi: 10.1109/CVPRW50498.2020.00203.
[18] H. Wu et al., “CvT: Introducing convolutions to vision transform-
to be intensively investigated. Lastly, all algorithms were only ers,” in Proc. IEEE/CVF Int. Conf. Comput. Vis., 2021, pp. 22–31,
implemented offline in this work. The newly proposed CSG doi: 10.1109/ICCV48922.2021.00009.
model took less than 2ms to calculate each finger joint angle [19] K. Yuan, S. Guo, Z. Liu, A. Zhou, F. Yu, and W. Wu, “Incorporating con-
volution designs into visual transformers,” in Proc. IEEE/CVF Int. Conf.
sample point. In practical scenarios, evaluating the real-time Comput. Vis., 2021, pp. 559–568, doi: 10.1109/ICCV48922.2021.00062.
performance of this algorithm and accelerating its computation [20] X. Chu et al., “Conditional positional encodings for vision transform-
will be conducted in our future work. ers,” vol. abs/2102.10882, 2021. [Online]. Available: https://arxiv.org/abs/
2102.10882
[21] F. Chollet, “Xception: Deep learning with depthwise separable convo-
REFERENCES lutions,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2017,
pp. 1800–1807, doi: 10.1109/CVPR.2017.195.
[1] T. Matsubara and J. Morimoto, “Bilinear modeling of EMG signals to [22] H.-C. Shin et al., “Deep convolutional neural networks for computer-aided
extract user-independent features for multiuser myoelectric interface,” detection: CNN architectures, dataset characteristics and transfer learn-
IEEE Trans. Biomed. Eng., vol. 60, no. 8, pp. 2205–2213, Aug. 2013, ing,” IEEE Trans. Med. Imag., vol. 35, no. 5, pp. 1285–1298, May 2016,
doi: 10.1109/tbme.2013.2250502. doi: 10.1109/TMI.2016.2528162.
[2] R. N. Khushaba, “Correlation analysis of electromyogram signals for [23] K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image
multiuser myoelectric interfaces,” IEEE Trans. Neural Syst. Rehabil. Eng., recognition,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2016,
vol. 22, no. 4, pp. 745–755, Jul. 2014, doi: 10.1109/tnsre.2014.2304470. pp. 770–778.
[3] B. Xue et al., “Multiuser gesture recognition using sEMG signals [24] S. Ioffe and C. Szegedy, “Batch normalization: Accelerating deep network
via canonical correlation analysis and optimal transport,” Comput. training by reducing internal covariate shift,” in Proc. Int. Conf. Mach.
Biol. Med., vol. 130, Mar. 2021, Art. no. 104188, doi: 10.1016/j. Learn., 2015, pp. 448–456.
compbiomed.2020.104188. [25] H.-C. Shin et al., “Deep convolutional neural networks for computer-aided
[4] X. Sheng, B. Lv, W. Guo, and X. Zhu, “Common spatial-spectral anal- detection: CNN architectures, dataset characteristics and transfer learn-
ysis of EMG signals for multiday and multiuser myoelectric interface,” ing,” IEEE Trans. Med. Imag., vol. 35, no. 5, pp. 1285–1298, May 2016,
Biomed. Signal Process. Control, vol. 53, Aug. 2019, Art. no. 101572, doi: 10.1109/TMI.2016.2528162.
doi: 10.1016/j.bspc.2019.101572. [26] H. Shan et al., “3-D convolutional encoder-decoder network for
[5] L. Pan, D. Crouch, and H. Huang, “Myoelectric control based on a generic low-dose CT via transfer learning from a 2-D trained network,”
musculoskeletal model: Towards a multi-user neural-machine interface,” IEEE Trans. Med. Imag., vol. 37, no. 6, pp. 1522–1534, Jun. 2018,
IEEE Trans. Neural Syst. Rehabil. Eng., vol. 26, no. 7, pp. 1435–1442, doi: 10.1109/TMI.2018.2832217.
Jul. 2018, doi: 10.1109/TNSRE.2018.2838448. [27] A. van Opbroek, M. A. Ikram, M. W. Vernooij, and M. de Bruijne,
[6] X. Jiang, B. Bardizbanian, C. Dai, W. Chen, and E. A. Clancy, “Data man- “Transfer learning improves supervised image segmentation across imag-
agement for transfer learning approaches to elbow EMG-torque model- ing protocols,” IEEE Trans. Med. Imag., vol. 34, no. 5, pp. 1018–1030,
ing,” IEEE Trans. Biomed. Eng., vol. 68, no. 8, pp. 2592–2601, Aug. 2021, May 2015, doi: 10.1109/TMI.2014.2366792.
doi: 10.1109/TBME.2021.3069961. [28] I. Goodfellow et al., “Generative adversarial networks,” Commun. ACM,
[7] C. Lin, X. Chen, W. Guo, N. Jiang, D. Farina, and J. Su, “A BERT based vol. 63, no. 11, pp. 139–144, 2020.
method for continuous estimation of cross-subject hand kinematics from [29] Y. Luo, L. Zheng, T. Guan, J. Yu, and Y. Yang, “Taking a closer look at
surface electromyographic signals,” IEEE Trans. Neural Syst. Rehabil. domain shift: Category-level adversaries for semantics consistent domain
Eng., early access, Oct. 21, 2022, doi: 10.1109/TNSRE.2022.3216528. adaptation,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit.,
[8] M. Atzori et al., “Characterization of a benchmark database for myoelectric 2019, pp. 2502–2511, doi: 10.1109/CVPR.2019.00261.
movement classification,” IEEE Trans. Neural Syst. Rehabil. Eng., vol. 23, [30] G. Huang, Z. Liu, L. Van Der Maaten, and K. Q. Weinberger, “Densely
no. 1, pp. 73–83, Jan. 2015, doi: 10.1109/TNSRE.2014.2328495. connected convolutional networks,” in Proc. IEEE Conf. Comput. Vis.
[9] M. Atzori et al., “Electromyography data for non-invasive naturally- Pattern Recognit., 2017, pp. 2261–2269, doi: 10.1109/CVPR.2017.243.
controlled robotic hand prostheses,” Sci. Data, vol. 1, no. 1, pp. 1–13, [31] Z. Qin, S. Stapornchaisit, Z. He, N. Yoshimura, and Y. Koike, “Multi–joint
2014. angles estimation of forearm motion using a regression model,” Front.
[10] A. Krasoulis, I. Kyranou, M. S. Erden, K. Nazarpour, and S. Vijayakumar, Neurorobot., vol. 103, 2021, Art. no. 685961.
“Improved prosthetic hand control with concurrent use of myoelectric and [32] L. Xuhong, Y. Grandvalet, and F. Davoine, “Explicit inductive bias for
inertial measurements,” J. Neuroeng. Rehabil., vol. 14, no. 1, pp. 1–14, transfer learning with convolutional networks,” in Proc. Int. Conf. Mach.
2017. Learn., 2018, pp. 2825–2834.

Authorized licensed use limited to: Indian Institute of Technology Patna. Downloaded on June 17,2023 at 13:21:30 UTC from IEEE Xplore. Restrictions apply.
LONG et al.: TRANSFER LEARNING BASED CROSS-SUBJECT GENERIC MODEL 1925

[33] Y. Geng et al., “A CNN-attention network for continuous esti- [40] S. Holm, “A simple sequentially rejective multiple test procedure,” Scand.
mation of finger kinematics from surface electromyography,” IEEE J. Statist., vol. 6, pp. 65–70, 1979.
Robot. Automat. Lett., vol. 7, no. 3, pp. 6297–6304, Jul. 2022, [41] M. Z. Jamal, “Signal acquisition using surface EMG and circuit de-
doi: 10.1109/LRA.2022.3169448. sign considerations for robotic prosthesis,” Comput. Intell. Electromyogr.
[34] M. Xiloyannis, C. Gavriel, A. A. C. Thomik, and A. A. Faisal, “Gaussian Anal.-A Perspective Curr. Appl. Future Challenges, vol. 18, pp. 427–448,
process autoregression for simultaneous proportional multi-modal pros- 2012.
thetic control with natural hand kinematics,” IEEE Trans. Neural Syst. [42] J. Li and Q. Wang, “Multi-modal bioelectrical signal fusion analysis based
Rehabil. Eng., vol. 25, no. 10, pp. 1785–1801, Oct. 2017. on different acquisition devices and scene settings: Overview, challenges,
[35] E. Snelson and Z. Ghahramani, “Sparse Gaussian processes using and novel orientation,” Inf. Fusion, vol. 79, pp. 229–247, 2022.
pseudo-inputs,” in Proc. Adv. Neural Inf. Process. Syst., 2006, vol. 18, [43] L. Wu, X. Zhang, K. Wang, X. Chen, and X. Chen, “Improved high-density
pp. 1257–1264. myoelectric pattern recognition control against electrode shift using data
[36] K. Englehart and B. Hudgins, “A robust, real-time control scheme for augmentation and dilated convolutional neural network,” IEEE Trans.
multifunction myoelectric control,” IEEE Trans. Biomed. Eng., vol. 50, Neural Syst. Rehabil. Eng., vol. 28, no. 12, pp. 2637–2646, Dec. 2020,
no. 7, pp. 848–854, Jul. 2003. doi: 10.1109/TNSRE.2020.3030931.
[37] M. G. Asogbon et al., “Appropriate feature set and window parameters [44] A. Vaswani et al., “Attention is all you need,” in Proc. Adv. Neural Inf.
selection for efficient motion intent characterization towards intelligently Process. Syst., 2017, vol. 30, pp. 6000–6010.
smart EMG-PR system,” Symmetry, vol. 12, no. 10, 2020, Art. no. 1710. [45] W.-C. Hung, Y.-H. Tsai, Y.-T. Liou, Y.-Y. Lin, and M.-H. Yang, “Ad-
[38] W. Guo et al., “Long exposure convolutional memory network for accurate versarial learning for semi-supervised semantic segmentation,” in Proc.
estimation of finger kinematics from surface electromyographic signals,” Brit. Mach. Vis. Conf., Newcastle, U.K., Sep. 3-6, 2018, p. 65. [Online].
J. Neural Eng., vol. 18, no. 2, 2021, Art. no. 026027. Available: http://bmvc2018.org/contents/papers/0200.pdf
[39] H. Mao, P. Fang, and G. Li, “Simultaneous estimation of multi-finger
forces by surface electromyography and accelerometry signals,” Biomed.
Signal Process. Control, vol. 70, 2021, Art. no. 103005.

Authorized licensed use limited to: Indian Institute of Technology Patna. Downloaded on June 17,2023 at 13:21:30 UTC from IEEE Xplore. Restrictions apply.

You might also like