A Review of Recurrent Neural Network-Based Methods in Computational Physiology
A Review of Recurrent Neural Network-Based Methods in Computational Physiology
Abstract— Artificial intelligence and machine learning In another scenario, physiological recordings refer to
techniques have progressed dramatically and become powerful sequential data, rather than images. Such datasets commonly
tools required to solve complicated tasks, such as computer have the following characteristics: 1) they refer to collective
vision, speech recognition, and natural language processing.
Since these techniques have provided promising and evident electrical/mechanical signals representing physical variables of
results in these fields, they emerged as valuable methods for interest, such as electrical activity produced by the brain or
applications in human physiology and healthcare. General skeletal muscles; 2) these data reflect the status variation of
physiological recordings are time-related expressions of bodily a subject/subjects in a given period of time; and 3) they are
processes associated with health or morbidity. Sequence naturally in the format of time-related recordings (e.g., time
classification, anomaly detection, decision making, and future
status prediction drive the learning algorithms to focus on series), and latent causality governs two (or more) successive
the temporal pattern and model the nonstationary dynamics occurrences. In practice, detecting an event in real time or
of the human body. These practical requirements give birth the future is critical, and the results might be sensitive to
to the use of recurrent neural networks (RNNs), which offer the temporal dynamics determined by physiological condi-
a tractable solution in dealing with physiological time series tions. Our literature survey found that most sensors used for
and provide a way to understand complex time variations
and dependencies. The primary objective of this article is to signal acquisition were noninvasive. For example, electro-
provide an overview of current applications of RNNs in the area cardiography (ECG) or electroencephalogram (EEG) signals
of human physiology for automated prediction and diagnosis were collected from electrodes attached to the skin. The
within different fields. Finally, we highlight some pathways of data collection procedures are patient-friendly and ubiquitous
future RNN developments for human physiology. for practical healthcare systems. However, interpreting these
Index Terms— Deep learning, human physiology, recurrent signals is not an easy task. The underlying complexity within
neural network (RNN), signal processing. the signals and actual physiological mechanisms are generally
not visible or easy to understand. Therefore, it is challeng-
I. I NTRODUCTION
ing to predict outcomes solely based on a human expert’s
Authorized licensed use limited to: Johns Hopkins University. Downloaded on November 01,2023 at 17:56:18 UTC from IEEE Xplore. Restrictions apply.
6984 IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, VOL. 34, NO. 10, OCTOBER 2023
Authorized licensed use limited to: Johns Hopkins University. Downloaded on November 01,2023 at 17:56:18 UTC from IEEE Xplore. Restrictions apply.
MAO AND SEJDIĆ: REVIEW OF RNN-BASED METHODS IN COMPUTATIONAL PHYSIOLOGY 6985
Fig. 2. LSTM RNN. The computational graph is shown in (a). The LSTM
has an extra pathway for the cell state. A recurrent unit of LSTM is shown
in (b). The arrows in blue represent the internal cell state.
Authorized licensed use limited to: Johns Hopkins University. Downloaded on November 01,2023 at 17:56:18 UTC from IEEE Xplore. Restrictions apply.
6986 IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, VOL. 34, NO. 10, OCTOBER 2023
Fig. 5. Implementations of RNN models are determined by the label structure of each signal sample. (a) Signal sequence with a sequential label. The general
applied RNN could be designed in (c). Sometimes a signal sequence could only have one annotated label, as shown in (b), and the RNN could be designed
in the form of (d). Although (c) and (d) show one-layer unidirectional RNN, multiple stacked layers, or bidirectional RNN are also adoptable.
physiological signals may give us more evidence for decision- layers. For example, to diagnose arrhythmia based on
making at the current time point [7], [11]–[13]. ECG signal, Oh et al. [17] fed the LSTM’s hidden state
at last time step to three-dense layers for final output.
III. E XPERIMENT D ESIGN W ITH RNN Chang et al. [18] and Hofmann et al. [19] also applied
In physiological applications, supervised learning is com- a similar configuration. For the bidirectional RNNs, one
monly used because the expected output (label) always exists. can concatenate the last hidden states of both forward
Besides, some unsupervised learning techniques, such as and backward paths into a signal vector, and then feed
autoencoder and clustering, may also help the physiological it into dense layers for final output, as introduced in the
studies, which will be introduced in Section IV with examples. studies of Lynn et al. [20] and Supratak et al. [21].
The experimental design depends on practical requirements 2) Another way of designing a fusion function is to employ
and special considerations, which are sometimes beyond the an attention layer and then a dense layer as the final
scope of computer science or engineering. We will discuss the output [11], [14], [22], [23]. The attention layer is calcu-
experiment design from two levels of view. lated by a weighted sum of all the hidden state vectors
from the RNN. A study from Shashikumar et al. [11]
A. Model Implementation suggested that the attention mechanism could improve
For model implementation, one should first analyze the the accuracy of classifying paroxysmal atrial fibrillation.
data structure on hand. The physiological signals, which are 3) The third way of constructing the output also uses
always used as input data, are time sequences. As shown in the information of all the hidden states. They could
Fig. 5(a) and (b), the annotated label structure would lead be flattened first and concatenated into a 1-D vec-
to two scenarios. Scenario I is that a sequence sample is tor, then fed to dense layer(s) for the final output.
annotated with a sequential label, framed as a “many-to-many” For example, the studies reported by Yildirim et al.
problem, such as the study of sleep stage classification [14]. [24], [25], Liu et al. [26], and Tsiouris et al. [27]
The model output was also a sequence with the same length applied such a way for arrhythmia classification;
as the label sequence. Another scenario II is that a sequence Xing et al. [28] also designed one dense layer with
sample is annotated with a single label. For example, the all hidden states as input for emotion recognition
ECG signal in a segment has only one label [15], [16]. This from EEG.
scenario is also described as a “many-to-one” problem since There could be other designs for the fusion function, such
the considered output is a scalar value for each sequential as sparse projection on hidden states [29] or averaging all the
input. hidden states [30]. Method B and C may not be compatible
1) Model Output Construction: The RNN model structures with variant length samples since the concatenated vector
have slight differences at the final output part in these two should have a uniform length. Padding zeros might be a
scenarios. For scenario I, one can use the structure shown in solution. However, more studies are needed to discuss the
Figs. 1, 2(a), and 4, since the output in Figs. 1, 2(a), and 4 is padding effects.
also sequential. For scenario II, there would be several ways 2) Input Construction: The input constructions of the two
to construct a fusion function (layer) to obtain the output, scenarios shown in Fig. 5(a) and (b) are also slightly different.
as shown in Fig. 5(d). In scenario I, the signals are manually divided into consecutive
1) The easiest way to construct the output is to use the chunks (or slices, epochs, and segments) with clinical or other
hidden state at the last step with one or more dense practical purposes, as shown in Fig. 5(a). To construct an
Authorized licensed use limited to: Johns Hopkins University. Downloaded on November 01,2023 at 17:56:18 UTC from IEEE Xplore. Restrictions apply.
MAO AND SEJDIĆ: REVIEW OF RNN-BASED METHODS IN COMPUTATIONAL PHYSIOLOGY 6987
Authorized licensed use limited to: Johns Hopkins University. Downloaded on November 01,2023 at 17:56:18 UTC from IEEE Xplore. Restrictions apply.
6988 IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, VOL. 34, NO. 10, OCTOBER 2023
Fig. 7. Representative applications of RNN in the human body for diagnosis and event detection.
1) Mixed Manner: Each subject provided several samples, considerations. However, different strategies on the same
and the whole sample pool was collected from multiple dataset could lead to significantly different results. We will
subjects. All the samples were mixed and randomly discuss this issue in Section V-C.
divided into training, validation, and testing sets, as
shown in Fig. 6(b). This manner assumed that all the IV. A PPLICATIONS IN P HYSIOLOGY
samples were independent and identically distributed The main machine learning task in physiology is to develop
regardless of the subject effect. Some ECG classification an automatic diagnostic or patient status monitoring system.
studies adopted this manner [17], [24], [26], [37]. In the Analyzing the disorder, predicting the onset of a seizure,
applications of emotion recognition with EEG signals, classifying a subjects’ state, and even forecasting a pos-
some studies also applied “trial-oriented” recognition, sible disease from the time-related signals are all desired
which was similar to mixed manner [28], [30], [47]. tasks. RNNs are the top options for dealing with temporal
2) Subject-specific (subject-dependent) manner. This man- information and learning the relationship between the signals
ner tended to train a specific model for just one subject and the symptoms. As mentioned earlier, such relationships
due to the variability among the subjects [32]. The are generally not well understood and are hard to evalu-
training and validation sets were collected from the same ate with existing human knowledge. The RNN framework
subject for just one model training, and the partici- endows physiological data analysis with a highly flexible,
pant’s group thus required multiple models, as shown in inductive, nonlinear modeling ability. Based on our survey,
Fig. 6(c). This manner assumed that only the samples we found that the RNNs have already served human phys-
collected from the same subject share an identical pat- iology from the top (brain) to the bottom (gait), as shown
tern. Some epileptic seizures prediction studies preferred in Fig. 7.
to use this manner, such as the studies reported in [27] The works cited in this review were found by use of
and [48]. aggregate research databases, including PubMed (MEDLINE),
3) Fine-Tuning Manner: This manner attempted to bal- Springer Link, Google Scholar, and IEEE Xplore. Key-
ance the information of other subjects and the testing word searches were conducted through these databases
subject. The model could be first trained by the data with search terms, such as “RNN,” “long short-term mem-
collected from other subjects, and then fine-tuned by ory,” and “GRU” (and their acronyms) with the combina-
partitions data of the tested subject (also known as tion of “physiology,” “electrocardiogram,” “electromyography
target domain). Such a manner believed that the training (EMG),” “EEG,” “photoplethysmogram,” “epileptic seizure,”
set collected from the training group helped model the “emotion recognition,” “sleep stage,” “BG,” and “gait” (and
common patterns, but it has insufficient personalized their acronyms). In the following sections, we will summarize
information of unseen subjects due to the interuser dif- the majority of studies in ECG classification, emotion recogni-
ferences. To build up the blood glucose (BG) prediction tion, epileptic seizure detection, sleep stage classification, and
model, Dong et al. [49] used this manner to train a BG level prediction with RNNs. To present up-to-date studies
model on multiple patients and then fine-tuned the model in these applications, we focus on the papers published after
for one patient. Similarly, Phan et al. [14] fine-tuned 2015. Besides, we also summarize the studies in other physi-
SeqSleepNet and DeepSleepNet [21] models, which ological fields from the most representative studies published
are well-developed RNN-based models in sleep stage after 2010. We mainly cover the human-subject studies that
classification [31]. applied RNN for analyzing the physiological time sequence.
The strategy choice is greatly determined by the study pur- Additional studies, such as biomedical image processing and
poses, practical requirement, data structure, and physiological document/tabular analysis, will not be included.
Authorized licensed use limited to: Johns Hopkins University. Downloaded on November 01,2023 at 17:56:18 UTC from IEEE Xplore. Restrictions apply.
MAO AND SEJDIĆ: REVIEW OF RNN-BASED METHODS IN COMPUTATIONAL PHYSIOLOGY 6989
TABLE I
S UMMARY OF W ORKS C ARRIED O UT U SING RNN S TRUCTURE W ITH ECG S IGNALS
Authorized licensed use limited to: Johns Hopkins University. Downloaded on November 01,2023 at 17:56:18 UTC from IEEE Xplore. Restrictions apply.
6990 IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, VOL. 34, NO. 10, OCTOBER 2023
TABLE I
(Continued.) S UMMARY OF W ORKS C ARRIED O UT U SING RNN S TRUCTURE W ITH ECG S IGNALS
Authorized licensed use limited to: Johns Hopkins University. Downloaded on November 01,2023 at 17:56:18 UTC from IEEE Xplore. Restrictions apply.
MAO AND SEJDIĆ: REVIEW OF RNN-BASED METHODS IN COMPUTATIONAL PHYSIOLOGY 6991
TABLE II
S UMMARY OF C ONTRIBUTIONS D EALING W ITH THE A PPLICATIONS OF RNN S TO E MOTION R ECOGNITION W ITH EEG S IGNALS
Authorized licensed use limited to: Johns Hopkins University. Downloaded on November 01,2023 at 17:56:18 UTC from IEEE Xplore. Restrictions apply.
6992 IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, VOL. 34, NO. 10, OCTOBER 2023
activities for specific emotion [29], [47], [61]–[63]. For the Other studies also applied non-RNN-based methods on the
dataset SEED, DEAP, and other 10–20 system constructed same dataset, but they did not exceed the best performance. For
EEG data, an additional concern is how to represent not only example, Acharya et al. [77] conducted a 13-layer deep CNN
the temporal dependency but also the spatial connection of and obtained 88.7% accuracy. Lu and Triesch [78] proposed a
the multichannel signals. The EEG components from different deep CNN with residual structure and the system gave 99.0%
brain regions may also correlate with emotions. Collaborating accuracy.
with RNN, some studies listed in Table II attempted to model
the spatially adjacent dependency according to the positions of D. Sleep Stage Classification
electrodes. For example, the studies reported in [47] mapped
Sleep plays a vital role in human health. Abnormalities
the multichannel signals to 2-D image sequences and further
in sleep timing and circadian rhythm are common comor-
extracted deep features by CNN layers, thus improving the
bidities in numerous disorders, such as apnea, insomnia, and
accuracy compared with the study in [30].
narcolepsy [79]. Automatically monitoring the sleep stage
Although various approaches have been proposed for
would significantly benefit the clinical research and practice
EEG-based emotion recognition, most experimental results
for evaluating a subject’s neurocognitive performance. Many
cannot be compared directly for different setups of experi-
studies have been trying to automate sleep stage scoring based
ments. Recently, we have several publicly available emotional
on multichannel signals from electrodes. These signals are
EEG datasets, but there is still a lack of standard protocol
generally called polysomnogram (PSG), typically consisting
for evaluating the performance. For example, the studies of
of EEG, EOG, EMG, and ECG. Most sleep stage classification
Li et al. [30], [47] employed the trial-oriented fivefold cross
problems could be described as “many-to-many” problems,1
validation (mixed manner) to implement their model. In con-
since the labels were commonly in the sequential form syn-
trast, Alhagry et al. [64] and Yang et al. [65] trained the
chronized with the PSG signals. Meanwhile, as described in
model in a subject-specific manner. Nevertheless, some studies
Fig. 5(a), each signal chunk had one corresponding label
compared their performances with non-RNN methods under
annotated by the human expert.2
similar experiment setups. Li et al. [30] proposed C-RNN
The current studies for sleep staging are summarized in
architecture achieved better performance than the random
Table IV. The RNN was indispensable for this task when
forest and. On both DEAP and SEED datasets, Li et al. [66]
deep learning methods were used. Similar to the task of
also suggested RNN with variational autoencoder (VAE) out-
emotion recognition, some studies for sleep staging used the
performed SVM, random forest, k-nearest neighbors, sample
frequency domain features for constructing the input, such
logistic regression, naive Bayes classifier, and DNN.
as the log-power spectrum, based on the frequency bands
C. Epileptic Seizure Detection of the rhythms of EEG signals. Meanwhile, according to
the American Academy of Sleep Medicine (AASM) stan-
Epilepsy is a chronic neurological disorder caused by
dard, the five sleeping stages are highly characterized by
abnormal excessive or synchronous neuronal activities in
the frequency bands [81], [86]. Most of the studies applied
the patient’s brain [68]. Based on the EEG recording, the
a cross-subject strategy. For example, Suparatak et al. [21]
study of the brain activity and the neurodynamic behavior
applied k-fold subjects cross validation, and Phan et al. [23]
of epileptic seizures provide the required clinical diagnostic
used leave-one-subject-out cross validation. The DeepSleep-
information. However, EEG analysis is time-consuming for the
Net developed by Supratak et al. [21] obtained higher overall
neurologist through qualitative visual inspection of raw data.
accuracy than the non-RNN sparse autoencoder [87] and
Current studies in automatically detecting epileptic seizures
CNN-based method [88]. Based on LSTM, Phan et al. [89]
have already utilized the merit of RNN, which helps to explore
further improved the accuracy with SeqSleepNet with learned
the characteristics of EEG, as summarized in Table III.
features. They suggested that such a model worked better than
Almost all the studies in Table III adopted the with-
the CNN-only [88], DNN-only, and regular machine learning
in subject strategy, leading to relatively fair comparisons.
methods, such as SVM and random forest [38].
Only the study carried out by Thodoroff et al. [69] con-
sidered cross-subject detection. With a similar reason to the
EEG-based emotion classification tasks, some studies con- E. Blood Glucose Level Prediction
structed handcraft features from time-frequency representa- Diabetes mellitus is a common public health issue, and the
tions [27], [69], [70]. Alternatively, deep models, such as CNN prevalence of diabetes diagnoses has increased substantially
and autoencoder, were also applied to learn the lower level over the past 30 years among adults in the U.S. [90]. The
features [48], [72]. culprit of this metabolic disorder is insulin release or action,
On the CHB-MIT dataset, the model designed by which leads to hyperglycemia. Managing the BG levels of a
Daoud et al. gave the state-of-art performance (99.72% accu-
1 Some studies in this field also named such a scenario as the “sequence-
racy) when the CNN layers and pretrained encoder were used
to-sequence” sleep staging problem, which has a similar name with the
to extract the features [48]. Their model outperformed the well-known Seq2seq model [80]. For disambiguation, we use the term “many-
CNN and DNN models. On the Uni Bonn dataset, a state-of- to-many” to describe the scenario in Fig. 5(a).
2 Some EEG or PSG studies named a signal slice in a specific window as
art performance was achieved by directly feeding the reshaped
“epoch.” However, in deep learning studies, “epoch” usually refers to the iter-
raw signal into RNN. The accuracy reached 100% [74], ations that an entire dataset is used for training a model. For disambiguation,
suggesting that the handcraft features were not robust enough. we use “chunk” instead of “epoch” to represent the signal slice.
Authorized licensed use limited to: Johns Hopkins University. Downloaded on November 01,2023 at 17:56:18 UTC from IEEE Xplore. Restrictions apply.
MAO AND SEJDIĆ: REVIEW OF RNN-BASED METHODS IN COMPUTATIONAL PHYSIOLOGY 6993
TABLE III
S UMMARY OF C ONTRIBUTIONS D ESCRIBING THE A PPLICATION OF RNN S FOR E PILEPTIC S EIZURE D ETECTION
diabetic patient can benefit glycemic control and reduce costly insulin intake, and activity, as shown in Table V. The features
complications [91]. are commonly extracted from physiological models when the
Continuous subcutaneous glucose monitoring is becoming external factors were included as input modalities [91], [92].
the most popular tool with a micro-invasive sensor to measure All these features describe the effects of carbohydrate, insulin,
BG, such as an adhesive patch. One primary task for machine exercise, sleep, and glucose dynamics, and they are character-
learning algorithms is to forecast abnormal changes in glucose ized as glucose-related variables.
concentration to take preventive action in time and avoid life- Predicting the value at a future step is traditionally an
threatening risks [92]. To implement this, there are two types autoregressive problem. When RNN is applied, this problem
of inputs: using the past BG concentration only, or using the could be reformulated as a “many-to-one” scenario. For the
past BG concentration and external factors, such as food, drug, prediction with multiple steps, such as predicting the BL in
Authorized licensed use limited to: Johns Hopkins University. Downloaded on November 01,2023 at 17:56:18 UTC from IEEE Xplore. Restrictions apply.
6994 IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, VOL. 34, NO. 10, OCTOBER 2023
TABLE IV
S UMMARY OF C ONTRIBUTIONS D ESCRIBING THE A PPLICATION OF RNN S FOR S LEEP S TAGE C LASSIFICATION
15, 30, and 60 min, one can construct the model output as a could not offer the best performance. Instead, the PolySeqMO
multidimensional vector [93], [94]. Most RNN-based BL pre- outperformed all other structures. More details could be found
diction studies have proven that the RNN models offered better in Table V.
performance (lower errors) than the other methods, such as the A critical challenge for the BG studies is modeling the typ-
autoregressive model and support vector regression [92], [94], ical pattern among all the subjects while considering the
[95], indicating that RNN was more robust in searching the subject-specific characteristic. To address this issue, the
historical information with sufficient nonlinearities. Besides, practitioners in BG prediction generally design the model with
one innovative design structured this prediction problem into 2 (or more) divisions: one learned the personalized pattern
the seq2seq model adapting RNN-based autoencoder, as intro- with different weights and biases, and the other one learned the
duced in the study of Fox et al. [95]. Their comparative common dynamics by shared RNN [91]–[93]. Other studies
studies suggested that the “many-to-one” structure (DeepMO) also attempt to apply the idea of fine-tuning manner, as shown
Authorized licensed use limited to: Johns Hopkins University. Downloaded on November 01,2023 at 17:56:18 UTC from IEEE Xplore. Restrictions apply.
MAO AND SEJDIĆ: REVIEW OF RNN-BASED METHODS IN COMPUTATIONAL PHYSIOLOGY 6995
TABLE V
S UMMARY OF E XISTING S TUDIES A PPLYING RNN S FOR BG L EVEL P REDICTION
in Fig. 6(c) [49], [94]. Glucose metabolism in the human body was the most optimal structure for stress level detection.
is a long-term process. For example, the effect of insulin intake Futoma et al. [44] applied an LSTM combined with the mul-
can be present for more than ten hours, and measuring BGL titask Gaussian process to detect the onset of sepsis. Mastoro-
variation demands days. Therefore, it is challenging to obtain costas and Theocharis [99] designed a block-diagonal RNN,
a large number of samples. Meanwhile, the within-subject a modified version of the Elman RNN, to analyze lung sounds.
experiment is the main strategy in the existing studies since the Cheng et al. [100] used a deep LSTM to detect the obstruction
interpatient variability makes it hard to find a generic model. of sleep apnea based on ECG signals. Su et al. [34] used ECG
We will discuss more in Section V-C. and PPG to predict blood pressure. Their study applied a res-
RNN architecture with a residual connection similar to the
F. Other Applications ResNet based on the CNNs [101]. Liu et al. [102] involved
Besides the applications mentioned above, the RNN historical blood pressure records, heart rate, and tempera-
also exhibited its power in other physiological fields. ture in predicting future blood pressure with LSTMs. More
Singh et al. [98] attempted to classify automotive drivers’ importantly, they used the subject’s profile as an extra input
stress levels based on the Galvanic skin response and photo- vector to address the cross-subject issue, and we will discuss
plethysmography signals. Their study compared the traditional it in Section V-C3. Hussain et al. investigated the preterm
DNN and the Elman RNN and pointed out that the RNN prediction for pregnant women using electrohysterography
Authorized licensed use limited to: Johns Hopkins University. Downloaded on November 01,2023 at 17:56:18 UTC from IEEE Xplore. Restrictions apply.
6996 IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, VOL. 34, NO. 10, OCTOBER 2023
(EHG) technique [36]. Bahrami Rad et al. [103] also used CNN-RNN model to detect the upper esophageal sphincter
PSG to detect nonapneic and nonhypopneic arousals with opening with swallowing acceleration signals. The proposed
a three-layer bidirectional LSTM. Yang and Hsieh [35] model used a GRU-based RNN for modeling time dependen-
detected heartbeat anomalies based on heart sounds with a cies after time-localized feature extraction from raw signals
two-layer GRU. using CNN [116], [117].
The EEG signals are also valuable measures for stroke Another critical physiological task is human gait analysis,
detection and rehabilitation, and an increasing number of as the data can hold information about medical and neu-
studies attempted to analyze such a disease with RNNs. rodegenerative disorders. Zhao et al. [118] applied LSTMs
Choi et al. [104] designed a hybrid model with CNN and bidi- on force-sensitive resistors signals to identify neurodegen-
rectional LSTM for early stroke detection. Fawaz et al. [105] erative diseases, such as Parkinson’s disease, Huntington’s
proposed a learnable fast Fourier transform method col- disease, and amyotrophic lateral sclerosis. Zhen et al. [119]
laborating with LSTM RNN to classify stroke/nonstroke used LSTM with accelerometer signals collected from the
patients. To identify post-stroke patients based on EEG, thigh, calf, and foot to identify swing and stance phases in
Sansiagi et al. [106] applied a three-layer LSTM with discrete the gait circles. Gao et al. [120] proposed a structure that
wavelet representations. combined LSTM and 1-D CNN to classify the abnormal gait
Another physiological application of RNN is the pain with wearable inertial measurement units. Tortora et al. [121]
assessment, which is challenging to complete in clinical prac- attempted to decode the gait patterns from EEG signals using
tice. One method to objectively assess the pain level is to LSTM. In this case, the input was the sequential features of
detect the protective behaviors by wearable motion sensors EEG, and the labels are swing or stance phase. The application
and the surface EMG signals. Based on these modalities, of RNNs in human gait analysis still has a long way to go in
Wang et al. [107] applied a three-layer LSTM model to iden- physiology since the related studies are relatively limited.
tify the patients with chronic lower back pain; Li et al. [108]
proposed a similar structure with extra dense layers; Yuan and V. E XISTING I SSUES AND F UTURE W ORK
Mahmoud [109] constructed an LSTM-based autoencoder to Although many studies have reported using RNNs to solve
extract the latent features, and then applied attention mecha- a wide range of problems, as introduced earlier, there remain
nism to the feature sequences. Another physiological modal- several issues facing the further development of RNNs in
ity is functional near-infrared spectroscopy, which measures physiological applications.
the hemodynamic response in the brain. Rojas et al. [110]
employed this modality and bidirectional LSTM to classify
thermal-induced pain perceptions. A. Finding Features for RNN Models
Some studies have also attempted to use RNN in EMG 1) Knowledge-Based Feature Engineering: The principal
signals analysis. These signals reflect the muscular elec- idea of feature extraction is selecting the meaningful compo-
trical activities and offer a widely adopted method for nents of sequential data to predict events of interest. These
evaluating the neuromuscular status and identifying body features are supposed to be related to these events, and
movement. Xia et al. [43] proposed an EMG-based fore- feature extraction requires external knowledge. Involving this
arm movement estimation system with LSTM. By analyz- knowledge in the RNN model design is a natural methodology,
ing the EMG collected from seven upper body muscles, especially when some typical applications’ signal features
Bengoetxea et al. [111] identified different figure-eight move- have been previously explored. For example, in the epilepsy
ments. Wang et al. [42] classified the left-hand postures using detection studies, wavelet-based methods were prevalent in
LSTM. Li et al. [46] built the relationship between EMG and constructing features from EEG signals because wavelet trans-
stimulated muscular torque with a NARX strategy. forms were extensively studied and well established to analyze
Some physiological signals can also use for identifi- brain activity [70], [122]. The study by Schwab et al. [22]
cation with the help of RNNs. Salloum et al. [112] and aimed to classify cardiac arrhythmias based on ECG and
Lynn et al. [20] used ECG signals to conduct biometric iden- manually extracted features from engineering and clinical per-
tification with RNN, and the accuracies were more than 98%. spectives, such as the amplitude of R point and QRS duration
Moreover, Zhang et al. [113] used ballistocardiogram as input in the ECG waveform. All these features were widely studied
for a similar task. The above-mentioned studies all compared biomarkers for cardiac disorders. They designed a five-layer
the accuracies between GRU and LSTM and reported that GRU or bidirectional LSTM with a Markov Model and atten-
there were no significant differences between these two units. tion mechanism. Although it was considerably complicated,
More discussion will be presented in Section V-B. such a sophisticated structure indeed provided state-of-the-art
Recently, some studies are attempting to measure the performance. All the previously reported studies in features
swallowing-induced events with on-neck sensors signals and extraction will help the RNNs’ design, especially in ECG
RNN models. Mao et al. applied a multilayer Elman RNN to and EEG-related tasks. However, seeking features might be
track the hyoid bone movement during swallowing [114]. They intractable when the domain knowledge is insufficient [123].
then combined CNN-1D and GRUs to identify the laryngeal 2) Finding Features Through Deep RNNs: Besides extract-
vestibule status (opening or closure) [115]. Importantly, they ing the features by human knowledge, scholars were also
pointed out that the RNN-based model performed better than aware of the merit in deep learning: it is possible to seek
the CNN model. Khalifa et al. [116], [117] proposed a hybrid features via the deep architecture itself. In computer vision,
Authorized licensed use limited to: Johns Hopkins University. Downloaded on November 01,2023 at 17:56:18 UTC from IEEE Xplore. Restrictions apply.
MAO AND SEJDIĆ: REVIEW OF RNN-BASED METHODS IN COMPUTATIONAL PHYSIOLOGY 6997
deep CNN architecture has been historically successful by 1) LSTM Versus GRU: Choosing between LSTM and GRU
generating “feature maps” in intermediate layers. However, might be a hard question for the model designer. The detailed
the situations were more complicated in dealing with physio- comparison between these two units was presented in [5].
logical data. If seeking features from the raw data is desirable, In addition, based on polyphonic music modeling and speech
increasing the model capacity may be needed. Chauhan and signal modeling, this study suggested no concrete conclusion
Vig [124] first attempted to feed raw electrocardiographic on which of the two units was better. In terms of physiological
signals into a three-layer LSTM RNN to conduct anomaly studies, we can get similar results. Zhang et al. [113] con-
detection. It was quite a deep structure in processing the ducted ballistocardiogram-based biometric identification with
physiological temporal data. Qiu et al. [125] proposed a three- two types of units and reported that LSTM and GRU were
layer LSTM to remove the power line interference in ECG, not significantly different in accuracy. Lynn [20] suggested
in which the input was also raw data. that GRU was slightly better than LSTM with ECG sig-
The drawback of the raw signal input is the number of nals. Dong et al. [38] reported that LSTM and GRU achieved
time steps through which the error signal of RNNs has to similar performance for sleep stage classification. Latif et al.
propagate [22]. The LSTM and GRU are specifically designed conducted RNN-based abnormal heartbeat detection with
to solve the long-time dependency problems, and they are phonocardiography and reported that the accuracy difference
not hardware friendly due to the difficulties in parallelized of GRU and LSTM was smaller than 1% [128].
computation [126]. Therefore, the design of hardware accel- Although GRU gave similar results with LSTM, it employs
erators is a path for future work. An alternative way is the fewer parameters, and thus, is computationally efficient com-
modification of the entire deep RNN structures to reduce the pared with LSTM. Latif et al. [128] also reported that
calculation of back-propagation through time, as introduced in GRU took 35% less run-time than bidirectional LSTM while
Section V-A.3. achieving a comparable result, suggesting that GRU was
3) Finding Features Through Deep Structures: From 2018, more suitable for deploying mobile or wearable devices with
there was a tendency to combine convolutional networks limited hardware resources. From the perspective of algorithm
with RNNs (C-RNNs) for physiological application [11], [12], deployment, the GRU is more competitive and hardware-
[41], [43]. Although the purpose of each network in these friendly than LSTM.
studies was different, the deep structures were similar: the 2) Use of Elman RNN: Elman RNN may still be valuable
CNN layers aimed to extract the local features, and the RNN for future model design. Although training Elman RNN is
connected the temporal relationship among these features. suffered by “gradient vanishing” problem, we did not see an
Shashikumar et al. [11] treated the 1-D ECG signals as 2-D absolute disappearance in current studies from Section IV.
pictures by calculating the wavelet power spectrum. Based Elman RNN is simple and computationally efficient, and we
on the spectral “image,” they implemented a five-layer CNN, need to figure out how to address the “gradient vanishing”
at the top of which was a one-layer bidirectional Elman problem. One way is reducing the length of input sequence
RNN. Unlike Shashikumar’s work, Tan et al. [12] and Patane with deep models (such as CNN) at a lower level, as sug-
et al. [127] used a 1-D CNN as the bottom layer to extract gested by Shashikumar et al. [11], Zhang et al. [29], and
the features of 1-D signals. Patane et al. [127] also proposed Xiong et al. [41]. The model proposed by Xiong et al. [41]
a “Siamese architecture” besides the C-RNN to improve even outperformed other LSTM-based models with the same
accuracy. The structure reported by Xiong et al. [41] was dataset, as shown in Table I. Mousavi and Afghah [54] also
more advanced: they applied the residual block and the batch conducted a Seq2seq model with CNN layers and bidirectional
normalization techniques to cardiac arrhythmias detection with Elman RNN, and they achieved better performance for ECG
an Elman RNN. These ideas are prevalent in image-related classification than the LSTM ones.
tasks and have been transferred to physiological studies. In the Compared with Elman RNN, LSTM, and GRU have addi-
above-mentioned studies, the CNN layers provide short-term tive gating components. According to the analysis carried out
local features and are easy to parallelize in computation. by Chung et al. [5], these additions effectively create shortcut
In addition, when a convolutional layer is introduced, the paths that bypass multiple temporal steps. For effectively
pooling technique is also applicable to reduce the signal length training the Elman RNN, an alternative way is to create the
or time steps, and the computation is therefore simplified. shortcut paths outside the recurrent loops. Several studies
The studies of hybrid structures just started in physiological attempted to investigate this idea. In atrial fibrillation detec-
applications, but they created new ideas in future studies. tion, Shashikumar et al. [11] added a soft attention layer on
the top of Elman RNN for the final output. Another profound
B. Choice of RNN Unit study was proposed by Zhu et al. [94]. They designed a dilated
Section II-A introduced the Elman, LSTM, and GRU, and Elman RNN structure with skipped time-step connections
they are widely used units in most physiological studies. The on each successive layer and reduced the long-term depen-
LSTM and GRU are famous structures to address the problem dency [94]. They also compared the Elman with LSTM and
of “gradient vanishing,” typically associated with the long- GRU and suggested that the Elman gave the best performance
term training of Elman RNN. Based on the existing studies with significantly reduced parameters number. For the “many-
shown in Section IV, the best choices for building up the RNN to-one” scenario, methods B and C (Section III-A) create the
models are LSTM and GRU. However, we should explore shortcuts outside recurrent units. These structures may help
more in future studies. the Elman RNN solve the vanished gradient problem since the
Authorized licensed use limited to: Johns Hopkins University. Downloaded on November 01,2023 at 17:56:18 UTC from IEEE Xplore. Restrictions apply.
6998 IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, VOL. 34, NO. 10, OCTOBER 2023
backpropagation-through-time is not the only way for weight beats-based cross validation (mixed manner) and record-based
updating. cross validation (cross-subject prediction), and the accuracies
3) Other Choices: Although LSTM and GRU are widely were 99.74% and 85.20%, respectively. In the EEG emotion
used RNN units, they are not the only choices for constructing recognition task, Li et al. [62] used both mixed manner
the RNN models. There were several other unit types, which and leave-one-subject-out cross validation, and the accuracies
were modified versions of existing units and showed promising were 92.38% and 83.28%, respectively. Thodoroff et al. [69]
results. employed both patient-specific and cross patient settings for
a) Quasi-recurrent neural networks (Q-RNN, 2016) seizure detection, and the sensitivities were 95%–100% and
[129]: Q-RNN is a hybrid structure inspired by LSTM. 85%, respectively.
It combines CNN and LSTM and enables parallel computation For even comparison, an ideal way is using a standard
across time-steps. Q-RNN achieved comparable results with protocol for all the practitioners in a specific field. Building
LSTM on language modeling tasks. up universal standard test datasets is challenging and requires
b) Simple recurrent units (SRU, 2017) [130]: SRU holds collaboration across organizations and disciplines. Fortunately,
the idea of cell states while only using forget and reset gates. the effort is currently ongoing, such as the dataset provided by
One improvement is replacing the matrix multiplication of cell the Computing in Cardiology Challenge, in which the testing
state with pointwise multiplication, making the unit compu- set was strictly defined for all the participating groups [15],
tation parallelizable. SRU achieved more robust performance as shown in Table I. Some research groups would also
than LSTM and Q-RNN but used less computational time on spontaneously use the same protocol, such as the protocol
various language processing tasks. proposed by Zheng and Lu [61] for EEG emotion recognition,
c) Independently recurrent neural network (IndRNN, and leave-one-subject-out cross validation in the pain level
2018) [131]: IndRNN is similar to the Elman unit. In (2, the assessment [109]. With the evenly comparative results, we can
hidden state is updated by matrix multiplication, while objectively evaluate the development and have a clear vision
IndRNN replaces matrix multiplication with pointwise multi- for future model design.
plication (Hadamard product). IndRNN got a higher accuracy Considering the subject effect in computational physiology,
on both image and language tasks. the deep learning practitioners should clearly describe how
d) Just another network (JANET, 2018) [132]: JANET they conduct the validation process. Our survey found that it is
only keeps the forget gate of LSTM and removes all other challenging to compare some performances among published
gates. JANET outperformed the LSTM on MNIST and studies since sometimes the types of validation were not
pMNIST databases. JANET was also applied to the MIT-BIH clearly stated. Moreover, the comparative analysis should be
ECG database and achieved 89.4% accuracy under cross- carefully carried out. Picking up the reported values from
subject classification. This performance is higher than the other studies is arbitrary because they may not use the same
study reported by Hou et al. [45], who also conducted cross- strategy. Fortunately, some studies conducted comparison by
subject classification with the same dataset. reperforming the methods proposed from the other ones under
All these currently proposed units have shown promising the same strategy. Based on these investigations, we could see
performances with simplified structures. They have not been the advantages of the RNN model, as introduced in Section IV.
broadly investigated in physiological applications and compar- 2) Practicability: Most pilot studies were in the prototype
ative results are still absent. stage, and adopting a within-subject strategy was for the proof
of concept purpose. Although they suggested that the RNN
could achieve better results, we also need to consider whether
C. Subject Effects the within-subject study is practically feasible. The models
Section III-B introduced the experiments that could employ trained under within-subject strategies must be retrained (or
either within-subject or cross-subject strategies for model fine-tuned) with new training set for the unseen subjects.
training. Meanwhile, in Section IV, the comparisons of exist- This is not typically an issue if the devices have sufficient
ing methods are under the same strategy to avoid the impact computational power. The problem is whether human experts
of the subject issue. Although this issue exists for all kinds are indispensable for labeling new data. If the labels for unseen
of machine learning designs (e.g., DNN and CNN), it causes subjects can be automatically obtained without human experts,
obstacles for RNN development. personalized models are practically feasible. One example is
1) Performance Comparison: Using inconsistent strategies BG prediction, in which monitoring devices could measure
stagnated side-by-side comparisons among the studies. It is the glucose level continuously. Therefore, it is efficient for
hard to design a better model structure if we cannot measure the pretrained models to capture the training pairs for a
the performance fairly. With the same datasets, performances new subject, and the interperson variability issue could be
can get changed by the innovative model designs and the addressed in practice.
strategies in the experiments. Some studies attempted to com- If the labels are not easy to acquire and human experts
pare the performance discrepancy under different strategies are necessary for the labeling process, off-line fine-tuning
with RNN. Tan et al. [12] conducted both a mixed manner for a personalized model is the only choice. For an unseen
and an equivalent way of cross-subject prediction for ECG subject, the model tuning procedures will be constrained by
classification and accuracies were 99.85% and 95.76%, respec- many factors, such as the availability of the human raters.
tively. For the same task, Hou et al. [45] also compared Since within-subject studies have already achieved very high
Authorized licensed use limited to: Johns Hopkins University. Downloaded on November 01,2023 at 17:56:18 UTC from IEEE Xplore. Restrictions apply.
MAO AND SEJDIĆ: REVIEW OF RNN-BASED METHODS IN COMPUTATIONAL PHYSIOLOGY 6999
performance, more studies should focus on the scalability mass index, medical history, and personal profile. Suppose the
of the personalized models. In the future, we would like model cannot capture the complete personalized information
to see investigations that practically implement the within- based on the input signals. In that case, we can encode
subject model for the unseen subjects at the inference stage, the external factors as an auxiliary input vector and involve
especially when the recordings of unseen subjects were not them in the model design. Liu et al. [102] adopted such an
collected or labeled before training the pretrained model. idea in blood pressure prediction by embedding a contextual
These studies would enlighten the way to make protocols information cue (personal profile), including age, gender, body
of data collection, human experts scheduling, inter/intrarater mass index, height, weight, and temperature. In practice, the
reliability analysis, off-line tuning for the personalized model, personal information is easy to achieve and may help to
and model deployment for the unseen subjects. improve the model’s performance for cross-subject prediction.
3) Implementation of Cross-Subject Prediction: Developing Data augmentation is a widely used way for improving the
the physiological system for unseen patients is the most nature model generalization. Augmentation is not readily accessible
circumstance. As introduced in Section IV and V-C1, the for sequential physiological input since there is no way to
accuracies of some studies under within-subjects classification augment the inputs from unseen subjects. If the latent factors,
tasks have already achieved more than 90%, even approaching some mathematical descriptions of training subjects, follow
100%. However, cross-subject prediction accuracies stayed some distribution, would it be possible to model this and
80% ∼ 90%. In the future, we should consider the hypoth- sample it to get more training data? Generative models are
esis that it is possible to predict one person’s status by good options. Such ideas have already been successfully
other persons’ examples via RNN models. The interpatient implemented in other fields, but more effort is needed in
variability will seriously affect the results when the cross- processing the physiological sequences.
subject prediction is conducted, but it does not suggest the Another way to obtain more data is multitask learning.
impossibility of capturing the personalized pattern based on We introduced that the emotion recognition, seizure detection,
training groups. For example, the state-of-the-art cross-subject and sleep stage classification tasks adopted EEG (PSG) signals
performance has reached 99.53% for ECG classification by as inputs. Although they were collected from different domains
using the Seq2seq model [54]. and varied setups, they may share common features. Joint
We can imagine whether the human experts (raters) are “per- datasets and multitask learning could extract these features.
sonalized” or “user specific” when labeling the data. However, This method increases the number of subjects and achieves
this expert-level performance is only obtainable when the stud- more robust features. In addition, some unsupervised learning
ied cohort is large, i.e., hundreds of subjects or more [31], [91]. methods, such as RNN-based autoencoder, allow involving
Although some datasets were manually annotated and publicly more EEG datasets without any annotation. Multitask learning
available, such as MIT-BIH of ECG signals [16], DEAP of may offer us new solutions to improve the cross-subject
EEG signals [58], more data are still needed to improve prediction performance for the task of subdomain and ideas for
the model’s cross-subject prediction generalization. Collecting other big-data-related physiological problems, such as transfer
more datasets for training is a solution, but it is challenging learning.
due to practical constraints, such as time cost for labeling,
monetary expense, ethical review, and privacy problem. D. Other Opportunities
The individual characteristics impede the model’s 1) Ensemble Model: There are so many ways of con-
generalization across the patients, and the samples may structing RNN models, as introduced in Section IV. In
be conditionally identical distributed under different subjects. addition, there are many choices for feature extraction,
However, rigorous mathematical or numerical analysis is hyperparameters setup, and the types of RNN units.
absent, and it is unclear how the intersubject variability By including different deep learning architectures, the
impacts the model’s generalization error bound. Meanwhile, ensemble model partially addresses the problem of
we need to investigate whether the input signals contain the searching for the optimal structure while improving
class information and represent the individual characteristics. robustness. This technique is prevalent in many fields
Unsupervised learning methods, such as the autoencoder and has shown encouraging results. In physiological
model, may help us answer those questions by analyzing tasks, only several studies attempted to implement this
the latent space. A typical structure involving unsupervised technique with RNNs, such as the studies carried out by
learning was proposed by Dong et al. [93] for BG prediction, Schwab et al. [22] and Zihlmann et al. [39] for ECG
in which a combined model with the K-mean method and classification.
RNN model was proposed (Clu-RNN). The idea was that 2) Seq2seq Model: In Section III-A, we described the
the input vectors collected from some subgroups might “many-to-many” scenario, in which both the input
share similar patterns. Moreover, Li et al. [66] attempted and output are sequences, and we also discussed the
to use the unsupervised learning method first to search the typical way of model construction. Alternatively, the
sequence of the features for emotion recognition. This method Seq2seq model is also suitable for the “many-to-
achieved the best cross-subject accuracy for the SEED dataset many” scenario citesutskever2014sequence. This model
compared with other studies. is well-developed in natural language processing, but
In physiological applications, each patient could be charac- it is not drawing enough attention in the physiological
terized by external factors, such as gender, age, weight, body area. Fox et al. [95] borrowed the idea of the Seq2seq
Authorized licensed use limited to: Johns Hopkins University. Downloaded on November 01,2023 at 17:56:18 UTC from IEEE Xplore. Restrictions apply.
7000 IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, VOL. 34, NO. 10, OCTOBER 2023
model and designed the PolySeqMO to predict the BG. [7] L. He, D. Jiang, L. Yang, E. Pei, P. Wu, and H. Sahli, “Multimodal
Mousavi et al. [83] adopted the Seq2seq model to affective dimension prediction using deep bidirectional long short-
term memory recurrent neural networks,” in Proc. 5th Int. Workshop
classify the sleep stages. They also proposed a similar Audio/Visual Emotion Challenge, Oct. 2015, pp. 73–80.
structure for ECG classification and reached the best [8] I. Goodfellow, Y. Bengio, and A. Courville, Deep Learning. Cam-
performance (shown in Table I) [54]. Moreover, some bridge, MA, USA: MIT Press, 2016.
[9] J. L. Elman, “Finding structure in time,” Cognit. Sci., vol. 14, no. 2,
advanced techniques accompanied by Seq2seq, such as pp. 179–211, Mar. 1990.
attention mechanism [133] and transformer-based archi- [10] R. Pascanu, T. Mikolov, and Y. Bengio, “On the difficulty of training
tectures [134], can also be transferred to physiological recurrent neural networks,” in Proc. 30th Int. Conf. Mach. Learn., 2013,
pp. 1310–1318.
applications. [11] S. P. Shashikumar, A. J. Shah, G. D. Clifford, and S. Nemati, “Detec-
3) Generative Model: As stated in Section V-C3, generative tion of paroxysmal atrial fibrillation using attention-based bidirectional
models may help for data augmentation. Generative recurrent neural networks,” in Proc. 24th ACM Special Interest Group
Knowl. Discovery Data Mining (SIGKDD), 2018, pp. 715–723.
adversarial network (GAN) has mainly been developed [12] J. H. Tan et al., “Application of stacked convolutional and long short-
and applied to images or artificial audio generation term memory network for accurate identification of CAD ECG signals,”
[135], [136]. Recent studies have already attempted to Comput. Biol. Med., vol. 94, pp. 19–26, Mar. 2018.
generate EEG and ECG signals with advanced tech- [13] K. Brady et al., “Multi-modal audio, video and physiological sensor
learning for continuous emotion prediction,” in Proc. 6th Int. Workshop
niques like Wasserstein GANs with gradient penalty Audio/Visual Emotion Challenge, Oct. 2016, pp. 97–104.
[137], [138]. Besides data augmentation, GANs can [14] H. Phan, F. Andreotti, N. Cooray, O. Y. Chen, and M. De Vos,
also serve other physiological tasks, such as anom- “SeqSleepNet: End-to-end hierarchical recurrent neural network for
sequence-to-sequence automatic sleep staging,” IEEE Trans. Neural
aly detection with well-trained discriminators, signal Syst. Rehabil. Eng., vol. 27, no. 3, pp. 400–410, Mar. 2019.
denoising, and signal synthesis/restoration for missing [15] G. D. Clifford et al., “AF classification from a short single lead ECG
channel(s) of multiple-sensor systems [139], [140]. VAE recording: The PhysioNet/computing in cardiology challenge 2017,” in
Proc. Comput. Cardiol. (CinC). IEEE, 2017, pp. 1–4.
is another kind of generative model, which offers an [16] G. B. Moody and R. G. Mark, “The impact of the MIT-BIH arrhythmia
alternative manner for describing the distribution of database,” IEEE Eng. Med. Biol. Mag., vol. 20, no. 3, pp. 45–50,
given data in latent space [141]. This model could also May 2001.
[17] S. L. Oh, E. Y. K. Ng, R. S. Tan, and U. R. Acharya, “Automated diag-
generate more physiological sequences, such as ECG nosis of arrhythmia using combination of CNN and LSTM techniques
generation reported by Kuznetsov et al. [142]. VAE may with variable length heart beats,” Comput. Biol. Med., vol. 102, no. 1,
enlighten new studies in computational physiology, but pp. 278–287, Nov. 2018.
[18] Y.-C. Chang, S.-H. Wu, L.-M. Tseng, H.-L. Chao, and C.-H. Ko, “AF
more investigations are still needed. detection by exploiting the spectral and temporal characteristics of
ECG signals with the LSTM model,” in Proc. Comput. Cardiol. Conf.
VI. C ONCLUSION (CinC), Dec. 2018, pp. 1–4.
[19] S. M. Hofmann, F. Klotzsche, A. Mariola, V. V. Nikulin, A. Villringer,
This review provided a comprehensive overview of existing and M. Gaebler, “Decoding subjective emotional arousal during a
studies attempting to apply RNNs in the field of human naturalistic VR experience from EEG using LSTMs,” in Proc. IEEE
Int. Conf. Artif. Intell. Virtual Reality (AIVR), Dec. 2018, pp. 128–131.
physiology. The RNN is particularly amenable for monitoring [20] H. M. Lynn, S. B. Pan, and P. Kim, “A deep bidirectional GRU network
and detecting various physiological states in real time due to model for biometric electrocardiogram classification based on recurrent
its capability of processing time-dependent sequential data. neural networks,” IEEE Access, vol. 7, pp. 145395–145405, 2019.
[21] A. Supratak, H. Dong, C. Wu, and Y. Guo, “DeepSleepNet: A model
Our survey revealed that RNNs have already been widely for automatic sleep stage scoring based on raw single-channel EEG,”
studied in diverse healthcare applications. The modern neural IEEE Trans. Neural Syst. Rehabil. Eng., vol. 25, no. 11, pp. 1998–2008,
networks and computational power techniques have facilitated Nov. 2017.
[22] P. Schwab, G. C. Scebba, J. Zhang, M. Delai, and W. Karlen, “Beat by
addressing health issues. beat: Classifying cardiac arrhythmias with recurrent neural networks,”
in Proc. Comput. Cardiol. (CinC), 2017, pp. 1–4.
[23] H. Phan, F. Andreotti, N. Cooray, O. Y. Chen, and M. D. Vos, “Auto-
ACKNOWLEDGMENT matic sleep stage classification using single-channel EEG: Learning
The content is solely the responsibility of the authors and sequential features with attention-based recurrent neural networks,”
in Proc. 40th Annu. Int. Conf. IEEE Eng. Med. Biol. Soc. (EMBC),
does not necessarily represent the official views of the National Jul. 2018, pp. 1452–1455.
Institutes of Health. [24] Ö. Yildirim, “A novel wavelet sequence based on deep bidirectional
LSTM network model for ECG signal classification,” Comput. Biol.
R EFERENCES Med., vol. 96, pp. 189–202, Jul. 2018.
[25] O. Yildirim, U. B. Baloglu, R.-S. Tan, E. J. Ciaccio, and
[1] Openai Five Benchmark: Results. (Aug. 6, 2018). [Online]. Available: U. R. Acharya, “A new approach for arrhythmia classification using
https://blog.openai.com/openai-five-benchmark-results/ deep coded features and LSTM networks,” Comput. Methods Programs
[2] A. Esteva et al., “Dermatologist-level classification of skin cancer with Biomed., vol. 176, pp. 121–133, Jul. 2019.
deep neural networks,” Nature, vol. 542, no. 7639, p. 115, Feb. 2017. [26] F. Liu, X. Zhou, J. Cao, Z. Wang, H. Wang, and Y. Zhang, “A LSTM
[3] Z. C. Lipton, J. Berkowitz, and C. Elkan, “A critical review of recurrent and CNN based assemble neural network framework for arrhythmias
neural networks for sequence learning,” 2015, arXiv:1506.00019. classification,” in Proc. IEEE Int. Conf. Acoust., Speech Signal Process.
[4] Y. Khalifa, D. Mandic, and E. Sejdić, “A review of hidden Markov (ICASSP), May 2019, pp. 1303–1307.
models and recurrent neural networks for event detection and localiza- [27] K. M. Tsiouris, V. C. Pezoulas, M. Zervakis, S. Konitsiotis,
tion in biomedical signals,” Inf. Fusion, vol. 69, pp. 52–72, May 2021. D. D. Koutsouris, and D. I. Fotiadis, “A long short-term memory deep
[5] J. Chung, C. Gulcehre, K. Cho, and Y. Bengio, “Empirical evaluation learning network for the prediction of epileptic seizures using EEG
of gated recurrent neural networks on sequence modeling,” 2014, signals,” Comput. Biol. Med., vol. 99, pp. 24–37, Aug. 2018.
arXiv:1412.3555. [28] X. Xing, Z. Li, T. Xu, L. Shu, B. Hu, and X. Xu, “SAE+LSTM:
[6] A. Graves, Supervised Sequence Labelling. Berlin, Germany: Springer, A new framework for emotion recognition from multi-channel EEG,”
2012, pp. 5–13. Frontiers Neurorobotics, vol. 13, p. 37, Jun. 2019.
Authorized licensed use limited to: Johns Hopkins University. Downloaded on November 01,2023 at 17:56:18 UTC from IEEE Xplore. Restrictions apply.
MAO AND SEJDIĆ: REVIEW OF RNN-BASED METHODS IN COMPUTATIONAL PHYSIOLOGY 7001
[29] T. Zhang, W. Zheng, Z. Cui, Y. Zong, and Y. Li, “Spatial–temporal [52] V. Maknickas and A. Maknickas, “Atrial fibrillation classification using
recurrent neural network for emotion recognition,” IEEE Trans. QRS complex features and LSTM,” in Proc. Comput. Cardiol. Conf.
Cybern., vol. 49, no. 3, pp. 839–847, Jan. 2018. (CinC), Sep. 2017, pp. 1–4.
[30] X. Li, D. Song, P. Zhang, G. Yu, Y. Hou, and B. Hu, “Emotion recog- [53] S. Saadatnejad, M. Oveisi, and M. Hashemi, “LSTM-based ECG classi-
nition from multi-channel EEG data through convolutional recurrent fication for continuous monitoring on personal wearable devices,” IEEE
neural network,” in Proc. IEEE Int. Conf. Bioinf. Biomed. (BIBM), J. Biomed. Health Informat., vol. 24, no. 2, pp. 515–523, Feb. 2020.
Dec. 2016, pp. 352–359. [54] S. Mousavi and F. Afghah, “Inter- and intra- patient ECG heartbeat
[31] H. Phan et al., “Towards more accurate automatic sleep staging via classification for arrhythmia detection: A sequence to sequence deep
deep transfer learning,” 2019, arXiv:1907.13177. learning approach,” in Proc. IEEE Int. Conf. Acoust., Speech Signal
[32] G. Wang et al., “A global and updatable ECG beat classification system Process. (ICASSP), May 2019, pp. 1308–1312.
based on recurrent neural networks and active learning,” Inf. Sci., [55] U. R. Acharya, H. Fujita, O. S. Lih, M. Adam, J. H. Tan, and
vol. 501, pp. 523–542, Oct. 2019. C. K. Chua, “Automated detection of coronary artery disease using dif-
[33] F. Ringeval et al., “Prediction of asynchronous dimensional emotion ferent durations of ECG segments with convolutional neural network,”
ratings from audiovisual and physiological data,” Pattern Recognit. Knowl.-Based Syst., vol. 132, pp. 62–71, Sep. 2017.
Lett., vol. 66, pp. 22–30, Nov. 2015. [56] M. Kachuee, S. Fazeli, and M. Sarrafzadeh, “ECG heartbeat classifi-
[34] P. Su, X. R. Ding, Y. T. Zhang, J. Liu, F. Miao, and N. Zhao, “Long- cation: A deep transferable representation,” in Proc. IEEE Int. Conf.
term blood pressure prediction with deep recurrent neural networks,” in Healthcare Informat. (ICHI), Jun. 2018, pp. 443–444.
Proc. IEEE Eng. Med. Biol. Soc. Int. Conf. Biomed. Health Informat. [57] T. J. Jun, H. M. Nguyen, D. Kang, D. Kim, D. Kim, and Y.-H. Kim,
(BHI), Mar. 2018, pp. 323–328. “ECG arrhythmia classification using a 2-D convolutional neural net-
[35] T.-C. Yang and H. Hsieh, “Classification of acoustic physiological sig- work,” 2018, arXiv:1804.06812.
nals based on deep learning neural networks with augmented features,” [58] S. Koelstra et al., “DEAP: A database for emotion analysis; using
in Proc. Comput. Cardiol. Conf. (CinC), Sep. 2016, pp. 569–572. physiological signals,” IEEE Trans. Affective Comput., vol. 3, no. 1,
[36] A. J. Hussain, P. Fergus, H. Al-Askar, D. Al-Jumeily, and F. Jager, pp. 18–31, Mar. 2012.
“Dynamic neural network architecture inspired by the immune algo- [59] L. Shu et al., “A review of emotion recognition using physiological
rithm to predict preterm deliveries in pregnant women,” Neurocomput- signals,” Sensors, vol. 18, no. 7, p. 2074, Jun. 2018.
ing, vol. 151, pp. 963–974, Mar. 2015. [60] C. D. B. Luft and J. Bhattacharya, “Aroused with heart: Modulation
[37] S. Shraddha, S. K. Pandey, U. Pawar, and R. R. Janghel, “Classification of heartbeat evoked potential by arousal induction and its oscillatory
of ECG arrhythmia using recurrent neural networks,” Proc. Comput. correlates,” Sci. Rep., vol. 5, no. 1, Dec. 2015, Art. no. 15717.
Sci., vol. 132, pp. 1290–1297, Jan. 2018. [61] W.-L. Zheng and B.-L. Lu, “Investigating critical frequency bands
[38] H. Dong, A. Supratak, W. Pan, C. Wu, P. M. Matthews, and Y. Guo, and channels for EEG-based emotion recognition with deep neural
“Mixed neural network approach for temporal sleep stage classi- networks,” IEEE Trans. Auton. Mental Develop., vol. 7, no. 3,
fication,” IEEE Trans. Neural Syst. Rehabil. Eng., vol. 26, no. 2, pp. 162–175, Sep. 2015.
pp. 324–333, Feb. 2018. [62] Y. Li, W. Zheng, Z. Cui, T. Zhang, and Y. Zong, “A novel neural
[39] M. Zihlmann, D. Perekrestenko, and M. Tschannen, “Convolutional network model based on cerebral hemispheric asymmetry for EEG
recurrent neural networks for electrocardiogram classification,” in Proc. emotion recognition,” in Proc. 27th Int. Joint Conf. Artif. Intell.,
Comput. Cardiol. Conf. (CinC), Sep. 2017, pp. 1–4. Jul. 2018, pp. 1561–1567.
[40] P. Warrick and M. Nabhan Homsi, “Cardiac arrhythmia detection from [63] Y. Li, W. Zheng, Y. Zong, Z. Cui, T. Zhang, and X. Zhou, “A
ECG combining convolutional and long short-term memory networks,” bi-hemisphere domain adversarial neural network model for EEG
in Proc. Comput. Cardiol. Conf. (CinC), Sep. 2017, pp. 1–4. emotion recognition,” IEEE Trans. Affective Comput., vol. 12, no. 2,
[41] Z. Xiong, M. P. Nash, E. Cheng, V. V. Fedorov, M. K. Stiles, and pp. 494–504, Apr. 2021.
J. Zhao, “ECG signal classification for the detection of cardiac arrhyth- [64] S. Alhagry, A. A. Fahmy, and R. A. El-Khoribi, “Emotion recognition
mias using a convolutional recurrent neural network,” Physiol. Meas., based on EEG using LSTM recurrent neural network,” Emotion, vol. 8,
vol. 39, no. 9, Sep. 2018, Art. no. 094006. no. 10, pp. 355–358, 2017.
[65] Y. Yang, Q. Wu, M. Qiu, Y. Wang, and X. Chen, “Emotion recogni-
[42] W. Wang, B. Chen, P. Xia, J. Hu, and Y. Peng, “Sensor fusion for
tion from multi-channel EEG through parallel convolutional recurrent
myoelectric control based on deep learning with recurrent convolutional
neural network,” in Proc. Int. Joint Conf. Neural Netw. (IJCNN),
neural networks,” Artif. Organs, vol. 42, no. 9, pp. E272–E282,
Jul. 2018, pp. 1–7.
Sep. 2018.
[66] X. Li et al., “Latent factor decoding of multi-channel EEG for emo-
[43] P. Xia, J. Hu, and Y. Peng, “EMG-based estimation of limb movement
tion recognition through autoencoder-like neural networks,” Frontiers
using deep learning with recurrent convolutional neural networks,”
Neurosci., vol. 14, p. 87, Mar. 2020.
Artif. Organs, vol. 42, no. 5, pp. E67–E77, May 2018.
[67] M. Soleymani, S. Asghari-Esfeden, Y. Fu, and M. Pantic, “Analy-
[44] J. Futoma, S. Hariharan, and K. Heller, “Learning to detect sepsis with
sis of EEG signals and facial expressions for continuous emotion
a multitask Gaussian process RNN classifier,” 2017, arXiv:1706.04152.
detection,” IEEE Trans. Affect. Comput., vol. 7, no. 1, pp. 17–28,
[45] B. Hou, J. Yang, P. Wang, and R. Yan, “LSTM-based auto-encoder Jan./Mar. 2016.
model for ECG arrhythmias classification,” IEEE Trans. Instrum. [68] E. Pippa et al., “Improving classification of epileptic and non-
Meas., vol. 69, no. 4, pp. 1232–1240, Apr. 2020. epileptic EEG events by feature selection,” Neurocomputing, vol. 171,
[46] Z. Li, M. Hayashibe, C. Fattal, and D. Guiraud, “Muscle fatigue pp. 576–585, Jan. 2016.
tracking with evoked EMG via recurrent neural network: Toward [69] P. Thodoroff, J. Pineau, and A. Lim, “Learning robust features using
personalized neuroprosthetics,” IEEE Comput. Intell. Mag., vol. 9, deep learning for automatic seizure detection,” in Proc. 1st Mach.
no. 2, pp. 38–46, May 2014. Learn. Healthcare Conf., 2016, pp. 178–190.
[47] Y. Li, J. Huang, H. Zhou, and N. Zhong, “Human emotion recognition [70] S. Raghu, N. Sriraam, and G. P. Kumar, “Classification of epileptic
with electroencephalographic multidimensional features by hybrid deep seizures using wavelet packet log energy and norm entropies with
neural networks,” Appl. Sci., vol. 7, no. 10, p. 1060, Oct. 2017. recurrent Elman neural network classifier,” Cogn. Neurodyn., vol. 11,
[48] H. Daoud and M. A. Bayoumi, “Efficient epileptic seizure prediction no. 1, pp. 51–66, 2017.
based on deep learning,” IEEE Trans. Biomed. Circuits Syst., vol. 13, [71] D. Ahmedt-Aristizabal, C. Fookes, K. Nguyen, and S. Sridharan, “Deep
no. 5, pp. 804–813, Oct. 2019. classification of epileptic signals,” in Proc. 40th Annu. Int. Conf. IEEE
[49] Y. Dong, R. Wen, K. Zhang, and L. Zhang, “A novel RNN-based Eng. Med. Biol. Soc. (EMBC), Jul. 2018, pp. 332–335.
blood glucose prediction approach using population and individual [72] C. Huang, W. Chen, and G. Cao, “Automatic epileptic seizure detection
characteristics,” in Proc. IEEE 7th Int. Conf. Bioinf. Comput. Biol. via attention-based CNN-BiRNN,” in Proc. IEEE Int. Conf. Bioinf.
(ICBCB), Mar. 2019, pp. 145–149. Biomed. (BIBM), Nov. 2019, pp. 660–663.
[50] S. Chauhan, L. Vig, and S. Ahmad, “ECG anomaly class identification [73] M. U. Abbasi, A. Rashad, A. Basalamah, and M. Tariq, “Detection of
using LSTM and error profile modeling,” Comput. Biol. Med., vol. 109, epilepsy seizures in neo-natal EEG using LSTM architecture,” IEEE
pp. 14–21, Jun. 2019. Access, vol. 7, pp. 179074–179085, 2019.
[51] C. Zhang, G. Wang, J. Zhao, P. Gao, J. Lin, and H. Yang, “Patient- [74] R. Hussein, H. Palangi, R. K. Ward, and Z. J. Wang, “Optimized deep
specific ECG classification based on recurrent neural networks and neural network architecture for robust detection of epileptic seizures
clustering technique,” in Proc. 13th IASTED Int. Conf. Biomed. Eng., using EEG signals,” Clin. Neurophysiol., vol. 130, no. 1, pp. 25–37,
2017, pp. 63–67. 2019.
Authorized licensed use limited to: Johns Hopkins University. Downloaded on November 01,2023 at 17:56:18 UTC from IEEE Xplore. Restrictions apply.
7002 IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, VOL. 34, NO. 10, OCTOBER 2023
[75] C.-Y. Chiang, N.-F. Chang, T.-C. Chen, H.-H. Chen, and L.-G. Chen, [96] C. Marling and R. C. Bunescu, “The OhioT1DM dataset for
“Seizure prediction based on classification of EEG synchronization blood glucose level prediction,” in Knowledge Discovery Health-
patterns with on-line retraining and post-processing scheme,” in Proc. care Data Co-Located 27th Int. Joint Conf. Artif. Intell., 2018,
Annu. Int. Conf. IEEE Eng. Med. Biol. Soc., Aug. 2011, pp. 7564–7569. pp. 60–63.
[76] R. G. Andrzejak, K. Lehnertz, F. Mormann, C. Rieke, P. David, [97] P. Pesl et al., “An advanced bolus calculator for type 1 diabetes: System
and C. E. Elger, “Indications of nonlinear deterministic and finite- architecture and usability results,” IEEE J. Biomed. Health Inform.,
dimensional structures in time series of brain electrical activity: vol. 20, no. 1, pp. 11–17, Jan. 2016.
Dependence on recording region and brain state,” Phys. Rev. E, Stat. [98] R. R. Singh, S. Conjeti, and R. Banerjee, “A comparative evaluation of
Phys. Plasmas Fluids Relat. Interdiscip. Top., vol. 64, no. 6, 2001, neural network classifiers for stress level analysis of automotive drivers
Art. no. 061907. using physiological signals,” Biomed. Signal Process. Control, vol. 8,
[77] U. R. Acharya, S. L. Oh, Y. Hagiwara, J. H. Tan, and H. Adeli, no. 6, pp. 740–754, Nov. 2013.
“Deep convolutional neural network for the automated detection and [99] P. A. Mastorocostas and J. B. Theocharis, “A stable learning algorithm
diagnosis of seizure using EEG signals,” Comput. Biol. Med., vol. 100, for block-diagonal recurrent neural networks: Application to the analy-
pp. 270–278, Sep. 2018. sis of lung sounds,” IEEE Trans. Syst., Man, B (Cybernetics), vol. 36,
[78] D. Lu and J. Triesch, “Residual deep convolutional neural network for no. 2, pp. 242–254, Apr. 2006.
EEG signal classification in epilepsy,” 2019, arXiv:1903.08100. [100] M. Cheng, W. J. Sori, F. Jiang, A. Khan, and S. Liu, “Recurrent neural
[79] K. Wulff, S. Gatti, J. G. Wettstein, and R. G. Foster, “Sleep and circa- network based classification of ECG signal features for obstruction
dian rhythm disruption in psychiatric and neurodegenerative disease,” of sleep apnea detection,” in Proc. IEEE Int. Conf. Comput. Sci.
Nature Rev. Neurosci., vol. 11, no. 8, pp. 589–599, Aug. 2010. Eng. (CSE) IEEE Int. Conf. Embedded Ubiquitous Comput. (EUC),
[80] I. Sutskever, O. Vinyals, and Q. V. Le, “Sequence to sequence learning Jul. 2017, pp. 199–202.
with neural networks,” in Proc. 27th Int. Conf. Neural Inf. Process. [101] K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image
Syst., vol. 2, Dec. 2014, pp. 3104–3112 recognition,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit.,
[81] N. Michielli, U. R. Acharya, and F. Molinari, “Cascaded LSTM Jun. 2016, pp. 770–778.
recurrent neural network for automated sleep stage classification using [102] J. Liu, Y. Wu, Z. Yuan, and X. Sun, “Blood pressure prediction with
single-channel EEG signals,” Comput. Biol. Med., vol. 106, pp. 71–81, multi-cue based RBF and LSTM model,” in Proc. 9th Int. Conf. Inf.
Mar. 2019. Technol. Med. Educ. (ITME), 2018, pp. 72–76.
[82] H. Phan, O. Y. Chen, P. Koch, A. Mertins, and M. De Vos, “Fusion [103] A. Bahrami Rad, M. Zabihi, Z. Zhao, M. Gabbouj, A. K. Katsaggelos,
of end-to-end deep learning models for sequence-to-sequence sleep and S. Särkkä, “Automated polysomnography analysis for detection of
staging,” in Proc. 41st Annu. Int. Conf. IEEE Eng. Med. Biol. Soc. non-apneic and non-hypopneic arousals using feature engineering and
(EMBC), Jul. 2019, pp. 1829–1833. a bidirectional LSTM network,” 2019, arXiv:1909.02971.
[83] S. Mousavi, F. Afghah, and U. R. Acharya, “SleepEEGNet: Automated [104] Y.-A. Choi et al., “Deep learning-based stroke disease prediction
sleep stage scoring with sequence to sequence deep learning approach,” system using real-time bio signals,” Sensors, vol. 21, no. 13, p. 4269,
PLoS ONE, vol. 14, no. 5, May 2019, Art. no. e0216456. Jun. 2021.
[105] S. Fawaz, K. S. Sim, and S. C. Tan, “Encoding rich frequencies for
[84] C. O’Reilly, N. Gosselin, J. Carrier, and T. Nielsen, “Montreal archive
of sleep studies: An open-access resource for instrument benchmarking classification of stroke patients EEG signals,” IEEE Access, vol. 8,
and exploratory research,” J. Sleep Res., vol. 23, no. 6, pp. 628–635, pp. 135811–135820, 2020.
Jun. 2014. [106] W. Sansiagi, E. C. Djamal, D. Djajasasmita, and A. Wulandari, “Post-
Stroke identification of EEG signals using recurrent neural networks
[85] A. Sterr et al., “Sleep EEG derived from behind-the-ear elec-
and long short-term memory,” Int. J. Adv. Intell. Inform., vol. 7, no. 2,
trodes (cEEGrid) compared to standard polysomnography: A proof of
pp. 137–150, 2021.
concept study,” Frontiers Human Neurosci., vol. 12, p. 452, Nov. 2018.
[107] C. Wang, T. A. Olugbade, A. Mathur, A. C. De C. Williams,
[86] R. S. Rosenberg and S. Van Hout, “The American academy of sleep
N. D. Lane, and N. Bianchi-Berthouze, “Recurrent network based auto-
medicine inter-scorer reliability program: Sleep stage scoring,” J. Clin.
matic detection of chronic pain protective behavior using MoCap and
Sleep Med., vol. 9, no. 1, pp. 81–87, Jan. 2013. sEMG data,” in Proc. 23rd Int. Symp. Wearable Comput., Sep. 2019,
[87] O. Tsinalis, P. M. Matthews, Y. Guo, and S. Zafeiriou, “Automatic pp. 225–230.
sleep stage scoring with single-channel EEG using convolutional neural [108] Y. Li, S. Ghosh, J. Joshi, and S. Oviatt, “LSTM-DNN based approach
networks,” 2016, arXiv:1610.01683. for pain intensity and protective behaviour prediction,” in Proc.
[88] O. Tsinalis, P. M. Matthews, and Y. Guo, “Automatic sleep stage scor- 15th IEEE Int. Conf. Autom. Face Gesture Recognit., Nov. 2020,
ing using time-frequency analysis and stacked sparse autoencoders,” pp. 819–823.
Ann. Biomed. Eng., vol. 44, no. 5, pp. 1587–1597, 2016. [109] X. Yuan and M. Mahmoud, “ALANet: Autoencoder-LSTM for pain and
[89] H. Phan, F. Andreotti, N. Cooray, O. Y. Chen, and M. De Vos, protective behaviour detection,” in Proc. 15th IEEE Int. Conf. Autom.
“Joint classification and prediction CNN framework for automatic Face Gesture Recognit., Nov. 2020, pp. 824–828.
sleep stage classification,” IEEE Trans. Biomed. Eng., vol. 66, no. 5, [110] R. F. Rojas, J. Romero, J. Lopez-Aparicio, and K.-L. Ou, “Pain
pp. 1285–1296, May 2019. assessment based on fNIRS using Bi-LSTM RNNs,” in Proc. 10th
[90] N. D. Mendola, T. Chen, Q. Gu, M. Eberhardt, and S. Saydah, Int. IEEE/EMBS Conf. Neural Eng. (NER), May 2021, pp. 399–402.
“Prevalence of total, diagnosed, and undiagnosed diabetes among [111] A. Bengoetxea et al., “Physiological modules for generating discrete
adults: United States, 2013–2016,” in NCHS Data Brief. Hyattsville, and rhythmic movements: Action identification by a dynamic recur-
MD, USA: U.S. National Center for Health Statistics, 2018, pp. 1–8. rent neural network,” Frontiers Comput. Neurosci., vol. 8, p. 100,
[91] W. Gu et al., “SugarMate: Non-intrusive blood glucose monitoring Sep. 2014.
with smartphones,” Proc. ACM Interact., Mobile, Wearable Ubiquitous [112] R. Salloum and C.-C. J. Kuo, “ECG-based biometrics using recurrent
Technol., vol. 1, no. 3, pp. 1–27, Sep. 2017. neural networks,” in Proc. IEEE Int. Conf. Acoust., Speech Signal
[92] M. He, W. Gu, Y. Kong, L. Zhang, C. J. Spanos, and K. M. Mosalam, Process. (ICASSP), Mar. 2017, pp. 2062–2066.
“CausalBG: Causal recurrent neural network for the blood glucose [113] X. Zhang, Y. Zhang, L. Zhang, H. Wang, and J. Tang, “Ballistocar-
inference with IoT platform,” IEEE Internet Things J., vol. 7, no. 1, diogram based person identification and authentication using recurrent
pp. 598–610, Jan. 2020. neural networks,” in Proc. 11th Int. Congr. Image Signal Process.,
[93] Y. Dong, R. Wen, Z. Li, K. Zhang, and L. Zhang, “Clu-RNN: BioMed. Eng. Inform. (CISP-BMEI), Oct. 2018, pp. 1–5.
A new RNN based approach to diabetic blood glucose prediction,” in [114] S. Mao, Z. Zhang, Y. Khalifa, C. Donohue, J. L. Coyle, and E. Sejdić,
Proc. IEEE 7th Int. Conf. Bioinf. Comput. Biol. (ICBCB), Mar. 2019, “Neck sensor-supported hyoid bone movement tracking during swal-
pp. 50–55. lowing,” Roy. Soc. Open Sci., vol. 6, no. 7, Jul. 2019, Art. no. 181982.
[94] T. Zhu, K. Li, J. Chen, P. Herrero, and P. Georgiou, “Dilated recur- [115] S. Mao, A. Sabry, Y. Khalifa, J. L. Coyle, and E. Sejdić, “Estimation of
rent neural networks for glucose forecasting in type 1 diabetes,” J. laryngeal closure duration during swallowing without invasive X-rays,”
Healthcare Inform. Res., vol. 4, pp. 308–324, Apr. 2020. Future Gener. Comput. Syst., vol. 115, pp. 610–618, Feb. 2021.
[95] I. Fox, L. Ang, M. Jaiswal, R. Pop-Busui, and J. Wiens, “Deep [116] Y. Khalifa, C. Donohue, J. L. Coyle, and E. Sejdić, “Upper esophageal
multi-output forecasting: Learning to accurately predict blood glucose sphincter opening segmentation with convolutional recurrent neural
trajectories,” in Proc. 24th Int. Conf. Knowl. Discovery Data Mining, networks in high resolution cervical auscultation,” IEEE J. Biomed.
2018, pp. 1387–1395. Health Informat., vol. 25, no. 2, pp. 493–503, Feb. 2021.
Authorized licensed use limited to: Johns Hopkins University. Downloaded on November 01,2023 at 17:56:18 UTC from IEEE Xplore. Restrictions apply.
MAO AND SEJDIĆ: REVIEW OF RNN-BASED METHODS IN COMPUTATIONAL PHYSIOLOGY 7003
[117] Y. Khalifa, C. Donohue, J. L. Coyle, and E. Sejdić, “On the robust- [139] P. Singh and G. Pradhan, “A new ECG denoising framework using gen-
ness of high-resolution cervical auscultation-based detection of upper erative adversarial network,” IEEE/ACM Trans. Comput. Biol. Bioinf.,
esophageal sphincter opening duration in diverse populations,” Proc. vol. 18, no. 2, pp. 759–764, Mar. 2021.
SPIE, vol. 11730, Apr. 2021, Art. no. 117300M. [140] D. Rajan and J. J. Thiagarajan, “A generative modeling approach to
[118] A. Zhao, L. Qi, J. Dong, and H. Yu, “Dual channel LSTM based multi- limited channel ECG classification,” in Proc. 40th Annu. Int. Conf.
feature extraction in gait for diagnosis of neurodegenerative diseases,” IEEE Eng. Med. Biol. Soc., Jul. 2018, pp. 2571–2574.
Knowl. Based Syst., vol. 145, pp. 91–97, Apr. 2018. [141] D. P. Kingma and M. Welling, “Auto-encoding variational Bayes,”
[119] T. Zhen, L. Yan, and P. Yuan, “Walking gait phase detection based on 2013, arXiv:1312.6114.
acceleration signals using LSTM-DNN algorithm,” Algorithms, vol. 12, [142] V. V. Kuznetsov, V. A. Moskalenko, and N. Yu. Zolotykh, “Elec-
no. 12, p. 253, Nov. 2019. trocardiogram generation and feature extraction using a variational
[120] J. Gao, P. Gu, Q. Ren, J. Zhang, and X. Song, “Abnormal gait autoencoder,” 2020, arXiv:2002.00254.
recognition algorithm based on LSTM-CNN fusion network,” IEEE
Access, vol. 7, pp. 163180–163190, 2019.
[121] S. Tortora, S. Ghidoni, C. Chisari, S. Micera, and F. Artoni, “Deep
learning-based BCI for gait decoding from EEG with LSTM recur-
rent neural network,” J. Neural Eng., vol. 17, no. 4, Jul. 2020, Shitong Mao (Graduate Student Member, IEEE)
Art. no. 046011. received the B.Sc. and M.Sc. degrees from Harbin
[122] T. Gandhi, B. K. Panigrahi, and S. Anand, “A comparative study Institute of Technology, Harbin, China, in 2008 and
of wavelet families for EEG signal classification,” Neurocomputing, 2010, respectively. He is currently pursuing the
vol. 74, no. 17, pp. 3051–3057, 2011. Ph.D. degree with the Department of Electrical and
[123] Y. Xu, K. Hong, J. Tsujii, and E. I.-C. Chang, “Feature engineering Computer Engineering, Swanson School of Engi-
combined with machine learning and rule-based methods for structured neering, University of Pittsburgh, Pittsburgh, PA,
information extraction from narrative clinical discharge summaries,” USA.
J. Amer. Med. Inform. Assoc., vol. 19, no. 5, pp. 824–832, Sep. 2012. His current research interests include pattern
[124] S. Chauhan and L. Vig, “Anomaly detection in ECG time signals via recognition, machine learning, biomedical sig-
deep long short-term memory networks,” in Proc. IEEE Int. Conf. Data nal processing, computer vision, electrical devices
Sci. Adv. Analytics (DSAA), Oct. 2015, pp. 1–7. development, and their applications in healthcare systems.
[125] Y. Qiu, F. Xiao, and H. Shen, “Elimination of power line interference
from ECG signals using recurrent neural networks,” in Proc. 39th
Annu. Int. Conf. IEEE Eng. Med. Biol. Soc. (EMBC), Jul. 2017,
pp. 2296–2299.
[126] M. F. Stollenga, W. Byeon, M. Liwicki, and J. Schmidhuber, “Parallel Ervin Sejdić (Senior Member, IEEE) received the
multi-dimensional LSTM, with application to fast biomedical volumet- B.E.Sc. and Ph.D. degrees in electrical engineering
ric image segmentation,” in Proc. 28th Int. Conf. Neural Inf. Process. from the University of Western Ontario, London,
Syst., vol. 2, 2015, pp. 2998–3006. ON, Canada, in 2002 and 2008, respectively.
[127] A. Patane and M. Kwiatkowska, “Calibrating the classifier: Siamese From 2008 to 2010, he was a Post-Doctoral
neural network architecture for end-to-end arousal recognition from Fellow with the University of Toronto, Toronto,
ECG,” in Proc. Int. Conf. Mach. Learn., Optim., Data Sci. Cham, ON, Canada, with a cross-appointment with
Switzerland: Springer, 2018, pp. 1–13. the Bloorview Kids Rehab, Toronto, Canada’s
[128] S. Latif, M. Usman, R. Rana, and J. Qadir, “Phonocardiographic largest children’s rehabilitation teaching hospital.
sensing using deep learning for abnormal heartbeat detection,” IEEE From 2010 to 2011, he was a Research Fellow with
Sensors J., vol. 18, no. 22, pp. 9393–9400, Sep. 2018. the Harvard Medical School, Boston, MA, USA,
[129] J. Bradbury, S. Merity, C. Xiong, and R. Socher, “Quasi-recurrent with a cross-appointment with the Beth Israel Deaconess Medical Center,
neural networks,” 2016, arXiv:1611.01576. Boston. In 2011, he joined the Department of Electrical and Computer
[130] T. Lei, Y. Zhang, S. I. Wang, H. Dai, and Y. Artzi, “Simple recurrent Engineering, University of Pittsburgh, Pittsburgh, PA, USA, as a tenure-
units for highly parallelizable recurrence,” 2017, arXiv:1709.02755. track Assistant Professor, where he was promoted to a tenured Associate
[131] S. Li, W. Li, C. Cook, C. Zhu, and Y. Gao, “Independently recurrent Professor in 2017. He holds secondary appointments with the Department
neural network (IndRNN): Building a longer and deeper RNN,” in of Bioengineering, Swanson School of Engineering, University of Pittsburgh,
Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit., Jun. 2018, where he is also with the Department of Biomedical Informatics, School of
pp. 5457–5466. Medicine, and the Intelligent Systems Program, School of Computing and
[132] J. van der Westhuizen and J. Lasenby, “The unreasonable effectiveness Information. He joined the University of Toronto, Toronto, and North York
of the forget gate,” 2018, arXiv:1804.04849. General Hospital, Toronto, in 2021. His current research interests include
[133] A. Vaswani et al., “Attention is all you need,” in Proc. 31st Int. Conf. biomedical signal processing, gait analyses, swallowing difficulties, advanced
Neural Inf. Process. Syst., 2017, pp. 6000–6010. information systems in medicine, rehabilitation engineering, assistive tech-
[134] Z. Wang, Y. Ma, Z. Liu, and J. Tang, “R-transformer: Recurrent neural nologies, and anticipatory medical devices.
network enhanced transformer,” 2019, arXiv:1907.05572. Dr. Sejdić was a recipient of many awards. He was a recipient of two
[135] A. Radford, L. Metz, and S. Chintala, “Unsupervised representation prestigious awards from the Natural Sciences and Engineering Research
learning with deep convolutional generative adversarial networks,” Council of Canada as a Graduate Student. In 2010, he was a recipient of the
2015, arXiv:1511.06434. Melvin First Young Investigator’s Award from the Institute for Aging Research
[136] C. Donahue, J. McAuley, and M. Puckette, “Adversarial audio synthe- at Hebrew Senior Life, Boston, MA, USA. In 2016, President Obama named
sis,” 2018, arXiv:1802.04208. him as a recipient of the Presidential Early Career Award for Scientists and
[137] T. Golany and K. Radinsky, “PGANS: Personalized generative adver- Engineers, the highest honor bestowed by the U.S. Government on science
sarial networks for ECG synthesis to improve patient-specific deep and engineering professionals in the early stages of their independent research
ECG classification,” in Proc. AAAI Conf. Artif. Intell., vol. 33, 2019, careers. In 2017, he was a recipient of the National Science Foundation
pp. 557–564. CAREER Award, which is the National Science Foundation’s most prestigious
[138] I. Gulrajani, F. Ahmed, M. Arjovsky, V. Dumoulin, and A. Courville, award in support of career-development activities of those scholars who most
“Improved training of Wasserstein GANs,” in Proc. 31st Int. Conf. effectively integrate research and education within the context of the mission
Neural Inf. Process. Syst., Dec. 2017, pp. 5769–5779. of their organization.
Authorized licensed use limited to: Johns Hopkins University. Downloaded on November 01,2023 at 17:56:18 UTC from IEEE Xplore. Restrictions apply.