Real-Time EEG-Based Driver Drowsiness Detection Based On Convolutional Neural Network With Gumbel-Softmax Trick
Real-Time EEG-Based Driver Drowsiness Detection Based On Convolutional Neural Network With Gumbel-Softmax Trick
1, 1 JANUARY 2025
1558-1748 © 2024 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See https://www.ieee.org/publications/rights/index.html for more information.
Authorized licensed use limited to: MIT-World Peace University. Downloaded on February 19,2025 at 18:12:03 UTC from IEEE Xplore. Restrictions apply.
FENG et al.: REAL-TIME EEG-BASED DRIVER DROWSINESS DETECTION BASED ON CNN 1861
drowsiness by detecting facial landmarks, measuring eye clo- end-to-end approach for driver drowsiness recognition
sure duration, monitoring yawning frequency, and estimating based on EEG signals.
head pose [7]. However, these methods rely heavily on vari- 2) To prevent the selection of duplicate channels,
ations in external conditions, such as occlusion, illumination, we impose penalties on the row sums of weight matrices
and clothing effects, and generally produce unreliable identifi- in the selection neurons, which enhances model effi-
cation results [8]. On the contrary, physiological signal-based ciency and reduces computational costs.
approaches differ in that they assess the driver’s alertness level 3) The performance of our network architecture on driver
and detect drowsy states by analyzing physiological such as drowsiness recognition is better than that of eight other
the electrooculography (EOG), the electrocardiography (ECG), recent works, and, in particular, achieves an average
and the electromyography (EMG). Despite these detection accuracy of 80.84% and an F1 score of 79.65% for
methodologies having made notable strides, they are still cross-subject drowsiness classification, demonstrating
hampered by limitations, including sensitivity to neural activity the effectiveness and broad applicability of our method-
and challenges in wearable comfort and practicality. In com- ology.
parison, electroencephalography (EEG) can more effectively 4) Faced with the scarcity of work simulating EEG-based
measure the brain dynamics associated with drowsiness and real-time drowsiness detection, we leverage the advan-
reflect functional and physiological changes in the central tages of the proposed model with high accuracy and
nervous system [9]. Therefore, EEG-based driver drowsiness efficiency to develop a graphical user interface (GUI)
detection has gradually attracted the attention of researchers. for simulating real-time driver drowsiness detection,
The endeavor to model EEG-based drowsiness recognition providing a valuable application for intelligent driver
predominantly branches into two main categories: machine drowsiness detection in the transportation industry.
learning and deep learning. Initially, driver drowsiness recog- The layout of the remaining part of this article is as
nition was accomplished by extracting handcrafted features follows. Section II reviews research related to EEG-based
from EEG signals in the time–frequency or nonlinear domains, driver drowsiness detection and Gumbel-Softmax methods in
which were fed into conventional machine learning mod- neural networks. The detailed description of the public dataset
els. However, these approaches typically require substantial and approach used in the study is in Section III, which is
domain expertise and yield suboptimal performance. In con- followed by the results and discussion of the validated methods
trast, deep learning, with its advanced architecture, allows in Sections IV and V, respectively. Finally, we conclude the
end-to-end learning of task-relevant high-level feature repre- entire paper in Section VI.
sentations directly from raw EEG data, thus achieving superior
performance. Convolutional neural networks (CNNs), as one II. R ELATED W ORKS
of the prominent deep models, exhibit enhanced efficiency in
This section reviews recent advances in EEG-based driver
modeling EEG signals for classification tasks [10]. Despite
drowsiness recognition and introduces the principles and appli-
the remarkable performance achieved in driver drowsiness
cations of the Gumbel-Softmax trick to neural networks.
recognition based on CNN architecture, it tends to underper-
form in cross-subject classification tasks due to significant
intersubject variability in EEG signals [11]. In addition, A. EEG-Based Driver Drowsiness Detection
EEG signals usually have multichannel data, yet not all With the rapid iteration of neural network technologies,
EEG channels correspond to the brain regions activated EEG-based drowsiness recognition methods have evolved from
by driver drowsiness [12]. Therefore, inputting all channels shallow machine learning to deep architectures, which pro-
will increase the model complexity and computational bur- vides a robust foundation for developing efficient and accurate
den, which is not conducive to real-time driver drowsiness driver fatigue detection systems. In general, shallow models
detection. achieve reasonable prediction ability with minimal complexity,
In response to these limitations, we propose a novel deep with a modest requirement for training data, and rely on
learning framework based on the CNN and Gumbel-Softmax extracting features according to prior knowledge or expert-
trick. First, we employ the Gumbel-Softmax discrete sampling informed approaches [13]. On the contrary, deep learning
to learn channel selection in an end-to-end manner from models incorporate a learned representation of the data and
multichannel EEG data, which is relevant to the drowsiness can find intrinsic feature expressions from the training data.
classification task. This approach effectively reduces data The recognition of drowsiness by shallow machine learn-
dimensionality and enhances computational efficiency. Subse- ing models first requires transforming the raw EEG signal
quently, a separable CNN is utilized to capture crucial feature into a set of feature vectors based on traditional artificial
information from the time–frequency domain. Finally, a fully feature extraction approaches and then feeding them into a
connected (FC) layer and a softmax layer are applied to classifier or regression model for detection. Along these lines,
complete the drowsiness classification. The main contributions Ogino and Mitsukura [14] compared the features extracted
of the article are summarized as follows. from EEG by three methods: power spectral density (PSD),
1) We utilize a separable CNN model to capture the autoregressive (AR), and multiscale entropy (MSE), adopted
spatiotemporal correlations in EEG signals, in conjunc- step-wise linear discriminant analysis (SWLDA) to sieve out
tion with the Gumbel-Softmax trick to optimize the informative PSD features, which were then fed into a support
channel selection and network parameters, achieving an vector machine (SVM) with a radial basis function (RBF)
Authorized licensed use limited to: MIT-World Peace University. Downloaded on February 19,2025 at 18:12:03 UTC from IEEE Xplore. Restrictions apply.
1862 IEEE SENSORS JOURNAL, VOL. 25, NO. 1, 1 JANUARY 2025
kernel to detect drowsiness, achieving a classification accuracy for each class i ∈ D = {1, 2, . . . , N } can be expressed as
of 72.7% through tenfold cross-validation evaluation. Novelly, θi exp(αi /T ) exp(αi /T )
Shahbakhti et al. [15] extracted blink-related features from pi = P N = PN = (1)
j=1 θ j j=1 exp(α j /T )
Z D (θ )
low-channel frontal EEG data using a variational mode method
and compared the synergistic effect of blinking and EEG fea- where T is the temperature parameter and Z D (θ ) denotes the
tures before and after filtering for driver drowsiness detection partition function that normalizes the distribution. Frequently,
based on the SVM classifier, with a significant improvement T sets to 1, making α an unstandardized logarithmic probabil-
in average accuracy of 6.9 percentage points. In another study, ity, and (1) is known as the softmax function. Assuming z is
Chen et al. [16] mined nonlinear features, including sam- a class variable, the Gumbel-Softmax trick can elegantly and
ple entropy (SaEn), approximate entropy (ApEn), and Renyi expediently extract the sample z from a categorical distribution
entropy (RenEn) from the EEG subbands and fused eyelid with probabilities
motion information to recognize drowsiness using extreme z = one_hot{arg max[gi + log( pi )]} (2)
learning machine (ELM), achieving high detection accuracies. i
Given the high variability between individuals, specific EEG where gi is a sample drawn from the Gumbel(0, 1) distribution,
features and shallow models make it difficult to achieve and it can be generated from a uniform distribution by the
outstanding cross-subject drowsiness detection performance. Gumbel distribution inverse, denoted as
The emergence of deep learning models has led to a
gi = − log(− log(u i )) (3)
substantial leap forward in the performance and accuracy
of myriad classification tasks, thereby garnering substantial where u i represents a value sampled from the Uniform(0, 1)
scholarly interest. For instance, Gao et al. [5] introduced distribution. The discrete process is implemented by sampling
core blocks to extract temporal dependencies and dense layer gi from a fixed distribution, removing the nondifferentiable
fusion spatial information from the EEG and proposed a sampling step, and facilitating a differential backpropagation
spatiotemporal CNN (ESTCNN) based on the EEG to detect process within the neural network [22].
driver drowsiness with a classification accuracy of 97.37%. Capitalizing on the strengths of the Gumbel-Softmax trick,
To assess the performance of drowsiness recognition across it was introduced to the redetection module using the VGG19
subjects, Cui et al. [17] designed an interpretable CNN model, network in the visual tracking domain for accurate sampling,
achieving an average accuracy of 78.35% on 11 participants producing a smaller number of output target candidate boxes,
for leave-one-out cross-subject, and explained that the network thereby reducing the computational effort of the algorithm and
identifies biologically significant features from EEG signals. improving the long-term tracking performance [23]. In the
In addition, Chen et al. [18] proposed a self-attention channel- training of a mixture of variational autoencoder (MVAE)
connectivity capsule network (SACC-CapsNet) for EEG-based networks, Ye and Bors [24] proposed an inventive component
driver drowsiness detection to study the critical temporal selection approach to generate dropout masking parameters
information and significant channels, with comparative assess- from a differentiable categorical Gumbel-Softmax distribution
ments validating its superiority over prevailing methodologies. by controlling the number of variational autoencoder (VAE)
Despite these triumphs, deep learning architectures grapple components, ensuring end-to-end backpropagation throughout
with an array of challenges, such as more parameters, vast the network’s training. Confronting the challenge of neural
calculations, and insufficient interpretability, and are also architecture search (NAS) algorithm discontinuity in the sam-
affected by extraneous EEG channels [19]. It is of immense pling process of candidates, Pang et al. [22] devised the
practical significance to use an EEG channel selection strategy Gumbel-Softmax scheme, converted the discrete distribution
for network parameters optimization to construct an efficient logits into continuous probabilities that are both summative
real-time driver drowsiness detection system. to unity and amenable to differentiation, and proposed the
Gumbel-Softmax-based NAS (GS-NAS) to automate the archi-
B. Gumbel-Softmax in Neural Networks tectural design of the hierarchical functional brain network
Categorical variables are innately suited for embodying (FBN) decomposition for deep belief network (DBN). So far,
discrete structures. Nonetheless, due to the inability to back- Gumbel-Softmax approaches have attained heightened preva-
propagate through samples, stochastic neural networks rarely lence within neural networks for addressing differentiable
use categorical latent variables, rendering the training of optimization problems.
models with a formidable task [20]. The introduction of
Gumbel-Softmax addressed this problem by replacing non- III. M ATERIALS AND M ETHODS
differentiable samples from a discrete distribution with a A. Dataset Description
differentiable approximation. To study driver drowsiness detection in the driving environ-
The concrete distribution boasts a pivotal characteristic, ment, we used an openly available sustained-attention driving
which can smoothly transition into a categorical distribution, task dataset released in 2019 by Cao et al. [25]. The dataset
and it is a probability distribution that can assign a probability encompasses 62 EEG recordings of 27 students or staff mem-
to N different classes [21]. There are three distinct parame- bers, aged between 22 and 28 years, from the National Chiao
terizations from the same distribution, namely the normalized Tung University. Volunteers enrolled in a 90-min sustained-
probability p, the nonnormalized probability θ , and the non- attention driving experiment numerous times on the same or
normalized logarithmic probability log θ . The probability p different days.
Authorized licensed use limited to: MIT-World Peace University. Downloaded on February 19,2025 at 18:12:03 UTC from IEEE Xplore. Restrictions apply.
FENG et al.: REAL-TIME EEG-BASED DRIVER DROWSINESS DETECTION BASED ON CNN 1863
Authorized licensed use limited to: MIT-World Peace University. Downloaded on February 19,2025 at 18:12:03 UTC from IEEE Xplore. Restrictions apply.
1864 IEEE SENSORS JOURNAL, VOL. 25, NO. 1, 1 JANUARY 2025
Fig. 2. Proposed deep learning network architecture based on the CNN and the Gumbel-Softmax trick. xn denotes the feature vector derived from
the EEG channel n. zk represents the output of the k th selection neuron. The green arrows indicate the input with the highest probability of selection
from each neuron.
Authorized licensed use limited to: MIT-World Peace University. Downloaded on February 19,2025 at 18:12:03 UTC from IEEE Xplore. Restrictions apply.
FENG et al.: REAL-TIME EEG-BASED DRIVER DROWSINESS DETECTION BASED ON CNN 1865
follows:
N K
!
X X
0(P) = δ Relu pnk − θ (6)
n=1 k=1
Authorized licensed use limited to: MIT-World Peace University. Downloaded on February 19,2025 at 18:12:03 UTC from IEEE Xplore. Restrictions apply.
1866 IEEE SENSORS JOURNAL, VOL. 25, NO. 1, 1 JANUARY 2025
Fig. 4. Average performance for 11 subjects at training epochs Fig. 5. Accuracy, precision, recall, and F1-score for each subject
from 1 to 200. calculated using the nested LOSOCV method.
TABLE II
of 0.1 for the regularization weight δ, then set the temperature M EAN AND S TANDARD D EVIATION OF A SSESSMENT
parameter T to 10 and decay to 0.1 as the epoch increased, I NDICATORS FOR A LL S UBJECTS
the regularization threshold θ decayed from 3 to 1.1 [33], and
finally adjusted the number of selection neurons to obtain the
best recognition result.
IV. R ESULTS
A. Overall Performance of Drowsiness Recognition
As for the evaluation criteria for model performance,
we adopt a quartet of universal metrics, such as accu-
all exceeding 80%. It indicates that the overall performance of
racy (Acc), precision (Pre), recall (Rec), and F1-score (F1),
the model has achieved a favorable result at the current training
to comprehensively assess the classification performance of
epoch number. Subsequently, as the number of training epochs
the network.
ranges from 100 to 200, the overall performance evaluation
Considering the influence of factors such as gender, age,
curve appears to fluctuate up and down.
and physical condition among different subjects, the models
In this study, we determined the optimal number of epochs
trained in the inner loop of the nested LOSOCV approach
to be 80 by analyzing the results in Fig. 4. In addition,
exhibit variations. To select the optimal model, we employed
we evaluated the classification performance of driver drowsi-
a strategy similar to a voting mechanism. Specifically, we ana-
ness for each participant using the LOSOCV method, with
lyzed the model validation outcomes from the internal loop
the selection neuron count fixed at 8, as depicted in Fig. 5.
and conducted a vote for the four evaluation metrics (Acc,
During each test stage, we calculated the average Acc, Pre,
Pre, Rec, and F1) across different subject validation results.
Rec, and F1. As can be observed from Fig. 5, the precision
The model corresponding to the maximum value received one
for each subject was higher than the recall, indicating the better
additional vote, and the model amassing the highest votes was
performance of our proposed model in identifying drowsiness
considered as optimal. In the case of a tie in the vote count,
states. In practical driving scenarios, we would prefer higher
the model boasting the supreme F1 was adjudged optimal.
precision values to minimize spurious classification of drowsi-
Since the F1 takes both Pre and Rec into account, it is a
ness as alertness [47]. However, due to individual differences
comprehensive performance index for evaluating classification
among the subjects, the model’s predictive ability significantly
algorithms [46].
decreased for subjects indexed as 02, 07, and 11, which is
Through preliminary training and validation of our proposed
similar to the prior work [48]. In addition, Table II presents
model, we observed that the model exhibited excellent per-
the average classification performance and standard deviations
formance in the driver drowsiness classification task when
of all participants to analyze the differences in the results.
configured with eight selection neurons. To optimize the train-
Our proposed model achieved an Acc of 80.84% and an F1
ing effect of the model, we focused on the selection of epoch
of 79.65%, with corresponding standard deviations of 8.22%
counts. Fig. 4 shows the trend of the average performance
and 8.39%, respectively. It indicates that our model attains
of all subjects on the training epochs from 1 to 200, under
a satisfying performance in cross-subject driver drowsiness
the condition with selection neurons of 8. The graph demon-
detection, thereby validating its robustness and practical utility.
strates that as the number of epochs increases, the average
performance of the model exhibits an overall upward trend
until reaching an optimal epoch count. It is worth noting that B. Comparison With Existing Works
when the number of training epochs is around 80, the values of To highlight the superiority of our proposed model for
four performance evaluation metrics obtain an extreme value, EEG-based cross-subject drowsiness detection within the
Authorized licensed use limited to: MIT-World Peace University. Downloaded on February 19,2025 at 18:12:03 UTC from IEEE Xplore. Restrictions apply.
FENG et al.: REAL-TIME EEG-BASED DRIVER DROWSINESS DETECTION BASED ON CNN 1867
TABLE III
C OMPARISON OF THE EEG-B ASED C ROSS -S UBJECT C LASSIFICATION ACCURACIES (%) B ETWEEN THE P ROPOSED M ODEL AND R ECENT
E XISTING W ORK ON THE S USTAINED -ATTENTION D RIVING TASK DATASET
driving context, we compared it with other recent advanced and separable CNN architecture to mine the potential informa-
baselines, which are briefly described below. tion of EEG channels and extract the crucial features hidden
in the spatial and temporal domains of multichannel EEG
1) CAGNN [49]: It uses a self-attention mechanism-based
signals. Pitted against the state-of-the-art (SOTA) TFormer
connection-aware graph neural network and introduces
model introduced by Li et al. [55], our proposed network
a squeeze-and-excitation block to capture important fea-
achieves competitive performance, with an average accuracy
tures, generating task-relevant connected networks via
improvement of 1.16%. This further attests to the capability of
end-to-end training.
our framework to learn more discriminative and task-relevant
2) Tree-FL+CNN-GAP [50]: A fusion model based on
feature representations for driver drowsiness detection through
tree-based federated learning and CNNs.
an end-to-end approach. Although the reduction in available
3) Conformer [51]: It utilizes the 1-D temporal and spatial
EEG channel information due to the discrete sampling inherent
convolutional layers to learn low-level local features and
in the Gumbel-Softmax trick, the combination of efficient,
the self-attention module to extract global correlations in
embedded channel selection and the classification network has
local temporal features.
achieved superior or comparable classification performance,
4) TSANet [52]: A temporal–spectral fused and
indicating its potential practical value.
attention-based deep neural network model.
In the comparison of eight methodologies, our model
5) SedRVFL [53]: A spectral-ensemble deep random vector
emerged as the foremost in driver drowsiness assessment,
functional link network that focuses on feature learning
demonstrating a significant advantage over other methods.
in the frequency domain.
The outcome underscores the efficacy of our framework in
6) TFAC-Net [31]: It first employs continuous
leveraging the Gumbel-Softmax technique to select effective
wavelet transform to generate the corresponding
EEG channels, combined with a separable CNN that can
spectral–temporal representation and then adopts the
robustly capture information from both temporal and spatial
temporal-frequential attention mechanism to reveal
dimensions.
critical feature regions.
7) FGloWD-edRVFL [54]: It utilizes the CNN to extract C. Simulation for Real-Time Detection
features integrated into a deep random vector functional
According to our research of relevant studies, few works
link network.
simulate real-time EEG-based drowsiness detection due to the
8) TFormer [55]: A time–frequency transformer that auto-
complexity of EEG data. Therefore, to verify the feasibility
matically learns global time–frequency patterns from
of the proposed model in practical applications, considering
raw EEG data.
that the artifacts and blink contamination of the raw EEG
In Table III, we can observe that the proposed approach signals require manual correction [56], we employ prepro-
achieved superior performance in EEG-based driver drowsi- cessed, openly accessible EEG signals [27] to engineer a
ness detection, attributed to the learnable capabilities of the GUI that emulates real-time driver drowsiness detection. The
EEG channel selection via Gumbel-Softmax methodology and overall GUI consists of three parts: an administrator login,
the effective spatiotemporal feature extraction by the separable a user login and registration, and the core drowsiness detection
CNN. Compared with the methods based on deep learning, functionality. Considering the degree of importance of these
our model improves the recognition accuracy by an average of three interfaces, we place the presentation of the first two
4.61% and a maximum of 10.34%. It indicates that our frame- interfaces in the Appendix, and the display of the drowsiness
work can effectively combine the channel selection network detection interface is shown in Fig. 6.
Authorized licensed use limited to: MIT-World Peace University. Downloaded on February 19,2025 at 18:12:03 UTC from IEEE Xplore. Restrictions apply.
1868 IEEE SENSORS JOURNAL, VOL. 25, NO. 1, 1 JANUARY 2025
TABLE IV
A BLATION S TUDY FOR C ROSS -S UBJECT C LASSIFICATION
P ERFORMANCE ON THE S USTAINED -ATTENTION
D RIVING TASK DATASET
Authorized licensed use limited to: MIT-World Peace University. Downloaded on February 19,2025 at 18:12:03 UTC from IEEE Xplore. Restrictions apply.
FENG et al.: REAL-TIME EEG-BASED DRIVER DROWSINESS DETECTION BASED ON CNN 1869
Authorized licensed use limited to: MIT-World Peace University. Downloaded on February 19,2025 at 18:12:03 UTC from IEEE Xplore. Restrictions apply.
1870 IEEE SENSORS JOURNAL, VOL. 25, NO. 1, 1 JANUARY 2025
R EFERENCES [22] T. Pang, S. Zhao, J. Han, S. Zhang, L. Guo, and T. Liu, “Gumbel–
Softmax based neural architecture search for hierarchical brain networks
[1] T. Stewart, “Overview of motor vehicle traffic crashes in 2021,” Nat.
decomposition,” Med. Image Anal., vol. 82, Nov. 2022, Art. no. 102570.
Highway Traffic Saf. Admin., Washington, DC, USA, Tech. Rep. DOT
HS 813 435, Apr. 2023. [23] Z. Hou, J. Ma, W. Yu, Z. Yang, S. Ma, and J. Fan, “Multi-template global
[2] A. Chowdhury, R. Shankaran, M. Kavakli, and M. M. Haque, “Sensor re-detection based on Gumbel–Softmax in long-term visual tracking,”
applications and physiological features in drivers’ drowsiness detection: Appl. Intell., vol. 53, no. 18, pp. 20874–20890, Sep. 2023.
A review,” IEEE Sensors J., vol. 18, no. 8, pp. 3055–3067, Apr. 2018. [24] F. Ye and A. G. Bors, “Deep mixture generative autoencoders,” IEEE
[3] X. Li, S. Lai, and X. Qian, “DBCFace: Towards pure convolutional neu- Trans. Neural Netw. Learn. Syst., vol. 33, no. 10, pp. 5789–5803,
ral network face detection,” IEEE Trans. Circuits Syst. Video Technol., Oct. 2022.
vol. 32, no. 4, pp. 1792–1804, Apr. 2022. [25] Z. Cao, C.-H. Chuang, J.-K. King, and C.-T. Lin, “Multi-channel EEG
[4] B. Mandal, L. Li, G. S. Wang, and J. Lin, “Towards detection of bus recordings during a sustained-attention driving task,” Sci. Data, vol. 6,
driver fatigue based on robust visual analysis of eye state,” IEEE Trans. no. 1, p. 19, Apr. 2019.
Intell. Transp. Syst., vol. 18, no. 3, pp. 545–557, Mar. 2017. [26] R. Langner, M. B. Steinborn, A. Chatterjee, W. Sturm, and K. Willmes,
[5] Z. Gao et al., “EEG-based spatio-temporal convolutional neural network “Mental fatigue and temporal preparation in simple reaction-time per-
for driver fatigue evaluation,” IEEE Trans. Neural Netw. Learn. Syst., formance,” Acta Psycholog., vol. 133, no. 1, pp. 64–72, Jan. 2010.
vol. 30, no. 9, pp. 2755–2763, Sep. 2019. [27] Z. Cao, C. H. Chuang, J. K. King, and C. T. Lin, “Multi-
[6] M. Shen, B. Zou, X. Li, Y. Zheng, L. Li, and L. Zhang, “Multi-source channel EEG recordings during a sustained-attention driving task
signal alignment and efficient multi-dimensional feature classification in (pre-processed dataset),” Sci. Data, vol. 6, no. 1, p. 19, Apr. 2019, doi:
the application of EEG-based subject-independent drowsiness detection,” 10.6084/m9.figshare.7666055.v3.
Biomed. Signal Process. Control, vol. 70, Sep. 2021, Art. no. 103023. [28] C.-S. Wei, Y.-T. Wang, C.-T. Lin, and T.-P. Jung, “Toward drowsiness
[7] Y. Gu, Y. Jiang, T. Wang, P. Qian, and X. Gu, “EEG-based driver mental detection using non-hair-bearing EEG-based brain-computer interfaces,”
fatigue recognition in COVID-19 scenario using a semi-supervised IEEE Trans. Neural Syst. Rehabil. Eng., vol. 26, no. 2, pp. 400–406,
multi-view embedding learning model,” IEEE Trans. Intell. Transp. Feb. 2018.
Syst., vol. 25, no. 1, pp. 859–868, Jan. 2024. [29] C.-S. Wei, Y.-T. Wang, C.-T. Lin, and T.-P. Jung, “Toward non-hair-
[8] J. R. Paulo, G. Pires, and U. J. Nunes, “Cross-subject zero calibration bearing brain-computer interfaces for neurocognitive lapse detection,”
driver’s drowsiness detection: Exploring spatiotemporal image encoding in Proc. 37th Annu. Int. Conf. IEEE Eng. Med. Biol. Soc. (EMBC),
of EEG signals for convolutional neural network classification,” IEEE Aug. 2015, pp. 6638–6641.
Trans. Neural Syst. Rehabil. Eng., vol. 29, pp. 905–915, 2021. [30] J. Cui et al., “A compact and interpretable convolutional neural network
[9] X. Zhang et al., “Fatigue detection with covariance manifolds of for cross-subject driver drowsiness detection from single-channel EEG,”
electroencephalography in transportation industry,” IEEE Trans. Ind. Methods, vol. 202, pp. 173–184, Jun. 2022.
Informat., vol. 17, no. 5, pp. 3497–3507, May 2021. [31] P. Gong, P. Wang, Y. Zhou, X. Wen, and D. Zhang, “TFAC-Net: A
[10] Y. Yang, Z. Gao, Y. Li, and H. Wang, “A CNN identified by rein- temporal-frequential attentional convolutional network for driver drowsi-
forcement learning-based optimization framework for EEG-based state ness recognition with single-channel EEG,” IEEE Trans. Intell. Transp.
evaluation,” J. Neural Eng., vol. 18, no. 4, May 2021, Art. no. 046059. Syst., vol. 25, no. 7, pp. 7004–7016, Jul. 2024.
[11] X. Gu et al., “EEG-based brain-computer interfaces (BCIs): A survey of [32] A. Abid, M. Balin, and J. Zou, “Concrete autoencoders for differentiable
recent studies on signal sensing technologies and computational intel- feature selection and reconstruction,” in Proc. Int. Conf. Mach. Learn.,
ligence approaches and their applications,” IEEE/ACM Trans. Comput. vol. 97, Jun. 2019, pp. 444–453.
Biol. Bioinf., vol. 18, no. 5, pp. 1645–1666, Sep. 2021. [33] T. Strypsteen and A. Bertrand, “End-to-end learnable EEG channel
[12] A. Ahmadi, H. Bazregarzadeh, and K. Kazemi, “Automated detection selection for deep neural networks with Gumbel–Softmax,” J. Neural
of driver fatigue from electroencephalography through wavelet-based Eng., vol. 18, no. 4, Aug. 2021, Art. no. 0460a9.
connectivity,” Biocybern. Biomed. Eng., vol. 41, no. 1, pp. 316–332, [34] C. J. Maddison, A. Mnih, and Y. W. Teh, “The concrete distribution: A
Jan. 2021. continuous relaxation of discrete random variables,” in Proc. Int. Conf.
[13] G. Sikander and S. Anwar, “Driver fatigue detection systems: A Learn. Represent., Nov. 2016, p. 712.
review,” IEEE Trans. Intell. Transp. Syst., vol. 20, no. 6, pp. 2339–2352, [35] E. J. Gumbel, “Statistics of extremes,” Annu. Rev. Stat. Appl., vol. 2,
Jun. 2019. pp. 203–235, Apr. 2015.
[14] M. Ogino and Y. Mitsukura, “Portable drowsiness detection through use [36] H. Sun, J. Ren, H. Zhao, P. Yuen, and J. Tschannerl, “Novel Gumbel–
of a prefrontal single-channel electroencephalogram,” Sensors, vol. 18, Softmax trick enabled concrete autoencoder with entropy constraints for
no. 12, p. 4477, Dec. 2018. unsupervised hyperspectral band selection,” IEEE Trans. Geosci. Remote
[15] M. Shahbakhti et al., “Simultaneous eye blink characterization and Sens., vol. 60, 2022, Art. no. 5506413.
elimination from low-channel prefrontal EEG signals enhances driver [37] R. Flamary, N. Jrad, R. Phlypo, M. Congedo, and A. Rakotomamonjy,
drowsiness detection,” IEEE J. Biomed. Health Informat., vol. 26, no. 3, “Mixed-norm regularization for brain decoding,” Comput. Math. Meth-
pp. 1001–1012, Mar. 2022. ods Med., vol. 2014, pp. 1–13, Apr. 2014.
[16] L.-L. Chen, Y. Zhao, J. Zhang, and J.-Z. Zou, “Automatic detection [38] Z. Li, F. Liu, W. Yang, S. Peng, and J. Zhou, “A survey of convolutional
of alertness/drowsiness from physiological signals using wavelet-based neural networks: Analysis, applications, and prospects,” IEEE Trans.
nonlinear features and machine learning,” Expert Syst. Appl., vol. 42, Neural Netw. Learn. Syst., vol. 33, no. 12, pp. 6999–7019, Dec. 2022.
no. 21, pp. 7344–7355, Nov. 2015.
[39] E. Fernandez-Blanco, D. Rivero, and A. Pazos, “EEG signal processing
[17] J. Cui, Z. Lan, O. Sourina, and W. Müller-Wittig, “EEG-based cross-
with separable convolutional neural network for automatic scoring of
subject driver drowsiness recognition with an interpretable convolutional
sleeping stage,” Neurocomputing, vol. 410, pp. 220–228, Oct. 2020.
neural network,” IEEE Trans. Neural Netw. Learn. Syst., vol. 34, no. 10,
pp. 7921–7933, Oct. 2022. [40] H. Qi, T. Xu, G. Wang, Y. Cheng, and C. Chen, “MYOLOv3-tiny: A
new convolutional neural network architecture for real-time detection of
[18] C. Chen, Z. Ji, Y. Sun, A. Bezerianos, N. Thakor, and H. Wang, “Self-
track fasteners,” Comput. Ind., vol. 123, Dec. 2020, Art. no. 103303.
attentive channel-connectivity capsule network for EEG-based driving
fatigue detection,” IEEE Trans. Neural Syst. Rehabil. Eng., vol. 31, [41] I. M. Elfadel, “On the stability of analog ReLU networks,” IEEE
pp. 3152–3162, 2023. Trans. Comput.-Aided Design Integr. Circuits Syst., vol. 40, no. 11,
[19] Y. Zhang, R. Guo, Y. Peng, W. Kong, F. Nie, and B.-L. Lu, “An auto- pp. 2426–2430, Nov. 2021.
weighting incremental random vector functional link network for EEG- [42] S. Ioffe and C. Szegedy, “Normalization: Accelerating deep network
based driving fatigue detection,” IEEE Trans. Instrum. Meas., vol. 71, training by reducing internal covariate shift,” in Proc. Mach. Learn.
pp. 1–14, 2022. Res., Jun. 2015, pp. 448–456.
[20] E. Jang, S. Gu, and B. Poole, “Categorical reparameterization with [43] G. C. Cawley and N. L. Talbot, “On over-fitting in model selection and
Gumbel–Softmax,” in Proc. Int. Conf. Learn. Represent., vol. 1611, subsequent selection bias in performance evaluation,” J. Mach. Learn.
Nov. 2016, p. 1144. Res., vol. 11, pp. 2079–2107, Jul. 2010.
[21] I. A. M. Huijben, W. Kool, M. B. Paulus, and R. J. G. van Sloun, “A [44] D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,”
review of the Gumbel-max trick and its extensions for discrete stochas- in Proc. Int. Conf. Learn. Represent., vol. 1412, Dec. 2014, p. 6980.
ticity in machine learning,” IEEE Trans. Pattern Anal. Mach. Intell., [45] L. Prechelt, “Automatic early stopping using cross validation: Quantify-
vol. 45, no. 2, pp. 1353–1371, Feb. 2023. ing the criteria,” Neural Netw., vol. 11, no. 4, pp. 761–767, Jun. 1998.
Authorized licensed use limited to: MIT-World Peace University. Downloaded on February 19,2025 at 18:12:03 UTC from IEEE Xplore. Restrictions apply.
FENG et al.: REAL-TIME EEG-BASED DRIVER DROWSINESS DETECTION BASED ON CNN 1871
[46] D. J. Hand, P. Christen, and N. Kirielle, “F*: An interpretable transfor- Xiaoping Wang (Senior Member, IEEE) received the B.S. and M.S.
mation of the F-measure,” Mach. Learn., vol. 110, no. 3, pp. 451–456, degrees in automation from Chongqing University, Chongqing, China,
Mar. 2021. in 1997 and 2000, respectively, and the Ph.D. degree in systems engi-
[47] M. Ahmed, S. Masood, M. Ahmad, and A. A. A. El-Latif, “Intelligent neering from Huazhong University of Science and Technology, Wuhan,
driver drowsiness detection for traffic safety based on multi CNN deep China, in 2003.
model and facial subsampling,” IEEE Trans. Intell. Transp. Syst., vol. 23, Since 2011, she has been a Professor with the School of Artifi-
no. 10, pp. 19743–19752, Oct. 2022. cial Intelligence and Automation, Huazhong University of Science and
[48] R. Li, L. Wang, and O. Sourina, “Subject matching for cross-subject Technology. Her current research interests include memristors and their
EEG-based recognition of driver states related to situation awareness,” applications to memory storage, modeling, and simulation.
Methods, vol. 202, pp. 136–143, Jun. 2022.
[49] Z. Zhuang, Y.-K. Wang, Y.-C. Chang, J. Liu, and C.-T. Lin, “A
connectivity-aware graph neural network for real-time drowsiness
classification,” IEEE Trans. Neural Syst. Rehabil. Eng., vol. 32,
pp. 83–93, 2024.
[50] X. Qin, Y. Niu, H. Zhou, X. Li, W. Jia, and Y. Zheng, “Driver drowsiness
EEG detection based on tree federated learning and interpretable net-
Jialan Xie received the M.S. and Ph.D. degrees from the Col-
work,” Int. J. Neural Syst., vol. 33, no. 3, Mar. 2023, Art. no. 2350009.
lege of Electronic and Information Engineering, Southwest University,
[51] Y. Song, Q. Zheng, B. Liu, and X. Gao, “EEG conformer: Convolutional
Chongqing, China, in 2018 and 2024, respectively.
transformer for EEG decoding and visualization,” IEEE Trans. Neural
Her current research interests include affective computing, virtual
Syst. Rehabil. Eng., vol. 31, pp. 710–719, 2023.
reality, and machine learning.
[52] G. Fu, Y. Zhou, P. Gong, P. Wang, W. Shao, and D. Zhang, “A
temporal–spectral fused and attention-based deep model for automatic
sleep staging,” IEEE Trans. Neural Syst. Rehabil. Eng., vol. 31,
pp. 1008–1018, 2023.
[53] R. Li, R. Gao, P. N. Suganthan, J. Cui, O. Sourina, and L. Wang, “A
spectral-ensemble deep random vector functional link network for pas-
sive brain–computer interface,” Expert Syst. Appl., vol. 227, Oct. 2023,
Art. no. 120279. Wanqing Liu received the M.S. degree from the College of Electronic
[54] R. Li, R. Gao, L. Yuan, P. N. Suganthan, L. Wang, and O. Sourina, and Information Engineering, Southwest University, Chongqing, China,
“An enhanced ensemble deep random vector functional link network for in 2024.
driver fatigue recognition,” Eng. Appl. Artif. Intell., vol. 123, Aug. 2023, Her main research interests include affective computing and
Art. no. 106237. EEG-based emotion recognition.
[55] R. Li, M. Hu, R. Gao, L. Wang, P. N. Suganthan, and O. Sourina,
“TFormer: A time–frequency transformer with batch normalization for
driver fatigue recognition,” Adv. Eng. Informat., vol. 62, Oct. 2024,
Art. no. 102575.
[56] Y.-J. Liu, M. Yu, G. Zhao, J. Song, Y. Ge, and Y. Shi, “Real-time movie-
induced discrete emotion recognition from EEG signals,” IEEE Trans.
Affect. Comput., vol. 9, no. 4, pp. 550–562, Oct. 2018.
[57] Y. Liu, Z. Lan, J. Cui, O. Sourina, and W. Müller-Wittig, “Inter-subject Yinghao Qiao received the M.S. degree from the College of Electronic
transfer learning for EEG-based mental fatigue recognition,” Adv. Eng. and Information Engineering, Southwest University, Chongqing, China,
Informat., vol. 46, Oct. 2020, Art. no. 101157. in 2024.
[58] I. Stancin, M. Cifrek, and A. Jovic, “A review of EEG signal features Her main research interests include deep learning and EEG-based
and their application in driver drowsiness detection systems,” Sensors, emotion recognition.
vol. 21, no. 11, p. 3786, May 2021.
Authorized licensed use limited to: MIT-World Peace University. Downloaded on February 19,2025 at 18:12:03 UTC from IEEE Xplore. Restrictions apply.