A Spatiotemporal Feature Extraction Technique Using Superlet CNN
A Spatiotemporal Feature Extraction Technique Using Superlet CNN
India
7 Department of Electrical Power Engineering, Faculty of Electrical Engineering, Universiti Teknologi Malaysia, Skudai 81310, Malaysia
8 Department of Electrical Engineering, Graphic Era (Deemed to be University), Dehradun 248002, India
9 Artificial Intelligence for Islamic Civilization and Sustainability, Universiti Sultan Zainal Abidin (UniSZA), Kuala Nerus, Kuala Terengganu, Terengganu 21300,
Malaysia
10 Operation Research and Management Sciences, Universiti Sultan Zainal Abidin (UniSZA), Kuala Nerus, Kuala Terengganu, Terengganu 21300, Malaysia
Corresponding authors: Vinay Kumar Jadoun ([email protected]) and Hasmat Malik ([email protected])
ABSTRACT In the realm of Brain-Computer Interface (BCI) research, the precise decoding of motor
imagery electroencephalogram (MI-EEG) signals is pivotal for the realization of systems that can be
seamlessly integrated into practical applications, enhancing the autonomy of individuals with mobility
impairments. This study presents an enhanced method for the precise recognition of MI tasks using EEG data,
to facilitate more intuitive interactions between individuals with mobility challenges and their environment.
The core challenge addressed herein is the development of robust algorithms that enable the accurate
identification of MI tasks, thereby empowering individuals with mobility impairments to control devices
and interfaces through cognitive commands. Although there are many different methods for analyzing
MI-EEG signals, research into deep learning and transfer learning approaches for MI-EEG analysis remains
scarce. This research leverages the superlet transform (SLT) to transform EEG signals into a two-dimensional
(2-D) high-resolution spectral representation. This 2-D representation of segmented MI-EEG signals is
then processed through an adapted pretrained residual network, which classifies the MI-EEG signals. The
effectiveness of the suggested technique is evident as the achieved classification accuracy is 99.9% for
binary tasks and 96.4% for multi-class tasks, representing a significant advancement over existing methods.
Through an intensive comparison with present algorithms assessed in variety of performance evaluating
metrics the present study emphasize the exceptional ability of proposed approach to accurately classify the
different MI categories from the EEG signals and which is a great contribution to the field of BCI research
field.
INDEX TERMS Motor imagery (MI), deep neural network (DNN), superlet transform (SLT),
brain–computer interface (BCI).
I. INTRODUCTION
The associate editor coordinating the review of this manuscript and Electroencephalography (EEG) signals are a gateway to
approving it for publication was Joewono Widjaja . the brain’s electrical activity enabling the capture of the
2024 The Authors. This work is licensed under a Creative Commons Attribution 4.0 License.
VOLUME 13, 2025 For more information, see https://creativecommons.org/licenses/by/4.0/ 2141
N. Sharma et al.: Spatiotemporal Feature Extraction Technique Using Superlet-CNN Fusion
dynamical oscillations at the center of neuronal interactions Fourier transforms (STFTs), continuous wavelet transforms
across various stages of consciousness, mental processes and (CWTs). The cutting edge technique transformer is also
stimuli response [1]. As a result, it has become a valuable employed by numerous researcher in deep learning field [23],
tool in both medical diagnostics and cognitive neuroscience. [24], [25], [26], [27], [28]. Authors has utilized spatiotem-
MI-EEG, which forms the crux of neuroscientific researches, poral features for various classification tasks. However, the
is devoted to the analysis of brain activity generated during dataset available for deep learning techniques is limited,
the mental simulation of movements, from which the neural necessitating the need for more data to effectively train the
bases of motor planning can be deciphered. It creates the basis models. This constraint guides us to employ transfer learning
for future therapeutic practices and BCI (brain-computer techniques to overcome it. Therefore, there are some methods
interface) technologies [2]. BCIs represent a brand new combining traditional feature extraction methods with deep
technological frontier in bridging human mental states with learning methods [29], [30]. In this approach, researchers
external apparatus, thereby enabling immediate communi- leverage transfer learning by converting one-dimensional
cation and regulation by neural signals decoding hence (1-D) data into two-dimensional (2-D) data, which is then
paving the way for new ways of interaction without body trained and evaluated using pre-trained networks. In the
movement necessary. There are various techniques to record proposed methodology, raw MI-EEG data is converted into
brain activity that range from invasive and semi-invasive to 2-D data using superlet transform and then classified using
non-invasive approaches. Of all non-invasive methods like pretrained residual CNN models for binary and multiclass
electroencephalography (EEG), Magnetoencephalography classification. Contribution of the study:
(MEG), Positron Emission Tomography (PET), Functional • The proposed method distinguishes itself by utilizing
Magnetic Resonance Imaging (fMRI) and optical imaging, superlet transform to generate distinct and informative
EEG is the outstanding choice. This preference is due to its features for motor imagery tasks. These features, when
non-invasiveness, portability potential of users, higher time fed into a CNN architecture, demonstrably increase
resolution and lower cost [3]. BCIs can leverage MI-EEG classification accuracy.
signals in diverse ways to enhance control, holding significant • To thoroughly assess the effectiveness of our algo-
promise in fields that necessitate decoding user thoughts for rithm, we implemented an ablation study using three
imagined actions, including gaming, neuro-prosthetics, and pre-trained residual networks. This systematic anal-
neurorehabilitation. For instance, utilizing EEG recordings of ysis reveals the contributions of each component
left- and right-hand motor imagery can enable the movement and conclusively demonstrates the algorithm’s superior
of a target, offering a novel communication pathway to performance.
compensate for lost motor function. This technology holds • Detailed evaluation on two different datasets reveals
the potential to provide amyotrophic lateral sclerosis patients superior generalization capabilities compared to SoTA
with a clear binary response to queries, thereby enhancing benchmarks, highlighting the robustness and adaptabil-
their overall quality of life [4]. Within the realm of scholarly ity of our proposed network.
literature, researchers have extensively explored signal pro- • We introduce a pioneering method for motor imagery
cessing techniques, encompassing methodologies grounded recognition in BCI by utilizing the superlet transform,
in time, frequency, and time-frequency analysis. Notably, offering a unique high-resolution approach.
only a limited subset of these approaches has exhibited Following an introduction, the article progresses through
the requisite robustness to merit in-depth consideration for a structured analysis: Section II details about the utilized
further research. The commonly used feature extraction Dataset, Section III provides a detailed description of the
algorithms include wavelet transform (WT) [5], [6], [7] methodology used, while Section IV delivers the results and
wavelet packet decomposition (WPD) [8], [9], common spa- analysis., and Section V summarizes the key insights and also
tial patterns (CSP) [10], [11], empirical mode decomposition discuss the future scopes.
(EMD) [12], [13], empirical wavelet decomposition (EWT)
[14], [15], Fourier decomposition method (FDM) [16], [17], II. EXPERIMENTAL SETUP
and so on. Various differential evolution algorithms such The computational experiments were performed on a
as ant colony optimization (ACO) and artificial bee colony high-performance workstation featuring an Intel Core i7-10th
(ABC) were proposed by [18] for optimum feature selection. generation 16-core processor running at 3.80 GHz, with a
Traditionally extracted features often rely on manual design, Linux 22.04 operating system, 128 GB of RAM, and an
necessitating extensive expertise. Therefore, the automatic NVIDIA Quadro RTX 5000 GPU with 16 GB of memory.
identification of significant features from EEG signals is of MATLAB 2022b was used for all experiments.
considerable importance. Deep learning effectively handles Research on MI-EEG signals leverages numerous datasets,
non-linear and non-stationary data, automatically deriving each varying in subject count, electrode number, trial
useful features from the raw data. In recent years, some deep duration, total trials, sampling frequency, and MI task types.
learning methods [19], [20], [21], [22] are employed for the Among these, the most utilized datasets by researchers are
classification of EEG signals where data is converted into BCI competition IVa and BCI IV 2a. This article utilized
its time frequency (TF) representation such as Short-time dataset IV-A from the BCI Competition-III, which is a
publicly available dataset of EEG signals for brain-computer capability on unseen data. Segmentation of the EEG signals
interface research [31]. The dataset contains EEG data from was tailored to the type of dataset being used. For the binary
five healthy subjects who performed right-hand (RH) and classification dataset, a segmentation window of 3.5 seconds
right-foot (RF) motor imagery (MI) tasks. EEG data were was applied, while for the multiclass classification dataset,
acquired from 118 out of 128 electrodes according to the the window was set to 4 seconds. This segmentation
international 10/20 system. Visual cues were displayed for ensures that the input data was divided into manageable
3.5 seconds to indicate the MI task. The unified dataset of portions that reflect meaningful temporal structures in the
280 trials was split into training, validation, and test sets for signals. The learning rate for fine-tuning the deep CNN
individual subjects. The sampling frequency of the signal model, particularly ResNet, was set to 1e-5, striking a
is 100 Hz. The timing diagram of dataset is shown in Figure 1. balance between convergence speed and model performance.
The network was trained for a maximum of 50 epochs,
with each epoch consisting of 1,027 iterations. In total,
51,350 iterations were performed, which allowed the model
ample opportunity to adjust its parameters and minimize the
loss function effectively.
III. METHODOLOGY
The methodological framework of our study is outlined
here. Subsection III-A provides a foundational understand-
ing of the Time-Frequency (TF) representation technique,
employed for feature extraction and signal characterization.
FIGURE 1. Timing diagram for binary class dataset. Subsection III-B delves into the specifics of the deep learn-
ing architecture, unveiling its network topology, activation
The other adapted dataset for classifying multiclass motor functions, and optimization algorithms. Here, a CNN based
imagery (MI) is BCI IV2a [32]. This study employed a framework is proposed for classification of RH, LH, F,
dataset comprising 22 EEG channels and 3 EOG channels to and T movements. The MI-EEG signal is segmented and
observe the brain activity of 9 individuals as they imagined transformed into 2D TF spectrogram using SLT. Then, these
performing four distinct tasks: the main code: didgeridoo are applied to different residual pretrained network. Based
left hand (LH), right hand (RH), tongue (T), feet (F). Data on extracted features from pretrained network, the classifier
gathering took place at 250Hz rate. The motor imagery task identifies the label of the class.
was conducted within a time frame of 2 to 6 seconds. In this The superlet transform offers a significant advantage in
study, segmentation was carried out at the 4-second mark, and EEG signal analysis by enhancing time-frequency resolution
a total of 22 channels were employed. The timing diagram of through adaptive and variable bandwidths, enabling the
dataset is depicted in Figure 2. capture of subtle oscillatory patterns and temporal variations
critical for EEG classification. Similarly, the ResNet model,
with its residual connections, addresses the vanishing gradi-
ent problem, allowing deeper architectures to efficiently learn
complex features inherent in EEG signals. This combination
improves the overall efficiency by leveraging precise feature
extraction from the superlet transform and robust learning
capabilities of ResNet, resulting in superior classification
performance and faster convergence. Figure 3 shows the step-
by-step approach used in the article.
A. SUPERLET TRANSFORM
STFT and CWT are established techniques for analysing
FIGURE 2. Timing diagram for multi class dataset. the TF characteristics of signals. However, both approaches
involve a trade-off between time and frequency resolution.
In the study, the dataset was divided into three subsets The superlet transform (SLT) overcomes this limitation by
to facilitate training, validation, and testing of the ResNet employing a set of wavelets, offering improved TF resolution
models. Sixty percent of the input data was allocated for and reduced ‘‘leakage’’ compared to a single wavelet [33].
training, ensuring that the model had sufficient data to Table 1 summarizes the key advantages and disadvantages of
learn meaningful patterns. Twenty percent of the data was each method, highlighting their suitability for different signal
used for validation, allowing the model’s performance to be characteristics, resolution requirements, and desired repre-
monitored during training, while the remaining 20% was sentations. SLT provides a new spectral estimation which
reserved for testing to assess the model’s generalization provides the information on high frequency components
of signal [34]. SLT intrinsically enhances signal clarity by parameters in the signal. SL constitutes a finite group of
suppressing noise through the integration of responses from wavelets sharing the same central frequency f and extending
multiple wavelets. This noise suppression occurs naturally across multiple bandwidths with c1 , c2 , . . . , ck indicating the
as a result of the combined wavelet responses. SLT offers number of cycles in the individual wavelets.
superior adaptability and enhanced time-frequency resolution
SL(f ,k) = ψ(f ,k) |c = c1 , c2 , c3 , . . . . . . ck
(2)
and has a relatively high computational complexity compared
to traditional methods such as STFT and CWT. SLT’s where k represents the order of SL, and ψci (t) is the ith order
multi-scale nature, which allows for better signal analysis wavelet that is given by:
at various frequencies, requires multiple convolutions and h i2 !
σ ·f − 12 σc·ift
scales to be computed simultaneously. This increased pro- ψci (t) = √ e ej25ft (3)
cessing demands more computational resources and time, c · i · 2π
particularly when working with large datasets or real-time
where σ is the standard deviation in k th order SL.
applications. The added computational overhead is a key
consideration in practical deployment, and efforts to optimize
B. CNN MODELS
SLT implementations for more efficient computation will
CNN models consist of feature extraction layers, such as
be important for its wider adoption in resource-constrained
convolutional layers, activation layer, batch normalization
and real-time environments. In our research, we employ the
layers, and pooling layers, that learn to extract features and
Morlet wavelet as the foundational mother wavelet. The
patterns from the input data and classification layer such
Superlet Transform of a signal g(t) can be mathematically
as fully connected layers, softmax layers, and classification
represented as the geometric mean of the responses from the
layer that gives the final decision of the input data [35].
individual wavelets, which is defined as follows:
In this study, we leverage the concept of transfer learning
u k √ Z ∞
v
u Y 2
τ −t
by utilizing the pre-trained ResNet architecture for the task
g (τ ) ψf ,ci
∗
k
A SLf ,k = t dτ (1) of motion recognition. The ingenious design of ResNet lies
s −∞ s
i=1 in its utilization of residual connections, which significantly
The integration of multiple wavelets enhances TF represen- enhances the learning capability of deep neural networks by
tation in SL. The parameters t and s are scaling and shifting focusing on residual functions rather than direct mappings.
TP
This strategic approach effectively mitigates the issue of Precision = (7)
gradient vanishing or exploding, thus preserving the integrity TP + FP
of the learning process. The ResNet framework is distin- 2TP
F − measure = (8)
guished by its incorporation of MaxPool and Average Pool FN + FP + 2TP
layers. Specifically, ResNet 101, a deep convolutional neural
network featuring 101 layers, exemplifies the advanced IV. RESULTS AND DISCUSSION
iteration of ResNet. It employs residual blocks to master In this section, the performance of three residual pretrained
residual functions, thereby overcoming learning degradation networks is evaluated. Notably, ResNet 101 achieved the
due to gradient issues. Structurally, ResNet 101 is organized highest classification accuracy, reaching 99.9% for the binary
into 33 convolutional layers divided into four distinct dataset and 96.4% for the multiclass dataset. Throughout our
sets, characterized by varying filter counts and repetition experiments, the ADAM optimizer with a learning rate of
frequencies. The sequence begins with a set of 64 filters 0.00001 was consistently employed. It’s noteworthy that, for
repeated thrice, progresses to 128 filters repeated four times, the purpose of this study, all samples related to foot and
advances to 256 filters repeated 23 times, and concludes with right-hand motion were amalgamated across all five subjects.
512 filters repeated thrice. After every convolutional layer, The data were segmented into 3.5-second intervals for binary
there is a batch normalization layer and a ReLU activation classification, while for the multiclass dataset, samples were
function, guaranteeing effective and nonlinear processing. combined and then segmented into 4-second intervals. The
The architecture initiates with a 64-filter convolutional layer time-frequency representation (TFR) superlet transform was
with a 7 × 7 kernel size, culminating in a fully connected subsequently applied to the segmented data. Subsequently,
layer equipped with 1000 nodes and a softmax function the dataset was divided into training, validation, and test sets
for classification. Capable of categorizing images into in an 80:20 ratio. The 2-D representation resulting from these
1000 different objects, ResNet 101 demonstrates remarkable processes served as input for various ResNet architectures.
accuracy across a plethora of benchmark datasets and various The ResNet18 architecture demonstrated an accuracy of
computer vision challenges. Figure 4 represents the layered 82.8% in recognizing hand and foot motion, as illustrated
structure of ResNet 101. in Figure 6 (a). The corresponding confusion matrix for the
test dataset revealed that out of 1,595 hand and foot samples,
C. EVALUATION METRICS 289 foot and 259 hand samples were misclassified, resulting
System performance can be assessed using a variety of per- in the overall test accuracy of 82.8%. In the case of the
formance metrics, such as accuracy, sensitivity, specificity, ResNet50 architecture Figure 6 (b), an impressive accuracy
etc. A true positive (TP) correctly determines the presence of 99.4% was achieved for hand and foot motion recognition.
of a condition or characteristic. A false positive (FP) claims The confusion matrix indicated only 10 foot and 9 hand
erroneously that a condition or characteristic exists.TN is samples were misclassified out of the 1,595 samples in the
one that accurately identifies the absence of a condition. test dataset. Furthermore, the performance of the ResNet101
FN claims erroneously that a condition or characteristic does architecture Figure 6 (c) reached an outstanding accuracy of
not exist [36], [37]. In Fig.5, the confusion matrix for the 99.9%. In this case, all 1,595 foot samples were correctly
binary class is depicted. identified, and out of the 1,595 hand samples, 1,592 were
accurately classified as hand samples, with only 3 samples
TP + TN misclassified as foot samples.
Accuracy = (4)
TP + TN + FP + FN The ResNet18 model achieved a notable accuracy rate
TP of 95.3% in identifying MI-EEG signals, as depicted in
Senstivity = (5) Figure 7 (a). The analysis of the test dataset’s confusion
TP + FN
TN matrix shows accurate classification of 6017 LH, 6429 RH,
Specificity = (6) 6317 F, and 5736 T samples, culminating in an overall test
TN + FP
VOLUME 13, 2025 2145
N. Sharma et al.: Spatiotemporal Feature Extraction Technique Using Superlet-CNN Fusion
FIGURE 5. Confusion matrix. TABLE 3. Detection summary of ResNet-101 using SLT and deep learning
model for multi-class dataset.
FIGURE 6. Confusion matrix of pretrained ResNet model for binary class dataset.
FIGURE 7. Confusion matrix of pretrained ResNet model for multi class dataset.
FIGURE 9. Traininng plot for binary class data for Resnet 101.
FIGURE 10. Traininng plot for multiclass data for Resnet 101.
TABLE 4. Comparative analysis of the proposed method and current leading-edge techniques for binary classification datasets.
TABLE 5. Comparative analysis of the proposed method and current leading-edge techniques for multiclass classification datasets.
classification accuracies of 82.9% and 96.4% respectively, a constrained sample set, securing a 92.8% accuracy [38].
SLT outperforms both methods by attaining an accuracy of Sadiq et al. explored 2D modelling with EWT, achieving a
99.9% and 96.4%. 95.3% classification accuracy [39]. Utilizing CSP for feature
For binary class data, the training process for ResNet 101 is extraction, Rashid et al. reported a 93.6% accuracy [40].
represented by Figure 9 which was conducted over 50 epochs A strategy involving Euclidean alignment (EA) with either
with a total of 5,100 iterations, averaging 102 iterations per an LDA or LR classifier by Xiong and Wei yielded an
epoch. The training duration was 153 minutes and 19 seconds 85.6% accuracy [13]. Analysing signals with CSP, Xu et al.
with a constant learning rate of 1e-5. By the end of the train- and Sakhavi et al. reported accuracies of 76.8% and 74%,
ing, the model achieved an impressive validation accuracy respectively [41], [42]. Chaudhary and Agrawal’s application
of 99.92%, indicating strong performance on the validation of wavelet transform resulted in an 85.6% accuracy [21],
set. The accuracy curve shows a steady increase from around while Riyad et al., utilizing EEGNet, achieved 74% [43].
50% to almost 100%, with the model converging around The proposed methodology achieved notably higher accuracy
iteration 3,500. Similarly, the loss decreased consistently when compared to Leading-edge techniques as clearly
from approximately 0.7 to just above 0.1. Validation occurred demonstrated in the Table 4 and 5.
every 100 iterations, and both accuracy and loss curves
demonstrate stable and effective model training across the V. CONCLUSION
entire process. In our research, we introduced the application of the
For multi class data, the training process for ResNet Superlet Transform (SLT) for analysing motor imagery EEG
101 is represented by Figure 10 which was conducted over data in time-frequency space and assessed the efficacy of
50 epochs, with a total of 51,350 iterations, averaging 1,027 three distinct residual CNN models for both binary and
iterations per epoch. The training duration was 2476 minutes multiclass classification tasks. Utilizing transfer learning
with a constant learning rate of 1e-5. The validation accuracy techniques on a pretrained network, we evaluated model
achieved was 96.40%, indicating that the model performed performance through accuracy, sensitivity, F-1 score, and
well on unseen data. The accuracy steadily improved precision metrics, derived from confusion matrices on test
throughout the training from around 20% to 96%, while datasets. Our comparative analysis spanned model depth,
the loss decreased from approximately 1.6 to below 0.2. layer count, parameter volume, approaches for TF signal
Validation was performed every 100 iterations, and both the representation, along with training and evaluation durations.
accuracy and loss curves show a consistent convergence over Findings revealed that SLT-enhanced feature extraction
time, confirming the model’s stability and effectiveness. notably boosts classification outcomes over current leading
Tables 4 and 5 compare the effectiveness of the sug- methods, with the residual CNN architectures showing
gested model with leading-edge methods across binary and superior accuracy rates. Specifically, ResNet 101 stood
multiclass datasets. Kervic and Subasi employed WPD on out, delivering an exceptional 99.9% accuracy for binary
classifications and 96.4% for multiclass dataset. A key [9] A. Echtioui, A. Mlaouah, W. Zouch, M. Ghorbel, C. Mhiri, and H. Hamam,
consideration for achieving such high accuracy involved the ‘‘A novel convolutional neural network classification approach of motor-
imagery EEG recording based on deep learning,’’ Appl. Sci., vol. 11, no. 21,
optimal selection of order and time spread parameters for the p. 9948, Oct. 2021.
SLT process. [10] M. Dai, D. Zheng, S. Liu, and P. Zhang, ‘‘Transfer kernel common spatial
By incorporating time-frequency (TF) representation, patterns for motor imagery brain–computer interface classification,’’
Comput. Math. Methods Med., vol. 2018, no. 1, pp. 1–9, 2018.
researchers can achieve a deeper and more precise under- [11] K. Darvish Ghanbar, T. Yousefi Rezaii, A. Farzamnia, and I. Saad,
standing of the intricate, dynamic changes in brain activity ‘‘Correlation-based common spatial pattern (CCSP): A novel extension of
across both temporal and spectral domains. This approach CSP for classification of motor imagery signal,’’ PLoS ONE, vol. 16, no. 3,
Mar. 2021, Art. no. e0248511.
facilitates the identification of subtle patterns linked to [12] T. Mwata-Velu, J. Ruiz-Pinales, J. G. Avina-Cervantes,
distinct motor imagery tasks, enhancing analytical accuracy. J. J. Gonzalez-Barbosa, and J. L. Contreras-Hernandez, ‘‘Empirical
mode decomposition and a bidirectional LSTM architecture used to
Moreover, the integration of pretrained networks through
decode individual finger MI-EEG signals,’’ J. Adv. Appl. Comput. Math.,
transfer learning accelerates the analytical process by vol. 9, pp. 32–48, May 2022.
leveraging extensive knowledge from large-scale datasets, [13] W. Xiong and Q. Wei, ‘‘Reducing calibration time in motor imagery-based
BCIs by data alignment and empirical mode decomposition,’’ PLoS ONE,
overcoming the constraints of traditional methods. This vol. 17, no. 2, Feb. 2022, Art. no. e0263641.
powerful combination has the potential to drive significant [14] J. Gilles, ‘‘Empirical wavelet transform,’’ IEEE Trans. Signal Process.,
advancements in brain-computer interface (BCI) technolo- vol. 61, no. 16, pp. 3999–4010, Aug. 2013.
[15] A. Bhattacharyya, L. Singh, and R. B. Pachori, ‘‘Fourier–Bessel
gies, enabling more intuitive and adaptive control systems for series expansion based empirical wavelet transform for analysis of
assistive devices, thus greatly enhancing the quality of life for non-stationary signals,’’ Digit. Signal Process., vol. 78, pp. 185–196,
individuals with mobility challenges. Additionally, it paves Jul. 2018.
[16] P. Singh, S. D. Joshi, R. K. Patney, and K. Saha, ‘‘The Fourier
the way for innovations in neurorehabilitation, cognitive decomposition method for nonlinear and non-stationary time series
neuroscience, and neurofeedback applications, fostering analysis,’’ Proc. Roy. Soc. A, Math., Phys. Eng. Sci., vol. 473, no. 2199,
broader progress in the understanding and application of Mar. 2017, Art. no. 20160871.
[17] N. Sharma, M. Sharma, A. Singhal, R. Vyas, H. Malik, M. A. Hossaini, and
neural mechanisms. A. Afthanorhan, ‘‘An efficient approach for recognition of motor imagery
EEG signals using the Fourier decomposition method,’’ IEEE Access,
ACKNOWLEDGMENT vol. 11, pp. 122782–122791, 2023.
[18] M. Z. Baig, N. Aslam, H. P. H. Shum, and L. Zhang, ‘‘Differential
The authors would like to express sincere gratitude to Intel- evolution algorithm as a tool for optimal feature subset selection
ligent Prognostic Private Limited Delhi, India for funding in motor imagery EEG,’’ Expert Syst. Appl., vol. 90, pp. 184–195,
this research work. The authors extend their appreciation to Dec. 2017.
[19] P. Kant, S. H. Laskar, J. Hazarika, and R. Mahamune, ‘‘CWT based transfer
the SGT University, India, Bennett University, India, Netaji learning for motor imagery classification for brain computer interfaces,’’
Subhas University of Technology, India, Manipal Institute J. Neurosci. Methods, vol. 345, Nov. 2020, Art. no. 108886.
[20] R. Zhang, Q. Zong, L. Dou, X. Zhao, Y. Tang, and Z. Li, ‘‘Hybrid
of Technology, Manipal Academy of Higher Education,
deep neural network using transfer learning for EEG motor imagery
Manipal, Karnataka, India, Universiti Sultan Zainal Abidin decoding,’’ Biomed. Signal Process. Control, vol. 63, Jan. 2021,
(UniSZA) Malaysia for providing research facility. Art. no. 102144.
[21] P. Chaudhary and R. Agrawal, ‘‘Non-dyadic wavelet decomposition for
sensory-motor imagery EEG classification,’’ Brain-Comput. Interfaces,
REFERENCES vol. 7, nos. 1–2, pp. 11–21, Apr. 2020.
[1] E. A. Mohamed, M. Z. Yusoff, A. S. Malik, M. R. Bahloul, D. M. Adam, [22] Y. Zhang, Y. Liu, W. Kang, and R. Tao, ‘‘VSS-Net: Visual
and I. K. Adam, ‘‘Comparison of EEG signal decomposition methods in semantic self-mining network for video summarization,’’ IEEE
classification of motor-imagery BCI,’’ Multimedia Tools Appl., vol. 77, Trans. Circuits Syst. Video Technol., vol. 34, no. 4, pp. 2775–2788,
no. 16, pp. 21305–21327, Aug. 2018. Apr. 2024.
[2] R. Janapati, V. Dalal, and R. Sengupta, ‘‘Advances in modern EEG-BCI [23] Y. Zhang, C. Wu, W. Guo, T. Zhang, and W. Li, ‘‘CFANet: Efficient
signal processing: A review,’’ Mater. Today, Proc., vol. 80, pp. 2563–2566, detection of UAV image based on cross-layer feature aggregation,’’ IEEE
Jan. 2023. Trans. Geosci. Remote Sens., vol. 61, 2023, Art. no. 5608911.
[3] N. Sharma, M. Sharma, A. Singhal, R. Vyas, H. Malik, A. Afthanorhan, [24] J. Xie, J. Zhang, J. Sun, Z. Ma, L. Qin, G. Li, H. Zhou, and
and M. A. Hossaini, ‘‘Recent trends in EEG-based motor imagery signal Y. Zhan, ‘‘A transformer-based approach combining deep learning
analysis and recognition: A comprehensive review,’’ IEEE Access, vol. 11, network and spatial–temporal information for raw EEG classification,’’
pp. 80518–80542, 2023. IEEE Trans. Neural Syst. Rehabil. Eng., vol. 30, pp. 2126–2136,
[4] A. Kawala-Sterniuk, M. Pelc, R. Martinek, and G. M. Wójcik, ‘‘Editorial: 2022.
Currents in biomedical signals processing—Methods and applications,’’ [25] Y. Zhang, Y. Liu, and C. Wu, ‘‘Attention-guided multi-granularity fusion
Frontiers Neurosci., vol. 16, Jul. 2022, Art. no. 989400. model for video summarization,’’ Expert Syst. Appl., vol. 249, Sep. 2024,
[5] N. Bajaj, J. R. Carrión, F. Bellotti, R. Berta, and A. De Gloria, Art. no. 123568.
‘‘Automatic and tunable algorithm for EEG artifact removal using [26] N. Sharma, A. Upadhyay, M. Sharma, and A. Singhal, ‘‘Deep temporal
wavelet decomposition with applications in predictive modeling during networks for EEG-based motor imagery recognition,’’ Sci. Rep., vol. 13,
auditory tasks,’’ Biomed. Signal Process. Control, vol. 55, Jan. 2020, no. 1, p. 18813, Nov. 2023.
Art. no. 101624. [27] Y. Ma, Y. Song, and F. Gao, ‘‘A novel hybrid CNN-transformer model for
[6] H. Göksu, ‘‘BCI oriented EEG analysis using log energy entropy of wavelet EEG motor imagery classification,’’ in Proc. Int. Joint Conf. Neural Netw.
packets,’’ Biomed. Signal Process. Control, vol. 44, pp. 101–109, Jul. 2018. (IJCNN), Jul. 2022, pp. 1–8.
[7] V. Gupta, T. Priya, A. K. Yadav, R. B. Pachori, and U. R. Acharya, [28] Y. Zhang, T. Zhang, C. Wu, and R. Tao, ‘‘Multi-scale spatiotemporal
‘‘Automated detection of focal EEG signals using features extracted from feature fusion network for video saliency prediction,’’ IEEE Trans.
flexible analytic wavelet transform,’’ Pattern Recognit. Lett., vol. 94, Multimedia, vol. 26, pp. 4183–4193, 2024.
pp. 180–188, Jul. 2017. [29] N. Mammone, C. Ieracitano, and F. C. Morabito, ‘‘A deep CNN approach
[8] C. Uyulan, ‘‘Development of LSTM&CNN based hybrid deep learning to decode motor preparation of upper limbs from time–frequency maps
model to classify motor imagery tasks,’’ Commun. Math. Biol. Neurosci., of EEG signals at source level,’’ Neural Netw., vol. 124, pp. 357–372,
vol. 2021, pp. 1–26, Jan. 2021, doi: 10.28919/cmbn/5265. Apr. 2020.
[30] H. Li, M. Ding, R. Zhang, and C. Xiu, ‘‘Motor imagery EEG classification [37] P. M. Tripathi, A. Kumar, R. Komaragiri, and M. Kumar, ‘‘A review on
algorithm based on CNN-LSTM feature fusion network,’’ Biomed. Signal computational methods for denoising and detecting ECG signals to detect
Process. Control, vol. 72, Feb. 2022, Art. no. 103342. cardiovascular diseases,’’ Arch. Comput. Methods Eng., vol. 29, no. 3,
[31] B. Blankertz, K.-R. Müller, D. J. Krusienski, G. Schalk, J. R. Wolpaw, pp. 1875–1914, Oct. 2021.
A. Schlögl, G. Pfurtscheller, J. R. Millán, M. Schröder, and N. Birbaumer, [38] J. Kevric and A. Subasi, ‘‘Comparison of signal decomposition methods
‘‘The BCI competition III: Validating alternative approaches to actual in classification of EEG signals for motor-imagery BCI system,’’ Biomed.
BCI problems,’’ IEEE Trans. Neural Syst. Rehabil. Eng., vol. 14, no. 2, Signal Process. Control, vol. 31, pp. 398–406, Jan. 2017.
pp. 153–159, Jun. 2006. [39] M. T. Sadiq, X. Yu, Z. Yuan, and M. Z. Aziz, ‘‘Motor imagery BCI
[32] M. Tangermann, K.-R. Müller, A. Aertsen, N. Birbaumer, C. Braun, classification based on novel two-dimensional modelling in empirical
C. Brunner, R. Leeb, C. Mehring, K. J. Miller, G. R. Müller-Putz, wavelet transform,’’ Electron. Lett., vol. 56, no. 25, pp. 1367–1369,
G. Nolte, G. Pfurtscheller, H. Preissl, G. Schalk, A. Schlögl, C. Vidaurre, Dec. 2020.
S. Waldert, and B. Blankertz, ‘‘Review of the BCI competition IV,’’ [40] M. Rashid, B. S. Bari, M. J. Hasan, M. A. M. Razman, R. M. Musa,
Frontiers Neurosci., vol. 6, pp. 6–55, Jul. 2012. A. F. A. Nasir, and A. P. P. A. Majeed, ‘‘The classification of motor imagery
[33] V. V. Moca, H. Bârzan, A. Nagy-Dăbâcan, and R. C. Mureşan, ‘‘Time- response: An accuracy enhancement through the ensemble of random
frequency super-resolution with superlets,’’ Nature Commun., vol. 12, subspace k-NN,’’ PeerJ Comput. Sci., vol. 7, Mar. 2021, Art. no. e374.
no. 1, p. 337, Jan. 2021. [41] S. Xu, L. Zhu, W. Kong, Y. Peng, H. Hu, and J. Cao, ‘‘A novel classification
[34] P. M. Tripathi, A. Kumar, M. Kumar, and R. Komaragiri, ‘‘Multilevel method for EEG-based motor imagery with narrow band spatial filters and
classification and detection of cardiac arrhythmias with high-resolution deep convolutional neural network,’’ Cognit. Neurodyn., vol. 16, no. 2,
superlet transform and deep convolution neural network,’’ IEEE Trans. pp. 379–389, Apr. 2022.
Instrum. Meas., vol. 71, pp. 1–13, 2022. [42] S. Sakhavi, C. Guan, and S. Yan, ‘‘Learning temporal information for
[35] S. Chaudhary, S. Taran, V. Bajaj, and A. Sengur, ‘‘Convolutional neural brain–computer interface using convolutional neural networks,’’ IEEE
network based approach towards motor imagery tasks EEG signals Trans. Neural Netw. Learn. Syst., vol. 29, no. 11, pp. 5619–5629,
classification,’’ IEEE Sensors J., vol. 19, no. 12, pp. 4494–4500, Jun. 2019. Nov. 2018.
[36] T. N. Alotaiby, S. A. Alshebeili, T. Alshawi, I. Ahmad, and [43] M. Riyad, M. Khalil, and A. Adib, ‘‘MI-EEGNET: A novel convolutional
F. E. A. El-Samie, ‘‘EEG seizure detection and prediction algorithms: neural network for motor imagery classification,’’ J. Neurosci. Methods,
A survey,’’ EURASIP J. Adv. Signal Process., vol. 2014, no. 1, p. 183, vol. 353, Apr. 2021, Art. no. 109037.
Dec. 2014.