Cross Task Cognitive Load Classification With Identity Mapping Based Distributed CNN and Attention Based RNN Using Gabor Decomposed Data Images
Cross Task Cognitive Load Classification With Identity Mapping Based Distributed CNN and Attention Based RNN Using Gabor Decomposed Data Images
Trupti Taori, Shankar Gupta, Sandesh Bhagat, Suhas Gajre & Ramchandra
Manthalkar
To cite this article: Trupti Taori, Shankar Gupta, Sandesh Bhagat, Suhas Gajre & Ramchandra
Manthalkar (2022): Cross-Task Cognitive Load Classification with Identity Mapping-Based
Distributed CNN and Attention-Based RNN Using Gabor Decomposed Data Images, IETE Journal
of Research, DOI: 10.1080/03772063.2022.2098191
Article views: 86
Department of Electronics and Telecommunication, Shri Guru Gobind Singhji Institute of Engineering and Technology, Nanded, India
ABSTRACT KEYWORDS
The cognitive workload is a key to developing a logical and conscious thinking system. Maintain- Attention mechanism;
ing an optimum workload improves the performance of an individual. The individuals’ psycho-social Cognitive load; Convolutional
factors are responsible for creating significant variability in the performance of a task, which poses neural network (CNN); Cross-
a significant challenge in developing a consistent model for the classification of cross-task cog- task; Electroencephalogram;
Long short- term memory;
nitive workload using physiological signal, Electroencephalogram (EEG). The primary focus of the Residual block
proposed work is to develop a robust classification model CARNN, by employing the concatenated
deep structure of distributed branches of convolutional neural networks with residual blocks
through identity mappings, and recurrent neural network with an attention mechanism. EEG data
is divided into milliseconds duration overlap segments. The segmented EEG data is converted into
images using Gabor decomposition with two spatial frequency scales and four orientations and sup-
plied as input to CARNN. The images are formed by interlacing the respective left and right electrode
data to capture the data variations effectively. Efficient feature aggregation with learning of spatial
and temporal domain discriminative features through Gabor decomposed data images improve the
training of CARNN. CARNN achieves outstanding performance over traditional classifiers; support
vector machine, k-nearest neighbor (KNN), ensemble subspace KNN and the pre-trained networks;
AlexNet, ResNet18/50, VGG16/19, and Inception-v3. The proposed method results in 94.2%, 92.5%,
95.9%, 92.8%, 94.3% classification accuracy, specificity, sensitivity, precision, and F1-score, respec-
tively. Two visual task levels apart in their complexity are used for cross-task classification of cognitive
workload. The proposed method is validated on raw EEG data of 44 participants.
© 2022 IETE
2 T. TAORI ET AL.: CROSS-TASK COGNITIVE LOAD CLASSIFICATION USING GABOR DECOMPOSED DATA IMAGES
coding, handwritten number identification, face recog- Section 5 provides the implementation details. Section 6
nition, vehicle detection, and EEG signal processing presents the result to evaluate the performance of the pro-
[11–13]. The proposed work exploits this concept and posed method. Section 7 proposes the discussion. Finally,
hypothesized that the two-dimensional (2D) Gabor Section 8 draws the conclusion.
decomposed data images of EEG data can be used to
efficiently extract crucial features for cognitive load clas-
2. LITERATURE WORK
sification of cross-task utilizing deep network structures.
The cognitive workload estimation is crucial in health-
The contributions of the proposed method are: care, learning, and support. For example, it is very effec-
tive in diagnosing various brain disorders, designing
(i) The cascaded deep structure of distributed CNNs assistance systems for BCI applications, and, classify-
with residual blocks and RNN with attention mech- ing cognitive workload levels of operators in safety and
anism is proposed for efficient feature extraction surveillance systems, etc. [14–16]. The two approaches:
from the spatial and temporal domain. statistical and deep learning are widely used with physio-
(ii) Utilization of spatial frequency scales and orienta- logical signals (EEG), for cross-problem cognitive work-
tions of 2D Gabor filters for image formation from load estimation. There are very limited studies are avail-
raw EEG data and feed these images as input to dis- able on cross-task till date, maximum of the existing
tributed CNNs + attention based RNN (CARNN). literature work is referred here in this section.
Instead of manipulating convolutions using Gabor
filter in the initial network layers, the emphasis is
2.1 Statistical Approach
on giving Gabor processed input to deep CNN,
which aids in efficient feature extraction with ease With the invention of different machine learning tech-
from multiple spatial frequency scales and orien- niques, it becomes relatively simple to discern various
tations from Gabor decomposed data images and characteristics of a complex EEG signal [17,18]. The
enhances the training of the network. To the best existing studies follow the learning models with statis-
of our knowledge, this is the first implementation, tical methods (Logistic regression, Naive Bays, SVM),
where EEG data images after 2D Gabor filtering are which use hand-crafted features to classify cognitive load
used to classify cross-task cognitive workload using levels. In [19], the power of multiple frequency bands
deep network structures. from segmented EEG data is utilized for cross-task clas-
(iii) Enhancing the training of CNN with identity map- sification with the artificial neural network, using multi-
ping connections. RNN (LSTM) structure learns layer perceptron and result into lower classification accu-
temporal features by relating output features of dis- racy of 44.8%. In [17], event-related synchronization and
tributed CNNs in the temporal domain. Attention de-synchronization are utilized with the support vector
network is employed with RNN structure gives machine (SVM) classifier. The cross-task classification
more prominence to time steps covering important accuracy results in 44.8%. In another parallel work [18],
discriminating activity in EEG. the power spectral density (PSD) of segmented data from
(iv) A unified structure is trained end to end, where a six frequency bands is utilized as a feature. The per-
total number of residual blocks and type of identity formance of the SVM classifier and regression model is
mapping for CNN structure and total RNN layers near to chance level for cross task. Further, the recursive
are decided empirically. feature elimination-based feature ranking and feature-
subset selection are used, where the SVM regression
Two different visual tasks employing recognition and model outperforms the SVM classifier.
counting operations are carried out at multiple difficulty
levels for cross-task binary classification. The proposed In [3], the study is carried out where the MW estima-
study utilizes EEG data of small duration (1 s) from the tor is trained on a simple task and tested on a complex
onset of the stimulus, which is further divided into mul- task. A SVM regression model shows the improved per-
tiple overlap segments for cross-task analysis. formance (correlation coefficient = 0.740 ± 0.147) using
the feature subset of PSD from seven frequency bands
The remainder of this paper is structured as follows. selected with recursive feature elimination. In [4], models
Section 2 presents the related literature work. Section 3 of behavior are developed to identify common cross-
introduces the dataset for two visual tasks and the EEG participant and cross-task EEG features. The sequential
data acquisition process. Section 4 presents the method- forward floating selection is used to identify the indepen-
ology used throughout the experimentation process. dent components (ICs) of tasks. Further, linear regression
T. TAORI ET AL.: CROSS-TASK COGNITIVE LOAD CLASSIFICATION USING GABOR DECOMPOSED DATA IMAGES 3
is used on power spectra of ICs and a continuous estimate 2.2 Deep Learning Approach
is generated for cross-subject cognitive load recognition.
Recently, deep learning methods have shown immense
The findings suggest that the complex behavioral dynam-
upgrading in the various biomedical field applications
ics are estimated with concurrent measures from EEG
[8,27]. The cross-problem estimation is also investigated
using a common finite feature set. A new avenue of map-
with the utilization of deep structures to reach an accept-
ping EEG source space related to various frequency bands
able performance level. In [28], CNN methods, support-
in the functional network is studied in [5]. The classifi-
ing single and double models are utilized along with novel
cation features subset is selected based on a sequential
fusion strategies of different networks. The CNN model is
feature selection algorithm, which selects the topmost
fed with the spectral maps (derived with the fast Fourier
common features for cross-task and is classified with 87%
transform) as 2D images, which hold the EEG signal’s
accuracy using SVM classifier. In [20], the work explored
spectral, spatial, and temporal information.
the domain adaptation method for cross-task cogni-
tive workload recognition. The working memory and
In [29], the cross-participant model is explored through
mental arithmetic task individually can be viewed as a
task-generic EEG features. A multi-layer perceptron neu-
domain. PSD and coherence features from brain connec-
ral network (MLPNN) is utilized to extract spectral fea-
tivity network are extracted from five different frequency
tures from five frequency bands. A temporal convolu-
bands. Among SVM and KNN, SVM produces higher
tional network (TCN) with an autoencoder is employed
binary classification accuracy (around 70%) for trans-
to analyze time-domain-based features from raw EEG
fer joint matching adaption. In another parallel work
data. MLPNN performs best with 64% accuracy. In [30],
[21], domain adaptation techniques using transfer com-
the EEG distributions of different subjects are alleviated
ponent analysis. The work achieves 30.0% classification
through the implementation of an online pre-alignment
accuracy for a 4-class cross-dataset problem using sparse
scheme (OPS), to balance the impact of cross-dataset
encoded representations of the decomposed wavelets. In
variability for the motor imagery datasets. OPS employs
[22], task-independent mental workload is discriminated
a recentering step prior to the training of a deep model
with cross-task of two working memory tasks. EEG spec-
with the usage of Riemannian mean covariance. OPS
tral features (PSD) and functional connectivity features
also improves generalization ability across datasets and
(phase lag index) are fused and classified with accu-
results in around 75% cross-subject classification accu-
racy of 91% using SVM and linear discriminant analysis
racy for various datasets. In [31], the cross-participant
(LDA) classifiers. In [23], the study compares the cross-
cognitive state in a non-stimulus-locked task environ-
task mental workload consistency with PSD of ongoing
ment is estimated. PSD values are normalized through
EEG and task-irrelevant auditory ERPs using the verbal
the creation of a cumulative distribution function at each
N-back and the multi-attribute task battery. The output of
time step for every electrode/frequency combination and
the discriminant analysis shows the statistical difference
is used to create input for CNN. It is concluded that
between both features used and area under curve is near
a novel convolutional-recurrent model using multi-path
to one. In [24], this work elaborates the studying real-
subnetworks and bi-directional, residual recurrent lay-
istic instructional materials for the optimum workload.
ers resulted in improved predictive accuracy (86.8%) and
Work uses SVM for cross-task classifications of different
decreased variance.
levels of WML utilizing spectral features from multiple
frequency bands, and achieves around 80% accuracy. In
In [32], a structure of deep recurrent and 3D-CNN
[25], work assesses cross-task mental workload during
is combined (R3DCNN) for automatic learning of fea-
anomaly detection in perceived visual stimuli of images
tures and performed cross-task binary classification. This
and video. Study utilizes multimodal signal-like electro-
existing work uses Morlet wavelet transformation to map
cardiogram, Electroocculogram, skin response, etc. LDA
frequency and time dimensions and uses EEG cubes
is used for feature reduction, where a subset of features
formed from topographic maps as input images for
with the largest Fisher separation value is considered.
3DCNN. The R3DCNN achieves an average accuracy of
SVM classifier results into 53.83% classification accuracy.
88.9%, which is a significant increase compared to the
state-of-the-art methods. In our previous work [33], tem-
It is argued that hand-crafted features lack in the extrac-
poral dynamics are modeled by grouping successive time
tion of crucial features from nonlinear EEG data [26],
segments of signal to form variable-length frames, and
therefore deep networks are employed as classifiers and
hand-crafted features from the statistical (band power),
for automatic feature extraction.
4 T. TAORI ET AL.: CROSS-TASK COGNITIVE LOAD CLASSIFICATION USING GABOR DECOMPOSED DATA IMAGES
morphological (curve length), and nonlinear (approxi- induce cognitive workload is displayed on the monitor
mate entropy) domains for these frames from four fre- screen.
quency bands are used. Deep RNN employing LSTM
results in 92.8% binary cross-task classification accuracy.
3.2 Experimental Tasks
In recent years, different existing works explored mul- Two visual tasks utilizing basic trigonometric shapes and
tiple ways of improving the performance of the deep an abstract shape of basic red, green, and blue colors
model in the following ways: by feeding pre-processed for their identification and counting are executed in the
EEG signal as feature set [29] or pre-processed signal is proposed work to induce cognitive workload. The tasks
transformed into 2D images [32], with some initializa- now onwards will refer to as “shape” and “color” indi-
tion techniques [34] at the beginning layer of the model, vidually throughout this paper. Like in our previous pub-
and by improving the structure of the deep network [31]. lished work, the proposed work also considers level-1 and
The pre-trained, CNN-based deep models like AlexNet level-4 only (simple and complex levels, respectively) for
[35], VGG16, VGG19 [36], ResNet18, ResNet50 [37], cross-task analysis out of 4 designed levels [33,42,43]. The
and Inception-v3 [38] are also efficient through transfer visual stimulus is depicted in Figure 1.
learning in feature extraction [8,9]. In [39], the steer-
able filter structure is incorporated in the deep CNN The single-trial timeline is depicted in Figure 2, con-
and forms a new model as Gabor-CNN, where basic sisting of a 7 s visual stimulus, followed by a 2 s blank
convolutional operation is manipulated based on Gabor period and a maximum 5 s response time. In response
filters. The work demonstrates that Gabor-CNN is capa- time, a participant must provide a count of anyone asked
ble of learning features efficiently for object recognition trigonometric shape and anyone color balloon, which
and it has fewer learnable parameters, which improves they memorize. Every individual level consists of 10 tri-
its training. In [10], the 1D Gabor function is used to als with 30 s break between successive levels to diminish
obtain features from EEG data which are fed to the the effect of fatigue if experienced by participants. Per-
ridge regression classifier. Further, the obtained feature mutation of trigonometric shapes and colors among trials
set is fused with the visual feature (obtained with a deep in the respective task avoids adaptation. Each experi-
CNN) set and is used to train the combined visual-EEG mental task lasts for approximately 15 min. NASA Task
classifier, which boosts visual recognition. In the liter- Load Index score related to the workload level experi-
ature, Gabor filters are also used efficiently in the var- ence during task performance is filled by 30 subjects who
ious analysis of EEG like sleep disorders and seizures participate in a similar kind of dummy experimentation.
[40,41]. It validates the correctness of designed experimentation
Figure 3: (A) The proposed 2D image generation method, segmented EEG data is convolved with 8 Gabor filters and the resultant of
every single spatial frequency scale with four orientations are cascaded, further these two cascading is clubbed vertically to form the 2D
image. (B) The proposed concatenated structure of distributed CNNs and RNN: CARNN
tasks. More details on the experimentation task can be the research community. The Gabor filters have sparked
found in [33,42]. renewed interest in a variety of computer vision applica-
tions. In the classification tasks, utilization of filters based
on the target data characteristics improves the perfor-
4. METHODOLOGY
mance of the deep model and requires less training data
Various processes carried out during experimentation for [34].
cross-task classification are discussed in this section. The
complete process flow is depicted in Figure 3. Thus, the images formed with Gabor filtered data is an
initialization, which helps CNN to extract more discrim-
4.1 Data Normalization inative features. Gabor filters offer optimal localization in
the spatial as well as frequency domain. Also, they are
Before data normalization, the electrode sequence is rear- found to be less vulnerable to noise, rotation, scaling, and
ranged in the captured data, where electrode pairs from a small range of translation [46]. In the spatial domain,
the left and right hemispheres are arranged side by side. the complex form of a 2D Gabor filter is expressed as,
The new arranged sequence is FP1, FP2, AF3, AF4, F3,
F4, F7, F8, FZ, FC1, FC2, FC5, FC6, T7, T8, C3, C4, CZ, g(x, y, λ, θ , ψ, σ , γ )
CP1, CP2, CP5, CP6, P3, P4, P7, P8, PZ, PO3, PO4, O1, 2
x + γ 2 y2 x
O2, OZ. The significance of this step is presented in the = exp − exp i 2π + ψ (1)
discussion section. Due to the variability of data among 2σ 2 λ
multiple levels of an individual, data must be normalized.
The z-score normalization is used in the proposed work. Where,
Normalization of data will contribute to effective learning
by deep networks [44]. x = x cos θ + y sin θ and y = −x sin θ + y cos θ (2)
4.2 Time Segmentation In Equation (1), λ scales the frequency of the sinusoidal
Time segmentation is a technique for extracting data and modulation, θ is the orientation represents the angle of
identifying components of small-time segments from a the major axis, ψ is the phase offset, σ scales the falloff
long EEG time series [45]. The proposed work employs of the Gaussian envelope, and γ specifies the ratio of x
a windowing technique where normalized time series of to y, called as spatial aspect ratio. This complex function
1 s (from the onset of stimulus period) is divided into 200 can be handled easily by breaking it down into its real and
ms segments with an overlap period of 100 ms. imaginary parts, referred to as even and odd functions.
fR (x, y, λ, θ , ψ, σ , γ )
4.3 Input Image Formation for CARNN
x + γ 2 y2 x
2
The resemblance of 2-D Gabor filters and receptive fields = exp − cos i 2π + ψ (3)
2σ 2 λ
of neurons in the V1 area of visual cortex, motivated
6 T. TAORI ET AL.: CROSS-TASK COGNITIVE LOAD CLASSIFICATION USING GABOR DECOMPOSED DATA IMAGES
fI (x, y, λ, θ, ψ, σ , γ )
xs + γ 2 y2 x
2
4.4.1 Proposed CNN with Residual Identity Mapping This identity mapping provides a direct path for sig-
Connections nal propagation from anyone residual unit to any other
In the proposed work, nine (1 s data divided into nine residual unit in the network during forward and back-
overlap segments) distributed CNN structures are used ward paths. The pre-activation residual connection, Res
as depicted in Figure 3. The individual distributed CNN Block is depicted in Figure 5 (C) where, batch normaliza-
branch structure is depicted in Figure 4, where Block tion (BN) and rectified linear unit (ReLU) are connected
1–Block 5 are connected. The CNN structure is made of before the convolutional layer [47]. The identity skip
multiple layers like convolutional, which is generally fol- connection with 1X1 convolutional layer is depicted in
lowed by Normalization, activation function as depicted Figure 5 (D) is mainly used to extract the information
in Figure 5 (A), and finally, a fully connected layer is stored in the spatial dimension of Gabor decomposed
used. CNN learns spatial and spectral features from non- data images. The BN layer and ReLU are used as the
stationary EEG data. Distributed CNNs can apply the activation function to accelerate the convergence, which
same transformation for a list of input data. CNN struc- improves network performance. The Flatten layer at the
ture is created with stacking of convolution blocks and end of the CNN structure is used to convert the output
residual blocks. Residual block helps in optimal gradi- of the last block (Block 5) to a feature vector. The input
ent flow with the presence of identity skip connections by and output feature sizes from each block are mentioned in
preventing loss of information flow through various net- Figure 6. The last parameter of output represents the total
work layers. The normal residual block is shown in Figure filters used for the convolutional layers from that block.
5 (B) is functioning as follows: Despite the ability of CNN to capture crucial spatial fea-
tures from data through convolution operation, it cannot
yl = h(xl ) + F(xl , Wl ) (5) map a complex time relationship in the logical sequences.
T. TAORI ET AL.: CROSS-TASK COGNITIVE LOAD CLASSIFICATION USING GABOR DECOMPOSED DATA IMAGES 7
Figure 5: (A) Structure of Convolutional block “Conv Block” (B) R1: Normal residual connection (C) R2: “Res Block” Pre-activation
residual connection. (D) R3: with 1 × 1 conv in identity skip connection
Figure 6: The proposed layered deep structure with input and t are then combined as given, to decide the new
it and C
output of each Block. The size of the kernel for all the convolu- input for the cell state.
tion layers is used as 3 × 3 for Conv Block and Res Block except
1 × 1convolution in shortcut path of Figure 5 (D). The number of t
ct = ft ∗ ct−1 + it ∗ C (13)
channels in output represents the total filters used for respective
convolution layers Cell state serve as the memory of LSTM. After finalizing
cell state the output gate is used to place selected amount
8 T. TAORI ET AL.: CROSS-TASK COGNITIVE LOAD CLASSIFICATION USING GABOR DECOMPOSED DATA IMAGES
5. EXPERIMENTATION
This section discusses the implementation details and
provides various metrics used for the performance eval-
Figure 8: Temporal attention mechanism with LSTM uation of the proposed method.
5.2 Evaluation Metrics published work using the same dataset for cross-task clas-
sification [33]. The accuracy of giving the correct answer
To build an effective learning model, the evaluation step
is more for the simple level, and participants take less
is very important. The performance evaluation metrics
time to respond, while for the complex level, the accuracy
used for cross-task classification model are as follows:
is low, and participants take more time to respond. This
fact is also supported by the NASA TLX score provided
Accuracy: Represents a total number of correct predic-
by participants, that they really feel complex level tough
tions by model.
than simple level. Further, it is justified with the paired t-
Sensitivity: Represents the proportion of actual “one
test that the easy level and difficult level of both the tasks
class” data that is correctly identified.
are discriminable (p < 0.05 is significant), and both the
Specificity: Represents the proportion of actual “another
tasks are of a similar kind (p > 0.05 is not significant).
class” data which is correctly identified.
Precision: Represents the proportion of correctly identi-
fied one class data. 6.1 Classification of Within-Task Cognitive Load
F1-Score: Represents the harmonic mean of precision In the proposed method, EEG data of 1 s is only consid-
and sensitivity. ered from the onset of visual stimulus for all the analyses.
TP + TN 1 s duration is sufficient for the perception and identi-
Accuracy = (19) fication of objects through memory activity; thus, load
TP + TN + FP + FN
TP levels can be identified. The within-task performance
Sensitivity (Recall) = (20) evaluation for the shape and color tasks is carried out
TP + FN
independently for various durations of time segments,
TN
Specificity = (21) 50, 100, 200, and 250 ms, as mentioned in Table 1 with
TN + FP 50% overlap duration. The number of distributed CNNs
TP are equal to the total segments of the prescribed duration
Precision(Positive Predictive Value) =
TP + FP accommodated in 1 s data. For within-task classification,
(22) 80% of task data is used for training and validation. The
PPV ∗ Sensitivity remaining 20% data from the same task is used for test-
F1 − Score = 2∗ (23)
PPV + Sensitivity ing. This evaluation is carried out for images formed
with the only real component of the Gabor function.
Where TP, True positive; TN, True negative; FP, False The images formed with the imaginary component of the
positive; and FN, False negative are confusion matrix Gabor function are not considered anymore for any of the
parameters. analyses. The highest within-task classification accuracy
is 96.9% and 94.9% for shape and color tasks, respectively,
for a segment duration of 200 ms (with 100 ms overlap).
6. RESULTS
Thus 200 ms segment duration is utilized to evaluate the
In the proposed work, the raw EEG data are normal- cross-task performance. With 200 ms segment duration,
ized and segmented with a window size of 200 ms with there is a total of 3960 (44 subjects x 10 questions in each
an overlap period of 100 ms. These segmented data are level x nine overlap segments in 1 s data) images for easy
decomposed using the Gabor filters and converted into level and 3960 images for difficult level for both the tasks
images considering only the real part of the complex independently.
function discussed under subsection image formation.
Then CARNN is employed to automatically extract fea-
6.2 Classification of Cross-Task Cognitive Load
tures from spatial and temporal domain and to classify
binary load levels of cross-task. The behavioral response Cross-task classification is performed in the following
for subjective performance can be referred to from our two steps;
Table 1: Within-task classification accuracy for various segment duration (with 50% overlap of mentioned duration) for images
formed with the only real component of Gabor function. Acc., Accuracy; Sen., Sensitivity; Spe., Specificity (Bold face values
indicates higest value of evaluation metrics for 200 ms segment duration)
50 ms 100 ms 200 ms 250 ms
Acc. Spe. Sen. Acc. Spe. Sen. Acc. Spe. Sen. Acc. Spe. Sen.
Shape 93.9 92.7 95.1 94.6 93.3 95.9 96.9 96.5 97.3 94.8 94.0 95.6
Color 92.1 94.5 89.7 92.8 94.7 90.9 94.9 92.8 97.0 93.2 95.0 91.4
10 T. TAORI ET AL.: CROSS-TASK COGNITIVE LOAD CLASSIFICATION USING GABOR DECOMPOSED DATA IMAGES
Figure 9: Model convergence for all five validation folds for Color and Shape training. The model converges around ten epochs, with
100% accuracy
Figure 11: Cross-task classification accuracy comparison with Cross-task challenge for cognitive workload classifi-
traditional classifiers cation is explored by different existing studies. This
part presents the comparison of the proposed model
with existing studies from quantitative and qualitative
Table 2: Parameters and input image sizes for pre-trained standpoints as studies differ in their experimental task
networks and the proposed network
and implementation details. Some of the quantifiable
Model Approximate Parameters Input Image Size
attributes are discussed for cross-task research in Table 4.
Alex Net 62 M 227 × 227 × 3
VGG16 138 M 224 × 224 × 3
In [19], ANNs are trained for short time duration EEG
VGG19 143 M 224 × 224 × 3 data where participants engaged in task performance
ResNet18 11.5 M 224 × 224 × 3 with no prior exposure to that task. Here the observed
ResNet50 26 M 224 × 224 × 3
Inception-v3 23.8 M 299 × 299 × 3 results could not even meet the general probability limit.
Proposed CARNN 2.9 M 224 × 224 × 1 In [17], as well, the obtained results of cross-task are
not significant over chance level for working memory
tasks using SVM classifier. The recursive feature elimi-
block and overfitting of the network. The performance of nation algorithm is employed in [18], to identify a sub-
a very deep network on EEG data is not very promising; set of features for cross-task. The classification accuracy
as the network goes deep, the performance is degraded using n-back tasks deteriorates. The cross-task prob-
[49]. ResNet50 shows degraded performance because lem is dealt with a functional connectivity network in
of more depth despite the presence of residual blocks. [5], which utilizes common discriminative features from
ResNet18 is showing good performance, but this degra- tasks. SVM classifier provides better classification accu-
dation in performance might be due to depth. This depth racy of 87%. Here from this work, it can deduce that
issue is also verified empirically in the proposed work, methods used for feature selection play an important role
which is discussed in the subsequent section. Inception- along with good classifier usage, which is difficult for
v3 is a wider model with multiple filters of different sizes generalization of the model. The deep learning archi-
on the same level. Even though the performance of this tectures, with their great success in object recognition,
model is good as compared to previously mentioned pre- have recently been exploited for mental workload clas-
trained networks. But Inception-v3 is computationally sification. In [29], MLPNN along with TCN autoencoder
very expensive and requires around 23.8 M parameters, extracts frequency and time domain-based EEG features.
whereas the proposed model requires 2.9 M parameters. Work concludes that a vigilance decrement is classified
The Computed values of all the metrics are mentioned using EEG for an unseen individual and unseen task.
in Table 3 for all the models. The average of both steps In [32], the robust EEG features are learned from the
Table 3: Comparison of proposed model with other deep structures for cross-task classification with 200 ms segment duration
(100 ms overlap). Acc., Accuracy; Sen., Sensitivity; Spe., Specificity; Pre., Precision; F1Sco., F1-Score
Model Train: Shape, Test: Color Train: Color, Test: Shape Average
Model Acc. Sen. Spe. Pre. F1Sco. Acc. Sen. Spe. Pre. F1Sco. Acc. Sen. Spe. Pre. F1Sco.
AlexNet 67.5 60.0 75.0 70.6 64.9 65.0 70.0 60.0 63.6 66.7 66.3 65.0 67.5 67.1 65.8
VGG19 72.0 77.0 67.0 70.0 73.3 74.0 72.0 76.0 75.0 73.5 73.0 74.5 71.5 72.5 73.4
VGG16 75.0 78.0 72.0 73.6 75.7 71.0 66.3 75.7 73.3 67.5 73.0 72.2 73.9 73.3 71.6
ResNet18 92.1 93.1 91.1 91.2 92.1 89.0 85.0 93.0 92.4 88.5 90.6 89.1 92.1 91.8 90.3
ResNet50 85.2 88.4 81.8 83.0 84.4 81.0 76.0 86.0 84.4 80.0 83.1 82.2 83.9 83.7 82.2
Inception-v3 92.1 93.2 91.0 91.2 92.1 88.0 90.0 86.0 86.5 88.2 90.1 91.6 88.5 88.9 90.2
CRNN 92.0 92.0 92.0 92.0 92.0 91.6 91.1 92.1 91.9 91.5 91.8 91.6 92.5 92.0 91.8
CARNN 94.6 93.0 93.2 96.2 94.7 93.8 92.0 92.3 95.6 93.9 94.2 92.5 92.8 95.9 94.3
12 T. TAORI ET AL.: CROSS-TASK COGNITIVE LOAD CLASSIFICATION USING GABOR DECOMPOSED DATA IMAGES
92.8
86.0
89.0
92.2
Spe.
–
–
–
–
and event-related spectral perturbation with 3DCNN fol-
lowed by temporal features learning through RNN. The
92.5
Sen.
88.0
89.8
93.5
–
–
–
–
work reported cross-task classification accuracy of 88.9%,
which is higher than with the usage of traditional meth-
94.2
44.8
53.5
87.0
29.3
84.0
92.8
88.9
Acc.
ods. In our previous work [33], temporal modeling is
carried out on the same dataset as pre-processing stage
counting
pre-processed data, which are beneficial for automatic
Table 4: Comparison of the proposed work with existing studies for cross-task analysis. Par., Participant; CV, Cross-validation
spatial feature extraction from a distributed branch of
CNN networks. Thus, instead of modulating the filters
Leave-two-out CV
data images are used, as some of the CNN layer kernels
Leave-one out CV
Impl. Details
are similar to Gabor filters. RNN is learning temporal
Not specified
10-fold CV
dependencies in the time series data, and the attention
5-fold CV
3-fold CV
5-fold CV
mechanism enhances the weights of time instances hold-
ing crucial information. The result obtained for cross-
task classification is promising with obtained accuracy of
Signal Duration
94.2%. The high performance of the proposed method
around 10 s
around 12 s
around 6 s
could be characterized by Gabor filtered data images
3.5 min.
500 ms
having spatial frequency information with various orien-
12 s
1s
1s
tations and the capability of CARNN to learn the spatial
and temporal features for the classification of cognitive
Partici-pants
workload efficiently. 32
15
44
21
17
28
20
44
7. DISCUSSION Temporal modelling BLSTM
3D-CNN + BLSTM
Method
images
Proposed model
Walter et al. [17]
being 79.8%, with around double the number of param- is sufficient to identify discriminative features for cogni-
eters for the network. Further, the attention mechanism tive load classification. The proposed method generalizes
is added to Model 1, referred to as Model 2, and it results the cross-task classification framework with the usage
in improved accuracy. Residual skip identity connection of EEG data without any noise removal, Gabor featured
proves its importance in various vision research stud- images, and deep CARNN structure. The classification
ies recently. So, three different types of residual blocks, accuracy of 94.2% is obtained for cross-task, where the
namely R1, R2, and R3 as depicted in Figure 5 (B), (C), model is trained on one task that can efficiently tackle
(D), respectively, are tested for the EEG dataset. Now another task for classification. The extensive study of the
CNN structure is extended by incorporating R1, R2, R3 Gabor filter with multiple scales and more orientations
individually. The classification accuracy result for these can be carried out in the future to gain better insight
three types of residual blocks are mentioned in Table 6 into the multilevel cognitive workload estimation and
referred to as Model 3, Model 4, Model 5. optimization of electrodes for real-time processing.
REFERENCES
8. CONCLUSION 1. J. Sweller, “Cognitive load theory,” Psychol. Learn. Motiv.,
The proposed method explores the cognitive workload Vol. 55, pp. 37–76, 2011. Elsevier. DOI:10.1016/B978-0-
12-387691-1.00002-8
classification for cross-task using the CARNN model.
Distributed CNNs are used to learn spatial features of 2. P. Antonenko, F. Paas, R. Grabner, and T. Van Gog, “Using
data, which is benefited with the usage of Gabor filtered electroencephalography to measure cognitive load,” Educ.
data images in the beginning layers of the network. The Psychol. Rev., Vol. 22, no. 4, pp. 425–438, 2010. DOI:10.
automatic learning of features from the temporal domain 1007/s10648-010-9130-y
by RNN is proved to be very effective when supported
3. Y. Ke, et al., “An EEG-based mental workload estimator
by an attention mechanism, which contributes to the trained on working memory task can work well under sim-
improvement of the performance of the model. A small ulated multi-attribute task,” Front. Hum. Neurosci., Vol. 8,
duration (1 s) of EEG data from the onset of stimulus no. 703, pp. 1–10, 2014. DOI:10.3389/fnhum.2014.00703
T. TAORI ET AL.: CROSS-TASK COGNITIVE LOAD CLASSIFICATION USING GABOR DECOMPOSED DATA IMAGES 15
4. J. Touryan, B. J. Lance, S. E. Kerick, A. J. Ries, and K. passive BCI using machine learning methods,” Expert. Syst.
McDowell, “Common EEG features for behavioral estima- Appl., Vol. 134, pp. 153–166, 2019. DOI:10.1016/j.eswa.
tion in disparate, real-world tasks,” Biol. Psychol., Vol. 114, 2019.05.057
pp. 93–107, 2016. DOI:10.1016/j.biopsycho.2015.12.009
16. S. Simpraga, R. Alvarez-Jimenez, H. D. Mansvelder, J.
5. G. N. Dimitrakopoulos, I. Kakkos, Z. Dai, J. Lim, J. J. M. Van Gerven, G. J. Groeneveld, S.-S. Poil, and K.
de Souza, A. Bezerianos, and Y. Sun, “Task- indepen- Linkenkaer-Hansen, “EEG machine learning for accurate
dent mental workload classification based upon common detection of cholinergic intervention and Alzheimer’s dis-
multiband EEG cortical connectivity,” IEEE Trans. Neural ease,” Sci. Rep., Vol. 7, no. 1, pp. 1–11, 2017. DOI:10.1038/
Syst. Rehabil. Eng., Vol. 25, no. 11, pp. 1940–1949, 2017. s41598-017-06165-4
DOI:10.1109/TNSRE.2017.2701002
17. C. Walter, S. Schmidt, W. Rosenstiel, P. Gerjets, and
6. R. G. Hefron, and B. J. Borghetti, “A new feature for cross- M. Bogdan, “Using cross-task classification for classify-
day psychophysiological workload estimation,” in 2016 ing workload levels in complex learning tasks,” in 2013
15th IEEE International Conference on Machine Learning Humaine Association Conference on Affective Computing
and Applications (ICMLA), 2016, pp. 785–790. IEEE. and Intelligent Interaction, 2013, pp. 876–881. IEEE.
7. J. Zhang, Y. Wang, and S. Li, “Cross-subject mental work- 18. Y. Ke, et al., “Towards an effective cross-task mental
load classification using kernel spectral regression and workload recognition model using electroencephalogra-
transfer learning techniques,” Cogn. Technol. Work, Vol. 19, phy based on feature selection and support vector machine
no. 4, pp. 587–605, 2017. DOI:10.1007/s10111-017-0425-3 regression,” Int. J. Psychophysiol., Vol. 98, no. 2, pp.
157–166, 2015. DOI:10.1016/j.ijpsycho.2015.10.004
8. R. Sharma, R. B. Pachori, and P. Sircar, “Automated emo-
tion recognition based on higher order statistics and deep 19. C. L. Baldwin, and B. N. Penaranda, “Adaptive training
learning algorithm,” Biomed. Signal. Process. Control., Vol. using an artificial neural network and EEG metrics for
58, pp. 101867, 2020. DOI:10.1016/j.bspc.2020.101867 within-and cross-task workload classification,” NeuroIm-
age, Vol. 59, no. 1, pp. 48–56, 2012. DOI:10.1016/j.neuro
9. A. Singhal, R. Shukla, P. K. Kankar, S. Dubey, S. Singh, and image.2011.07.047
R. B. Pachori, “Comparing the capabilities of transfer learn-
ing models to detect skin lesion in humans,” Proceedings of 20. Y. Zhou, et al., “Cross-Task cognitive workload recog-
the Institution of Mechanical Engineers, Part H: Journal of nition based on EEG and domain adaptation,” IEEE
Engineering in Medicine, Vol. 234, no. 10, pp. 1083–1093, Trans. Neural Syst. Rehabil. Eng., Vol. 30, pp. 50–60, 2022.
2020. DOI:10.1177/0954411920939829 DOI:10.1109/TNSRE.2022.3140456
10. N. Cudlenco, N. Popescu, and M. Leordeanu, “Reading 21. W. L. Lim, O. Sourina, and L. Wang, “Cross dataset work-
into the mind’s eye: boosting automatic visual recognition load classification using encoded wavelet decomposition
with EEG signals,” Neurocomputing, Vol. 386, pp. 281–292, features,” in 2018 International Conference on Cyber-
2020. DOI:10.1016/j.neucom.2019.12.076 worlds (CW), 2018. IEEE.
11. M. Abadi, M. Khoudeir, and S. Marchand, “Gabor filter- 22. I. Kakkos, et al., “EEG fingerprints of task-independent
based texture features to archaeological ceramic materials mental workload discrimination,” IEEE. J. Biomed. Health.
characteri- zation,” in International Conference on Image Inform., Vol. 25, no. 10, pp. 3824–3833, 2021. DOI:10.1109/
and Signal Processing, 2012, pp. 333–342. Springer. JBHI.2021.3085131
12. Y. Hamamoto, S. Uchimura, M. Watanabe, T. Yasuda, Y. 23. Y. Ke, et al., “Cross-task consistency of EEG-based mental
Mitani, and S. Tomita, “A gabor filter-based method for workload indicators: comparisons between power spec-
recognizing handwritten numerals,” Pattern Recognit., Vol. tral density and task-irrelevant auditory event-related
31, no. 4, pp. 395–400, 1998. DOI:10.1016/S0031-3203(97) potentials,” Front. Neurosci., Vol. 15, pp. 1–14, 2021.
00057-5 DOI:10.3389/fnins.2021.703139
13. J. Arrospide, and L. Salgado, “Log-gabor filters for image- 24. P. Gerjets, C. Walter, W. Rosenstiel, M. Bogdan, and
based vehicle verification,” IEEE Trans. Image Process., Vol. T.O. Zander, “Cognitive state monitoring and the design
22, no. 6, pp. 2286–2295, 2013. DOI:10.1109/TIP.2013. of adaptive instruction in digital environments: lessons
2249080 learned from cognitive workload assessment using a pas-
sive brain-computer interface approach,” Front. Neurosci.,
14. M. Jimnez-Guarneros, and P. Gomez-Gil, “Custom domain Vol. 8, pp. 385, 2014. DOI:10.3389/fnins.2014.00385
adaptation: a new method for cross-subject, EEG-based
cognitive load recognition,” IEEE Signal Process Lett., Vol. 25. G. Zhao, Y.-J. Liu, and Y. Shi, “Real-time assessment of
27, pp. 750–754, 2020. DOI:10.1109/LSP.2020.2989663 the cross-task mental workload using physiological mea-
sures during anomaly detection,” IEEE Transactions on
15. Ç. İ. Acı, M. Kaya, and Y. Mishchenko, “Distinguish- Human-Machine Systems, Vol. 48, no. 2, pp. 149–160, 2018.
ing mental attention states of humans via an EEG-based DOI:10.1109/THMS.2018.2803025
16 T. TAORI ET AL.: CROSS-TASK COGNITIVE LOAD CLASSIFICATION USING GABOR DECOMPOSED DATA IMAGES
26. P. Zhang, X. Wang, J. Chen, W. You, and W. Zhang, “Spec- Conference on Computer Vision and Pattern Recognition,
tral and temporal feature learning with two-stream neural 2016, pp. 770–778.
networks for mental workload assessment,” IEEE Trans.
Neural Syst. Rehabil. Eng., Vol. 27, no. 6, pp. 1149–1159, 38. C. Szegedy, V. Vanhoucke, S. Ioffe, J. Shlens, and Z.
2019. DOI:10.1109/TNSRE.2019.2913400 Wojna, “Rethinking the inception architecture for com-
puter vision,” in Proceedings of the IEEE Conference
27. S. Madhavan, R. K. Tripathy, and R. B. Pachori, “Time- on Computer Vision and Pattern Recognition, 2016, pp.
frequency domain deep convolutional neural network for 2818–2826.
the classification of focal and non-focal EEG signals,”
IEEE Sensors J., Vol. 20, no. 6, pp. 3078–3086, 2020. 39. S. Luan, C. Chen, B. Zhang, J. Han, and J. Liu, “Gabor
DOI:10.1109/JSEN.2019.2956072 convolutional networks,” IEEE Trans. Image Process., Vol.
27, no. 9, pp. 4357–4366, 2018. DOI:10.1109/TIP.2018.
28. Z. Jiao, X. Gao, Y. Wang, J. Li, and H. Xu, “Deep convolu- 2835143
tional neural networks for mental load classification based
on EEG data,” Pattern Recognit., Vol. 76, pp. 582–595, 2018. 40. T. Sunil Kumar, and V. Kanhangad, “Gabor filter-based
DOI:10.1016/j.patcog.2017.12.002 one-dimensional local phase descriptors for obstructive
sleep apnea detection using single-lead ECG,” IEEE Sensors
29. A. Kamrud, B. Borghetti, C. S. Kabban, and M. Miller, Letters, Vol. 2, no. 1, pp. 1–4, 2018.
“Generalized deep learning eeg models for cross- partici-
pant and cross-task detection of the vigilance decrement 41. K. Samiee, P. Kovács, and M. Gabbouj, “Epileptic seizure
in sustained attention tasks,” Sensors, Vol. 21, no. 16, pp. detection in long-term EEG records using sparse ratio-
1–24, 2021, article no. 5617. DOI:10.3390/s21165617 nal decomposition and local gabor binary patterns feature
extraction,” Knowl. Based. Syst., Vol. 118, pp. 228–240,
30. L. Xu, M. Xu, Y. Ke, X. An, S. Liu, and D. Ming, “Cross- 2017. DOI:10.1016/j.knosys.2016.11.023
dataset variability problem in EEG decoding with deep
learning,” Front. Hum. Neurosci., Vol. 14, pp. 103, 2020. 42. S. S. Gupta, and R. R. Manthalkar, “Classification of visual
DOI:10.3389/fnhum.2020.00103 cognitive workload using analytic wavelet transform,”
Biomed. Signal. Process. Control., Vol. 61, pp. 101961, 2020.
31. R. Hefron, B. Borghetti, C. Schubert Kabban, J. Chris- DOI:10.1016/j.bspc.2020.101961
tensen James, and J. Estepp, “Cross-participant EEG-based
assessment of cognitive workload using multi-path convo- 43. M. Y. Ladekar, S. S. Gupta, Y. V. Joshi, and R. R. Manthalkar,
lutional recurrent neural networks,” Sensors, Vol. 18, no. 5, “EEG based visual cognitive workload analysis using mul-
pp. 1339, 2018. DOI:10.3390/s18051339 tirate iir filters,” Biomed. Signal. Process. Control., Vol. 68,
pp. 102819, 2021. DOI:10.1016/j.bspc.2021.102819
32. P. Zhang, X. Wang, W. Zhang, and J. Chen, “Learning
spatial–spectral–temporal EEG features with recurrent 3d 44. J. Sola, and J. Sevilla, “Importance of input data normal-
convolutional neural networks for cross-task mental work- ization for the application of neural networks to complex
load assessment,” IEEE Trans. Neural Syst. Rehabil. Eng., industrial problems,” IEEE Trans. Nucl. Sci., Vol. 44, no. 3,
Vol. 27, no. 1, pp. 31–42, 2019. DOI:10.1109/TNSRE.2018. pp. 1464–1468, 1997. DOI:10.1109/23.589532
2884641
45. Q. Wang, and O. Sourina, “Real-time mental arithmetic
33. S. S. Gupta, T. J. Taori, M. Y. Ladekar, R. R. Manthalkar, S. S. task recognition from EEG signals,” IEEE Trans. Neu-
Gajre, and Y. V. Joshi, “Classification of cross task cognitive ral Syst. Rehabil. Eng., Vol. 21, no. 2, pp. 225–232, 2013.
workload using deep recurrent network with modelling of DOI:10.1109/TNSRE.2012.2236576
temporal dynamics,” Biomed. Signal. Process. Control., Vol.
70, pp. 103070, 2021. DOI:10.1016/j.bspc.2021.103070 46. S. Molaei, and M. E. Shiri Ahmad Abadi, “Maintaining
filter structure: A gabor-based convolutional neural net-
34. A. Alekseev, and A. Bobe, “Gabornet, “gabor filters with work for image analysis,” Appl. Soft. Comput., Vol. 88, pp.
learnable parameters in deep convolutional neural net- 105960, 2020. DOI:10.1016/j.asoc.2019.105960
work,”,” in 2019 International Conference on Engineering
and Telecommunication (EnT), 2019, pp. 1–4. IEEE. 47. M. Hardt, and T. Ma, “Identity matters in deep learning”
arXiv preprint arXiv:1611.04231 < /, 2016.
35. A. Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet
classification with deep convolutional neural networks,” 48. J. V. Dillon, et al. “Tensorflow distributions” arXiv preprint
Adv. Neural. Inf. Process. Syst., Vol. 25, pp. 1097–1105, 2012. arXiv:1711.10604 < /, 2017.
36. K. Simonyan, and A. Zisserman. “Very deep convolutional 49. A. Craik, Y. He, and J. L. Contreras-Vidal, “Deep learn-
networks for large-scale image recognition,” arXiv preprint ing for electroencephalogram (EEG) classification tasks: a
arXiv:1409.1556, 2014. review,” J. Neural Eng., Vol. 16, no. 3, pp. 031001, 2019.
DOI:10.1088/1741-2552/ab0ab5
37. K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learn-
ing for image recognition,” in Proceedings of the IEEE
T. TAORI ET AL.: CROSS-TASK COGNITIVE LOAD CLASSIFICATION USING GABOR DECOMPOSED DATA IMAGES 17
Email: [email protected]