F NRIS
F NRIS
Real time detection of cognitive load using fNIRS: A deep learning approach✩
Subashis Karmakar a , Supreeti Kamilya a , Prasenjit Dey c , Parag K. Guhathakurta a ,
Mamata Dalui a , Tushar Kanti Bera b , Suman Halder b , Chiranjib Koley b ,∗, Tandra Pal a ,
Anupam Basu a
a Department of Computer Science and Engineering, National Institute of Technology Durgapur, West Bengal, India
b Department of Electrical Engineering, National Institute of Technology Durgapur, West Bengal, India
c
Department of Computer Science and Engineering, Cooch Behar Government Engineering College, Cooch Behar, West Bengal, India
Keywords: Functional near infrared spectroscopy (fNIRS) is a non-invasive tool for monitoring functional brain activation
Functional near infrared spectroscopy (fNIRS) that records changes in oxygenated hemoglobin (HbO) and deoxygenated hemoglobin (HbR) concentrations.
Oxygenated hemoglobin (HbO) fNIRS is well accepted in the cognitive study where the signals are intended to measure cognitive load in the
Deoxygenated hemoglobin (HbR)
human brain. Concentration changes in HbO and HbR help in classifying the cognitive states of human brain.
Cognitive load
There are several machine learning classification techniques to distinguish different cognitive states. Some
Mental arithmetic (MA)
Baseline task (BL)
conventional machine learning methods, which are easier to implement, undergo a complex processing phase
before training the network and also suffer from low accuracy due to inappropriate data preprocessing. Deep
learning based convolutional neural network (CNN) having automatic feature engineering capability plays a
very important role in efficiently classifying different cognitive states. The present work uses two open-access
datasets on fNIRS signal. The datasets are taken for two cognitive states: mental task (MT) and resting state or
baseline task (BL). The concentration changes of HbO and HbR are computed using the modified Beer–Lambert
law. The band-pass filter is used to remove additional noise from the signals. Here, topographical brain images
are generated from the data of 2 s window with 1 s overlapping for both HbO and HbR. Global normalization
is applied to the filtered data for better visualization of the images. The brain images are fed to the proposed
CNN model in order to classify them into MT or BL. The accuracy of the classification and the comparative
study shows the superiority of the proposed model over two existing models.
1. Introduction Moreover, fNIRS is broadly adopted in the research world due to its
less sensitivity to motion artifacts, cost-effectiveness, and portability.
For the last three decades, researchers have been working on various fNIRS is well accepted in the cognitive study where the signals are
aspects of neuroimaging in order to diagnose diseases or measure intended to measure cognitive load in the human brain during a mental
cognitive load due to different activities in the human brain. fNIRS task. The signals can identify visuospatial working memory (WM)
is one of the neuroimaging techniques that is used for functional neu- associated with brain activation functionality in the brain’s prefrontal
roimaging [1,2]. It is a non-invasive measurement of functional brain cortex [4]. The cognitive load of a subject depends not only on the
activation that records changes in HbO and HbR concentrations. The task complexity but also on the subject’s expertise for a given task [5].
term ‘functional’ signifies lots of pictures of the brain over the course If a subject gains skill and learns to perform a task automatically, the
of time to identify the changes in the activity of the brain. The main ad- cognitive load of the subject decreases [6,7]. High cognitive load occurs
vantage of fNIRS compared to other technologies like positron emission in a subject while performing a highly complex task without having
tomography (PET) and functional magnetic resonance imaging (fMRI) skill on that task. High cognitive load in a person may have a negative
is its compact measurement, which reduces strain on the subjects [3]. impact on learning ability or task completion as the information storage
✩ This work is supported by Ministry of Electronics & Information Technology (MeitY), Government of India (Sanctioned number: 4(16)/2019-ITEA).
∗ Corresponding author.
E-mail addresses: [email protected] (S. Karmakar), [email protected] (S. Kamilya), [email protected] (P. Dey),
[email protected] (P.K. Guhathakurta), [email protected] (M. Dalui), [email protected] (T.K. Bera),
[email protected] (S. Halder), [email protected] (C. Koley), [email protected] (T. Pal), [email protected]
(A. Basu).
https://doi.org/10.1016/j.bspc.2022.104227
Received 28 April 2022; Received in revised form 2 September 2022; Accepted 18 September 2022
Available online 17 October 2022
1746-8094/© 2022 Elsevier Ltd. All rights reserved.
S. Karmakar et al. Biomedical Signal Processing and Control 80 (2023) 104227
capacity in short-term memory, or working memory is limited. It is as input to the classifier. B. Lakshmi Priya et al. [34] show improved
important to properly identify the load in a person during a cognitive results when Empirical Wavelet Transform (EWT) combined with the
state. An interesting study is done in [8] where one group of subjects wavelet scattering coefficients is fed into a recurrent neural network
are given mindfulness training through meditation. Mindfulness is a (RNN) to analyze the brain wave pattern in response to a thought pro-
state of mind that signifies consciousness in the current moment. The cess and visual stimuli-based classifier. Processing raw fNIRS time series
training results in a significant improvement of the level of cognitive data for classification is a challenging task. Converting fNIRS time
load with respect to another group of subjects. A number of studies series data into images improves the accuracy of classification [35].
have been done on measuring cognitive workload [9–11]. The growing The authors of [36] use Gramian Angular Summation Field to convert
interest among the researchers also leads them to analyze electroen- time series data to images and further process it to classify tasks like
cephalogram (EEG) based measurement of cognitive load [5,12,13]. To MA, MI and idle state(IS) using CNN. The CNN model is then compared
effectively classify cognitive load, optimal selection of EEG electrodes with sLDA (shrinkage Linear Discernment Analysis), ANN, and Bi-LSTM
with the most relevant features plays a significant role [14,15]. Some (Bidirectional Long Short-Term Memory). sLDA is one of the preferable
works on the fNIRS-based study are reported in [16–18], where linear classification techniques in case the number of classes is more
the workloads during the mental task are evaluated. To analyze the than two. Bi-LSTM is the process of making any neural network go
brain network in fNIRS, the authors of [19] have proposed a dynamic forward and backwards in order to preserve both future and past
weighted graph and calculated the ‘small-world’ network by improving information. Authors of [37] use a composite framework consisting of
the spectral performance. Real-time assessment of cognitive workload Grey Wolf Optimizer (GWO) and a deep neural network to estimate
with the help of EEG or fNIRS is an essential step in diagnosing different mental workload for two different tasks, i.e., no task and simultaneous
disabilities of brain activities such as dyslexia, visual motor deficit, and capacity-based multitasking activity. They have also proposed Long
attention-deficit hyperactivity disorder (ADHD). EEG and fNIRS are two Short-Term Memory (LSTM) and Bi-LSTM, which provides the accuracy
commonly used techniques to diagnose ADHD. H. Shibasaki et al. [20] 82.57% and 86.33%, respectively. The authors of [38] have performed
showed that EEG and fNIRS are a complement to one another in terms three types of experimentations: (i) left-hand MI vs right hand MI,
of spatial and temporal resolution. The advantage of fNIRS over EEG is (ii) MA vs BL and (iii) Motion Artifacts. For the first and second
its higher spatial resolution; hence, it is more befitting to optimize brain experimentations, sLDA provides promising results. In [39], the authors
imaging techniques [21]. It is noted that HbO and HbR are negatively have considered 3 different types of cognitive tasks: (i) 0-back, 2-
correlated [22], i.e., if the concentration of HbO is increased during a back and 3-back, (ii) discrimination/selection response and (iii) word
generation. They have used sLDA, which shows promising results in
cognitive load, the concentration of HbR is decreased. Hence, HbO and
classifying cognitive tasks and the rest. In [40], SVM as well as CNN
HbR concentrations help in classifying the cognitive states of a human
are used on three datasets, available in [39], to classify 0-back, 2-
being. Paper [23] combines EEG and fNIRS to get better accuracy in the
back, 3-back and the rest. The CNN architecture proposed in [40]
diagnosis. Many works use such brain–computer interface (BCI) system
consists of convolutional, ReLU, maxpooling, dropout, fully connected
in motor imagery task [24,25]. To classify data obtained from the BCI
and softmax layers. The input image size is set to 30 × 112. The number
system, several machine learning approaches, such as linear discrimi-
of kernels is 32 and the size of the kernel is 3 × 3. CNN yields 89%
nant analysis (LDA), support vector machine (SVM), k-nearest neighbor
classification accuracy, while the SVM achieves only 82% accuracy.
(kNN), and decision tree are considered in some studies [26,27]. Some
The current work deals with two open-access fNIRS datasets, avail-
studies show such classifications using deep learning techniques [28–
able in [38] and [39]. For the purpose of experimentation, we
31]. In [28], the authors have shown that deep learning-based models
have considered fNIRS data for mental arithmetic (MA) and baseline
provide remarkably better performance compared to the conventional
(BL) tasks of 29 healthy subjects from [38] and fNIRS data for word
machine learning algorithms for analyzing cognitive load.
generation (WG) and baseline (BL) task of 26 subjects from [39]. The
CNN is one of the most spectacular forms of deep learning that is
topographical brain images are generated from the fNIRS time series
used to solve different image-driven pattern recognition tasks. CNN is
data using BBCI toolbox [41]. We propose a CNN model to classify
also applied in several fields like natural language processing, voice
the topographical brain images into mental task (i.e., MA or WG) and
recognition, and BCI. Early detection of mild cognitive impairment baseline task. The proposed CNN model is compared with two existing
(MCI) can control and prevent Alzheimer’s disease (AD). The study, CNN models [31,40]. It is observed from the performance analysis that
in [31], shows that deep learning based CNN on fNIRS data can the proposed model provides better accuracy compared to the two
efficiently distinguish MCI patients and healthy subjects, also termed as existing models. The rest of the paper is organized as follows.
healthy controls (HCs) while they were performing three mental tasks: Some preliminaries on fNIRS and CNN are discussed in Section 2.
N-back, Stroop and verbal fluency (VF). The architecture of the pro- The proposed methodology, including the CNN model, is described in
posed CNN model, in [31], contains convolutional, ReLU, maxpooling, Section 3. The experimental results, including the comparative analysis,
dropout (25%), fully connected and softmax layers. The input image are provided in Section 4. Finally, conclusion is drawn in Section 5.
size is set to 200 × 200. The number of kernels, the kernel size, the
size of the pooling area and the value of the stride are respectively, 8, 2. Preliminaries
4 × 4, 2 × 2 and 1. Categorical cross-entropy is used as a loss function
and Adam optimization is used to select the adaptive learning rate and 2.1. Functional near infrared spectroscopy
the parameters during gradient descent. The authors have considered
five-fold cross-validation to address the overfitting problem which may fNIRS is a non-invasive neuroimaging method which relies on opti-
arise due to the limited number of samples. The average accuracies cal properties of cerebral blood flow to infer brain activity. fNIRS works
for the N-back, Stroop and VFT task are, respectively, 89.46%, 87.80% on BOLD (blood-oxygen-level-development) response by detecting the
and 90.37%. In [32], it has been shown that CNN and hybrid-CNN (h- changes in concentrations of HbO and HbR. Brain tissues absorb near-
CNN) provide better accuracy in case of EEG Motor Imagery (MI) tasks. infrared lights of some particular wavelengths and scatter the other
Recurrent CNN is used in [29] to model cognitive events from EEG data. wavelengths. An oscillatory measurement of absorption and scatter-
The authors of [33] have done interesting work on automated classi- ing of lights helps in non-invasive monitoring and quantifying the
fication of mental arithmetic (MA) tasks, i.e., good MA calculation vs. concentration changes of HbO and HbR in the cortex.
bad MA calculation and before MA calculation vs. after MA calculation To measure the changes of HbO and HbR, optical density (say, 𝐴)
using a recurrent neural network where several entropy features such as is computed by using Beer–Lambert Law (BLL) as shown below.
approximation entropy, sample entropy, permutation entropy, disper- 𝐼𝑖𝑛𝑐
sion entropy and slope entropy from each channel of the EEG are taken 𝐴 = 𝑙𝑜𝑔10 ( ) = 𝜀 × 𝐶 × 𝐿, (1)
𝐼𝑑𝑒𝑡
2
S. Karmakar et al. Biomedical Signal Processing and Control 80 (2023) 104227
where, 𝐼𝑖𝑛𝑐 is the incident light intensity, 𝐼𝑑𝑒𝑡 is the detected light 3. Proposed methodology
intensity, 𝜀 is the molar absorption coefficient (from spectral graph),
𝐶 is the concentration of substance in media and 𝐿 is the path length. Detailed block diagram of the proposed methodology is shown in
As the brain tissue does not perfectly transmit the light, we need Fig. 1. The fNIRS signal is collected while subjects are performing
to modify the BLL, shown above in (1), to account for scattering. The MT followed by BL. Using MBLL, optical density is evaluated for
modified Beer Lambert law (MBLL) [42,43] is given below in (2). HbO and HbR from the collected fNIRS signal. Further, signal data
𝐼𝑖𝑛𝑐 is preprocessed by applying filter and baseline correction followed by
𝐴′ = 𝑙𝑜𝑔10 ( ) = (𝜀 × 𝐶 × 𝐷𝑃 𝐹 ) + 𝐺 (2) global normalization. From the normalized data, topographical brain
𝐼𝑑𝑒𝑡
images are generated for 2 s intervals of HbO and HbR. Images are
Here, 𝐺 denotes geometry dependent factor (i.e., intensity lost due to then provided to the proposed CNN model to classify as MT or BL.
scattering) and DPF is the differential path length factor. By adding G BBCI toolbox [41] is used to generate the topographical brain
in (2), it is acknowledged that more light is lost compared to the light images. Using this toolbox, the signal data are firstly segmented with
as given in (1), due to absorption (i.e., scattering and by multiplying marker interval. Along with the segmented data and fNIRS channel
with DPF). positions, topographical images of the scalp are generated for all epoch
intervals of each class (i.e., MT or BL) based on the previous marker.
2.2. Fundamentals of convolutional neural network (CNN) Here, the topographical brain images are generated for 2 s window with
1 s overlapping from the epoch interval between −10 s to 25 s (total
CNN is one of the most popular deep learning technologies that time presented in the dataset includes both MT and BL).
show promising results. The application deals with image data, speech The window overlapping can predict mental tasks for every second.
data, computer vision, natural language processing (NLP), etc. The ba- To understand it better, baseline correction is performed by subtracting
sic operations of CNN can be divided into two parts: feature extraction a value from the range of −5 s to −2 s from the average value. For each
and classification. CNN mainly comprises three layers: convolutional, subject, there are 34 images for MT and 34 for BL in the case of HbO.
pooling and fully connected [44–46]. A brief description of the layers Thus, 34 × 2 = 68 images are generated for HbR as well. So, for 29
is presented below. subjects in 3 sessions, total number of images = 3 × 68 × 29 = 5916
The convolutional layer is one of the important parts of CNN, which (2958 HbO images and 2958 HbR images). The total number of images
focuses on learning features from the input. Filters or kernels are used for an individual session is 68 × 29 = 1972 (986 images of HbO and
for convolution operation over small patches of input and then the 986 images of HbR). Next, we apply CNN on HbO and HbR images to
results are passed into a non-linear activation function like sigmoid, classify MT and BL.
ReLU or tanh. The main idea of the pooling layer is down-sampling in
Proposed CNN Model: In this work, we apply CNN to the topograph-
order to reduce the computational load and the number of parameters.
ical images for both HbO and HbR. The aim of applying CNN is to
Max pooling and average pooling are two popular pooling methods
classify the images into two states: MT and BL. The proposed CNN
used in CNN. A fully connected layer takes the output of the previous
Model is shown in Fig. 2.
layers and flattens them into a single vector. Each node in a fully
Input to the CNN is the RGB images with size 326 × 311 × 3.
connected layer is directly connected to every node of the next layer.
Preprocessing like rotation and translation helps to prevent the model
Four standard metrics, accuracy, recall, precision and F1-score, are
from overfitting and to memorize the exact details of the training
used to verify the performance of a CNN model for classifying the
images. So, the images are augmented by some resizing, translation
cognitive states. The metrics are defined as follows.
and rotation before providing the images as input to the convolution
𝑇𝑃 + 𝑇𝑁 layer. The convolution layer convolves the input by moving the filters
𝐴𝑐𝑐𝑢𝑟𝑎𝑐𝑦 = (3)
𝑇𝑃 + 𝑇𝑁 + 𝐹𝑃 + 𝐹𝑁 along the input images vertically and horizontally and computing the
𝑇𝑃 dot product of the weights and the input and then adding a bias term.
𝑅𝑒𝑐𝑎𝑙𝑙 = (4)
𝑇𝑃 + 𝐹𝑁 The filter size is gradually decreased to extract in-depth features from
the image as well as to reduce the computational cost. If convolution
𝑇𝑃
𝑃 𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 = (5) operation is considered as an interpolation from a given pixel to a
𝑇𝑃 + 𝐹𝑃
center pixel, we cannot interpolate to a center pixel by using an even-
2 × (𝑃 𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 × 𝑅𝑒𝑐𝑎𝑙𝑙) sized filter. In our model, it is decreased from 11 × 11 to 5 × 5 and
𝐹 1 − 𝑆𝑐𝑜𝑟𝑒 = (6)
𝑃 𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 + 𝑅𝑒𝑐𝑎𝑙𝑙 then 3 × 3. We have not considered a filter of size 1 × 1 as it does not
Here, TP, TN, FP and FN, respectively, represent true positive, true contain any information about the neighborhood pixels. The number of
negative, false positive and false negative. In our model, TP signifies filters signifies the number of neurons in the convolutional layer that
the total number of images of MT identified correctly, and TN is the connect to the same region in the input. The abstract features of image
total number of images of BL that are correctly identified. FP implies data can be efficiently extracted if we increase the number of filters. So,
the number of images of MT wrongly classified, and FN denotes the the number of filters increased from 16 to 48 and then 96. To batch-
number of wrongly classified images of BL. wise normalize the images, the batch normalization layer is added after
3
S. Karmakar et al. Biomedical Signal Processing and Control 80 (2023) 104227
the input where any negative value, if exists, is set to zero. The output Optimizer Stochastic Gradient
Descent with Momentum (SGDM)
obtained after the ReLU layer acts as the input to the max pooling layer
Momentum 0.9000
with a pool size of 2 × 2 and stride size of 2. Max pooling layer down Activation function ReLU
samples the input by dividing it into rectangular pooling regions of Initial learning rate 0.0100
size 2 × 2 and computes the maximum value in each region. A fully Learning rate drop factor 0.1000
connected layer with output size 2 implies a 2-class classification (MT Regularization method l2 regularization
Gradient threshold method l2norm
and BL). A 5-fold cross-validation technique is used to validate the Maximum epochs 50
proposed CNN model. Iteration per epochs 25
Hyperparameters are the parameters whose values control the learn- Minimum batch size 64
ing process and determine the values of model parameters that a
model can learn during its training. Some of the hyperparameters used
to train the proposed CNN model are shown in Table 1. Stochastic
• Dataset C: Motion Artifacts (non-physiological, eye blinking,
Gradient Descent with Momentum (SGDM) optimizer is used with a
head movement)
momentum of 0.9000, which helps to accelerate gradients vectors
in the right directions, thus leading to faster converging. The initial For this experimentation, 28 right-handed and 1 left-handed healthy
learning rate is set to 0.0100 with a learning rate drop factor of 0.1000. subject, i.e., a total of 29 subjects, are considered. Among them, the
L2 regularization is used to overcome the overfitting problem. After number of male and female candidates are respectively 14 and 15,
experimentations with several number of epochs, it is observed that at and they do not have any neurological, psychiatric or other brain-
50 epochs (50 × 25 = 1200 iterations), the model converses, as shown in related diseases. The average age of the subject is 28.5, with a standard
Fig. 7(a). The minimum batch size is taken as 64 to train the model. deviation of 3.7.
Fig. 3 shows a schematic sequence diagram of the experimental
4. Experimentation model of each subject, as given in [38]. Each subject is kept at rest for
1 min before the experiment starts. After that, the task has been started
with 2 s of a visual introduction of the task followed by a task period
4.1. Dataset description
of 10 s and a resting period of 15 to 17 s after completion of the task.
The given task have 20 repetitions (as shown in Fig. 3). There is a total
Here, we have considered two open-access datasets available in [38, of 6 sessions for each subject. Among them, sessions 1, 3, and 5 are for
39]. In this experimentation, we have considered them as Dataset I and motor imagery (MI) and baseline tasks, and sessions 2, 4 and 6 are for
Dataset II, respectively. MA and BL. As we are interested only in mental arithmetic and baseline
Dataset I [38]: In [38], the dataset is for hybrid BCIs that use EEG and tasks, we have considered the data that were taken during sessions 2, 4,
fNIRS. The fNIRS data were collected by 36 channels (14 sources and and 6. The detailed description of data acquisition and environmental
16 detectors) NIRScout device at a 12.5 Hz sampling rate. It comprises setup is given in [38].
three sets of data as follows. In our experimentation, we are interested Dataset II [39]: The dataset available in [39] consists of three types
in the detection of cognitive load during a mental task, so we consider of experimentations, i.e., n-back (dataset A), discrimination/selection
dataset B. response (DSR) (dataset B), and word generation (WG) (dataset C) tasks
on 26 healthy participants consisting of 9 males and 17 females. The
• Dataset A: Three sessions of left and right-hand motor imagery average age of the subject is 26.1, with a standard deviation of 3.5. For
(MI). data acquisition, 36 channels fNIRS system, consisting of 16 sources
• Dataset B: Mental Arithmetic (MA) and Baseline Tasks (BL) and 16 detectors with a sampling rate of 10.4 Hz, was used. Each of
(taking a rest without any thought) the following three datasets consists of three sessions for each subject.
4
S. Karmakar et al. Biomedical Signal Processing and Control 80 (2023) 104227
Fig. 3. Schematic sequence diagram of the experimental model for each subject for a particular session for Dataset I.
5
S. Karmakar et al. Biomedical Signal Processing and Control 80 (2023) 104227
Fig. 6. Topographical brain images (HbO) of Subject 1 of session 1 for Dataset II.
Fig. 7. (a) Comparison of training and validation accuracy over the number of epochs (b) Loss over the number of epochs.
6
S. Karmakar et al. Biomedical Signal Processing and Control 80 (2023) 104227
Table 2
Performance comparison based on four metrics of the proposed CNN model with CNN Model I [31] and CNN Model II [40] on both Dataset I [38] and Dataset II [39].
Dataset Session Data CNN Model I [31] CNN Model II [40] Proposed CNN Model
Accuracy Precision Recall F1 score Accuracy Precision Recall F1 score Accuracy Precision Recall F1 score
HbO 80.99 84.01 84.62 75.30 92.38 89.55 86.26 90.55 92.24 91.89 90.65 91.69
2 HbR 69.58 77.06 67.48 66.85 90.35 93.35 90.04 92.39 91.02 92.48 90.70 94.61
HbO 69.79 65.20 66.35 66.23 69.24 68.71 72.91 73.14 85.64 86.50 86.13 86.34
I [38] 4 HbR 70.46 72.38 70.35 70.17 82.24 81.54 84.61 82.42 86.84 87.98 88.39 84.59
HbO 73.59 72.67 72.12 72.18 84.43 85.02 81.07 86.42 88.58 84.78 86.48 84.61
6 HbR 70.88 68.28 70.41 68.61 83.63 93.69 83.71 86.54 90.85 90.78 92.93 93.73
HbO 86.97 87.56 90.1 88.43 87.94 91.54 80.94 80.96 98.08 94.54 96.71 97.27
2 + 4 + 6 HbR 87.58 88.96 89.81 89.21 85.50 85.94 80.4 79.46 93.66 91.74 95.02 93.67
1 + 2 + 3 HbO 77.15 77.33 77.01 75.25 72.84 77.29 73.05 75.27 97.01 97.61 96.46 97.12
II [39] HbR 67.86 68.51 67.34 67.91 77.85 80.86 77.55 78.74 96.27 96.94 95.7 96.41
Fig. 8. Comparison of the proposed CNN model with CNN Model I and CNN Model II on both Dataset I and Dataset II based on classification accuracy.
Fig. 9. Comparison of the proposed CNN model with the CNN Model I and CNN Model
II based on classification accuracy. Fig. 10. Comparison of the proposed CNN with sLDA on Dataset I and Dataset II model
based on classification accuracy.
7
S. Karmakar et al. Biomedical Signal Processing and Control 80 (2023) 104227
(i.e., high variance in the outputs) or underfitting (i.e., high bias [4] J.S. Witmer, E.A. Aeschlimann, A.J. Metz, S.J. Troche, T.H. Rammsayer, Func-
in the outputs). tional near-infrared spectroscopy recordings of visuospatial working memory
processes. part II: A replication study in children on sensitivity and mental-
• A large number of image datasets overcome the limitation of
ability-induced differences in functional activation, Brain Sci. 8 (8) (2018)
small dataset, especially for the deep learning model. 152.
• The generated topographical brain images for both the classes are [5] N. Friedman, T. Fekete, K. Gal, O. Shriki, EEG-based prediction of cognitive load
differentiable even in unaided eyes. It also helps the CNN model in intelligence tests, Front. Human Neurosci. 13 (2019) 191.
to classify the images accurately . [6] G. Borghini, P. Aricò, G. Di Flumeri, G. Cartocci, A. Colosimo, S. Bonelli, A.
Golfetti, J.P. Imbert, G. Granger, R. Benhacene, et al., EEG-based cognitive
control behaviour assessment: an ecological study with professional air traffic
5. Conclusion controllers, Sci. Rep. 7 (1) (2017) 1–16.
[7] G.D. Logan, Skill and automaticity: Relations, implications, and future directions,
Can. J. Psychol./Revue Can. Psychol. 39 (2) (1985) 367.
In this work, we have used fNIRS signal data from two open-
[8] S.S. Gupta, R.R. Manthalkar, S.S. Gajre, Mindfulness intervention for improving
access datasets to classify cognitive states among two classes: mental cognitive abilities using EEG signal, Biomed. Signal Process. Control 70 (2021)
arithmetic (MA) or word generation (WG) and resting state or baseline 103072.
task (BL). Topographical brain images are generated from the data [9] F.A. Fishburn, M.E. Norr, A.V. Medvedev, C.J. Vaidya, Sensitivity of fNIRS to
for HbO as well as HbR. The proposed CNN model is used to classify cognitive state and load, Front. Human Neurosci. 8 (2014) 76.
[10] J. Sweller, Cognitive load during problem solving: Effects on learning, Cogn. Sci.
them into MT (MA/WG) or BL. The results show the superiority of the
12 (2) (1988) 257–285.
proposed CNN model compared to two state-of-the CNN models for [11] J. Sweller, Cognitive load theory, Psychol. Learn. Motiv. 55 (2011) 37–76.
both datasets. However, we think that the duration of 15–17 s for the [12] M.Y. Ladekar, S.S. Gupta, Y.V. Joshi, R.R. Manthalkar, EEG based visual cognitive
rest period is not enough for the signal to reach baseline, which may workload analysis using multirate iir filters, Biomed. Signal Process. Control 68
cause the presence of some cognitive load also in BL. The presence (2021) 102819.
of noise in the data of some subjects also affects the classification [13] I. Zyma, S. Tukaev, I. Seleznov, K. Kiyono, A. Popov, M. Chernykh, O. Shpenkov,
Electroencephalograms during mental arithmetic task performance, Data 4 (1)
accuracy. If the signal data of those subjects can be discarded, then the (2019) 14.
results may be improved in classification. Moreover, due to the highly [14] R. Lahiri, P. Rakshit, A. Konar, Evolutionary perspective for optimal selection of
complex nature of the brain, it is quite difficult to properly differentiate EEG electrodes and features, Biomed. Signal Process. Control 36 (2017) 113–137.
brain states as either MT or BL. In the future fuzzy classifier can be used [15] A.A. Rai, M.K. Ahirwal, Electroencephalogram-based cognitive load classification
to deal with the imprecise nature of the cognitive states of the brain by during mental arithmetic task, Edge Anal. (2022) 479–487.
[16] Y. Hoshi, B.H. Tsou, V.A. Billock, M. Tanosaki, Y. Iguchi, M. Shimada, T. Shinba,
classifying the MT as high, medium or low.
Y. Yamada, I. Oda, Spatiotemporal characteristics of hemodynamic changes in
the human lateral prefrontal cortex during working memory tasks, Neuroimage
CRediT authorship contribution statement 20 (3) (2003) 1493–1504.
[17] T. Li, Q. Luo, H. Gong, Gender-specific hemodynamics in prefrontal cortex during
a verbal working memory task by near-infrared spectroscopy, Behav. Brain Res.
Subashis Karmakar: Conception and design of study, Acquisition 209 (1) (2010) 148–153.
of data, Analysis and/or interpretation of data, Writing – original [18] J.S. Witmer, E.A. Aeschlimann, A.J. Metz, S.J. Troche, T.H. Rammsayer, The va-
draft, Writing – review & editing. Supreeti Kamilya: Conception and lidity of functional near-infrared spectroscopy recordings of visuospatial working
design of study, Analysis and/or interpretation of data, Writing – memory processes in humans, Brain Sci. 8 (4) (2018) 62.
[19] Y. Wang, X. Zhao, W. Zhou, C. Chen, W. Chen, Dynamic weighted small-world
original draft. Prasenjit Dey: Conception and design of study. Parag
graphical network establishment for fNIRS time-varying brain function analysis,
K. Guhathakurta: Writing – original draft. Mamata Dalui: Writing Biomed. Signal Process. Control 69 (2021) 102902.
– original draft. Tushar Kanti Bera: Writing – original draft. Suman [20] H. Shibasaki, Human brain mapping: hemodynamic response and electrophysi-
Halder: Writing – original draft. Chiranjib Koley: Conception and ology, Clin. Neurophysiol. 119 (4) (2008) 731–743.
design of study, Acquisition of data, Analysis and/or interpretation of [21] S. Dong, J. Jeong, Onset classification in hemodynamic signals measured during
three working memory tasks using wireless functional near-infrared spectroscopy,
data, Writing – review & editing. Tandra Pal: Conception and design of
IEEE J. Sel. Top. Quantum Electron. 25 (1) (2018) 1–11.
study, Analysis and/or interpretation of data, Writing – original draft, [22] X. Cui, S. Bray, A.L. Reiss, Functional near infrared spectroscopy (fNIRS)
Writing – review & editing. Anupam Basu: Conception and design of signal improvement based on negative correlation between oxygenated and
study. deoxygenated hemoglobin dynamics, Neuroimage 49 (4) (2010) 3039–3046.
[23] A. Güven, M. Altınkaynak, N. Dolu, M. Izzetoğlu, F. Pektaş, S. Özmen, E.
Demirci, T. Batbat, Combining functional near-infrared spectroscopy and EEG
Declaration of competing interest measurements for the diagnosis of attention-deficit hyperactivity disorder, Neural
Comput. Appl. (2019) 1–14.
The authors declare the following financial interests/personal rela- [24] E.A. Mousavi, J.J. Maller, P.B. Fitzgerald, B.J. Lithgow, Wavelet common
spatial pattern in asynchronous offline brain computer interfaces, Biomed. Signal
tionships which may be considered as potential competing interests:
Process. Control 6 (2) (2011) 121–128.
Tandra Pal reports financial support was provided by Ministry of [25] Y.R. Tabar, U. Halici, A novel deep learning approach for classification of EEG
Electronics & Information Technology (MeitY), Government of India. motor imagery signals, J. Neural Eng. 14 (1) (2016) 16003.
[26] C. Cortes, V. Vapnik, Support-vector networks, Mach. Learn. 20 (3) (1995)
Data availability 273–297.
[27] A. Kübler, A. Furdea, S. Halder, E.M. Hammer, F. Nijboer, B. Kotchoubey,
A brain–computer interface controlled auditory event-related potential (p300)
The open-access datasets used in this paper are available in [38] spelling system for locked-in patients, Ann. New York Acad. Sci. 1157 (1) (2009)
and [39]. 90–100.
[28] A. Saha, V. Minz, S. Bonela, S. Sreeja, R. Chowdhury, D. Samanta, Classification
of EEG signals for cognitive load estimation using deep learning architectures,
References in: International Conference on Intelligent Human Computer Interaction, 2018,
pp. 59–68.
[1] A.C. Ehlis, S. Schneider, T. Dresler, A.J. Fallgatter, Application of functional [29] P. Bashivan, I. Rish, M. Yeasin, N. Codella, Learning representations from
near-infrared spectroscopy in psychiatry, NeuroImage 85 (2014) 478–488. EEG with deep recurrent-convolutional neural networks, in: 4th International
[2] M. Ferrari, I. Giannini, G. Sideri, E. Zanette, Continuous non invasive monitoring Conference on Learning Representation (Poster), 2016.
of human brain by near infrared spectroscopy, in: Oxygen Transport to Tissue [30] H. Yang, S. Sakhavi, K.K. Ang, C. Guan, On the use of convolutional neural
VII, 1985, pp. 873–882. networks and augmented csp features for multi-class motor imagery of EEG
[3] S. Tak, J.C. Ye, Statistical analysis of fnirs data: a comprehensive review, signals classification, in: 37th Annual International Conference of the IEEE
Neuroimage 85 (2014) 72–91. Engineering in Medicine and Biology Society, EMBC, 2015, pp. 2620–2623.
8
S. Karmakar et al. Biomedical Signal Processing and Control 80 (2023) 104227
[31] D. Yang, R. Huang, S.H. Yoo, M.J. Shin, J.A. Yoon, Y.I. Shin, K.S. Hong, Detection [41] B. Blankertz, L. Acqualagna, S. Dähne, S. Haufe, M. Schultz Kraft, I. Sturm,
of mild cognitive impairment using convolutional neural network: Temporal- M. Ušćumlic, M.A. Wenzel, G. Curio, K.R. Müller, The berlin brain-computer
feature maps of functional near-infrared spectroscopy, Front. Aging Neurosci. 12 interface: progress beyond communication and control, Front. Neurosci. 10
(2020) 141. (2016) 530.
[32] A. Al-Saegh, S.A. Dawwd, J.M. Abdu. Jabbar, Deep learning for motor imagery [42] D.T. Delpy, M. Cope, P. van der Zee, S. Arridge, S. Wray, J. Wyatt, Estimation of
EEG-based classification: A review, Biomed. Signal Process. Control 63 (2021) optical pathlength through tissue from direct time of flight measurement, Phys.
102172. Med. Biol. 33 (12) (1988) 1433–1442.
[33] A. Varshney, S.K. Ghosh, S. Padhy, R.K. Tripathy, U.R. Acharya, Automated [43] W.B. Baker, A.B. Parthasarathy, D.R. Busch, R.C. Mesquita, J.H. Greenberg, A.G.
classification of mental arithmetic tasks using recurrent neural network and Yodh, Modified beer-lambert law for blood flow, Biomed. Opt. Express 5 (11)
entropy features obtained from multi-channel EEG signals, Electronics 10 (9) (2014) 4053–4075.
(2021) 1079. [44] Y. LeCun, L. Bottou, Y. Bengio, P. Haffner, Gradient-based learning applied to
[34] S. Jayalakshmy, J.K. Pragatheeswaran, D. Saraswathi, N. Poonguzhali, et al., document recognition, Proc. IEEE 86 (11) (1998) 2278–2324.
Scattering convolutional network based predictive model for cognitive activity [45] S. Albawi, T.A. Mohammed, S. Al Zawi, Understanding of a convolutional neural
of brain using empirical wavelet decomposition, Biomed. Signal Process. Control network, in: International Conference on Engineering and Technology, ICET,
66 (2021) 102501. 2017, pp. 1–6.
[35] Z. Wang, T. Oates, Imaging time-series to improve classification and imputation, [46] Y. Li, Z. Hao, H. Lei, Survey of convolutional neural network, J. Comput. Appl.
in: Twenty-Fourth International Joint Conference on Artificial Intelligence, 2015. 36 (9) (2016) 2508–2515.
[36] S.D. Wickramaratne, M.S. Mahmud, A deep learning based ternary task classifica- [47] H.-D. Nguyen, S.-H. Yoo, M.R. Bhutta, K.-S. Hong, Adaptive filtering of
tion system using gramian angular summation field in fNIRS neuroimaging data, physiological noises in fNIRS data, Biomed. Eng. Online 17 (1) (2018) 1–23.
in: 2020 IEEE International Conference on E-Health Networking, Application & [48] M.D. Pfeifer, F. Scholkmann, R. Labruyére, Signal processing in functional
Services, HEALTHCOM, 2021, pp. 1–4. near-infrared spectroscopy (fNIRS): Methodological differences lead to different
[37] D.D. Chakladar, S. Dey, P.P. Roy, D.P. Dogra, EEG-based mental workload statistical results, Front. Human Neurosci. 11 (2018) 641.
estimation using deep blstm-lstm network and evolutionary algorithm, Biomed. [49] H.B.D. Mark Hudson Beale, Martin T. Hagan, Matlab Deep Learning Toolbox,
Signal Process. Control 60 (2020) 101989. The MathWorks, Inc., 2021, pp. 1–111.
[38] J. Shin, A. von Lühmann, B. Blankertz, D.W. Kim, J. Jeong, H.J. Hwang, K.R.
Müller, Open access dataset for EEG+ NIRS single-trial classification, IEEE Trans.
Neural Syst. Rehabil. Eng. 25 (10) (2016) 1735–1745.
[39] J. Shin, A. Von Lühmann, D.W. Kim, J. Mehnert, H.J. Hwang, K.R. Müller,
Simultaneous acquisition of EEG and NIRS during cognitive tasks for an open
access dataset, Sci. Data 5 (1) (2018) 1–16.
[40] M. Saadati, J. Nelson, H. Ayaz, Convolutional neural network for hybrid fNIRS-
EEG mental workload classification, in: International Conference on Applied
Human Factors and Ergonomics, 2019, pp. 221–232.