Published
Published
net/publication/359789483
CITATIONS READS
0 479
6 authors, including:
Ileas Pramanik
City University of Hong Kong
12 PUBLICATIONS 416 CITATIONS
SEE PROFILE
Some of the authors of this publication are also working on these related projects:
Big data analytics for security and criminal investigations View project
All content following this page was uploaded by Md. Sakir Hossain on 22 June 2022.
Tarik Bin Shams1 , Md. Sakir Hossain1† , Md. Firoz Mahmud1 , Md. Shahariar Tehjib1 ,
Zahid Hossain1 , and Md. Ileas Pramanik2 , Non-members
Range of Frequencies
Waveform State of Occurrence State of Abnormality Location
and Characteristics
0.5–4 Hz Wakefulness, serious brain Lesion, tumor, severe damage from Posteriorly in children
Delta
Highest amplitude [7] disorder, deep sleep [7] stroke and frontal in adults
4–8 Hz Deep meditation, emotional stress Brain lesion and head injuries Frontal region
Theta Large amplitude (disappointment or frustration),
drowsiness
8–13 Hz Wakeful and resting state, Head injuries, attention problems, Posterior and central
Alpha
30–50 𝜇V [7] dominant when eyes closed [8] and depression regions
13–30 Hz Alertness, drug-taking, strongly Sleep disorder and lack of attention Mostly frontal
Beta Low in amplitude engaged in problem-solving, in problem-solving
focused on the outside world
>30 Hz Cognitive and motor functions, Epilepsy, Alzheimer’s disease, Somatosensory cortex
Gamma
Small in amplitude long-term memory schizophrenia, hallucinations
to train a classifier. After analyzing and classifying have been clearly surveyed, the engineering and analysis
the registered signal with the help of a model, the techniques used to capture and analyze data from organs
user is identified through recognition or validated by have not. A behavioral-based approach to authentication
authentication, depending on the form of the classifier. is considered in [11]. The benefit of this approach is
However, the classifier may need to be retrained several that the data for user authentication can be collected
times if the precision does not reach a certain threshold. anonymously without the subject noticing. For example,
Retraining may also be required in the case of any biometric authentication using keystrokes [11–15].
change in the dataset. Historically, classifiers such as Rather than reviewing only biometric authentication
the support vector machine (SVM) and the 𝐾 -nearest based on a single feature, a comprehensive review of
neighbors (KNN) were used for this purpose, but in authentication based on several kinds of biometric data
recent years these have often been replaced by deep is presented in [16]. The features include iris, hand
learning methods. A judgment is made on the basis of geometry, face, fingerprints, and so on. However,
classifier output. Finally, estimation is used to evaluate the difference between [10] and [16] is that instead of
the performance of the authentication mechanism. describing the organs used to extract biometric data, the
Recent research has identified biological signals, in- main emphasis in [16] is on the processing of extracted
cluding brain waves, as a feasible means of reliable data and the security and privacy of the data. Various
authentication. Since the brain stimulus is proven to machine learning algorithms have been exploited in
be specific to the individual, the EEG may be used to evaluating the data extracted from various organs to
discriminate. The EEG authentication approach of visual authenticate an individual. A survey on EEG-based user
stimulation shows high precision verification and can authentication is presented in [16], in which a handful of
be used with brain communication methods for further papers are reviewed. However, the application of various
development. machine learning algorithms is not explicitly discussed.
Research into biometric authentication has made All the papers reviewed briefly describe each aspect
significant progress during the last few decades, with of biometric authentication. However, the objective of
several papers published on the topic (see Table 2). In this paper is to cover all works that use machine learning
[10], the author reviewed the different features com- algorithms in classifying EEG data to authenticate a
monly used in biometric authentication, including face, user. Furthermore, rather than merely touching on
gait, iris, voice, signature, lips, fingerprints, veins, DNA, each paper, here we provide an elaborate review of
and so on. Although various organs for capturing data each work to enable the reader to get a clear picture.
228 ECTI TRANSACTIONS ON ELECTRICAL ENGINEERING, ELECTRONICS, AND COMMUNICATIONS VOL.20, NO.2 JUNE 2022
A comparative analysis of the different state-of-the-art Table 3: Works based on task type.
EEG-based authentication techniques is also presented.
Activities During
2. MACHINE LEARNING CLASSIFICATION AL- References
EEG Recording
GORITHMS [23], [32], [24], [43], [33], [25], [26],
This section presents a review of the various machine Real [34], [35], [37], [38], [40], [41], [44],
[28], [45], [29], [42], [30]
learning algorithms widely applied in user authentica-
Motor imagery [50], [51], [46], [52], [47], [48]
tion. However, only the supervised classification algo-
Real and motor imagery [60], [56], [57], [58]
rithms are considered here. A classification algorithm Rest state [61]
can be defined as supervised if it learns from a dataset
in which each instance is accompanied by its class label.
Once trained, the algorithm can predict the class of an
instance in the dataset where the class label is unknown. a large dataset. However, it requires a long training time
The supervised learning algorithms appearing frequently [21].
in this paper are briefly described in the following
subsections.
2. 5 Support Vector Machine
2. 1 𝑲 -Nearest Neighbor In this machine learning algorithm, the data points are
In this technique, an instance without any class label divided by hyperplanes. Data points in different sides
is compared with all instances of the dataset, and the of a hyperplane belong to different classes. If there are
𝐾 closest instances selected. Finally, a majority voting two input features in the dataset, the hyperplane will be
technique is applied to the labels of the 𝐾 closest a line. However, the hyperplane will be two-dimensional
instances to identify the label of the candidate instance if there are three features. Things become more complex
[18]. with a higher number of input features. A dataset point
close to the hyperplane is called a support vector, which
2. 2 Decision Tree influences the orientation and position of the hyperplane
[22]. The benefits of the support vector machine include
This is a tree like structure where the root represents low computational overhead and generalization error.
a condition and the branches represent the outcomes of However, one of the challenging tasks of this algorithm is
the tree while the leaf represents the class of the instance. to tune the hyperparameters. It is usually used for binary
The path from the root to the leaf is known as the set of classification [18].
decision rules [19].
3. 1.1 Traditional machine learning algorithm with Table 4: Number of works per classifier.
real activity
An authentication technique was proposed in [23] Machine Learning
References
to investigate the impact of a large pool of people on Algorithms
the accuracy of the process. Emulating the various SVM
[50], [24], [56], [26], [46], [58], [44],
authentication situations involved four different steps: [47], [28], [45], [29], [30]
Random forest [23], [58], [29]
EEG signal collection, pre- processing, feature extraction,
Bayesian network [56]
and authentication. The raw EEG signal was gathered
Naive Bayes [61], [25], [58]
by placing “EASYCAP” devices containing six midline
KNN [57], [43], [58], [29]
electrodes on 32 participants (11 females, aged 18–25;
LDA [57], [29]
average age 19.12). The signals were sampled at 500 Hz.
ENN [43]
Since the raw EEG signals have outliers, the noise needs CART [29]
first to be minimized by ensemble averaging. After XGBoost [29]
pre-processing, various techniques were used to extract FRNN [42]
the features of the EEG signal. These can be catego- MLP [50], [32], [60], [43], [33], [44], [45]
rized in three ways: time-frequency-domain extraction, CNN [51], [34], [35], [52], [37], [38], [48]
time-domain extraction, and frequency-domain feature LSTM [51], [38], [40], [41]
extraction. The time domain consists of mean, median, RNN [38]
and variance. Fourier transform was used to analyze
the frequency spectrum of EEG signals through the
frequency-domain properties. The wavelet transforms
the time and frequency domain. For EEG pattern Using these functions, a supervised ECOC was eventually
classification, feedforward, backpropagation, multi-layer trained using an SVM classifier to classify individuals
perception neural networks were used. The accuracy with EEG testing signals. The real positive rating of
of the hidden layer in 25 neurons varied from 5.75 to 94.44% for the suggested procedure indicated a tentative
10.68%. Accuracy varied between 28.71 and 36.27% in trial of nine EEG records from nine participants. This was
the 32 submodels and the maximum accuracy of 40 a comprehensive study on cost-effective portable devices
neurons was 36.27%. The highest average accuracy with minimal electrodes.
of the hidden layer in 45 neurons was 94.04%. The In [25], the authors first estimate the separate com-
side-by-side approach, which involves a large dataset ponents from five EEG brain regions, by obtaining a
being divided into a number of smaller datasets and dominant independent component (DIC) for each region.
submodels, built using the small datasets, increased the independent component analysis (ICA) is a popular
efficiency of authentication in all subjects. The highest technique for blind source separation. A multivariate
average accuracy of the hidden layer in 45 neurons was signal can be divided into additive sub-components,
94.04%. However, when 45 neurons are used in the assuming the non-Gaussian source signals have relative
hidden layer, the complexity of the system increases and statistical equality. The EEG data consists of electrical
the training phase becomes more time-consuming. potential recordings on the scalp in a number of dif-
In [24], person authentication with EEG was inves- ferent places, and it is commonly accepted that these
tigated using cost-effective portable devices. A tech- scalp recordings are simply linear mixtures of unknown
nique was proposed for person authentication utilizing underlying neural source activity. The auto-regressive
EEG signals acquired from ease gadgets. The EEG (AR) scalar coefficients are then calculated as a feature
signal was first pre-processed to eliminate artifacts and collection with each DIC. A model of the DIC is provided
noise with a bandpass finite impulse response (FIR) using nonparametric auto-regression. The actual value
filter. The noise-minimized EEG signals were then split of a time series can be estimated from the previous
randomly into five parts. For EEG feature extraction, measurements of the same time series by a univariate
two multi-scale strategies were used: a multi-scale AR model. In the EEG-based person authentication
shape descriptor (MSD) and multi-scale wavelet packet method, Naive Bayes models for decision-making are
statistics (MWPS). The first approach involved breaking implemented due to their flexibility and efficiency in
down EEG fragments into four layers by decomposing a real-world applications. Since the order of the AR model
wavelet packet. For statistical characteristic extraction, influences the feature extraction process and hence the
sub-bands of brain impulses corresponding to delta, overall efficiency of the authentication method, error
theta, alpha, beta, and gamma frequency bands were rates are found at multiple orders. These error rates are
used. In the second procedure, at each sampling point of obtained at optimum thresholds.
the EEG signal, a set of binary patterns were extracted. The brain signals generated during various imaginary
Based on restricted data, three segments of each EEG tasks are used for authentication in [26]. It was
signal were used to train error-correcting output code observed that, while capturing EEG signals, the subjects
(ECOC) multi-class model SVM classifiers, with the performed four mental imaging tasks consisting of base-
remaining two segments used to test the learned model. line measurement, referential limb movement, counting,
230 ECTI TRANSACTIONS ON ELECTRICAL ENGINEERING, ELECTRONICS, AND COMMUNICATIONS VOL.20, NO.2 JUNE 2022
and rotation. Each electrode extracted three sets of was used, while 99.5 and 95.8% accuracy were achieved
characteristics: sixth-order AR coefficients, spectral when keystrokes and EEG were considered individually,
density, and total power in five frequency bands. Six AR respectively. The highest accuracy was observed with
coefficients were used as proposed for EEG classification the XGBoost classifier. In practice, each user has a
in [27]. Specifically, biologically significant bands and different password, thereby increasing the dissimilarity
spectral powers were used to obtain brain activity. The in keystroke statistics. This may adversely affect the
extracted features include AR coefficient, power spectral authentication accuracy.
density (PSD), spectral power, interhemispheric power An investigation was carried out in [30] to find EEG
difference, and interhemispheric channel linear complex- channels for use in person authentication. The 26
ity. These feature sets were integrated into a vector subjects in the study were asked to look at a letter on
which is then used for classification by a linear SVM the screen and find it in the group of letters shown
with cross-validation. Four separate mental tasks were by the P300-speller system. If a subject can find the
performed by each subject, with the images captured at letter, he provides positive feedback. Otherwise, negative
128 Hz. To obtain six AR coefficients, the sixth-order feedback is given. To capture the EEG generated by
AR model was used on the filtered data. Feature vector the activity, a 56-electrode EEG headset was used. For
classification was carried out using a one-vs-all linear each channel, the first two intrinsic mode functions
SVM, with the aim of reducing false accept rates (FARs) (IMFs) were separated using empirical mode decompo-
and false reject rates (FRRs). For each block, the FAR and sition (EMD). Thereafter, four features, namely instan-
FRR values were computed and then summed over the 15 taneous and Teager energy distribution, and Higuchi
folds. Finally, to achieve the half total error rate (HTER), and Petrosian fractal dimension, were extracted from
the average of FAR and FRR values were computed. each IMF, resulting in 448 features. A subset of the
In [28], the impact of various feature extractors on best features was selected using the forward-addition or
authentication accuracy is investigated. Four different backward-elimination method [31] and SVM classifier.
entropies: fuzzy, approximate, sample, and spectral were The positive feedback-related response was 4% more
con sidered. A total of 16 subjects were seated on accurate compared to that of the negative feedback. The
armless chairs in front of display systems. The subjects accuracy in the case of male-only was 1% higher than
were shown a random sequence of self-images and the female-only case. The number of channels varied
non-self-images and asked to identify their own images. for male-only and female-only cases, with nine and eight
The EEG data were collected using a 32-electrodes channels required, respectively. However, the set of
EEG headset. After applying feature extractors and channels differed significantly between male and female.
feature selection based on Fisher distance, the number The reduction in the number of channels had little impact
of features selected varied from 21 to 22 depending on on the accuracy of subject authentication.
the entropies used. The EEG of the forehead area was
more orderly when the self-images were shown to the 3. 1.2 Artificial neural networks with real activity
subjects. All entropies were higher for the self-image Eye activity was used in [32] for an individual
compared to non-self-image. The SVM classifier with biometric framework. Two separate datasets were used
linear, polynomial, sigmoid, and radial-based kernels was to construct an entity biometric system: eyes open (EO)
used to identify a subject. The fuzzy entropy was found and eyes closed (EC). The EEG signals were demonstrated
to provide the highest classification accuracy of 90.7%. utilizing two distinct classifiers: SVM and random forest.
To avoid the impact of variability on the EEG over A feature selection process was then used to minimize
time and the cognitive state of a person, a multimodal the number of features and generate the necessary results
authentication system is proposed combining EEG and to discover the ideal element measurement. Information
keystroke statistics in [29]. The EEG and keystroke on the signal processing technique used in the prepa-
statistics were collected when the subject typed a specific ration of the raw signals and features extracted prior
password, namely “qu-ELEC371”. A total of 45 features to classification were provided for signal processing and
were extracted from the keystrokes. In contrast, the classification. SVM training was performed on the EEG
EEG contained 206 features, consisting of 27 frequency data in EO and EC scenarios independently. In the first
domains, 11 statistical, and three time-domain features scenario, the subjects had their eyes open and closed or
in each channel. From a total of 251 features, 88 were resting in the second scenario. In this case, only the
selected based on correlation coefficients and random variation of the statistical features from the EEG signals
forest classifiers. A variety of classifiers were used such with a combination of the gamma band was considered.
as random forest, linear discriminant analysis (LDA), The feature vector consisted of three features: mean (M),
linear SVM (LSVM), quadrature-enhanced SVM (QSVM), mean square (RMS), and standard deviation (SD) of the
KNN, classification and regression tree (CART), and XG- signals. The EO scenario achieved maximum accuracy
Boost. Although feature selection improved the compu- with the random forest classifier.
tational overhead, it reduced the authentication accuracy In [33], the authors propose a technique for utilizing
rather than increasing it. Up to 99.8% authentication VEP signals when an image is shown to the subjects
accuracy was found when multimodal authentication during the recording of the EEG. Exploiting the spectral
EEG-BASED BIOMETRIC AUTHENTICATION USING MACHINE LEARNING: A COMPREHENSIVE SURVEY 231
power ratio of the VEP gamma band, the backprop- selected epochs (90%).
agation neural network was used to authenticate a In [37], the authors investigate the performance of
person. The VEP signals were collected from subjects CNN for EEG-based person authentication with a larger
when shown a single image. Each channel of the number of subjects. CNN has recently been used as a
VEP signals was separated utilizing a zero-stage Butter- new tool for biometric automatic feature extraction and
worth bandpass digital filter. A multi-layer perceptron classification. A total of 33 healthy subjects participated
neural network with a single hidden layer trained by in the experiment in this paper, with P300 speller being
the backpropagation calculation was used to classify introduced [37]. The results show that the CNN-based
the VEP spectral power ratio. The trained network biometric achieves 99.9% accuracy for eight classes, 99.3%
provided average accuracy of 99.06%, indicating that the for 10 classes, and 99.3% for 13 classes. The authors
VEP signals conveyed hereditary explicit data and were fine-tuned the network structure of the 10-class and
appropriate for designing biometric systems. The normal 13-class classifications. The convergence speed of the
layout execution of 99.06% implies that VEP designs can network was also improved, while the time to achieve
be ordered into their related classifications accurately. the best classification model was reduced by changing the
However, the preparation of VEPs can take longer than CNN structure. It also ensured that the accuracy and loss
other biometrics such as fingerprints. functions do not change significantly.
Systematic research is carried out in [34] to investigate Rather than considering the brain signals produced by
the possibility of merging ConvNet and steady-state the different imaginary and non-imaginary movement
visual-evoked potentials (SSVEP) with a user authen- of different organs, emotions are used for individual
tication system. The low-frequency elements of the identification using EEG in [38]. Human emotions were
SSVEP were used as biometric patterns. The discrimi- extracted as the primary nature of feelings from facial
nating capacities were tested through various parameter expressions. EEG can be used to illustrate feelings
settings to refine the CNN model. The effect of the since human reactions are related to cortical functions.
EEG data duration on authentication efficiency was also Emotion-related signals include contextual temporal de-
studied. The authors then integrated the low-frequency pendencies. Research on person authentication allows
element of the SSVEPs with 40 target stimuli for further one to achieve a better knowledge of the output between
study. As a result, each subject had 240 SSVEP epochs various affective states of personal identity. This research
(trials) with 600 epochs between the first and second mainly focused on the DEAP EEG dataset [39], an emo-
sessions. The proposed system achieved 97% accuracy tion processing dataset of EEG, video, and physiological
on a cross-day basis for user authentication among signals for emotion analysis. A deep neural network
eight persons, demonstrating EEG-based biometric CNN- is proposed in this study using a combination of CNNs
based brain decoding. This paper proposes and develops and recur rent neural networks (RNNs). There are two
an EEG-based user authentication framework, using kinds of RNN: the CNN long short-term memory (LSTM)
SSVEP brain responses and CNN- based decoding. and the CNN-gated recurrent unit (CNN-GRU). Both
As can be observed from the foregoing, CNN is widely CNN-LSTM and CNN-GRU have been found to achieve
used in user authentication. In [35], the authors propose person authentication accuracy of 99.9–100%. However,
a CNN method for EEG-based person authentication. the CNN-GRU can achieve faster convergence.
They review CNN’s performance on datasets involving For more productive differentiation between people,
one driving fatigue experiment with 100 subjects. The the authors in [40] merge the SSVEP and event-related
input of raw EEG data was used in CNN, thus decreasing potential (ERP) features and apply LSTM networks to
the need for feature engineering. A wide selection of EEG analyze the data extracted from the EEG signal. To
datasets was tested for CNN, involving 100 subjects from differentiate between individuals and apply the LSTM
one particular BCI driving fatigue task. The EEG data network for study, the SSVEP and ERP functions were
produced by conducting a BCI experiment, referred to incorporated. The proposed technique was subdivided
as the BCIT experiment Baseline Driving (XB Driving) into three steps. The raw EEG data were obtained from
[36], was used in this paper. For target prediction, 20 people with a 7.5 Hz square SSVEP stimulant in the
deep learning (DL) models, specifically CNN approaches, image set proposed by Snodgrass and Vanderwart as
were studied in RSVP tasks. The CNN architecture targeted and non-targeted ERP stimulation. The raw
comprised dense convolutional layers accompanied by data were then filtered using a passband notch filter,
fully connected layers. Each convolutional layer was and the eye blinking artifacts eliminated. Deep learning
connected by many kernals to input data (vectorized was used to construct the latest form of individual EEG
EEG). These filters were designed to collect a range of authentication. Through the use of LSTM SSVEP and ERP
local spatial characteristics. is the authors suggest that functionality, this system was able to achieve a decent
fewer cases exist on use of biometric identification based 91.44% verification accuracy. A broader selection of
on EEG. The CNN model was found to be very fast users with ordinary eyes and brains can use the method.
and capable of providing 97% accuracy in the case of The successful results obtained in this paper would not
100 subjects. The accuracy of the authentication was only encourage the automatic authentication technique
demonstrated to be much higher than that of randomly to become more robust and impersonation-tolerant but
232 ECTI TRANSACTIONS ON ELECTRICAL ENGINEERING, ELECTRONICS, AND COMMUNICATIONS VOL.20, NO.2 JUNE 2022
provide a building block to expand related studies. does not require any training. Training was conducted
An EEG-based user authentication technique applying utilizing nine feature vectors, and testing carried out
multiple machine learning algorithms is proposed in utilizing the remainder of the set. ENN order exhibitions
[41]. A total of 105 subjects were seated before a tend to be somewhat better than KNN and demonstrate
display screen with instruction given to either tighten validity for all distinctive utilized component extraction
or relax the fists of both hands. They were also asked techniques. In terms of computational overhead, ENN
to imagine these activities. There were three sessions, appears to be much more algorithmically difficult than
each consisting of seven trials. The 21 trials were KNN and needs long-term analysis during training, due
then converted into 105 trials using the sliding window to the process taking much longer. Conversely, KNN
technique. Three different sets of electrodes were used: requires no unequivocal training. However, a significant
8, 16, and 64. To reduce complexity, the EMD was drawback of KNN is its high computational complexity
exploited to select the first four EMFs, since these contain in the testing phase, since characterizing a test VEP
the most information. Two-stage feature selection was feature vector requires the distance of all VEP features
then performed. Firstly, 18 features were extracted used for training to be determined. The examination
using the following entropies: log, approximate, sample, in this study [43] revealed the capability of prevailing
and Shannon. Subsequently, three different machine recurrence powers in gamma band VEPs as biometrics.
learning techniques neural network, visual geometry A quite interesting investigation is carried out in [44]
group (VGG), and principal component analysis (PCA), concerning the impact of time variance on the authen-
were independently used to select two features for each tication accuracy. The contribution can be divided into
channel. The resulting features are input into the SVM two parts. In the first part, the impact of different feature
classifiers. The highest accuracy (95.64%) is achieved selection techniques were investigated such as discrete
by the SVM classifier with PCA feature selection and Fourier transform (DFT), zero-crossing rate (ZCR), and
64 electrodes. The VGG neural network gives the Hjorth. It was found that DFT provided the highest
worst performance. The computation complexity is classification accuracy, possibly because a total of 180
comparatively high in this method because it involves features were extracted from DFT, with only four and 12
several processing steps such as feature extraction, a from ZCR and Hjorth, respectively. The impact of the
PCA, neural network, VGG neural network, and SVM. number of electrodes on subject classification accuracy
The authentication systems discussed here require was also investigated. Higher accuracy was observed
retraining when new data are added to the dataset, with more electrodes. In addition, as a classifier the
which is quite common in practical application. Further- DNN outperforms SVM. The tasks considered included
more, significant memory space is required to store the relaxation and listening to music. Relaxation provided
datasets. To overcome this limitation, an incremental the highest accuracy. In the second part, 10 subjects were
fuzzy-rough nearest neighbor (IncFRNN) base classifier considered. EEG data were collected on two occasions
is proposed in [42] combining the concept of the fuzzy from each subject for three different tasks with a time
set theory and rough set theory. The objective of this gap of 1–2 weeks. An astounding result was revealed.
paper is to properly handle the uncertainty involved in The authentication accuracy dropped dramatically. The
EEG. Data were collected from 37 subjects by asking them highest accuracy of 52.72% was obtained by SVM, while
to identify their own password in the display system. A the DNN provided 47.64%. The EEG data were found to
total of 405 features were extracted using wavelet packet be inconsistent over time.
decomposition, mean of amplitude, cross-correlation, co-
Most of the papers described so far use EEG, produced
herence, and Hjorth parameter. The IncFRNN technique
either by actual movement or the motor movement
outperformed the incremental KNN in terms of area
imagery of body parts. In addition, a significant number
under the receiver operating characteristic (ROC) curve
of papers consider visual stimuli [28]. A different
in both unlimited and predefined instances in the dataset.
approach is considered in [45] where the EEG produced
However, the IncFRNN was found to be less accurate
by displaying a shape is used to authenticate subjects.
than the incremental KNN.
Twenty subjects participated in the experiment. They
were told to identify the existence of a black circle above
3. 1.3 Mixed machine learning algorithm with real
a red plus sign. A varying degree of contrast was used
activity
in the circle: 0% (indicating the absence of a circle),
In [43], a new framework for establishing VEP-based 5%, 10%, and 100%. All subjects failed to perceive the
biometrics is proposed. A total of 3560 VEP signals were presence of a circle when 5% contrast was used. Four
obtained from 102 subjects. There were at least 10 VEP features were extracted based on the power spectra of the
signals from each subject with a maximum of 50 eye 𝛼, low 𝛽, high 𝛽, and 𝛾 bands. A headset containing four
squints free. Three unique examinations were carried electrodes was used to capture the EEG. The electrodes
out on features created by the selection metrics, and were positioned at O1, O2, P7, P8, covering the occipital
the enhanced features of the proposed technique. Two region generating the responses to visual stimulation.
classifiers were utilized: Elaman neural network (ENN) Among the SVM and neural network, the equal error
and KNN. For correlation, KNN was selected since it rate (EER) for SVM was 11.2%, obtained from the O2
EEG-BASED BIOMETRIC AUTHENTICATION USING MACHINE LEARNING: A COMPREHENSIVE SURVEY 233
electrode, while the neural network ensemble consisting EEG data for the fists of the left and right hands were
of six neural networks provided an EER of 8.1% for the collected from the motor movement imagery task. On
EEG obtained from E1. The EER from a single neural the other hand, the eye tracking data were collected for
network was comparatively higher. The higher EER the jumping dot stimulus task. Multimodal classification
could have been due to its ability to overcome the impact was found to provide much better performance compared
of random weights in the first iteration of the individual to each mode (either EEG or eye tracking). The FAR
neural network. The training complexity of the ensemble of the multimodal authentication was less than half the
neural network would be very high since this system FAR obtained from baseline EEG authentication. This
involves multiple neural networks. Pertinently, the EEG improvement might be due to the increased features.
generated from this type of approach may not always be The fused dataset has greater data diversity which may
feasible in practice. result in better FAR. The computational overhead is
slightly lower since the SVM requires less computational
3. 2 Authentication Using Motor Imagery Activity overhead compared to neural network or KNN-based
solutions.
This section presents the authentication techniques
used when the EEG data are collected from motor im-
3. 2.2 Artificial neural networks with imagery ac-
agery movement. In similarity to Section 3.1, this section
tivity
is divided into three groups: (i) Authentication using
traditional machine learning techniques; (ii) Artificial Many similar tasks are concurrently used for machine
neural network; and (iii) Authentication using multiple learning, all using a single-task classifier design and
machine learning techniques from groups (i) and (ii). subsequent recognition. The benefit of this mechanism
is that it incorporates data from extra activities. A
3. 2.1 Traditional machine learning algorithm with complex problem is split into smaller problems by
imagery activity the traditional single-task learning (STL) mechanism in
In [46], the authors suggest that EEG can be used pattern recognition and machine learning. Each of the
for authentication in multi-level security systems where smaller tasks are trained separately before combining
users are asked to provide EEG authentication signals by them together. Alternatively, multi-task learning (MTL)
executing motor imagery tasks. These activities can be trains a variety of different tasks in sequence and may
single or mixed, based on the level of security required. use latent domain-specific knowledge in additional tasks.
The EEG-based authentication method has two phases: Inspired by this, MTL is employed in [50] for user
enrollment and verification. During the enrollment pro- authentication purposes. In general, additional MTL
cess, a person is asked to perform certain tasks, such as tasks function as an inductive bias, leading a learner
imagining moving their hand, foot, finger, or tongue, and to choose the hypotheses that better describe the main
recording the EEG signals. For authentication purposes, task and the additional tasks simultaneously. EEG-based
the imaging activities themselves are often part of the biometrics apply extra yet similar tasks to the existing
credentials and should not be accessed by any third party. neural network. The main task and extra tasks of
After data selection, the EEG signals of each activity common hidden layer representation incorporate the
belonging to the user are pre-processed, the features inductive bias of additional tasks, potentially benefiting
extracted, and subsequently used to train the model for the learning of the main task. Since a neural network
that individual, which is stored securely in the database. with two layers of appropriate units can approximate
Security networks can have different levels, based on the every bounded continuous function, a neural network
regions and services of EEG-based authentication, and with one hidden layer was used for the experiments in
then adjusted according to the number of tasks assigned. [50]. EEG signals for imagining left and right index
AR models may be used with an EEG single-channel finger gestures were used independently for STL, with
signal. The signal PSD is a positive number function two precision tests obtained. In MTL, for training and
of frequency factor analysis with a stationary stochastic enabling comparison, both signals were used. The output
operation. The linear AR parameters and PSD elements of the learning network was tested using each type of
of these signals are derived as features. SVM uses the signal. Multiple related tasks were performed in parallel,
C-support vector classification (C-SVC) algorithm to find with the generated outputs representing the results of
the optimum hyperplane. The SVM system is used to the main and extra tasks, performed by one feedforward
build individual EEG models. Experiments are carried neural network. This paper compared the accuracy
out using five-fold cross-validation training. EEG-based between STL and MTL in a small dataset of nine subjects.
authentication has been shown to provide all the benefits However, the real accuracy may be different when the
of password-based authentication. number of subjects increases.
Multimodal authentication using SVM is proposed in In [51], a new approach is introduced for the clas-
[47]. The EEG and eye tracking data were used from sification of EEG signals using a combination of CNN
[48] and [49], respectively. The datasets were combined and LSTM. To improve the performance, one dimensional
by considering the similarity of data to produce a fused (1D) convolutional LSTM neural networks were used
dataset of hypothetical subjects. On the one hand, the for CNN and LTSM. All the EEG biometrics of users
234 ECTI TRANSACTIONS ON ELECTRICAL ENGINEERING, ELECTRONICS, AND COMMUNICATIONS VOL.20, NO.2 JUNE 2022
were processed in a neural network of 1D-convolution Despite 19 electrodes being employed to capture the
LSTM, trained in the enrollment phase. The recorded signals, two channels were excluded since the imagery
EEG signal was pre-processed for normalization, either EEG signals were generated in the parietal and central
in the enrollment phase or during authentication, and regions of the brain. Only the α band was considered
segmented into 1-second normalized signal recordings due to it being the most widely used in motor imagery
before being sent to the 1D convolutional LSTM. This [55]. No feature selection and extraction algorithms were
proposed network consisted of 10 layers with multiple included since CNN itself can extract features from the
convolutional layers, LSTM layers, and fully connected different layers of row data. Importantly, in this paper,
layers. For each training iteration, EEG segments were the CNN model was created using EEG signals captured
organized, shuffled, and randomly selected in batches. in session 1, while the testing of the model was conducted
Each batch consisted of 80 sets of EEG samples, based using the signal captured in session 2. Consequently,
on the number of channels. The network training was this paper emphasizes on the effectiveness of EEG-based
built to avoid over 1000 epochs or minimize the lack of person authentication in a real-world situation. Up to
training or validation. Since the performance of the pro- 93.5% accuracy in identifying persons was found to be
posed LSTM authentication system with 1D convolution achievable when employing CNN.
depends heavily on parameter type, a trade-off is made
to balance and improve the system’s performance, cost, 3. 3 Authentication Using Combined Real and Mo-
and effectiveness. tor Imagery Activities
In order to address the reliability problem of the In this section, the authentication methods using both
ongoing EEG biometrics, an investigation is carried out real and imagery activities during EEG recording are
in [52] to provide more accurate and easier to use EEG presented. In similarity to Sections 3.1 and 3.2, the
biometric systems. A computational approach based on authentication methods are divided according to the
functional connectivity (FC) and CNN is suggested to machine learning algorithms.
ingest identity-bearing information from ongoing EEGs
to help secure biometric EEGs against human states. The 3. 3.1 Traditional machine learning algorithm with
proposed approach combines a functional connectivity combined activities
prediction module with a CNN-based deep learning A single-channel-based authentication technique is
module that learns discriminatory patterns from the FC proposed in [56], consisting of three steps: pre-
maps predicted by the first module. The workflow of sig- processing, feature extraction, and classification. Fea-
nal processing starts with bandpass filtering (0.5–42 Hz) tures were extracted using DFT, discrete wavelet trans-
and denoizing, followed by feature extraction and the form (DWT), AR modeling, and entropy. In this research,
classification of the two major modules. The authors a dataset of five behavioral tasks was used to analyze
compare their proposed approach with the common seven participants (325 examples). The neural network,
features and methods of EEG biometrics (those using EEG Bayesian classifier, and SVM were ordered for these
signals ongoing). Three features were chosen: AR model features. Channel enhancement can produce better
coefficients, spectral power density functions, and fuzzy performance by decreasing the quantity of EEG channels
entropy, each offered details of individual distinctiveness and characterizing the ideal cathode arrangement for
and incorporated these features to further improve various mental exercises. Choosing a fitting channel with
performance. All three features were determined on the high precision for various errands is a significant part of
basis of the EEG from each single channel, also known authentication systems. Various classifiers such as SVM,
as univariate features, due to their combined “united Naive Bayes, and neural networks are used in [56] to
features.” CNN was used as the classifier. CNN provides achieve the optimal classifier. In this study, the highest
a higher correct recognition rate (CRR) than traditional accuracy was obtained for the single channel case by the
classifiers such as shallow neural networks and SVMs neural network. The classification accuracy varied from
within a fair com putation time. The suggested approach 97 to 98% for the neural network classifier using a single
was evaluated on two datasets, namely the PhysioNet channel validation framework.
[53] and self-collected database. The latter included the The feasibility of implementing imaginary and non-
EEG signals obtained from 109 subjects during motor imaginary tasks for user authentication is investigated
imagery tasks. in [57]. The participants were expected to execute
The EEG varied significantly, depending on the mood non-imaginary tasks (left or right hand movements) and
of the subject. Furthermore, there was a temporal imaginary tasks (either left or right hand movements
variation in the signal for the same mood. For this had to be imagined). The time allowed for each task
reason, the impact of temporal signal variation in person was one minute, with one minute of rest between tasks.
authentication based on the EEG signal is investigated The bandpass filter was used to remove 𝛼 and 𝛽 waves
in [48]. EEG data were collected from 50 subjects during pre-processing. The signal was segmented and the
in response to their imagery movements of hands and power spectral density calculated by the Welch and Burg
legs. The data were collected using the Galileo BE Light methods. The statistical characteristics derived from
amplifier system in two sessions with a one-week gap. the PSD were used as classifier inputs (mean, medium,
EEG-BASED BIOMETRIC AUTHENTICATION USING MACHINE LEARNING: A COMPREHENSIVE SURVEY 235
mode and variance, standard deviation, and minimum network. Two hidden layers were utilized with 100 nodes
and maximum). For classification, KNN and LDA were in every layer. The eyes open and eyes closed features
applied. The Welch procedure provided the highest had no significant impact on the rate of correct personal
accuracy of 98% for 𝛽 waves from channel C4 with the identification. In this paper, only 10% of the dataset
KNN classifier. The imaginary tasks provided 98.03% was used for testing the network. However, just 10% of
accuracy, which was higher when compared to 94.95% for the dataset may not represent the characteristics of the
the non-imaginary tasks. Thus, the imaginary function whole dataset. For this reason, the effectiveness of the
was found to be more suitable for authentication. technique proposed in [60] needs further investigation.
Instead of performing user authentication based on a
single imagery activity as in [41], a combination of both 3. 4 Traditional Machine Learning with no Activity
actual and imagery actions are used in [58] to classify While most EEG-based authentication systems focus
subjects. The experimental setup was similar to [41]. on increasing the level of accuracy, the usability and
The subjects were seated on chairs with arm rests in timeliness of the systems in practical scenarios are largely
front of a display system with visual instruction given overlooked. Furthermore, all the above-mentioned
to the subject in the performance of a particular task. authentication techniques use EEG recording during
Two tasks, namely hand lift and fist tightening, were certain activities. Thus, a new system of authentication
performed with each hand in both actual action mode is proposed in [61] which is easy to implement in
and imaginary action mode. The data were collected practical scenarios with low computational complexity.
from 10 subjects using 32 channels on an EEG headset. In this technique, the EEG signals are collected from
However, only data from four channels were used for users in a state of relaxation and sent to the respective
further processing due to the specific region of the brain user’s smartphone. Considering the low processing
being of interest. Prior to classification, the desired range power of smartphones and limited battery power, the
of frequencies (0.1 Hz to 70 Hz) were extracted since this authentication task is offloaded to a fog computing server
frequency range contained useful information. Wavlet located nearby. The Naive Bayes classifier was used
transform was then used for noise removal. Only one in the fog server. Fast Fourier transform (FFT) was
feature, namely PSD, was extracted for each channel used for feature extraction only from the 𝛼 band (which
using Burg’s method [59]. Finally, the data on actual provides the best performance) of the EEG signal. The
and imagery actions were fed into a classifier (e.g., authentication accuracies were 81 and 95% when the
random forest, KNN, SVM, and Naive Bayes) separately EEG was operated for five seconds and 10 seconds,
for each subject. The highest accuracy was found by respectively.
the random forest with imagery tasks. All eight tasks From the above discussion, it is evident that consid-
(two tasks for each hand in the case of actual action erable diversity exists in pre-processing EEG data prior
and imagery movement) were combined for multimodal to classification. Thus, Table 5 provides a comparison
analysis. In this case, the average accuracy was 98.28%. of the authentication techniques based on the type
The computational burden in the case of random forest of pre-processing carried out. A detailed comparison
was comparatively less than that for the solutions using of different machine learning algorithms for biometric
neural networks for classification. authentication is presented in Table 6.
3. 3.2 Artificial neural networks with combined ac- 4. FUTURE RESEARCH DIRECTIONS
tivities
It is important to note that this is the first thorough
Here, the EEG-based authentication techniques using investigation into the progress of machine learning
neural networks for combined real and imagery activities assisted biometric authentication using brain waves. In
are described. In [60], the impact of eyes-open and contrast to other biometric methods, a relatively small
eyes-closed scenarios on the person authentication based number of experiments on the topic have been reported.
on EEG signals is investigated. The signals for the Therefore, the scope for future research is considerable.
subjects with eyes-closed and eyes-open states were Potential future research directions include:
recorded. The features were extracted utilizing the • In [50], a multi-tasking-based biometric authentication
WPD and classified with neural networks. The EEG technique is proposed involving a neural network with
signals were gathered from 10 male participants while one hidden layer for classification. However, with
resting with their eyes open and eyes closed in five the advancement of research on neural networks, we
different meetings lasting about 14 days. Two channels have entered the new domain of deep learning, leaving
recorded the EEG signals as the subjects sat for a moment behind the notion of a shallow neural network. Deep
with their eyes closed. The WDM was chosen since learning appears to outperform the shallow neural
it can provide information both in time and frequency network in all perspectives. Therefore, it would be
domains. The EEG segments were arbitrarily partitioned interesting to perform multi-tasking-based biometric
into training and testing sets with 90% and 10% of the data authentication using deep learning rather than the
used for training and testing, respectively. The extracted shallow neural network. There is also the potential to
features were then fed into the input layer of the neural achieve better accuracy.
236 ECTI TRANSACTIONS ON ELECTRICAL ENGINEERING, ELECTRONICS, AND COMMUNICATIONS VOL.20, NO.2 JUNE 2022
No. of No. of EEG No. of Extracted Feature Extraction Methods EEG Type of Tasks
Ref.
Subjects Channels Features Bands
[50] 5 59 8 Common feature patterns [63] 𝛼, 𝛽 Reading an unconnected list of words
[23] 32 6 15 Mean, standard deviation and entropy All Eyes open and closed
Wavelet packet decomposition (WPD) co-
efficient
[32] 109 64 192 Mean, standard deviation, root mean 𝛾 Left hand movement and eyes open
square error
[60] 10 8 168 Mean, standard deviation and entropy All Eyes
Wavelet packet decomposition (WPD) co-
efficient
[24] 5 9 168 Multi-scale shape descriptor (MSD), multi- All (MSD, Eyes close
scale wavelet packet statistics (MWPS), MWPS)
multi-scale wavelet packet energy statis- 𝛼, 𝛽
tics (MWPES) (MWPES)
[56] 7 6 85 DWT, log energy entropy, sample entropy, All Rest, math, visual counting, geometric
auto-regressive coefficients figure rotation
[61] 10 1 FFT 𝛼 Resting state
[57] 20 19 7 Statistical features of PSD All Eyes
[51] 109 4, 16, 32, 64 10240 𝛾 Motor movement for opening and clos-
ing fists and moving feet
[43] 10 61 61 Multiple signal classification (MUSIC) 𝛾 Image visualization
[33] 20 61 61 Spectral power ratio All Images of different objects
[25] 7 17 5 ICA 𝛾 Motion related activity in virtual envi-
ronment
[26] 5 14 1358 AR, FFT, interhemispheric power dif- All Different activities such as resting with
ference, interhemispheric channel linear eyes closed, limb movement, geomet-
complexity ric figure rotation
[46] 9 5 99 FFT, AR All Motor imagery single and combined
left and right hands
[34] 8 9 𝜃 Low frequency SSVEP from retina
[35] 100 256 1408 AR, FFT All In virtual reality environment, keeping
in lane when driving of a car
[52] 109, 59 64, 46 AR, fuzzy entropy, PSD 𝛼, 𝛽 Motor imagery due to fists or feet
movement
[37] 33 20 All Visualization of different English char-
acters
[38] 32 5, 32 32, 496 PSD (Welch method), spectral coherence 𝛼, 𝛽, 𝛾, 𝜃 To score subjective rating by watching
a music video
[40] 20 7 2637 𝛿, 𝜃 Image visualization, and facing flash
light
[41] 105 8, 16, 64 16, 32, 128 EMD All Fist tightening or relaxation
[58] 10 4 4 PSD (Burg method) 𝛼, 𝛽, 𝛾 Actual movement and imagery move-
ment (each hand lift, fist tightening)
[44] 20, 10 4 196 DFT, ZCR, Hjorth All Relax mode, listening to music, and
counting numbers such as 2, 4, 8, …
[47] 37 64 64 DFT, ZCR, Hjorth All Left and right hand motor movement
imagery
[28] 16 32 30 Fisher distance All Viewing self and non-self images
[45] 20 4 4 Wavelet transform 𝛼, 𝛽, 𝛾 Viewing circles with varying (0 to
100%) contrasts
[29] 10 5 251 Statistical, time domain, frequency do- All Typing specific password
main
[42] 37 8 414 Mutual information, cross-correlation, co- 𝛼, 𝛽 Viewing own and others’ passwords
herence, and Hjorth parameter
[30] 26 56 448 EMD All Viewing a letter followed by finding
that letter in a group of letters
[48] 40 17 𝛼 Motor imagery movement of hands
and legs
EEG-BASED BIOMETRIC AUTHENTICATION USING MACHINE LEARNING: A COMPREHENSIVE SURVEY 237
Computational
Ref. Objective Machine Learning Techniques Accuracy Complexity Dataset
[50] EEG-based user identification and authentication Feedforward backpropagation, 94.04% Low complexity, high [64]
multi-layer neural network training time
[23] Biometric user identification on from eye activity SVM, Random Forest SVM: EO-97.64%, EC-96.02% Moderate Experimental
RF: EO-98.16%, EC-97.30%
[32] Multiple related tasks are performed simultane- Two-layer neural network Left-95.60%, Low complexity predic- PhysioNet [53]
ously Right-94.81% tion, high training time
[60] Human identification by EEG signal with four Neural network Eyes open 78%, Low complexity, high Experimental
channels or less Eyes closed 81% training time
[24] Human recognizable proof utilizing EEG signals SVM 94.44% Moderate Experimental
[56] Effect of electrode placement on authentication Bayesian network, SVM 95% Low [65]
accuracy in the case of different mental states
[61] Mobile phone assisted EEG authentication Naive Bayes 95% Low Experimental
[57] Authentication using imagery and non-imagery KNN, LDA 94.95% High complexity predic- Experimental
tasks tion, no training
[51] Authentication based on 1D convolutional LTSM CNN, LSTM 99.58% Low complexity, high PhysioNet [53]
training time
[43] VEP-based biometric KNN, ENN KNN-96.13%, High computational Experimental
ENN-98.12% overhead in prediction
[33] Evoked brain signals to identify individuals Back-propagation neural network 99.06% Low complexity, high Experimental
training time
[25] Independent component analysis-based authenti- Naive Bayes Low Experimental
cation
[26] EEG-based authentication with low cost Linear SVM 100% Low Experimental
[46] EEG authentication with multi-level security SVM Equal error rate: Moderate [66]
0.002 to 0.007
[34] Extraction of low frequency SSVEP for user CNN 97% Low complexity, high [67]
authentication training time
[35] EEG-based identification from driving fatigue CNN 97% Low complexity, high [36]
experiment training time
[52] Stability of EEG biometrics across diverse human CNN 99.94% Low complexity, high PhysioNet [53]
states training time
[37] Investigating the effects on subjects in exper- CNN 99.9% Low complexity, high Experimental
iments to assess CNN’s identification perfor- training time
mance
[38] Person identification on affective EEG (e.g. the CNN, RNN, CNN-LSTM up to 97.97% Complex [39]
subject is in different states during the experi-
ment)
[40] To combine SSVEP and ERP LSTM up to 91.44% Long training time Experimental
[41] Finding appropriate feature extraction and se- LSTM 91.44% High complex in training Experimental
lection algorithms for improving authentication
using the SVM classifier
[58] Classifying subjects using different classifiers Random forest, KNN, SVM, Naive 98.28% Comparatively less com- Experimental
with multimodal input to the classifier Bayes plex during the training
phase, but more complex
in the testing phase
[44] To investigate the effect of time variance on SVM, DNN Time invariant test; For DNN, training Experimental
authentication accuracy SVM: 93.12%, DNN: 97.0% complexity is very high,
Time variant test; while testing complexity
SVM: 51.72%, DNN: 47.64% is extremely low
[47] To combine eye tracking and EEG for user SVM 98.28% FAR: 23.6% (base-line [48, 49]
authentication EEG: 42.1%)
[28] To analyze the impact of different entropies as SVM with linear, polynomial, ra- 90.7% with linear SVM and Moderate Experimental
feature extractors for classifying subjects dial basis and sigmoid kernels Fuzzy entropy
[45] To investigate the im pact of the invisible visual SVM with linear, polynomial and EER-SVM: 11.2% Low for SVM, very long Experimental
stimuli on user authentication radial basis kernels, neural net- EER-NN: 8.1% training time for multi-
works ple NNs
[29] To authenticate a subject based on EEG and LSVM, QSVM, KNN, CART, XGBoost: 99.8% High Experimental
keystroke statistics XGBoost, Random forest, LDA
[42] To investigate the use of IncFRNN to adapt to FRNN 95.1% Very high Experimental
changes in dataset size
[30] To develop a low-density EEG headset for person SVM 89% with five channels Moderate Experimental
authentication (mixed gender), 95% with
nine channels (male only)
[48] To assess the impact of the EEG recording session CNN 99.3% Moderate Experimental
on the authentication accuracy
238 ECTI TRANSACTIONS ON ELECTRICAL ENGINEERING, ELECTRONICS, AND COMMUNICATIONS VOL.20, NO.2 JUNE 2022
• Similarly, deep learning can be applied in all authenti- of research, and hence, researchers prefer to generate
cation techniques which use shallow neural networks. their own data experimentally. This paper offers in-depth
The conventional CNN can be replaced by the deep knowledge of state-of-the-art biometric authentication
CNN to investigate the corresponding classification techniques and opens a path for future research.
accuracy.
• From a review of existing literature, only one machine ACKNOWLEDGMENTS
learning technique seems to be used for classification
in a particular paper. The algorithm can be any one of This work is supported by Research Grants from
CNN, SVM, KNN, and so on. However, no classifier Begum Rokeya University, Rangpur, Bangladesh (Refer-
can provide the best solution to all problems [62]. ence: BRUR/Reg./2020–21/669(11)).
Thus, a possible research direction could involve the
identification of the best classification algorithm for a REFERENCES
given biometric authentication scenario.
[1] “Authentication Definition,” TechTerms.com.
• The effect of a drug can change the electrical activities
https://techterms.com/definition/authentication
of the brain. This can also change the response of
(accessed Jan. 4, 2021).
the brain with respect to different real and imagery
[2] “What is ‘Authentication’,” The Economic Times.
activities. For this reason, an investigation is required
https://economictimes.indiatimes.com/definition/
into each brainwave-based authentication technique to
authentication (accessed Jan. 4, 2021).
assess the impact of drugs on the accuracy of machine
[3] L. Norton et al., “Electroencephalographic record-
learning-based authentication.
ings during withdrawal of life-sustaining ther-
• Electrical activities in the brain change according to
apy until 30 minutes after declaration of death,”
age. Thus, what is the impact of age on the accuracy
Canadian Journal of Neurological Sciences / Journal
of biometric authentication? Does an authentication
Canadien des Sciences Neurologiques, vol. 44, no. 2,
algorithm achieve the same level of accuracy in older
pp. 139–145, Oct. 2016.
people as for children? All works discussed in this
[4] “EEG (Electroencephalogram),” KidsHealth. https:
paper consider subjects within a very short age range.
//kidshealth.org/en/parents/eeg.html (accessed Jan.
However, the age range is likely to be broad in practical
7, 2021).
applications. For this reason, a thorough investigation
[5] Britannica, The Editors of Encyclopaedia.
is required to assess the authentication accuracy in
“electroencephalography,” Encyclopedia
such a scenario.
Britannica, Oct. 31, 2017. https://www.britannica.
• The combination of multiple types of activity is found
com/science/electroencephalography (accessed
to improve authentication accuracy. In addition, the
Jan. 4, 2021).
use of other statistics, such as keystrokes as well as
EEG, improves accuracy. However, the amount of ef- [6] J. W. C. Medithe and U. R. Nelakuditi, “Study of
fort made in this direction remains very limited. Thus, normal and abnormal EEG,” in 2016 3rd International
a more holistic approach could be considered involving Conference on Advanced Computing and Communi-
the exploitation of facial expression, EEG, keystrokes, cation Systems (ICACCS), 2016.
and other information in the authentication process. [7] S. Siuly, Y. Li, and Y. Zhang, EEG Signal Analysis and
Classification: Techniques and Applications. Cham,
Switzerland: Springer, 2016, ch. 1, pp. 11–14.
5. CONCLUSION [8] A. S. Malik and H. U. Amin, Designing EEG Ex-
EEG is a person-dependent signal, and the authenti- periments for Studying the Brain: Design Code and
cations methods presented in this paper indicate that the Example Datasets. London, UK: Academic Press,
identification accuracy of a person using EEG signals is 2017, ch. 1, p. 4.
very promising. This paper gives an outline of machine [9] M. L. Ali, J. V. Monaco, C. C. Tappert, and M. Qiu,
learning assisted EEG-based biometric authentication “Keystroke biometric systems for user authentica-
methods. The investigation process and outcomes found tion,” Journal of Signal Processing Systems, vol. 86,
in each paper have been thoroughly discussed. In addi- pp. 175–190, Mar. 2016.
tion, various papers have been compared using different [10] O. S. Adeoye, “A survey of emerging biometric
criteria such as objectives, classifiers, feature extraction technologies,” International Journal of Computer
methods, number of channels, computational overheads, Applications, vol. 9, no. 10, pp. 1–5, Nov. 2010.
type of tasks, and so on. The findings of this investigation [11] S. P. Banerjee and D. Woodard, “Biometric authenti-
reveal that the following machine learning classification cation and identification using keystroke dynamics:
algorithms are widely used in brainwave-based biometric A survey,” Journal of Pattern Recognition Research,
authentication: CNN, MLP, SVM, KNN, RNN, LSTM, vol. 7, no. 1, pp. 116–139, 2012.
FRNN, BLSTM. Of these, CNN and SVM are the most [12] H. Crawford, “Keystroke dynamics: Characteristics
commonly used algorithms and provide comparatively and opportunities,” in 2010 Eighth International
high classification accuracy. Furthermore, there are a Conference on Privacy, Security and Trust, 2010,
limited number of publicly available datasets for this type pp. 205–212.
EEG-BASED BIOMETRIC AUTHENTICATION USING MACHINE LEARNING: A COMPREHENSIVE SURVEY 239
[13] M. Karnan, M. Akila, and N. Krishnaraj, “Biometric thentication,” in 2011 5th International IEEE/EMBS
personal authentication using keystroke dynamics: Conference on Neural Engineering, 2011, pp. 442–
A review,” Applied Soft Computing, vol. 11, no. 2, pp. 445.
1565–1573, Mar. 2011. [27] A. Lecko and Y. J. Sim, “Coefficient problems in
[14] R. Napier, W. Laverty, D. Mahar, R. Henderson, the subclasses of close-to-star functions,” Results in
M. Hiron, and M. Wagner, “Keyboard user verifica- Mathematics, vol. 74, May 2019, Art. no. 104.
tion: toward an accurate, efficient, and ecologically [28] Z. Mu, J. Hu, J. Min, and J. Yin, “Comparison of
valid algorithm,” International Journal of Human- different entropies as features for person authenti-
Computer Studies, vol. 43, no. 2, pp. 213–222, Aug. cation based on EEG signals,” IET Biometrics, vol. 6,
1995. no. 6, pp. 409–417, Apr. 2017.
[15] D. Shanmugapriya and G. Padmavathi, “A survey [29] A. Rahman et al., “Multimodal EEG and keystroke
of biometric keystroke dynamics: Approaches, dynamics based biometric system using machine
security and challenges,” International Journal of learning algorithms,” IEEE Access, vol. 9, pp. 94 625–
Computer Science and Information Security, vol. 5, 94 643, 2021.
no. 1, pp. 115–119, Sep. 2009. [30] L. A. Moctezuma and M. Molinas, “Event-related
[16] Z. Rui and Z. Yan, “A survey on biometric authen- potential from EEG for a two-step identity authen-
tication: Toward secure and privacy-preserving tication system,” in 2019 IEEE 17th International
identification,” IEEE Access, vol. 7, pp. 5994–6009, Conference on Industrial Informatics (INDIN), 2019.
2019. [31] L. Moctezuma and M. Molinas, “Subject identifica-
[17] N. Ortiz, R. D. Hernandez, R. Jimenez, tion from low-density EEG-recordings of resting-
M. Mauledeoux, and O. Aviles, “Survey of states: A study of feature extraction and classifica-
biometric pattern recognition via machine learning tion,” in Advances in Information and Communica-
techniques,” Contemporary Engineering Sciences, tion, K. Arai and R. Bhatia, Eds. Cham, Switzerland:
vol. 11, no. 34, pp. 1677–1694, 2018. Springer, 2020, pp. 830–846.
[18] P. Harington, Machine Learning in Action. New [32] B. Kaur and D. Singh, “Neuro signals: A future
York, USA: Manning Publication, 2012. biomertic approach towards user identification,” in
[19] S. Shalev-Shwartz and S. Ben-David, Understanding 7th International Conference on Cloud Computing,
Machine Learning: From Theory to Applications. New Data Science & Engineering – Confluence, 2017, pp.
York, USA: Cambridge University Press, 2014, pp. 112–117.
124–126. [33] R. Palaniappan, “Method of identifying individuals
[20] T. M. Oshiro, P. S. Perez, and J. A. Baranauskas, using VEP signals and neural network,” IEE Proceed-
“How Many Trees in a Random Forest?,” in Machine ings - Science, Measurement and Technology, vol. 151,
Learning and Data Mining in Pattern Recognition, P. no. 1, pp. 16–20, Jan. 2004.
Perner, Ed. Berlin, Germany: Springer, 2012, pp. [34] T. Yu, C.-S. Wei, K.-J. Chiang, M. Nakanishi, and
154–168. T.-P. Jung, “EEG-based user authentication using a
[21] J. Brownlee, Deep Learning for Computer Vision: convolutional neural network,” in 2019 9th Interna-
Image Classification, Object Detection, and Face tional IEEE/EMBS Conference on Neural Engineering
Recognition in Python. Machine Learning Mastery, (NER), 2019, pp. 1011–1014.
2019. [35] Z. Mao, W. X. Yao, and Y. Huang, “EEG-based
[22] M. Mohri, A. Rostamizadeh, and A. Talwalkar, Foun- biometric identification with deep learning,” in
dations of Machine Learning, 2nd ed. Cambridge, 8th International IEEE/EMBS Conference on Neural
UK: The MIT Press, 2018, pp. 79–83. Engineering (NER), 2017, pp. 609–612.
[23] Q. Gui, Z. Jin, and W. Xu, “Exploring EEG-based [36] J. Touryan, G. Apker, B. J. Lance, S. E. Kerick, A. J.
biometrics for user identification and authentica- Ries, and K. McDowell, “Estimating endogenous
tion,” in 2014 IEEE Signal Processing in Medicine and changes in task performance from EEG,” Frontiers
Biology Symposium (SPMB), 2014. in Neuroscience, vol. 8, Jun. 2014, Art. no. 155.
[24] M. K. Bashar, I. Chiaki, and H. Yoshida, “Human [37] Y. Di, X. An, S. Liu, F. He, and D. Ming, “Using con-
identification from brain EEG signals using ad- volutional neural networks for identification based
vanced machine learning method EEG-based bio- on EEG signals,” in 10th International Conference on
metrics,” in 2016 IEEE EMBS Conference on Biomed- Intelligent Human-Machine Systems and Cybernetics
ical Engineering and Sciences (IECBES), 2016, pp. (IHMSC), 2018, pp. 119–122.
475–479. [38] T. Wilaiprasitporn, A. Ditthapron, K. Matcha-
[25] C. He and J. Wang, “An independent component parn, T. Tongbuasirilai, N. Banluesombatkul, and
analysis (ICA) based approach for EEG person E. Chuangsuwanich, “Affective EEG-based person
authentication,” in 2009 3rd International Conference identification using the deep learning approach,”
on Bioinformatics and Biomedical Engineering, 2009. IEEE Transactions on Cognitive and Developmental
[26] C. Ashby, A. Bhatia, F. Tenore, and J. Vogelstein, Systems, vol. 12, no. 3, pp. 486–496, Sep. 2020.
“Low-cost electroencephalogram (EEG) based au- [39] S. Koelstra, C. Muhl, M. Soleymani, J.-S. Lee,
240 ECTI TRANSACTIONS ON ELECTRICAL ENGINEERING, ELECTRONICS, AND COMMUNICATIONS VOL.20, NO.2 JUNE 2022
A. Yazdani, T. Ebrahimi, T. Pun, A. Nijholt, and short-term memory neural networks,” Expert Sys-
I. Patras, “DEAP: A database for emotion analysis tems with Applications, vol. 125, pp. 259–267, Jul.
using physiological signals,” IEEE Transactions on 2019.
Affective Computing, vol. 3, no. 1, pp. 18–31, Jan. [52] M. Wang, J. Hu, and H. Abbass, “Stable EEG
2012. biometrics using convolutional neural networks
[40] S. Puengdang, S. Tuarob, T. Sattabongkot, and and functional connectivity,” Australian Journal of
B. Sakboonyarat, “EEG-based person authentication Intelligent Information Processing Systems, vol. 15,
method using deep learning with visual stimula- no. 3, pp. 19–26, 2019.
tion,” in 11th International Conference on Knowledge [53] G. Schalk, D. McFarland, T. Hinterberger, N. Bir-
and Smart Technology (KST), 2019, pp. 6–10. baumer, and J. Wolpaw, “BCI2000: A general-
[41] U. Barayeu, N. Horlava, A. Libert, and M. V. purpose brain-computer interface (BCI) system,”
Hulle, “Robust single-trial EEG-based authentica- IEEE Transactions on Biomedical Engineering, vol. 51,
tion achieved with a 2-stage classifier,” Biosensors, no. 6, pp. 1034–1043, Jun. 2004.
vol. 10, no. 9, Sep. 2020, Art. no. 124. [54] B. J. Edelman, B. Baxter, and B. He, “EEG source
[42] S.-H. Liew, Y.-H. Choo, Y. F. Low, and Z. I. M. Yusoh, imaging enhances the decoding of complex right-
“EEG-based biometric authentication modelling us- hand motor imagery tasks,” IEEE Transactions on
ing incremental fuzzy-rough nearest neighbour Biomedical Engineering, vol. 63, no. 1, pp. 4–14, Jan.
technique,” IET Biometrics, vol. 7, no. 2, pp. 145–152, 2016.
Mar. 2018. [55] M. Zeynali and H. Seyedarabi, “EEG-based single-
[43] R. Palaniappan and D. P. Mandic, “Biometrics channel authentication systems with optimum elec-
from brain electrical activity: A machine learning trode placement for different mental activities,”
approach,” IEEE Transactions on Pattern Analysis and Biomedical Journal, vol. 42, no. 4, pp. 261–267, Aug.
Machine Intelligence, vol. 29, no. 4, pp. 738–742, Apr. 2019.
2007. [56] T. Z. Chin, A. Saidatul, and Z. Ibrahim, “Explor-
[44] F. P. Sjamsudin, “EEG-based authentication with ing EEG based authentication for imaginary and
machine learning,” M.S. thesis, Department of Com- non-imaginary tasks using power spectral density
puter Science and Communications Engineering, method,” IOP Conference Series: Materials Science
Waseda University, Tokyo, Japan, 2017. and Engineering, vol. 557, 2019, Art. no. 012031.
[45] T. Miyake, N. Kinjo, and I. Nakanishi, “Wavelet [57] A. Valsaraj, I. Madala, N. Garg, M. Patil, and
transform and machine learning-based biometric V. Baths, “Motor imagery based multimodal biomet-
authentication using EEG evoked by invisible visual ric user authentication system using EEG,” in 2020
stimuli,” in IEEE Region 10 Conference (TENCON), International Conference on Cyberworlds (CW), 2020.
2020. [58] T. Thorvaldsen, “A comparison of the least squares
[46] T. Pham, W. Ma, D. Tran, P. Nguyen, and D. Phung, method and the Burg method for autoregressive
“EEG-based user authentication in multilevel secu- spectral analysis,” IEEE Transactions on Antennas
rity systems,” in Advanced Data Mining and Applica- and Propagation, vol. 29, no. 4, pp. 675–679, Jul. 1981.
tions, H. Motoda, Z. Wu, L. Cao, O. Zaiane, M. Yao, [59] M. K. Abdullah, K. S. Subari, J. L. C. Loong, and
and W. Wang, Eds. Berlin, Germany: Springer, 2013, N. N. Ahmad, “Analysis of the EEG signal for a
pp. 513–523. practical biometric system,” International Journal of
[47] V. Krishna, Y. Ding, A. Xu, and T. Höllerer, Biomedical and Biological Engineering, vol. 4, no. 8,
“Multimodal biometric authentication for VR/AR pp. 364–368, 2010.
using EEG and eye tracking,” in 2019 International [60] J. Sohankar, K. Sadeghi, A. Banerjee, and S. K.
Conference on Multimodal Interaction, 2019. Gupta, “E-Bias: A pervasive EEG-based identifica-
[48] R. Das, E. Maiorana, and P. Campisi, “Motor tion and authentication system,” in Proceedings of
imagery for EEG biometrics using convolutional the 11th ACM Symposium on QoS and Security for
neural network,” in IEEE International Conference Wireless and Mobile Networks, 2015, pp. 165–172.
on Acoustics, Speech and Signal Processing (ICASSP), [61] D. H. Wolpert, “The lack of a priori distinctions
2018, pp. 2062–2066. between learning algorithms,” Neural Computation,
[49] P. Kasprowski, O. V. Komogortsev, and A. Karpov, vol. 8, no. 7, pp. 1341–1390, Oct. 1996.
“First eye movement verification and identification [62] J. Müller-Gerking, G. Pfurtscheller, and H. Flyvb-
competition at BTAS 2012,” in 2012 IEEE Fifth jerg, “Designing optimal spatial filters for single-
International Conference on Biometrics: Theory, Ap- trial EEG classification in a movement task,” Clinical
plications and Systems (BTAS), 2012. Neurophysiology, vol. 110, no. 5, pp. 787–798, May
[50] S. Sun, “Multitask learning for EEG-based biomet- 1999.
rics,” in 2008 19th International Conference on Pattern [63] P. Sajda, A. Gerson, K.-R. Muller, B. Blankertz, and
Recognition, 2008. L. Parra, “A data analysis competition to evaluate
[51] Y. Sun, F. P.-W. Lo, and B. Lo, “EEG-based user machine learning algorithms for use in brain-
identification system using 1d-convolutional long computer interfaces,” IEEE Transactions on Neural
EEG-BASED BIOMETRIC AUTHENTICATION USING MACHINE LEARNING: A COMPREHENSIVE SURVEY 241
Systems and Rehabilitation Engineering, vol. 11, Md. Firoz Mahmud received his B.Sc.
no. 2, pp. 184–185, Jun. 2003. in Computer Science and Engineering from
American International University-Bangladesh
[64] X. Bao, J. Wang, and J. Hu, “Method of individ- (AIUB). Between Aug. 2021 and Feb. 2022,
ual identification based on electroencephalogram he worked as a web developer at Prajukti
analysis,” in 2009 International Conference on New 71, Bangladesh. Furthermore, Mr. Mahmud
worked as a Teaching Intern at AIUB from
Trends in Information and Service Science, 2009, pp. Feb. 2020 to July 2020. His research inter-
390–393. est include Machine learning and Bio-Signal
Processing.
[65] J.-F. Hu, “Biometric system based on EEG signals
by feature combination,” in 2010 International Con-
ference on Measuring Technology and Mechatronics
Automation, 2010, pp. 752–755.
Md. Shahariar Tehjib received a B.Sc.
[66] X. Chen, Y. Wang, M. Nakanishi, X. Gao, T.-P. degree in computer science and engineer-
Jung, and S. Gao, “High-speed spelling with a ing from American International University-
noninvasive brain–computer interface,” Proceedings Bangladesh (AIUB), Dhaka, Bangladesh. He
completed his secondary school certificate
of the National Academy of Sciences, vol. 112, no. 44, from Fulbari G.M pilot high school, Fulbari,
pp. E6058–E6067, Oct. 2015. Dinajpur, Bangladesh, and the higher sec-
ondary certificate from Fulbari Govt. College,
Fulbari, Dinajpur, Bangladesh. Currently, he
is working as a web developer. In addition,
he works in Artificial intelligence, machine
learning, and cyber security system.
Tarik Bin Shams received his B.Sc. degrees
in computer science and software engineer-
ing from American International University-
Bangladesh (AIUB). Currently, he is studying
M.Sc. in Computer Science and Engineering Zahid Hossain received a B.Sc. in Com-
at Brac University, Bangladesh. From 2021, he puter science and Software Engineering from
works as a Software Developer in Deepchain American International University-Bangladesh
Labs Ltd, Bangladesh. As a Software Devel- (AIUB), Bangladesh. In 2021, he worked as
oper, he contributes to developing many local an online medicine shop business analyst. He
and global projects. In addition to his current also completed the Digital Marketing course.
duties, he did research on machine learning
techniques and integrated them into several projects.