0% found this document useful (0 votes)
222 views15 pages

A Review On Machine Learning For EEG Signal Processing in Bioengineering

Uploaded by

Ioana Guță
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
222 views15 pages

A Review On Machine Learning For EEG Signal Processing in Bioengineering

Uploaded by

Ioana Guță
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 15

204 IEEE REVIEWS IN BIOMEDICAL ENGINEERING, VOL.

14, 2021

A Review on Machine Learning for EEG Signal


Processing in Bioengineering
Mohammad-Parsa Hosseini , Senior Member, IEEE, Amin Hosseini , Member, IEEE,
and Kiarash Ahi , Member, IEEE

(Methodological Review)

Abstract—Electroencephalography (EEG) has been a EEG signals are taken on the surface of the scalp, but there also
staple method for identifying certain health conditions in exists iEEG signals, which are taken inside the brain. In this
patients since its discovery. Due to the many different types paper, we will be focusing primarily on conventional scalp EEG
of classifiers available to use, the analysis methods are
also equally numerous. In this review, we will be examin- signals.
ing specifically machine learning methods that have been Conventionally, EEG recordings may be obtained by con-
developed for EEG analysis with bioengineering applica- necting electrodes to the scalp with the use of a conductive
tions. We reviewed literature from 1988 to 2018 to capture gel. A differential amplifier is then used to amplify each active
previous and current classification methods for EEG in
electrode compared to the reference before it is sent through an
multiple applications. From this information, we are able to
determine the overall effectiveness of each machine learn- anti-aliasing filter. Finally, this filtered signal is converted with
ing method as well as the key characteristics. We have an analog-to-digital converter.
found that all the primary methods used in machine learn- Clinically, EEG signals are used primarily to diagnose and
ing have been applied in some form in EEG classification. treat various brain disorders such as epilepsy, tremor, concus-
This ranges from Naive-Bayes to Decision Tree/Random
sions, strokes, and sleep disorders. More recent applications of
Forest, to Support Vector Machine (SVM). Supervised learn-
ing methods are on average of higher accuracy than their EEG include using machine learning as a method of analysis. In
unsupervised counterparts. This includes SVM and KNN. particular, there is much research on epileptic seizure detection
While each of the methods individually is limited in their and sleep disorder research in combination with machine learn-
accuracy in their respective applications, there is hope that ing. Additionally, there is also a growing interest in studying
the combination of methods when implemented properly
EEG signals for gaming to control and manipulate objects using
has a higher overall classification accuracy. This paper
provides a comprehensive overview of Machine Learning brainwaves due to EEG monitoring for brain activity during
applications used in EEG analysis. It also gives an overview tasks [36].
of each of the methods and general applications that each EEG waveforms vary based on the band, which denotes the
is best suited to. frequency range. The delta band is the slowest wave with the
Index Terms—Machine learning, eeg, survey, medical highest amplitude, having a frequency range below 4 Hz. For
applications, signal processing, signal analysis. adults, it is located frontally, while for children it is located
I. INTRODUCTION posteriorly. The theta band is between 4 to 7 Hz and is most com-
mon in young children while signifying drowsiness or arousal
LECTROENCEPHALOGRAPHY (EEG) is a method of
E testing electrical signals in the brain. It is often applied as
a technique for data analysis such as time and frequency series
in adults. This band tends to spike due to an active inhibition of
a movement or response. The alpha band is between 8 to 14 Hz,
and it is correlated to eye muscle movements. It is located on
analysis. The brain’s neurons contain ionic current, which cre- both sides of the head’s posterior regions. The beta band is above
ates voltage fluctuations that EEG can measure. This electrical 14 Hz and is correlated with general motor behavior. It is located
activity is spontaneous and recorded over a period of time from on both sides of the head’s frontal regions [44].
many scalp electrodes to form an EEG signal. [22] Traditionally, Some of the advantages of using EEG compared to other
Manuscript received April 10, 2019; revised August 9, 2019; accepted
techniques to study brain function are low costs, tolerance to
September 29, 2019. Date of publication January 28, 2020; date of motion from subjects, and no radiation exposure risks. Some of
current version January 22, 2021. (Corresponding author: Mohammad- the disadvantages of using EEG include low spatial resolution
Parsa Hosseini.)
Mohammad-Parsa Hosseini is with the Bioengineering Department,
and poor signal-to-noise ratio.
Santa Clara University, Santa Clara, CA 95053 USA and also with the
AI Research, Silicon Valley, CA USA (e-mail: [email protected]).
Amin Hosseini is with the Electrical and Computer Engineering De- II. MACHINE LEARNING METHODS FOR EEG
partment, Azad University, Central Tehran Branch, Tehran, Iran (e-mail:
[email protected]). A. Overview
Kiarash Ahi is with the University of Connecticut, Storrs, CT 06269
USA (e-mail: [email protected]). Machine learning is the use of a set of mathematical mod-
Digital Object Identifier 10.1109/RBME.2020.2969915 els and algorithms to gradually improve the performance of a
1937-3333 © 2020 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See https://www.ieee.org/publications/rights/index.html for more information.

Authorized licensed use limited to: CZECH TECHNICAL UNIVERSITY. Downloaded on July 15,2022 at 16:45:08 UTC from IEEE Xplore. Restrictions apply.
HOSSEINI et al.: REVIEW ON MACHINE LEARNING FOR EEG SIGNAL PROCESSING IN BIOENGINEERING 205

Fig. 1. Machine learning applications on EEG have been developed based on supervised and unsupervised learning in the literature. Supervised
learning is categorized to classification and regression which produce discrete and continuous accordingly. Unsupervised learning is categorized
to clustering and dimensionality reduction which produce discrete and continuous accordingly.

Fig. 2. The overall steps for EEG analysis by machine learning include preprocessing, feature extraction, feature selection, model training, model
testing.

singular task. It takes training data sets as input to use as a guide data, unedited. Pre-processing involves the removal of noise and
for making estimates without being specifically programmed other outliers in the data set. Feature extraction determines the
to. The tasks vary widely in this space and can be categorized spectrum of the data point groupings and what features they
into two main groups: supervised and unsupervised learning. correspond to. Feature selection is the isolation of the desired
Unsupervised learning is the case when the algorithm builds a classifiers that the machine learning method will be testing for
pattern of recognition from a data set containing only inputs the following training. Machine learning training involves the
with no set outputs. Supervised learning has a subsection being use of training data sets, whether with or without known outputs
semi-supervised learning. They are identical in the sense that to refine the classification method. Lastly, the testing phase is
they both learn from data sets with given inputs and known the processing of true test data sets and comparing the overall
outputs with the exception that semi-supervised has parts of accuracy of the desired feature.
the data set missing. Supervised learning is primarily used in
applications of classification and regression while unsupervised
B. Regression
learning lends itself to feature learning and the inverse, dimen-
sionality reduction. This paper will discuss some of the most Regression modeling is a popular tool in statistics because
popular machine learning methods and categorize them based it is a simple way to create a functional relationship between
on the type of learning with some practical applications in EEG. variables. Various types of regression include: univariate and
EEG signals can be used as indicators of harder to detect multivariate for quantitative response variables; simple and mul-
medical conditions with the assistance of machine learning tiple for predictor variables; linear for linearly transformable
methods. In Fig. 1 the applications of machine learning on data; nonlinear for nonlinearly transformable data; analysis of
EEG signals are shown based on supervised and unsupervised variance for qualitative variable predictors; analysis of covari-
learning. Supervised learning develops a predictive model using ance for the combination of qualitative and quantitative variable
both input and desired output data is categorized to classification predictors; and logistic for qualitative response variable [84].
and regression which produce discrete and continuous accord- Legendre and Gauss first applied regression using the Method
ingly. Unsupervised learning develops a predictive model using of Least Squares. This method makes approximations by sum-
just input data is categorized to clustering and dimensionality ming the squares of each equation residual to best fit the data,
reduction which produce discrete and continuous accordingly. and it is applied in Linear Regression. as shown in the equation
Fig. 2 describes the general flow of how machine learning is below.
implemented to get the desired classification of the data sets.
The first step is signal acquisition. This is essentially the raw yi = B0 + B1 xi + ei,i=1,...,n (1)

Authorized licensed use limited to: CZECH TECHNICAL UNIVERSITY. Downloaded on July 15,2022 at 16:45:08 UTC from IEEE Xplore. Restrictions apply.
206 IEEE REVIEWS IN BIOMEDICAL ENGINEERING, VOL. 14, 2021

TABLE I
REGRESSION MODELS APPLIED FOR EEG ANALYSIS

Linear Regression is one of the most common regression tech- Linear SVM classifier with hard margin
niques. In this model, the parameters are specified in the form
of a linear combination, while each independent variable is not 
l
1

l 
l
necessarily linear. Multiple linear regression is similar, except W (α) = − αi + 2 yi yj αi αi x i x j (3)
i=1 i=1 j=1
that there are several independent variables rather than just one.
When the parameters are not linear, nonlinear regression must
be used. This also uses a sum of squares technique, though it Kernel trick equation minimizing W subject to:
uses an iterative procedure to minimize the function.
l

yi αi = 0
C. SVM i=1

SVM is a subcategory of supervised learning used for analyz- 0≤αi ≤C (4)


ing data for classification and regression analysis. The purpose
is to map points in space such that the examples of the target
categories are divided by the largest possible margin. This allows D. KNN K-Nearset Neighbours
for SVM to have a general lower generalization error as a KNN is one of the supervised machine learning algorithms.
classifier [39]. The objective is to find a hyperplane or set of In supervised learning, the relationship between the input and
hyperplanes in an N-dimensional space. Support vectors are data output is already established for the training data set, i.e for
points that are closer to a given hyperplane. They maximize the a given input the output is already known. Supervised learn-
margin of the classifier by changing the position and orientation ing is categorized into regression and classification. KNN can
of the hyperplane. Additionally, within this space, it is also be used for both classification and regression. The input for
possible that the points are not separable linearly due to the both classification and regression is the same but the output
position of the data. SVM is capable of utilizing generated kernel differs respectively. Example input-output pairs are used for
functions or more commonly known as “kernel trick” to the data predicting the output for untrained data set. KNN classifies the
set to remedy this issue. This trick involves the transformation input based on the classification of its K neighbors. To find the
of the existing algorithm from a lower-dimensional data set to a nearest neighbors, Euclidean distance or Mahalanobis distance
higher one. The amount of information remains the same, but in is calculated from the input to all known data points. After the
this higher dimensional space, it is possible to create a linear clas- distance is calculated, K nearest neighbors are selected. It then
sifier. Several K kernels are assigned to each point which then classifies the input based on similarities between input and its
help determine the best fit hyperplane for the newly transformed K- neighbors. The selection of K is based on the size of the
feature space. With enough K functions, it is possible to get data set. The square root of the size of the data set is taken and
precise separation. The only major concern is overfitting. [110]. if the result is an even number then 1 is added or subtracted
Fig. 3 depicts a sample of data separation in both 2D and 3D. from it. The result is then established as K for that data set. K is
selected to be an odd number to avoid bias in the prediction of


w ·−

x − b = 1, −1 (2) input.

Authorized licensed use limited to: CZECH TECHNICAL UNIVERSITY. Downloaded on July 15,2022 at 16:45:08 UTC from IEEE Xplore. Restrictions apply.
HOSSEINI et al.: REVIEW ON MACHINE LEARNING FOR EEG SIGNAL PROCESSING IN BIOENGINEERING 207

Fig. 3. Higher dimension kernel separation. The kernel trick involves the transformation of the existing algorithm from a lower dimensional data
set to a higher one.

TABLE II
SUPPORT VECTOR MACHINE APPLICATIONS WITH EEG

E. ANN For ANN, the classification technique can be brought about


by:
Neural networks, commonly called the Artificial neural net-
works in the computing world, is a mathematical model very Summation of input-weight product and Bias
similar to the structure of neural networks seen in a human brain. 
n
(wi xi ) + bias (5)
To understand how the model works, researches have put forth i=1
several theories and examples showing the interaction between
Activation Layer
different layers of the neural networks to convert the given input  
into the desired output. 1 if  wx + b ≥ 0
Output = f (x) = (6)
Imagine you are at a bar, and looking at the menu to order a 0 if wx + b < 0
nice beer. Your favorite is IPA and as soon as you see that on As each application varies and has to have a specific approach
the list, you order it. So what happened in your brain is that you – Long term or short term EEG segment analysis, real-time pro-
provided multiple inputs for beer choice to your brain’s neural cess or time-delayed process, type of EEG channel analysis (sin-
network, IPA choice had a preferable weight as that being your gle or multiple) – which can be easily targeted and synthesized
most favorite beer; the brain made a decision and gave you the using ANN. Once the EEG signals are converted to waveforms in
output. This is a basic example of how neural networks operate. user-friendly GUIs, the classification of these signals happens
The architecture of the model shows the decision-making pro- with ANN, with the selection of a particular type of network
cess which involves much deeper layers of interaction that lie for a specific use case – Feedforward backpropagation, Radial
between the input and the output layer. basis function, Recurrent Neural networks. It is important to see

Authorized licensed use limited to: CZECH TECHNICAL UNIVERSITY. Downloaded on July 15,2022 at 16:45:08 UTC from IEEE Xplore. Restrictions apply.
208 IEEE REVIEWS IN BIOMEDICAL ENGINEERING, VOL. 14, 2021

TABLE III
ARTIFICIAL NEURAL NETWORKS APPLICATION FOR EEG ANALYSIS

Fig. 5. A radial basis function network is an ANN which uses radial


basis functions as activation functions. A linear combination of radial
Fig. 4. Feedforward Neural Network. There are two directions for basis functions of the inputs and the parameters of neurons is used for
information flows, forward propagation and backpropagation. Forward the output of the network. These structures have many applications such
propagation is used in the prediction time while backpropagation is used as time series prediction, classification, and function approximation.
for adjusting the weights to minimize loss.

A typical RBF is a Gaussian distribution, in case of a


how different types of ANN operate and the architecture which scalar input, and is given by:
 
facilitates that operation. −(x − c)2
1) Feedforward Neural Networks: This is a type of network h(x) = exp (7)
r2
where data flows in only one direction, starting from
the input nodes, passing through the hidden nodes and Where c is the center, and r is the radius parameters. A
arriving at the output nodes. This network ensures no Gaussian RBF distribution decreases as the distance from
loop or cycle formation, making the information flow in the center increases.
a specific direction only. Fig. 4 shows the architecture for For a multiquadric RBF with a scalar input can be shown
Feedforward network mechanism. as:
√2
2) Radial basis function: In the field of artificial neural net- r +(x−c)2
h(x) = r
(8)
works and mathematical modeling, RBF is a type of ANN
which makes use of radial basis functions (An arbitrary In this case the Gaussian RBF increases with increase in
real-valued function, the value of which is determined the distance from the center.
by functions location from the origin). Thus, the network 3) Recurrent Neural Networks: As the name suggests, RNN
determines the output by a linear combination of RBF is a type of Artificial Neural Network which has connec-
of the inputs and parameters given for the neurons. As tions between different nodes, with a specific assigned
shown in Fig. 5 the structure operates by summing the direction for output flow to a specific node. Here, the flow
centers/widths of the points with the associated weights of data can form loops and cycles to feed the data back to
to get us the final output. a specific node as intended. This technique is illustrated

Authorized licensed use limited to: CZECH TECHNICAL UNIVERSITY. Downloaded on July 15,2022 at 16:45:08 UTC from IEEE Xplore. Restrictions apply.
HOSSEINI et al.: REVIEW ON MACHINE LEARNING FOR EEG SIGNAL PROCESSING IN BIOENGINEERING 209

TABLE IV
NAIVE BAYES APPLICATIONS WITH EEG

TABLE V
REVIEW ON DECISION TREE AND RANDOM FOREST

in Fig. 6 which shows the backpropagation of information By applying the tan hyperbolic function the dot product of
from one layer to another and to a specifically intended associated weights from previous states and the dot product of
node. associated weights and input state, we shall have the value of
To understand the working of RNN it is important to define the new state. We can have the final output function as:
the transitions from one previous state to a new state. Let Xt be
the input vector, Ht be the new state, and Ht-1 be the previous yt = Why .ht (11)
state. RNN is observed to be a function of the input vector and
the previous state, which will land us to the new state Ht. We F. Naive Bayes
can represent a simple Vanilla version of the RNN by obtaining
the weight function Fw and implementing that to find the output Naive Bayes classifier is a popular text categorization method
function Yt. This can be represented as follows: that applies Bayes’ theorem to separate data based on simple
trained features. Essentially, the model assigns labels as feature
ht = fw (ht−1 , xt ) (9) vectors within a finite set. While simple in nature, with adequate
pre-processing it can match more advanced methods such as
ht = tanh(Whh .ht−1 + Wxh .xt ) (10) SVM discussed above. The one disadvantage of the naive Bayes

Authorized licensed use limited to: CZECH TECHNICAL UNIVERSITY. Downloaded on July 15,2022 at 16:45:08 UTC from IEEE Xplore. Restrictions apply.
210 IEEE REVIEWS IN BIOMEDICAL ENGINEERING, VOL. 14, 2021

TABLE VI
A REVIEW ON ENSEMBLE LEARNING STATE OF ARTS

TABLE VII
FUZZY LOGIC FOR EEG ANALYSIS

[45] method is that it considers all of the feature vectors as vectors and it assigns probabilities to a given outcome or case.
independent from one another regardless of any real correlation. Event models can be separated into 2 main classes, Gaussian
The main advantage of it is that it only needs a small number of Naive Bayes and Multinomial Naive Bayes. In a data set with
training data sets to begin correctly estimating the parameters continuous values, a good assumption would be that it follows
necessary for classification. Several models can be implemented a Gaussian distribution. Using this method the Bayes method
for the Bayes method. The most common of which is the prob- assigns probabilities based on the curve. A multinomial event
abilistic model. In this model, the features are represented by model represents the frequencies of specific events spawned

Authorized licensed use limited to: CZECH TECHNICAL UNIVERSITY. Downloaded on July 15,2022 at 16:45:08 UTC from IEEE Xplore. Restrictions apply.
HOSSEINI et al.: REVIEW ON MACHINE LEARNING FOR EEG SIGNAL PROCESSING IN BIOENGINEERING 211

TABLE VIII
LINEAR DISCRIMINANT ANALYSIS

TABLE IX
K MEANS FOR EEG ANALYSIS

Fig. 6. Recurrent Neural Network where connections between nodes Fig. 7. Example for decision tree technique to determine a health
form a directed graph along a temporal sequence. It makes previous condition.
outputs to be used as inputs.

from multinomials, often as a histogram. A potential concern The probabilistic Naive Bayes Model
is when a feature does not occur in the data set at all. This
causes the multiple of all the estimates to be zero. It can be (v−μk )2

2σ 2
corrected with a pseudocount to smooth out any outliers in the P (x = v | Ck ) = √ 1 2
e k (13)
2πσk
data set [91].
P (x)|(c)
P (c|x) = P (x) (12) The Gaussian Naive Bayes Model

Authorized licensed use limited to: CZECH TECHNICAL UNIVERSITY. Downloaded on July 15,2022 at 16:45:08 UTC from IEEE Xplore. Restrictions apply.
212 IEEE REVIEWS IN BIOMEDICAL ENGINEERING, VOL. 14, 2021

data being structured; i.e. instead of finding the most important


feature from the given set, it operates to find the best feature
among a random set of a defined subset of features. This results
in a more diverse and better result model.
In Random forest the solution from all the trees is summed up
and classification happens through a majority voting where the
best suitable classification is chosen. However, if the trees are
found to be unstable, where minor changes in the data set can
change the whole decision tree, we might end up with a wrong
classification.

H. Ensemble Learning
Ensemble learning is a supervised learning algorithm. As the
name suggests, ensemble learning ensemble’s many different
algorithms to make a model that gives a better predictive perfor-
mance. The general idea is to improve the overall performance
by combining decisions received from different multiple models.
It is based on the concept of diversity, more diverse models
are considered for obtaining the results for the same problem
in comparison to single models. This gives a set of hypotheses
which can be combined to gain better performance. All the single
models are called as base learners when combined are called as
an ensemble. The ensemble is mostly better than the base learn-
Fig. 8. Random Forest is an ensemble learning method which is used
mostly for classification and regression. It operates by creating a multi- ers from which the ensemble is made. Ensemble learning can be
tude of decision trees on various sub-samples of the dataset and uses used in the fields of medicine, fraud detection, banking, malware
majority voting or averaging for finding output. This model improves the and intrusion detection, face and emotion recognition, etc.
accuracy of prediction and can control over-fitting.

I. Fuzzy Logic
Almost every household machine or equipment (like the air
G. Decision Tree and Random Forest
conditioner, washing machine, etc.) operates on the concept of
Decision trees use questions about the features of an item to Fuzzy Logic. This logic is fed to a control system usually called
classify data. Each question can be represented as a node, in the Fuzzy system control, where each component is designed to
which there is a child node for each answer to that question. function and alter another physical operating system, to achieve
This creates a hierarchy, in other words, a tree. The most basic the desired functionality. To understand how a fuzzy system
tree would be a binary one in which each question results in a yes works, it is necessary to analyze the system requirements and
or no answer. Therefore there is a yes and no child node for each the intent for using a fuzzy system [20]. To make a system
parent node question. Data is sorted through the tree by starting a knowledge-based functioning element with the capacity to
at the top-most node, also known as the root, and maneuvering apply the human cognitive processes, such as reasoning and
its way down to the leaf, or the node that has no children. The thinking, has to have a stable component that can provide output
path taken is dependent on the data’s features. Once the data on the perspective of the degree of truth for a given set of
reaches the leaf, it can be classified under the class associated input variables. Fig. 9 shows the breakdown of a typical fuzzy
with that particular leaf [64]. system. For a fuzzy system to work effectively, the following
The advantages of decision trees are that they are simplistic components need to be assured of performance:
and can be easily combined with other techniques for decision 1) Fuzzy sets: A fuzzy set is considered to be correspon-
making. The disadvantages of decision trees are that they are dent with the member function, which is defined in a
somewhat unstable as well as inaccurate, especially with varying fuzzy space where the variables are set. The feature of
level sizes which cause biases towards larger levels. a member function is to provide a degree of membership
In the study of machine learning, and different classifying to any element within the well defined fuzzy sets. Then
and distribution methods, we come across the Random Forest the member function assigns these elements a numerical
technique, which can be used for both data classification and value between 0 to 1, where 0 implies the corresponding
regression operations. As the name suggests, Random Forest element is not an element in the fuzzy set or 1 means the
operates by producing a multitude of decision trees and trained corresponding element is an element of the fuzzy set.
by performing bagging operation to combine multiple decision 2) Fuzzy Rules: The way a fuzzy logic is intended to function
trees or models to arrive at a more stable and accurate data is defined by a set of applied fuzzy rules, which deter-
prediction. Random Forest creates additional randomness to the mines the output which will be specified by the IF-THEN

Authorized licensed use limited to: CZECH TECHNICAL UNIVERSITY. Downloaded on July 15,2022 at 16:45:08 UTC from IEEE Xplore. Restrictions apply.
HOSSEINI et al.: REVIEW ON MACHINE LEARNING FOR EEG SIGNAL PROCESSING IN BIOENGINEERING 213

Fig. 9. Example for Fuzzy System. For a Fuzzy system to work effectively, the following features and components needs to be assured of
performance: 1. Fuzzy sets, 2. Fuzzy Rules, 3. Fuzzy Logic Inference, 4. Fuzzy Score.

Fig. 10. General K-means classification. K-means works based on using an algorithm to locate a partition in order to minimize the error between
a cluster’s empirical mean and points within. Using these K clusters, K-means tries to minimize the summation of the squared errors.

rules. The IF-THEN rules are observed to create a condi- linguistic terms. In the second stage, the system processes
tional statement that will consist of fuzzy logic. For exam- the rules according to the strengths of each input variable.
ple, the IF-THEN assumes where X and Y are intended in the third stage, the resulting fuzzy values are converted
terms and are evaluated by the terms of fuzzy sets with the back to numerical values, by the process of Defuzzifica-
range being U and V. This divides the statement into two tion. This process thereby maps the fuzzy domain output
parts namely antecedent and consequent. If the antecedent back to the crisp domain, which makes the output clear.
is a preceding statement which specifies the terms X and 4) Fuzzy Score: The output from the FIS system is in the
U, then the consequent statement should conclude with form of a fuzzy score, for all the individual input scores
Y and V. These combined makes a rule which states: if X that are known to be generated by the system. The FIS
is U, then Y is V. However, these rules are based on the system calculates the fuzzy score by taking into consider-
natural language and model representation, based on the ations all the defined fuzzy constraints and membership
given fuzzy sets and logic. functions. The score is dependent on the type of rules ap-
3) Fuzzy Logic Inference or Fuzzy Inference System (FIS): plied and the type of input variables. Every input variable
Once the set of fuzzy rules and membership functions is assigned a score by the FIS based on the fuzzy rules
have been defined, the FIS is implemented for process criteria.
simulation, and control, and is done by the type of data or As the main application of Machine Learning is found to be in
knowledge provided. The FIS system usually operates on pattern recognition of EEG signals, Fuzzy Logic can be used to
3 stages: In the first stage, the numerical input variables determine the correct recognition rate of EEG classifications
which are provided to the system, are mapped for a degree at different stages. However, a combination of Fuzzy logic
of compatibility for the respective fuzzy sets. This is with Neural networks often called the Neuro-Fuzzy system, is
called the Fuzzification process. This process allows the adopted, where the system can apply the fuzzy parameter (like
system to express the input and output in fuzzy-readable fuzzy sets, fuzzy rules) and combine that with the neural network

Authorized licensed use limited to: CZECH TECHNICAL UNIVERSITY. Downloaded on July 15,2022 at 16:45:08 UTC from IEEE Xplore. Restrictions apply.
214 IEEE REVIEWS IN BIOMEDICAL ENGINEERING, VOL. 14, 2021

approximation techniques for extensive analysis. The Neuro- K. K-Means


Fuzzy system [85] is found to be highly beneficial for medical
K-means is an unsupervised learning method that is used for
condition diagnostics, density and regression estimation, pattern
the clustering problem. The way it works is by using an algorithm
recognition, and data analytics.
to locate a partition to minimize the error between a cluster’s em-
pirical mean and points within. Using these K clusters, K-means
J. Linear Discriminant Analysis tries to minimize the summation of the squared errors [57].
There are two commonly used methods for initialization:
For a given data set with a wide selection of random variables, Forgy and Random Partition. With the Forgy method, K observa-
it is necessary to perform a dimensionality reduction to reduce tions are chosen randomly from the data set. These observations
the number of parameters to specific principle variables to reduce are then used as the initial means. For the Random Partition
the dimensional space of the dataset. As there are many possible method, each observation is first assigned a random cluster. This
ways to classify the data, the dimensionality reduction technique is then updated as the initial mean is computed such that it is at
is implemented by two techniques: The Principle component the center of the cluster.
analysis, and linear discriminant analysis. Both PCA and LDA One of the advantages of K-means is its easy implementation
have similar functionalities and applications. However, the LDA of high computational speed given that K is relatively small.
technique can handles situations where the within-class frequen- Some of the disadvantages of K-means include the high signifi-
cies need not be equal and the standout factor is that it offers a cance of initial conditions on final outputs, sensitivity to scaling,
high ratio and significant separation between the between-class and a correlation between data order with final results.
variance, and the within-class variance. The main difference
between the PCA and LDA being, PCA is more applicable L. Reinforcement Learning
for classification of features, and LDA is applicable for data
classification. The biggest problem in modern-day brain-computer interface
The most common technique used for dimensionality re- (BCI) systems is that the performance factor of these systems
duction is Linear discriminant analysis (LDA). The main cri- in controlling a BCI can and will decrease significantly over the
teria behind this technique are to offer a good separability period. Due to this issue, the necessity of controlling a BCI has
between different classes and to avoid overfitting of the curve. increased, and the motivation factor behind this is quite low.
This will significantly reduce computational costs and pro- To eradicate this scenario and find a solution to the addressed
vides better classification, by projecting the given feature space problem, we must enable a continuous feedback system from
with n-dimensional samples onto a precise and smaller feature the subject and feed that to a Reinforcement learning agent
subspace. In a typical PCA analysis, the location, shape, and to train and support the case in finding an accurate solution.
structure of the data set completely change. But for LDA, the The purpose here is to use the RL agent to control the actions
technique maintains the location and shape of the data set when of the given task and as the process precedes, the supporting
transformed into a different smaller space. This happens through impact from the agent is decreased and the subject will take
defining a set of vectors on the transformed space to distinguish over the control mechanism. As the subject takes over the
and separate. In an LDA technique this usually happens by two control, the criteria are to maintain the subject at the state and
different approaches: to measure the performance by implementing a reward system
1) Class-independent transformation: This approach mainly that assigns certain points to the subject on how well it controls
focuses on increasing the ratio of overall variance to the task without any agent present. The main objective of the
the within-class variance and it only uses one criterion reinforcement agent is to interact with the subject in uncertain
to optimize the process of data set transformation. This conditions, and maximize the numerical long term reward for the
transforms all the necessary data points irrespective of subject, basically taking a subject from one state to another. For
their class. So here, each class is observed to be separate example, if in every state St, there exists an agent which can take
from all other classes. up an action At to get to a new state St+1. The agent will gain
2) Class-dependent transformation: Here, the main objective the capacity to learn and interact in different states by increasing
is to increase the ratio between the class variables to that the numerical long term reward for the agent. This is shown in
of the within-class variables, to offer a sufficient range of Fig. 11.
separability for classification. One of the advantages of using the RL model is that it
For the application of analysis of EEG signals and the maintains the balance between Exploration and Exploitation.
Brain-computing interface, the exploration of advanced Other supervised algorithms cannot perform this balance. For
methods to separate and segregate the data sets with EEG analysis applications, the RL model has shown constant
multiple variables, in an effective manner. A received progress towards the control mechanism of the brain-computer
EEG signal may be distorted by noise disturbance and interface system, maintaining an equal balance between state
may have to be separated effectively, to achieve accurate transitions and reward mechanisms for optimum functioning.
results. For this purpose, the technique of dimensionality
reduction is being implemented, to reduce the data set and M. Combination of Methods
separate the unwanted signal frequencies from the ones A combination of methods involves the use of two or more of
in interest. the machine learning algorithms to take advantage of the unique

Authorized licensed use limited to: CZECH TECHNICAL UNIVERSITY. Downloaded on July 15,2022 at 16:45:08 UTC from IEEE Xplore. Restrictions apply.
HOSSEINI et al.: REVIEW ON MACHINE LEARNING FOR EEG SIGNAL PROCESSING IN BIOENGINEERING 215

topic for EEG analysis using machine learning methods [11].


Also, wavelet transform and Auto-regressive methods have
played a pivotal role in machine learning for EEG such as the
following studies [6]. Subasi used wavelet feature extraction
for epileptic seizure detection with an adaptive neuro-fuzzy
inference system in [101]. The effect of de-noising such as mul-
tiscale PCA in EEG analysis is shown in [62]. Data preparation
methods such as PCA, ICA and, LDA can be used to increase the
classification accuracy [30]. Ensemble methods and combining
classifiers have shown good performance in EEG analysis such
as the following studies [61].
The incorporation of deep learning models in neuroimaging
and electrodiagnostic analytics has allowed for large amounts
of data to be correlated from multiple modalities [41]. These
models have been shown to perform better and faster than current
state-of-the-art analysis techniques through both supervised and
unsupervised learning tasks. Recent advancements and advan-
Fig. 11. Operation of Reinforcement Learning. Software agents must
to take suitable actions in an environment to maximize reward in a
tages in using deep learning in EEG analysis can provide more
particular situation. accurate and faster analysis for a large amount of data. Hosseini
et al. [35], [39], [47] proposed a cloud-based method for EEG
analysis. In [48] convolutional neural networks (CNN) have
characteristics that each method possesses. This allows the mul-
been developed for EEG analysis. In [43] optimization modules
timodal algorithm to extract additional desired features [49].
consisted of PCA, ICA and DSA analysis are developed for
The significance of multimodal integration is that it allows
CNN and stacked auto-encoder deep learning structures in EEG
high-resolution classification using primarily already existing
analysis.
methods [21], [46]. Additionally, this resolution will generally
The coefficients of the wavelet transform and the numerical
be higher than that of the individual methods separately [34].
autoregressive model are used in recognizing the changes and
However, multimodal extraction is not without limitations. Due
behaviors in EEG signals. These coefficients are taken as inputs
to the increased complexity of the algorithm, it may be difficult
and combined with different machine learning algorithms like
to determine the true accuracy as it is not directly comparable
multiple layered neural networks, K-means, Support vector ma-
to existing methods. An example of this application in EEG is
chines, K-nearest neighbors, and Naive Bayesian; to break the
the diagnosis of multiple sclerosis patients. In the paper, the
EEG signal into machine recognizable components, for extract-
T-test [40], [42], [45] and Bhattacharyya were used for feature
ing and determining the power points which are responsible for
extraction as part of the preprocessing. Following this a combi-
triggering seizures.
nation of KNN and SVM as the primary classification algorithm.
As there are multiple techniques involving machine learning
This resulted in a total accuracy of 93% [103]. While other
for analyzing a given set of EEG signals, it is required to evaluate
sections above have dedicated tables with reviewed literature,
the best-suited technique for a given application. Each model has
we wanted to bring attention to multimodal analysis as some
a specific use case about the type of application and subject data
literature above already demonstrated the application of the
set. As to our topic of study here, we were concerned about the
combination of methods [37].
analysis of waveforms to determine an output. Here we will see
how each different ML models can be used for the intended use
III. CONCLUSION case:
As the process of epileptic seizure detection is a bit com- K-NN classifiers can be used for both regression and classifi-
plicated biomedical situation, it has generated a substantial cation of data, which for our purpose can be used for identifying
amount of concerns towards the utilization of machine learning and classifying different acquired EEG signals and finding the
processes as a solution [78]. Most of the recent literature surveys nearest possible output point to the desired classification line
regarding EEG signal analysis have proposed multiple learning for possible detection of abnormality. ANN, on the other hand,
models and different artificial neural network algorithms like has the capability of segregating the physical shape of the EEG
radial bias function, recurrent neural networks, and vector quan- waveform and dividing it into segments. These segments are
tizations to interpret epileptic seizure patterns in a given set of each given a specific weight value accordingly by analyzing the
EEG signals. The problem is also being targeted and solved using waveform, and the output is determined by subjecting the final
other models like Support Vector Machines (SVM), adaptive equation to a bias. The final chosen bias brings down the output
neuro-fuzzy interference system (ANFIS), adaptive learning, to a desired expected range. As more data is being involved,
and time-frequency analysis. the number of interactions in the hidden layer will increase. So
Reviewing the published papers in EEG analysis for depending on the type of problem and the amount of data being
epilepsy the following points are considerable. Dimensionality considered, a suitable selection has to be made while selecting
reduction and selection have been identified as an interesting an appropriate process.

Authorized licensed use limited to: CZECH TECHNICAL UNIVERSITY. Downloaded on July 15,2022 at 16:45:08 UTC from IEEE Xplore. Restrictions apply.
216 IEEE REVIEWS IN BIOMEDICAL ENGINEERING, VOL. 14, 2021

ACKNOWLEDGMENT [20] M. Cosenza-Andraus, C. Nunes-Cosenza, R. Gomes-Nunes, C. Fantezia-


Andraus, and S. Alves-Leon, “Video-electroencephalography prolonged
The authors would like to thank students at Santa Clara monitoring in patients with ambulatory diagnosis of medically refractory
University, Thi-Vu Huynh, Pradnya Patel, Elissa Yang, and temporal lobe epilepsy: Application of fuzzy logic’s model,” Revista de
neurologia, vol. 43, no. 1, pp. 7–14, 2006.
Haygreev Venkatesh for their contributions to this work. [21] S. Dähne, F. Bießmann, F. C. Meinecke, J. Mehnert, S. Fazli, and K.-R.
Mtüller, “Multimodal integration of electrophysiological and hemody-
namic signals,” in Proc. IEEE Int. Winter Workshop Brain-Comput.
REFERENCES Interface, 2014, pp. 1–4.
[22] A. Delorme and S. Makeig, “EEGLAB: An open source toolbox for
[1] A. Aarabi, R. Fazel-Rezai, and Y. Aghakhani, “Seizure detection in analysis of single-trial EEG dynamics including independent component
intracranial EEG using a fuzzy inference system,” in Proc. IEEE Annu. analysis,” J. Neurosci. Methods, vol. 134, no. 1, pp. 9–21, 2004.
Int. Conf. Eng. Medicine Biol. Soc., 2009, pp. 1860–1863. [23] J. A. Dian, S. Colic, Y. Chinvarun, P. L. Carlen, and B. L. Bardakjian,
[2] H. Abbasi, L. Bennet, A. J. Gunn, and C. P. Unsworth, “Identifying stereo- “Identification of brain regions of interest for epilepsy surgery planning
typic evolving micro-scale seizures (SEMS) in the hypoxic-ischemic using support vector machines,” in Proc. IEEE 37th Annu. Int. Conf. Eng.
EEG of the pre-term fetal sheep with a wavelet type-II fuzzy classifier,” Medicine Biol. Soc., 2015, pp. 6590–6593.
in Proc. IEEE 38th Annu. Int. Conf. Eng. Medicine Biol. Soc., 2016, [24] Q. Dong, B. Hu, J. Zhang, X. Li, and M. Ratcliffe, “A study on visual
pp. 973–976. attention modeling—A linear regression method based on EEG,” in Proc.
[3] H. Aghajani, M. Garbey, and A. Omurtag, “Measuring mental workload IEEE Int. Joint Conf. Neural Netw., 2013, pp. 1–6.
with EEG+ fNIRS,” Frontiers Human Neurosci., vol. 11, 2017, Art. [25] C. Dora and P. K. Biswal, “Robust ECG artifact removal from EEG using
no. 359. continuous wavelet transformation and linear regression,” in Proc. IEEE
[4] A. Ahani, H. Wahbeh, H. Nezamfar, M. Miller, D. Erdogmus, and Int. Conf. Signal Process. Commun., 2016, pp. 1–5.
B. Oken, “Quantitative change of EEG and respiration signals during [26] F. D. V. Fallani, G. Vecchiato, J. Toppi, L. Astolfi, and F. Babiloni,
mindfulness meditation,” J. Neuroeng. Rehabil., vol. 11, no. 1, p. 87, “Subject identification through standard EEG signals during resting
2014. states,” in Proc. IEEE Annu. Int. Conf. IEEE Eng. Medicine Biol. Soc.,
[5] O. Al Zoubi et al., “Predicting age from brain EEG signals–a machine 2011, pp. 2331–2333.
learning approach,” Frontiers Aging Neurosci., vol. 10, p. 184, 2018. [27] J. Fan et al., “A step towards EEG-based brain computer interface for
[6] E. Alickovic, J. Kevric, and A. Subasi, “Performance evaluation of empir- autism intervention,” in Proc. IEEE Eng. Medicine Biol. Soc. Annu. Conf.,
ical mode decomposition, discrete wavelet transform, and wavelet packed vol. 2015, NIH Public Access, 2015, Art. no. 3767.
decomposition for automated epileptic seizure detection and prediction,” [28] V. Gao, F. Turek, and M. Vitaterna, “Multiple classifier systems for auto-
Biomed. Signal Process. Control, vol. 39, pp. 94–102, 2018. matic sleep scoring in mice,” J. Neurosci. Methods, vol. 264, pp. 33–39,
[7] H. U. Amin, W. Mumtaz, A. R. Subhani, M. N. M. Saad, and A. S. Malik, 2016.
“Classification of EEG signals based on pattern recognition approach,” [29] M. Günay and T. Ensari, “EEG signal analysis of patients with epilepsy
Frontiers Comput. Neurosci., vol. 11, p. 103, 2017. disorder using machine learning techniques,” in Proc. IEEE Comput. Sci.,
[8] M. N. Anastasiadou, M. Christodoulakis, E. S. Papathanasiou, S. S. Biomed. Engineerings’ Meeting Electric Electron., 2018, pp. 1–4.
Papacostas, and G. D. Mitsis, “Unsupervised detection and removal of [30] L. Guo, D. Rivero, and A. Pazos, “Epileptic seizure detection using
muscle artifacts from scalp EEG recordings using canonical correlation multiwavelet transform based approximate entropy and artificial neu-
analysis, wavelets and random forests,” Clin. Neurophysiol., vol. 128, ral networks,” J. Neurosci. Methods, vol. 193, no. 1, pp. 156–163,
no. 9, pp. 1755–1769, 2017. 2010.
[9] A. Antoniades et al., “Deep neural architectures for mapping scalp to [31] M. I. Gursoy and A. Subast, “A comparison of PCA, ICA and LDA
intracranial EEG,” Int. J. Neural Syst., 2018, Art. no. 1850009. in EEG signal classification using SVM,” in Proc. IEEE 16th Signal
[10] V. Asanza et al., “Clustering of EEG occipital signals using k-means,” in Process., Commun. Appl. Conf., 2008, pp. 1–4.
Proc. IEEE Ecuador Tech. Chapters Meeting, 2016, pp. 1–5. [32] C. R. Hamilton, S. Shahryari, and K. M. Rasheed, “Eye state prediction
[11] N. Beganovic, J. Kevric, and D. Jokic, “Identification of diagnostic- from EEG data using boosted rotational forests,” in Proc. IEEE 14th Int.
related features applicable to EEG signal analysis,” in Proc. Annu. Conf. Conf. Mach. Learn. Appl., 2015, pp. 429–432.
PHM Soc., 2018, vol. 10. [33] R. Harikumar, T. Vijayakumar, and M. Sreejith, “Performance analysis
[12] M. Bentlemsan, E.-T. Zemouri, D. Bouchaffra, B. Yahya-Zoubir, and of SVD and k-means clustering for optimization of fuzzy outputs in
K. Ferroudji, “Random forest and filter bank common spatial patterns for classification of epilepsy risk level from EEG signals,” in Proc. IEEE
EEG-based motor imagery classification,” in Proc. IEEE 5th Int. Conf. 9th Int. Conf. Elect. Eng./Electron., Comput., Telecommun. Inf. Technol.,
Intell. Syst., Modelling Simul., 2014, pp. 235–238. 2012, pp. 1–4.
[13] N. Bigdely-Shamlo, A. Vankov, R. R. Ramirez, and S. Makeig, “Brain [34] M. P. Hosseini, “Developing a cloud based platform as a service to
activity-based image classification from rapid serial visual presentation,” improve public health of epileptic patients in urban places,” presented
IEEE Trans. Neural Syst. Rehabil. Eng., vol. 16, no. 5, pp. 432–441, at Reimagining Health in Cities: New Directions in Urban Health Re-
Oct. 2008. search, Drexel University School of Public Health, Philadelphia, USA.
[14] S. Biswal, Z. Nip, V. M. Junior, M. T. Bianchi, E. S. Rosenthal, and Sep. 2015.
M. B. Westover, “Automated information extraction from free-text EEG [35] M. P. Hosseini, “Brain-computer interface for analyzing epileptic
reports,” in Proc. IEEE 37th Annu. Int. Conf. Eng. Medicine Biol. Soc., big data,” Ph.D. thesis, Rutgers Univ.-School Graduate Studies, New
2015, pp. 6804–6807. Brunswick, NJ, USA, 2018.
[15] P. A. Bizopoulos, D. G. Tsalikakis, A. T. Tzallas, D. D. Koutsouris, [36] M.-P. Hosseini, A. Hajisami, and D. Pompili, “Real-time epileptic seizure
and D. I. Fotiadis, “EEG epileptic seizure detection using k-means detection from EEG signals via random subspace ensemble learning,” in
clustering and marginal spectrum based on ensemble empirical mode Proc. IEEE Int. Conf. Autonomic Comput., 2016, pp. 209–218.
decomposition,” in Proc. IEEE 13th Int. Conf. Bioinform. Bioeng., 2013, [37] M.-P. Hosseini, A. Lau, K. Elisevich, and H. Soltanian-Zadeh, “Mul-
pp. 1–4. timodal analysis in biomedicine,” in Big Data in Multimodal Medical
[16] S. Bose, V. Rama, and C. R. Rao, “EEG signal analysis for seizure Imaging. London, U.K.: Chapman and Hall/CRC, 2019, pp. 193–203.
detection using discrete wavelet transform and random forest,” in Proc. [38] M.-P. Hosseini, S. Lu, K. Kamaraj, A. Slowikowski, and H. C. Venkatesh,
IEEE Int. Conf. Comput. Appl., 2017, pp. 369–378. “Deep learning architectures,” in Deep Learning: Concepts and Archi-
[17] W. Chen et al., “Epileptic EEG visualization and sonification based on tectures, Berlin, Germany: Springer, 2020, pp. 1–24.
linear discriminate analysis,” in Proc. IEEE 37th Annu. Int. Conf. Eng. [39] M.-P. Hosseini, M. R. Nazem-Zadeh, F. Mahmoudi, H. Ying, and
Medicine Biol. Soc., 2015, pp. 4466–4469. H. Soltanian-Zadeh, “Support vector machine with nonlinear-kernel opti-
[18] A. M. Chiarelli, P. Croce, A. Merla, and F. Zappasodi, “Deep learning mization for lateralization of epileptogenic hippocampus in MR images,”
for hybrid EEG-FNIRS brain–computer interface: Application to motor in Proc. IEEE 36th Annu. Int. Conf. Eng. Medicine Biol. Soc., 2014,
imagery classification,” J. Neural Eng., vol. 15, no. 3, 2018, Art. no. pp. 1047–1050.
036028. [40] M.-P. Hosseini, M.-R. Nazem-Zadeh, D. Pompili, K. Jafari-Khouzani,
[19] E. Combrisson and K. Jerbi, “Exceeding chance level by chance: The K. Elisevich, and H. Soltanian-Zadeh, “Comparative performance evalu-
caveat of theoretical chance levels in brain signal classification and sta- ation of automated segmentation methods of hippocampus from magnetic
tistical assessment of decoding accuracy,” J. Neurosci. Methods, vol. 250, resonance images of temporal lobe epilepsy patients,” Med. Phys., vol. 43,
pp. 126–136, 2015. no. 1, pp. 538–553, 2016.

Authorized licensed use limited to: CZECH TECHNICAL UNIVERSITY. Downloaded on July 15,2022 at 16:45:08 UTC from IEEE Xplore. Restrictions apply.
HOSSEINI et al.: REVIEW ON MACHINE LEARNING FOR EEG SIGNAL PROCESSING IN BIOENGINEERING 217

[41] M.-P. Hosseini, M.-R. Nazem-Zadeh, D. Pompili, K. Jafari-Khouzani, [65] J. S. Kirar and R. Agrawal, “Relevant feature selection from a combina-
K. Elisevich, and H. Soltanian-Zadeh, “Automatic and manual segmen- tion of spectral-temporal and spatial features for classification of motor
tation of hippocampus in epileptic patients MRI,” arXiv:1610.07557, imagery EEG,” J. Med. Syst., vol. 42, no. 5, p. 78, 2018.
2016. [66] J. Laton et al., “Single-subject classification of schizophrenia patients
[42] M.-P. Hosseini, M. R. Nazem-Zadeh, D. Pompili, and H. Soltanian- based on a combination of oddball and mismatch evoked potential
Zadeh, “Statistical validation of automatic methods for hippocampus paradigms,” J. Neurological Sci., vol. 347, no. 1/2, pp. 262–267,
segmentation in MR images of epileptic patients,” in Proc. IEEE 36th 2014.
Annu. Int. Conf. Eng. Medicine Biol. Soc., 2014, pp. 4707–4710. [67] J. Le Douget, A. Fouad, M. M. Filali, J. Pyrzowski, and M. Le Van
[43] M.-P. Hosseini, D. Pompili, K. Elisevich, and H. Soltanian-Zadeh, “Op- Quyen, “Surface and intracranial EEG spike detection based on discrete
timized deep learning for EEG big data and seizure prediction BCI via wavelet decomposition and random forest classification,” in Proc. IEEE
Internet of Things,” IEEE Trans. Big Data, vol. 3, no. 4, pp. 392–404, 39th Annu. Int. Conf. IEEE Eng. Medicine Biol. Soc., 2017, pp. 475–478.
Dec. 2017. [68] Y.-H. Lee et al., “A cross-sectional evaluation of meditation experience
[44] M.-P. Hosseini, D. Pompili, K. Elisevich, and H. Soltanian-Zadeh, “Ran- on electroencephalography data by artificial neural network and support
dom ensemble learning for EEG classification,” Artif. Intell. Medicine, vector machine classifiers,” Medicine, vol. 96, no. 16, 2017.
vol. 84, pp. 146–158, 2018. [69] P. Li, C. Karmakar, J. Yearwood, S. Venkatesh, M. Palaniswami, and
[45] M. P. Hosseini, H. Soltanian-Zadeh, and S. Akhlaghpoor, “Computer- C. Liu, “Detection of epileptic seizure based on entropy analysis of short-
aided diagnosis system for the evaluation of chronic obstructive pul- term EEG,” PLOS ONE, vol. 13, no. 3, 2018, Art. no. e0193691.
monary disease on CT images,” Tehran University Med. J., vol. 68, [70] X. Li et al., “An ocular artefacts correction method for discriminative
no. 12, 2011. EEG analysis based on logistic regression,” in Proc. IEEE 23rd Eur.
[46] M. P. Hosseini, H. Soltanian-Zadeh, and S. Akhlaghpoor, “Three cuts Signal Process. Conf., 2015, pp. 2731–2735.
method for identification of COPD,” Acta Medica Iranica, vol. 51, no. 11, [71] Y.-H. Liu, S. Huang, and Y.-D. Huang, “Motor imagery EEG classifi-
pp. 771–778, 2013. cation for patients with amyotrophic lateral sclerosis using fractal di-
[47] M.-P. Hosseini, H. Soltanian-Zadeh, K. Elisevich, and D. Pompili, mension and fisher’s criterion-based channel selection,” Sensors, vol. 17,
“Cloud-based deep learning of big EEG data for epileptic seizure pre- no. 7, 2017, Art. no. 1557.
diction,” in Proc. IEEE Global Conf. Signal Inf. Process. (GlobalSIP), [72] M. Manjusha and R. Harikumar, “Performance analysis of KNN classifier
2016, pp. 1151–1155. and k-means clustering for robust classification of epilepsy from EEG
[48] M.-P. Hosseini, T. X. Tran, D. Pompili, K. Elisevich, and H. Soltanian- signals,” in Proc. IEEE Int. Conf. Wireless Commun., Signal Process.
Zadeh, “Deep learning with edge computing for localization of epilepto- Netw., 2016, pp. 2412–2416.
genicity using multimodal RS-FMRI and EEG big data,” in Proc. IEEE [73] T. Meyer, J. Peters, T. O. Zander, B. Schölkopf, and M. Grosse-Wentrup,
Int. Conf. Autonomic Comput., 2017, pp. 83–92. “Predicting motor learning performance from electroencephalographic
[49] M. P. Hosseini, T. X. Tran, D. Pompili, K. Elisevich, and H. Soltanian- data,” J. Neuroeng. Rehabil., vol. 11, no. 1, p. 24, 2014.
Zadeh, “ Multimodal data analysis of epileptic EEG and RS-fMRI via [74] M. Mirsadeghi, H. Behnam, R. Shalbaf, and H. J. Moghadam, “Charac-
deep learning and edge computin,” Artif. Intell. Medicine, vol. 104, 2020, terizing awake and anesthetized states using a dimensionality reduction
Art no. 101813. method,” J. Med. Syst., vol. 40, no. 1, p. 13, 2016.
[50] A. E. Hramov et al., “Classifying the perceptual interpretations of a [75] W. Mumtaz, S. S. A. Ali, M. A. M. Yasin, and A. S. Malik, “A ma-
bistable image using EEG and artificial neural networks,” Frontiers chine learning framework involving EEG-based functional connectivity
Neurosci., vol. 11, p. 674, 2017. to diagnose major depressive disorder (MDD),” Med. Biological Eng.
[51] W.-Y. Hsu, “Assembling a multi-feature EEG classifier for left–right Comput., vol. 56, no. 2, pp. 233–246, 2018.
motor imagery data using wavelet-based fuzzy approximate entropy [76] W. Mumtaz et al., “An EEG-based functional connectivity measure for
for improved accuracy,” Int. J. Neural Syst., vol. 25, no. 08, 2015, automatic detection of alcohol use disorder,” Artif. Intell. Medicine,
Art. no. 1550037. vol. 84, pp. 79–89, 2018.
[52] J. Hu and J. Min, “Automated detection of driver fatigue based on EEG [77] M. Murakami, S. Nakatani, N. Araki, Y. Konishi, and K. Mabuchi,
signals using gradient boosting decision tree model,” Cogn. Neurodyn., “Motion discrimination from EEG using logistic regression and schmitt-
vol. 12, no. 4, pp. 431–440, Aug. 2018. trigger-type threshold,” in Proc. IEEE Int. Conf. Syst., Man, Cybern.,
[53] J. Hu and Z. Mu, “EEG authentication system based on auto-regression 2015, pp. 2338–2342.
coefficients,” in Proc. IEEE 10th Int. Conf. Intell. Syst. Control, 2016, [78] M.-R. Nazem-Zadeh et al., “Lateralization of temporal lobe epilepsy
pp. 1–5. by imaging-based response-driven multinomial multivariate models,”
[54] A. Ishfaque, A. J. Awan, N. Rashid, and J. Iqbal, “Evaluation of ANN, in Proc. IEEE 36th Annu. Int. Conf. Eng. Medicine Biol. Soc., 2014,
LDA and decision trees for EEG based brain computer interface,” in Proc. pp. 5595–5598.
IEEE 9th Int. Conf. Emerging Technol., 2013, pp. 1–6. [79] E. Neto, F. Biessmann, H. Aurlien, H. Nordby, and T. Eichele, “Regular-
[55] I. Iturrate, L. Montesano, and J. Minguez, “Robot reinforcement learning ized linear discriminant analysis of EEG features in dementia patients,”
using EEG-based reward signals,” in Proc. IEEE Int. Conf. Robot. Autom., Frontiers Aging Neurosci., vol. 8, p. 273, 2016.
number EPFL-CONF-205134, pp. 4822–4829, 2010. [80] A. Onishi and K. Natsume, “Multi-class ERP-based BCI data analysis
[56] A. Jain, B. Abbas, O. Farooq, and S. K. Garg, “Fatigue detection and using a discriminant space self-organizing map,” in Proc. IEEE 36th
estimation using auto-regression analysis in EEG,” in Proc. IEEE Int. Annu. Int. Conf. Eng. Medicine Biol. Soc., 2014, pp. 26–29.
Conf. Adv. Comput., Commun. Informat., 2016, pp. 1092–1095. [81] M. S. Özerdem and H. Polat, “Emotion recognition based on EEG features
[57] A. K. Jain, “Data clustering: 50 years beyond k-means,” Pattern Recognit. in movie clips with channel selection,” Brain Informat., vol. 4, no. 4,
Lett., vol. 31, no. 8, pp. 651–666, 2010. p. 241, 2017.
[58] A. K. Jaiswal and H. Banka, “Epileptic seizure detection in EEG signal [82] A. Page, S. P. T. Oates, and T. Mohsenin, “An ultra low power feature
with GModPCA and support vector machine,” Bio-Med. Mater. Eng., extraction and classification system for wearable seizure detection,”
vol. 28, no. 2, pp. 141–157, 2017. in Proc. IEEE 37th Annu. Int. Conf. Eng. Medicine Biol. Soc., 2015,
[59] L. Jakaite, V. Schetinin, C. Maple, and J. Schult, “Bayesian decision pp. 7111–7114.
trees for EEG assessment of newborn brain maturity,” in Proc. IEEE UK [83] S. K. Prabhakar and H. Rajaguru, “PCA and k-means clustering for
Workshop Comput. Intell., 2010, pp. 1–6. classification of epilepsy risk levels from EEG signals? A comparitive
[60] A. Jalilifard, E. B. Pizzolato, and M. K. Islam, “Emotion classification study between them,” in Proc. IEEE Int. Conf. Intell. Informat. Biomed.
using single-channel scalp-EEG recording,” in Proc. IEEE 38th Annu. Sci., 2015, pp. 83–86.
Int. Conf. Eng. Medicine Biol. Soc., 2016, pp. 845–849. [84] S. Puntanen, “Regression analysis by example, by samprit chatterjee, Ali
[61] S. Jukić and J. Kevrić, “Majority vote of ensemble machine learning S. Hadi,” Int. Statistical Rev., vol. 81, no. 2, pp. 308–308, 2013.
methods for real-time epilepsy prediction applied on EEG pediatric data,” [85] A. F. Rabbi, L. Azinfar, and R. Fazel-Rezai, “Seizure prediction using
TEM J., vol. 7, no. 2, p. 313, 2018. adaptive neuro-fuzzy inference system,” in Proc. IEEE 35th Annu. Int.
[62] J. Kevric and A. Subasi, “The effect of multiscale PCA de-noising in Conf. Eng. Medicine Biol. Soc., 2013, pp. 2100–2103.
epileptic seizure detection,” J. Med. Syst., vol. 38, no. 10, p. 131, 2014. [86] A. F. Rabbi and R. Fazel-Rezai, “A fuzzy logic system for seizure onset
[63] J.-H. Kim, F. Bießmann, and S.-W. Lee, “Reconstruction of hand move- detection in intracranial EEG,” Comput. Intell. Neurosci., vol. 2012, p. 1,
ments from EEG signals based on non-linear regression,” in Proc. IEEE 2012.
Int. Winter Workshop Brain-Comput. Interface, 2014, pp. 1–3. [87] K. Rai, V. Bajaj, and A. Kumar, “Novel feature for identification of focal
[64] C. Kingsford and S. L. Salzberg, “What are decision trees? Nature EEG signals with k-means and fuzzy c-means algorithms,” in Proc. IEEE
Biotechnol., vol. 26, no. 9, p. 1011, 2008. Int. Conf. Digit. Signal Process., 2015, pp. 412–416.

Authorized licensed use limited to: CZECH TECHNICAL UNIVERSITY. Downloaded on July 15,2022 at 16:45:08 UTC from IEEE Xplore. Restrictions apply.
218 IEEE REVIEWS IN BIOMEDICAL ENGINEERING, VOL. 14, 2021

[88] H. Rajaguru and S. K. Prabhakar, “Logistic regression Gaussian mixture [111] S. Yuan, W. Zhou, and L. Chen, “Epileptic seizure prediction using dif-
model and softmax discriminant classifier for epilepsy classification from fusion distance and bayesian linear discriminate analysis on intracranial
EEG signals,” in Proc. IEEE Int. Conf. Comput. Methodologies Commun., EEG,” Int. J. Neural Syst., vol. 28, no. 1, 2018, Art. no. 1750043.
2017, pp. 985–988. [112] T. Zhang and W. Chen, “LMD based features for the automatic seizure
[89] H. Rajaguru and S. K. Prabhakar, “Non linear ICA and logistic regression detection of EEG signals using SVM,” IEEE Trans. Neural Syst. Rehabil.
for classification of epilepsy from EEG signals,” in Proc. IEEE Int. Conf. Eng., vol. 25, no. 8, pp. 1100–1108, Aug. 2017.
Electron., Commun. Aerosp. Technol., 2017, vol. 1, pp. 577–580. [113] N. Zhuang, Y. Zeng, K. Yang, C. Zhang, L. Tong, and B. Yan, “Investi-
[90] H. Rajaguru and S. K. Prabhakar, “Sparse PCA and soft decision gating patterns for self-induced emotion recognition from EEG signals,”
tree classifiers for epilepsy classification from EEG signals,” in Proc. Sensors, vol. 18, no. 3, p. 841, 2018.
IEEE Int. Conf. Electron., Commun. Aerosp. Technol., 2017, vol. 1,
pp. 581–584.
[91] I. Rish et al., “An empirical study of the naive bayes classifier,” in IJCAI
2001 Workshop on Empirical Methods in Artificial Intelligence, vol. 3.
IBM: New York, 2001, pp. 41–46. Mohammad-Parsa Hosseini (Senior Member,
[92] S. Roy, I. Kiral-Kornek, and S. Harrer, “Deep learning enabled automatic IEEE) received the B.Sc. degree in electrical
abnormal EEG identification,” in Proc. IEEE 40th Annu. Int. Conf. Eng. and electronic engineering in 2006, the M.Sc.
Medicine Biol. Soc., 2018, pp. 2756–2759. degree in biomedical engineering in 2008, and
[93] K.-M. Rytkönen, J. Zitting, and T. Porkka-Heiskanen, “Automated sleep the M.Sc. degree in electrical and communica-
scoring in rats and mice using the naive bayes classifier,” J. Neurosci. tion engineering in 2010. He received the Ph.D.
Methods, vol. 202, no. 1, pp. 60–64, 2011. degree in electrical and computer engineering
[94] C. Y. Sai, N. Mokhtar, H. Arof, P. Cumming, and M. Iwahashi, “Auto- with research in computer science from Rutgers
mated classification and removal of EEG artifacts with SVM and wavelet- University, New Brunswick, NJ, USA, in 2018.
ICA,” IEEE J. Biomed. Health Informat., vol. 22, no. 3, pp. 664–670, He did a graduate study in electrical engineering
2018. at Wayne State University, Detroit, MI, USA, in
[95] B. Sharif and A. H. Jafari, “Prediction of epileptic seizures from EEG us- 2013. He is collaborating with Medical Image Analysis Laboratory, Henry
ing analysis of ictal rules on poincaré plane,” Comput. Methods Programs Ford Health System, and with the Clinical Neuroscience Department,
Biomedicine, vol. 145, pp. 11–22, 2017. Spectrum Health, Grand Rapids, MI, USA. He has been a Senior Data
[96] A. Sharma, J. Rai, and R. Tewari, “Epileptic seizure anticipation and Scientist and Machine Learning Researcher in Silicon Valley, CA, USA,
localisation of epileptogenic region using EEG signals,” J. Med. Eng. since 2017. He has also been an Adjunct Lecturer and Faculty Member
Technol., vol. 42, no. 3, pp. 203–216, Apr. 2018. with several universities since 2009 and is currently with Santa Clara
[97] A. Sharmila and P. Geethanjali, “Effect of filtering with time domain University. His current research interests include machine learning, deep
features for the detection of epileptic seizure from EEG signals,” J. Med. learning, and signal and image processing. He was on the scientific
Eng. Technol., vol. 42, no. 3, pp. 217–227, Apr. 2018. committees and review boards of several national and international
[98] A. Sharmila and P. Mahalakshmi, “Wavelet-based feature extraction for conferences and journals.
classification of epileptic seizure EEG signal,” J. Med. Eng. Technol.,
vol. 41, no. 8, pp. 670–680, 2017.
[99] V. Srinivasan, C. Eswaran, and N. Sriraam, “Approximate entropy-based
epileptic EEG detection using artificial neural networks,” IEEE Trans.
Amin Hosseini (Member, IEEE) is with the De-
Inf. Technol. Biomedicine, vol. 11, no. 3, pp. 288–295, 2007.
partment of Electrical and Computer Engineer-
[100] A. F. Struck et al., “Association of an electroencephalography-based risk
ing with minor in computer science at Azad
score with seizure probability in hospitalized patients,” JAMA Neurol.,
University, Central Branch, Tehran, Iran. His re-
vol. 74, no. 12, pp. 1419–1424, 2017.
search interests include digital signal and im-
[101] A. Subasi, “Application of adaptive neuro-fuzzy inference system for
age processing, machine learning, artificial in-
epileptic seizure detection using wavelet feature extraction,” Comput.
telligence and biomedical engineering. He is a
Biol. Medicine, vol. 37, no. 2, pp. 227–244, 2007.
member of the IEEE Signal Processing Society
[102] T. Teramae, D. Kushida, F. Takemori, and A. Kitamura, “Estimation of
and the IEEE Machine Learning Society.
feeling based on EEG by using NN and k-means algorithm for massage
system,” in Proc. IEEE SICE Annu. Conf., 2010, pp. 1542–1547.
[103] A. Torabi, M. R. Daliri, and S. H. Sabzposhan, “Diagnosis of multiple
sclerosis from EEG signals using nonlinear methods,” Australasian Phys.
Eng. Sci. Medicine, vol. 40, no. 4, pp. 785–797, 2017.
[104] M. S. Treder, A. K. Porbadnigk, F. S. Avarvand, K.-R. Müller, and B.
Blankertz, “The LDA beamformer: Optimal estimation of ERP source Kiarash Ahi (Member, IEEE) received the
time series using linear discriminant analysis,” Neuroimage, vol. 129, M.Sc. degree in electrical and information engi-
pp. 279–291, 2016. neering from the Leibniz University of Hannover,
[105] V. Tuyisenge et al., “Automatic bad channel detection in intracranial Germany in 2012, and the Ph.D. degree in elec-
electroencephalographic recordings using ensemble machine learning,” trical and computer engineering from the Uni-
Clin. Neurophysiol., vol. 129, no. 3, pp. 548–554, 2018. versity of Connecticut, USA, in 2017. The focus
[106] V. Vijayakumar, M. Case, S. Shirinpour, and B. He, “Quantifying and of his M.Sc. degree has been smart grid, re-
characterizing tonic thermal pain across subjects from EEG data using newable energy systems, and power electron-
random forest models,” IEEE Trans. Biomed. Eng., vol. 64, no. 12, ics. His Ph.D. studies have been in the areas
pp. 2988–2996, 2017. of semiconductor technology, optics, machine
[107] Y. Wang, W. Chen, K. Huang, and Q. Gu, “Classification of neona- learning and natural computation, compressive
tal amplitude-integrated EEG using random forest model with com- sensing, and terahertz signal and image processing. He is currently
bined feature,” in Proc. IEEE Int. Conf. Bioinform. Biomed., 2013, a Senior Researcher and Lead Product Development Engineer in the
pp. 285–290. advanced semiconductor and software industry, where he researches
[108] S. Weichwald, T. Meyer, B. Scholkopf, T. Ball, and M. Grosse-Wentrup, areas where artificial intelligence can enhance the accuracy and effi-
“Decoding index finger position from EEG using random forests,” in ciency of semiconductor device manufacturing toward driving Moore’s
Proc. IEEE 4th Int. Workshop Cognitive Inf. Process., 2014, pp. 1–6. law beyond 7-nm technology node. He architects automated systems,
[109] X. Ying, H. Lin, and G. Hui, “Study on non-linear bistable dynamics empowered by machine learning and image processing, and leads
model based EEG signal discrimination analysis method,” Bioengi- multinational R&D teams. His scientific and research interests include
neered, vol. 6, no. 5, pp. 297–298, 2015. digital image and signal processing, optics and photolithography, MEMS
[110] W. Yu, T. Liu, R. Valdez, M. Gwinn, and M. J. Khoury, “Application of and semiconductor devices, machine learning and artificial intelligence,
support vector machine modeling for prediction of common diseases: The hardware security, bioengineering, wearable technologies, embedded
case of diabetes and pre-diabetes,” BMC Med. Informat. Decis. Making, systems, human-computer interaction, terahertz technology, and intel-
vol. 10, no. 1, p. 16, Mar. 2010. ligent software development.

Authorized licensed use limited to: CZECH TECHNICAL UNIVERSITY. Downloaded on July 15,2022 at 16:45:08 UTC from IEEE Xplore. Restrictions apply.

You might also like