Idp 3
Idp 3
By
Anita Hasan
17301221
Fahim Abrar
21341028
Eshaan Tanzim Sabur
16101255
Iftehaj Muntasir
17301223
Sumaia Sadia Nafisha
17201030
1. The thesis submitted is my/our own original work while completing degree at
Brac University.
3. The thesis does not contain material which has been accepted, or submitted,
for any other degree or diploma at a university or other institution.
i
Approval
The thesis/project titled “Emotion Analysis using Machine Learning Model and
Deep Learning Model on DEAP Dataset” submitted by
Examining Committee:
Supervisor:
(Member)
Moin Mostakim
Lecturer
Department of CSE
BRAC University
Co-Supervisor:
(Member)
ii
Head of Department:
(Chair)
iii
Abstract
Emotion has a significant influence on how you think and interact with others. It
serves as a link between how you feel and the actions you take, or you could say it
influences your life decisions on occasion. Since the patterns of emotions and their
reflections vary from person to person, their inquiry must be based on approaches
that are effective over a wide range of population regions. To extract features and
enhance accuracy, emotion recognition using brain waves or EEG signals requires
the implementation of efficient signal processing techniques. Various approaches
to human-machine interaction technologies have been ongoing for a long time, and
in recent years, researchers have had great success in automatically understanding
emotion using brain signals. In our research, several emotional states were classified
and tested on EEG signals collected from a well-known publicly available dataset, the
DEAP Dataset, using SVM (Support Vector Machine), KNN (K-Nearest Neighbor),
and an advanced Neural Network model RNN (Recurrent Neural Network) trained
with LSTM (Long Short Term Memory). The main purpose of this study is to
use improved ways to improve emotion recognition performance using brain signals.
Emotions, on the other hand, can change with time. As a result, the changes in
emotion through time are also examined in our research.
iv
Dedication
This Thesis is in honor of our parents, who have always inspired us to learn. They
gave us the strength to grow and develop ourselves. We may not be able to complete
our studies without their invaluable support. They have always encouraged us with
all endeavors, and have constantly loved us unconditionally. We would also like to
dedicate this thesis to our friends who supported us to do better, as well as the
participants who assisted us.
v
Acknowledgement
First and foremost, our praises and appreciation to the Almighty for his mercies
during our thesis work, which enabled us to complete it successfully. Secondly, we
would want to express our gratitude to our cherished family members, to whom we
will be eternally grateful for their love, care and support. We would want to express
our heartfelt gratitude to our supervisor, Moin Mostakim, for all of his assistance
and persistent guidance. We also appreciate our Co- Supervisor Dr. Mohammad
Zavid Parvez’s assistance. They have trusted us to do this research alongside them.
Without their direction and support, none of this would be possible. Finally, we
would want to express our gratitude to all of our faculty members and stuff of CSE
Department. They have helped us to grow and develop ourselves by providing us a
great learning and teaching atmosphere.
vi
Table of Contents
Declaration i
Approval ii
Ethics Statement iv
Abstract iv
Dedication v
Acknowledgment vi
List of Figures ix
List of Tables x
Nomenclature xi
1 Introduction 1
1.1 An overview to Emotion Recognition and its approaches . . . . . . . 1
1.2 Problem Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.3 Research Aims and Objectives . . . . . . . . . . . . . . . . . . . . . . 2
1.4 Thesis Orientation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
2 Literature Review 4
2.1 Variety of Analyses with EEG Signals . . . . . . . . . . . . . . . . . . 4
2.2 DEAP Dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
3 Background Study 8
3.1 FFT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
3.2 RNN and LSTM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
3.3 DWT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
4 Proposed Methodology 11
4.1 Materials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
4.2 Data Visualization . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
4.3 Welch’s method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
4.3.1 Topographical Mapping: . . . . . . . . . . . . . . . . . . . . . 18
4.4 Cross-Validation and Splitting Dataset . . . . . . . . . . . . . . . . . 18
vii
4.5 Confusion matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
4.6 Machine Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
4.7 Support Vector Machine . . . . . . . . . . . . . . . . . . . . . . . . . 19
4.8 K-Nearest Neighbour . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
6 45
6.1 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
6.2 Future Work Plan . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
Bibliography 50
viii
List of Figures
ix
List of Tables
x
Nomenclature
The next list describes several symbols & abbreviation that will be later used within
the body of the document
EEG Electroencephalogram
ML Machine Learning
SD Standard Deviation
xi
Chapter 1
Introduction
1
emotion. The advancement of a reliable human emotion recognition system using
EEG signals could help people regulate their emotions and open up new possibilities
in fields like education, entertainment, and security and might aid people suffering
from Alexithymia or any other psychiatric disease.
• Explore the DEAP dataset and its components to have a broader idea about
the dataset and how emotion states work
• Utilize both machine learning models and deep learning models to extract
features from the dataset and train and test the dataset
• Compare all important Machine learning classifiers and metrics to find the
best outcome for the prediction.
2
• In Chapter 2, similar work in this field is discussed in depth, as well as existing
methodologies used by researchers. We discussed the results of different types
of related rsearch and the used algorithms, techniques and models to reach
that conclusion.
• The proposed model and methodologies we will use in this paper are described
in detail in Chapter 4. The chapter also covers the train-test splitting ratio
and dataset validation, as well as a detailed description of the machine learning
and deep learning models we will use in our research.
• The test results and related discussions are presented in Chapter 5. The chap-
ter includes a comparison of various measurement metric approaches.The im-
plementation of machine learning algorithms and neural nework model (RNN)
is showed here.
3
Chapter 2
Literature Review
There are two parts to this chapter. The first section will discuss various types of
studies that focus on various analyses involving EEG data, as well as their conclu-
sions and significance. The second section goes through the previous studies and
how they performed on the DEAP Dataset.
4
accurate detection and identification of environmental factors when subjects had
better or poorer sleep cycles by Pouya Bashivan [13] and Stanislas Chambon [28].
In this study, Convolutional Neural Networks were used to extract time-invariant
features, and bidirectional-Long Short-Term Memory was used to automatically
predict the sleep transition stages, which resulted in the development of various
sleep improvement mobile applications that assist people in developing different
exercises to get the best sleep. The researchers made use of data from EEG epochs.
They employed a two-step training procedure as well as two publicly available sleep
datasets to conduct their research. First, the model was trained to learn filters for
extracting time-invariant features from raw channel EEG epochs, which was the
first stage. Second, utilizing sequence residual learning, the researchers were able
to incorporate the sleep stage transition rules into the recovered features from a
sequence of EEG epochs. In his article, Thomas J. et al. [33] points out that
CNNs require images as raw data and to transform EEG signals to 2D projection
map images. Many other articles have used different classification algorithms to
explain differing degrees of success, ranging from 50 to 85 percent. Belakhdar et
al. [17] obtained a maximum accuracy of 86.5 percent when detecting drowsiness
using the MIT-IH Polysomnographic dataset in their study. Jin et al. [2] while
analyzing emotions reported promising results, claiming that combining FFT, PCA,
and SVM yielded results that were about 90 percent accurate. As a result, rather
than the complexity of the classification algorithm used, the feature extraction stage
determines the accuracy of any model. As a result, categorization systems can offer
consistent accuracy and recall. In the article [12], the authors also explored several
techniques and feature types from time domain, frequency field, statistical features
and time frequency features with emotion recognition system. Physiological data
were utilized in another research [8] to assess stress. Skin behavior was observed
for 5 days for 18 individuals using wrist sensors and their mobile telephone activity,
such as call, SMS and location. In the research of Yishu [46], the analysis suggested
that individual emotions be recognized by integrating six EEG signal statistical
characteristics from the temporal domain as EEG features. A procedure to remove
noisy and superfluous canals by use of PCA and ReliefF algorithms was carried out.
SVM then was built and used with DEAP to recognize emotions. The results from
the research achieved an overall precision rating of 81.87 percent. Google recently
completed a project titled FaceNet [16], in which they trained a deep convolutional
network on facial photos in the Wild (LFW) dataset and obtained 99.63 percent
accuracy. Xing et al. [39] developed a stacked autoencoder (SAE) to breakdown
EEG data and classify them using an LSTM model. - The observed valence accuracy
rate was 81.1 percent, while the observed arousal accuracy rate was 74.38 percent.
5
2.2 DEAP Dataset
The DEAP Dataset was utilized by the following writers to analyze emotion states.
The impact of identifying emotion states accuracy of brain signal with varied bands
of frequencies and number of channels is explained by Hongpei et al. [31].The K-
nearest neighbor Classifier is employed. In gamma frequency ranges, accuracy of
around 95 percent is attained, and accuracy rose as the number of channels grew
from 10, 18, 18, and 32. Zamanian et al. [34] retrieved Gabor and IMF along with
time-domain features and attained an accuracy of 93 percent for 3 and 7 channels
using multiclass SVM as a classifier. Chao et al. [29] investigated a deep learn-
ing architecture, reaching an arousal rate of 75.92 percent. and 76.83 percent for
valence states. Using SVM, Liu et al. [32] categorized the data using time and fre-
quency domain characteristics and attained 70.3 percent and 72.6 percent accuracy,
respectively. Mohammadi et al. [24] classified arousal and valence using Entropy
and energy of each frequency band and reached an accuracy of 84.05 percent for
arousal and 86.75 percent for valence. Xian et al. [22] utilized MCF with statisti-
cal, frequency, and nonlinear dynamic characteristics to predict valence and arousal
with 83.78 percent and 80.72 percent accuracy, respectively. Among those who have
contributed to this work are Maria et al. [23] explored power characteristics based
on Russell’s Circumplex Model and applied to SVM with results of 88.4 percent for
Valence and 74 percent for Arousal. Singh et al. [26] utilized an SVM classifier to
divide emotions into four quadrants based on ERP and latency data. The accuracy
rate for single trails ranged from 62.5 percent to 83.3 percent, while for multi-subject
trails, they achieved a categorization rate of 55 percent for 24 subjects. Ang et al.
[21] developed a wavelet transform and time-frequency characteristics with ANN
classification method. For joyful feeling, the classification rate was 81.8 percent for
mean and 72.7 percent for standard deviation y. The performance of frequency do-
main characteristics for sad emotions was 72.7 percent. Krishna and colleagues [27]
utilized the Tunable-Q wavelet transform to get sub-bands of EEG data obtained
from 24 electrodes by watching 30 video clips. They achieved a classification ac-
curacy of 84.79 percent. According to the survey, researchers have created several
ML algorithms that use various characteristics and have an accuracy of 74 to 90
percent. A machine learning algorithm [40] is created in this suggested approach to
categorize emotion states into four groups utilizing the channel fusion method using
data accessible from the DEAP and SEED-IV data bases independently. The author
[45] did a comparison research on domain adaptation strategies on two emotional
EEG datasets, as well as a preliminary investigation on cross-dataset emotion recog-
nition, in this work. To summarize, they presented an emotion identification system
for EEG data based on the AdaBoost ensemble learning algorithm. You-Yun Lee
and Shulan Hsieh [11] conducted a research in which they utilized video snippets
to categorize emotions as positive, neutral, or negative [22]. Following that, the
researchers examined brain connection as a result of watching video with various
levels of emotional categorization.Three indices are used to assess brain connectiv-
ity: correlation, coherence, and phase synchronization index. The degree of the link
between two brain locations is referred to as correlation.Coherence describes how
closely two brain locations operate together at a given frequency. Finally, the phase
synchronization index describes the similarity of two signals’ phases.It was discov-
ered that negative emotion showed stronger connections with the occipital site than
6
neutral or positive emotion in terms of theta and alpha bands, which are distinct
frequency ranges. Positive emotion exhibited stronger temporal connections than
neutral emotion, particularly in the right hemisphere of the brain. Negative emotions
had stronger coherence in the theta, alpha, and beta bands than positive emotions,
with the biggest difference in the right parietal and occipital areas, as shown in
Figure 3. At each frequency range, positive emotion was more synchronized than
negative emotion, especially in the frontal area. Negative states exhibited greater
correlation and coherence than positive states overall, particularly in the occipital
and temporal areas. This study found a strong relationship between various EEG
signals for films with variable perceived emotional output. Yimin Hou and Shuaiqi
Chen [37] investigated the connection between EEG signals and emotional states
generated by music in another study. It was discovered that when an individual
listened to emotion-evoking music, there was more energy of various frequencies in
the frontal area. Whereas listening to cheerful music, theta and alpha energy levels
were notably high in the occipital area, while beta energy levels were greater around
the forehead. Listening to sorrowful music increased the activity of alpha signals.
Alhagry et al. [20] developed a deep learning technique for identifying emotions
from raw EEG data that used long-short term memory (LSTM) neural networks to
learn features from EEG signals and then classified these characteristics as low/high
arousal, valence, and liking. The DEAP data set was used to evaluate the -e tech-
nique. -The method’s average accuracy was 85.45 percent for arousal and 85.65
percent for valence.
Therefore, we present an emotion recognition approach for EEG data based on the
collective learning method in this work. To begin, we will attempt to summarize the
EEG data and then use FFT and DWT to extract features from the preprocessed
EEG signals. We will attempt to analyze the FFT and DWT findings using vari-
ous machine learning model. We will also try to apply RNN and LSTM to particular
EEG channels and various band waves, with the goal of training or creating a model
with higher accuracy for learning, testing, and training.
7
Chapter 3
Background Study
3.1 FFT
The discrete Fourier transform (DFT) is a technique that may be used to convert
certain types of function sequences to other sorts of representations. FFT is a
mathematical procedure that computes the discrete Fourier transform (DFT) of a
sequence. Alternatively, think of the discrete Fourier transform as a transformation
that converts the structure of a waveform’s cycle into sine components. A Fast
Fourier transform may be applied to a variety of signal processing applications,
including audio and video processing. It may be useful to have this skill if you
are reading things such as sound waves or utilizing image-processing tools. A Fast
Fourier transform may be used to solve a variety of different types of equations or to
graphically depict a range of various types of frequency activity. Fourier analysis is
a signal processing technique used to convert digital signals (x) of length (N) from
the time domain to the frequency domain (X) and vice versa. This approach may
be used to investigate both continuous and discrete time signals when it comes to
temporal signals. The Fourier Transform (FFT) mathematical approach is used to
calculate the Discrete Fourier Transform (DFT) or its inverse of a sequence. In terms
of results, it is identical to explicitly evaluating the DFT specification, but it does
so at a much quicker rate. When estimating the Power Spectral Density of an EEG
signal, FFT is a technique that is widely utilized (PSD). PSD is an abbreviation
for Power spectral distribution (spectral energy distribution) at a specific frequency.
It can be computed directly on the signal using FFT, or indirectly by altering the
estimated autocorrelation sequence.
8
may be incredibly precise in forecasting what will happen next because of their in-
ternal memory, which allows them to retain key input details. The reason they’re
so popular is because they’re good at handling sequential data kinds like time series
and voice. Recurrent neural networks have the advantage over other algorithms
in that they can gain a deeper understanding of a sequence and its context. A
short-term memory is common in RNNs. When linked with an LSTM, they have a
long-term memory as well (more on that later). The use of an example is another
good teaching strategy for the notion of memory in a recurrent neural network:
Put the word ”neuron” into a feed-forward neural network, and see what happens.
Individual letters and symbols are removed one at a time. In other words, by the
time it gets to the letter ”r,” the neural network has already forgotten about the
letters ”n,” ”e,” and ”u,” making it almost hard to anticipate the next character.
A recurrent neural network, on the other hand, can recall such characters because
of its internal storage. Produced data is sent back into the network via a process
known as cloning. Recurrent neural networks, in a nutshell, enrich the present by
incorporating memories from the recent past. Consequently, an RNN is fed the
present state as well as recent past. Due to the data sequence providing important
information about what will happen next, an RNN may do jobs that other algorithms
are unable to complete [48].
Long short-term memory networks (LSTMs) are a sort of recurrent neural network
extension that expands memory effectively. As a result, it’s well-suited to learning
from big experiences separated by long periods of time. RNN extensions that in-
crease memory capacity are known as long short-term memory (LSTM) networks.
The layers of an RNN are built using LSTMs. RNNs can either assimilate new infor-
mation, forget it, or give it enough importance to alter the result thanks to LSTMs,
which assign “weights” to data. The layers of an RNN, which is sometimes referred
to as an LSTM network, are built using the units of an LSTM. With the help of
LSTMs, RNNs can remember inputs for a long time. Because LSTMs store data
in a memory comparable to that of a computer, this is the case. The LSTM can
read, write, and delete information from its memory. This memory can be thought
of as a gated cell, with gated signifying that the cell decides whether to store or
erase data (i.e., whether to open the gates) based on the value it assigns to the
data. To allocate importance, weights are utilized, which the algorithm also learns.
This basically means that it learns over time which data is critical and which is not.
[48] Long-Short-Term Memory Networks (LSTMs) are recurrent neural network sub-
types (RNN). Because typical RNNs are trained via back-propagation through time
(BPTT), which introduces the vanishing/exploding gradient problem, learning long
sequences can be problematic. The RNN cell is replaced by a gated cell, such as
an LSTMs cell, to address this. Figure 3.1 depicts the architecture of an LSTM
cell [51]. These gates determine which data must be stored in memory and which
data does not. The addition of memory to an LSTM cell allows it to keep track of
previous activities. For LSTMs, cell health is crucial. The LSTM may modify the
state of the cell by removing or adding information using three gates. The first gate
is a forget gate, which uses a sigmoid layer to decide which information to erase
from the cell state [20].
9
The second gate is an input gate with a sigmoid layer that determines which values
should be updated and a tanh layer that creates a vector of newly updated values.
Finally, the output of the current state will be calculated using the updated cell state
and a sigmoid layer that determines which parts of the cell state will be output,
where is the sigmoid activation function that squashes numbers into the range (0,1),
tanh is the hyperbolic tangent activation function that squashes numbers into the
range (-1,1). [51]
3.3 DWT
The Discrete Wavelet Transform (DWT) is a versatile signal processing technique
that may be used in a wide range of applications [7]. DWT can denoise and extract
characteristics from a wide range of signals, including physiological (EEG, EMG,
EOG, ECG, BVP, and others), voice, vibration, acoustic, and biological data. A
discrete wavelet transform (DWT) splits a signal into several sets, each of which is
a time series of coefficients reflecting the temporal development of the signal in the
relevant frequency band [4].
10
Chapter 4
Proposed Methodology
4.1 Materials
For our research, we have chosen the DEAP [49] dataset. The DEAP dataset for
emotion classification is freely available on the internet. A number of physiological
signals found in the DEAP dataset can be utilized to determine emotions. It includes
information on four main types of states: valence, arousal, dominance, and liking.
Due to the use of various sample rates and different types of tests in data gathering,
the DEAP Dataset is an amalgamation of many different data types. EEG data was
gathered from 32 participants, comprising 16 males and 16 women, in 32 channels.
The EEG signals were collected by playing 40 different music videos, each lasting 60
seconds, and recording the results. Following the viewing of each video, participants
were asked to rate it on a scale of one to nine points. According to the total number
of video ratings received, which was 1280, the number of videos (40) multiplied by the
number of volunteers (40) yielded the result (i.e. 32). Following that, the signals
from 512 Hz were downsampled to 128 Hz and denoised utilizing bandpass and
lowpass frequency filters, as well as a lowpass frequency filter. 512 Hz EEG signals
were acquired from the following 32 sensor positions (according to the worldwide
10- 20 positioning system): Fp1, AF3, F3, F7, FC5, FC1, C3, T7, CP5, CP1, P3,
P7, PO3, O1, Oz, Pz, Fp2, AF4, Fz, F4, F8, FC6, FC2, Cz, T8, CP2, P4, P8, PO4,
and O2 (Figure: 4.1)
11
Figure 4.1: Sensor position on head to collect data
The international 10–20 system specifies the placement of the electrodes that will
be placed on the human skull in order to detect electroencephalogram (EEG) data.
The numerical values ”10” and ”20” indicate that the distance between adjacent
electrodes is 10 percent or 20 percent of the distance between the front and rear of
the skull, or between the right and left sides of the skull, respectively, and that the
distance between adjacent electrodes is 10 percent or 20 percent of the distance be-
tween the front and rear of the skull, or between the right and left sides of the skull.
It was also possible to take a frontal face video of each of the 22 participants. Sev-
eral signals, including EEG, electromyograms, breathing region, plethysmographs,
temperature, and so on, were gathered as 40 channel data during each subject’s 40
trials, with each channel representing a different signal. EEG data is stored in 32 of
the 40 available channels for research, which is a significant amount.
12
4.2 Data Visualization
Emotions can be classified into various categories, the most prominent of which are as
follows: When it comes to negative emotions (such as anger and anxiety), disgust,
shame, fear, and sadness are instances. When it comes to good emotions (such
affection and amusement), happiness, joy, pleasure, pride, and relief are examples.
Arousal-valence space is an alternative to using continuous values to define emotions.
Arousal relates to the intensity of an emotion, i.e., the power of the related emotional
state, whereas Valence refers to the amount to which an emotion is positive or
negative. For this research, we focused on valence and arousal. There were 1240
trials number of experiments in this study to obtain valence and arousal ratings. We
gathered information from ratings on a scale of one to nine. The dataset defined the
values from one to nine. To begin with, we plotted 40 data rows of 100 emotion
Valence Arousal
count 1240 1240
mean 5.252435 5.144210
sd 2.136497 2.031844
min 1.00 1.00
25% or First Quartile or Q1 3.80 3.68
50% or Median or Q2 5.04 5.165
75% or Third Quartile or Q3 7.05 6.94
max 9.00 9.00
distribution points in numerical number of the first participant to visualize the data.
The goal was to observe how the distribution of valence and arousal was distributed.
We extracted valence and arousal ratings from the dataset. The combination of
Valence and Arousal can be converted to emotional states: High Arousal Positive
Valence (Excited, Happy), Low Arousal Positive Valence (Calm, Relaxed), High
Arousal Negative Valence (Angry, Nervous) and Low Arousal Negative Valence (Sad,
Bored). We have analyzed the changes in emotional state along with the number of
trials for each group by following Russell’s circumplex model. The question might
rise about the way to classify the dataset. Russell’s circumplex model can help
classify the DEAP dataset. Russell’s methodology for visualizing the scale with the
real numbers 0–10, the DEAP dataset employs self-assessment manikins (SAMs) [1].
1–5 and 5–9 were chosen as the scales based on self-evaluation ratings [9], [15], [35].
The label was changed to “positive” if the rating was greater than or equal to 5, and
to “negative” if it was less than 5. We utilized a different way to determine ”positive”
and ”negative” values. The difference in valence and arousal was rated on a scale of
1 to 9. It is not a good idea, in our opinion, to categorize the dataset using a mean
value because a particular ”number” may express differently for different users. As
a result, we used median values to discriminate between ”positive” and ”negative”
13
Figure 4.2: Data rows Plotting of first participant
integers. We looked to see if each trial had a positive or negative valence, as well as
a positive or negative arousal level. In general, values larger than the median are
seen as ”positive,” while those less than the median are regarded as ”negative.”
As a result, four labels have been created: high arousal low valence (HALV), low
arousal high valence (LAHV), high arousal high valence (HAHV), and low arousal
low valence (LALV).
A box plot is a graphical representation of numerical data groupings through their
quartiles used in descriptive statistics. Lines extending from the boxes (whiskers) can
be seen in box plots, suggesting variability outside the upper and lower quartiles.
A boxplot is a standardized data visualization approach that uses a five-number
14
No. Proposed Labels Emotional States
1. High Arousal Low Valence Calm, Relaxed
2. High Arousal High Valence Happy, Excited
3. Low Arousal High Valence Angry, Nervous
4. Low Arousal Low Valence Sad, Bored
The mean value is shown by the blue line in the boxplot. It specifies the location
with the greatest number of values. In the boxplot, we can observe that there
have been four groups of means in terms of valence: HAHV (mean=7.22), LAHV
(mean=6.57), HALV (mean=3.08), and LALV (mean=3.58). On the other side, we
can also obtain values from the standpoint of arousal. HAHV has a mean of 6.86,
LAHV has a mean of 3.8, HALV has a mean of 6.8, and LALV has a mean of 3.11.
One-hot encoding is a technique for transforming categorical data into a format that
machine learning algorithms can use to enhance prediction accuracy. To encode
category information, it employs a one-hot numeric array. Each category variable
level is compared to a preset reference level [50]. A single variable [25]te with n
observations and d different values is transformed into d binary variables with n
observations each using a single hot encoding. The experiment uses 1 as a positive
and 0 as a negative to study the dataset in our research.[6] For our research, we
have transformed the trial description on a scale from 0 to 1. In the tables, the
transformed values are shown.
15
Figure 4.4: Box Plot on Channels
In the DEAP dataset, there are 40 channels which include 32 EEG channels and
breathing region, GSR, temperature, plethysmographs are the rest. From all of the
channels, 40 x 8064 data are collected. Extracting features from EEG data can be
done in a variety of methods. Periodogram and power spectral density calculations
and combining band waves of various frequencies are required for feature extraction.
Periodogram and PSD calculations can also be done in a variety of methods. PSD
can be calculated using the periodogram [36]. It is determined by the modulus
squared of the signal’s Fourier transform and is made of a frequency decomposition:
N −1 2
∆t X −2rikn
S(f ) = xn e N (4.1)
N n=0
Where S(f ) = PSD of xn
16
∆t = space between samples
xn = input sequence
N = elements in input sequence
M −1 2
1 X
P (f ) = xi (n)w(n)e−j2πf (4.2)
M U n=0
L−1
1X
Pwelch (f ) = P (f ) (4.3)
L i=0
The power spectral density (PSD) shows how a signal’s power is distributed in the
frequency domain. Among the PSD estimators, Welch’s method and the multitaper
approach have demonstrated the best results [30]. The input [43] signal x [n], n =
0,1,2,. . . ,N-1 is divided into a number of overlapping segments. Let M be the length
of each segment, using n=0,1, 2,. . . ,M-1, M.
M
xi = x[i × + n] (4.4)
2
where n=0,. . . ,M-1,i=0,1,2,. . . ,N-1
Each segment is given a smooth window w(n). In most cases, we employ the Ham-
ming window at a time. The Hamming window formula for each segment is as
follows:
2nπ
w(n) = 0.54 − 0.46cos[ ] (4.5)
M
Here,
M
X −1
U = (1/M ) w2 (n) (4.6)
n=0
denotes the mean power of the window w(n). So,
M
X −1
MU = w2 (n) (4.7)
n=0
17
4.3.1 Topographical Mapping:
The use of topographical mapping to show EEG data is quite useful. Voltage activity
will be examined in our study. The black dots (fig p) correspond to the approximate
physical placements of each electrode on the scalp, allowing humans to respond
because it lets us to see changes in data at a single or multiple time points [44].
This approach is a particularly powerful visualization method.
Correct Predictions
Accuracy = (4.8)
Total Predictions
TP + TN
Accuracy = (4.9)
TP + TN + FP + FN
TP
Precision = (4.10)
TP + FP
TP
Recall = (4.11)
TP + FN
18
Figure 4.5: Figure: Distribution table of a confusion matrix
precision · recall
F =2· (4.12)
precision + recall
19
approach is built by introducing a neighbor weight that varies exponentially based
on the neighbor’s square Euclidean distance. Since the value of k in our study is 5,
it will look for the 5 nearest neighbors to that data point, which may provide the
best classification result. The equation of the distance is:
s
X
Distance(x, y) = (xi − yi )2 (4.15)
i
20
Chapter 5
Welch’s method, we get the following results. These figures show us the relationship
with power spectral density over band wave’s frequencies. From the periodogram
figures, we get to see the peak points of the signal. Suppose, for theta band, the peak
point is somewhere in between 5-7Hz, for alpha band, the peak point is somewhere
in between 10-12 Hz. For beta band, the peak point is somewhere in between 16-19
Hz. And, for gamma band, it can not be detected with eye easily. The data and
information are collected from the EEG signal for observing the relationship among
the signals, power spectral density and band waves.
21
Figure 5.1: Theta band on Welch’s Periodogram
22
Figure 5.3: Beta bandgram on Welch’s Periodogram
23
Figure 5.5: Power Spectral Density across the channels
Figure 5.6: The change in voltage with respect to time with EEG signal
24
5.2.1 Theta Band
25
Figure 5.10: Voltage tropographical map (alpha band))
26
5.2.4 Gamma Band
27
5.3 Analysis of Machine Learning Model Using
FFT
After working on welch’s feature extraction method and topographical mapping, we
calculated the mean of accuracy, standard deviation and the time taken to get the
results from data visualization using FFT. For our research, we have processed new
datasets with 6 EEG regions and four band power values such as theta, alpha, beta
and gamma. We have divided the EEG regions based on left (Fp1, AF3, F7, FC5,
T7), right (Fp2, AF4, F8, FC6, T8), frontal (F3, FC1, Fz, F4, FC2), parietal (P3,
P7, Pz, P4, P8), occipital (O1, Oz, O2, PO3, PO4) and central (CP5, CP1, Cz,
C4, C3, CP6, CP8) sensor positions. The research calculates mean, std, min, first
quartile, median, third quartile and max values of 1240 trials of the mentioned six
regions based on four band power values. For this research, we used SVM and K-NN
classifiers. SVM classifier used “linear” kernel in this research.
5.3.1 Arousal-Accuracy
SVM KNN
Mean of accuracy 58.52% 62.32%
STD 0.033537092407755495 0.01809672118602032
Time in minutes 2.8121 0.1343
Table 5.2: Accuracy of band power values on Arousal using SVM and KNN Classi-
fiers
SVM KNN
Mean of F1 58.35% 63.28%
STD 0.05551135016067932 0.019482042202837713
Time in minutes 2.76777 0.1281
Table 5.3: F1-score of band power values on Arousal using SVM and KNN Classifiers
28
5.3.3 Valence-Accuracy
SVM KNN
Mean of accuracy 56.79% 56.92%
STD 0.03921589438580304 0.04104184725930891
Time in minutes 3.3126 0.1471
Table 5.4: Accuracy of band power values on Valence using SVM and KNN Classi-
fiers
SVM KNN
Mean of F1 65.61% 60.48%
STD 0.0437543076694958 0.04882980750657068
Time in minutes 3.1484 0.1253
Table 5.5: F1-score of band power values on Arousal using SVM and KNN Classifiers
29
Moreover, the research tries to compare the accuracy rate and F1-score of valence
label based on top EEG regions, bands, EEG region per each band and bands with
highest scores per each EEG region using K-NN classifier.
Table 5.6: Valence accuracy results based on EEG regions and EEG bands
Table 5.7: Valence F1-score results based on EEG regions and EEG bands
30
To observe confusion matrix, we have worked on top combinations for valence. In
the first experiment, we plot confusion matrix of valence with respect to “theta”
band and “central” EEG regions using KNN algorithm.
Figure 5.15: Confusion matrix of valence with respect to “theta” band and “central”
EEG regions using KNN algorithm
Table 5.8: TP, TN, FP, FN Distribution of valence with respect to “theta” band
and “central” EEG regions using KNN
31
Precision Recall F1-Score Support
0 0.55 0.54 0.54 69
1 0.70 0.71 0.70 104
Accuracy 0.64 173
Macro Avg 0.63 0.62 0.62 173
Weighted Average 0.64 0.64 0.64 173
Table 5.9: Distribution of different metrics on valence with respect to “theta” band
and “central” EEG regions using KNN
We have also observed 173 distribution of valence with respect to “beta” band and
“left” EEG regions using KNN algorithm.
Figure 5.16: Confusion matrix of valence with respect to “beta” band and “left”
EEG regions using KNN algorithm
32
N=173 Positive Negative
True 35 63
False 34 41
Table 5.10: TP, TN, FP, FN Distribution of valence with respect to “beta” band
and “left” EEG regions using KNN
Table 5.11: Distribution of different metrics on valence with respect to “beta” band
and “left” EEG regions using KNN
We have also observed 173 distribution of valence with respect to “gamma” band
and “right” EEG regions using KNN algorithm.
Table 5.12: Distribution of valence with respect to “gamma” band and “right” EEG
regions using KNN
33
Figure 5.17: Confusion matrix of valence with respect to “gamma” band and “right”
EEG regions using KNN algorithm
34
5.4 Implementation with RNN and FFT
For this research, during the FFT processing, we employed meta data for the purpose
of doing a meta vector analysis. Raw data was split over a time span of 2 seconds,
with each slice having a 0.125-second interval between it. A two-second FFT of
channel j was carried out in different frequencies in a sequence. Emotiv Epoch+
was fitted with a total of 14 channels, which were carefully selected. The number of
channels is [1,2,3,4,6,11,13,17,19,20,21,25,29,31] .The number of bands is 5. band =
[4,8,12,16,25,45] . A band power of 2 seconds on average is used. The window size
was 256 with a step size of 16, with each update occurring once every 0.125 seconds.
The sampling rate was set to 128 hertz. The FFT was then performed on all of the
subjects using these settings in order to obtain the required output. Neural net-
works and other forms of artificial intelligence require a starting collection of data,
referred to as a training dataset, that serves as a foundation for subsequent appli-
cation and use. This dataset serves as the foundation for the program’s developing
information library. Before the model can interpret and learn from the training
data, it must be appropriately labeled. The lowest value of the data is 200 and the
greatest value is above 2000, which means that trying to plot it will result in a lot
of irrelevant plots, which will make conducting the analysis tough. The objective of
machine learning is to create a plot and then optimize it further in order to obtain
35
Figure 5.19: Parameter Information of LSTM
a pattern. And if there are significant differences between the plotted points, it will
be unable to optimize the data. As a result, in order to fix this issue, the values have
been reduced to their bare minimum, commonly known as scaling. The values of
the data will not be lost as a result of scaling; instead, the data will be optimized to
the point where there is little difference between the plotted points. StandardScaler
is the name given to this technique. In order to achieve this, StandardScaler must
transform your data into a distribution with a mean of zero and a standard devi-
ation of one. When dealing with multivariate data, this is done feature-by-feature
to ensure that the data is accurate (in other words independently for each column
of the data). Because of the way the data is distributed, each value in the dataset
will be deducted from the mean and then divided by the standard deviation of the
dataset. Categorical Data refers to information that has a finite number of possible
values. All machine learning models are mathematical models that require numbers
to operate on. This is one of the key reasons for pre-processing categorical data
prior to feeding it to machine learning models. In our scenario, we were unable to
apply regression because we are attempting to classify our data. We will transform
our data to categorical in order to undertake classification. After that, we divided
the data set into two parts: a training data set and a testing data set. Training will
be carried out on 75% of the data, and testing will be carried out on 25% of the
data. A total of 456768 data were used in the training process. A total of 152256
data were used in the testing. RNN has been kept sequential. The first layer LSTM
of sequential model takes input of 512. The second layer takes input of 256. The
third and fourth layer takes an input of 128 and 64. And, the final layer LSTM of
sequential model takes input of 10. Since we are conducting classification where we
will need 0 or 1 that is why sigmoid has been used. The activation functions used
are relu and for the last part sigmoid. The rectified linear activation function, ab-
breviated ReLU, is a piecewise linear function that, if the input is positive, outputs
the value directly; otherwise, it outputs zero. Batch normalization was used. Batch
normalization is a method for training extremely deep neural networks in which the
inputs to a layer are standardized for each mini-batch. This results in a stabilization
of the learning process and a significant drop in the total of training epochs required
for training deep networks. Through randomly dropping out nodes while training,
a single model can be utilized to simulate having a huge variety of distinct network
designs.[2] This is referred to as dropout, and it is an extremely computationally
36
efficient and amazingly successful regularization technique for reducing overfitting
and improving generalization error in all types of deep neural networks. In our sit-
uation, dropout rates began at 30%, increased to 50%, then 30%, 30%, 30%, and
eventually 20%. Previously, we worked with three-dimensional datasets; however,
when we converted to a dense layer, we obtained a one-dimensional representation
in order to make a prediction. RMSprop was used as the optimizer with a learning
rate of 0.001, a rho value of 0.9, and an epsilon value of 1e-08. RMSprop calculates
the gradient by dividing it by the root of the moving (discounted) average of the
square of the gradients. This application of RMSprop makes use of conventional
momentum rather than Nesterov momentum. Additionally, the centered version
calculates the variance by calculating a moving average of the gradients. As we can
see, accuracy increases very gradually in this case, and learning rate plays a major
part. If we increased the learning rate, accuracy would also increase rapidly, and
when optimization is reached, the process would reverse, with accuracy decreasing
at a faster rate. That is why the rate of learning has been reduced. When one zero
is removed, the accuracy decreases significantly. As our loss function, we utilized
the Mean Squared Error. The Mean Squared Error (MSE) loss function is the most
basic and extensively used loss function, and it is typically taught in introductory
Machine Learning programs. To calculate the MSE, take the difference between
your model’s predictions and the ground truth, square it, and then average it across
the whole dataset. The MSE can never be negative since we are constantly squaring
the errors. To compute loss, we utilized mean squared error. Because of the squar-
ing portion of the function, the MSE is excellent for guaranteeing that the trained
model does not contain any outlier predictions with significant mistakes. Because of
this, the MSE places greater emphasis on outlier predictions with large errors. We
tried our best to reduce the percentage of value loss and increase the accuracy rate.
We saved the model and kept track by every 50 epochs. In the first picture, we can
see that for the first 50 epochs the training loss 0.1588 and validation loss reduced to
0.06851 and 0.06005. And the training accuracy rate increased from 9.61 percent to
45.784 percent and validation accuracy increased to 53.420 pecent. For the second
50 epochs, the training loss reduced to 0.06283 and the validation loss reduced to
.05223 where the training accuracy increased to 51.661 percent and validation accu-
racy increased to 60.339 percent. For the third 50 epochs, the training loss reduced
to 0.05992 and the validation loss reduced to .04787 where the training accuracy
increased to 54.492 percent and validation accuracy increased to 64.413 percent.
After 200 epochs the ratio started to change at a very slow rate.We ran 1000 epochs
and got the training accuracy rate of 69.21% and the validation accuracy rate was
78.28%.
37
Figure 5.20: Epoch Vs Loss
38
Figure 5.22: Epoch Vs Loss (From 51-100 Epochs)
39
Figure 5.24: Epoch Vs Loss (From 101-150 Epochs)
40
5.5 Implementation with DWT
We tried to explore some result from DWT transformation with STD, min-max
standard scaler.
Using SVM with sigmoid kernel and K-NN, we get the following results:
5.5.1 Arousal-Accuracy
SVM KNN
Mean of accuracy 52.21% 52.73%
STD 0.15954184532165527 0.003194570541381836
SVM KNN
Mean of accuracy 52.21% 52.73%
STD 0.16068744659423828 0.0840451717376709
5.5.3 Valence-Accuracy
SVM KNN
Mean of accuracy 51.17% 54.55%
STD 0.1488192081451416 0.0040628910064697266
Table 5.16: Accuacy result of Positive Valence using SVM and K-NN
41
SVM KNN
Mean of accuracy 51.17% 54.55%
STD 0.14807367324829102 0.002973318099975586
Table 5.17: F1 Score result of Positive Valence using SVM and K-NN
Using K-NN classifier, we compared the accuracy rate, precision, recall and F1-score
of valence and arousal labels.
Table 5.18: Distribution of different metrics on valence using KNN with DWT
42
In the experiment, we showed 179 distribution of confusion matrix of valence for
different metrics on KNN classifier.
43
Precision Recall F1-Score Support
0 0.57 0.61 0.59 89
1 0.58 0.54 0.56 90
Accuracy 0.58 179
Macro Avg 0.58 0.58 0.58 179
Weighted Average 0.58 0.58 0.58 179
Table 5.20: Table: Distribution of different metrics on arousal using KNN with
DWT
44
Chapter 6
6.1 Conclusion
To summarize, in this research, we describe the EEG-based emotion recognition
challenge, as well as existing and proposed solutions to this problem. Emotion
detection by the use of EEG waves is a relatively new and exciting area of study
and analysis. With this study, we hope to acquire more meaningful information for
emotion recognition from a variety of features and combine it in a useful way for
future research. To identify and evaluate on numerous emotional states using EEG
signals acquired from the DEAP Dataset, SVM (Support Vector Machine), KNN
(K-Nearest Neighbor), and RNN (Recurrent Neural Network) trained with LSTM
(Long Short-Term Memory) are used in conjunction with LSTM (Long Short-Term
Memory). According to the findings, the suggested method is a very promising
option for emotion recognition, owing to its remarkable ability to learn features from
raw data in a short period of time. When compared to typical feature extraction
approaches, it produces higher average accuracy over a larger number of people.
45
Bibliography
46
[12] M. Wyczesany and T. Ligeza, “Towards a constructionist approach to emo-
tions: Verification of the three-dimensional model of affect with eeg-independent
component analysis,” Experimental brain research, vol. 233, Nov. 2014. doi:
10.1007/s00221-014-4149-9.
[13] P. Bashivan, I. Rish, M. Yeasin, and N. Codella, “Learning representations
from eeg with deep recurrent-convolutional neural networks,” Nov. 2015.
[14] M. N. Fakhruzzaman, E. Riksakomara, and H. Suryotrisongko, “Eeg wave
identification in human brain with emotiv epoc for motor imagery,” Procedia
Computer Science, vol. 72, pp. 269–276, 2015, The Third Information Systems
International Conference 2015, issn: 1877-0509. doi: https://doi.org/10.1016/
j . procs . 2015 . 12 . 140. [Online]. Available: https : / / www . sciencedirect . com /
science/article/pii/S1877050915036017.
[15] X. Li, P. Zhang, D. Song, G. Yu, Y. Hou, and B. Hu, “Eeg based emotion
identification using unsupervised deep feature learning,” 2015.
[16] F. Schroff, D. Kalenichenko, and J. Philbin, “Facenet: A unified embedding
for face recognition and clustering,” in 2015 IEEE Conference on Computer
Vision and Pattern Recognition (CVPR), 2015, pp. 815–823. doi: 10.1109/
CVPR.2015.7298682.
[17] I. Belakhdar, W. Kaaniche, R. Djmel, and B. Ouni, “A comparison between
ann and svm classifier for drowsiness detection based on single eeg channel,”
2016 2nd International Conference on Advanced Technologies for Signal and
Image Processing (ATSIP), pp. 443–446, 2016.
[18] W. Liu, W.-L. Zheng, and B.-L. Lu, “Emotion recognition using multimodal
deep learning,” vol. 9948, Oct. 2016, isbn: 978-3-319-46671-2. doi: 10.1007/
978-3-319-46672-9 58.
[19] V. Vanitha and P. Krishnan, “Real time stress detection system based on eeg
signals,” vol. 2016, S271–S275, Jan. 2016.
[20] S. Alhagry, A. Aly, and R. El-Khoribi, “Emotion recognition based on eeg us-
ing lstm recurrent neural network,” International Journal of Advanced Com-
puter Science and Applications, vol. 8, Oct. 2017. doi: 10 . 14569 / IJACSA .
2017.081046.
[21] A. Ang and Y. Yeong, “Emotion classification from eeg signals using time-
frequency-dwt features and ann,” Journal of Computer and Communications,
vol. 05, pp. 75–79, Jan. 2017. doi: 10.4236/jcc.2017.53009.
[22] X. Li, J.-Z. Yan, and J.-H. Chen, “Channel division based multiple classifiers
fusion for emotion recognition using eeg signals,” ITM Web of Conferences,
vol. 11, p. 07 006, Jan. 2017. doi: 10.1051/itmconf/20171107006.
[23] M. Menezes, A. Samara, L. Galway, A. Sant’Anna, A. Verikas, F. Alonso-
Fernandez, H. Wang, and R. Bond, “Towards emotion recognition for virtual
environments: An evaluation of eeg features on benchmark dataset,” Personal
and Ubiquitous Computing, vol. 21, Dec. 2017. doi: 10.1007/s00779-017-1072-
7.
[24] Z. Mohammadi, J. Frounchi, and M. Amiri, “Wavelet-based emotion recog-
nition system using eeg signal,” Neural Computing and Applications, vol. 28,
Aug. 2017. doi: 10.1007/s00521-015-2149-8.
47
[25] K. Potdar, T. Pardawala, and C. Pai, “A comparative study of categorical vari-
able encoding techniques for neural network classifiers,” International Jour-
nal of Computer Applications, vol. 175, pp. 7–9, Oct. 2017. doi: 10 . 5120 /
ijca2017915495.
[26] M. I. Singh and M. Singh, “Development of a real time emotion classifier
based on evoked eeg,” Biocybernetics and Biomedical Engineering, vol. 37,
no. 3, pp. 498–509, 2017, issn: 0208-5216. doi: https://doi.org/10.1016/j.
bbe.2017.05.004. [Online]. Available: https://www.sciencedirect.com/science/
article/pii/S0208521616303035.
[27] V. Bajaj, A. Krishna, a. sri aravapalli, K. Priyanka, and S. Taran, “Emotion
classification using eeg signals based on tunable-q wavelet transform,” IET
Science, Measurement Technology, vol. 13, Dec. 2018. doi: 10 . 1049 / iet -
smt.2018.5237.
[28] S. Chambon, V. Thorey, P. J. Arnal, E. Mignot, and A. Gramfort, “A deep
learning architecture to detect events in EEG signals during sleep,” in MLSP
2018 - IEEE International Workshop on Machine Learning for Signal Process-
ing, Aalborg, Denmark, Sep. 2018. [Online]. Available: https://hal.archives-
ouvertes.fr/hal-01917529.
[29] H. Chao, H. Zhi, D. Liang, and Y. Liu, “Recognition of emotions using multi-
channel eeg data and dbn-gc-based ensemble deep learning framework,” Com-
putational Intelligence and Neuroscience, vol. 2018, pp. 1–11, Dec. 2018. doi:
10.1155/2018/9750904.
[30] M. Ghofrani Jahromi, H. Parsaei, A. Zamani, and D. W. Stashuk, “Cross com-
parison of motor unit potential features used in emg signal decomposition,”
IEEE Transactions on Neural Systems and Rehabilitation Engineering, vol. 26,
no. 5, pp. 1017–1025, 2018. doi: 10.1109/TNSRE.2018.2817498.
[31] M. Li, H. Xu, X. Liu, and S. Lu, “Emotion recognition from multichannel eeg
signals using k-nearest neighbor classification,” Technology and Health Care,
vol. 26, pp. 1–11, Apr. 2018. doi: 10.3233/THC-174836.
[32] J. Liu, H. Meng, M. Li, F. Zhang, R. Qin, and A. Nandi, “Emotion detection
from eeg recordings based on supervised and unsupervised dimension reduc-
tion,” Concurrency and Computation: Practice and Experience, vol. 30, e4446,
Mar. 2018. doi: 10.1002/cpe.4446.
[33] J. Thomas, L. Comoretto, J. Jin, J. Dauwels, S. Cash, and M. Westover, “Eeg
classification via convolutional neural network-based interictal epileptiform
event detection,” 2018 40th Annual International Conference of the IEEE
Engineering in Medicine and Biology Society (EMBC), pp. 3148–3151, 2018.
[34] H. Zamanian and H. Farsi, “A new feature extraction method to improve
emotion detection using eeg signals,” ELCVIA Electronic Letters on Computer
Vision and Image Analysis, vol. 17, p. 29, Nov. 2018. doi: 10.5565/rev/elcvia.
1045.
48
[35] M. A. Asghar, M. J. Khan, Fawad, Y. Amin, M. Rizwan, M. Rahman, S. Bad-
nava, S. S. Mirjavadi, and S. S. Mirjavadi, “Eeg-based multi-modal emotion
recognition using bag of deep features: An optimal feature selection approach,”
Sensors (Basel, Switzerland), vol. 19, no. 23, Nov. 2019, issn: 1424-8220. doi:
10 . 3390 / s19235218. [Online]. Available: https : / / europepmc . org / articles /
PMC6928944.
[36] D. Barahona-Pereira, “Evaluation of feature extraction techniques for an in-
ternet of things electroencephalogram,” 2019.
[37] Y. Hou and S. Chen, “Distinguishing different emotions evoked by music
via electroencephalographic signals,” Computational Intelligence and Neuro-
science, vol. 2019, pp. 1–18, Mar. 2019. doi: 10.1155/2019/3191903.
[38] W. Ng, A. Saidatul, Y. Chong, and Z. Ibrahim, “Psd-based features extraction
for eeg signal during typing task,” IOP Conference Series: Materials Science
and Engineering, vol. 557, p. 012 032, Jun. 2019. doi: 10.1088/1757- 899X/
557/1/012032.
[39] X. Xing, Z. Li, T. Xu, L. Shu, B. Hu, and X. Xu, “Sae+lstm: A new framework
for emotion recognition from multi-channel eeg,” Frontiers in Neurorobotics,
vol. 13, p. 37, 2019, issn: 1662-5218. doi: 10.3389/fnbot.2019.00037. [Online].
Available: https://www.frontiersin.org/article/10.3389/fnbot.2019.00037.
[40] Y. Çimtay and E. Ekmekcioglu, “Investigating the use of pretrained convolu-
tional neural network on cross-subject and cross-dataset eeg emotion recogni-
tion,” Sensors, vol. 20, Apr. 2020. doi: 10.3390/s20072034.
[41] S. A. Hussain and A. S. A. A. Balushi, “A real time face emotion classification
and recognition using deep learning model,” Journal of Physics: Conference
Series, vol. 1432, p. 012 087, Jan. 2020. doi: 10 .1088 / 1742 - 6596 / 1432 / 1 /
012087. [Online]. Available: https ://doi .org/10 .1088/1742 - 6596 /1432/1 /
012087.
[42] T. Kusumaningrum, A. Faqih, and B. Kusumoputro, “Emotion recognition
based on deap database using eeg time-frequency features and machine learn-
ing methods,” Journal of Physics: Conference Series, vol. 1501, p. 012 020,
Mar. 2020. doi: 10.1088/1742-6596/1501/1/012020.
[43] Q. Xiong, X. Zhang, W.-F. Wang, and Y. Gu, “A parallel algorithm framework
for feature extraction of eeg signals on mpi,” Computational and Mathematical
Methods in Medicine, vol. 2020, pp. 1–10, May 2020. doi: 10 . 1155 / 2020 /
9812019.
[44] A. Aydin, H. Öğmen, and H. Kafaligonul, “Neural correlates of metacontrast
masking across different contrast polarities,” Brain Structure and Function,
Mar. 2021. doi: 10.1007/s00429-021-02260-5.
[45] Y. Chen, R. Chang, and J. Guo, “Emotion recognition of eeg signals based
on the ensemble learning method: Adaboost,” Mathematical Problems in En-
gineering, vol. 2021, pp. 1–12, Jan. 2021. doi: 10.1155/2021/8896062.
[46] Y. Liu and G. Fu, “Emotion recognition by deeply learned multi-channel tex-
tual and EEG features,” Future Gener. Comput. Syst., vol. 119, pp. 1–6, 2021.
doi: 10.1016/j.future.2021.01.010. [Online]. Available: https://doi.org/10.
1016/j.future.2021.01.010.
49
[47] S. D. Rama Chaudhary Ram Avtar Jaswal, “Emotion recognition based on eeg
using deap dataset,” European Journal of Molecular amp; Clinical Medicine,
vol. 8, no. 3, pp. 3509–3517, 2021, issn: 2515-8260.
[48] N. Donges. (). “A guide to rnn: Understanding recurrent neural networks
and lstm networks,” [Online]. Available: https://builtin.com/data- science/
recurrent-neural-networks-and-lstm. (accessed: 24.09.2021).
[49] S. Koelstra. (). “Deapdataset a dataset for emotion analysis using eeg, phys-
iological and video signals,” [Online]. Available: https://www.eecs.qmul.ac.
uk/mmv/datasets/deap/. (accessed: 12.07.2021).
[50] D. S-l. (). “Sklearn.preprocessing.onehotencoder –scikit-learn 0.21.3 documen-
tation.,” [Online]. Available: https : / / scikit - learn . org / stable / modules /
generated/sklearn.preprocessing.OneHotEncoder.html.. (accessed: 29.07.2019).
[51] (). “What is the architecture behind the keras lstm cell?” [Online]. Available:
https : // stackoverflow . com / questions /50488427 / what - is - the - architecture -
behind-the-keras-lstm-cell. (accessed: 01.05.2017).
50