Expert Systems With Applications: John Atkinson, Daniel Campos

Download as pdf or txt
Download as pdf or txt
You are on page 1of 7

Expert Systems With Applications 47 (2016) 35–41

Contents lists available at ScienceDirect

Expert Systems With Applications


journal homepage: www.elsevier.com/locate/eswa

Improving BCI-based emotion recognition by combining EEG feature


selection and kernel classifiersR
John Atkinson a,∗, Daniel Campos b
a
Department of Computer Sciences, Faculty of Engineering, Universidad de Concepcion, Concepcion, Chile
b
Artificial Intelligence Laboratory, Department of Computer Sciences, Universidad de Concepcion, Chile

a r t i c l e i n f o a b s t r a c t

Keywords: Current emotion recognition computational techniques have been successful on associating the emotional
Emotion recognition changes with the EEG signals, and so they can be identified and classified from EEG signals if appropriate
Brain–Computer Interfaces
stimuli are applied. However, automatic recognition is usually restricted to a small number of emotions
EEG
classes mainly due to signal’s features and noise, EEG constraints and subject-dependent issues. In order
Feature selection
Emotion classification to address these issues, in this paper a novel feature-based emotion recognition model is proposed for EEG-
based Brain–Computer Interfaces. Unlike other approaches, our method explores a wider set of emotion types
and incorporates additional features which are relevant for signal pre-processing and recognition classifica-
tion tasks, based on a dimensional model of emotions: Valence and Arousal. It aims to improve the accuracy of
the emotion classification task by combining mutual information based feature selection methods and ker-
nel classifiers. Experiments using our approach for emotion classification which combines efficient feature
selection methods and efficient kernel-based classifiers on standard EEG datasets show the promise of the
approach when compared with state-of-the-art computational methods.
© 2015 Elsevier Ltd. All rights reserved.

1. Introduction State-of-the-art emotion recognition computational techniques


have been successful on associating the emotional changes with
Emotions play a critical role in rational decision-making, percep- the EEG signals, and so they can be identified and classified from
tion, human interaction, and human intelligence. Hence emotions are EEG signals if appropriate stimuli are applied. However, automatic
a fundamental component of being human as they motivate action recognition is usually restricted to a small number of emotions
and add meaning and richness to virtually all human experience. classes mainly due to signal’s features and noise, EEG constraints and
Traditionally, in Human–Computer Interaction (HCI), users must subject-dependent issues.
discard their emotional selves to work efficiently and rationality with Accordingly, in this research a novel feature-based emotion
computers (Sourina, Wang, Liu, & Nguyen, 2011; Wright, 2010). recognition model is proposed for EEG-based BCI interfaces. Unlike
Interfacing directly with the human brain is made possible other approaches, our research explores a wider set of emotion
through the use of sensors that can monitor some of the physical types, claiming that combining a mutual information based feature
processes that occur within the brain that correspond with certain selection method (i.e., minimum-Redundancy-Maximum-Relevance)
forms of thought. Researchers have used these technologies to build and kernel classifiers may improve the accuracy of the emotion
Brain–Computer Interfaces (BCIs), communication systems that do not classification task.
depend on the brain’s normal output pathways of peripheral nerves This work is organized as follows: Section 2 describes the
and muscles (Calvo & D’Mello, 2010). Instead, users explicitly manip- fundamentals and state-of-the-art emotion recognition techniques,
ulate their brain activity that can be used to control computers or Section 3 proposes a novel feature-based model for EEG emotion
communication devices. recognition, Section 4 discusses the main experiments conducted and
the results for different model settings and finally, Section 5 high-
lights the main conclusions of the research and some further work.
R
This research was partially supported by FONDECYT, Chile under Grant number
1130035: “An Evolutionary Computation Approach to Natural-Language Chunking for
2. Emotions recognition
Biological Text Mining Applications.”

Corresponding author. Tel. :+56 412204305.
E-mail addresses: [email protected] (J. Atkinson), [email protected] Research of human emotional states via physiological signals in-
(D. Campos). volves recording and statistical analysis of signals from central and

http://dx.doi.org/10.1016/j.eswa.2015.10.049
0957-4174/© 2015 Elsevier Ltd. All rights reserved.
36 J. Atkinson, D. Campos / Expert Systems With Applications 47 (2016) 35–41

parietal cortex. A popular physiological signal that is highly adopted signal based on its oscillatory processes) Brunner, an C. Vi-
for human emotion assessment is the EEG, etc. Unlike other physio- daurre, and Neuper (2011), Hjorth parameters (i.e., EEG sig-
logical signals, EEG is a non-invasive technique with good temporal nals described by activity, mobility and complexity) Zhang et al.
and acceptable spatial resolution. Thus, EEG might play a major role (2008), wavelet transform (i.e., decomposition of the EEG sig-
on detecting an emotion directly from the brain at higher spatial and nal) Petrantonakis and Hadjileontiadis (2010), fractal dimen-
temporal resolution (Yisi, Sourina, & Minh, 2010). sion (i.e., complexity of the fundamental patterns hidden in a
A major problem with recognizing emotions is that people have signal) Zhang et al. (2008).
different subjective emotional experiences as responses to the same (3) Feature selection: one little used technique of feature se-
stimuli (Wright, 2010; Yisi et al., 2010). Accordingly, emotions can be lection for emotions recognition combines a metaheuristic
classified into two taxonomy models: method known as Genetic Algorithms (GA) and a Support
(1) Discrete model: it is based on evolutionary features (Calvo & Vector Machines (SVM). This GA-SVM approach heuristically
D’Mello, 2010) that include basic emotions (happiness, sadness, searches for the best sets of features initially represented as
fear, disgust, anger, surprise), and mixed emotions such as Mo- chromosomes of features which evolves as the GA goes on,
tivational (thirst, hanger, pain, mood), Self-awareness (shame, so that these can then be provided as an input to an SVM
disgrace, guilt), etc. classifier (Wang et al., 2011). A major drawback with this
(2) Dimensional model: it is expressed in terms of two emotions method is the time spent to converge toward good results
provoking people: Valence (disgust, pleasure) and Arousal (calm, and the redundancy of the selected features assessed in each
excitement) Yisi et al. (2010). iteration of the GA.
In order to deal with this issue, other EEG feature selec-
Emotion recognition enables systems to get non-verbal informa- tion technique known as minimum-Redundancy-Maximum-
tion from human subjects so as to put events in context based on Relevance (mRMR) selects the features that correlate the
underlying captured emotions. Humans are capable of recognizing strongest with a classification variable, reducing information
emotions either from speech (voice tone and discourse) with an ac- redundancy. This method selects features that are mutually
curacy around 60% or from facial expressions and body movements different from each other while still having a high correlation
with an accuracy of 78–90%. However, the recognition task is strongly make up the selection task of mRMR (Polat & Cataltepe, 2012),
dependent on the context and requires facial expressions to be delib- by reducing redundancy between bad and good features using
erately performed or even in a very exaggerated manner, which is far Mutual Information (MI) methods, so that a subset of features
away from the natural way a user interact with intelligent interfaces. that represents best the dataset can be obtained.
Other kinds of techniques use audio signals, obtaining classifica- (4) Emotions classification: once the FVs are extracted from the
tion accuracy close to 60–90% (Calvo & D’Mello, 2010), whereas some previous task, emotions must be classified according to pre-
other methods use non-linguistic vocalizations (i.e., laughs, tears, viously identified classes of emotions. Despite the large num-
screams, etc.) to recognize complex emotional states such as anxi- ber of features used by these methods, no feature selection is
ety, sexual interest, boredom. Bi-modal methods also combine audio usually carried out. There are plenty of state-of-the-art classi-
inputs and facial expressions based on the assumption that a human fiers for automatic emotion identification. For example, Near-
emotion can trigger multiple behavior and physiological responses est Neighbor classifiers used features such as FFT and Wavelets
whenever he/she experiences this emotion. to recognize 4 types of emotions (i.e., joy, sad, angry, relaxed)
Nevertheless, most of these methods require humans to express achieving accuracies ranging from 54% to 67%. On the other
their emotional (mind) states in a deliberated and exaggerated man- hand, statistical methods such as Quadratic Discriminant Analy-
ner, so that emotions cannot spontaneously be expressed. On the sis (QDA) used several statistical features for negative and pos-
other hand, extracting information from facial expressions requires itive arousal levels with an average accuracy of 63% (Koelstra
monitoring a subject by using one of several cameras, whereas for et al., 2012; Petrantonakis & Hadjileontiadis, 2010; Wu et al.,
audio-based approaches, emotions are very hard to recognize when- 2010; Yisi et al., 2010).
ever a subject does not speak or produce any sounds (Giakoumis,
Tzovaras, Moustakas, & Hassapis, 2011; Sourina et al., 2011).
A popular and effective non-invasive technique to measure 3. An adaptive BCI-based emotions recognition model
changes on brain activity is called (EEG), which transforms brain ac-
In this work, a novel approach that combines minimum-
tivity into images of variations of electrical potential by using small
Redundancy-Maximum-Relevance (mRMR) based feature selection
low-cost devices (AlMejrad, 2010). There are several approaches for
tasks and kernel classifiers for emotions recognition is proposed. The
EEG-based emotion recognition which are usually based on four main
method takes EEG signals received from BCI devices and incorporates
tasks (Calvo & D’Mello, 2010):
relevant features in order to detect several kinds of emotional states
(1) Signal preprocessing: an EEG device can directly get signals by using state-of-the-art classifiers. The main contribution of this re-
from the brain. However, there are some noise sources that search is that unlike other automatic emotion recognition methods
are not neurologically produced known as artifacts (i.e., blink- our approach
ing, muscular effects, vascular effects, etc.), so digital signal
processing techniques must be applied to represent signals (1) Incorporates a feature selection task into the classification
using frequencies and harmonic functions (Petrantonakis & task.
Hadjileontiadis, 2010; Yisi et al., 2010). (2) Uses multi-label classifiers to simultaneously recognize a
(2) Feature extraction: EEG signals are highly dimensional so com- wider range of emotion types based on a dimensional model.
putational processing becomes very complex. Hence different The overall model is composed of three tasks: signal preprocess-
features must be extracted in order to simplify the further ing, feature extraction and selection, and emotions classification (see
emotion classification task so to create input Feature Vectors Fig. 1).
(FV). Typical methods include statistical metrics of the signal’s
first difference (i.e., median, standard deviation, kurtosis sym- 3.1. EEG signal preprocessing
metry, etc.), spectral density (i.e., EEG signals with specific
frequency bands) Zhang, Yang, and Huang (2008), Logarith- In order to train the emotions classifier, a set of previously
mic Band Power (Log BP) (i.e., power of a band within the emotion-labeled EEG data extracted from subjects self-assessing
J. Atkinson, D. Campos / Expert Systems With Applications 47 (2016) 35–41 37

as:
1 
max VI , VI = I (C, fi ) (2)
|S| fi ∈S

Thus, finally obtained sets must accomplish the optimization con-


ditions for Eqs. (1) and (2) simultaneously, into a single function,
where the first and second condition are named MID and MIQ, re-
spectively (max(VI − WI ) and max(VI /WI )). In addition, each feature
was converted into a discrete value by using the transformation func-
tion of Eq. (3), where μ is the median of a subject’s feature values, and
σ is the standard deviation of values for the same feature (a common
value of α = 0.5 is used).
⎧ σ

⎪1 if x ≥ μ +

⎪ 2

σ σ
f (x ) = 0 if μ − ≤x<μ+ (3)
Fig. 1. Steps in our emotions recognition approach. ⎪
⎪ 2 2

⎪ σ
⎩−1 if x < μ −
2
their emotional states was taken. It Arousal and Valence dimensions
that were triggered from external stimuli. Since EEG brain signals 3.3. Emotions classification
contain much noise, the following basic preprocessing steps were
performed: In order to recognize different emotion classes, a multi-class Sup-
port Vector Machine (SVM) was trained for a set E = {xi , yi }N
i=1
, where
• Resolution reduction: it optimizes the used memory by reducing N is the number of samples built from previously selected features,
a signal resolution. Since useful data for emotions recognition are xi is composed of an FV and yi the dimension class of xi (i.e., Arousal
found under 40 Hz (Yisi et al., 2010), resolution can be reduced to and Valence). The classifier builds and trains k(k − 1 )/2 SVMs, where
128 Hz, preserving the original signal’s information. k is the number of classes for yi . For our approach, three classes
• Electrooculography removal: electrooculography (EOG) measures were considered for each dimension. Each of three SVM uses RBF
the corneo-retinal standing potential that exists between the kernels and the overall classification is then carried out by using a
front and the back of the human eye. In order to remove the noise One-versus-One voting mechanism in which a finally assigned class
produced from this kind of eyes movement, a method for remov- label will become those having the higher accuracy among the vot-
ing EOG artifacts in the EEG called Automatic Removal of Ocular ing SVMs. Classes produced for each dimension are divided accord-
Artifacts is applied. ing to a range of values for each dimension [1, 9], into three sets: [1,
• Band filter: it filters EEG signals by generating bands that are use- 3.66], [3.66, 6.33] and [6.33, 9] based on Eq. (4), where r(i) indicates
ful for emotion recognition (e.g., 4 Hz–45 Hz). the point in which the i-th division of the range of values is created,
max v − min v is the difference between the maximum and minimum
3.2. Feature extraction and selection value for each dimension, and k is the number of sets to be created.
 
max v − min v
EEG signals are highly dimensional data which may contain a lot r (i ) = i∗ + min v (4)
k
of useless features. In order to reduce dimensionality, a large set of
relevant features are extracted to create easy-to-process FVs for each Previously trained multi-class SVMs are then used to classify
stimuli. These included statistical features (S), band power (BP) for Arousal and Valence dimension classes for unseen FVs extracted from
different frequencies, Hjorth parameters (HP) and fractal dimension different EEG signals extracted from the same subject as our model is
(FD) for each channel. Statistical features included median, standard subject-dependent.
deviation, kurtosis coefficient, etc. Furthermore, bands of frequency
for each EEG channel correspond to theta (4–8 Hz), low alpha (8– 4. Experiments
10 Hz), alpha (8–12 Hz), beta (12–30 Hz) and gamma (30–45 Hz).
In order to select a relevant set of features from the previously In order to assess the accuracy of our emotion classification ap-
extracted candidate features so that further classification can be proach into different dimensions, a computational prototype was
more accurate, the minimum-Redundancy-Maximum-Relevance built and run on DEAP datasets. Different experiments were con-
(mRMR) method was used (Wu et al., 2010; Yisi et al., 2010). It selects ducted in order to tune different parameters of the finally im-
the features that correlate the strongest with the classification vari- plemented model. In addition, comparisons with other state-of-art
able, reducing information redundancy between bad and good fea- methods were also performed and discussed.
tures using Mutual Information (MI) methods, so that the best set of Accuracy for different classifiers configurations was measured as
features can be selected. It is based on two underlying conditions: the proportion of correctly classified signals versus the total number
minimum redundancy and maximum relevance. Let S be a set of fea- of signals. In the case of the SVM classifier, the performance is based
tures, the minimum redundancy condition is defined as: on different types of kernel functions (Sourina et al., 2011). In addi-
1  tion, the mRMR feature selection method was used for tunning and
min WI , WI = I ( fi , f j ) (1) training purposes (Wang et al., 2011).
|S|2 fi , f j ∈S All the experiments used the standard DEAP (Dataset for Emotion
Analysis using EEG, Physiological and Video Signals)1 dataset (Koelstra
Where I(fi , fj ) is the MI between features fi and fj , and |S| = n is
et al., 2012) which contains a set of EEG physiological signals and
the number of features from the set. The discriminant power of each
video records of 40 stimuli tests for 32 human subjects, i.e., 1280
feature regarding the emotion classes is then measured as the MI be-
tween features and classes. Since I(C, fi ) expresses the relevance of
feature fi for a class C, the maximum relevance condition can be seen 1
http://www.eecs.qmul.ac.uk/mmv/datasets/deap.
38 J. Atkinson, D. Campos / Expert Systems With Applications 47 (2016) 35–41

Table 1
Parameters setting for GA-SVM-based feature selection.

Configuration Pm Pc Selection Crossover

C1 1.0 0.80 Roulette One point


C2 0.8 0.66 Roulette One point
C3 1.0 0.80 Tournament One point
C4 0.8 0.66 Tournament One point
C5 1.0 0.80 Roulette Two points
C6 0.8 0.66 Roulette Two points
C7 1.0 0.80 Tournament Two points
C8 0.8 0.66 Tournament Two points
Fig. 2. Average classification accuracy for dimension Arousal and mRMR.

Feature selection methods for different sets of candidate features


were tested by its ability to find the best features for the same SVM
classifiers. Overall results for mRMR can be seen in Figs. 2 and 3, for
Arousal and Valence dimensions, respectively. Those results are ob-
tained for SVMs using RBF kernels with γ = 0.05. Note the average
accuracy slightly drops as a larger set of features is selected. The best
classification accuracy was obtained for a set containing 35 and 29
features for Arousal and Valence, respectively (dotted lines), whereas
worst accuracy for the configuration is obtained for set sized greater
Fig. 3. Average classification accuracy for dimension Valence and mRMR. than 173 for both dimensions.
On the other hand, GA-SVM uses a GA to find an optimal subset of
features, so the quality of evolved solutions is strongly dependent on
stimuli tests each associated to their corresponding Arousal and Va- how genetic operators modify initial hypotheses as the GA goes on:
lence dimensions. Furthermore, training and testing the models were mutation (probability Pm ), crossover (type and probability Pc ), parents
conducted by using m- f old cross-validation (with best results ob- selection (i.e., tournament, roulette). Table 1 shows the different con-
tained for m = 8). figurations for this task.
Preliminary runs showed the GA usually converges toward good
4.1. Data acquisition solutions between 80 and 100 generations. Fitness evaluation of can-
didate solutions in the GA uses the classification accuracy (acc) of the
Stimuli tests were conducted by using musical video records from SVM, so it is computed as seen in Eq. (5), where f itness = accuracy
DEAP as they are more suitable than other stimuli to evoke emotional where wa = 1 (weight of acc), w f = 0 (weight of the number and cost
reactions. The stimuli testing procedure was carried out as follows: of the features), n is the number of features, ci is the cost of extracting
the i-th feature, and xi represents the absence/presence of a selected
• Each subject watched a musical video as his/her EEG physiological featured (0 or 1).
signals and facial expressions are recorded.

−1
• Each subject indicated his/her emotional state according to di- 
n
mensions Arousal and Valence. fitness = wa ∗ acc + w f ∗ ci ∗ xi (5)
i=1
While DEAP data contain EEG signals extracted from a 32-channel
BCI device, experiments only used information on 14 relevant Classification results using GA-SVM feature selection for different
channels. population sizes, and RBF kernel (γ = 0.05) and dimensions, can be
seen in Figs. 4 and 5, where Ci represents the i − th configuration for
4.2. Parameters tunning the GA (Table 1). Unlike mRMR techniques, there is no a clear trend on
the evolution of candidate solutions. However, as the size of the GA
In order to tune our model, mRMR was compared against other population increases, the average accuracy increases too. Hence best
state-of-the-art features selection method such as GA-SVM. At the results were obtained for population sizes of 90 and 100 for Arousal
same time, three kernel configurations were tested for the SVM by and Valence, respectively. However, no significant differences are ob-
using different kernel functions and degrees (RBF and γ = 0.2, RBF served for both dimensions. Thus, best setting results are obtained
and γ = 0.05 and Polynomial and degree = 5). using configurations C2 , C4 and C6 .

Fig. 4. Average accuracy for classifying Arousal dimension using GA-SVM.


J. Atkinson, D. Campos / Expert Systems With Applications 47 (2016) 35–41 39

Fig. 5. Average accuracy for classifying Valence dimension using GA-SVM.

Table 2 the selected features. Note also mRMR generates features sets that are
Best results for setting parameters for different feature selection methods.
smaller than for GA-SVM, which makes it more suitable for real-time
Method Dimension Accuracy (%) Std. dev (%) No. of applications as it requires less work to extract features and achieve
features good classification accuracy.
mRMR Arousal 60.72 9.08 35
4.3. Overall evaluation
mRMR Valence 62.39 9.90 20
GA-SVM Arousal 56.69 9.34 95
GA-SVM Valence 53.46 9.05 94 A final overall experiment compared our approach against some
state-of-the-art methods. To this end, the best previously tunned con-
figurations were used: mRMR-based feature selection, 35 features for
Finally, Table 2 shows the best setting results for each dimension, dimension Arousal and 20 features for dimension Valence, and RBF
feature selection method, and the number of selected features. The kernel with γ = 0.05 for the SVM classifier. The model was then
SVM classifier using RBF Kernel with γ = 0.05, produces the highest trained using 40 stimuli tests for each of the 31 subjects of the dataset.
accuracy, and the performance of mRMR is better than GA-SVM for Experimental results are shown in Fig. 6, indicating a median of
60.7% and 62.33%, for dimension Arousal and Valence, respectively
(i.e., std. dev. of 9 is close to the median of both dimensions).
Graphic of Fig. 7 shows the classification accuracy for each dimen-
sion and subject. In addition, the lower row for each figure shows the
number of subjects for whom certain features were selected, where
darker points represent a larger number of subjects. This suggests
there is no relationship between the selected features for one or other
subject. Nevertheless, best selected features for both dimensions, cor-
respond to the statistical measures extracted from each channel (left-
hand side).
Classification accuracy of our model was also compared against
other approaches, indicating very promising results when dealing
Fig. 6. Classification accuracy per dimension for each subject. with combination of methods and different classes of emotions as

Fig. 7. Classification accuracy of the proposed model for each subject.


40 J. Atkinson, D. Campos / Expert Systems With Applications 47 (2016) 35–41

Table 3 effective when attempting to classify a wider set of emotion classes


Comparing our recognition approach and some state-of-the-art methods.
(i.e., a higher number of classes per dimension).
Method No. of classes Accuracy Accuracy Overall results showed that our methods outperformed those of
per dimension (Arousal) (%) (Valence) the state-of-the-art for the same number of classes per dimension
Our model 2 73.06 73.14%
(i.e., 73% versus 62%). In addition, our approach was capable of clas-
Our model 3 60,7 62,33% sifying a higher number of classes per dimension whenever no other
Our model 5 46.69 45.32% state-of-the-art method did it for em Valence (i.e., 62.33% versus no
Spectral density 3 96.5 – accuracy known in spectral density for 3 classes per dimension). Note
and SVM
that spectral density does it well for recognizing emotions within
Band power 2 62 67.6%
and Naive Bayes the Arousal dimension, but it has not been assessed for more than
5 classes/dimensions as for our research. Thus, our mRMR-based
emotion classification approach outperformed other state-of-the-art
methods. Furthermore, the method is promising when considering a
higher number of classes per dimension (i.e. 3 and 5), that had not
seen in Table 3, even for recognition using two classes per dimen- been proven in the literature. This also showed our method recog-
sion (high/low). Experimental results for different approaches for EEG nizes a higher number of emotion classes without using additional
emotion recognition show our model’s performance is similar to oth- emotions classifiers.
ers but for a higher number of classes. Accordingly, combining features-selection methods (mRMR) and
The table includes results for classifying 2 and 5 classes per di- SVM classifiers using RBF kernels yield significant improvements in
mension, indicating promising results for fair parameters settings. It accuracy. In words, our method requires less work to classify based on
suggests results for two classes per dimension are better than other a smaller set of selected so as to achieve higher accuracy than other
approaches using the same DEAP dataset. Note also that the table also techniques.
indicates that our model is capable of recognizing a wider range of
emotion classes based on the dimensions (Arousal and Valence), with
5.1. Further research
no need to use additional emotions classifiers. Hence our technique
can recognize several emotion types simultaneously by using a single
There are some open issues which may be addressed so as to pro-
multi-class kernel classifier. Note that spectral density and SVM does
duce more accurate results and robust emotion recognition methods,
well for recognizing emotions within the Arousal dimension, but it
including:
has not been assessed for more than 5 classes/dimensions as for our
case. • Parameters design and analysis: running time was a constraint
when assessed different configurations for our model as ana-
lyzing EEG signals became a very demanding task. As a conse-
5. Conclusions quence, experiments were designed only for a small set of set-
tings. Hence more exhaustive setup evaluations may be required
In this paper, an EEG feature-based emotion recognition method on the model’s parameters to evaluate the extent to which dif-
was proposed. Unlike other approaches, the approach uses the mRMR ferent frequence bands, number of subjects, EEG oscillations, etc.
feature selection method as a signal preprocessing step so as to im- affect the effectiveness of the approach.
prove the predictive accuracy of an SVM emotion classifier based on • Specific-purpose training dataset creation: while there are some
two-dimension emotions model (i.e., Valence and Arousal). In addi- training corpus for emotion recognition purposes such as DEAP
tion, compared with state-of-the-art emotion recognition methods, (Koelstra et al., 2012), large-scale and well-balanced dataset are
our approach deals with a higher number of emotion classes (i.e., 8) required so as to avoid bias and overfiting of the classification task.
on a standard DEAP dataset, which makes the problem more realistic • Higher number of classes per dimension: while recognizing a higher
but at the same time, the training task becomes more demanding. number of emotion classes (i.e., 3 or 5) was a main purpose of the
Accordingly, one of the contributions of this research is that it proposed method, it might be not enough to deploy real-world
incorporates a statistical-based feature selection task into the clas- EEG-based emotion recognition applications (i.e., videogames,
sification task. Furthermore, our approach which combines feature brain-controlled wheelchairs, etc.) as these must adjust to a sig-
selection and kernel classifiers uses multi-label classifiers to simulta- nificantly big set of emotional states. Hence further experiments
neously recognize a wider set of emotion classes based on a dimen- may be needed to modify the model so that it can effectively rec-
sional emotion model. ognize more than 3 levels per dimension.
In order to assess the effectiveness of our kernel-based classifier,
several preliminary experiments were conducted so as to produce the References
best parameters settings. It included tunning feature selection meth-
ods, emotion classifier, signal preprocessing tasks, etc. Preliminary AlMejrad, A. (2010). Human emotions detection using brain wave signals: a challeng-
experiments showed that our mRMR-based feature selection method ing. European Journal of Scientific Research, 44, 640–659.
Brunner, C., Vidaurre, C., Billinger, M., & Neuper, C. (2011). A comparison of univari-
outperformed the most popular feature selection strategy (GA-SVM) ate, vector, bilinear autoregressive, and band power features for brain–computer
for both dimensions (Arousal and Valence) when classifying emotions interfaces. Medical & Biological Engineering & Computing, 49(11), 1337–1346.
(Accuracy of 60.72% and 62.4% versus 57% and 53.4%). In addition, for Calvo, R., & D’Mello, S. (2010). Affect detection: an interdisciplinary review of models,
methods, and their applications. IEEE Transactions on Affective Computing, 1(1), 18–
both dimensions, our method reduced the number of relevant fea- 37.
tures of almost to capable of reducing in 63% with a higher accuracy. Giakoumis, D., Tzovaras, D., Moustakas, K., & Hassapis, G. (2011). Automatic recognition
Classification accuracy of our model was then compared against other of boredom in video games using novel biosignal moment-based features. IEEE
Transactions on Affective Computing, 2, 119–133. http://doi.ieeecomputersociety.
competitive current approaches to emotion recognition: SVM-based
org/10.1109/T-AFFC.2011.4.
spectral density and Bayes-based Band Power (BP). An important is- Koelstra, S., Muhl, C., Soleymani, M., Lee, J., Yazdani, A., & Ebrahimi, T. (2012). Deap:
sue with these two techniques is that either the EEG signal they anal- a database for emotion analysis; using physiological signals. IEEE Transactions on
yse must be within very specific frequence bands (i.e., spectral den- Affective Computing, 3(1), 18–31.
Petrantonakis, P., & Hadjileontiadis, L. (2010). Emotion recognition from brain signals
sity) or the power of the frequence band within a signal is strongly using hybrid adaptive filtering and higher order crossings analysis. IEEE Transac-
dependent on its oscillatory processes. Hence they might not very tions on Affective Computing, 1(2), 81–97.
J. Atkinson, D. Campos / Expert Systems With Applications 47 (2016) 35–41 41

Polat, D., & Cataltepe, Z. (2012). Feature selection and classification on brain computer Wu, D., Courtney, C., Lance, B., Narayanan, S., Dawson, M., Kelvin, S., & Parsons, T.
interface (BCI) data. In Proceedings of signal processing and communications applica- (2010). Optimal arousal identification and classification for affective computing us-
tions conference (SIU) (pp. 1–4). IEEE. ing physiological signals: virtual reality stroop task. IEEE Transactions on Affective
Sourina, O., Wang, Q., Liu, Y., & Nguyen, M. (2011). A real-time fractal-based brain state Computing, 1(2), 109–118.
recognition from eeg and its applications. In F. Babiloni, A. L. N. Fred, J. Filipe, & Yisi, L., Sourina, O., & Minh, N. (2010). Real-time EEG-based human emotion recog-
H. Gamboa (Eds.), Biosignals (pp. 82–90). SciTePress. nition and visualization. In Proceedings of international conference on cyberworlds
Wang, L., Xu, G., Wang, J., Yang, S., Guo, L., & Yan, W. (2011). GA-SVM based feature (CW) (pp. 262–269).
selection and parameters optimization for BCI research. In Proceedings of seventh Zhang, A., Yang, B., & Huang, L. (2008). Feature extraction of eeg signals using power
international conference on natural computation (ICNC): vol. 1 (pp. 580–583). spectral entropy. In Proceedings of international conference on biomedical engineer-
Wright, F. (2010). Emotional instant messaging with the epoc headset, (Master’s thesis). ing and informatics, 2008. BMEI: vol. 2 (pp. 435–439). IEEE.
Baltimore County: University of Maryland.

You might also like