Data Pre-Processing of Electroencephalography
Data Pre-Processing of Electroencephalography
Data Pre-Processing of Electroencephalography
Abstract:
I. Introduction:
The most important key principles of human civilization are Communication and social
interaction between them. Human beings share their emotions, expectations and thoughts to
another person through communication. The communication can be established through
Speech, Gesture or writing, because through this human communication becomes very easier.
The person who is suffered from locked-in syndrome could not communicate or interact or
express their emotions and feelings with other persons, but they are well cognizant of things
happenings around them. The basic communication abilities can be provided by the booming
technology that can be interacted with human brain called “Brain Computer Interface (BCI)”.
A Brain Computer-Interfaces (BCI) or Brain Machine- Interfaces (BMI) is a communication
system that activates human brain to communicate with machines without doing any physical
contact by using signals generated by the electrical activity of the brain called
Electroencephalogram (EEG). The figure 1 shows the typical block diagram that illustrates
the different stages of EEG signal processing for BCI. BCI aims at controlling various human
assistive devices through EEG. BCIs are getting more popular among the researchers to
control devices using electrical signal generated from the brain especially in providing good
assistance to disabled people. A Brain Computer-Interfaces (BCI) method can be divided
broadly into five segments. First segment is measuring brain activity, second is pre-
processing, third is feature extraction, fourth is classification and final is control interface.
After signal acquisition, signals needs to be pre-processed. Because EEG signal contains
some movement artefacts. Pre-processing is mainly focused on removing such artefacts,
noise and improving the quality of the signal without losing any information from the original
EEG signal.
The greatest challenges in EEG based BCI applications are Signal to Noise Ratio (SNR) and
different noise sources. The EEG signal contains some unwanted noise, interference or
movement artefacts. There are two main sources of artefacts such as external or
environmental source and physiological source. The external source includes AC power lines
Lighting and electronic equipment. The noise from physiological source arises due to
movement of subject, eye movement and other bioelectric potentials. These noises can be
removed by the non-trivial process called Pre-Processing techniques. The primary attempts to
attenuate the noises by simple, low high and band-pass filters. But these techniques are
applicable for the frequency bands of the signal do not overlap with each other. This
chapter mainly focused on several techniques applied for BCI at Pre-processing and also
discussed about the current trends in BCI. The most common Pre-processing techniques are
Common Spatial Pattern, Principal Component Analysis (PCA), Common Average
Referencing (CAR), Surface Laplacian, Adaptive Filtering, Independent Component Analysis
and Digital filter. In this chapter, we discussed all the Pre-processing Techniques.
The pre-processing signal algorithm is plays vital in BCI system, and feature extraction is a
key issue. Some typical methods used for pre-processing of EEG signal are discussed as
follows.
Common spatial pattern (CSP) is the most popular technique applied to motor-
imagery (MI) feature extraction for classification in brain computer interface (BCI)
application. The effective application of CSP depends on the filter band selection to a higher
degree. Yu Zhang, et al [1] proposed a sparse filter band common spatial pattern (SFBCSP)
for enhancing the spatial patterns. From raw EEG data, SFBCSP estimates the CSP features
on numerous signals that are filtered at a set of coinciding bands. The filter bands that results
in significant CSP features are then selected in a supervised way by using sparse regression.
A support vector machine (SVM) is applied on the selected features for Motor Imagery
classification. In the proposed method, sparse filter band common spatial pattern (SFBCSP),
the regularization parameter played an important role in feature selection process. The
proposed sparse filter band common spatial pattern (SFBCSP) methods produce higher
classification accuracy when compared to other competing methods.
Jyoti Singh Kirar& R. K. Agrawal [2], proposed a method that attains features from
various size sub bands within a particular frequency band using CSP. The major difficult in
using CSP is tuning of BCI device for every person as the rhythmic patterns. The α and β
waves varies from person to person. The presence of artefacts, noises and the non-stationary
nature of an EEG data may further decrease the performance of CSP. The most recent
research has been mainly focused on finding spatial patterns from a filter bank of non-
coinciding fixed sized sub bands. Sub bands CSP (SBCSP) has been used to evaluate the
Motor Imagery data using different fixed sized sub bands. Filter bank CSP (FBCSP) is used
to allow the analysis of CSP technique by applying it on different frequency bands filtered
EEG based on maximal mutual information criterion. Combined Variable Sized Common
Spatial Patterns method discovers various size sub bands. The efficiency of the proposed
method is evaluated in terms of classification error. The proposed method (CVSCSP) obtains
high classification performance when to compare to the other two techniques. The parameters
have been chosen for analysing the performance of the classification is bandwidth and
granularity.
Fabien LOTTE [3], proposed a tutorial of signal processing techniques, can be used to
identify states from mental states from EEG signals in BCI. Recognizing the user’s mental
state from EEG signals is not an easy task, such signals being noisy, movable, complex and
of high dimensionality. Therefore, mental state recognition from EEG signals requires
specific signal processing and machine learning tools. In this paper, the importance of feature
extraction and classification components. There are 3 main sources of information that can be
used to design EEG-based BCI: 1) Typically used method with band power features - spectral
information, 2) Pre-processing of amplitude (EEG) - temporal information, and 3) the spatial
information, which can be misused by using channel selection and spatial filtering (e.g.,
Common Spatial Pattern).
Kottaimalai R, et al., [4], proposed a technique for analysing data and pattern
identification from EEG signals using PCA. Data compression is possible in Principal
Component Analysis (PCA), and it can shows higher dimensional data to lower dimensional
data. There are two steps in using Principal Component Analysis with Neural Network, (i)
Elimination of redundant data in the dataset and (ii) the resultant data is trained using Neural
Network. The accuracy of mental task classification has been improved by several kinds of
pre-processing. Compared with the Neural Network model, Principal Component Analysis
with Neural Network provides increased in the probability of correct classification The same
dataset can be classified by using other soft computing techniques like Fuzzy logic, ANFIS in
future for better results.
Hongbin Yu, et. al., [5] proposed nonparametric CSP (NCSP) algorithms which do
not openly rely on the assumption of the underlying class Gaussian distribution. To solve the
proposed NCSP algorithm and its extensions-nonparametric multi-class CSP (NMCSP), they
established a new efficient algorithm based on matrix deflation. Vigilance detection has been
an important area in the field of brain-computer interface (BCI) research. Vigilance detection
or continuous attention is an important feature or part for the people who engaged in long
time attention. The long time attention is a hard job or responsibility to monitor monotonous
monitoring and driving. EEG signals are also strongly correlated with the human vigilance.
Because, EEG signals can reveals the changes in the condition of human’s brain, which
makes it possible to predict human alertness or activities based on EEG. In the proposed
work, a new Common Spatial Pattern algorithm called two-class nonparametric common
spatial pattern (NCSP) for extracting the feature of EEG signals for estimating the vigilance
of the person to avoid unnecessary accidents and problems while driving.
Hardik Meisheri, et. al., [6] proposed Common Spatial Pattern algorithm for
multiclass EEG classification with a pre-processing step which can improve the
generalization of CSP covariance matrices by removing the trials which are noisy/affected
with artefact. The Outcomes of the work were presented on publically available data set BCI
competition IV data 2a. Results are compared with the currently state of the art algorithms for
multiclass classification. It clearly shows that proposed method outperforms in four class
classification by improving the mean accuracy 8-13%. The increase is significant in subjects
which earlier had very low accuracies.
Fiorenzo Artoni, et. al., [7] proposed a common practice or method called
dimension reduction to EEG data using Principal Component Analysis (PCA), before the
processing of data using ICA Decomposition. The di-polarity and stability of bioelectric
potentials rising from brain and known non-brain processes are affected by PCA rank
reduction. PCA rank reduction also increased confusion in the equivalent dipole positions and
spectra of the independent components (IC) brain effective sources across subjects. Before
the application of dimension reduction as a pre-processing procedure, PCA ranking should be
avoided or at least tested on each EEG dataset. In various research fields, to reduce the
dimensionality of the original sensor space and simplify subsequent analyses, Principal
Component Analysis (PCA) has been commonly used. PCA can be used to efficiently remove
the redundancy for making the data suitable for standard ‘complete’ ICA decomposition.
Depending upon the amount of data available pre-processing pipeline, type of subject task
might vary slightly. The amount of data available means the length of data and number of
channels for pre-processing.
Robert J. Barry, Frances M. De Blasio [8], proposed the method EEG-ERP. In this
method, Principal Component Analysis technique has been used to decompose the Event-
Related Potential (ERP) into mathematical components and these mathematical entities are
largely accepted as important and useful initiatives for the electrophysiological components
developing the ERP. ERPs from the scalp-recorded EEG gives us huge amounts of data as
sampling rates and the number of electrode gets increased. The proposed method has served
to demonstrate the feasibility of using PCA to increase the impartiality and efficiency of data
extraction from both the EEG spectrum and ERP waveforms, shows the brain dynamics that
reflects the EEG/ERP correlations among them. This application may leads to better insight
into the brain dynamics underlying cognitive functioning in these tasks.
Deepa R, Shanmugam A , Sivasenapathi B [9], proposed the method involving
Revised Principal Component Analysis (RPCA), multipliers and Support Vector Machine
(SVM) Classifiers with two distinct features. These methods are contrasted to investigate the
behavior of electrical activity of the brain for visual attention. In this work the EEG
classification can be done by proposed method called RPCA. It is very useful in predicting
the brain activity. By analysing the activity of the brain signal in open or in close condition,
this approach provides better frequency behavior. The relation among visual attention and
brain activity of the brain is one of the most frustrating events in existing biomedical
engineering. The feature extraction and dimensionality reduction of the brain signals for
various signals are analysed using RPCA.
Independent Component Analysis (ICA) considers that the reading for each electrode
is a linear combination of isolated physiological activities and seeks to estimate the primary
sources relating to various physiological processes in the context of EEG. The next part
discusses the Independent Component Analysis principles and assumptions on the nature and
properties of the EEG signal before employing ICA to analyze EEG signals:
a) Linear representation-the past observations are assumed to be the result of a joint
distribution of several different sources, according to ICA. The head conductive
volume continuously mixes the potential fields on the brain in the framework of the
EEG data.
b) Instantaneous mixing model-ICA ignores any temporal delays that may occur during
the mixing of the combined signal's sources. There is a delay in EEG recording
between the propagation of the produced signal from the inputs and the brain
measuring sites (i.e. electrode locations). When compared to the message signal and
sample rate, the EEG electrodes are quite close together, and the propagation speed is
relatively fast. As a result, it is reasonable to assume that the input signals are detected
at the same instant as they can be produced from the inputs, a process known as
instantaneous EEG recording.
In line with this hypothesis, the mean of all EEG signal generators tends to have a
Gaussian distribution since the number of neurons is so high. Neurons can generate
oscillatory potentials and act as separate oscillators. Electrodes on the scalp detect and
analyze these potentials when they are combined together. The EEG signal includes a
combination of brain and non-brain sources. The non-brain components in EEG have non-
Gaussian distributions; some non-brain components have super-Gaussian distributions (i.e.
blink and ECG sources), while the 50 Hz power line signal is the major sub-Gaussian source
in EEG. Due to their spikes activation pattern and non-Gaussian distribution, ICA can
identify the sources involved with the stimulus response and epileptic
spikes.As a result, for EEG data to distinguish artefactual factors from the mixed EEG data, t
his assumption is reasonable. Finding the unmixing matrix W is an optimization method that
optimises the nonGaussianity (minimum Gaussian characteristics) of separate sources S, resul
tingin source that are optimally separate.To estimate the optimum unmixing matrix W, many
ICA methods have been devised, each ofwhich uses different metrics to measure the non-
Gaussianity of the ICs.
The predicted input signals might be linked to brain or artefact
related processes. ICA does not name the predicted sources despite source separation.
ICA-estimated sources are usually manually selected by eye examination. The artefact
free EEG may be rebuilt by eliminating the artefactual sources from the input source
that have already been labelled.
ICA Algorithms
Various ICA algorithms use different criteria to optimize the non-Gaussianity of ICs,
as mentioned in the preceding section. Nonetheless, before predicting the un-mixing matrix
W, two first stages of centring and whitening (or sphering) are generally done to simplify the
ICA procedures. The mean of the measured signal X is subtracted from the signal itself to
achieve centring. As a result of the centring procedure, the detected signal has a zero mean.
Whitening is a basic and conventional method for reducing the complexity of the ICA issue
by changing the various sample data coordinates linearly. Then, using ICA all that is required
is to “rotate” this depiction return to the initial axis dimension. As a result, one might claim
that whitening addresses half of the ICA problem.
The information related to brain functions that is buried in the apparent mixing of the EEG
signal can be revealed using ICA on EEG data. EEG may be split into artefact and non-
artefact sources in this way. The artefact sources may then be identified and eliminated from
the data, either manually or automatically. By backward projections the insufficient evidence
to conclude sources, the cleaned and artefact-free EEG may be reconstructed. Another
application of ICA to EEG is feature selection, in which the ICA-estimated contributed to the
problem are utilised as a new description of data from which significant results may be
retrieved from data series and energy spectra.ICA has been used to extract characteristics for
Alzheimer's illness, epilepsy spasms, and epileptic spiking detection.
EEG Applications
Based on the number of sensors, EEG has a good resolution of milliseconds and a
poor spatial resolution of a few centimeters. The EEG has been widely utilised in clinical and
scientific study because it may give a non-invasive measure of brain activity with great
temporal resolution.
In recent years, EEG waves have been widely investigated and analysed as a means of
analyzing brain activity as well as a sights of monitoring brain stimulation. Different
oscillations, called as the rhythms, make up EEG signals. The EEG can be used to
statistically detect brain diseases by identifying certain rhythms or characteristics. The state
of awareness or sleep is linked to brain activity in certain frequency bands for healthy
subjects. Delta (), theta (), alpha (), beta (), and gamma () -bands are the names given to these
frequency ranges. The -band (0.5-4 Hz) is linked to deep sleep, whereas the -band (4-7.5 Hz)
emerges during the transition from consciousness to sleepiness and is linked to arousal level,
The -band (7.5-13 Hz), which is mostly visible in the occipital region, indicates a relaxed
state of awareness without attention; the -band (13-26 Hz) is a waking rhythm linked with
focus and concentration; and the -band (above 26 Hz) makes a significant contribution to
some mental disorders.
Conclusion:
EEG pre-processing is very important to remove artefacts from EEG signal. In the
above review, several techniques have been discussed. Common spatial pattern (CSP) is the
most popular technique applied to motor-imagery (MI) feature extraction for classification in
brain computer interface (BCI) application. The effective application of CSP depends on the
filter band selection to a higher degree. Principal Component Analysis (PCA) is one of the
oldest and most widely used techniques for pre-processing of EEG data. Its idea is very
simple to reduce the dimensionality of a dataset, while by using this pre-processing technique
we preserve as much variability or statistical information as possible. The review proves that
ICA is a most powerful tool when the biomedical analysis involved more channels, which is
the case of electroencephalogram. In this case after applying the Independent Component
Analysis, the important information can be obtained.
References:
1. Mamunur Rashid, Norizam Sulaiman , Anwar P. P. Abdul Majeed , Rabiu Muazu
Musa, Ahmad Fakhri Ab. Nasir , Bifta Sama Bari and Sabira Khatun, 2020, “Current
Status, Challenges, and Possible Solutions of EEG-Based Brain-Computer Interface:
A Comprehensive Review, Vol 14, pp: 1-35
4. Fabien LOTTE, 2014, “A Tutorial on EEG Signal Processing Techniques for Mental
State Recognition in Brain-Computer Interfaces”, pp. 8
5. Kottaimalai R, et al., 2013, “EEG Signal Classification using Principal Component
Analysis with Neural Network in Brain Computer Interface Applications”, IEEE
International Conference on Emerging Trends in Computing, Communication and
Nanotechnology (ICECCN 2013), pp 227-231
6. Hongbin Yu , Hongtao Lu , Shuihua Wang , Kaijian Xia ,Yizhang Jiang And
Pengjiang Qian, 2016, A General Common Spatial Patterns for EEG Analysis with
Applications to Vigilance Detection, “ IEEE Transactions and Journals, vol 4, pp:1-13