Robust Artifactual Independent Component Classification For BCI Practitioners
Robust Artifactual Independent Component Classification For BCI Practitioners
This content has been downloaded from IOPscience. Please scroll down to see the full text.
(http://iopscience.iop.org/1741-2552/11/3/035013)
View the table of contents for this issue, or go to the journal homepage for more
Download details:
IP Address: 147.188.128.74
This content was downloaded on 11/11/2014 at 20:00
Abstract
Objective. EEG artifacts of non-neural origin can be separated from neural signals by
independent component analysis (ICA). It is unclear (1) how robustly recently proposed
artifact classifiers transfer to novel users, novel paradigms or changed electrode setups, and (2)
how artifact cleaning by a machine learning classifier impacts the performance of
brain–computer interfaces (BCIs). Approach. Addressing (1), the robustness of different
strategies with respect to the transfer between paradigms and electrode setups of a recently
proposed classifier is investigated on offline data from 35 users and 3 EEG paradigms, which
contain 6303 expert-labeled components from two ICA and preprocessing variants. Addressing
(2), the effect of artifact removal on single-trial BCI classification is estimated on BCI trials
from 101 users and 3 paradigms. Main results. We show that (1) the proposed artifact classifier
generalizes to completely different EEG paradigms. To obtain similar results under massively
reduced electrode setups, a proposed novel strategy improves artifact classification. Addressing
(2), ICA artifact cleaning has little influence on average BCI performance when analyzed by
state-of-the-art BCI methods. When slow motor-related features are exploited, performance
varies strongly between individuals, as artifacts may obstruct relevant neural activity or are
inadvertently used for BCI control. Significance. Robustness of the proposed strategies can be
reproduced by EEG practitioners as the method is made available as an EEGLAB plug-in.
Keywords: EEG, artifact removal, independent component analysis (ICA), blind source
separation (BSS), brain–computer interface (BCI)
(Some figures may appear in colour only in the online journal)
task-locked artifacts, as the decoding of a user’s intent by reader to [17, 23, 24] on the question of which ICA variants are
a BCI system should not rely on task-related non-neural well-suited for artifact rejection. Instead, we focus on practical
signals. This requirement is most important when conducting tools which avoid the time-consuming hand-rating process
research with healthy study participants on a novel paradigm of ICs by classifying ICs with the help of machine learning
or analysis method which should be transferable to severely methods into artifactual and non-artifactual components. Most
motor-impaired patients, because they may not be physically approaches concentrate on eye artifacts [25–31], but automatic
capable of producing those artifacts [2–4]. Understandably, classification has also been successful for heart-beat artifacts
the role of artifacts is thus scrutinized during peer-reviewed [28, 31], generic discontinuities [29], muscle artifacts [31–34]
publication processes. and even very specialized artifacts such as cochlear implants
The exclusive use of brain signals in BCI must typically [21]. As most of these methods have a supervised basis, to
be dropped when it comes to practical tests with end-users in some degree they reflect the specific conditions of the training
need, as hybrid BCI approaches [5, 6] provide a richer and set. The EEG practitioner is now faced with the question of
more reliable control than pure BCIs. Additionally, interest in how well supervised methods generalize to his or her data
novel types of studies is growing amongst EEG researchers. acquired under novel experimental conditions with different
Such studies include users (inter-)acting in space [7–9] like preprocessing.
in collaborative and social paradigms (for a review see [10]), Unsupervised methods successfully circumvent this
the interaction between users and machines [11] and the non- problem for example by reverting to automatic thresholding
medical use of BCI methods [12, 13]. strategies [29]. However, these methods are often limited to
From an EEG practitioner’s point of view, a fully the use of one or two features and detect only certain types of
automatic algorithmic solution for the treatment of artifacts artifacts. It is unclear how to extend them to more complex
is desirable. It would put him or her in control of artifacts artifacts with a varying physiological fingerprint, such as
and enable him or her to either remove them or check muscle artifacts. For supervised or template-based approaches,
their influence. Ideally, this would be realized by a global first studies suggest that generalization to novel paradigms is
classifier which could be trained once and then reliably possible [28, 30, 31, 34]; however, efforts have concentrated
separates multiple types of artifactual components from on eye artifacts [28, 30].
neural components. The classifier should work robustly across
data from different users and across domains. The latter 1.2. Robustness under novel paradigms and electrode setups
includes changing experimental paradigms and tasks, different
preprocessing methods and varying EEG electrode setups. It In this paper, we take a step forward by analyzing the
should do so without any need of re-training, and it should not generalization ability of a state-of-the-art supervised IC
require separate artifact recordings before it can be applied to classification algorithm which we have recently proposed [34].
novel scenarios. It is not restricted to the classification of eye or muscle artifacts,
but is equally well suited to detect other artifacts such as
1.1. State-of-the-art IC artifact classification loose electrodes. By comparing three strategies, we investigate
this multi-artifact classifier wrt. new electrode setups and
For an extensive review of artifact reduction techniques paradigms. We ask the following questions: How does a
in the context of BCI-systems, we refer the reader to change of the electrode setup impact the IC classification
[14]. In our work, we concentrate on a class of popular performance? Is it necessary to hand-label components of the
artifact rejection approaches, which decompose the original new data set and retrain the classifier based on those? How
EEG into independent source components (ICs) using strong is the deterioration of IC classification performance
independent component analysis (ICA). This method exploits without re-training? We investigate these questions for three
the assumption that artifactual signal components and neural data sets of 6303 labeled ICs from 35 participants in 3
activity are generated independently. Artifactual ICs are hand- experimental studies: a reaction time (RT) task embedded in
selected and then discarded. The remaining neural components a simulated-driving task, an auditory event-related potential
are used to reconstruct the EEG [15, 16]. study (ERP-BCI) and a study analyzing continuous EEG data
While assumptions for the application of ICA methods (CNT) of subjects instructed to listen to short stories.
are only approximately met in practice (no systematic co-
activation of artifactual and neural activity, linear mixture 1.3. Effect on BCI performance
of independent components (ICs), stationarity of the sources
and the mixture, prior knowledge about the number of After having demonstrated the robustness properties of the
components), their application usually leads to a good, albeit IC classification, we are interested in the effects of automatic
not perfect separation for common artifacts such as blinks, eye ICA artifact cleaning on the classification of EEG trials in
movements or scalp muscles [17–20]. ICA has successfully BCI systems. As a first proof-of-concept, Halder et al [33]
been applied to the removal of cochlea implant artifacts [21]. applied artifact cleaning to data from three participants who
However, gait-related artifacts are reported to remain in most performed motor imagery. Depending on whether artifacts
of the ICs in EEG recorded during mobile activities [9, 22]. were systematically co-activated with the task or not, opposite
Because a thorough analysis of the achievable separation effects of artifact cleaning on BCI classification performance
performance is out of the scope of this paper, we refer the were demonstrated. To the best of our knowledge, only small
2
J. Neural Eng. 11 (2014) 035013 I Winkler et al
data sets of one or two participants have been analyzed since rates consistently over time. However, this algorithm, too,
then [35, 36]. is of subjective nature in the sense that it is optimized to
To fill this gap, we extend our analysis from [34] by predict labels similar to those labeling strategies applied by
investigating the overall effect of ICA artifact cleaning on human raters. The performance of the algorithm thus crucially
BCI performance to data of 101 participants wrt. 3 BCI depends on the quality of the training set and its labels. For all
paradigms: auditory event-related potentials, event-related our IC data sets, experts were instructed to identify components
(de-)synchronization and slow motor-related potentials due which are predominantly driven by artifacts.
to motor imagery tasks. In this paper, automatic IC classification is realized by
a linear pre-trained classifier. It is based on the following
1.4. Software for the EEG practitioner six features which were determined in a feature selection
procedure described in [34]. One feature aims to detect outliers
Last but not least, we make our IC classification software in the time series of an IC, three features are extracted from
available as an EEGLAB plug-in ‘MARA’ (Multiple Artifact the spectrum, and two features extract information from the
Rejection Algorithm). EEGLAB [37] is a popular, Matlab- scalp pattern of an IC—the latter depending directly on the
based open-source tool and used by a growing community of electrode layout.
EEG researchers. As existing ICA-based plug-ins primarily
focus on the detection of eye artifacts [27–29], we hope this (i) Current density norm. ICA itself does not provide
will deliver a substantial contribution to the community by information about the locations of the sources s. However,
assisting EEG practitioners with the rejection of multiple type ICA patterns can be interpreted as EEG potentials for
of artifacts. which the location of the sources can be estimated. We
considered 2142 locations arranged in a 1 cm spaced
3D-grid, formulated the forward problem according to
2. Methods and materials
[41–43] and sought the source distribution with minimal
2.1. Processing chain for ICA artifact rejection l2 -norm (i.e. the ‘simplest’ solution) [44, 45]. Since this
source distribution can model cerebral sources only, it
The typical process chain for artifact rejection with ICA is natural that artifactual signals originating outside the
consists of the following steps: first, a rough pre-cleaning brain can only be modeled by rather complicated sources.
of the data by channel rejection and trial rejection based on Those are characterized by a large l2 -norm, which we use
variance criteria may be performed. Second, a dimensionality as a feature.
reduction may help to avoid an unnatural splitting of (neural) (ii) Range within pattern. The logarithm of the difference
sources. Unfortunately, the optimal number of components between the minimal and the maximal activation in a
to extract remains unknown and has to be determined either pattern.
by visual inspection or by a heuristic, such as retaining 99% (iii) Mean local skewness. The mean absolute local skewness
of the explained variance or a fixed number of components. of time intervals of 15 s duration. This feature aims to
Third, ICA methods decompose the observed EEG data x detect outliers in the time series.
into unknown source components s assumed to be mutually (iv) λ and fit error. These two features describe the deviation
independent and following the generative linear model of a component’s spectrum from a prototypical 1/ f curve
x = A · s. Finally, artifactual source components are identified and its shape. The parameters k1 , λ, k2 > 0 of the curve
which allows the EEG signals to be reconstructed without
them. k1
f → − k2 (1)
In manual classification of ICs, experts ratings are fλ
based on a component’s time series, its power spectrum
and spatial pattern (given by the respective column of A). are determined by six points of the log spectrum: (1) the
Unfortunately, ICA frequently results in mixed components log power at 2 Hz, (2) the log power at 3 Hz, (3) the
containing aspects of both neural and artifactual activity point of the local minimum in the band 5–13 Hz, (4)
which cannot be rated unambiguously [38]. Consequently, the point 1 Hz below the third point of support, (5) the
such mixed components tend to be either retained or rejected point of the local minimum in the band 33–39 Hz, and (6)
depending on the specific application. The subjective nature the point 1 Hz below the fifth point of support. Finally,
of such expert decisions is reflected by the fact that experts the logarithm of λ and of the MSE of the approximation
disagree with each other as well as with themselves over time of f to the real spectrum in the 8–15 Hz range are used as
[39]. Nevertheless, the reliability of component classification features for the classifier.
is often not reported, and if it is, researchers use one of many (v) 8–13 Hz. The average log band power of the α band
metrics of inter-rater reliability statistics which are difficult (8–13 Hz).
to compare directly (e.g. Krippendorff’s alpha in [20], inter-
class correlation coefficient in [40], degree of association phi 2.2. Data sets and experimental paradigms
in [28], mean-squared error (MSE) or average agreement in
[34, 39]). Data sets of four experimental EEG paradigms (named RT,
Automatic classification of ICs based on Machine CNT, MI-BCI, ERP-BCI) were available for this study. For
Learning methods offers a well-described algorithm which three of them, RT, CNT and ERP-BCI, expert-labeled ICs
3
J. Neural Eng. 11 (2014) 035013 I Winkler et al
(artifacts versus neural sources) were available. Two data sets study of Schreuder et al [48] was re-analyzed. Their calibration
(MI-BCI, ERP-BCI) stem from BCI experiments. As the trial- measurement is used to train a shrinkage regularized linear
wise BCI tasks are known, the estimated single-trial BCI- classifier based on spatio-temporal ERP features [48, 49]. BCI
classification performance provides a metric for the influence performance evaluations are based on the re-analyzed online
of a preceding artifact treatment. data of these participants.
RT. For this data set, labeled ICs were available. In a MI-BCI. For this data set, labeled BCI-trials were available,
simulated-driving study, participants performed a forced- but no labeled ICs. This data set was recorded with 119 EEG
choice left or right key press RT task upon two auditory stimuli channels from 80 healthy BCI novices, who first performed
in an oddball paradigm [34]. EEG data was recorded from motor imagery tasks (left hand, right hand and both feet)
121 approx. equidistant sensors and high-noise channels were in a calibration run (i.e. without feedback). Every 8 s, the
rejected based on a variance criterion. We selected 43 runs of requested BCI task of the current trial was indicated by
10 min duration from eight participants that had 104 electrodes a visual cue. A CSP-based BCI-classifier (see below) was
in common. Prior to the IC computation via TDSEP [46], a trained on the labeled calibration trials using the pair of classes
2 Hz high-pass filter was applied, and dimensionality was which provided best discrimination. During the three online
reduced to 30 PCA components. Two experts hand-labeled the runs of 100 trials each participant controlled an application
resulting 30 ICs per run into artifactual and neural components which provided continuous visual feedback in the form of a
(1290 labeled ICs altogether). horizontally moving cursor [50].
Of these, 840 ICs (28 runs from 5 participants) were Motor imagery data can be exploited by two different
used to train a linear classifier CRT to discriminate artifactual types of EEG features.
from neural components. Another 450 ICs (15 runs from (i) CSP-MI-BCI: the most common strategy makes use of
3 remaining subjects) were available for estimating the oscillatory features which describe event-related (de)-
generalization performance of CRT . The training set contained synchronization (ERD/ERS) in the alpha- and beta band
52% of artifactual ICs, the test set contained 59%. of the EEG. After enhancing the SNR of these effects
by individual data-driven spatial filters, which are derived
CNT. For this data set, labeled ICs were available. Nine by the common spatial patterns (CSP) analysis [51], CSP-
participants continuously listened to audio–visual stories features can be classified by a shrinkage-regularized linear
during short runs of an average duration of 3.77 min [40]. The classifier.
resulting 71 recordings contained 62 EEG channels plus one (ii) LRP-MI-BCI: the second strategy is based on slow motor-
EOG channel. The recording of each run was appended with related potentials (e.g. the lateralized readiness potential
a short eyes-closed and eyes-open recording and high-pass (LRP)). Different classes of imagined movements are
filtered at 0.16 Hz. No dimensionality reduction was applied, distinguished with an ERP-type analysis [49, 52]: EEG
before ICs were estimated by FastICA [47] on the full set is band-pass filtered between 4 and 8 Hz, before a small
of electrodes. This decomposition yielded 63 × 71 = 4473 number of class-discriminative intervals is determined
components, which were hand-rated by three experts into 47% on the calibration data. The average activity per interval
artifactual and 53% neural source components. and channel is used as features for a binary shrinkage-
regularized linear classifier.
ERP-BCI. For this data set, labeled ICs as well as While the original online runs were performed with the
labeled BCI-trials were available. In a spatial auditory BCI CSP-MI-BCI classifier, without artifact rejection, the offline
study which made use of auditory event-related potentials, re-analysis makes use of both types of features in order to
participants underwent a calibration run of approx. 30 min assess the influence of a preceding artifact removal.
duration and an online spelling run [48]. In the online run,
subjects were asked to write a sentence while auditory and
2.3. Robustness under novel paradigms and electrode setups
visual feedback was provided. EEG was recorded from 61
electrodes while the participants listened to a rapid sequence For the classification of artifactual IC components, three
of 6 auditory stimuli and were instructed to silently count the classification strategies—fixed, adapted and study-specific—
number of appearances of a rare target tone. were compared on the ERP-BCI and the CNT data set. Figure 1
For the classification of artifacts, data of 18 participants visualizes the strategies. In the fixed scenario, classifier CRT is
was analyzed. Their EEG signals were band-pass filtered trained once on features of labeled ICs of the RT data set,
between 0.1 and 40 Hz and the dimensionality was reduced and furthermore applied to ICs of any other data set. Neither
to 30 PCA channels. Subsequently 30 ICs were computed hand-labeling of novel ICs nor re-calculation of features or
per run using TDSEP. The resulting 540 source components any re-training of the classifier is necessary in this simplest
were hand-labeled into 72% artifactual and 31% neural source scenario. While hand-labeling of novel ICs is also avoided
components. successfully in the adapted strategy, a channel adaptation on
To assess the influence of artifact correction onto the the RT-data is performed by cutting the training patterns to
BCI classification performance, data of the 21 BCI novices the specific electrode layout of the test data set. Features then
participating in the first session of the auditory ERP speller need to be re-calculated based on the reduced patterns and a
4
J. Neural Eng. 11 (2014) 035013 I Winkler et al
Figure 1. Schematic plot of the three transfer strategies fixed, adapted and study-specific. Expensive hand-labeling steps of ICs are marked
with red arrows, cheap channel reduction and classifier training steps in green and black. Note that any self-application of classifiers in the
study-specific strategy was performed exclusively in a leave-one-subject-out validation scenario.
re-training yields the adapted classifier CRT−A . All steps can data. This was the case only for the LRP-MI paradigm. IC
be performed automatically and do not require user input. components were then derived by TDSEP and classified with
The third strategy, study-specific, requires the effort of experts the adapted classifier CRT−A on the calibration data. The BCI
every time a novel study is performed. The ICs of at least is set up on the remaining ICs. On the online runs, un-mixing
some subjects need to be hand-labeled, before a study-specific and component rejection is performed according to the de-
classifier (e.g. CCNT or CERP ) can be trained and applied to mixing determined on the calibration data. The BCI classifier
novel subjects. It’s performance was evaluated by leave-one- is applied to features extracted from the remaining components
subject-out cross-validation. of the online runs.
To explore the robustness of the artifact classifier against
reduced EEG channel sets, we compared the fixed IC-classifier 3. Results
CRT with the adapted IC-classifier CRT−A on the RT and
ERP-BCI test data sets with reduced setups (varying from 3.1. Robustness under novel electrode setups
16 to 104 resp. 61 EEG channels). All electrode setups were
Figure 2 shows the classification error for the fixed classifier
approximately equidistant and covered the whole scalp.
CRT and the adapted classifier CRT−A for different channel
setups on both the RT and the ERP-BCI test sets. On the RT
2.4. Effect on BCI performance test data with the full 104 channel setup, a classifier using
This offline re-analysis of three BCI paradigms described in all six features achieves a MSE of 9.3% only, which slightly
section 2.2 compares standard BCI performance with and outperforms the use of only four pattern-independent features
without a preceding ICA artifact cleaning. In both cases, (12.4% MSE). While CRT generalizes robustly over the range
artifactual channel and trial rejection based on a variance of 104 to 48 electrodes in the RT test sets, its error increases up
criterion was performed prior to BCI training. Training of to 31.8% for the smallest set of 16 electrodes. On the ERP-BCI
the BCI-classifiers is based on the calibration runs only, and data set, the use of only four pattern-independent features is
BCI performance tests are performed with the online runs of already outperforming the fixed classifier CRT on the full 61
the participants. electrode setup. Classification performance of CRT then breaks
ICA artifact cleaning is included in a manner that down to 50% on the smallest set of 16 electrodes. In both the
allows for real-time BCI applications. Prior to TDSEP, we RT and the ERP-BCI data set, the drop in overall performance
estimated whether a PCA pre-processing to 99% explained is due to the bad performance of both pattern-based features
variance would be useful via cross-validation on the calibration of over 50%.
5
J. Neural Eng. 11 (2014) 035013 I Winkler et al
50 50
Test Error
Test Error
40 40
30 30
20 20
10 10
0 0
16 24 32 48 64 104 16 24 32 48 64 104
Number of electrodes Number of electrodes
70 70
60 60
50 50
Test Error
Test Error
40 40
30 30
20 20
10 10
0 0
16 24 30 42 61 16 24 30 42 61
Number of electrodes Number of electrodes
(b) ERP-BCI data set
Figure 2. Mean classification error ± standard error estimated on (a) the RT and (b) the ERP-BCI test sets for different channel setups. The
left plot shows the results for a fixed classifier, the right plot for a classifier adapted to each channel setup.
For the adapted strategy (i.e. re-training the classifier on small. We observe an error of 9.3% on the RT data set, an error
the patterns cut to the specific electrode setup), the error of of 9.6% on the ERP-BCI data set and an error of 13.1% on the
the pattern features (range within pattern and current density CNT data set. This improved performance is due to two effects:
norm) was much less pronounced in both data sets. The overall first, adjusting feature thresholds for the specific study may
error of CRT−A for 16 electrodes remained at 11.3% on the improve the performance of each feature. For example, a re-
RT data set (compared with 9.3% on 104 channels) and at training of the 8–13 Hz feature of the CNT data set decreased
15.9% for the ERP-BCI data set (compared with 13.3% on 61 its error from 33.3% to 18.0%. Second, feature weights adjust
channels). In both data sets, we slightly gain from using the such that more discriminative features obtain a higher weight.
pattern features. On the reduced electrode setup, the classifier Interestingly, after re-training both CERP and CCNT primarily
weight of the range in pattern dropped, while the weight for use one of the two pattern features—CERP focuses mostly on
current density norm remained stable. the current density norm feature, while CCNT is strongly based
on the range within pattern feature.
3.2. Robustness under novel paradigms
3.3. Effect on BCI performance
The results for the three proposed classification strategies on
the three labeled IC data sets are summarized in table 1. The The upper plots of figure 3 show scatter plots of BCI
adapted classifier CRT−A (trained on the RT data set cut to the performance with and without preceding ICA artifact cleaning
specific electrode montage of the ERP-BCI or CNT data set) for the three analyzed BCI paradigms. For ERP-BCI, BCI
achieves an error of 13.3% on the ERP-BCI data and an error performance decreased slightly from 69.4% to 68.3% (t(20) =
of 14.0% on the CNT data set. −2.43, p = 0.03, d = 0.21). On average, 44 components were
The classification performance can be improved by a re- retained and 16 artifactual components were removed. There
training on labeled data from the same study, but the effect is was no significant change in overall MI-CSP performance
6
J. Neural Eng. 11 (2014) 035013 I Winkler et al
Figure 3. Upper plots: effect of artifact correction for three BCI paradigms. Dots over the diagonal indicate participants, whose data
improved in classification performance (in per cent correct trials), dots below indicate participants whose performance decreased by the
correction. Changes are strongest for the paradigm MI-LRP, which is most sensitive to eye artifacts. For this paradigm, participants (A) and
(B) are highlighted, which undergo relatively strong changes. Lower plots: effect of artifact cleaning for participants (A) and (B). Top row:
average activity of selected channels for left trials (blue) and right trials (green). The four upper scalp plots indicate the spatial distribution
of average activity (in μV ) for one or two time intervals (in columns) and for left and right trials (upper and lower scalp plots). Lowest scalp
plots indicate the spatial distribution of class-discriminative information (as signed r2 values) per interval. For participant A, a dominating
eye artifact could be removed, which lead to an increase in the SNR and of classification performance. For participant B, very little
class-discriminant signal remained after artifact cleaning.
Table 1. Feature weight vectors w and test errors (MSE) for three data sets (RT, ERP-BCI and CNT) and three classification strategies (fixed
classifier CRT , adapted classifier CRT−A and study-specific classifiers CERP , CCNT ). Test errors are reported for the 6 single features and for the
combined classification. The fixed classifier is trained on the RT train data set. The adapted classifier is trained on the RT train data set cut to
the specific electrode montage. The study-specific classifiers are trained on data from the same study and evaluated with
leave-one-subject-out CV.
Current density Range within Local
norm pattern skewness λ 8–13 Hz FitError Combined
(t(79) = −0.50, p = 0.62, d = 0.04) which remained constant The strongest changes were observed for the MI-LRP
at ≈72% after the removal of on average 18 artifactual paradigm, which is most prone to eye artifacts due to the
components (69 components were kept). In both BCI systems, focus on low-frequency signal components. Note that as
the effect per subject was small. feedback was provided with a moving cursor, eye activity
7
J. Neural Eng. 11 (2014) 035013 I Winkler et al
Figure 4. Screen shot of the MARA plug-in applied to EEGLAB sample data.
may be correlated with the two classes. On average, nine strategy which recomputes the training features based on the
components were retained and ten artifactual components were specific electrode montage of the test sets. Using this relatively
removed. While the mean BCI accuracy remained constant at inexpensive strategy—no hand-labeling is involved—artifact
≈60% (t(79) = 0.23, p = 0.82, d = 0.03), the performance classification generalizes well even on very reduced electrode
of each participant varied considerably. The lower plots setups.
of figure 3 exemplarily highlight the effect of the artifact For comparison reasons, a re-training of the classifier
rejection for two participants. Without artifact rejection, both using labor-intensively gained hand-labeled ICs from every
participants mainly use eye artifacts for BCI control (frontal new study was analyzed (strategy study-specific). While
class-discriminative activation). The effect of artifact removal avoiding some generalization issues in theory, it is
can be twofold. For participant A, eye artifacts obstruct prohibitively expensive in most practical situations and only
the underlying neural activity, and the system’s accuracy achieved a performance gain of a few per cent compared with
improved upon artifact cleaning from 66.3% to 73.6% due to the adapted strategy.
an improved signal-to-noise level. In participant B, very little We therefore recommend the adapted strategy for artifact
class-discriminant activity remained after the eye activity was classification. It generalized robustly even to completely novel
removed. BCI classification dropped considerably from 91.3% EEG paradigms, with its IC classification performance (13.3%
to 64.0%. MSE on auditory ERP data and 14.0% MSE on auditory
listening data) staying on a similar level as inter-expert
4. Discussion disagreements (often above 10% [34, 39]). This classification
error is remarkably low given that the studies have been
To summarize, we have analyzed the robustness properties recorded with half the number of electrodes, used different
of our recently proposed artifact classification method and ICA methods and contained different proportions of artifactual
proposed a strategy to handle a wide range of electrode components.
setups. The proposed adapted strategy fully automates the We provide the ready-to-use artifact classifier to the
time-consuming rating of artifactual ICs and reliably identified community as an open-source EEGLAB plug-in called MARA
multiple types of artifacts from 35 participants and 3 EEG (multiple artifact rejection algorithm). MARA automatically
paradigms. adapts to novel channel setups and its output is designed
IC classification performance of three strategies was to support the experimenter in his or her decisions:
evaluated against expert ratings. We showed that our simplest a semi-automatic mode allows for visual inspection of
automatic fixed strategy (train the classifier once, then apply components and for changing the classifier’s proposed
to other setups) exhibits sensitivity to drastically reduced ratings. Figure 4 shows an example screen shot of the
electrode setups. As a solution, we proposed the adapted visual inspection menu. The plug-in is published under the
8
J. Neural Eng. 11 (2014) 035013 I Winkler et al
General Public License (GPL) and can be downloaded from Neuper C and Birbaumer N 2010 The hybrid BCI Front.
www.user.tu-berlin.de/irene.winkler/artifacts/. Neurosci. 4 42
BCI practitioners may find the application of MARA on [7] Gramann K, Gwin J T, Ferris D P, Oie K, Jung T-P, Lin C-T,
Liao L-D and Makeig S 2011 Cognition in action: imaging
BCI data sets of particular interest. We used the adapted brain/body dynamics in mobile humans Rev. Neurosci.
strategy to analyze how ICA artifact cleaning impacts on 22 593–608
single-trial BCI performance of three different BCI paradigms. [8] Debener S, Minow F, Emkes R, Gandras K and de Vos M 2012
In all three paradigms, we were able to remove artifactual How about taking a low-cost, small, and wireless EEG for a
activity while maintaining the average BCI performance. walk? Psychophysiology 49 1617–21
On the single subject level the effect of artifact cleaning [9] Castermans T, Duvinage M, Petieau M, Hoellinger T,
De Saedeleer C, Seetharaman K, Bengoetxea A, Cheron G
depends on whether artifacts mask the relevant neural activity and Dutoit T 2011 Optimizing the performances of a
or serve as a control signal for BCI. While artifact cleaning had P300-based brain–computer-interface in ambulatory
little influence on an auditory ERP speller and on oscillatory conditions IEEE J. Emerg. Sel. Top. Circuits Syst. 4 566–77
motor imagery data analyzed with CSP, we observed strong [10] Hari R and Kujala M V 2009 Brain basis of human social
effects for a paradigm known to be heavily affected by eye interaction: from concepts to brain imaging Physiol. Rev.
89 453–79
artifacts, the use of slow motor-related potentials. Here our
[11] Tangermann M, Krauledat M, Grzeska K, Sagebaum M,
analysis suggests that artifact removal by MARA or similar Vidaurre C, Blankertz B and Müller K-R 2009 Playing
tools may drastically improve the safety and reliability of pinball with non-invasive BCI Advances in Neural
results, as they guarantee that rejected artifacts are not utilized Information Processing Systems 21, December 8–11, 2008
mistakenly to control the BCI system. ed D Koller, D Schuurmans, Y Bengio and L Bottou
(Vancouver, BC: MIT Press) pp 1641–8
[12] Blankertz B et al 2010 The Berlin brain–computer interface:
Acknowledgments
non-medical uses of BCI technology Front. Neurosci. 4 198
We would like to thank Stefan Haufe for providing the code [13] van Erp J, Lotte F and Tangermann M 2012 Brain–computer
interfaces: beyond medical applications Computer 45 26–34
for the Current Density Norm feature, Claudia Sanelli and [14] Fatourechi M, Bashashati A, Ward R K and Birch G E 2007
Stefan Haufe for their help with recording and preparing the EMG and EOG artifacts in brain computer interface
RT data set, Anna Kuhlen for providing the manual labels systems: a survey Clin. Neurophysiol. 118 480–94
of the CNT dataset, the authors of [50] for providing the [15] Makeig S, Bell A J, Jung T-P and Sejnowski T J 1996
motor imagery data set, Martijn Schreuder for his help with Independent component analysis of
recording and preparing the auditory ERP-BCI data set, and electroencephalographic data Advances in Neural
Information Processing Systems vol 8 ed D S Touretzk,
Klaus-Robert Müller, Daniel Bartz and Andrew Dowding for M C Mozer and M E Hasselmo (Cambridge, MA: MIT
helpful comments on the manuscript. Last but not least, we Press) pp 145–51
would like to thank our reviewers for their valuable comments. [16] Jung T-P, Makeig S, Humphries C, Lee T-W, Mckeown M J,
This work is supported by the European ICT Programme Iragui V and Sejnowski T J 2000 Removing
(Project FP7-224631 TOBI), by the German Federal Ministry electroencephalographic artifacts by blind source separation
Psychophysiology 37 163–78
for Education and Research (BMBF) (grant 01GQ0850), by
[17] Fitzgibbon S P, Powers D M W, Pope K J and Richard
the Federal State of Berlin, and by the BrainLinks-BrainTools Clark C 2007 Removal of EEG noise and artifact using
Cluster of Excellence (DFG, grant number EXC 1086). This blind source separation Clin. Neurophysiol. 24 232–43
paper only reflects the authors’ views and funding agencies [18] Romero S, Mañanas M A and Barbanoj M J 2008 A
are not liable for any use that may be made of the information comparative study of automatic techniques for ocular
contained herein. artifact reduction in spontaneous EEG signals based on
clinical target variables: a simulation case Comput. Biol.
Med. 38 348–60
References [19] Crespo-Garcia M, Atienza M and Cantero J L 2008 Muscle
artifact removal from human sleep EEG by using
[1] Tangermann M et al 2012 Review of the BCI competition IV
independent component analysis Ann. Biomed. Eng.
Front. Neurosci. 6 55
36 467–75
[2] Kübler A, Nijboer F, Mellinger J, Vaughan T M, Pawelzik H,
Schalk G, McFarland D J, Birbaumer N and Wolpaw J R [20] McMenamin B W, Shackman A J, Maxwell J S,
2005 Patients with ALS can use sensorimotor rhythms to Bachhuber D R W, Koppenhaver A M, Greischar L L
operate a brain–computer interface Neurology 64 1775–7 and Davidson R J 2010 Validation of ICA-based myogenic
[3] Hill N J et al 2006 Classifying EEG and ECoG signals without artifact correction for scalp and source-localized EEG
subject training for fast BCI implementation: comparison of NeuroImage 49 2416–32
nonparalyzed and completely paralyzed subjects IEEE [21] Campos Viola F, De Vos M, Hine J, Sandmann P, Bleeck S,
Trans. Neural Syst. Rehabil. Eng. 14 183–6 Eyles J and Debener S 2012 Semi-automatic attenuation of
[4] Conradi J, Blankertz B, Tangermann M, Kunzmann V cochlear implant artifacts for the evaluation of late auditory
and Curio G 2009 Brain–computer interfacing in tetraplegic evoked potentials Hear. Res. 284 6–15
patients with high spinal cord injury Int. J. Bioelectromagn. [22] Gwin J T, Gramann K, Makeig S and Ferris D P 2010 Removal
11 65–8 of movement artifact from high-density EEG recorded
[5] Millán J del R et al 2010 Combining brain–computer during walking and running J. Neurophysiol. 103 3526–34
interfaces and assistive technologies: state-of-the-art and [23] Meinecke F, Ziehe A, Kawanabe M and Müller K-R 2002
challenges Front. Neurosci. 4 161 Resampling approach to estimate the stability of one- or
[6] Pfurtscheller G, Allison B Z, Brunner C, Bauernfeind G, multidimensional independent components IEEE Trans.
Solis-Escalante T, Scherer R, Zander T O, Mueller-Putz G, Biomed. Eng. 49 1514–425
9
J. Neural Eng. 11 (2014) 035013 I Winkler et al
[24] Choi S, Cichocki A, Park H-M and Lee S-Y 2005 Blind source [37] Delorme A and Makeig S 2004 EEGLAB: an open source
separation and independent component analysis: a review toolbox for analysis of single-trial EEG dynamics including
Neural Inform. Process. Lett. Rev. 6 1–57 independent component analysis J. Neurosci. Methods
[25] Romero S, Mañanas M A, Riba J, Morte A, Giménez S, Clos S 134 9–21
and Barbanoj M J 2004 Evaluation of an automatic ocular [38] Shackman A J, McMenamin B W, Slagter H A, Maxwell J S,
filtering method for awake spontaneous EEG signals based Greischar L L and Davidson R J 2009 Electromyogenic
on independent component analysis EMBS: 26th Annu. Int. artifacts and electroencephalographic inferences Brain
Conf. Engineering in Medicine and Biology Society Topography 22 7–12
pp 925–8 [39] Klekowicz H, Malinowska U, Piotrowska A J,
[26] Shoker L, Sanei S and Chambers J 2005 Artifact removal from Wolynczyk-Gmaj D, Niemcewi S and Durka P 2009 On the
electroencephalograms using a hybrid BSS-SVM algorithm robust parametric detection of EEG artifacts in
IEEE Signal Process. Lett. 12 721–4 polysomnographic recordings Neuroinformatics 7 147–60
[27] Gómez-Herrero G, De Clercq W, Anwar H, Kara O, [40] Kuhlen A K, Allefeld C and Haynes J-D 2012
Egiazarian K, Van Huffel S and Van Paesschen W 2006 Content-specific coordination of listeners’ to speakers’ EEG
Automatic removal of ocular artifacts in the EEG without during communication Front. Human Neurosci. 6 266
an EOG reference channel NORSIG: Proc. 7th Nordic [41] Fonov V, Evans A C, McKinstry R C, Almli C R and Louis
Signal Processing Symp. pp 130–3 Collins D 2009 Unbiased nonlinear average age-appropriate
[28] Campos Viola F, Thorne J, Edmonds B, Schneider T, brain templates from birth to adulthood NeuroImage
Eichele T and Debener S 2009 Semi-automatic 47 (Suppl. 1) S102
identification of independent components representing EEG [42] Fonov V, Evans A C, Botteron K, Almli C R, McKinstry R C
artifact Clin. Neurophysiol. 120 868–77 and Louis Collins D 2011 Unbiased average age-appropriate
[29] Mognon A, Jovicich J, Bruzzone L and Buiatti M 2010 atlases for pediatric studies NeuroImage 54 313–27
ADJUST: an automatic EEG artifact detector based on the [43] Nolte G and Dassios G 2005 Analytic expansion of the EEG
joint use of spatial and temporal features Psychophysiology lead field for realistic volume conductors Phys. Med. Biol.
48 229–40 50 3807–23
[30] Bigdely-Shamlo N, Kreutz-Delgado K, Kothe C and Makeig S [44] Hämäläinen M S and Ilmoniemi R J 1994 Interpreting
2013 Eyecatch: data-mining over half a million EEG magnetic fields of the brain: minimum-norm estimates Med.
independent components to construct a fully-automated Biol. Eng. Comput. 32 35–42
eye-component detector Proc. 35th Annu. Int. Conf. IEEE [45] Haufe S, Nikulin V V, Ziehe A, Müller K-R and Nolte G 2008
Engineering in Medicine and Biology Society pp 5845–8 Combining sparsity and rotational invariance in EEG/MEG
[31] Frøhlich L, Andersen T S and Mørup M 2013 Classification of source reconstruction NeuroImage 42 726–38
independent components of EEG into multiple artifact [46] Ziehe A, Müller K-R, Nolte G, Mackert B-M and Curio G
classes BaCI Conf. Abstract: Conf. on Basic and Clinical 2000 Artifact reduction in magnetoneurography based on
Multimodal Imaging time-delayed second-order correlations IEEE Trans.
[32] LeVan P, Urrestarazu E and Gotman J 2006 A system for Biomed. Eng. 47 75–87
automatic artifact removal in ictal scalp EEG based on [47] Hyvärinen A and Oja E 1997 A fast fixed-point algorithm for
independent component analysis and Bayesian independent component analysis Neural Comput.
classification Clin. Neurophysiol. 117 912–27 9 1483–92
[33] Halder S, Bensch M, Mellinger J, Bogdan M, Kübler A, [48] Schreuder M, Rost T and Tangermann M 2011 Listen, you are
Birbaumer N and Rosenstiel W 2007 Online artifact writing! Speeding up online spelling with a dynamic
removal for brain–computer interfaces using support vector auditory BCI Front. Neurosci. 5 112
machines and blind source separation Comput. Intell. [49] Blankertz B, Lemm S, Treder M S, Haufe S and Müller K-R
Neurosci. 7 1–10 2011 Single-trial analysis and classification of ERP
[34] Winkler I, Haufe S and Tangermann M 2011 Automatic components—a tutorial NeuroImage 56 814–25
classification of artifactual ICA-components for artifact [50] Blankertz B, Sannelli C, Halder S, Hammer E M, Kübler A,
removal in EEG signals Behav. Brain Funct. Müller K-R, Curio G and Dickhaus T 2010
7 30 Neurophysiological predictor of SMR-based BCI
[35] Asadi Ghanbari A, Kousarrize M R N, Teshnehlab M performance NeuroImage 51 1303–9
and Aliyari M 2009 An evolutionary artifact rejection [51] Blankertz B, Tomioka R, Lemm S, Kawanabe M
method for brain computer interface using ICA Int. J. and Müller K-R 2008 Optimizing spatial filters for robust
Electr. Comput. Sci. 9 48–53 EEG single-trial analysis IEEE Signal Proc. Mag. 25 41–56
[36] Bartels G, Shi L-C and Lu B-L 2010 Automatic artifact [52] Krauledat M, Dornhege G, Blankertz B, Losch F, Curio G
removal from EEG—a mixed approach based on double and Müller K-R 2004 Improving speed and accuracy of
blind source separation and support vector machine brain–computer interfaces using readiness potential features
EMBC’10: Annu. Int. Conf. IEEE Engineering in Medicine IEMBS’04: 26th Annu. Int. Conf. IEEE Engineering in
and Biology Society pp 5383–6 Medicine and Biology Society vol 2 pp 4511–5
10