BCI Challenge: Error Potential Detection With Cross-Subject Generalisation

This document outlines the author's solution to a brain-computer interface challenge which achieved first prize. The solution focused on predicting errors in a P300 speller task using EEG data from 16 training subjects and 10 test subjects. Features were extracted from the EEG signals following feedback, including mean EEG values in different windows, template matching features, and meta features. A support vector machine model averaging approach using different feature sets achieved an AUC of 0.76921 on the test data.

Uploaded by

meagmohit

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

49 views5 pages

BCI Challenge: Error Potential Detection With Cross-Subject Generalisation

Uploaded by

meagmohit

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 5

BCI challenge: Error potential detection with cross-subject

generalisation
Duncan Barrack
Nottingham, UK
[email protected]
March 10, 2015

1 Summary
In this document I outline my solution to the brain computer interface (BCI challenge hosted on Kaggle [1]
which was awarded first prize. The competition used data collected in the study conducted by Perrin et al.
[6] and the aim was to form a statistical model to predict errors in the spelling task of the P-300 speller [3].
During a P-300 trial users are presented with letters and numbers and are tasked with spelling words. Using
electroencephalogram (EEG) data the speller attempts to determine which letter the user is thinking of and
presents him/her with this letter. If the letter coincides with the same letter the user was thinking about then
the feedback is regarded as positive, otherwise the feedback is negative. The challenge is to predict, based on
the user’s response to the feedback event, whether feedback was positive or negative.

The EEG (56 channels) and Electrooculography (EOG, 1 channel) data of sixteen subjects who partook in
340 P-300 trials are provided as a training set. Ten other subjects make up the test set for which it is not known
whether the feedback was positive or negative. The efficacy of the predictive models submitted by competition
entrants are measured against this test set. Two subjects make up the score used for the public leader-board
and eight for the private leader-board. It is the latter which is used for the final standings.

For the solution presented in this work, I focused on the 1.3s long EEG and EOG signals after feedback
which contain the event related potentials (ERP) associated with the user’s response to the feedback event. I
engineered several features from the ERP signals including the means of the EEG values for each channel in
windows of various lengths and lags, features based on template matching, as well as meta features such as trial
time-stamp, trial session number etc. To predict the probability of a positive feedback event in the test set I
took the arithmetic mean of the posterior probabilities of two regularised support vector machines which were
trained on different features sets using linear kernels. This approach scored an area under the receiver operating
characteristic curve on the private leader-board of 0.76921.

2 Features Selection / Extraction

Raw EEG and EEG signals were bandpass filtered to between 1 and 20 Hz using a fifth order Butterworth filter
[5]. The features I used were predominantly based on material I came across in the academic P-300 speller
literature and are described below.

2.1 Feature set A - Meta features

I used the trial number and trial session number as features as well as whether the trial was long or short. The
motivation behind this was that as a user progressed through the trials they would begin to fatigue and make
more mistakes. Session and trial number could act as proxies for this. Furthermore trials were spread over
five sessions and session five differed from the other sessions in that if the P-300 speller incorrectly predicted

1
the feedback letter then it would present the user with the next most probable letter. It was possible to infer
whether this had happened by looking at the times between feedback events (longer times tended to correspond
with ‘retrials’ where the initial prediction of the P-300 speller was incorrect). I used this as a feature.

2.2 Feature set B - Mean EEG values in windows of different lengths and lags
Based on a video lecture given by Müller, Blankertz et al. [2] I took the mean of the EEG and EOG recordings
after the feedback event over windows of a number different lengths (50ms to 650ms at increments of 50ms)
and lags (0ms to 1250ms at increments of 50ms). This was done for each channel separately giving a total of
14820 features.

2.3 Feature set C - Template matching based features

Inspired by Smulders et al. [7] I averaged all ERP signals labelled as positive feedback in the training set to create
a ‘positive feedback template’ and similarly to create a ‘negative feedback template’. This was done separately
for each channel. I then used the correlation coefficient, maximum cross-correlation coefficient at lags of 200ms,
covariance, maximum cross-covariance at lags of 200ms and Euclidean distance between the templates and test
ERP signals as features. Additionally I used the difference between the correlation coefficients obtained using
the positive and negative feedback templates (and similarly for the maximum cross-correlation coefficients),
the ratio of the covariance obtained using the positive feedback template to the covariance obtained using the
negative template (and similarly for the cross-correlation covariance) as well as the ratio between the Euclidean
distances obtained using the positive template and that obtained using the negative feedback template. This
gave a total of 855 additional features.

3 Modeling Techniques and Training

3.1 Cross validation
I used 4 fold ‘subject wise’ cross validation (CV) and calculated the area under the receiver operating character-
istic curve (AUC) [4] for the four subjects in the test fold. I repeated my CV procedure five times with different
splits and took the average of the 20 AUC scores produced (5 repetitions × 4 folds). For the template based
features (feature set C) I ensured that only the data within the training folds was used to form the templates
so as to not bias the results.

3.2 Model selection

I tried a number of different machine learning algorithms including gradient tree boosting, random forests,
support vector machines (SVM) and logistic regression with elastic net regularisation [4] using different sets of
features. The two models which gave the highest CV scores were a SVM with a linear kernel using feature sets
A and B and SVM with a linear kernel using feature sets A and C. For these models L2 regularisation was used
with regularisation parameters set via CV. Furthermore, rather than use all of feature set B a sub-set of 7410
features, accounting for half of the total generated, was used. This was determined using a logistic regression
model with ridge regression (penalty term set via CV) [4] and taking the 7410 features which had the highest
coefficient values. Although models using the subset had similar CV scores to models using all of the features,
the variance in the AUC CV scores was greatly reduced when the the sub-set was used.

3.3 Model averaging

My final model was formed by taking a weighted average of the posterior probabilities produced by the best
two linear SVM models described in Section 3.2. The weights were set via CV (see Figure 1).

2
4 Code Description
The code used is available at https://github.com/duncan-barrack/kaggle_BCI_challenge. The repository
contains seven programs which are described below.

• generate meta features.m - generates all of the features in set A (Section 2.1) except the features based
on session 5 retrials.

• get sess5 retrial features.m - generates the remaining features in set A.

• get ave amplitude features.m - produces the features in set B (Section 2.2)
• get template features1.m - gives all of the features in set C (Section 2.3) expect those based on cross-
correlation and cross-covariance.

• get template features2.m - generates the remaining features in set C.

• train model.py - Trains the two SVM models described in Section 3.2 using the training data.
• predict.py - Combines the predictions of the SVM models by taking a weighted average and produces a
Submission.csv file.

5 Dependencies
To generate the features I used Matlab R2014b. To fit and predict the model Python 2.7.6 was used with
scikit-learn 0.14.1, numpy 1.8.2 and pandas 0.13.1.

6 How To Generate the Solution

After setting the paths in the SETTINGS.json file, to generate the features the following Matlab scripts will
need to be run in order, generate meta features.m, get sess5 retrial features.m, get ave amplitude features.m,
get template features1.m and get template features2.m. To train the model run train model.py and
finally to make predictions on the test set run predict.py.

3
7 Figures

0.80

0.75
AUC CV score

0.70

0.65

0.60
0.0 0.2 0.4 0.6 0.8 1.0
Relative weight of SVM model (features sets A and B)

Figure 1: Plot showing the CV scores when the posterior probabilities from a linear SVM model which uses
features sets A and B and a linear SVM model which uses features sets A and C (both described in Section 3.2)
are averaged using different weights. Here the maximum CV score ≈ 0.75 is obtained by a applying a relative
weight of 0.46 to the results of the SVM model which uses features sets A and B. This weighting was used to
predict the labels of the test set.

References
[1] BCI Challenge @ NER 2015. https://www.kaggle.com/c/inria-bci-challenge. Accessed: January
2015.
[2] Machine Learning and Signal Processing Tools for BCI. http://videolectures.net/bbci09_blankertz_
muller_mlasp/. Accessed: January 2015.
[3] Lawrence Ashley Farwell and Emanuel Donchin. Talking off the top of your head: toward a mental prosthesis
utilizing event-related brain potentials. Electroencephalography and clinical Neurophysiology, 70(6):510–523,
1988.

[4] Trevor Hastie, Robert Tibshirani, Jerome Friedman, T Hastie, J Friedman, and R Tibshirani. The elements
of statistical learning, volume 2. Springer, 2009.
[5] B Latni. Signal processing and linear systems. Oxford University Press, USA, 1998.
[6] Margaux Perrin, Emmanuel Maby, Sébastien Daligault, Olivier Bertrand, and Jérémie Mattout. Objec-
tive and subjective evaluation of online error correction during p300-based spelling. Advances in Human-
Computer Interaction, 2012:4, 2012.

4
[7] Fren TY Smulders, JL Kenemans, and A Kok. A comparison of different methods for estimating single-trial
p300 latencies. Electroencephalography and Clinical Neurophysiology/Evoked Potentials Section, 92(2):107–
114, 1994.

Volume 3 Answers
33% (3)
Volume 3 Answers
105 pages
Bannari Amman Institute of Technology
No ratings yet
Bannari Amman Institute of Technology
10 pages
Prototype - Js Cheat Sheet
100% (16)
Prototype - Js Cheat Sheet
1 page
CRISP-DM Template Final Project
No ratings yet
CRISP-DM Template Final Project
13 pages
Bci Thesis
100% (3)
Bci Thesis
4 pages
Kakinada Seaport Visit by Hemanthkumar
No ratings yet
Kakinada Seaport Visit by Hemanthkumar
36 pages
8601 Quiz - 03087611772
0% (1)
8601 Quiz - 03087611772
55 pages
Boycott List of Israel Items
No ratings yet
Boycott List of Israel Items
3 pages
Byzhao 1
No ratings yet
Byzhao 1
116 pages
Mechanical Equipment Selection
No ratings yet
Mechanical Equipment Selection
17 pages
Thesis
No ratings yet
Thesis
86 pages
Classification of EEG Data Using Machine Learning Techniques
No ratings yet
Classification of EEG Data Using Machine Learning Techniques
79 pages
Da Salla
No ratings yet
Da Salla
70 pages
Improving Transfer Rates in Brain Computer Interfacing: A Case Study
100% (2)
Improving Transfer Rates in Brain Computer Interfacing: A Case Study
8 pages
Wcma Application Note
No ratings yet
Wcma Application Note
103 pages
Machine Learning Tutorial Machine Learning Tutorial
No ratings yet
Machine Learning Tutorial Machine Learning Tutorial
33 pages
Delhi Metro Rail Corporation LTD
No ratings yet
Delhi Metro Rail Corporation LTD
3 pages
Zertifikat-IEC62109-fuer-Huawei-SUN2000-215KTL-H0-Wechselrichter
No ratings yet
Zertifikat-IEC62109-fuer-Huawei-SUN2000-215KTL-H0-Wechselrichter
3 pages
A Review of Classification Algorithms For EEG-based Brain-Computer Interfaces
No ratings yet
A Review of Classification Algorithms For EEG-based Brain-Computer Interfaces
24 pages
Model Selection and Evaluation
No ratings yet
Model Selection and Evaluation
23 pages
Institutionalization Stage Revalida
No ratings yet
Institutionalization Stage Revalida
59 pages
Unit 3 Materials Technology Esrmnotes - In-1 PDF
No ratings yet
Unit 3 Materials Technology Esrmnotes - In-1 PDF
118 pages
EEGLAB 2010 - BCILAB - Instructions and Practicum
No ratings yet
EEGLAB 2010 - BCILAB - Instructions and Practicum
68 pages
КМиОЗ ПП ENG
No ratings yet
КМиОЗ ПП ENG
35 pages
Description PDF
No ratings yet
Description PDF
6 pages
Doc-20250117-Wa0014. 20250117 193235 0000
No ratings yet
Doc-20250117-Wa0014. 20250117 193235 0000
22 pages
Applsci 13 13350 v2
No ratings yet
Applsci 13 13350 v2
19 pages
IV Ai & Ds Al3451 ML Unit5
No ratings yet
IV Ai & Ds Al3451 ML Unit5
23 pages
Research Paper
No ratings yet
Research Paper
26 pages
Extrema and Average Rates of Change+
No ratings yet
Extrema and Average Rates of Change+
63 pages
Classé CD DVD 1 CD DVD Player SM
No ratings yet
Classé CD DVD 1 CD DVD Player SM
44 pages
Applications of Multi Rotor Drone Technologies in Construction Management
No ratings yet
Applications of Multi Rotor Drone Technologies in Construction Management
14 pages
Pretrained CNN
No ratings yet
Pretrained CNN
14 pages
Art:10.1007/s00500 014 1443 1
No ratings yet
Art:10.1007/s00500 014 1443 1
14 pages
EEG Dataset For RSVP and P300 Speller Brain-Computer Interfaces
No ratings yet
EEG Dataset For RSVP and P300 Speller Brain-Computer Interfaces
11 pages
Jbhi 2018 2883458
No ratings yet
Jbhi 2018 2883458
12 pages
Sprite Library For CSharp PDF
No ratings yet
Sprite Library For CSharp PDF
29 pages
GOLF Proposal
No ratings yet
GOLF Proposal
7 pages
Machine Learning Project 1
No ratings yet
Machine Learning Project 1
19 pages
International Reporting Template: Exploration Results, Mineral Resources and Mineral Reserves
No ratings yet
International Reporting Template: Exploration Results, Mineral Resources and Mineral Reserves
36 pages
Automatic Time-Frequency Analysis of Mrps For Mind-Controlled Mechatronic Devices
No ratings yet
Automatic Time-Frequency Analysis of Mrps For Mind-Controlled Mechatronic Devices
6 pages
Classification of Four Categories of EEG Signals Based On Relevance Vector Machine
No ratings yet
Classification of Four Categories of EEG Signals Based On Relevance Vector Machine
6 pages
CinC2020 032
No ratings yet
CinC2020 032
4 pages
When Brain and Behavior Disagree
No ratings yet
When Brain and Behavior Disagree
4 pages
Special Topic: Missing Values
No ratings yet
Special Topic: Missing Values
25 pages
Liu 2012
No ratings yet
Liu 2012
5 pages
Brain Computer Interface
No ratings yet
Brain Computer Interface
28 pages
Seizure Prediction From EEG Data With BiLSTM
No ratings yet
Seizure Prediction From EEG Data With BiLSTM
6 pages
A Comparative Study of Feature Extraction Methods in P300 Detection
No ratings yet
A Comparative Study of Feature Extraction Methods in P300 Detection
4 pages
Assignment of Trademark
No ratings yet
Assignment of Trademark
3 pages
2015 Ieee
No ratings yet
2015 Ieee
4 pages
Chapter One
No ratings yet
Chapter One
5 pages
A Feasibility Study On Using EEG For Biometric Tra
No ratings yet
A Feasibility Study On Using EEG For Biometric Tra
4 pages
A Comparative Analysis On Feature Extraction and Classification of EEG Signal For Brain-Computer Interface Applications
No ratings yet
A Comparative Analysis On Feature Extraction and Classification of EEG Signal For Brain-Computer Interface Applications
8 pages
Quiz 13 Questions and Answers
No ratings yet
Quiz 13 Questions and Answers
3 pages
A Criterion To Evaluate Feature Vectors Based On ANOVA Statistical Analysis
No ratings yet
A Criterion To Evaluate Feature Vectors Based On ANOVA Statistical Analysis
5 pages
Empathy Statements For Customer Service
No ratings yet
Empathy Statements For Customer Service
3 pages
Amd BKK Amd 7 Seas
No ratings yet
Amd BKK Amd 7 Seas
2 pages
Deep Learning Based Prediction of EEG Motor Imagery of Stroke Patients' For Neuro-Rehabilitation Application
No ratings yet
Deep Learning Based Prediction of EEG Motor Imagery of Stroke Patients' For Neuro-Rehabilitation Application
8 pages
Energy Computation Using DCT For Brain Computer Interface Motor Imagery Classification
No ratings yet
Energy Computation Using DCT For Brain Computer Interface Motor Imagery Classification
4 pages
A New Method For Features Normalization in Motor Imagery Few-Shot Learning Using Resting-State 2021
No ratings yet
A New Method For Features Normalization in Motor Imagery Few-Shot Learning Using Resting-State 2021
17 pages
Genetic Feature Selection For A P300 Brain Computer Interface
No ratings yet
Genetic Feature Selection For A P300 Brain Computer Interface
4 pages
Array Questions
No ratings yet
Array Questions
2 pages
Deteccion P300
No ratings yet
Deteccion P300
4 pages
SSRN 4309595
No ratings yet
SSRN 4309595
16 pages
Feature Selection Using A Genetic Algorithm in A Motor Imagery-Based Brain Computer Interface
No ratings yet
Feature Selection Using A Genetic Algorithm in A Motor Imagery-Based Brain Computer Interface
4 pages
Shrey Choubey: Career Objective Skills
No ratings yet
Shrey Choubey: Career Objective Skills
2 pages
BCI Competition 2008 - Graz Data Set A: Experimental Paradigm
No ratings yet
BCI Competition 2008 - Graz Data Set A: Experimental Paradigm
6 pages
Exam Spring 10
No ratings yet
Exam Spring 10
10 pages
Channel Selection and Classification of Electroencephalogram Signals: An Artificial Neural Network and Genetic Algorithm-Based Approach
No ratings yet
Channel Selection and Classification of Electroencephalogram Signals: An Artificial Neural Network and Genetic Algorithm-Based Approach
10 pages
Spesifikasi Gorman Ruup PAH3A60-6068H
No ratings yet
Spesifikasi Gorman Ruup PAH3A60-6068H
2 pages
LST-1198 Decom &amp Xfrto Morroco 7-9-84
No ratings yet
LST-1198 Decom &amp Xfrto Morroco 7-9-84
30 pages
Cambridge International General Certificate of Secondary Education
No ratings yet
Cambridge International General Certificate of Secondary Education
8 pages
Study of Algorithm(s) For EEG Based Brain Computer Interface
No ratings yet
Study of Algorithm(s) For EEG Based Brain Computer Interface
7 pages
An Efficient Heartbeats Classifier Based On Convolutional Neural Network
No ratings yet
An Efficient Heartbeats Classifier Based On Convolutional Neural Network
3 pages
Information About Netbook Axioo Neon CNW
0% (1)
Information About Netbook Axioo Neon CNW
16 pages
Blankertz Bbci Print
No ratings yet
Blankertz Bbci Print
27 pages
A Neural Network-Based Time Series Prediction Approach For Feature Extraction in A Brain-Computer Interface
No ratings yet
A Neural Network-Based Time Series Prediction Approach For Feature Extraction in A Brain-Computer Interface
10 pages
Multitask Learning For Brain-Computer Interfaces
No ratings yet
Multitask Learning For Brain-Computer Interfaces
8 pages
Spatio-Spectral Filters For Improving The Classification of Single Trial EEG
No ratings yet
Spatio-Spectral Filters For Improving The Classification of Single Trial EEG
7 pages
First Paper
No ratings yet
First Paper
4 pages
Necanko, Inc.
0% (1)
Necanko, Inc.
9 pages
Feature Extraction of Eeg Signals Using Power Spectral Entropy
No ratings yet
Feature Extraction of Eeg Signals Using Power Spectral Entropy
5 pages
tmpFFD7 TMP
No ratings yet
tmpFFD7 TMP
4 pages
Practical Monte Carlo Simulation with Excel - Part 2 of 2: Applications and Distributions
From Everand
Practical Monte Carlo Simulation with Excel - Part 2 of 2: Applications and Distributions
Akram Najjar
2/5 (1)
DATA MINING and MACHINE LEARNING: CLUSTER ANALYSIS and kNN CLASSIFIERS. Examples with MATLAB
From Everand
DATA MINING and MACHINE LEARNING: CLUSTER ANALYSIS and kNN CLASSIFIERS. Examples with MATLAB
César Pérez López
No ratings yet
DEEP LEARNING TECHNIQUES: CLUSTER ANALYSIS and PATTERN RECOGNITION with NEURAL NETWORKS. Examples with MATLAB
From Everand
DEEP LEARNING TECHNIQUES: CLUSTER ANALYSIS and PATTERN RECOGNITION with NEURAL NETWORKS. Examples with MATLAB
César Pérez López
No ratings yet
DATA MINING AND MACHINE LEARNING. PREDICTIVE TECHNIQUES: REGRESSION, GENERALIZED LINEAR MODELS, SUPPORT VECTOR MACHINE AND NEURAL NETWORKS
From Everand
DATA MINING AND MACHINE LEARNING. PREDICTIVE TECHNIQUES: REGRESSION, GENERALIZED LINEAR MODELS, SUPPORT VECTOR MACHINE AND NEURAL NETWORKS
César Pérez López
No ratings yet
ISTQB Advanced Level Technical Test Analyst- Exam Insights: Q&A with Explanations
From Everand
ISTQB Advanced Level Technical Test Analyst- Exam Insights: Q&A with Explanations
SUJAN
No ratings yet
Machine Learning - Advanced Concepts
From Everand
Machine Learning - Advanced Concepts
Derrick Mwiti
No ratings yet