01-Spike sorting based on multi-class support vector machine with superposition resolution
01-Spike sorting based on multi-class support vector machine with superposition resolution
DOI 10.1007/s11517-007-0248-0
ORIGINAL ARTICLE
Received: 9 October 2006 / Accepted: 20 August 2007 / Published online: 15 September 2007
International Federation for Medical and Biological Engineering 2007
Abstract A new spike sorting method based on the from several neurons adjacent to one electrode. The spikes
support vector machine (SVM) is proposed to resolve the must be classified before further analysis is carried out.
superposition problem. The spike superposition is gener- The methods for spike sorting have been extensively
ally resolved by the template matching. Previous template studied in the past three decades [13, 14]. The template
matching methods separate the spikes through linear matching is one category of spike sorting methods. It could
classifiers. The classification performance is severely be implemented in real time with superposition resolution.
influenced by the background noise included in spike Conventionally, the template matching methods identify
trains. The nonlinear classifiers with high generation ability spikes by measuring the statistical distances between the
are required to deal with the task. A multi-class SVM detected spikes and a given template. If the distance is
classifier is therefore applied to separate the spikes, which smaller than a predetermined threshold, the detected spikes
contains several binary SVM classifiers. Every binary SVM are assigned to a specific class. The superposition is
classifier corresponding to one spike class is used to resolved by iterative subtraction of all possible template
identify the single and superposition spikes. The superpo- combinations from unidentified waveforms [2, 12]. Satis-
sition spikes are decomposed through template extraction. factory sorting results can be achieved when background
The experimental results on the simulated and real data noise is Gaussian or the signal to noise ratio (SNR) is high.
demonstrate the utility of the proposed method. Actually, the background noise can be more complicated
[7, 15]. As an alternative, the neural network classifiers are
Keywords Spike sorting Template matching applied to template matching [1, 4, 10], which resolve
Multi-class Support vector machine superposition problem by introducing overlapping spike
Extracellular recording waveforms into the training data. Nevertheless, the neural
networks are prone to local optima during training. The
quality and consistency of the converged solution are
1 Introduction therefore not guaranteed. Recently, Takahashi et al. [17–
19] introduce independent component analysis (ICA) into
The extracellular recordings of neural spikes have become spike sorting. It can resolve the spike superposition for
an indispensable technique in modern neuroscience tetrode data. This method has a prerequisite that the
research. The neural activity recordings often contain spikes recordings of neighboring electrodes must contain same
spikes from different neurons, which is not always satisfied
in the case of extracellular recordings with multi-electrode
W. Ding J. Yuan (&) array. Zhang et al. [22] propose an automatic overlapping
Department of Automation, Shanghai Jiao Tong University,
decomposition method using the v2 test, which is only
800 Dongchuan Road, Shanghai 200240,
People’s Republic of China suitable for Gaussian noise. Wang et al. [21] try to resolve
e-mail: [email protected] the superposition problem in frequency domain. Unfortu-
W. Ding nately, the spikes generated by real neurons occupy often
e-mail: [email protected] the similar frequency bands as background noise.
123
140 Med Bio Eng Comput (2008) 46:139–145
The support vector machine (SVM) has been introduced T
class of x arg max xi /ðxÞ þ bi
to bioinformatics and spike sorting because of its excellent i¼1;...;k
generalization ability [3, 6, 11, 16, 20]. However, the spike
There are k l-variable quadratic programming problems to
sorting method using SVM [20] could not identify multi-
be solved. The Gaussian RBF kernel is adopted in all of
class spikes and not decompose overlapping spikes. In this
binary SVM classifiers.
study, a new method based on multi-class SVM is proposed
to classify neural spikes and resolve the superposition
problem. The multi-class SVM is able to accurately identify
2.2 Classification and superposition resolution
neural spikes even when the background noise is non-
Gaussian or when SNR is very low. The detected spikes,
The detected spikes are analyzed by fuzzy C-means clus-
which are not assigned to any class by multi-class SVM, are
tering algorithm (FCM) [23] to obtain training data and
treated as superposition spikes. They are further decom-
templates. The spikes with high membership degrees to
posed by template extraction in SVM classification process.
each cluster are used as training data. The clustering cen-
ters are used as the templates. The training data contains
approximately a tenth of all the detected spike waveforms
2 Methods
for each class. The k binary SVM classifiers need to be
designed for k clusters. The ith SVM classifier is trained by
2.1 Multi-class support vector machine
examples with positive labels from ith cluster and exam-
ples with negative labels from other clusters. Every binary
The existing multi-class SVM classifiers include ‘‘one-
SVM classifier corresponds to one template.
against-all’’, ‘‘one-against-one’’ and Directed Acyclic
In order to distinguish the superposition spikes from
Graph SVM [9]. By testing on various neural activity
others, the training data with negative labels should contain
recordings acquired with multi-electrode array, the
some ambiguous spike waveforms. These spikes wave-
‘‘one-against-all’’ method shows better classification
forms represent the superposition and severely distorted
performance and is therefore used as the classifier. ‘‘One-
spikes. In classification process, the superposition spikes
against-all’’ combines several binary SVM classifiers. For
are determined by the threshold to the output of multi-class
the data with k classes, k binary SVM classifiers need to be
SVM. Let e be the minimum output of all the training data
constructed. The ith binary SVM is trained with examples in
with positive labels for multi-class SVM. By experiments,
the ith class with positive labels, and examples in other
we found that the spikes would be outliers or superposition
classes with negative labels. Thus given l training data
spikes if the SVM outputs are less than 0.75e. Therefore,
(x1,y1), …, (xl,yl,), where xi[Rn, i = 1,…,l and yi[{1,…,k} is
0.75e is set as the predetermined threshold.
the class of xi, the ith SVM solves the following problem:
Once the SVM parameters are trained completely, the
Xl detected spikes are classified by the SVM classifiers. If the
1 i T i
min i x x þC nij SVM outputs for spikes are less than 0.75e, the spikes will
i i
x ;b ;n 2 j¼1 be regarded as superposition or severely distorted spikes.
i T They are then decomposed through template extraction in
x / xj þ bi 1 nij if yi ¼ i ð1Þ
classification process.
i T
x / xj þ bi 1 þ nij if yi ¼6 i The process of the template extraction is as follows:
Let N be the total number of templates, M the number of
nij 0 j ¼ 1; . . .; l sample points of spike waveforms, ds(i) the distorted spike
where training data xi is mapped to a higher dimensional waveform and template (i–m, j) the template j with shift m.
P The waveform extracted by templates is:
space by the function /. The penalty term C lj¼1 nij is used
to reduce training errors with C as the penalty parameter.
xðiÞ ¼ dsðiÞ templateði m; jÞ ð3Þ
After Eq. (1) is solved, k decision functions are
obtained: where i = 1,…, M, j = 1,…, N, m = –M + 1,…, M–1 and
template(i – m, j) = 0 if i – m £ 0 or i – m [ M.
T
ðx1 Þ /ðxÞ þ b1 If the output of kth binary SVM classifier for the
.. ð2Þ waveform x(i) is larger than the predetermined threshold,
k T . the distorted spike would be produced by the superposition
x /ðxÞ þ bk
of class k and class j at mth sample point. Otherwise the
The input data x is in the class which has the largest value process of template extraction continues, and the input for
of the decision function: SVM classifier is:
123
Med Bio Eng Comput (2008) 46:139–145 141
0 0 spikes being extracted. All the spikes are represented in 40
xðiÞ ¼ dsðiÞ templateði m; jÞ template i m ; j
sample point waveforms (see Fig. 1).
ð4Þ We compare the performance of the proposed method
with Bankman’s [2], Lewicki’s [12], Chandra’s [4] and
where j=j0 .
Zhang’s [22] method under various SNR. The noise has the
If none of the output is larger than the predetermined
same statistic characteristics as the real noise from extra-
threshold at the end of whole extracting, the waveform
cellar recordings, and only its variance differs. The SNR is
would be severely distorted spike.
defined as the peak-to-peak value of spikes with minimum
amplitude divided by the root mean square value of the
2.3 Data collection and spike detection noise.
Figure 2 shows the correct classification rate under
The real data were acquired with multi-electrode array different SNR. The suitable parameters for the five meth-
(MEA, Multi Channel System MCS GmbH, Germany) on ods are selected, with which better classification
the isolated, newly hatched chicken retinas [5]. Multi-unit performance are obtained. The spike train contains about
photoresponses were recorded from all 60 electrodes 2,000 single spikes in Fig. 2a. More than 1,500 overlapping
simultaneously, which were amplified with a 60-channel spikes of two classes are classified in Fig. 2b. Figure 2c
amplifier (single-ended amplifiers, bandwidth 10 Hz to shows the classification results for 1,100 overlapping
3.4 kHz, amplification 1,200). The recording was digitized spikes of three classes. These spike waveforms are over-
by a commercial multiplexed data acquisition system lapping at different sample points.
(MCRack). MCRack sampled the incoming data at a rate of The correct classification rate of the proposed method is
20 kHz. 7% larger than that of other four methods in three situa-
The spikes were detected when the amplitude of the tions. A linear classifier is used in Bankman’s method.
recorded signal exceeded a positive or negative threshold. Lewicki’s method and Zhang’s method improve the clas-
The threshold was determined by the standard deviation of sification performance by using the statistical information
the pure neuronal noise trace. of the background noise. When the distribution of the
background noise is non-Gaussian or the spike clusters are
overlapped, the performance of the three methods is not
3 Results satisfactory. Chandra’s method overcomes the problems by
adopting the neural network classifier and obtains better
To verify the classification performance, the proposed classification results for single spikes. This method is
method is applied to simulated and real spike data. mainly used to classify the spike waveforms in real time. It
separates overlapping spikes by introducing overlapping
spike waveforms into training data. As the training data
3.1 Application to simulated data belonging to different classes could be overlapped in
Chandra’s method, the correct classification rate for
The simulated spike trains are constructed by embedding superposition spikes becomes very low (Fig. 2b, c). The
four templates at various locations in noise traces. The proposed method improves the classification performance
templates come from the average spike waveforms, which by using nonlinear classifier and resolves the superposition
have been completely classified in our experiments. The problem through template extraction, and thus has better
noise is taken from a segment of real recording with the classification performance.
123
142 Med Bio Eng Comput (2008) 46:139–145
The classification results of the proposed method for the analyzing these spikes by FCM, three clusters are identi-
overlapping of two spikes are better than that of three fied, which means that there are three ganglion cells
spikes as shown in Fig. 2. When four or more spike contributing to the recording. Three templates are then
waveforms are overlapping, the background noise becomes reconstructed (Fig. 3b) and the appropriate training data
very complicated. The five methods could not accurately are selected.
decompose the overlapping spikes even though SNR is The detected spikes are decomposed by adopting multi-
very high. class SVM. Accordingly, there are 353 spikes classified as
class I, 672 are classified as class II, 415 are classified as
class III, 89 spikes are overlapping of class I and II, 67 are
3.2 Application to real data overlapping of class I and III, 72 are overlapping of class II
and III, 17 are overlapping of three classes. As the number
The proposed method is applied to real data recorded from of spikes in class II is the largest and the number of spikes
an isolated, newly hatched chicken’s retina. There are in class I is the smallest, the firing rate of neuron I and III is
1,685 spikes detected by setting the threshold (21 lV) in lower than that of neuron II. Therefore, the overlapping of
3.1 standard deviation of the background noise estimated class I and III is less common than other overlapping.
from the idle periods of the recording (Fig. 3). After However, the arrival times of spikes generated by neuron I
123
Med Bio Eng Comput (2008) 46:139–145 143
are more close to spike arrival times of neuron II. Although SVM. They are then decomposed by template extraction in
the firing rate of neuron III is faster than that of neuron I, SVM classification process.
the overlapping of class I and II is more common than that In order to analyze the distribution of background noise
of class II and III. in real data, the Mahalanobis distances from spike wave-
The representative waveforms and auto-correlation his- forms to the mean are calculated [8, 15]. These spike
tograms for the separated spikes are shown in Fig. 4. The waveforms have been separated well and generated by one
clear spike refractory period in the autocorrelation histo- unit as shown in Fig. 4. The quantile–quantlie plot in Fig. 7
gram indicates that the spikes belonging to one class are compares the empirical distributions and v2-distributions.
generated by a single neuron. Figure 5 shows the cross- Under the normal distribution assumption of background
correlation histograms between all possible pairs of neu- noise, the Mahalanobis distance will follow a v2-distribu-
rons for real data. The flat cross-correlation histograms tion. However, the empirical and v2-distribution have
indicate that all the neural spikes are not produced by the significant discrepancies. And thus the real background
same bursting neuron. noise is not exactly Gaussian. The suitable methods are
Figure 6 shows the examples of the superposition res- required to identify the spikes under non-Gaussian back-
olution. The overlapping spikes are detected by multi-class ground noise.
Fig. 5 Cross-correlation
histograms of the real data for
unit I and II (a), for unit I and III
(b), and for unit II and III (c)
123
144 Med Bio Eng Comput (2008) 46:139–145
123
Med Bio Eng Comput (2008) 46:139–145 145
noises are introduced in separating overlapping spikes, 10. Kim KH, Kim SJ (2000) Neural spike sorting under nearly 0-dB
which will impact the superposition resolution. The choice signal-to-noise ratio using nonlinear energy operator and artificial
neural-network classifier. IEEE Trans Biomed Eng 47:1406–1411
of appropriate training data and templates is therefore 11. Kim KH, Kim SS, Kim SJ (2006) Improvement of spike train
critical to ensure the successful applications of the pro- decoder under spike detection and classification errors using
posed method. support vector machine. Med Biol Eng Comput 44:124–130
12. Lewicki MS (1994) Bayesian modeling and classification of
Acknowledgments This study is supported by the National Natural neural signals. Neural Comput 6:1005–1030
Science Foundation of China (Grant No. 60574038) and the Spe- 13. Lewicki MS (1998) A review of methods for spike sorting: the
cialized Research Fund for the Doctoral Program of Higher Education detection and classification of neural action potentials. Netw
China (Grant No.20060248015). Comput Neural Syst 9:53–78
14. Schmidt EM (1984) Computer separation of multi-unit neuro-
electric data: a review. J Neurosci Methods 12:95–111
15. Shoham S, Fellows MR, Normamn R (2003) A robust, automatic
References spike sorting using mixtures of multivariate t-distributions.
J Neurosci Methods 127:111–122
1. Argoud FI, De Azevedo FM, Neto JM, Grillo E (2006) SADE3: 16. Sun S, Zhang C (2006) Adaptive feature extraction for EEG
an effective system for automated detection of epileptiform signal classification. Med Biol Eng Comput 44: 931–935
events in long-term EEG based on context information. Med Biol 17. Takahashi S, Anzai Y, Sakurai Y (2003) Automatic sorting for
Eng Comput 44(6):459–470 multi-neuronal activity recorded with tetrodes in the presence of
2. Bankman IN, Johnson KO, Schneider W (1993) Optimal detec- overlapping spikes. J Neurophysiol 89:2245–2258
tion, classification, and superposition resolution in neural 18. Takahashi S, Sakurai Y (2005) Real-time and automatic sorting
waveform recordings. IEEE Trans Biomed Eng 40(8):836–841 of multi-neuronal activity for sub-millisecond interactions
3. Boostani R, Graimann B, Moradi MH, Pfurtscheller G (2007) A in vivo. Neuroscience 134:301–315
comparison approach toward finding the best feature and classi- 19. Takahashi S, Sakurai Y, Tsukada M, Anzai Y (2002) Classifi-
fier in cue-based BCI. Med Biol Eng Comput 45:403–412 cation of neuronal activities from tetrode recordings using
4. Chandra R, Optican LM (1997) Detection, classification, and independent component analysis. Neurocomputing 49:289–298
superposition resolution of action potentials in multiunit single- 20. Vogelstein RJ, Murari K, Thakur PH, Cauwenberghs G, Cha-
channel recordings by an on-line real-time neural network. IEEE krabartty S, Diehl C (2004) Spike sorting with support vector
Trans Biomed Eng 44:403–412 machines. In: Proceedings of 26th annual international confer-
5. Chen AH, Zhou Y, Gong HQ, Liang PJ (2003) Chicken retinal ence on IEEE engineering in medicine and biology society
ganglion cells response characteristics: multi-channel electrode 21. Wang GL, Zhou Y, Chen AH, Zhang PM, Liang PJ (2006)
recording study. Sci China Ser C 33:82–88 A robust method for spike sorting with automatic overlap
6. Cristianini N, Shawe-Taylor J (2000) An introduction to support decomposition. IEEE Trans Biomed Eng 53:1195–1198
vector machines. Cambridge University Press, Cambridge 22. Zhang PM, Wu JY, Zhou Y, Liang PJ, Yuan JQ (2004) Spike
7. Fee MS, Mitra PP, Kleinfeld D (1996) Variability of extracellular sorting based on automatic template reconstruction with a partial
spike waveforms of cortical neurons. J Neurophysiol 76:3823– solution to the overlapping problem. J Neurosci Methods 135:
3833 55–65
8. Harris KD, Henze DA, Csicsvari J, Hirase H (2000) Accuracy of 23. Zouridaks G, Tam DC (2000) Identification of reliable spike
tetrode spike separation as determined by simultaneous intracel- templates in multi-unit extracellular recording using fuzzy
lular and extracellular measurements. J Neurophysiol 84:401–414 clustering. Comput Methods Programs Biomed 61:91–98
9. Hsu CW, Lin CJ (2002) A comparison on methods for multi-class
support vector machines. IEEE Trans Neural Netw 13:415–425
123