On Sequence Kernels For SVM Classification of Sets of Vectors
On Sequence Kernels For SVM Classification of Sets of Vectors
Jérôme Louradour
Kernels for sequence applied to Speaker Verification with SVM Jérôme Louradour 1/29
Introduction
Classical approach
“target speaker”
DECISION
test sequence
? SCORING MAKING
or
“impostor”
Classifier
FRONT-END
target speaker
GMM
TRAINING
impostors, UBM
“background”
Kernels for sequence applied to Speaker Verification with SVM Jérôme Louradour 2/29
Introduction
FRONT-END
target speaker
GMM
GMM
TRAINING
impostors, UBM
UBM
“background”
Kernels for sequence applied to Speaker Verification with SVM Jérôme Louradour 2/29
Introduction
FRONT-END
target speaker
GMM
GMM
TRAINING
impostors, UBM
UBM
“background”
+ Theoretical power
+ Core algorithm well mastered
+ Good performance for binary classification
Kernels for sequence applied to Speaker Verification with SVM Jérôme Louradour 3/29
Sequence kernels
Outline
1 Sequence kernels
Kernels for sequence applied to Speaker Verification with SVM Jérôme Louradour 4/29
Sequence kernels
Principle
Basics on kernels
Similarity measure
Mercer property: symetric, positive definite
=⇒ k(x, y) = φ(x)> φ(y)
φ : expansion in a Feature Space RD (dimension D ≤ +∞)
Example for vectors: noyau Gaussien
kx−yk2
− 2
k(x, y) = e 2ρ
small ρ
Kernels for sequence applied to Speaker Verification with SVM Jérôme Louradour 5/29
Sequence kernels
Principle
Basics on kernels
Similarity measure
Mercer property: symetric, positive definite
=⇒ k(x, y) = φ(x)> φ(y)
φ : expansion in a Feature Space RD (dimension D ≤ +∞)
Example for vectors: noyau Gaussien
kx−yk2
− 2
k(x, y) = e 2ρ
high ρ
Kernels for sequence applied to Speaker Verification with SVM Jérôme Louradour 5/29
Sequence kernels
Principle
Basics on kernels
Similarity measure
Mercer property: symetric, positive definite
=⇒ k(x, y) = φ(x)> φ(y)
φ : expansion in a Feature Space RD (dimension D ≤ +∞)
Example for vectors: noyau Gaussien
kx−yk2
− 2
k(x, y) = e 2ρ
√
ρ= dσ
Kernels for sequence applied to Speaker Verification with SVM Jérôme Louradour 5/29
Sequence kernels
Principle
Basics on kernels
Similarity measure
Mercer property: symetric, positive definite
=⇒ k(x, y) = φ(x)> φ(y)
φ : expansion in a Feature Space RD (dimension D ≤ +∞)
Example for vectors: noyau Gaussien
kx−yk2
− 2
k(x, y) = e 2ρ
√
ρ= dσ
Kernels for sequence applied to Speaker Verification with SVM Jérôme Louradour 5/29
Sequence kernels
Principle
Basics on kernels
Similarity measure
Mercer property: symetric, positive definite
=⇒ k(x, y) = φ(x)> φ(y)
φ : expansion in a Feature Space RD (dimension D ≤ +∞)
Example for vectors: noyau Gaussien
kx−yk2
− 2
k(x, y) = e 2ρ
√
ρ= dσ
Kernels for sequence applied to Speaker Verification with SVM Jérôme Louradour 5/29
Sequence kernels
Principle
Kernels for sequence applied to Speaker Verification with SVM Jérôme Louradour 6/29
Sequence kernels
Principle
Kernels for sequence applied to Speaker Verification with SVM Jérôme Louradour 7/29
Sequence kernels
Principle
2 Gaussians GMM
Kernels for sequence applied to Speaker Verification with SVM Jérôme Louradour 7/29
Sequence kernels
Principle
GGGGGGGGGGGGGGGGGA p
Apprentissage
X, Y X , pY
learning
Kernels for sequence applied to Speaker Verification with SVM Jérôme Louradour 8/29
Sequence kernels
Principle
X = { xt | t = 1 · · · TX }
Sequences of vector
Y = { yt 0 | t 0 = 1 · · · TY }
w
between vectors sets of vectors
>
k(x, y) = φ(x) φ(y) κ(X, Y)
Mercer kernel in terms of k(xt , yt 0 ),
k(xt , xt 0 ), k(yt , yt 0 ), . . .
Simple example: TX X
TY
X
1
κ(X, Y) = TX TY k(xt , yt 0 )
t=1 t 0 =1
complexity O(T 2 ) for each kernel computation
Kernels for sequence applied to Speaker Verification with SVM Jérôme Louradour 9/29
Sequence kernels
Principle
> P
1
P −1 1
κ(X, Y) = TX t φq (xt ) S B TY t 0 φq (yt 0)
Normalizing with SB :
- GLDS kernel ∼ train on X (discriminant model) & test on Y
- Same amplitude of each feature
Explicit map:
+ High efficiency for testing (linear SVM model)
– Impossibility of using an expansion φq with high or infinite dimension
(in practice, max degree q = 3)
Kernels for sequence applied to Speaker Verification with SVM Jérôme Louradour 10/29
Sequence kernels
Principle
> P
1
P −1 1
κ(X, Y) = TX t φq (xt ) S B TY t 0 φq (yt 0)
Normalizing with SB :
- GLDS kernel ∼ train on X (discriminant model) & test on Y
- Same amplitude of each feature
Explicit map:
+ High efficiency for testing (linear SVM model)
– Impossibility of using an expansion φq with high or infinite dimension
(in practice, max degree q = 3)
Kernels for sequence applied to Speaker Verification with SVM Jérôme Louradour 10/29
A novel sequence kernel
Outline
1 Sequence kernels
Kernels for sequence applied to Speaker Verification with SVM Jérôme Louradour 11/29
A novel sequence kernel
Definition
y
avoid to compute φ
Objectif :
rewrite using the Mercer kernel k = φ> φ
Kernels for sequence applied to Speaker Verification with SVM Jérôme Louradour 12/29
A novel sequence kernel
Definition
y
avoid to compute φ
Objectif :
rewrite using the Mercer kernel k = φ> φ
FSMS kernels (Feature Space Mahalanobis Sequence kernels)
P > −1 1 P
κ(X, Y) = T1X t φ(xt ) ΣB + εI TY t 0 φ(yt 0 )
ΣB : covariance matrix of φ
SVM are invariant to translations in the Feature Space
⇒ same as FSNS with centering of φ
kernel ∼ KL divergence between Gaussians in the Feature Space
Kernels for sequence applied to Speaker Verification with SVM Jérôme Louradour 12/29
A novel sequence kernel
Dual Form
Kernels for sequence applied to Speaker Verification with SVM Jérôme Louradour 13/29
A novel sequence kernel
Dual Form
Proposition
> −1 1 P
1
P
κ(X, Y) = t φ(xt ) SB + εI t 0 φ(yt )
0
TX TY
> −1 1 P
1 1 2
P
= t ψB (xt ) N K + εK t 0 ψB (yt )
0
TX TY
Kernels for sequence applied to Speaker Verification with SVM Jérôme Louradour 13/29
A novel sequence kernel
Dual Form
P > −1 1 P
= T1X t ψB (xt ) 1
N KΠK + εK TY t 0 ψB (yt )
0
Kernels for sequence applied to Speaker Verification with SVM Jérôme Louradour 13/29
A novel sequence kernel
Dual Form
Computational complexity
Dot product of normalized expansions
>
κ(X, Y) = φ(X) MB φ(Y) = Uφ(X) , Uφ(Y) (1)
>
= ψB (X) RB ψB (Y) = VψB (X) , VψB (Y) (2)
Kernels for sequence applied to Speaker Verification with SVM Jérôme Louradour 14/29
A novel sequence kernel
Dual Form
Computational complexity
Dot product of normalized expansions
>
κ(X, Y) = φ(X) MB φ(Y) = Uφ(X) , Uφ(Y) (1)
>
= ψB (X) RB ψB (Y) = VψB (X) , VψB (Y) (2)
Kernel Approximation
Goal
1 Reduce the size of the empirical map ψ
2 Keep a maximum of information
1 Selection of a sub-population
of backround vectors:
codebook C = bp1 · · · bpi · · · bpm ⊂ B
(taille m)
index I = { p1 · · · pi · · · pm } ⊂ {1···N}
2 Low-rank approximation of the Gram matrix:
−1 >
K ≈ LI = K(:, I)K(I, I) K(:, I) (rang m)
P 2
min tr K − LI ≡ min kφ(bi ) − φC (bi )k
I C
Kernels for sequence applied to Speaker Verification with SVM Jérôme Louradour 15/29
A novel sequence kernel
Approximation
Kernel Approximation
Goal
1 Reduce the size of the empirical map ψ
2 Keep a maximum of information
Background data
Codebook
Kernels for sequence applied to Speaker Verification with SVM Jérôme Louradour 16/29
A novel sequence kernel
Approximation
Proposition
w
ICD
−1 >
K K(:, I)K(I, I) K(:, I)
Expansion of size m N :
k(bp1 , xt )
..
ψC (X) = T1X t
P
.
k(bpm , xt )
´−1
RB×C = 1
N
K(:, I)ΠK(:, I)> + εK(I, I)
Kernels for sequence applied to Speaker Verification with SVM Jérôme Louradour 17/29
Experimental evaluation on a speaker verification task
Outline
1 Sequence kernels
Kernels for sequence applied to Speaker Verification with SVM Jérôme Louradour 18/29
Experimental evaluation on a speaker verification task
Data
Speech corpus: NIST Speaker Recognition Evaluation
Development :
Background corpus (1/2)
Validation corpus (2/2)
- Hyper-parameters of kernels
- Parameter C (SVM learning)
- Decision threshold
Evaluation :
∼ 18000 tests, 400 target speakers
DCF = τFR Ploc FR% + τFA Pimp FA%
Kernels for sequence applied to Speaker Verification with SVM Jérôme Louradour 19/29
Experimental evaluation on a speaker verification task
Data
Speech corpus: NIST Speaker Recognition Evaluation
Development :
Background corpus (1/2)
Validation corpus (2/2)
- Hyper-parameters of kernels
- Parameter C (SVM learning)
- Decision threshold
Evaluation :
∼ 18000 tests, 400 target speakers
DCF = τFR Ploc FR% + τFA Pimp FA%
Front-end processing
SVM GMM
acoustic vectors MFCC +∆MFCC +∆logE LFCC +∆LFCC +∆logE
silence removal unsupervised clustering of the energy
normalization feature warping centring-reduction
Kernels for sequence applied to Speaker Verification with SVM Jérôme Louradour 19/29
Experimental evaluation on a speaker verification task
Kernels for sequence applied to Speaker Verification with SVM Jérôme Louradour 20/29
Experimental evaluation on a speaker verification task
Kernels for sequence applied to Speaker Verification with SVM Jérôme Louradour 21/29
Experimental evaluation on a speaker verification task
EER DCF
Probability product
13.92 52.1
kernel SVM
GLDS kernel SVM 12.54 48.8
Fisher kernel SVM 11.90 44.0
FSNS kernel SVM 11.91 41.6
UBM-GMM (ref) 12.06 40.6
GMM supervectors
10.40 37.7
SVM
Kernels for sequence applied to Speaker Verification with SVM Jérôme Louradour 22/29
Experimental evaluation on a speaker verification task
Evaluation: Fusion
EER DCF
(1) FSNS kernel
11.91 41.6
SVM
(2) UBM-GMM 12.06 40.6
(3) GMM
10.40 37.7
supervectors SVM
(2+3) fusion no improvement
(1+2) fusion 9.71 37.0
(1+3) fusion 10.28 36.1
Kernels for sequence applied to Speaker Verification with SVM Jérôme Louradour 23/29
Kernel between pairs of sequences for speaker verification
Outline
1 Sequence kernels
Kernels for sequence applied to Speaker Verification with SVM Jérôme Louradour 24/29
Kernel between pairs of sequences for speaker verification
Principle
TEST
? Pair of
sequences
?
FRONT-END
+1
same speaker ?
-1
Pair-of-
Sequences
= +1
MODEL
TRAINING
≠ -1
Principe
pairs of
dissimilar GMMs
1 Map pairs of GMM in a suitable
vectorial space
2 Use a vectorial kernel pairs of
similar GMMs
Principe
pairs of
1 dissimilar GMMs
1 Map pairs of GMM in a suitable
vectorial space
2 Use a vectorial kernel pairs of
similar GMMs
Evaluation
EER DCF
Probability product
13.92 52.1
kernel SVM
GLDS kernel SVM 12.54 48.8
Pair-of-Sequences
11.58 46.0
kernel SVM
Fisher kernel SVM 11.90 44.0
FSNS kernel SVM 11.91 41.6
UBM-GMM (ref) 12.06 40.6
GMM supervectors
10.40 37.7
SVM
Kernels for sequence applied to Speaker Verification with SVM Jérôme Louradour 27/29
Kernel between pairs of sequences for speaker verification
Experiments
Evaluation
EER DCF
Pair-of-Sequences
11.58 46.0
kernel SVM
FSNS kernel SVM 11.91 41.6
Fusion 10.93 40.5
Kernels for sequence applied to Speaker Verification with SVM Jérôme Louradour 28/29
Conclusion
Perspectives
Kernels for sequence applied to Speaker Verification with SVM Jérôme Louradour 29/29