0% found this document useful (0 votes)
17 views44 pages

Class 0420

This document discusses support vector machines (SVM) and the standard large-margin problem. It introduces the concept of finding the optimal separating hyperplane that maximizes the margin between positive and negative examples. This hyperplane is known as the largest-margin separating hyperplane. The goal is to find the hyperplane that separates the classes with the maximum margin while also correctly classifying all training examples. The document derives the formula to calculate the distance from an example to the separating hyperplane and formulates the standard large-margin problem to find the optimal hyperplane parameters.

Uploaded by

江健彰
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views44 pages

Class 0420

This document discusses support vector machines (SVM) and the standard large-margin problem. It introduces the concept of finding the optimal separating hyperplane that maximizes the margin between positive and negative examples. This hyperplane is known as the largest-margin separating hyperplane. The goal is to find the hyperplane that separates the classes with the maximum margin while also correctly classifying all training examples. The document derives the formula to calculate the distance from an example to the separating hyperplane and formulates the standard large-margin problem to find the optimal hyperplane parameters.

Uploaded by

江健彰
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 44

Machine Learning

(機器學習)
Lecture 10: Support Vector Machine (1)
Hsuan-Tien Lin (林軒田)
[email protected]

Department of Computer Science


& Information Engineering
National Taiwan University
(國立台灣大學資訊工程系)

Hsuan-Tien Lin (NTU CSIE) Machine Learning 0/42


Support Vector Machine (1)

Roadmap
1 When Can Machines Learn?
2 Why Can Machines Learn?
3 How Can Machines Learn?
4 How Can Machines Learn Better?
5 Embedding Numerous Features: Kernel Models

Lecture 10: Support Vector Machine (1)


Large-Margin Separating Hyperplane
Standard Large-Margin Problem
Support Vector Machine
Motivation of Dual SVM
Lagrange Dual SVM
Solving Dual SVM
Messages behind Dual SVM

Hsuan-Tien Lin (NTU CSIE) Machine Learning 1/42


Support Vector Machine (1) Large-Margin Separating Hyperplane

Linear Classification Revisited

PLA/pocket

h(x) = sign(s)
x0
x1
s
x2 h(x)
(linear separable)

xd

plausible err = 0/1


(small flipping noise)
minimize specially

linear (hyperplane) classifiers:


h(x) = sign(wT x)
Hsuan-Tien Lin (NTU CSIE) Machine Learning 2/42
Support Vector Machine (1) Large-Margin Separating Hyperplane

Which Line Is Best?

• PLA? depending on randomness


• VC bound? whichever you like!

Eout (w) ≤ Ein (w) + Ω(H)


| {z } | {z }
0 dVC =d+1

You? rightmost one, possibly :-)

Hsuan-Tien Lin (NTU CSIE) Machine Learning 3/42


Support Vector Machine (1) Large-Margin Separating Hyperplane

Why Rightmost Hyperplane?

informal argument
if (Gaussian-like) noise on future x ≈ xn :
xn further from hyperplane distance to closest xn
⇐⇒ tolerate more noise ⇐⇒ amount of noise tolerance
⇐⇒ more robust to overfitting ⇐⇒ robustness of hyperplane

rightmost one: more robust


because of larger distance to closest xn

Hsuan-Tien Lin (NTU CSIE) Machine Learning 4/42


Support Vector Machine (1) Large-Margin Separating Hyperplane

Fat Hyperplane

• robust separating hyperplane: fat


—far from both sides of examples
• robustness ≡ fatness: distance to closest xn

goal: find fattest separating hyperplane


Hsuan-Tien Lin (NTU CSIE) Machine Learning 5/42
Support Vector Machine (1) Large-Margin Separating Hyperplane

Large-Margin Separating Hyperplane

max fatness(w)
w
subject to w classifies every (xn , yn ) correctly
fatness(w) = min distance(xn , w)
n=1,...,N

• fatness: formally called margin


• correctness: yn = sign(wT xn )

goal: find largest-margin


separating hyperplane
Hsuan-Tien Lin (NTU CSIE) Machine Learning 6/42
Support Vector Machine (1) Large-Margin Separating Hyperplane

Large-Margin Separating Hyperplane

max margin(w)
w
subject to every yn wT xn > 0
margin(w) = min distance(xn , w)
n=1,...,N

• fatness: formally called margin


• correctness: yn = sign(wT xn )

goal: find largest-margin


separating hyperplane
Hsuan-Tien Lin (NTU CSIE) Machine Learning 6/42
Support Vector Machine (1) Large-Margin Separating Hyperplane

Questions?

Hsuan-Tien Lin (NTU CSIE) Machine Learning 7/42


Support Vector Machine (1) Standard Large-Margin Problem

Distance to Hyperplane: Preliminary

max margin(w)
w
subject to every yn wT xn > 0
margin(w) = min distance(xn , w)
n=1,...,N

‘shorten’ x and w
distance needs w0 and (w1 , . . . , wd ) differently (to be derived)

b = w0 x0X
X
 =
X1

X
       
| w1 | x1
 ..  ;  . 
 w  =  .   x  =  .. 
   

| wd | xd

for this part: h(x) = sign(wT x + b)


Hsuan-Tien Lin (NTU CSIE) Machine Learning 8/42
Support Vector Machine (1) Standard Large-Margin Problem

Distance to Hyperplane
want: distance(x, b, w), with hyperplane wT x′ + b = 0

x
consider x′ , x′′ on hyperplane
w
1 wT x′ = − b, wT x′′ = − b dist(x, h)
x′′
2 w ⊥ hyperplane: x ′

 

(x′′ − x′ )
 T 
w =0
 | {z } 
vector on hyperplane

3 distance = project (x − x′ ) to ⊥ hyperplane

T
w ′ 1 1
|wT x + b|

distance(x, b, w) =
(x − x ) =
∥w∥ ∥w∥

Hsuan-Tien Lin (NTU CSIE) Machine Learning 9/42


Support Vector Machine (1) Standard Large-Margin Problem

Distance to Separating Hyperplane


1
distance(x, b, w) = |wT x + b|
∥w∥

• separating hyperplane: for every n

yn (wT xn + b) > 0
• distance to separating hyperplane:
1
distance(xn , b, w) = yn (wT xn + b)
∥w∥

max margin(b, w)
b,w

subject to every yn (wT xn + b) > 0


margin(b, w) = min 1
yn (wT xn + b)
n=1,...,N ∥w∥

Hsuan-Tien Lin (NTU CSIE) Machine Learning 10/42


Support Vector Machine (1) Standard Large-Margin Problem

Margin of Special Separating Hyperplane


max margin(b, w)
b,w

subject to every yn (wT xn + b) > 0


margin(b, w) = min 1
yn (wT xn + b)
n=1,...,N ∥w∥

• wT x + b = 0 same as 3wT x + 3b = 0: scaling does not matter


• special scaling: only consider separating (b, w) such that

min yn (wT xn + b) = 1 =⇒ margin(b, w) = 1


∥w∥
n=1,...,N

1
max ∥w∥
b,w

subject to every yn (wT xn + b) > 0


min yn (wT xn + b) = 1
n=1,...,N

Hsuan-Tien Lin (NTU CSIE) Machine Learning 11/42


Support Vector Machine (1) Standard Large-Margin Problem

Standard Large-Margin Hyperplane Problem


1
max subject to min yn (wT xn + b) = 1
b,w ∥w∥ n=1,...,N

necessary constraints: yn (wT xn + b) ≥ 1 for all n

original constraint: minn=1,...,N yn (wT xn + b) = 1


want: optimal (b, w) here (inside)

if optimal (b, w) outside, e.g. yn (wT xn + b) > 1.126 for all n


b w
—can scale (b, w) to “more optimal” ( 1.126 , 1.126 ) (contradiction!)
√ 1
final change: max =⇒ min, remove , add 2

1 T
min
b,w 2w w

subject to yn (wT xn + b) ≥ 1 for all n


Hsuan-Tien Lin (NTU CSIE) Machine Learning 12/42
Support Vector Machine (1) Standard Large-Margin Problem

Questions?

Hsuan-Tien Lin (NTU CSIE) Machine Learning 13/42


Support Vector Machine (1) Support Vector Machine

Support Vector Machine (SVM)

0.707

optimal solution: (w1 = 1, w2 = −1, b = −1)

0
1 √1
margin(b, w) = =

=
1
∥w∥ 2


2
x

1
x
• examples on boundary: ‘locates’ fattest hyperplane
other examples: not needed
• call boundary example support vector (candidate)

support vector machine (SVM):


learn fattest hyperplanes
(with help of support vectors )

Hsuan-Tien Lin (NTU CSIE) Machine Learning 14/42


Support Vector Machine (1) Support Vector Machine

Solving General SVM


1 T
min
b,w 2w w

subject to yn (wT xn + b) ≥ 1 for all n

• not easy manually, of course :-)


• gradient descent? not easy with constraints
• luckily:
• (convex) quadratic objective function of (b, w)
• linear constraints of (b, w)
—quadratic programming

quadratic programming (QP):


‘easy’ optimization problem

Hsuan-Tien Lin (NTU CSIE) Machine Learning 15/42


Support Vector Machine (1) Support Vector Machine

Quadratic Programming

optimal (b, w) = ? optimal u ← QP(Q, p, A, c)

1 T 1 T
min 2w w min 2 u Qu + pT u
b,w u

subject to yn (wT xn + b) ≥ 1, subject to aTm u ≥ cm ,


for n = 1, 2, . . . , N for m = 1, 2, . . . , M

0Td
   
b 0
objective function: u= ;Q = ; p = 0d+1
w 0d Id
aTn = yn 1 xTn
 
constraints: ; cn = 1; M = N

SVM with general QP solver:


easy if you’ve read the manual :-)
Hsuan-Tien Lin (NTU CSIE) Machine Learning 16/42
Support Vector Machine (1) Support Vector Machine

SVM with QP Solver


Linear Hard-Margin SVM Algorithm
0Td
 
0
; p = 0d+1 ; aTn = yn 1 xTn ; cn = 1
 
1 Q=
0d Id
 
b
2 ← QP(Q, p, A, c)
w
3 return b & w as gSVM

• hard-margin: nothing violate ‘fat boundary’


• linear: xn

want non-linear?
zn = Φ(xn )—remember? :-)

Hsuan-Tien Lin (NTU CSIE) Machine Learning 17/42


Support Vector Machine (1) Support Vector Machine

Why Large-Margin Hyperplane?

1 T
min
b,w 2w w

subject to yn (wT zn + b) ≥ 1 for all n

minimize constraint
regularization Ein wT w ≤ C
SVM wT w Ein = 0 [and more]

SVM (large-margin hyperplane):


‘weight-decay regularization’ within Ein = 0

Hsuan-Tien Lin (NTU CSIE) Machine Learning 18/42


Support Vector Machine (1) Support Vector Machine

Large-Margin Restricts Dichotomies


consider ‘large-margin algorithm’ Aρ :
either returns g with margin(g) ≥ ρ (if exists), or 0 otherwise

A0 : like PLA =⇒ shatter ‘general’ 3 inputs

A1.126 : more strict than SVM =⇒ cannot shatter any 3 inputs


ρ

fewer dichotomies =⇒ smaller ‘VC dim.’ =⇒ better generalization

Hsuan-Tien Lin (NTU CSIE) Machine Learning 19/42


Support Vector Machine (1) Support Vector Machine

VC Dimension of Large-Margin Algorithm


fewer dichotomies =⇒ smaller ‘VC dim.’
considers dVC (Aρ ) [data-dependent, need more than VC]
instead of dVC (H) [data-independent, covered by VC]

generally, when X in radius-R hyperball:


 2 
R
dVC (Aρ ) ≤ min ,d + 1 ≤ d + 1}
ρ2 | {z
dVC (perceptrons)

Hsuan-Tien Lin (NTU CSIE) Machine Learning 20/42


Support Vector Machine (1) Support Vector Machine

Benefits of Large-Margin Hyperplanes


large-margin
hyperplanes hyperplanes hyperplanes
+ feature transform Φ
# even fewer not many many
boundary simple simple sophisticated

• not many good, for dVC and generalization


• sophisticated good, for possibly better Ein

a new possibility: non-linear SVM


large-margin
hyperplanes
+ numerous feature transform Φ
# not many
boundary sophisticated

Hsuan-Tien Lin (NTU CSIE) Machine Learning 21/42


Support Vector Machine (1) Support Vector Machine

Questions?

Hsuan-Tien Lin (NTU CSIE) Machine Learning 22/42


Support Vector Machine (1) Motivation of Dual SVM

Non-Linear Support Vector Machine Revisited


Non-Linear Hard-Margin SVM
0Td̃
 
0
1 T 1 Q= ; p = 0d̃+1 ;
min
b,w 2w w 0d̃ Id̃
aTn = yn 1 zTn ; cn = 1
 
T
s. t. yn (w zn + b) ≥ 1,  
|{z} b
Φ(xn ) 2 ← QP(Q, p, A, c)
w
for n = 1, 2, . . . , N
3 return b ∈ R & w ∈ Rd̃ with
gSVM (x) = sign(wT Φ(x) + b)

• demanded: not many (large-margin), but sophisticated


boundary (feature transform)
• QP with d̃ + 1 variables and N constraints
—challenging if d̃ large, or infinite?! :-)

goal: SVM without dependence on d̃


Hsuan-Tien Lin (NTU CSIE) Machine Learning 23/42
Support Vector Machine (1) Motivation of Dual SVM

Todo: SVM ‘without’ d̃

Original SVM ‘Equivalent’ SVM


(convex) QP of (convex) QP of
• d̃ + 1 variables • N variables
• N constraints • N + 1 constraints

Warning: Heavy Math!!!!!!


• introduce some necessary math without rigor to help understand
SVM deeper
• ‘claim’ some results if details unnecessary
—like how we ‘claimed’ Hoeffding

‘Equivalent’ SVM: based on some


dual problem of Original SVM

Hsuan-Tien Lin (NTU CSIE) Machine Learning 24/42


Support Vector Machine (1) Motivation of Dual SVM
Key Tool: Lagrange Multipliers
Regularization by Regularization by
Constrained-Minimizing Ein ⇔ Minimizing Eaug
min Ein (w) s.t. wT w ≤ C λ T
w min Eaug (w) = Ein (w) + w w
w N

• C equivalent to some λ ≥ 0 by checking optimality condition



∇Ein (w) + N w =0

• regularization: view λ as given parameter instead of C, and


solve ‘easily’
• dual SVM: view λ’s as unknown given the constraints, and solve
them as variables instead

how many λ’s as variables?


N—one per constraint
Hsuan-Tien Lin (NTU CSIE) Machine Learning 25/42
Support Vector Machine (1) Motivation of Dual SVM

Starting Point: Constrained to ‘Unconstrained’


Lagrange Function
λ@
with Lagrange multipliers @n αn ,
1 T
min
b,w 2w w
L(b, w, α) =
s.t. yn (wT zn + b) ≥ 1, N
X
1 T
for n = 1, 2, . . . , N 2w w + αn (1 − yn (wT zn + b))
| {z } | {z }
n=1
objective constraint

Claim
   
SVM ≡ min max L(b, w, α) = min ∞ if violate ; 12 wT w if feasible
b,w all αn ≥0 b,w
 
• any ‘violating’ (b, w): max □ + n αn (some positive) → ∞
P
all αn ≥0
 
• any ‘feasible’ (b, w): max □ + n αn (all non-positive) = □
P
all αn ≥0

constraints now hidden in max


Hsuan-Tien Lin (NTU CSIE) Machine Learning 26/42
Support Vector Machine (1) Motivation of Dual SVM

Questions?

Hsuan-Tien Lin (NTU CSIE) Machine Learning 27/42


Support Vector Machine (1) Lagrange Dual SVM

Strong Duality of Quadratic Programming


   
min
b,w
max L(b, w, α)
all αn ≥0
= max min L(b, w, α)
all αn ≥0 b,w
| {z } | {z }
equiv. to original (primal) SVM Lagrange dual

• ‘=’: strong duality, true for QP if


• convex primal
• feasible primal (true if Φ-separable)
• linear constraints
—called constraint qualification

exists primal-dual optimal


solution (b, w, α) for both sides

Hsuan-Tien Lin (NTU CSIE) Machine Learning 28/42


Support Vector Machine (1) Lagrange Dual SVM

Solving Lagrange Dual: Simplifications (1/2)


 
 
 XN 
max min 12 wT w + αn (1 − yn (wT zn + b))
 
all αn ≥0  b,w 
 n=1
| {z }
L(b,w,α)

• inner problem ‘unconstrained’, at optimal:


∂L(b,w,α)
=0= − N
P
∂b n=1 αn yn
PN
• no loss of optimality if solving with constraint n=1 αn yn =0

but wait, b can be removed


N
!
X X N X 
min 12 wT w T PX
max
P + αn (1 − yn (w zn )) −  X
n=1 ynX·X
αn
X b
all αn ≥0, yn αn =0 b,w
n=1

Hsuan-Tien Lin (NTU CSIE) Machine Learning 29/42


Support Vector Machine (1) Lagrange Dual SVM

Solving Lagrange Dual: Simplifications (2/2)


N
!
X
max
P min 12 wT w + T
αn (1 − yn (w zn ))
all αn ≥0, yn αn =0 b,w
n=1

• inner problem ‘unconstrained’, at optimal:


∂L(b,w,α)
= 0 = wi − N
P
∂wi n=1 αn yn zn,i
PN
• no loss of optimality if solving with constraint w = n=1 αn yn zn

but wait!
N
!
X
P max P min 21 wT w + αn − wT w
all αn ≥0, yn αn =0,w= αn yn zn b,w
n=1
N
X N
X
2
⇐⇒ P max P − 21 ∥ αn yn zn ∥ + αn
all αn ≥0, yn αn =0,w= αn yn zn
n=1 n=1

Hsuan-Tien Lin (NTU CSIE) Machine Learning 30/42


Support Vector Machine (1) Lagrange Dual SVM
KKT Optimality Conditions
N
X N
X
2
P max P − 12 ∥ α n yn zn ∥ + αn
all αn ≥0, yn αn =0,w= αn yn zn
n=1 n=1

if primal-dual optimal (b, w, α),


• primal feasible: yn (wT zn + b) ≥ 1
• dual feasible: αn ≥ 0
• dual-inner optimal:
P P
yn αn = 0; w = αn yn zn
• primal-inner optimal (at optimal all ‘Lagrange terms’ disappear):

αn (1 − yn (wT zn + b)) = 0

—called Karush-Kuhn-Tucker (KKT) conditions, necessary for


optimality [& sufficient here]

will use KKT to ‘solve’ (b, w) from optimal α


Hsuan-Tien Lin (NTU CSIE) Machine Learning 31/42
Support Vector Machine (1) Lagrange Dual SVM

Questions?

Hsuan-Tien Lin (NTU CSIE) Machine Learning 32/42


Support Vector Machine (1) Solving Dual SVM

Dual Formulation of Support Vector Machine


N
X N
X
2
P max P − 12 ∥ αn yn zn ∥ + αn
all αn ≥0, yn αn =0,w= αn yn zn
n=1 n=1

standard hard-margin SVM dual


N N N
1XX X
min αn αm yn ym zTn zm − αn
α 2
n=1 m=1 n=1
N
X
subject to yn αn = 0;
n=1
αn ≥ 0, for n = 1, 2, . . . , N

(convex) QP of N variables & N + 1 constraints, as promised

how to solve? yeah, we know QP! :-)


Hsuan-Tien Lin (NTU CSIE) Machine Learning 33/42
Support Vector Machine (1) Solving Dual SVM

Dual SVM with QP Solver

optimal α = ? optimal α ← QP(Q, p, A, c)


N P
N
1 T
min 1
αn αm yn ym zTn zm min 2 α Qα + pT α
P
2
α n=1 m=1 α
N
X subject to aTi α ≥ ci ,
− αn
n=1
for i = 1, 2, . . .
N
• qn,m = yn ym zTn zm
X
subject to yn αn = 0; • p = − 1N
n=1 • a≥ = y, a≤ = − y;
αn ≥ 0, aTn = n-th unit direction
for n = 1, 2, . . . , N • c≥ = 0, c≤ = 0; cn = 0

note: many solvers treat equality (a≥ , a≤ ) &


bound (an ) constraints specially for numerical stability
Hsuan-Tien Lin (NTU CSIE) Machine Learning 34/42
Support Vector Machine (1) Solving Dual SVM

Dual SVM with Special QP Solver


optimal α ← QP( QD , p, A, c)

1 T
min 2α Q D α + pT α
α
subject to special equality and bound constraints

• q = yn ym zTn zm , often non-zero


n,m
• if N = 30, 000, dense QD (N by N symmetric) takes > 3G RAM
• need special solver for
• not storing whole QD
• utilizing special constraints properly
to scale up to large N

usually better to use special solver in practice

Hsuan-Tien Lin (NTU CSIE) Machine Learning 35/42


Support Vector Machine (1) Solving Dual SVM
Optimal (b, w)
KKT conditions
if primal-dual optimal (b, w, α),
• primal feasible: yn (wT zn + b) ≥ 1
• dual feasible: αn ≥ 0
• dual-inner optimal:
P P
yn αn = 0; w = αn yn zn
• primal-inner optimal (at optimal all ‘Lagrange terms’ disappear):

αn (1 − yn (wT zn + b)) = 0 (complementary slackness)

• optimal α =⇒ optimal w? easy above!


• optimal α =⇒ optimal b? a range from primal feasible &
equality from comp. slackness if one αn > 0 ⇒ b = yn − wT zn

comp. slackness:
αn > 0 ⇒ on fat boundary (SV!)
Hsuan-Tien Lin (NTU CSIE) Machine Learning 36/42
Support Vector Machine (1) Solving Dual SVM

Questions?

Hsuan-Tien Lin (NTU CSIE) Machine Learning 37/42


Support Vector Machine (1) Messages behind Dual SVM

Support Vectors Revisited


• on boundary: ‘locates’ fattest hyperplane;
others: not needed
0.707
• examples with αn > 0: on boundary
• call αn > 0 examples (zn , yn )

0
=
1

support vectors h (
(candidates)
h (h

2
(h

x
( (h


1
x
• SV (positive αn )
⊆ SV candidates (on boundary)

N
• only SV needed to compute w: w =
P P
αn yn zn = αn yn zn
n=1 SV
• only SV needed to compute b: b = yn − wT zn with any SV (zn , yn )

SVM: learn fattest hyperplane


by identifying support vectors
with dual optimal solution
Hsuan-Tien Lin (NTU CSIE) Machine Learning 38/42
Support Vector Machine (1) Messages behind Dual SVM

Summary: Two Forms of Hard-Margin SVM

Primal Hard-Margin SVM Dual Hard-Margin SVM

1 T 1 T
min 2w w min 2 α QD α − 1T α
b,w α

sub. to yn (wT zn + b) ≥ 1, s.t. yT α = 0;


for n = 1, 2, . . . , N αn ≥ 0 for n = 1, . . . , N
• d̃ + 1 variables, • N variables,
N constraints N + 1 simple constraints
—suitable when d̃ + 1 small —suitable when N small
• physical meaning: locate • physical meaning: locate
specially-scaled (b, w) SVs (zn , yn ) & their αn

both eventually result in optimal (b, w) for fattest hyperplane


gSVM (x) = sign(wT Φ(x) + b)
Hsuan-Tien Lin (NTU CSIE) Machine Learning 39/42
Support Vector Machine (1) Messages behind Dual SVM

Are We Done Yet?


goal: SVM without dependence on d̃

1 T
min 2 α QD α − 1T α
α
subject to yT α = 0;
αn ≥ 0, for n = 1, 2, . . . , N
• N variables, N + 1 constraints: no dependence on d̃?
• qn,m = yn ym zTn zm : inner product in Rd̃
—O(d̃) via naïve computation!

no dependence only if
avoiding naïve computation (next lecture :-))

Hsuan-Tien Lin (NTU CSIE) Machine Learning 40/42


Support Vector Machine (1) Messages behind Dual SVM

Questions?

Hsuan-Tien Lin (NTU CSIE) Machine Learning 41/42


Support Vector Machine (1) Messages behind Dual SVM

Summary
1 Embedding Numerous Features: Kernel Models
Lecture 10: Support Vector Machine (1)
Large-Margin Separating Hyperplane
intuitively more robust against noise
Standard Large-Margin Problem
minimize ‘length of w’ at special separating scale
Support Vector Machine
‘easy’ via quadratic programming
Motivation of Dual SVM
want to remove dependence on d̃
Lagrange Dual SVM
KKT conditions link primal/dual
Solving Dual SVM
another QP, better solved with special solver
Messages behind Dual SVM
SVs represent fattest hyperplane

Hsuan-Tien Lin (NTU CSIE) Machine Learning 42/42

You might also like