0% found this document useful (0 votes)
30 views9 pages

SVM Handout

This document discusses support vector machines (SVMs), a supervised machine learning method used for classification and regression. It provides equations and formulas for solving SVM problems, including equations for the decision boundary, positive and negative gutters, margin width, and common SVM kernels like linear, polynomial, radial basis function, and sigmoid kernels. It also gives an example of determining SVM parameters through inspection of a 2D graph with positive and negative examples.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
30 views9 pages

SVM Handout

This document discusses support vector machines (SVMs), a supervised machine learning method used for classification and regression. It provides equations and formulas for solving SVM problems, including equations for the decision boundary, positive and negative gutters, margin width, and common SVM kernels like linear, polynomial, radial basis function, and sigmoid kernels. It also gives an example of determining SVM parameters through inspection of a 2D graph with positive and negative examples.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

Support Vector Machines ( . .

I SVM " " " "


.
T minimi e:
T L :

S L , , , unique
( ) . O " ", + , ­ " "
ppo ec o . T . T " "
, ( ) ­ . N ,
" " 2D . I 3D ;
, .

S L
. I , Q P . A SVM P ' SMO (S
M O ) . F SVM , ,
/ .

Useful Equations for solving SVM questions

A. Equations derived from optimi ing the Lagrangian:


1. Partial of the Lagrangian wrt to b: F

N ­ .
S ( ) 0.
2. Partial of the Lagrangian wrt to w: F

F .
T .
S
F (
).
S , .

B. Equations from the boundaries and constraints:


3. The Decision boundar :
G , .
T ,
support vectors .
S

4. Positive gutter:
G , .
For use when the Kernel is linear.

5. Nega i e g er:

6. The id h of he margin (or road):

where,

Alternate formula for the two support vector case:

Thi e ai i ef he i g SVM be i 1D 2D, he e he id h f he ad ca be visuall determined.

Common SVM Kernels:

In document classification, feature vectors are composed of binary word


features:
Linear Kernel I(word=foo) outputs 1 if the word "foo" appears in the document 0 if it does
not.

Each document is represented as vocabulary length feature vectors. Support


vectors found are generally particularly salient documents (documents best at
discriminating topics being classified).

Decomposable Kernels

Idea: Define that transforms input Example:


vectors into a different (usually higher)
dimensional space where the data is (more
easily) linearly separable.

n>1

Example: Quadratic Kernel:


In 2D resulting decision boundary can look parabolic, linear or
hyperbolic depending on which terms in the expansion dominate.
Here is an expansion of the quadratic kernel, with u = [x, y]
Polynomial Kernel

HW: Try this Kernel using Professor Winston's demo

In 2D generated decision boundaries resemble contour circles around clusters


Radial Basis Function (RBF) or Gaussian of +ve and ­ve points. Support vectors are generally +ve or ­ve points that
Ke e a ec e he i gc e . The c ace d a e f
f ec Ga ia .
Wi fi a a da a. Ma e hibi
e fi i g he ed i e . HW: T hi Ke e i g P fe Wi ' de
Si i a KNN b iha i
ha i g a e; eigh f each e Whe i a ge ge f a e Ga ia . Whe i a ge
de e i ed b Ga ia ha e Ga ia . (He ce he i ga a c de i i
P i fa he a a ge e f a ea c e / de e a d ec i ).
a e ha i ea b
He e i he Ke e i ­2D e a ded , ih =[ , ]

A a i ge c e a ec i a ache e (0) = 1. A a i
e fa a a f a ec i a ache e (­i fi i ) = 0.

P e ie f a h:
Sig ida ( a h) Ke e Si i a he ig id f ci
A f c bi a i f i ea
deci i b da ie Ra ge f ­1 +1.
ah ( ) => +1 he >> 0
ah ( ) => ­1 he << 0

Re i g deci i b da ie a e gica c bi a i f i ea
b da ie . N diffe e f ec d a e e i Ne a Ne .

Li e RBF, a e hibi e fi i g he i e ed.


Li ea c bi a i f Ke e Sca i g:
f a>0
Idea: Ke e f c i a ec ed de Li ea c bi a i :
addi i a d ca i g (b a ii e be ). a,b>0

Method 1 of Solving SVM parameters b inspection:


Thi i a e ­b ­ e i P b e 2.A f 2006 i 4:
We a e gi e he f i gga h ih a d i he ­ a i ;
+ e i a 1 (0, 0) a d a ­ e i 2 a (4, 4).

Ca a SVM e a a e hi ? i.e. i i i ea e a ab e? Hec Yeah! i g he i e ab e.


Part 2A: Provide a decision boundar :
We ca fi d he deci i b da b g a hica i ec i .
1. The deci i b da ie he i e: = ­ + 4
2. We ha e a + e ec a (0, 0) i h i e e a i =­
3. We ha e a ­ e ec a (4, 4) i h i e e a i =­ +8
Gi e he e a i f he deci i b da , e e a age he a geb a ge he deci i b da c f i h he
de i ed f , a e :

1. (< beca e + e i be he i e)
2.
3. ( i ied b ­1)
4. ( ii g he c efficie e ici )
N e ca ead he i f he e a i c efficie :
1 = ­1 2 = ­1 b=4

Ne , i g f af id h f ad, e chec ha he e eigh gi e a ad id h f: .

WAIT! Thi i c ea he id h f he " ide " ad/ a gi .


We e e be ha a i e c (c>0) f he b da e a i i i he a e deci i b da . S a e ai f he
f :

S ide hi deci i b da . S he e i a e ge e a i :
1 = ­c 2 = ­c b = 4c

a d
U ing The Wid h of he Road Con ain
G a hica e ee ha he ide id h a gi h d be:
The i eigh ec a d i e ce ca be ed b i gf cc ai ed b he id h­ f­ he­ ad.
Le g h f i e f c:

N gi a hi i he a gi id h e ai a d i gf c, e ge :

=> => =>

Thi ea he e eigh ec a d i e ce f he SVM ol ion h d be:

a d

Ne e ol e fo alpha , i g he ec a de ai 1.

P gi i he ec a e f ec a d :

We ge ide ica e ai :

U i gE ai 1, e ca ef he he a ha:
Pa 2B: D e he b da cha ge if a + e i 3 i added a (­1, ­1)?
N . S ec a e i a 1, a d 2. Deci i b da a he a e.
Pa 2C: Wha if i 2 (­ e) i ed c di a e ( , )?
H i a e cha ge, i c ea e, dec ea e a a e? Whe = 2? a d = 8?
A e : G bac h e ed f a ha :

P gi i 2

S i gf

U i g he fac ha ,
a d id h­ f­ ad/ a gi .
We e e a ha i e f he a gi m:

A e:
Whe cha ge f 4 2. The a gi ( ad id h) i ha ed a d i a ha ed. S a ha i c ea e b a
fac f 4.
Whe cha ge f 4 8. The a gi i d b ed, i a d b ed. S a ha dec ea e b a fac f 4.
Th gh e d ide a f f he e. A ha i ge e a cha ge inversel i h .
Wide ad ­> e a ha. Na ed ad ­> highe a ha

Me h d 2: S i g f a ha, b, a d ih i a i ec i (B c i g Ke e
a d i gC ai e a i )
E a ef 2005 Fi a E a .
I hi be a e d ha ha e he f i g i .

­ e i : A a (0, 0) B a (1, 1)
+ e i : C a (2, 0)

and that these points lie on the gutter in the SVM ma ­margin solution.

S e 1. C ea e e f ci a e , hich i hi ca e, he e a e a d d c .
K(A, A) = 0*0+0*0 = 0 K(A, B) = 0*1+0*1 = 0 K(A, C) = 0*2+0*0 = 0
K(B, A) = 1*0+1*0 = 0 K(B, B) = 1*1+1*1 = 2 K(B, C) = 1*2+1*0 = 2
K(C, A) = 2*0+0*0 = 2 K(C, B) = 2*1+0*1 = 2 K(C, C) = 2*2+0*0 = 4

S e 2: W i e he e fe ai , i g SVM c ai :

C ai 1: ,

C ai 2: i i eg e.
C ai 3: ega i e g e.
Thi i ie d 4 e ai .
C1 ­1 ­1 1 0 0
C3.A AK(A,A)=­ BK(B,A)=­ cK(C,A)=+1*2=2 + 1 ­1
1*0=0 1*0=0

AK(A,B)=­ BK(B,B)=­
C3.B cK(C,B)=+1*2=2 + 1 ­1
1*0=0 1*2=­2
AK(A,C)=­ BK(B,C)=­
C2.C cK(C,C)=+1*4=4 + 1 +1
1*0=0 1*2=­2
For clarit here are the four equations:
C1

C3.A

C3.B

C2.C
Step 3: Use our favorite method of solving linear equations to solve for the 4 unknowns.
Answer:

This is a more general wa to solve SVM parameters, without the help of geometr . This method can be applied to problems
where "margin" width or boundar equation can not be derived b inspection. (e.g. > 2D)

NOTE: We used the gutter constraints as equalities above because we are told that the given points lie on the "gutter".
More realisticall , if we were given more points, and not all points la on the gutters, then we would be solving a s stem of
ineq ali ie (because the gutter equations are reall constraints on >= 1 or <= ­1).

In the quadratic programming solvers used to solve SVMs, we are in fact doing just that, we are minimi ing a target function
b subjecting it to a s stem of linear inequalit constraints.

E ample of SVMs ith a Non­Linear Kernel


From Part 2E of 2006 Q4. You are given the graph below and the following kernel:

and ou are asked to solve for equation for the decision boundar .

Step 1: First, decompose the kernel into a dot product of functions:


A e:

S e 2: C e a igi a i i he e ace i g he a f . (We a e g i g f 2D 1D).


Positi e i a e a :

Negati e i a ea :

S e 3: P he i i he e ace, hi a ea a a i e f 0 8.
Wi h i i e i a 0, 2, 4 a d ega i e i a 6, 8.

The ec ie be ee a d (be ee a e f 4 a d 6)
He ce he deci i b da ( a i a gi ) h d be:
The < d e he i i e i bei g a e ha 5.

E a di g he de e i ed deci i b da i e fc e f , e ge :

S a e b h ide :

C e ( a da d f ):

Thi i a ci c e i h adi
An Abstract Lesson on Support Vector Behavior

S e ha e he ab e e f i . Le ' e he SVM a a e e b i ec i .
1. B da e ai :
=> =>
2. Read ff he a db a d i b c (c>0):

3. N a he id h f he ad/ a gi c ai :

ggi g i i e g h f ,a d i gf c:

=>

4. N e ha e he SVM i a i a d b:

5. Ne , ef he i g he ag a gia e ai :

a d

a) F e a di g he fi e ai , e ge :

hich ead e ai :
a d

b) F e a di g he ec de ai , e ge :

c) P i g he e ai f a) a d b) ge he e ca ef he he a ha .
a d i ia f

We ee ha he + e ec a ha a e i ba ed he a i f di a ce de e i ed b a d . If = ee
e a , he = =

Ob e a i A:
Q: S e e ed i A he igi a (0, 0). Wha ha e a d ?
A: Thi c fig a i ba ica i ie = 0; e ge : a d .

C ce a , bec e he sole primar support vector beca e i A i di ec ac f i B. P i


A ake p all he ha e of he "p e e" in holding p he ma gin; poin C, ho gh ill on he g e , effec i el become a
non­ ppo ec o . So hi implie ha poin on he g e ma no al a e e he ole of being a ppo ec o .

Ob e a ion B:
Q: S ppo e e changed k, b mo ing poin B p/o do n he ­a i ha happen o he alpha ?
A: All he alpha a e p opo ional o

If k decreases, he oad narro s, he alpha increases. Analog , the supports need to appl more "pressure" to push the
margin tighter.

If k increases, he oad idens, he alpha decrease. Analog : wider road needs less "pressure" on the supports to hold it in
place.

You might also like