0% found this document useful (0 votes)
13 views32 pages

SVM Minus Kernel 71

The document discusses linear classifiers, specifically focusing on Support Vector Machines (SVM) and the concept of maximum margin classifiers. It explains how to mathematically represent classifiers, compute margin widths, and the significance of support vectors. The document emphasizes the importance of finding the widest margin that correctly classifies all data points.

Uploaded by

Devashish LNU
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views32 pages

SVM Minus Kernel 71

The document discusses linear classifiers, specifically focusing on Support Vector Machines (SVM) and the concept of maximum margin classifiers. It explains how to mathematically represent classifiers, compute margin widths, and the significance of support vectors. The document emphasizes the importance of finding the widest margin that correctly classifies all data points.

Uploaded by

Devashish LNU
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 32

SVM

Linear Classifiers

denotes +1
denotes -1

How would you


classify this data?
Linear Classifiers

denotes +1
denotes -1

How would you


classify this data?
Linear Classifiers

denotes +1
denotes -1

How would you


classify this data?
Linear Classifiers

denotes +1
denotes -1

How would you


classify this data?
Linear Classifiers

denotes +1
denotes -1

Any of these would


be fine..

..but which is best?


Classifier Margin

denotes +1
denotes -1
Define the margin of
a linear classifier as
the width that the
boundary could be
increased by before
hitting a datapoint.
Maximum Margin

denotes +1
denotes -1
The maximum
margin linear
classifier is the
linear classifier with
the, maximum
margin.
This is the simplest
kind of SVM (Called
an LSVM)

Linear SVM
Maximum Margin

denotes +1
denotes -1
The maximum
margin linear
classifier is the
linear classifier with
Support Vectors are
those datapoints
the maximum
that the margin margin.
pushes up against This is the simplest
kind of SVM (Called
an LSVM)

Linear SVM
Specifying=a+1line
” andPlus-
margin
la ss PlaneClassifier
ic t C one Boundary
e d z Minus-Plane
r
“P -1”
s s=
C la
ic t one
re d z
“P

• How do we represent this mathematically?


• …in m input dimensions?
Specifying a line and margin

+1 Plus-
s s= PlaneClassifier
C la e
d ict zon Boundary
Minus-Plane
e
“Pr -1”
s s=
= 1
C la
+b ic t one
wx =0 d z
b re
wx
+
+ 1b =- “P
wx

• Plus-plane = { x : w . x + b = +1 }
• Minus-plane = { x : w . x + b = -1 }

Classify +1 if w . x + b >= 1
as..
-1 if w . x + b <= -1
Computing the margin width

+1 M = Margin Width
s s=
C la e
t n
e dic zo
“Pr -1” How do we
s=
b=
1
C las compute M in
+ ic t one
wx =0 d z
wx
+ b
b =- “Pre terms of w and
+ 1
wx
b?
• Plus-plane = { x : w . x + b = +1 }
• Minus-plane = { x : w . x + b = -1 }
The vector w is perpendicular to the Plus Plane.
Computing the margin width

+1 M = Margin Width
s s=
C la e
t n
e dic zo
“Pr -1” How do we
s=
b=
1
C las compute M in
+ ic t one
wx =0 d z
wx
+ b
b =- “Pre terms of w and
+ 1
wx
b?
• Plus-plane = { x : w . x + b = +1 }
• Minus-plane = { x : w . x + b = -1 }
The vector w is perpendicular to the Plus Plane.
Computing the margin width

+1 + M = Margin Width
s s= x
C la e
d ict zon
e
“Pr - 1” How do we
sx=-
= 1
C las compute M in
+ b t
ic zon e
wx =0 d
wx
+ b
b=
-
“Pr
e terms of w and
+ 1
wx
b?
• Plus-plane = { x : w . x + b = +1 }
• Minus-plane = { x : w . x + b = -1 }
• The vector w is perpendicular to the Plus Plane
• Let x- be any point on the minus plane
• Let x+ be the closest plus-plane-point to x-.
Computing the margin width

+1 + M = Margin Width
s s= x
C la e
d ict zon
e
“Pr - 1” How do we
sx=-
= 1
C las compute M in
+ b t
ic zon e
wx =0 d
wx
+ b
b=
-
“Pr
e terms of w and
+ 1
wx
b?
• Plus-plane = { x : w . x + b = +1 }
• Minus-plane = { x : w . x + b = -1 }
• The vector w is perpendicular to the Plus Plane
• Let x- be any point on the minus plane
• Let x+ be the closest plus-plane-point to x-.
• Claim: x+ = x- + l w for some value of l. Why?
Computing the margin width

+1 + M = Margin Width
s s= x
C la e
d i c t
z o n The line from x -
to x +
is
e
“Pr - 1” How
perpendicular
do we to the
s sx=-
planes.
b = 1
t C la
e compute M in
wx
+
=0 d ic zon So to get from x -
to x +
travel
wx
+ b
-
“P r e terms of w and
wx
b=
+ 1 some distance in
b?
direction w.
• Plus-plane = { x : w . x + b = +1 }
• Minus-plane = { x : w . x + b = -1 }
• The vector w is perpendicular to the Plus Plane
• Let x- be any point on the minus plane
• Let x+ be the closest plus-plane-point to x-.
• Claim: x+ = x- + l w for some value of l. Why?
Computing the margin width

+1 + M = Margin Width
s s= x
C la e
d ict zon
e
“Pr - 1”
sx=-
= 1
C las
+b ic t one
wx = 0 d z
+b re
w x
b=
+ 1
-
“P
wx
What we know:
• w . x+ + b = +1
• w . x- + b = -1
• x+ = x- + l w
• |x+ - x- | = M
It’s now easy to get M in
terms of w and b
Computing the margin width

+1 + M = Margin Width
s s= x
C la e
d ict zon
e
“Pr - 1”
sx=-
= 1
C las
+b ic t one
wx 0 d
+b
=
- Pre z w . (x -
+ l w) + b = 1
w x
b
+ 1
= “
wx
=>
What we know:
• w . x+ + b = +1 w . x - + b + l w .w = 1
• w . x- + b = -1 =>
• x+ = x- + l w -1 + l w .w = 1
• |x+ - x- | = M =>
It’s now easy to get M in
2
terms of w and b λ
w.w
Computing the margin width

+1 + M = Margin Width
2
s s= x w.w
la =
i c t C one 2
e d z 
“Pr - 1” || w ||
s sx=
-
=1 C la
+ b
ic t one
wx 0 d
+b
=
- P re z M = |x+
- x-
| =| l w |=
w x
b
+ 1
= “
wx
What we know:  λ | w |  λ w.w
• w . x+ + b = +1
• w . x- + b = -1 2 w.w 2
• x+ = x- + l w  
w.w w.w
• |x+ - x- | = M
• 2
λ
w.w
Learning the Maximum Margin Classifier

+1 + M = Margin Width
2
s s= x w.w
la =
i c t C one
e d z 2
r 
“P - 1” || w ||
sx=-
= 1
C las
+b ic t one
wx = 0 d z
+b re
w x
b=
+ 1
-
“P
wx
Given a guess of w and b we can
• Compute whether all data points in the correct half-planes
• Compute the width of the margin
So now we just need to write a program to search the space of w’s
and b’s to find the widest margin that matches all the datapoints.
Learning the Maximum Margin Classifier
1”
+ M Given guess of w , b we can
s s= =2
Cla ne • Compute whether all data
ic t w.w
e d zo points are in the correct half-
“Pr -1”

2
s s = || w || planes
= 1
C la
wx
+b
=0 ict zone • Compute the margin width
e d
“Pr
+ b
wx b =- Assume R datapoints, each
+ 1
wx
(xk,yk) where yk = +/- 1

What should our quadratic How many constraints will we


optimization criterion be? have?
What should they be?
|| w ||

2

You might also like