0% found this document useful (0 votes)
7 views21 pages

Machine Learning SVM: Mustansar Ali

The document discusses support vector machines and maximum margin classifiers. It introduces linear classifiers and defines the margin of a classifier as the width the boundary can be increased before hitting a data point. It describes how the maximum margin classifier is the linear classifier with the largest margin, and that support vectors are data points pushed up against the margin.

Uploaded by

Amjad Hussain
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views21 pages

Machine Learning SVM: Mustansar Ali

The document discusses support vector machines and maximum margin classifiers. It introduces linear classifiers and defines the margin of a classifier as the width the boundary can be increased before hitting a data point. It describes how the maximum margin classifier is the linear classifier with the largest margin, and that support vectors are data points pushed up against the margin.

Uploaded by

Amjad Hussain
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 21

Machine Learning

SVM
Mustansar Ali
Slides adapted from Andrew Moore

Copyright © 2001, 2003, Andrew W. Moore


Support Vector
Machines
Note to other teachers and users of these Andrew W. Moore
slides. Andrew would be delighted if you
found this source material useful in
giving your own lectures. Feel free to use
Professor
these slides verbatim, or to modify them
to fit your own needs. PowerPoint School of Computer Science
originals are available. If you make use
of a significant portion of these slides in Carnegie Mellon University
your own lecture, please include this
message, or the following link to the www.cs.cmu.edu/~awm
source repository of Andrew’s tutorials:
http://www.cs.cmu.edu/~awm/tutorials .
[email protected]
Comments and corrections gratefully 412-268-7599
received.

Copyright © 2001, 2003, Andrew W. Moore Nov 23rd, 2001



Linear Classifiers
x f yest
f(x,w,b) = sign(w. x - b)
denotes +1
denotes -1

How would you


classify this data?

Copyright © 2001, 2003, Andrew W. Moore Support Vector Machines: Slide 3



Linear Classifiers
x f yest
f(x,w,b) = sign(w. x - b)
denotes +1
denotes -1

How would you


classify this data?

Copyright © 2001, 2003, Andrew W. Moore Support Vector Machines: Slide 4



Linear Classifiers
x f yest
f(x,w,b) = sign(w. x - b)
denotes +1
denotes -1

How would you


classify this data?

Copyright © 2001, 2003, Andrew W. Moore Support Vector Machines: Slide 5



Linear Classifiers
x f yest
f(x,w,b) = sign(w. x - b)
denotes +1
denotes -1

How would you


classify this data?

Copyright © 2001, 2003, Andrew W. Moore Support Vector Machines: Slide 6



Linear Classifiers
x f yest
f(x,w,b) = sign(w. x - b)
denotes +1
denotes -1

Any of these
would be fine..

..but which is
best?

Copyright © 2001, 2003, Andrew W. Moore Support Vector Machines: Slide 7



Classifier Margin
x f yest
f(x,w,b) = sign(w. x - b)
denotes +1
denotes -1 Define the margin
of a linear
classifier as the
width that the
boundary could be
increased by
before hitting a
datapoint.

Copyright © 2001, 2003, Andrew W. Moore Support Vector Machines: Slide 8



Maximum Margin
x f yest
f(x,w,b) = sign(w. x - b)
denotes +1
denotes -1 The maximum
margin linear
classifier is the
linear classifier
with the, um,
maximum margin.
This is the
simplest kind of
SVM (Called an
LSVM)
Linear SVM
Copyright © 2001, 2003, Andrew W. Moore Support Vector Machines: Slide 9

Maximum Margin
x f yest
f(x,w,b) = sign(w. x - b)
denotes +1
denotes -1 The maximum
margin linear
classifier is the
linear classifier
Support Vectors with the, um,
are those
datapoints that maximum margin.
the margin
This is the
pushes up
against simplest kind of
SVM (Called an
LSVM)
Linear SVM
Copyright © 2001, 2003, Andrew W. Moore Support Vector Machines: Slide 10
Why Maximum Margin?

f(x,w,b) = sign(w. x - b)
denotes +1
denotes -1 The maximum
margin linear
classifier is the
linear classifier
Support Vectors with the, um,
are those
datapoints that maximum margin.
the margin
This is the
pushes up
against simplest kind of
SVM (Called an
LSVM)
Copyright © 2001, 2003, Andrew W. Moore Support Vector Machines: Slide 11
Specifying a line and margin
+ 1” Plus-Plane
=
la ss Classifier Boundary
i c t C o ne
r ed z Minus-Plane
“ P
- 1”
=
la ss
ic t C one
Pr ed z

• How do we represent this mathematically?


• …in m input dimensions?

Copyright © 2001, 2003, Andrew W. Moore Support Vector Machines: Slide 12


Specifying a line and margin
+ 1” Plus-Plane
=
la ss Classifier Boundary
i c t C o ne
r ed z Minus-Plane
“ P
- 1”
=
la ss
=1
+b ic t C one
wx 0 ed z
+b=
“ Pr
wx b=-1
+
wx

• Plus-plane = { x : w . x + b = +1 }
• Minus-plane = { x : w . x + b = -1 }

Classify as.. +1 if w . x + b >= 1


-1 if w . x + b <= -1
Universe if -1 < w . x + b < 1
explodes
Copyright © 2001, 2003, Andrew W. Moore Support Vector Machines: Slide 13
Computing the margin width
+ 1”
= M = Margin Width
la ss
i c t C o ne
r ed z
“ P
= - 1” How do we compute
la ss M in terms of w
=1
+b ic t C one
wx 0 ed z
+b=
wx b=-1 “ Pr and b?
+
wx

• Plus-plane = { x : w . x + b = +1 }
• Minus-plane = { x : w . x + b = -1 }
Claim: The vector w is perpendicular to the Plus Plane. Why?

Copyright © 2001, 2003, Andrew W. Moore Support Vector Machines: Slide 14


Computing the margin width
+ 1”
= M = Margin Width
la ss
i c t C o ne
r ed z
“ P
= - 1” How do we compute
la ss M in terms of w
=1
+b ic t C one
wx 0 ed z
+b=
wx b=-1 “ Pr and b?
+
wx

• Plus-plane = { x : w . x + b = +1 }
• Minus-plane = { x : w . x + b = -1 }
Claim: The vector w is perpendicular to the Plus Plane. Why?
Let u and v be two vectors on the
Plus Plane. What is w . ( u – v ) ?

And so of course the vector w is also


perpendicular to the Minus Plane
Copyright © 2001, 2003, Andrew W. Moore Support Vector Machines: Slide 15
Computing the margin width
+ 1”
= x+ M = Margin Width
la ss
i c t C o ne
r ed z
“ P
=- -
1” How do we compute
ssx
+b
=1
c t
a
Cl e M in terms of w
wx 0 edi zon
+b=
wx b=-1 “Pr and b?
+
wx

• Plus-plane = { x : w . x + b = +1 }
• Minus-plane = { x : w . x + b = -1 }
• The vector w is perpendicular to the Plus Plane
Any location in
• Let x- be any point on the minus plane mm:: not
R not
necessarily a
• Let x+ be the closest plus-plane-point to x-. datapoint

Copyright © 2001, 2003, Andrew W. Moore Support Vector Machines: Slide 16


Computing the margin width
+ 1”
= x+ M = Margin Width
la ss
i c t C o ne
r ed z
“ P
=- -
1” How do we compute
ssx
+b
=1
c t
a
Cl e M in terms of w
wx 0 edi zon
+b=
wx b=-1 “Pr and b?
+
wx

• Plus-plane = { x : w . x + b = +1 }
• Minus-plane = { x : w . x + b = -1 }
• The vector w is perpendicular to the Plus Plane
• Let x- be any point on the minus plane
• Let x+ be the closest plus-plane-point to x-.
• Claim: x+ = x- +  w for some value of . Why?

Copyright © 2001, 2003, Andrew W. Moore Support Vector Machines: Slide 17


Computing the margin width
+ 1”
= x+ M = Margin Width
la ss
i c t C o ne The line from x- to x+ is
r ed z
“ P perpendicular to the
- 1” How do we compute
ss x=-
planes.
b = 1
c t C la
e M in terms of w
wx
+
=0 edi zon
+ b
wx b=-1 “Pr and
So to getbfrom
? x- to x+
+
wx travel some distance in
• Plus-plane = { x : w . x + b = +1 direction
} w.
• Minus-plane = { x : w . x + b = -1 }
• The vector w is perpendicular to the Plus Plane
• Let x- be any point on the minus plane
• Let x+ be the closest plus-plane-point to x-.
• Claim: x+ = x- +  w for some value of . Why?

Copyright © 2001, 2003, Andrew W. Moore Support Vector Machines: Slide 18


Computing the margin width
+ 1”
= x+ M = Margin Width
la ss
i c t C o ne
r ed z
“ P
1”
sx=- -
a s
+b
=1
c t Cl e
wx
b=
0
r edi zon
+
wx b=-1 “P
+
wx
What we know:
• w . x+ + b = +1
• w . x- + b = -1
• x+ = x- +  w
• |x+ - x- | = M
It’s now easy to get M
in terms of w and b
Copyright © 2001, 2003, Andrew W. Moore Support Vector Machines: Slide 19
Computing the margin width
+ 1”
= x+ M = Margin Width
la ss
i c t C o ne
r ed z
“ P
1”
=- -
sx
a s
+b
=1
c t Cl e
wx
b=
0
r edi zon w . (x - + w) + b = 1
+
wx b=-1 “P
+
wx
=>
What we know:
• w . x+ + b = +1 w . x - + b + w .w = 1
• w . x- + b = -1 =>
• x+ = x- +  w
-1 + w .w = 1
• |x+ - x- | = M
=>
It’s now easy to get M 2
in terms of w and b λ
w.w
Copyright © 2001, 2003, Andrew W. Moore Support Vector Machines: Slide 20
Computing the margin width
+ 1” 2
= x+ M = Margin Width =
la ss w.w
i c t C o ne
r ed z
“ P
1”
=- -
sx
a s
+b
=1
c t Cl e
wx
b=
0
r edi zon M = |x+ - x- | =| w |=
+
wx b=-1 “P
+
wx
What we know:  λ | w |  λ w.w
• w . x+ + b = +1
• w . x- + b = -1 2 w.w 2
• x+ = x- +  w  
w.w w.w
• |x+ - x- | = M
• 2
λ
w.w
Copyright © 2001, 2003, Andrew W. Moore Support Vector Machines: Slide 21

You might also like