0% found this document useful (0 votes)
20 views

Lecture 14

Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
20 views

Lecture 14

Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 20

Support Vector Machines

(Informal : Version 0)
Introduction

10/26/2021 P. Viswanath, at IIITS 1


Linear Classifier
Classifier:
If f(x1,x2) < 0 assign Class 1;
x2
If f(x1,x2) > 0 assign Class 2;
Class 2

f(x1,x2) = w1x1+w2x2+b = 0
Class 1

x1

10/26/2021 P. Viswanath, at IIITS 2


Perceptron
• Perceptron is the name given to the linear
classifier.
• If there exists a Perceptron that correctly
classifies all training examples, then we say
that the training set is linearly separable.
• Different Perceptron learning techniques are
available.

10/26/2021 P. Viswanath, at IIITS 3


Perceptron – Let us begin with linearly
separable data
• For linearly separable data many Perceptrons are
possible that correctly classifies the training set.

All being doing equally good


on training set, which one is
Class 2
good on the unseen test set?

Class 1

10/26/2021 P. Viswanath, at IIITS 4


Hard Linear SVM
• The best perceptron for a linearly separable
data is called “hard linear SVM”.
• For each linear function we can define its
margin.
• That linear function which has the maximum
margin is the best one.

10/26/2021 P. Viswanath, at IIITS 5


Class 2 Class 2

Class 1 Class 1

Margin
10/26/2021 P. Viswanath, at IIITS 6
Maximizing the Margin
Var1 IDEA : Select the
separating
hyperplane that
maximizes the
margin!

Margin
Width

Margin
Width
Var2
10/26/2021 P. Viswanath, at IIITS 7
Support Vectors
Var1

Support Vectors

Margin
Width
Var2
10/26/2021 P. Viswanath, at IIITS 8
What if the data is not linearly separable?
But solving a non-linear
Var1
problem is mathematically
more difficult

Var2
10/26/2021 P. Viswanath, at IIITS 9
Kernel Mapping

10/26/2021 P. Viswanath, at IIITS 10


An example
Input Space Feature Space

y = -1
y = +1

10/26/2021 P. Viswanath, at IIITS 11


The Trick !!
• There is no need to do this mapping explicitly.
• For some mappings, the dot product in the
feature space can be expressed as a function
in the Input space.
• 𝜑 𝑋1 ∙ 𝜑 𝑋2 = 𝑘 𝑋1 , 𝑋2

10/26/2021 P. Viswanath, at IIITS 12


• So, if the solution is going to involve only dot
products then it can be solved using kernel
trick (of course, appropriate kernel function
has to be chosen).

• The problem is, with powerful kernels like


“Gaussian kernel” it is possible to learn a non-
linear classifier which does extremely well on
the training set.

10/26/2021 P. Viswanath, at IIITS 13


Discriminant functions: non-linear

This makes zero mistakes with the training set.

10/26/2021 P. Viswanath, at IIITS 14


Other important issues …
• This classifier is doing very well as for the training data is
considered.
• But this does not guarantee that the classifier works well
with a data element which is not there in the training set
(that is, with unseen data).
• This is overfitting the classifier with the training data.
• May be we are respecting noise also (There might be
mistakes while taking the measurements).
• The ability “to perform better with unseen test patterns
too” is called the generalization ability of the classifier.

10/26/2021 P. Viswanath, at IIITS 15


Generalization ability
• This is discussed very much.
• It is argued that the simple one will have better
generalization ability (eg: Occam’s razor: Between two
solutions, if everything else is same then choose the
simple one).
• How to quantify this?
• (Training error + a measure of complexity) should be
taken into account while designing the classifier.
• Support vector machines are proved to have better
generalization ability.

10/26/2021 P. Viswanath, at IIITS 16


Discriminant functions …

This has some training error, but this is a relatively simple one.

10/26/2021 P. Viswanath, at IIITS 17


Overfitting and underfitting

underfitting good fit overfitting

10/26/2021 P. Viswanath, at IIITS 18


Soft SVM
• Allow for some mistakes with the training set !
• But, this is to achieve a better margin.

10/26/2021 P. Viswanath, at IIITS 19


10/26/2021 P. Viswanath, at IIITS 20

You might also like