0% found this document useful (0 votes)

36 views14 pages

Lec 12

The lecture discusses the design approach for support vector machines. It explains that the goal is to find the optimal separating hyperplane that maximizes the margin between the two classes. This is formulated as an optimization problem to minimize a subject to constraints, resulting in a Lagrangian that is optimized over alpha values to obtain the solution vector a and define the optimal separating hyperplane.

Uploaded by

Kutti Dhivya

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

36 views14 pages

Lec 12

Uploaded by

Kutti Dhivya

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 14

Deep Learning

Prof. Prabir Kumar Biswas

Department of Electronics and Electrical Communication Engineering
Indian institute of Technology, Kharagpur

Lecture - 12
Support Vector Machine - II

Hello, welcome to the NPTEL Online Certification course on Deep Learning. You remember
in the previous class we started our discussion on the Support Vector Machine. So, in today’s
lecture, we will continue with the same discussion.

(Refer Slide Time: 00:42)

So, in the previous class we have just introduced or gave a brief introduction of what the
Support Vector Machine is and today we are going to talk about what should be the design
approach of a Support Vector Machine.
(Refer Slide Time: 00:59).

So, we have seen that in our case we will assume again a two class problem.

So, we have the feature vectors given from two classes; ω 1 and ω 2 and all the training
vectors we assume that are given as leveled pair in the sense that attaining vector X i which is
the i-th training vector will be given as a pair ( X i , y i ) where this y i indicates the label. So, if
the training vector y i is taken from class ω 1 hen we will set label y i to be +1 else -1.

So, that indicates that given a separating plane with an equation at X + b =0

if this is the separating plane between the feature vectors belonging to the 2 classes

for every X belonging to class ω 1 which are my training vectors this condition must be
satisfied that at X + b > 0

In the same manner, if I take a feature vector X from class ω 2 where this feature vector X
falls on the negative side of the linear boundary this condition that at X + b < 0 must be
satisfied.
(Refer Slide Time: 03:19)

That is I can write:

y i (at X i + b) > 0

If X i is correctly classified by the separating plane at X + b = 0 and this will be less than 0
if X i is misclassified by the separating plane.

(Refer Slide Time: 03:57)

And we have also seen as in this equation at X + b = 0 a is a vector which is orthogonal to
the separating plane and b is a bias which indicates what is the position or location of the
separating plane. So, as a is orthogonal to X if I modify a, that means the orientation of the
separating plane will be different whereas, if I modify b, then the position or location of the
separating plane will be different in a feature space.

So, for different values of a and b, I have got I can obtain different separating planes and
maybe many of those departing planes will satisfy the same condition that is

y i (at X i + b) > 0
(Refer Slide Time: 04:55)

Now for different values of a the vector a and for different values of the bias b, I get different
such planes, but for each such plane I will have the different margins or different confidence
level of classification. So, what is that?

(Refer Slide Time: 05:26)

So, here I take this particular separating plane which separates between this set of set of
feature vectors which belong to class ω 1 and these two feature vectors which belong to class
ω2 .

Now, given this you find that if I take this particular separating plane, this separating plane
gives me a margin which is given by this, so that the distance between these two planes gives
me the margin or what is the confidence level of the confidence level given by this particular
classifier. Similarly if I take another separating plane set, this one here again you find that the
margin is given by this much ok. So obviously the margin given in this option is less than the
margin given in the previous option.

(Refer Slide Time: 06:44)

To continue further if I take this separating plane, then again the margin is given by this. So,
out of so many options which one should be preferred and that is the scope of the Support
Vector Machine that is what the Support Vector Machine does. The Support Vector Machine
tries to get a separating plane which maximizes the margin and for such a separating plane
the separating plane should be at a maximal distance from the vectors belonging to both the
classes. That means, the vectors belonging to class ω 1 should try to maximize the distance of
the separating plane from the vectors belonging to class omega 1 and it also tried to maximize
the distance from the vectors belonging to class ω 2 right.
So, I should get that particular separating plane. I should try to obtain that particular
separating plane which maximizes this margin and for classification my rule :

y i (at X i + b) > 0

This is for the classification, but as I am talking about the margin I want that for correct
classification of a reliable classification for every X i, the distance from the separating plane
must be more than a certain threshold. So, that distance as we said earlier that a measure of
the distance is given by at X i + b

So, if (at X i + b) = 0 , that means X i falls on the separating plane in which case the
distance of X i from the separating plane is 0. For any non-zero value if X i is taken from class
1, then I must have (at X i + b) to be greater than certain threshold say d and if X i is taken
from class 2, then I should have (at X i + b) should be less than minus d and this should be
true for all the training samples whether the training samples are taken from class 1 or the
training samples are taken from class 2.

So, if X i is taken from class 1, then this should be satisfied that (at X i + b) should be greater
than d and if the training sample X i is taken from class 2, then this one should be satisfied
that is (at X i + b) must be less than d less than minus d and by taking this particular option I
have an uniform criteria that is y i (at X i + b) > 0 should always be greater than d
irrespective of from whichever class this training sample X i has been obtained. What I can
do is, I can always normalize this expression.

So, while designing I can have the condition that y i (at X i + b) > 0 should be greater than
or equal to 1 and I will use this approach while designing the classifier or while choosing the
separating plane, but for classification my rule will be once I fix what should be a and what
should b after designing the separating plane or choosing the separating plane using the
training vectors, then for any unknown X my classification rule can be that (at X i + b) > 0
indicates that X belongs to class 1 else to class 2.
(Refer Slide Time: 11:24)

So, right now our aim is that I should choose this separating plane y i (at X i + b) = 0 which
satisfies the condition that y i (at X i + b) > 1 . So, that is after normalization. So, how I can
do that?

So, what I am saying is that in this particular equation of this particular separating plane I
should take that particular separating plane which maximizes this margin. So, how I can
obtain this margin and how I can maximize this margin? So, for that let us take one vector on
this margin which is say X + and I have take I will take another vector on this margin which
is say X − . So, X + is taken within the class 1 region and X − is taken within class 2 region.

So, a vector ( X + − X − ) is a vector drawn from X − to X + and once I have this vector, then
from here you find that I can obtain that margin which is given by this as a dot product of the
vector ( X + − X − )minus with the unit vector in that direction of w, right.
(Refer Slide Time: 13:24)

But dot product of the vector drawn from ( X + − X − ) with the unit vector in the direction of
w which is nothing, but orthogonal to the separating plane and the unit vector in this direction
is given by a upon ||a|| So, the margin that you get is:

at 2
|a|
(X + − X −) = |a|

So, as we said earlier that I should choose or I aim to choose that particular separating plane
2
which maximizes the margin and the margin comes out to be |a|
.

So, I should choose that particular a which maximizes this and here you find that obviously
as mod of a comes in the denominator, I can maximize this term indefinitely by making a
smaller and smaller, but that is not the solution because the a and b that I choose also must
satisfy the requirement that y i (at X i + b) > 1 . So, I have to minimize a subject to the
constraint that y i (at X i + b) > 1 . So, it becomes a constrained optimization problem and as
you know that to solve accountant optimization problem, we have to make use of Lagrangian.

So, here what I have to do is, I have to form a Lagrangian using this particular constant.
(Refer Slide Time: 17:44)

So, the Lagrangian can be formed like this.:

(Refer Slide Time: 18:52)

So, first for this optimization problem as you know that we have to make use of the
differential operators. So, first let us try to differentiate L(. ) with respect to a and equate that
to zero.

Over all that is all the training vectors which are given for designing the Support Vector
Machine in the same manner.

If I take the differential of L(.) with respect to b what do I get ?

⇒
(Refer Slide Time: 22:03)

So, I get two intermediate solutions that is

So, now let us see what Lagrangian that we had. We had Lagrangian:

Putting in the intermediate solutions and some simplifications we have,

The solution vector a is given by:

(Refer Slide Time: 24:59)

So, now you can make use of any of the optimization tool to optimize L(.) with respect to
alphas and the state of such alphas that you get which maximizes this L(.) can give you what
is my solution vector a.

And once you have the solution vector a, you get your separating plane and this is the
separating plane which maximizes the margin or in other words, this separating plane will
give you our robust linear classifier.

So, today what we have done is, we have tried to find out a linear boundary between the
feature vectors taken from two different classes; 1 and 2 and using Support Vector Machine
we have tried to find out one such linear separator i will plane between the two separating
planes in such a manner that this separator maximizes the margin between the vectors
belonging to class 1 and the vectors belonging to class 2.
So, so far whatever we have discussed, whether it is a linear discriminator or a Support
Vector Machine we have considered a problem which is only two class problem. So, next we
will generalize this and try to find out that how we can obtain or how we can extend similar
concepts to multi class problems. With this I stop here today.

Thank you.

Support Vector Machines
No ratings yet
Support Vector Machines
57 pages
Machine Learning-4
100% (1)
Machine Learning-4
18 pages
DL - Assignment 3 Solution
No ratings yet
DL - Assignment 3 Solution
7 pages
Lecture Notes - SVM
No ratings yet
Lecture Notes - SVM
13 pages
Final - Support Vector Machine - Class - Modifie
No ratings yet
Final - Support Vector Machine - Class - Modifie
69 pages
A Tutorial on ν-Support Vector Machines: 1 An Introductory Example
No ratings yet
A Tutorial on ν-Support Vector Machines: 1 An Introductory Example
29 pages
Support Vector Machine
No ratings yet
Support Vector Machine
52 pages
Supervised Machine Learning
No ratings yet
Supervised Machine Learning
74 pages
21 Support Vector Machines 03-10-2024
No ratings yet
21 Support Vector Machines 03-10-2024
72 pages
Support Vector Network
No ratings yet
Support Vector Network
25 pages
Lecture Notes On Pattern Recognition and Image Processing
No ratings yet
Lecture Notes On Pattern Recognition and Image Processing
24 pages
Discriminant Functions
No ratings yet
Discriminant Functions
33 pages
Report 1
No ratings yet
Report 1
6 pages
CSE 473 Pattern Recognition: Instructor: Dr. Md. Monirul Islam
No ratings yet
CSE 473 Pattern Recognition: Instructor: Dr. Md. Monirul Islam
43 pages
Machine Learning
No ratings yet
Machine Learning
46 pages
Chapter 8
No ratings yet
Chapter 8
52 pages
Pattern Revision
No ratings yet
Pattern Revision
63 pages
ML.1.Lecture.9 (Where It Actually Comes From)
No ratings yet
ML.1.Lecture.9 (Where It Actually Comes From)
31 pages
SVM & Image Classification.
No ratings yet
SVM & Image Classification.
22 pages
Support Vector Machines
No ratings yet
Support Vector Machines
35 pages
Lec 11
No ratings yet
Lec 11
17 pages
Lec 10 SVM
No ratings yet
Lec 10 SVM
35 pages
Lec 13
No ratings yet
Lec 13
16 pages
Lec 10
No ratings yet
Lec 10
16 pages
SVM Slides
No ratings yet
SVM Slides
32 pages
Support Vector Machine SVM
No ratings yet
Support Vector Machine SVM
58 pages
Introduction To Machine Learning Lecture 3: Linear Classification Methods
No ratings yet
Introduction To Machine Learning Lecture 3: Linear Classification Methods
40 pages
SVM and Kernel
No ratings yet
SVM and Kernel
57 pages
W12 SVM
No ratings yet
W12 SVM
52 pages
SML Unit 4
No ratings yet
SML Unit 4
61 pages
Lec 9
No ratings yet
Lec 9
15 pages
Machine Learning - Lecture 5
No ratings yet
Machine Learning - Lecture 5
19 pages
6.034 Notes: Section 7.1: Slide 7.1.1
No ratings yet
6.034 Notes: Section 7.1: Slide 7.1.1
25 pages
Overview of SVM: A Support Vector Machine (SVM) Performs by Finding The That The Margin Between The
No ratings yet
Overview of SVM: A Support Vector Machine (SVM) Performs by Finding The That The Margin Between The
20 pages
Lec 9
No ratings yet
Lec 9
15 pages
Lec 9
No ratings yet
Lec 9
15 pages
Unit 2
No ratings yet
Unit 2
10 pages
Lec5 Support Vector Machine
No ratings yet
Lec5 Support Vector Machine
28 pages
SVM Scribe Notes
No ratings yet
SVM Scribe Notes
16 pages
Ai and ML
No ratings yet
Ai and ML
16 pages
Perceptrons
No ratings yet
Perceptrons
12 pages
Introduction To Support Vector Machines: 1 Description
No ratings yet
Introduction To Support Vector Machines: 1 Description
15 pages
SVM Tutorial: SVM - Understanding The Math - The Optimal Hyperplane
No ratings yet
SVM Tutorial: SVM - Understanding The Math - The Optimal Hyperplane
13 pages
Basic of SVM Algorithm
No ratings yet
Basic of SVM Algorithm
10 pages
Linear Classifiers and The Perceptron Algorithm: 36-350, Data Mining, Fall 2009 16 November 2009
No ratings yet
Linear Classifiers and The Perceptron Algorithm: 36-350, Data Mining, Fall 2009 16 November 2009
5 pages
Support Vector Machines: Dominik Wisniewski Wojciech Wawrzyniak
No ratings yet
Support Vector Machines: Dominik Wisniewski Wojciech Wawrzyniak
16 pages
SVM Seminarbericht Hofmann
No ratings yet
SVM Seminarbericht Hofmann
16 pages
1 An Introduction To Linear Classifiers
No ratings yet
1 An Introduction To Linear Classifiers
9 pages
Lec 41
No ratings yet
Lec 41
6 pages
Tutorial4 SVM
No ratings yet
Tutorial4 SVM
8 pages
Lec 6
No ratings yet
Lec 6
14 pages
1501589527da Mod14 Q1 e Text
No ratings yet
1501589527da Mod14 Q1 e Text
12 pages
SVM 1
No ratings yet
SVM 1
6 pages
Topic 7.9 Support Vector Machines: Buys Yes' Buys No'
No ratings yet
Topic 7.9 Support Vector Machines: Buys Yes' Buys No'
8 pages
SVM Problems1
No ratings yet
SVM Problems1
5 pages
Nptel Lec
No ratings yet
Nptel Lec
22 pages
Lec 19
No ratings yet
Lec 19
16 pages
Lec 8
No ratings yet
Lec 8
16 pages
PG TRB Tamil Question Paper 2001
No ratings yet
PG TRB Tamil Question Paper 2001
15 pages
Dis11 Sol
No ratings yet
Dis11 Sol
5 pages
Main
No ratings yet
Main
5 pages