0% found this document useful (0 votes)

8 views8 pages

SVM 4

This lecture discusses the concept of SVM kernels in machine learning, emphasizing the importance of inner products in evaluating classifiers. It explains how basis transformations can enhance linear classifiers and introduces various kernel functions, such as polynomial, Gaussian, and neural network kernels, which facilitate computations in higher-dimensional spaces. The lecture also touches on tuning parameters for SVMs and the concept of nu-SVM to manage the number of support vectors.

Uploaded by

srkkps6

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

8 views8 pages

SVM 4

Uploaded by

srkkps6

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 8

NPTEL

NPTEL ONLINE CERTIFICATION COURSE

Introduction to Machine Learning

Lecture 30

Prof. Balaraman Ravibdran

Computer Science and Engineering
Indian Institute of Technology Madras

SVM Kernels

So if you remember I asked you to note the fact that I am using a inner product there right xiTx as
the inner product of two vectors and the way I wrote the dual also I had only inner products in
there. So in fact if I want to evaluate the dual I need to only know the inner products of the two
vectors. Likewise if I want to finally evaluate and use the classifier that I learn I still need to only
find inner products right.

(Refer Slide Time: 01:24)

So if I can come up with a way of efficiently computing this inner products, I can do something
interesting. So what is that so what do we normally do to make linear classifiers more powerful?
Basis transformations. So I can just take my x and replace it with some function h(x) that gives
me a larger basis. It could be just replace it with the square.
I take x and replace it with x2 and then I will get a larger basis and now it turns out that I can do a
lot of math but I can get a dual that looks like this. So that is the inner product notation and so if I
can compute the inner product, I can just solve the same kind of optimization problem but I can
do this in some other transformed space.
(Refer to slide time 02.11)

So likewise our f(x) is going to be so essentially what I need to know is h(x) for whatever pair x
and x’ that I would like to consider. So in the training it is the pairs of training points right while
I am actually using it is one of the support point and the input data that I am looking at any point
I just take this pairs of data points and I need to compute the inner product.
(Refer to slide time 03.13)

So I am going to call this as some function which is a kind of a distance function or a similarity
measure between h(x) and h(x’). Such similarity measures are also called as kernels. So we have
been hearing about kernels in the context of support vector machines we have been trying to use
this libsvm or any of the other tools for some projects over the summer you have heard of
kernels. Kernels are nothing but similarity functions. So the nice thing about the kernels that we
use is that they actually operate on x and x’. They operate on x and x’ but they are computing the
inner product of h(x) and h(x’). Did you see that? They are going to work with x and x’ but they
will be computing the inner product of h(x) and h(x’).
(Refer to slide time 04.27)

So I will give you an example so the kernel function k should be symmetric and positive semi-
definite. Positive definite and semi definite is fine in some cases positive definite. People
remember what positive definite is right? xTAx>0 if it is definite and xTAx>=0 if it is semi
definite. It essentially we want the quadratic forms to be to be positive.

We do not want to take xTAx and suddenly find it is negative so it is in fact you remember I told
the xTAx is usually the quadratic form that we are trying and that will actually mess up big time
in the computation if the quadratic form becomes negative. Then we will have problems in all
the optimization thing going through okay so that is the mechanistic reason for wanting it to be
positive semi-definite. There is a much more fundamental reason for it which I have not
developed the math or the intuition for you to understand. So it has to come at a later course. So
hopefully in the kernel methods course if you are taking it you will figure out why that is needed.
So there are many choices which you can use for the kernels.
(Refer to slide time: 07.26)
So there is something called the polynomial kernel which is essentially (1  x, x ' ) d . So d is a
parameter you can have. d of two three four you can even have d of one is essentially here
whatever we have solved. The next one is called the Gaussian kernel or the RBF kernel right so
where the distance is given by exp( || x  x ' ||2 ) and is essentially the Gaussian without your
normalizing factor. So that is why it is called the RBF kernel so if you want to call it the
Gaussian kernel you actually have to make it Gaussian otherwise call it the RBF kernel.

And then this is called the neural network kernel or the sigmoidal kernel sometimes. This is just
the hyperbolic tangent tanh( K1  x, x '   K 2 ) . Some arbitrary constants k1 and k2 which are
your parameters that you choose and this is x, x’ inner product. So these are some of the popular
kernels which can be used for any generic data but then depending on the kind of data that you
are looking at where the data comes from people do develop speech the specialized kernels they
for examples for string data people have come up with a lot of kernels.

When you want to compare strings how do I look at similarity between strings so the nice thing
about whatever we have done so far is that you can apply this not just to data that comes from Rp
right you been assuming so far that your x comes from some p dimensional real space as long as
you can define a proper kernel right you can apply this max margin classification.

That we have done to any kind of data does not have to come from a real-valued space. Which is
not true of many of the other things you are looked at right all those inherently depend on the
fact that the data is real valued. Because of this nice property of what is called the kernel trick
you could do all of this nice things so as long as you can define appropriate kernel that you can
actually apply this to any kind of data. So that is one very powerful idea.
(Refer Slide Time: 09:28)

So just to convince you so let us look at the polynomial kernel of degree two operating on
vectors of two dimensions. There are two 2's here so the degree is two the d is two and the p is
also two but they need not necessarily be the same that I could have had a much larger thing but
it was easy for me to write something so this is what
(Refer to slide time: 10.33)

Now just squared it now if you think of h we get the following.

(Refer to slide time: 11.18)
So what is this function h? It is essentially the quadratic basis expansion. So I have two features
x1 x2. So i give so remember that x, x is x1 x2. So this is essentially the quadratic expansion the
first thing is one the second coordinate is x1 third coordinate is x2 so it keeps it as it is then fourth
coordinate is x12 fifth coordinate is x22 the sixth coordinate is x1x2 it is all the quadratic basis
expansion. Now if I make this operate on x and x’ and take the inner product so what will be the
terms? 1, 2 x1x1’ , 2x2x2’ , x12 x2x1’2, x2x2’2, 2x1x1’x2x2’ is exactly what we have here right so
what is the nice thing about it is I can essentially compute the inner product of x and x’ first add
1 and square it so numerically what I will end up with is the same as what I would have ended up
with if I had done the basis expansion right and then taken the inner product
(Refer Slide Time: 13:05)

If I had just taken whatever is the original vectors let us say I have some 2, 3 and 4,5 so instead
of doing this basis expansion and then computing the inner product I can just take the inner
product right away. This is essentially what the answer would be so this well for degree 2 it
might not seem great what about degree 15 polynomial? I have essentially doing similar amounts
of computation except that I have to rise something to the power of 15.
That is basis expansion if you thought something else about basis expansion please correct it this
is basis expansion. So I take the original data and then since I said you could have a new
component set or sinx cosx mean does not matter right you could think of variety of different
ways of expanding the bases in this case I am just doing the quadratic basis expansion.
So whatever we have done so far and so this whole idea for kernel and other things are arriving
rather straightforward so what I cannot write now for you is what is the basis expansion for the
RBF kernel it turns out that the computation is doing is actually in an infinite dimensional vector
space okay so here the computation is a six dimensional space and I took some data point from a
two dimensional space computation in a six dimensional space right. And I gave you back the
answer but all the time doing computation only in a two-dimensional space and I only took the
inner product of these two and then added 12 so I am essentially doing computations only R2 .
Well the actual number I am returning to you is the result of computation done in R6 that is why
it is called the kernel trick. So likewise the RBF kernel I will do something in whatever is the
original dimensional space you give me but the resulting computation has an interpretation in
some infinite dimensional vector space case it is not even easy to write it down so that is why the
RBF kernel powerful they work on a variety of data right but they are not all powerful this have
to be careful about it right so, so that is all there is to support vector machines so we have done
this support vector machines as well.

So I don't know if people who have used libsvm or one such tool for that for most RBF kernels
you would have to tune two parameters one is C which we already saw that is essentially how
much penalty you are giving to the thing other one you will tune is γ essentially this right it is
some kind of a width parameter for your Gaussian this how wide you are Gaussian is it just it is
controls that so that is γ so those are the two parameters you tune and for polynomial kernels you
have a d and you have your C right and for sigmoidal kernels you have constants k1and k2 and
you have C and this form of defining a support vector machine is called as C-SVM okay.

There are other ways other constraints that you can impose on it not just the penalty on the  ’s
you can impose penalty on the number of support vectors you consider right you want to say so
suppose I run the data and it comes back and says okay everything is a support vector right so
that is not something interesting how can everything be a support vector can all the data points
be equal distance from the separating hyper plane not if you are considering linear but when I am
considering RBF kernels right the separating hyper plane can be very very complex right.

So in which case you might end up with a lot of support vectors typically I do not know if you
have not thought too much about it and you are setting some very high values for C and trying to
run this thing you will end up with like sixty percent of your data as being support vectors so
instead of trying to do that empirically second on why only 20 support vector so let me try
different see differential γ and so on so forth you can use something called the nu-SVM not new
but nu-SVM which gives you a upper bound on the number of support vectors you are going to
get you can say do the best you can but do not give me more than 30 support vectors something
like that to that effect.

IIT Madras Production

Funded by
Department of Higher Education
Ministry of Resource Development
Government of India

www.nptel.ac.in

Copyrights Reserved

Wald General Relativity Solutions
80% (10)
Wald General Relativity Solutions
59 pages
Support Vector Machines
No ratings yet
Support Vector Machines
57 pages
Nonlinear Solid Mechanics A Continuum Approach For Engineering
86% (21)
Nonlinear Solid Mechanics A Continuum Approach For Engineering
467 pages
Lecture 19 - Nonlinear Learning With Kernels (1) - Plain
No ratings yet
Lecture 19 - Nonlinear Learning With Kernels (1) - Plain
15 pages
Lecture Notes 2
No ratings yet
Lecture Notes 2
181 pages
Lecture Slides-Week12
100% (1)
Lecture Slides-Week12
41 pages
Tensor Algebra-1-60
No ratings yet
Tensor Algebra-1-60
60 pages
SVM Class 2
No ratings yet
SVM Class 2
87 pages
Lecture 05
No ratings yet
Lecture 05
49 pages
Kernel Methods in Machine Learning
No ratings yet
Kernel Methods in Machine Learning
53 pages
Lecture 4
No ratings yet
Lecture 4
49 pages
Kernel Models 1233
No ratings yet
Kernel Models 1233
56 pages
Kernal and Multiclass
No ratings yet
Kernal and Multiclass
51 pages
TFM Lichtner Bajjaoui Aisha
No ratings yet
TFM Lichtner Bajjaoui Aisha
51 pages
Introduction To: Support Vector Machines
No ratings yet
Introduction To: Support Vector Machines
53 pages
Introduction To Support Vector Machines: Andrew Moore CMU
No ratings yet
Introduction To Support Vector Machines: Andrew Moore CMU
40 pages
A Journey From Linear Algebra To Machine Learning
No ratings yet
A Journey From Linear Algebra To Machine Learning
50 pages
Handout 03 Classic Classifiers
No ratings yet
Handout 03 Classic Classifiers
39 pages
Machine Learning 3
No ratings yet
Machine Learning 3
35 pages
ML Lecture06 2
No ratings yet
ML Lecture06 2
63 pages
Kernel Functions
No ratings yet
Kernel Functions
35 pages
Lecture 8 - Kernels
No ratings yet
Lecture 8 - Kernels
32 pages
תרגול - SVM 1
No ratings yet
תרגול - SVM 1
32 pages
03 - Kernelization
No ratings yet
03 - Kernelization
32 pages
SCH Smo 03 C
No ratings yet
SCH Smo 03 C
24 pages
4c Kernels
No ratings yet
4c Kernels
31 pages
Lec5 SVM Kernel SoftMargin
No ratings yet
Lec5 SVM Kernel SoftMargin
44 pages
MLSlides4 Selected Shared
No ratings yet
MLSlides4 Selected Shared
35 pages
Lecture17 Kernels
No ratings yet
Lecture17 Kernels
23 pages
Vahid
No ratings yet
Vahid
18 pages
Advanced Machine Learning
No ratings yet
Advanced Machine Learning
74 pages
Some Methods of Constructing Kernel
No ratings yet
Some Methods of Constructing Kernel
23 pages
Lecture 5
No ratings yet
Lecture 5
19 pages
Kernel Methods
No ratings yet
Kernel Methods
19 pages
Lecture 04
No ratings yet
Lecture 04
19 pages
SVM Kernel Functions
No ratings yet
SVM Kernel Functions
12 pages
Neural Network Lectures RBF 1
No ratings yet
Neural Network Lectures RBF 1
44 pages
Lecture 14: Kernels - Applied ML
No ratings yet
Lecture 14: Kernels - Applied ML
14 pages
Ds 11
No ratings yet
Ds 11
21 pages
Max Tegmark The Mathematical Universe
100% (1)
Max Tegmark The Mathematical Universe
31 pages
High Dimensional Representation
No ratings yet
High Dimensional Representation
33 pages
2021 UNAS REFER Rafi Yon Saputra 173112706420242 Kernel Primer
No ratings yet
2021 UNAS REFER Rafi Yon Saputra 173112706420242 Kernel Primer
65 pages
cs229 Notes3
No ratings yet
cs229 Notes3
30 pages
An Introduction To Kernel Methods: C. Campbell
No ratings yet
An Introduction To Kernel Methods: C. Campbell
38 pages
Lecture 05
No ratings yet
Lecture 05
10 pages
Support Vector Machines: Kernels: CS4780/5780 - Machine Learning Fall 2011 Thorsten Joachims Cornell University
No ratings yet
Support Vector Machines: Kernels: CS4780/5780 - Machine Learning Fall 2011 Thorsten Joachims Cornell University
15 pages
Support Vector Machines & Kernels: David Sontag New York University
No ratings yet
Support Vector Machines & Kernels: David Sontag New York University
19 pages
Introduction To Kernels: Max Welling
No ratings yet
Introduction To Kernels: Max Welling
16 pages
Kernels Regularization and Differential Equations
No ratings yet
Kernels Regularization and Differential Equations
16 pages
Kernal Methods Machine Learning
No ratings yet
Kernal Methods Machine Learning
53 pages
28.8 - RBF-Kernel - mp4
No ratings yet
28.8 - RBF-Kernel - mp4
5 pages
Linear-Algebra-Review Xid-8243921 1
No ratings yet
Linear-Algebra-Review Xid-8243921 1
6 pages
hw2 4
No ratings yet
hw2 4
3 pages
NNLS1 2019 HW4 Solutions
No ratings yet
NNLS1 2019 HW4 Solutions
11 pages
28.7 - Polynomial Kernel - mp4
No ratings yet
28.7 - Polynomial Kernel - mp4
3 pages
Lecture 13 - Kernels
No ratings yet
Lecture 13 - Kernels
5 pages
07 Kernels
No ratings yet
07 Kernels
6 pages
This Is
No ratings yet
This Is
7 pages
Kernel Functions: Tejumade Afonja Jan 2, 2017 6 Min Read
No ratings yet
Kernel Functions: Tejumade Afonja Jan 2, 2017 6 Min Read
6 pages
KernelTrick PDF
No ratings yet
KernelTrick PDF
4 pages
Solutions To The Exercises On The Kernel Trick
No ratings yet
Solutions To The Exercises On The Kernel Trick
3 pages
Linear Algebra For B.SC Students
No ratings yet
Linear Algebra For B.SC Students
284 pages
Semester I: Discipline: Electronics and Communication Stream: EC3
No ratings yet
Semester I: Discipline: Electronics and Communication Stream: EC3
99 pages
Matlab Module 1
No ratings yet
Matlab Module 1
264 pages
Iiserb Mm1 Notes
No ratings yet
Iiserb Mm1 Notes
21 pages
Matlab Matrix Operation
No ratings yet
Matlab Matrix Operation
56 pages
tmp1295 TMP
No ratings yet
tmp1295 TMP
9 pages
Mee 2019
No ratings yet
Mee 2019
229 pages
A Gentle Introduction To The Finite Element Method: Francisco-Javier Sayas 2008
No ratings yet
A Gentle Introduction To The Finite Element Method: Francisco-Javier Sayas 2008
104 pages
Learn Physics by Programming in Haskell: Scott N. Walck
No ratings yet
Learn Physics by Programming in Haskell: Scott N. Walck
11 pages
Masters 2074
No ratings yet
Masters 2074
17 pages
4D Maths
100% (1)
4D Maths
182 pages
4MTH502 Lecture Notes
No ratings yet
4MTH502 Lecture Notes
17 pages
Lec 20
No ratings yet
Lec 20
16 pages
Classical Mechanics
No ratings yet
Classical Mechanics
159 pages
Continuum Mechanics and Tensor Algebra
No ratings yet
Continuum Mechanics and Tensor Algebra
10 pages
PSR - Module3 - Bivariate Random Variables - Class11
No ratings yet
PSR - Module3 - Bivariate Random Variables - Class11
64 pages
Mathematical Proofs: A Transition To Advanced Mathematics
No ratings yet
Mathematical Proofs: A Transition To Advanced Mathematics
16 pages
PSR - Module4 - Test of Significance
No ratings yet
PSR - Module4 - Test of Significance
53 pages
Basic Programming With Elmer: 1.1 Calling Convention
No ratings yet
Basic Programming With Elmer: 1.1 Calling Convention
14 pages
Separating Axis Theorem - Programmer Art
No ratings yet
Separating Axis Theorem - Programmer Art
30 pages
Finite Element 3D Magnetic Field Computation
No ratings yet
Finite Element 3D Magnetic Field Computation
6 pages
Barnes - SCQC Review
No ratings yet
Barnes - SCQC Review
24 pages
Lec 12
No ratings yet
Lec 12
9 pages
SVM 2
No ratings yet
SVM 2
8 pages
Principal Component Analysis in Linear Systems Controllability Observability and Model Reduction
No ratings yet
Principal Component Analysis in Linear Systems Controllability Observability and Model Reduction
16 pages
Inverse Reliability Evaluation in Power Distribution Systems
No ratings yet
Inverse Reliability Evaluation in Power Distribution Systems
3 pages
Kont Mek ht2018
No ratings yet
Kont Mek ht2018
2 pages
136 Lecture Slides 1 Post
No ratings yet
136 Lecture Slides 1 Post
14 pages
L-02 Lines.6 PDF
No ratings yet
L-02 Lines.6 PDF
9 pages
243 Syllabus
No ratings yet
243 Syllabus
3 pages
MATH 1 Summary
No ratings yet
MATH 1 Summary
2 pages
Deep Learning Fundamentals in Python
From Everand
Deep Learning Fundamentals in Python
LazyProgrammer
4/5 (9)
The Little Book of Javascript
From Everand
The Little Book of Javascript
Karl Agius
No ratings yet
Advanced C++ Interview Questions You'll Most Likely Be Asked
From Everand
Advanced C++ Interview Questions You'll Most Likely Be Asked
Vibrant Publishers
No ratings yet

SVM 4

Uploaded by

SVM 4

Uploaded by

NPTEL

NPTEL ONLINE CERTIFICATION COURSE

Introduction to Machine Learning

Prof. Balaraman Ravibdran

(Refer Slide Time: 01:24)

Now just squared it now if you think of h we get the following.

IIT Madras Production

You might also like