0% found this document useful (0 votes)
16 views

Kernel Methods

edx machine learning kernel methods

Uploaded by

Jorge
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views

Kernel Methods

edx machine learning kernel methods

Uploaded by

Jorge
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

10. Kernel Methods | Project 2: Digit recognition ... https://courses.edx.org/courses/course-v1:MITx+...

Unit 2 Nonlinear Classi�cation,


Linear regression, Collaborative
Course  Filtering (2 weeks)  Project 2: Digit recognition (Part 1)  10. Kernel Methods

Audit Access Expires May 11, 2020


You lose all access to this course, including your progress, on May 11, 2020.
Upgrade by Apr 1, 2020 to get unlimited access to the course as long as it exists on the site. Upgrade now

10. Kernel Methods


As you can see, implementing a direct mapping to the high-dimensional features is a lot of work (imagine using an even
higher dimensional feature mapping.) This is where the kernel trick becomes useful.

Recall the kernel perceptron algorithm we learned in the lecture. The weights θ can be represented by a linear combination of
features:

n
θ = ∑ α(i) y (i) ϕ (x(i) )
i=1

In the softmax regression fomulation, we can also apply this representation of the weights:

n
θj = ∑ αj y (i) ϕ (x(i) ) .
(i)

i=1

⎡e ⎤
[θ1 ⋅ϕ(x)/τ]−c



[θ2 ⋅ϕ(x)/τ]−c ⎥

⎢ ⎥
1 e
⎢ ⎥
h (x) =
∑kj=1 e [ θ ⋅ϕ(x)/τ]−c
⎢ ⋮ ⎥
⎣ e[θk ⋅ϕ(x)/τ]−c ⎦
j

⎡ e[∑i=1 α1 y ϕ(x )⋅ϕ(x)/τ]−c ⎤


n (i) (i) (i)

⎢ e[∑ni=1 α(i) ⎥
⎢ 2 y ϕ(x )⋅ϕ(x)/τ]−c ⎥
⎢ ⎥
(i) (i)
1

[∑ni=1 αj y (i) ϕ(x(i) )⋅ϕ(x)/τ]−c ⎢


h (x) =
⎢ ⎥
(i)
k
∑j=1 e ⋮
⎣ e[∑i=1 αk y ϕ(x )⋅ϕ(x)/τ]−c ⎦
n (i) (i) (i)

We actually do not need the real mapping ϕ (x) , but the inner product between two features after mapping: ϕ (xi ) ⋅ ϕ (x) ,
where xi is a point in the training set and x is the new data point for which we want to compute the probability. If we can
create a kernel function K (x, y) = ϕ (x) ⋅ ϕ (y), for any two points x and y, we can then kernelize our softmax regression
algorithm.

1 of 6 2020-03-25, 7:33 p.m.


10. Kernel Methods | Project 2: Digit recognition ... https://courses.edx.org/courses/course-v1:MITx+...
You will be working in the �les part1/main.py and part1/kernel.py in this problem

Implementing Polynomial Kernel


1.0/1 point (graded)
In the last section, we explicitly created a cubic feature mapping. Now, suppose we want to map the features into d
dimensional polynomial space,

– – – – – −− −−
ϕ (x) = ⟨x2d , … , x21 , √2xd xd−1 , … , √2xd x1 , √2xd−1 xd−2 , … , √2xd−1 x1 , … , √2x2 x1 , √2cxd , … , √2cx1 , c⟩

Write a function polynomial_kernel that takes in two matrix X and Y and computes the polynomial kernel K (x, y) for
every pair of rows x in X and y in Y .

Available Functions: You have access to the NumPy python library as np

1 def polynomial_kernel(X, Y, c, p):


2 """
3 Compute the polynomial kernel between two matrices X and Y::
4 K(x, y) = (<x, y> + c)^p
5 for each pair of rows x in X and y in Y.
6
7 Args:
8 X - (n, d) NumPy array (n datapoints each with d features)
9 Y - (m, d) NumPy array (m datapoints each with d features)
10 c - a coefficient to trade off high-order and low-order terms (scalar)
11 p - the degree of the polynomial kernel
12
13 Returns:
14 kernel_matrix - (n, m) Numpy array containing the kernel matrix
15 """
Press ESC then TAB or click outside of the code editor to exit

Correct

def polynomial_kernel(X, Y, c, p):


"""
Compute the polynomial kernel between two matrices X and Y::
K(x, y) = (<x, y> + c)^p
for each pair of rows x in X and y in Y.

Args:
X - (n, d) NumPy array (n datapoints each with d features)
Y - (m, d) NumPy array (m datapoints each with d features)
c - an coefficient to trade off high-order and low-order terms (scalar)
p - the degree of the polynomial kernel

Returns:
kernel_matrix - (n, m) Numpy array containing the kernel matrix
"""
K = X @ Y.transpose()
K += c
K **= p
return K

2 of 6 2020-03-25, 7:33 p.m.


10. Kernel Methods | Project 2: Digit recognition ... https://courses.edx.org/courses/course-v1:MITx+...

Test results

See full output

CORRECT
See full output

You have used 1 of 25 attempts

 Answers are displayed within the problem

Gaussian RBF Kernel


1.0/1 point (graded)
Another commonly used kernel is the Gaussian RBF kenel. Similarly, write a function rbf_kernel that takes in two matrices
X and Y and computes the RBF kernel K (x, y) for every pair of rows x in X and y in Y .

Available Functions: You have access to the NumPy python library as np

1 def rbf_kernel(X, Y, gamma):


2 """
3 Compute the Gaussian RBF kernel between two matrices X and Y::
4 K(x, y) = exp(-gamma ||x-y||^2)
5 for each pair of rows x in X and y in Y.
6
7 Args:
8 X - (n, d) NumPy array (n datapoints each with d features)
9 Y - (m, d) NumPy array (m datapoints each with d features)
10 gamma - the gamma parameter of gaussian function (scalar)
11
12 Returns:
13 kernel_matrix - (n, m) Numpy array containing the kernel matrix
14 """
15 # YOUR CODE HERE
Press ESC then TAB or click outside of the code editor to exit

Correct

3 of 6 2020-03-25, 7:33 p.m.


10. Kernel Methods | Project 2: Digit recognition ... https://courses.edx.org/courses/course-v1:MITx+...

def rbf_kernel(X, Y, gamma):


"""
Compute the Gaussian RBF kernel between two matrices X and Y::
K(x, y) = exp(-gamma ||x-y||^2)
for each pair of rows x in X and y in Y.

Args:
X - (n, d) NumPy array (n datapoints each with d features)
Y - (m, d) NumPy array (m datapoints each with d features)
gamma - the gamma parameter of gaussian function (scalar)

Returns:
kernel_matrix - (n, m) Numpy array containing the kernel matrix
"""
XTX = np.mat([np.dot(row, row) for row in X]).T
YTY = np.mat([np.dot(row, row) for row in Y]).T
XTX_matrix = np.repeat(XTX, Y.shape[0], axis=1)
YTY_matrix = np.repeat(YTY, X.shape[0], axis=1).T
K = np.asarray((XTX_matrix + YTY_matrix - 2 * (X @ Y.T)), dtype='float64')
K *= - gamma
return np.exp(K, K)

Test results

See full output

CORRECT
See full output

You have used 1 of 25 attempts

 Answers are displayed within the problem

Now, try implementing the softmax regression using kernelized features. You will have to rewrite the softmax_regression
function in softmax.py, as well as the auxiliary functions compute_cost_function, compute_probabilities,
run_gradient_descent_iteration.

How does the test error change?

4 of 6 2020-03-25, 7:33 p.m.


10. Kernel Methods | Project 2: Digit recognition ... https://courses.edx.org/courses/course-v1:MITx+...
In this project, you have been familiarized with the MNIST dataset for digit recognition, a popular task in computer vision.

You have implemented a linear regression which turned out to be inadequate for this task. You have also learned how to use
scikit-learn's SVM for binary classi�cation and multiclass classi�cation.

Then, you have implemented your own softmax regression using gradient descent.

Finally, you experimented with di�erent hyperparameters, di�erent labels and di�erent features, including kernelized
features.

In the next project, you will apply neural networks to this task.

Discussion Hide Discussion


Topic: Unit 2 Nonlinear Classi�cation, Linear regression, Collaborative Filtering (2 weeks):Project
2: Digit recognition (Part 1) / 10. Kernel Methods

Add a Post

Show all posts by recent activity

 [STAFF] get confused 2


Hi, I completed all answers correctly except the one on SVM=>"Implement C-SVM" .... where I get confused. Can you please reset the attempts count …

 [STAFF] RFB kernel answer correct but grader truncates output


6
My answer seems correct (the kernel output is exactly the same as the answer) but for some reason the grader is truncating my output and giving m…

 RBF: Submitted same code twice, got error the �rst time and correct the second time 3
For the RBF kernel question, I submitted the code and got an error. But my output looked identical to the grader's output. So I submitted the same c…

 [STAFF] Problems with grader for "Gaussian RBF Kernel" 2


Hi, Sta� I've sent four times my code solution to the problem "Gaussian RBF Kernel". I think, the grader has a problem because if the solution is trun…

 Computation between matrices of di�erent shapes, how? 4


[relate to RBF Kernel] Sorry for this stupid question, but why it is possible to compute ||x-y||^2 when x and y are matrices of di�erent shapes....? I t…

 Implementing Polynomial Kernel 1


I got this correct

 The ending feels rushed 1


Basically, the �nal part of the project where it's suggested to implement softmax regression using Kernelized features feels a bit rushed. It's not very…

 Correct Answer Marked Wrong?


2
Hi When submitting my polynomial kernel code I get exactly the same output as given by the grader, but it's marked incorrect. Can someone please …

 What is the issue that i am getting. Answers are correct but INCORRECT? 4
What is the issue that i am getting. Answers are correct but INCORRECT?

 [sta�] How to use kernels?


6
I got all functions correct but still struggling in getting the point of why do we need these kernels? First of all i don't understand where arbitrary Y co…

 [STAFF] Gaussian RBF Kernel - please check the lines of code 2


I am going crazy. Everything seems to be ok, the grader gives me half the points, but after several hours I am unable to see where I am missing the o…

 softmax regression using kernelized features.


2
Hi Sta�, Please point me to a paper or lecture notes to describe complete softmax regression with kernel algorithm, thanks.

 review my answer
2
Please review my answer, and tell me what is wrong

Learn About Veri�ed Certi�cates


5 of 6 2020-03-25, 7:33 p.m.
10. Kernel Methods | Project 2: Digit recognition ... https://courses.edx.org/courses/course-v1:MITx+...
© All Rights Reserved

6 of 6 2020-03-25, 7:33 p.m.

You might also like