Kernel Methods
Kernel Methods
Recall the kernel perceptron algorithm we learned in the lecture. The weights θ can be represented by a linear combination of
features:
n
θ = ∑ α(i) y (i) ϕ (x(i) )
i=1
In the softmax regression fomulation, we can also apply this representation of the weights:
n
θj = ∑ αj y (i) ϕ (x(i) ) .
(i)
i=1
⎡e ⎤
[θ1 ⋅ϕ(x)/τ]−c
⎢
⎢
[θ2 ⋅ϕ(x)/τ]−c ⎥
⎥
⎢ ⎥
1 e
⎢ ⎥
h (x) =
∑kj=1 e [ θ ⋅ϕ(x)/τ]−c
⎢ ⋮ ⎥
⎣ e[θk ⋅ϕ(x)/τ]−c ⎦
j
⎢ e[∑ni=1 α(i) ⎥
⎢ 2 y ϕ(x )⋅ϕ(x)/τ]−c ⎥
⎢ ⎥
(i) (i)
1
⎢
[∑ni=1 αj y (i) ϕ(x(i) )⋅ϕ(x)/τ]−c ⎢
⎥
⎥
h (x) =
⎢ ⎥
(i)
k
∑j=1 e ⋮
⎣ e[∑i=1 αk y ϕ(x )⋅ϕ(x)/τ]−c ⎦
n (i) (i) (i)
We actually do not need the real mapping ϕ (x) , but the inner product between two features after mapping: ϕ (xi ) ⋅ ϕ (x) ,
where xi is a point in the training set and x is the new data point for which we want to compute the probability. If we can
create a kernel function K (x, y) = ϕ (x) ⋅ ϕ (y), for any two points x and y, we can then kernelize our softmax regression
algorithm.
– – – – – −− −−
ϕ (x) = ⟨x2d , … , x21 , √2xd xd−1 , … , √2xd x1 , √2xd−1 xd−2 , … , √2xd−1 x1 , … , √2x2 x1 , √2cxd , … , √2cx1 , c⟩
Write a function polynomial_kernel that takes in two matrix X and Y and computes the polynomial kernel K (x, y) for
every pair of rows x in X and y in Y .
Correct
Args:
X - (n, d) NumPy array (n datapoints each with d features)
Y - (m, d) NumPy array (m datapoints each with d features)
c - an coefficient to trade off high-order and low-order terms (scalar)
p - the degree of the polynomial kernel
Returns:
kernel_matrix - (n, m) Numpy array containing the kernel matrix
"""
K = X @ Y.transpose()
K += c
K **= p
return K
Test results
CORRECT
See full output
Correct
Args:
X - (n, d) NumPy array (n datapoints each with d features)
Y - (m, d) NumPy array (m datapoints each with d features)
gamma - the gamma parameter of gaussian function (scalar)
Returns:
kernel_matrix - (n, m) Numpy array containing the kernel matrix
"""
XTX = np.mat([np.dot(row, row) for row in X]).T
YTY = np.mat([np.dot(row, row) for row in Y]).T
XTX_matrix = np.repeat(XTX, Y.shape[0], axis=1)
YTY_matrix = np.repeat(YTY, X.shape[0], axis=1).T
K = np.asarray((XTX_matrix + YTY_matrix - 2 * (X @ Y.T)), dtype='float64')
K *= - gamma
return np.exp(K, K)
Test results
CORRECT
See full output
Now, try implementing the softmax regression using kernelized features. You will have to rewrite the softmax_regression
function in softmax.py, as well as the auxiliary functions compute_cost_function, compute_probabilities,
run_gradient_descent_iteration.
You have implemented a linear regression which turned out to be inadequate for this task. You have also learned how to use
scikit-learn's SVM for binary classi�cation and multiclass classi�cation.
Then, you have implemented your own softmax regression using gradient descent.
Finally, you experimented with di�erent hyperparameters, di�erent labels and di�erent features, including kernelized
features.
In the next project, you will apply neural networks to this task.
Add a Post
RBF: Submitted same code twice, got error the �rst time and correct the second time 3
For the RBF kernel question, I submitted the code and got an error. But my output looked identical to the grader's output. So I submitted the same c…
What is the issue that i am getting. Answers are correct but INCORRECT? 4
What is the issue that i am getting. Answers are correct but INCORRECT?
review my answer
2
Please review my answer, and tell me what is wrong