XII. Support Vector Machines
XII. Support Vector Machines
You submitted this quiz on Tue 7 Apr 2015 11:03 AM CEST. You got a score of
5.00 out of 5.00.
Question 1
Suppose you have trained an SVM classifier with a Gaussian kernel, and it learned the following
decision boundary on the training set:
When you measure the SVM's performance on a cross validation set, it does poorly. Should you
try increasing or decreasing C ? Increasing or decreasing σ 2 ?
It would be
reasonable to try
decreasing C . It
would also be
reasonable to try
decreasing σ .
2
It would be
reasonable to try
increasing C . It would
also be reasonable to
try increasing σ 2 .
Total 1.00 /
1.00
Question 2
−l (1)
−
2
||x ||
The formula for the Gaussian kernel is given by similarity(x, l (1) ) = exp (
σ
2
2
) . The figure
below shows a plot of f 1 = similarity(x, l
(1)
) when σ 2 = 1 .
Which of the following is a plot of f 1 when σ 2 = 0.25 ?
Total 1.00 /
1.00
Question 3
The SVM solves min θ C ∑ m
i=1
y
(i)
θ
cost 1 (
T
x
(i)
) + (1 −y (i)
)cost 0 (θ T
x
(i)
) + ∑ n
j=1
θ 2
j
where
the functions cost 0 (z) and cost 1 (z) look like this:
term will be zero if two of the following four conditions hold true. Which are the two conditions
that would guarantee that this term equals zero?
For every example 0.25 For examples with y (i) = 1 , only the cost 1 (θ T x (i) )
with y (i) = 1 , we have term is present. As you can see in the graph, this will
that θ T
x
(i)
≥ 1. be zero for all inputs greater than or equal to 1.
For every example 0.25 For examples with y (i) , only the cost 0 (θ
T
= 0 x
(i)
)
with y (i)
= 0 , we have term is present. As you can see in the graph, this will
that θ ≤ −1. be zero for all inputs less than or equal to -1.
T (i)
x
Total 1.00 /
1.00
Question 4
Suppose you have a dataset with n = 10 features and m = 5000 examples. After training your
logistic regression classifier with gradient descent, you find that it has underfit the training set and
does not achieve the desired performance on the training or cross validation sets. Which of the
following might be promising steps to take? Check all that apply.
Use an SVM with a 0.25 By using a Gaussian kernel, your model will have
Gaussian Kernel. greater complexity and can avoid underfitting the
data.
Create / add new 0.25 When you add more features, you increase the
polynomial features. variance of your model, reducing the chances of
underfitting.
Reduce the number of 0.25 While you can improve accuracy on the training
examples in the training set. set by removing examples, doing so results in a
worse model that will not generalize as well.
Total 1.00 /
1.00
Question 5
Which of the following statements are true? Check all that apply.
Suppose you have 2D 0.25 The SVM without any kernel (ie, the linear kernel)
input examples (ie, predicts output based only on θ x, so it gives a
T
x
(i)
ℝ
∈ 2 ). The decision linear / straight-line decision boundary, just as
boundary of the SVM logistic regression does.
(with the linear kernel) is a
straight line.
Suppose you are using 0.25 The one-vs-all method requires that we have a
SVMs to do multi-class separate classifier for every class, so you will train K
classification and would different SVMs.
like to use the one-vs-all
approach. If you have K
different classes, you will
train K - 1 different SVMs.
If you are training multi- 0.25 Each SVM you train in the one-vs-all method is a
class SVMs with the one- standard SVM, so you are free to use a kernel.
vs-all method, it is not
possible to use a kernel.
Total 1.00 /
1.00