Week 4
Week 4
Week 4
Prof. B. Ravindran, IIT Madras
f (x)
1. (1 Mark) In the context of the perceptron learning algorithm, what does the expression ||f ′ (x)||
represent?
Soln. B
2. (1 Mark) Why do we normalize by ∥β∥ (the magnitude of the weight vector) in the SVM
objective function?
(a) To ensure the margin is independent of the scale of β
(b) To minimize the computational complexity of the algorithm
(c) To prevent overfitting
(d) To ensure the bias term is always positive
Soln. A
3. (1 Mark) Which of the following is NOT one of the KKT conditions for optimization problems
with inequality constraints?
Pm Pp
(a) Stationarity: ∇f (x∗ ) + i=1 λi ∇gi (x∗ ) + j=1 νj ∇hj (x∗ ) = 0
(b) Primal feasibility: gi (x∗ ) ≤ 0 for all i, and hj (x∗ ) = 0 for all j
(c) Dual feasibility: λi ≥ 0 for all i
(d) Convexity: The objective function f (x) must be convex
Soln. D
x y
-1 1
0 -1
2 1
1
State true or false: The dataset becomes
linearly separable after using basis expansion with
1
the following basis function ϕ(x) = 3
x
(a) True
(b) False
Soln. B
′ 1 ′ 1 ′ 1
After applying basis expansion, x1 = , x2 = , and x3 = . Despite the basis
−1 0 8
expansion, it still remains linearly inseparable.
(a) p × d
(b) (p + 1) × d
(c) p+d
d
(d) pd
Soln. C
6. (1 Mark) State True or False: For any given linearly separable data, for any initialization,
both SVM and Perceptron will converge to the same solution.
(a) True
(b) False
Soln. B
For Q7,8: Kindly download the modified version of Iris dataset from this link.
Available at: (https://goo.gl/vchhsd)
The dataset contains 150 points, and each input point has 4 features and belongs to one among
three classes. Use the first 100 points as the training data and the remaining 50 as test data.
In the following questions, to report accuracy, use the test dataset. You can round off the
accuracy value to the nearest 2-decimal point number. (Note: Do not change the order of data
points.)
7. (2 marks) Train a Linear perceptron classifier on the modified iris dataset. We recommend
using sklearn. Use only the first two features for your model and report the best classification
accuracy for l1 and l2 penalty terms.
2
Sol. (d)
Following code will give the desired result.
>>clf = Perceptron(penalty=”l1”).fit(X[0:100,0:2],Y[0:100])
>>clf.score(X[100:,0:2], Y[100:])
>>clf = Perceptron(penalty=”l2”).fit(X[0:100,0:2],Y[0:100])
>>clf.score(X[100:,0:2], Y[100:])
8. (2 marks) Train a SVM classifier on the modified iris dataset. We recommend using sklearn.
Use only the first three features. We encourage you to explore the impact of varying different
hyperparameters of the model. Specifically try different kernels and the associated hyperpa-
rameters. As part of the assignment train models with the following set of hyperparameters
RBF-kernel, gamma = 0.5, one-vs-rest classifier, no-feature-normalization.
Try C = 0.01, 1, 10. For the above set of hyperparameters, report the best classification accu-
racy.
(a) 0.98
(b) 0.88
(c) 0.99
(d) 0.92
Sol. (a)
Following code will give the desired result.
>>clf = svm.SVC( C=1.0, kernel=’rbf’, decision function shape=’ovr’, gamma = 0.5)).fit(X[0:100,0:3],
Y[0:100])
>>clf.score(X[100:,0:3], Y[100:])