0% found this document useful (0 votes)
57 views3 pages

Week 4

Uploaded by

Raja mitme
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
57 views3 pages

Week 4

Uploaded by

Raja mitme
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

Introduction to Machine Learning

Week 4
Prof. B. Ravindran, IIT Madras

f (x)
1. (1 Mark) In the context of the perceptron learning algorithm, what does the expression ||f ′ (x)||
represent?

(a) The gradient of the hyperplane


(b) The signed distance to the hyperplane
(c) The normal vector to the hyperplane
(d) The misclassification error

Soln. B

2. (1 Mark) Why do we normalize by ∥β∥ (the magnitude of the weight vector) in the SVM
objective function?
(a) To ensure the margin is independent of the scale of β
(b) To minimize the computational complexity of the algorithm
(c) To prevent overfitting
(d) To ensure the bias term is always positive
Soln. A

3. (1 Mark) Which of the following is NOT one of the KKT conditions for optimization problems
with inequality constraints?
Pm Pp
(a) Stationarity: ∇f (x∗ ) + i=1 λi ∇gi (x∗ ) + j=1 νj ∇hj (x∗ ) = 0
(b) Primal feasibility: gi (x∗ ) ≤ 0 for all i, and hj (x∗ ) = 0 for all j
(c) Dual feasibility: λi ≥ 0 for all i
(d) Convexity: The objective function f (x) must be convex
Soln. D

4. (1 Mark) Consider the 1 dimensional dataset:

x y
-1 1
0 -1
2 1

(Note: x is the feature and y is the output)

1
State true or false: The dataset becomes
  linearly separable after using basis expansion with
1
the following basis function ϕ(x) = 3
x

(a) True
(b) False

Soln. B
     
′ 1 ′ 1 ′ 1
After applying basis expansion, x1 = , x2 = , and x3 = . Despite the basis
−1 0 8
expansion, it still remains linearly inseparable.

5. (1 Mark) Consider a polynomial kernel of degree d operating on p-dimensional input vectors.


What is the dimension of the feature space induced by this kernel?

(a) p × d
(b) (p + 1) × d
(c) p+d

d
(d) pd

Soln. C

6. (1 Mark) State True or False: For any given linearly separable data, for any initialization,
both SVM and Perceptron will converge to the same solution.

(a) True
(b) False

Soln. B
For Q7,8: Kindly download the modified version of Iris dataset from this link.
Available at: (https://goo.gl/vchhsd)
The dataset contains 150 points, and each input point has 4 features and belongs to one among
three classes. Use the first 100 points as the training data and the remaining 50 as test data.
In the following questions, to report accuracy, use the test dataset. You can round off the
accuracy value to the nearest 2-decimal point number. (Note: Do not change the order of data
points.)
7. (2 marks) Train a Linear perceptron classifier on the modified iris dataset. We recommend
using sklearn. Use only the first two features for your model and report the best classification
accuracy for l1 and l2 penalty terms.

(a) 0.91, 0.64


(b) 0.88, 0.71
(c) 0.71, 0.65
(d) 0.78, 0.64

2
Sol. (d)
Following code will give the desired result.
>>clf = Perceptron(penalty=”l1”).fit(X[0:100,0:2],Y[0:100])
>>clf.score(X[100:,0:2], Y[100:])
>>clf = Perceptron(penalty=”l2”).fit(X[0:100,0:2],Y[0:100])
>>clf.score(X[100:,0:2], Y[100:])

8. (2 marks) Train a SVM classifier on the modified iris dataset. We recommend using sklearn.
Use only the first three features. We encourage you to explore the impact of varying different
hyperparameters of the model. Specifically try different kernels and the associated hyperpa-
rameters. As part of the assignment train models with the following set of hyperparameters
RBF-kernel, gamma = 0.5, one-vs-rest classifier, no-feature-normalization.
Try C = 0.01, 1, 10. For the above set of hyperparameters, report the best classification accu-
racy.
(a) 0.98
(b) 0.88
(c) 0.99
(d) 0.92
Sol. (a)
Following code will give the desired result.
>>clf = svm.SVC( C=1.0, kernel=’rbf’, decision function shape=’ovr’, gamma = 0.5)).fit(X[0:100,0:3],
Y[0:100])
>>clf.score(X[100:,0:3], Y[100:])

You might also like