0% found this document useful (0 votes)

8 views5 pages

Quiz 1

Uploaded by

liquid.nitrogen

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

8 views5 pages

Quiz 1

Uploaded by

liquid.nitrogen

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 5

BITS F464

Machine Learning
Aditya Challa Quiz 1

Instructions

• This is a take-home quiz.

• The total marks for the quiz will be scaled to 10.

• You are required to submit your answers via quanta (local). The form to submit
your answers will be available on quanta shortly.

• The last date to submit your answers is 11:59 PM, 30th September 2024.

Exercise 1
1. One way to select a subset of features is by using naive selection, for which forward and
backward selection are two popular approxmations to reduce computational complexity.
Another way is to perform regularization schemes such as L1 and L2 regularization.
A dataset is provided to you with filenames data problem1.csv and labels problem1.csv.
Using this data which of the following features would be in the best subset: (indexed as
0, 1, · · · , 199). Assume that you know exactly 100 features are relevant.

A 121
B 129
C 34
D 173
E 20
F 158

[3 Marks]

1 of 5
BITS F464
Machine Learning
Aditya Challa Quiz 1

Exercise 2
2. We discussed about increasing the complexity of the model. Two student propose the fol-
lowing ways to increase the complexity:
Assume that the data is in 2-dimensions and we are concerned with the classification prob-
lem.
Student A uses a random matrix to project the data to a higher dimension and then uses a
linear classifier. This implicitly increases the number of features, and hence the number of
parameters in the model and hence the complexity of the model.
Student B instead generates a lot of random features by concatenating the original features
with random features. So, he creates a lot of features and adds them to the 2 original features
and then uses a linear classifier. This also implicitly increases the number of features, and
hence the number of parameters in the model and hence the complexity of the model.
Which of the following statements are true? - Comparision is with respect to the original
model with 2 features.
Remark: The words “reduces” and “increases” are used to mean ≤ and ≥ respectively, i.e
equality is allowed in both cases.

(A) Training error reduces with Student A’s approach.

(B) Training error increases with Student A’s approach.
(C) Test error reduces with Student A’s approach.
(D) Test error increases with Student A’s approach.
(E) Training error reduces with Student B’s approach.
(F) Training error increases with Student B’s approach.
(G) Test error reduces with Student B’s approach.
(H) Test error increases with Student B’s approach.

[8 Marks]

2 of 5
BITS F464
Machine Learning
Aditya Challa Quiz 1

Exercise 3
3. Consider the following kernel construction - Given the set of data points {xi }, Construct a
complete graph between all points where the edge weight is given by exp(−||xi − xj ||2 /σ 2 ).
Then, define the kernel as

K(x, x′ ) = max ′ min exp −||xi − xj ||2 /σ 2

π∈Π(x,x ) (xi ,xj )∈π

where Π(x, x′ ) is the set of all paths between x and x′ , e ∈ π is an edge in the path π. Also
set K(x, x) := 1.
Answer the following questions:

(a) State TRUE/FALSE. The kernel is symmetric, i.e K(x, x′ ) = K(x′ , x).
(b) State TRUE/FALSE. The kernel is positive definite.
(c) State TRUE/FALSE. The boundary obtained using this kernel changes under the
transformation x → x + c for any vector c.
(d) State TRUE/FALSE. The boundary obtained using this kernel does not change
under the transformation x → Ax + b for matrix A and vector b.
(e) State TRUE/FALSE. The boundary obtained using this kernel does not change
under the transformation x → Ax + b for matrix A which is positive definite and
vector b.

Definitions/Hints:

1. If A is a symmetric with positive entries, and Aij ≥ mink {Aik , Akj }, then A is
positive definite. Check out https://www.math.kent.edu/~varga/pub/paper_
199.pdf
2. A complete graph with n vertices has n(n − 1)/2 edges, and every point is con-
nected to every other point.

[5 × 2 = 10 Marks]

3 of 5
BITS F464
Machine Learning
Aditya Challa Quiz 1

Exercise 4
4. Implement the above kernel using scikit-learn support vector machine library (allows
using manual kernel). Answer the following questions based on the results. You can use the
code in kernel svm.ipynb as a starting point.
Remark 1: You are expected to tweak the hyperparameter (σ) to get the best results and
answer the following questions based on those results.
Remark 2: There is an important subtlety in the way the kernel is defined. Note that
K(x, x′ ) actually depends on the entire dataset! Hence, the implementation should be
adapted accordingly.
Which of the following statements are true?

(A) The best test-accuracy on the make moons dataset is 0 (Assume very large
number of samples, Use n samples = 1000 and noise = 0.01)
(B) The best test-accuracy on the make moons dataset is 1 (Assume very large
number of samples, Use n samples = 1000 and noise = 0.01)
(C) The best test-accuracy on the make circles dataset is 1 (Assume very large
number of samples, Use n samples = 1000 and noise = 0.01)
(D) The best test-accuracy on the make circles dataset is 0 (Assume very large
number of samples)
(E) Consider the dataset of make blobs - The kernel works well compared to linear
kernel when n samples is large and n features is small.
(F) Consider the dataset of make blobs - The kernel works well compared to linear
kernel when n samples is small and n features is large.

[12 Marks]

4 of 5
BITS F464
Machine Learning
Aditya Challa Quiz 1

Problem 5
5. Suppose we P have a domain knowledge that the ground-truth function is a finite sum of the
form f (t) = pi=1 ai cos(2πωi t), where ai ∈ R is fixed but unknown, but know that ωi belongs
to the set of integers between 1 and K - {1, 2, · · · , K}. So, sampling from this function will
give observations - {(ti , yi = f (ti ))}ni=1 . Assume ti is uniformly sampled from [−1, 1].

Pc let us construct the hypothesis class Hc (indexed by c) as

Using the domain knowledge,
follows - Hc = {h(t) = i=1 ai cos(2πωi t)}, where ai is some real number and ωi ∈
{1, 2, · · · , K}.
One way to look at the problem is – as a regression problem. Since, K is known, we can
generate K features by transforming ti to (cos(2πti ), cos(2π2ti ), · · · , cos(2πKti )). So we now
have K features and we need to identify the values of ai , and can use the linear regression
model to do so.
Which of the following statements are true?

(A) Since there are K unknowns - {ai }, we need at least K distinct (w.r.t t) data
points to identify the values of ai .
(B) Since there are K unknowns - {ai } and there is no irreducible error, K distinct
(w.r.t t) data points are sufficient to identify the values of ai .

Note that there is no irredicible error in the ground-truth function, that is if t is fixed the
output is also fixed.
Assume that we are sampling t uniformly at random from the continuous distribution
U [−1, 1]. Then which of the following statements are true?

(C) Irrespective of the sample size n, the variance of our estimates is 0.

(D) If the sample size is n ≥ K, then the variance of our estimates is 0.
(E) If the sample size is n ≥ 4K + 4, then the variance of our estimates is 0.

Hint: You are required to use the bootstrap method to estimate the variance of your
estimates and playing with the parameters K, p and n and see how the variance changes.
Use Log-Scale on the y-axis to get better visualization. Use large a values (order of 109 ) to
see the effect of n on the variance better. One question which arises while experimentation
is - When can you say the variance is 0? Due to precision errors while computing, this can
be a tricky question. Take equally spaced samples in [−1, 1] and plot the variance of your
estimates as a function of n. Since it is the same sample each time, we expect that the
variance is 0 and hence you can check if the variance you get is approximately 0 or not.

[10 Marks]

5 of 5

Wa0030.
No ratings yet
Wa0030.
36 pages
IC-GASMOTDS-2025 Brochure 250710 130719
No ratings yet
IC-GASMOTDS-2025 Brochure 250710 130719
8 pages
MLSlides4 Selected Shared
No ratings yet
MLSlides4 Selected Shared
35 pages
SVM Class 2
No ratings yet
SVM Class 2
87 pages
FT of AI
No ratings yet
FT of AI
109 pages
Machine 2021 Jan-Apr
No ratings yet
Machine 2021 Jan-Apr
45 pages
Probability and Random Processes: Lessons 5-6 Discrete Random Variables
No ratings yet
Probability and Random Processes: Lessons 5-6 Discrete Random Variables
66 pages
Point Estimation Exercises
100% (1)
Point Estimation Exercises
7 pages
Mymodules - ICT1511-19-S1 - Online Assessment 2
No ratings yet
Mymodules - ICT1511-19-S1 - Online Assessment 2
18 pages
Final2019 Solutions
No ratings yet
Final2019 Solutions
23 pages
MIDA1 AUT - Solutions
No ratings yet
MIDA1 AUT - Solutions
4 pages
M-Tech 1 Year Cace Lab Computational Lab
No ratings yet
M-Tech 1 Year Cace Lab Computational Lab
41 pages
Unit 4 Numerical Techniques Unit: 4: Btech4 Sem
No ratings yet
Unit 4 Numerical Techniques Unit: 4: Btech4 Sem
106 pages
Final 2012 W
No ratings yet
Final 2012 W
8 pages
MidSem 202122 Solution
No ratings yet
MidSem 202122 Solution
7 pages
Quiz 3
No ratings yet
Quiz 3
12 pages
Post-Quantum Cryptography Working Group (2023) Risk Model Technical Paper
No ratings yet
Post-Quantum Cryptography Working Group (2023) Risk Model Technical Paper
25 pages
Set3sol 2022
No ratings yet
Set3sol 2022
3 pages
AI42001 Practice 2
No ratings yet
AI42001 Practice 2
4 pages
ML 2024a QP Solution Full
No ratings yet
ML 2024a QP Solution Full
13 pages
Introduction To Kernels: Max Welling
No ratings yet
Introduction To Kernels: Max Welling
16 pages
ML 20231026 1
No ratings yet
ML 20231026 1
8 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
56 pages
Lab Report 5
No ratings yet
Lab Report 5
6 pages
(Reformatted) Module 5 (Students)
No ratings yet
(Reformatted) Module 5 (Students)
32 pages
C&i Lab Final Eee
No ratings yet
C&i Lab Final Eee
50 pages
Homework 2: SVM, Kernel Methods, Ensemble Learning, Learning Theory
No ratings yet
Homework 2: SVM, Kernel Methods, Ensemble Learning, Learning Theory
12 pages
ML 20240315
No ratings yet
ML 20240315
8 pages
LCM & CLCM Printout
No ratings yet
LCM & CLCM Printout
4 pages
HW 3
No ratings yet
HW 3
5 pages
EE 769 2023.02.23 Mid Term
No ratings yet
EE 769 2023.02.23 Mid Term
2 pages
ML Question CMU
No ratings yet
ML Question CMU
12 pages
Sample Quiz1 Questions
No ratings yet
Sample Quiz1 Questions
8 pages
Machine Learning 20CSE09
No ratings yet
Machine Learning 20CSE09
3 pages
Quiz3 2024
No ratings yet
Quiz3 2024
2 pages
EE2211 Past Paper Ans
No ratings yet
EE2211 Past Paper Ans
19 pages
Libopenabe v1.0.0 Design
No ratings yet
Libopenabe v1.0.0 Design
30 pages
Assignment 1
No ratings yet
Assignment 1
6 pages
EE2211 Past Paper
No ratings yet
EE2211 Past Paper
14 pages
Python Advanced - Finite State Machine in Python
No ratings yet
Python Advanced - Finite State Machine in Python
1 page
Sample Midterm Exam 6
No ratings yet
Sample Midterm Exam 6
11 pages
Midterm With Solutions
No ratings yet
Midterm With Solutions
26 pages
CS467 ML Dec2018
No ratings yet
CS467 ML Dec2018
3 pages
EndSem 202223 Solution
No ratings yet
EndSem 202223 Solution
4 pages
Histogram
No ratings yet
Histogram
10 pages
ML 20230316 1
No ratings yet
ML 20230316 1
9 pages
Final 2019
No ratings yet
Final 2019
15 pages
Mock End Term Solution
No ratings yet
Mock End Term Solution
12 pages
Anti Pid Windup
No ratings yet
Anti Pid Windup
14 pages
Lecture10 Mid
No ratings yet
Lecture10 Mid
43 pages
Massachusetts Institute of Technology: 6.867 Machine Learning, Fall 2006 Problem Set 2: Solutions
No ratings yet
Massachusetts Institute of Technology: 6.867 Machine Learning, Fall 2006 Problem Set 2: Solutions
7 pages
212 Final-Solution
No ratings yet
212 Final-Solution
23 pages
ML 2023a Midsem Solution
No ratings yet
ML 2023a Midsem Solution
9 pages
1 Analytical Part (3 Percent Grade) : + + + 1 N I: y +1 I 1 N I: y 1 I
No ratings yet
1 Analytical Part (3 Percent Grade) : + + + 1 N I: y +1 I 1 N I: y 1 I
5 pages
Machine 2021 Jul-Dec
No ratings yet
Machine 2021 Jul-Dec
46 pages
2019-20-I ES Key
No ratings yet
2019-20-I ES Key
4 pages
P95 Course Slides
No ratings yet
P95 Course Slides
86 pages
C-3 Pap365er
No ratings yet
C-3 Pap365er
4 pages
HW 3
No ratings yet
HW 3
7 pages
Endsem ML Makeup AK - 1
No ratings yet
Endsem ML Makeup AK - 1
7 pages
MCQs Dumps 2
No ratings yet
MCQs Dumps 2
15 pages
Dokumen - Tips Contoh Studi Kasus Decision Tree
No ratings yet
Dokumen - Tips Contoh Studi Kasus Decision Tree
11 pages
Lect1 Measu Handout
No ratings yet
Lect1 Measu Handout
7 pages
07au Midterm
No ratings yet
07au Midterm
17 pages
Endsem ML Regular AK
No ratings yet
Endsem ML Regular AK
7 pages
Learning Multiple Layers of Representation: Geoffrey E. Hinton
No ratings yet
Learning Multiple Layers of Representation: Geoffrey E. Hinton
7 pages
Lesson16 2
No ratings yet
Lesson16 2
22 pages
Final: CS 189 Spring 2013 Introduction To Machine Learning
No ratings yet
Final: CS 189 Spring 2013 Introduction To Machine Learning
9 pages
Assignment Problem
No ratings yet
Assignment Problem
11 pages
Clustering Data With Measurement Errors: Mahesh Kumar, Nitin R. Patel, James B. Orlin Operations Research Center, MIT
No ratings yet
Clustering Data With Measurement Errors: Mahesh Kumar, Nitin R. Patel, James B. Orlin Operations Research Center, MIT
26 pages
ML ES 23-24-II Key
No ratings yet
ML ES 23-24-II Key
4 pages
CS467 A
No ratings yet
CS467 A
3 pages
ECS7020P Sample Paper Solutions
No ratings yet
ECS7020P Sample Paper Solutions
6 pages
Matlab Homework Experts 2
No ratings yet
Matlab Homework Experts 2
10 pages
Kernel PCA
No ratings yet
Kernel PCA
13 pages
Simulation of Queueing Systems
No ratings yet
Simulation of Queueing Systems
17 pages
Machine Learning - AKTU PAPER (Session 2019 - 2020)
No ratings yet
Machine Learning - AKTU PAPER (Session 2019 - 2020)
10 pages
GIS Interpolation
No ratings yet
GIS Interpolation
14 pages
2022 CS244 End Sem Soln
No ratings yet
2022 CS244 End Sem Soln
6 pages
Digital Image Processing: Dr. Ir. Aleksandra Pizurica
No ratings yet
Digital Image Processing: Dr. Ir. Aleksandra Pizurica
12 pages
ARIMA Forecasting Using R
No ratings yet
ARIMA Forecasting Using R
9 pages
DM 2019
No ratings yet
DM 2019
7 pages
Problems: Soru 6.1
No ratings yet
Problems: Soru 6.1
2 pages
I 24 Nov 2023 Lab Exam Questions Material
No ratings yet
I 24 Nov 2023 Lab Exam Questions Material
2 pages
Practice Midterm 2010
No ratings yet
Practice Midterm 2010
4 pages
Practice Midterm
No ratings yet
Practice Midterm
4 pages
Euler Differential Equation PDF
No ratings yet
Euler Differential Equation PDF
2 pages

Quiz 1

Uploaded by

Quiz 1

Uploaded by

BITS F464

• This is a take-home quiz.

• The total marks for the quiz will be scaled to 10.

(A) Training error reduces with Student A’s approach.

K(x, x′ ) = max ′ min exp −||xi − xj ||2 /σ 2

Pc let us construct the hypothesis class Hc (indexed by c) as

(C) Irrespective of the sample size n, the variance of our estimates is 0.

You might also like