0% found this document useful (0 votes)

25 views7 pages

Quiz1 Solutions Quiz 1 Soln

This document contains solutions to a quiz on machine learning concepts. It discusses concepts like loss functions, overfitting, underfitting, linear regression, and more. Multiple choice questions with detailed explanations are provided as part of the quiz solutions.

Uploaded by

hasna.nafir

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

25 views7 pages

Quiz1 Solutions Quiz 1 Soln

Uploaded by

hasna.nafir

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 7

lOMoARcPSD|41888394

Quiz1 Solutions - Quiz-1 soln

Advanced Machine Learning (Indraprastha Institute of Information Technology, Delhi)

Scan to open on Studocu

Studocu is not sponsored or endorsed by any college or university

Downloaded by Hasna NAFIR ([email protected])
lOMoARcPSD|41888394

Quiz 1

Course Title: Machine Learning

Date of Examination: 10.09.2020

1. In your experiment, you wish to strongly penalize outliers. Consider the formula for mean
squared error, which is often used as a loss function. Is it a good idea to use “mean cubed
error” (consider power 3 instead of 2) as a loss function instead? (Yes/No).

1 X
N
M SE = (yi − ŷi )2 (1)
N
i=1

Ans: No, this is not a good idea. Even though the cubic function is also a differentiable
function and in theory, we can plug it into our model. However, it will cause the following
problems:-

• The cubic function can return negative values as will, which may cancel out some positive
values while summing up the values and lead to a lower loss, which will be erroneous.
• Using a higher even power will also not be a good idea. Due to the higher degree, it may
cause the model to overfit more (relative to a squared function), because even for small
deviations, a large penalty will be induced. This will essentially be a case of penalizing
too strongly for it to lead to any good.
• Implementation constraints: Because we’re using a cubic function, we will reach the point
of overflow much sooner than a squaring function.

2. While training your model, you observe that the loss on the testing set is constantly lower than
the training set. What could be the possible justification for this?

(a) The model is not training well as the testing set loss should always be higher than training
set error.
(b) The model is training properly as the testing set loss should always be lower than training
set error.
(c) The model may be training properly. We cannot comment with certainty about whether
the training or testing set loss should be lower.
(d) The model is training properly. We should ignore the test set loss as it provides us no
useful information. We only need to ensure that the training set loss keeps decreasing.

Ans: C,We may have an unfortunate train-test split, due to which the model is generalizing
well and making better predictions on the testing set. However, the relative trends should still
remain the same (both training and testing set losses decrease till stopping point, then training
loss decreases and testing loss increases.)

Downloaded by Hasna NAFIR ([email protected])

lOMoARcPSD|41888394

3. Consider we are trying to learn three different classifiers A, B, C on a given data set such that A
has high training as well as test accuracies, B has high training accuracy but low test accuracy,
while C has low training as well as test accuracies. Which one of the following statements is
correct? Justify your answer.

(a) A is overfitting whereas B is underfitting.

(b) A is overfitting whereas C is underfitting.
(c) B is overfitting whereas C is underfitting.
(d) B is underfitting whereas C is overfitting.
(e) None of the above

Ans: C,Overfitting refers to the phenomenon of a model overtly fitting the patterns in the
training data which may not generalize well on unseen data. On the other hand, underfitting
refers to the phenomenon of a model not being able to capture even the desirable patterns
existing in the data (due to limitations on its representational power). In the case above, A
tends to be showing expected trends as the performance on unseen test data is high. However, B
performs well on the training data but does not generalize well, indicating a case of overfitting.
C does not perform well even on the training data, suggesting that the model may be incapable
to predict patterns existing in data, a case of underfitting.

4. You are trying to minimize a convex objective function using gradient descent and the algorithm
has not converged even after completing 10,000 iterations. What might be the possible reasons
for this?

(a) The learning rate is too high

(b) The learning rate is too low
(c) The model may be stuck in a local minima
(d) None of the above

Ans: A,B The learning rate is an important hyperparameter that determines how fast and
stable the gradient descent algorithm converges to the global minimum. A very low learning
rate would signify a very small step-size and may lead to a significant amount of time for the
algorithm to converge. A very high learning rate on the other hand signifies a large step size
and may overshoot the minimum and never lead to convergence due to instability. Therefore,
both are correct. In the case of a convex function, there are no local minima and local minima
is equal to the global minima. Hence, C is incorrect.

5. In linear regression we define residual as the difference of actual value and predicted value
(ytrue − ypred ). You have trained a linear regression model and plotted the residual and the
predicted values on a plane and observed that there is a relationship between them. What can
you say about the trained model?

(a) The model is trained properly

(b) The model is overfitted
(c) The model has failed to capture the relationship between input vector and output com-
pletely
(d) Can’t say anything

Page 2

Downloaded by Hasna NAFIR ([email protected])

lOMoARcPSD|41888394

Ans: C, as we can observe a relationship between errors and outputs the model has failed
to capture this and is not trained properly. A properly trained model will not have any
relationship with residue and the outputs.
6. Consider the following statements regarding linear regression and select the correct ones :
(a) You will always find the optimum solution with the normal equation method with arbi-
trary input feature matrix.
(b) You will always find the optimum solution with the normal equation method with input
feature matrix which does not contain columns which perfectly correlate with each other.
(c) You will always find the optimum solution with gradient descent method as the loss
surface is convex.
(d) You will find the optimum solution with gradient descent with appropriate hyperparam-
eters.
Ans : B and C are correct. The matrix may not be invertible (contain linearly dependent
columns) while calculating the normal solution. The learning rate might be too high for the
model to converge with gradient descent.
7. Let f(x) be the sigmoid function. What is the derivative of f(x) in terms of f(x)?
(a) f (x) ∗ (1 − f (x))
(b) f (x) ∗ (f (x) − 1)
(c) f (x)/(1 − f (x))
(d) 1/(1 − f (x))
Ans : A. f (x) ∗ (1 − f (x)) is the derivative of f(x).
8. Suppose you have been given a fair coin and you want to find out the odds of getting heads.
Which of the following option is true for such a case?
(a) 0
(b) 0.5
(c) 1
(d) None of the above
Ans : C. Odds are defined as the ratio of the probability of success and the probability of
failure. So in case of fair coin probability of success is 1/2 and the probability of failure is 1/2
so odd would be 1
9. You have a model which predicts marks of students on a test. The marks are given on a scale
of 0-10, in increments of 0.5. You have been given a set of features for this. You will perform
(a) ————. Regression on this. A possible metric to compute the performance of this model
is (b)————.
Ans: (a)Logistic Regression, (b)Accuracy. Since the grading buckets are discretized in
intervals of 0.5 marks, we must perform logistic regression as the task is to predict one out
of k possible classes (grading buckets between 0-10 at intervals of 0.5). You cannot use linear
regression as it will not predict results with the constraint of the result being a multiple of 0.5
(linear regression predicts a continuous value).

Page 3

Downloaded by Hasna NAFIR ([email protected])

lOMoARcPSD|41888394

10. Which optimization function between MSE and MAE is harder to optimize. State the reason.
Ans :MAE is harder to optimize. MAE does not have a closed form solution because it is
a non-differentiable piecewise function, as it involves an absolute value. For this reason, MAE
is computationally more expensive, as we can’t solve it in terms of matrix math, and most rely
on approximations.
11. You have a dataset which has x and y as input variables and z as the output variable. After
plotting the data you realized that the output values form a cone around the positive z axis.
What transformation can you apply to the input variables so that you could train a linear
regression classifier on the dataset.
(a) (x, y) → (x + y)
(b) (x, y) → (x2 + y 2 )
(c) (x, y) → (y, x)
(d) (x, y) → (x)
Ans : B. As the output forms a cone around the positive z axis we can say that the output
shows linear relation with the input coordinates’ distance from origin on the x,y plane. We can
model this linear relationship with linear regression by calculating the distance of the input
coordinates from origin and then applying a linear regression model on it.
12. Which loss function L1 or L2 penalizes outliers more and why?
Ans : In L2 since the difference between the predicted and expected value is squared, the
error when there’s greater difference between predicted and expected values(as id the case
with outliers) is relatively more than when there is less difference in comparison with L1 loss.
Therefore the L2 penalizes outliers more.
13. You have learnt about the equation for gradient descent equation (see below). You want to
train your model for 100 iterations. What are the possible issue(s) that may arise due to an
incorrect choice of the learning rate (α):-

(a) If the learning rate is too high, the updates will not move towards the intended minima.
Instead, it’ll move in the opposite direction.
(b) If the learning rate is too high, the gradient descent will arrive at the minima in very few
epochs (say 10 iterations). In subsequent epochs, it will move out of the minima.
(c) If the learning rate is too low, the updates will not move towards the intended minima.
Instead, it’ll move in the opposite direction.
(d) If the learning rate is too low, then the parameters may not converge to the optimal values
in 100 iterations even if the updates are moving in the correct direction.
Ans: D. The convergence will be slow if the learning rate is too low. It may require a higher
number of iterations. A, C are incorrect because the update to the parameters is always in
the direction of the minima. B is incorrect because once the minima is reached, the gradient
becomes 0. Hence, there will be no update at all for the remaining iterations.

Page 4

Downloaded by Hasna NAFIR ([email protected])

lOMoARcPSD|41888394

14. Imagine a classification problem with highly imbalanced data. The majority class occurs 99%
of the times in the training data. A model trained on this dataset yields 99% accuracy on the
test data. Which of the following can be said in such a case? (Select all that apply)

(a) Accuracy is not a good metric for problems having highly imbalanced data.
(b) Accuracy is a good metric for problems with highly imbalanced data.
(c) Precision, Recall are good metrics for problems with highly imbalanced data.
(d) Precision, Recall are not good metrics for problems with highly imbalanced data

Ans: A,C In a highly imbalanced dataset, accuracy is not a good metric since an accuracy
of 99% might mean that the model is only predicting the majority class correctly. To measure
class wise performance of the classifier, other metrics such as Precision and Recall should be
used

15. Consider that you have a linear regression model with two model parameters, θ0 and θ1 .
Further, consider that the cost function, after n iterations with m training inputs, the cost
function J(θ0 , θ1 ) = 0. Then select all that apply:

(a) All prediction outputs for the training inputs will lie perfectly on a straight line which
also coincides with the true outputs for the training inputs.
(b) For the true output matrix Y ǫ R1×m , if Y is a null matrix, then θ0 and θ1 can be equal
to zero.
(c) For the true output matrix Y ǫ R1×m , if Y is a null matrix, then θ0 and θ1 can NOT be
equal to zero.
(d) J(θ0 , θ1 ) can NOT be equal to 0.

Ans: A, B If the cost function is zero and we have only two parameters, we can say that our
predicted output = θ0 + θ1 x, which will necessarily lie on a straight line. If that straight line
coincides with the true outputs, then the predicted and true outputs will be the same, hence
cost function J = 0 If Y is a null matrix, then all true outputs are zero. If θ0 and θ1 are zero,
then predicted values are zero too.

16. Which of the following is true?

1. The validation set is used to estimate the generalization error of the final model, once all
hyperparameters have been chosen.
2. Model Parameters are the parameters in the model that must be determined using the
training data set.
3. The test set is used to estimate the generalization error of each hyperparameter setting.
4. Hyperparameters are adjustable parameters that must be tuned in order to obtain a model
with optimal performance.

Ans: B,D

Page 5

Downloaded by Hasna NAFIR ([email protected])

lOMoARcPSD|41888394

17. Five randomly chosen land areas were sold for the prices mentioned in the table below. A real
estate dealer wants to predict the house price for any given land area based on these samples.
Come up with a linear regression equation that best predicts the house prices.

Ans :The numerical is similar to one posted for tutorial 3. There are other alternate approaches
as well (Best one would be normal equation method if you had access to scientific calculators)

Regression equation is:

yP red = b0 + b1 x. We solve for b0 and b1 .
P P
b1 = ( [(xi − xM ean )(yi − yM ean )])/( [(xi − xM ean )2])
b1 = 125 / 350 = 0.357

b0 = yM ean − b1 ∗ xM ean
b0 = 80 - 0.357 * 85 = 49.655

Final answer: yP red = b0 + b1x

yP red = 49.655 + 0.357 ∗ x.
Or, w = 0.357, b = 49.655 (As given in the question)

Page 6

Downloaded by Hasna NAFIR ([email protected])

TOR Character Lifepaths
90% (10)
TOR Character Lifepaths
10 pages
ECS7020P Sample Paper Solutions
No ratings yet
ECS7020P Sample Paper Solutions
6 pages
Method Statement For LVAC Panel Testing
No ratings yet
Method Statement For LVAC Panel Testing
9 pages
Test 1 With Key 10-3
No ratings yet
Test 1 With Key 10-3
16 pages
Machine 2021 Jan-Apr
No ratings yet
Machine 2021 Jan-Apr
45 pages
Machine 2020 Jul-Dec
No ratings yet
Machine 2020 Jul-Dec
45 pages
Assignment 1-12 ML
No ratings yet
Assignment 1-12 ML
54 pages
Solutions Problem Set 1
No ratings yet
Solutions Problem Set 1
7 pages
Machine 2021 Jul-Dec
No ratings yet
Machine 2021 Jul-Dec
46 pages
Int 354 ML-1
No ratings yet
Int 354 ML-1
4 pages
SS ZG568 EC 2R SECOND SEM 2020 2021 Solution 1617000149821
No ratings yet
SS ZG568 EC 2R SECOND SEM 2020 2021 Solution 1617000149821
6 pages
Epfl Machine Learning Final Exam 2021 Solutions
No ratings yet
Epfl Machine Learning Final Exam 2021 Solutions
21 pages
Group 30
No ratings yet
Group 30
33 pages
COGS 118 Homework 3 Supervised Machine Learning Algorithms
No ratings yet
COGS 118 Homework 3 Supervised Machine Learning Algorithms
7 pages
Supervised Machine Learning Regression
No ratings yet
Supervised Machine Learning Regression
6 pages
NPTEL ML Assignment Week1
100% (4)
NPTEL ML Assignment Week1
5 pages
ML Assignments 2025
No ratings yet
ML Assignments 2025
91 pages
Unit 3 MCQ
No ratings yet
Unit 3 MCQ
20 pages
ML Midsem 2018 Solutions
No ratings yet
ML Midsem 2018 Solutions
7 pages
Midterm Practice Questions
No ratings yet
Midterm Practice Questions
14 pages
Linear Regression Review
67% (3)
Linear Regression Review
4 pages
Nptel Week 5
No ratings yet
Nptel Week 5
4 pages
MLF Q2 Practice Problems
No ratings yet
MLF Q2 Practice Problems
61 pages
Wa0030.
No ratings yet
Wa0030.
36 pages
ML Unit 03 MCQ
No ratings yet
ML Unit 03 MCQ
20 pages
Ell409 Aq
No ratings yet
Ell409 Aq
8 pages
Quiz Feedback - Coursera
0% (1)
Quiz Feedback - Coursera
5 pages
Introduction To Machine Learning Week 2 Assignment
100% (1)
Introduction To Machine Learning Week 2 Assignment
8 pages
Big Data Analytics (BDAG 19-5) : Quiz: GMP - 2019 Term V
No ratings yet
Big Data Analytics (BDAG 19-5) : Quiz: GMP - 2019 Term V
2 pages
Midterm 2010 F
No ratings yet
Midterm 2010 F
15 pages
Solution Dseclzg524!01!102020 Ec2r
100% (1)
Solution Dseclzg524!01!102020 Ec2r
6 pages
CS-31002 (ML) - CS End April 2025
No ratings yet
CS-31002 (ML) - CS End April 2025
19 pages
Machine Learning, (CS-3035), Online Spring End Semester Examination 2021
No ratings yet
Machine Learning, (CS-3035), Online Spring End Semester Examination 2021
8 pages
Extract Pages From 1 ML
No ratings yet
Extract Pages From 1 ML
4 pages
L02 Linear Regression
No ratings yet
L02 Linear Regression
9 pages
07au Midterm
No ratings yet
07au Midterm
17 pages
Sample Exam Answers
No ratings yet
Sample Exam Answers
6 pages
Wa0006.
No ratings yet
Wa0006.
4 pages
Test 5 - Linear Regression
No ratings yet
Test 5 - Linear Regression
1 page
Week 1
No ratings yet
Week 1
11 pages
Lecture 1, Part 1: Linear Regression: Roger Grosse
No ratings yet
Lecture 1, Part 1: Linear Regression: Roger Grosse
9 pages
Exam in Statistical Machine Learning Statistisk Maskininlärning (1RT700)
No ratings yet
Exam in Statistical Machine Learning Statistisk Maskininlärning (1RT700)
12 pages
2024 Machine Learning
No ratings yet
2024 Machine Learning
8 pages
ML 2023a Midsem Solution
No ratings yet
ML 2023a Midsem Solution
9 pages
Exam 21
No ratings yet
Exam 21
17 pages
Midem ML Makeup Sol Upated
No ratings yet
Midem ML Makeup Sol Upated
6 pages
MLexam001 - 1 2 2
No ratings yet
MLexam001 - 1 2 2
9 pages
30 Questions To Test A Data Scientist On Linear Regression
No ratings yet
30 Questions To Test A Data Scientist On Linear Regression
10 pages
Midterm2008f Sol
No ratings yet
Midterm2008f Sol
12 pages
Machine Learning Homework
No ratings yet
Machine Learning Homework
8 pages
Sample Midterm With Solutions (Updated)
No ratings yet
Sample Midterm With Solutions (Updated)
26 pages
2019-20-I MS Key
No ratings yet
2019-20-I MS Key
6 pages
Exam in Statistical Machine Learning Statistisk Maskininlärning (1RT700)
No ratings yet
Exam in Statistical Machine Learning Statistisk Maskininlärning (1RT700)
10 pages
CS6910 Tutorial5
No ratings yet
CS6910 Tutorial5
9 pages
HW 1
No ratings yet
HW 1
3 pages
Assignment 5 Solution
No ratings yet
Assignment 5 Solution
6 pages
ML PYQs
No ratings yet
ML PYQs
32 pages
480 Note Lin
No ratings yet
480 Note Lin
11 pages
IEP RevoU Sept23 NEXT - RezaMalikiAkbar - IoTCyberSec
No ratings yet
IEP RevoU Sept23 NEXT - RezaMalikiAkbar - IoTCyberSec
35 pages
Lab2 Pandas1
No ratings yet
Lab2 Pandas1
8 pages
Dzexams Docs 4am 906144
No ratings yet
Dzexams Docs 4am 906144
6 pages
Dzexams Docs 4am 908228
100% (1)
Dzexams Docs 4am 908228
8 pages
Dzexams Docs 2am 905587
100% (1)
Dzexams Docs 2am 905587
3 pages
BÀI TẬP IDIOMS TIẾNG ANH CHUẨN FORMAT ĐỀ THI THPT QUỐC GIA MÔN ANH
No ratings yet
BÀI TẬP IDIOMS TIẾNG ANH CHUẨN FORMAT ĐỀ THI THPT QUỐC GIA MÔN ANH
10 pages
Streptococcus SPP
No ratings yet
Streptococcus SPP
52 pages
Astronaut
No ratings yet
Astronaut
5 pages
History of Trade and Commerce - 11th CBSE - TutelageAllTime
50% (2)
History of Trade and Commerce - 11th CBSE - TutelageAllTime
10 pages
Topic 6 Database Design
No ratings yet
Topic 6 Database Design
54 pages
A Glossary - of Words and Phrases - Rela Ted To Dams - Icold
No ratings yet
A Glossary - of Words and Phrases - Rela Ted To Dams - Icold
70 pages
AC Quotation
0% (1)
AC Quotation
3 pages
Process Capability Analysis in Pharmaceutical Production
No ratings yet
Process Capability Analysis in Pharmaceutical Production
5 pages
An Introduction To Fiber Optics: About This Chapter
No ratings yet
An Introduction To Fiber Optics: About This Chapter
9 pages
Valuation of Securities: by - ) Ajay Rana, Sonam Gupta, Shivani, Gurpreet, Shilpa
No ratings yet
Valuation of Securities: by - ) Ajay Rana, Sonam Gupta, Shivani, Gurpreet, Shilpa
46 pages
Saferain Listino 2020
No ratings yet
Saferain Listino 2020
38 pages
Polyaluminum Chloride
No ratings yet
Polyaluminum Chloride
1 page
Hot Work Safety Training: Click To Edit Master Subtitle Style
No ratings yet
Hot Work Safety Training: Click To Edit Master Subtitle Style
34 pages
SK Tools 2020 Brand Standards 4-16-2020
No ratings yet
SK Tools 2020 Brand Standards 4-16-2020
31 pages
Syllabus-Certificate Course in Python Programming 2023-24
No ratings yet
Syllabus-Certificate Course in Python Programming 2023-24
5 pages
Quiz #1 (Chapter 1 Factors and Multiples
No ratings yet
Quiz #1 (Chapter 1 Factors and Multiples
4 pages
Ie2110 1
No ratings yet
Ie2110 1
47 pages
MWPA404 Cathodic Protection Guideline Rev 0
No ratings yet
MWPA404 Cathodic Protection Guideline Rev 0
44 pages
Formulario de Fisica Sergio Martínez Ramírez: Parametro Impacto Seccion Transversal Fracción Partículas
No ratings yet
Formulario de Fisica Sergio Martínez Ramírez: Parametro Impacto Seccion Transversal Fracción Partículas
1 page
Sharp Lc32-40-46-52le700e-S - Om - Es
No ratings yet
Sharp Lc32-40-46-52le700e-S - Om - Es
50 pages
Garnishing & Decorating
No ratings yet
Garnishing & Decorating
8 pages
Determinants of Effective Capacity Planning
88% (8)
Determinants of Effective Capacity Planning
3 pages
TM 9-8000 (1956)
No ratings yet
TM 9-8000 (1956)
583 pages
Eating Disorder
100% (1)
Eating Disorder
13 pages
Eyes Physics
No ratings yet
Eyes Physics
13 pages
@@@@@@@@@@@@@@@@@@@@@@@@@@ تﺎﺤﺘﻔﻟا Openings
No ratings yet
@@@@@@@@@@@@@@@@@@@@@@@@@@ تﺎﺤﺘﻔﻟا Openings
24 pages
B2First FCE Essay Topics
No ratings yet
B2First FCE Essay Topics
13 pages
The King of The Golden River
No ratings yet
The King of The Golden River
22 pages

Quiz1 Solutions Quiz 1 Soln

Uploaded by

Quiz1 Solutions Quiz 1 Soln

Uploaded by

lOMoARcPSD|41888394

Quiz1 Solutions - Quiz-1 soln

Advanced Machine Learning (Indraprastha Institute of Information Technology, Delhi)

Scan to open on Studocu

Studocu is not sponsored or endorsed by any college or university

Course Title: Machine Learning

Downloaded by Hasna NAFIR ([email protected])

(a) A is overfitting whereas B is underfitting.

(a) The learning rate is too high

(a) The model is trained properly

Downloaded by Hasna NAFIR ([email protected])

Downloaded by Hasna NAFIR ([email protected])

Downloaded by Hasna NAFIR ([email protected])

16. Which of the following is true?

Downloaded by Hasna NAFIR ([email protected])

Regression equation is:

Final answer: yP red = b0 + b1x

Downloaded by Hasna NAFIR ([email protected])

You might also like