DL - Assignment 9 Solution
DL - Assignment 9 Solution
Deep Learning
Assignment- Week 9
TYPE OF QUESTION: MCQ/MSQ
Number of questions: 10 Total mark: 10 X 1 = 10
______________________________________________________________________________
QUESTION 1:
For the following figure A and figure B of loss landscape, choose correct statement
Figure A Figure B
a. Figure A has small learning rate, Figure B has High learning rate
b. Figure A has high learning rate, Figure B has small learning rate
c. Figure A and Figure B have different Loss function
d. None of Above
Correct Answer: a
Detailed Solution:
Figure A has small learning rate which is evident from slow convergence before optimal
valley point. Figure B has highly fluctuating weight updates therefore has high learning
rate. (Figures taken from book Dive into Deep Learning)
____________________________________________________________________________
NPTEL Online Certification Courses
Indian Institute of Technology Kharagpur
QUESTION 2:
Which of the following problem is primarily solved by the Residual connection in ResNet?
Correct Answer: a
Detailed Solution:
____________________________________________________________________________
QUESTION 3:
The following is the equation of update vector for momentum optimizer. Which of the
following is true for 𝛾?
𝑉𝑡 = 𝛾𝑉𝑡−1 + 𝜂∇𝜃 𝐽(𝜃)
a. 𝛾 is the momentum term which indicates how much acceleration you want
b. 𝛾 is the step size
c. 𝛾 is the first order moment
d. 𝛾 is the second order moment
Correct Answer: a
Detailed Solution:
A fraction of the update vector of the past time step is added to the current update vector.
𝜸 is that fraction which indicates how much acceleration you want and its value lies
between 0 and 1.
____________________________________________________________________________
QUESTION 4:
Choose the correct option
NPTEL Online Certification Courses
Indian Institute of Technology Kharagpur
Statement 1: Stochastic gradient descent is less prone to getting stuck in local minima because
of inherent noise due to minibatch sampling.
Statement 2: Large learning rates with annealing schedule can be used with higher mini-batch
size.
a. Statement 1 is True, Statement 2 is True
b. Statement 1 is False, Statement 2 is True
c. Statement 1 is True, Statement 2 is False
d. Statement 1 is False, Statement 2 is False
Correct Answer: a
Detailed Solution:
Stochastic Gradient Descent does not consider the whole batch for update and thus has noisier
updates, due to noise, the gradient direction is very likely to avoid updates in direction of Local
minima. With higher mini-batch size, noise in SGD goes down making higher learning rate with
annealing schedule strategies can be used.
____________________________________________________________________________
QUESTION 5:
Which of the following is simplest optimizer, in computational requirement sense, to deal with
oscillations and saddle points?
Correct Answer: b
Detailed Solution:
Mini-batch gradient descent makes a parameter update after seeing just a subset of
examples, the direction of the update has some variance, and so the path taken by mini-
batch gradient descent will "oscillate" toward convergence. Using momentum can reduce
these oscillations and deal with saddle points for vanishing gradients.
____________________________________________________________________________
QUESTION 6:
Given following three figures A, B and C choose the correct option:
NPTEL Online Certification Courses
Indian Institute of Technology Kharagpur
Figure A
Figure B
Figure C
NPTEL Online Certification Courses
Indian Institute of Technology Kharagpur
Correct Answer: b
Detailed Solution:
RMSProp/AdaGrad show less Oscillation in steep slopes of contour lines, Low value of
momentum will make optimizer converge with high degree of oscillations, High value of
momentum dampen the oscillation in high gradient regions. (Figures taken from book Dive
into Deep Learning)
______________________________________________________________________________
QUESTION 7:
For a function f(θ0,θ1), if θ0 and θ1 are initialized at a global minimum, then what should be the
values of θ0 and θ1 after a single iteration of gradient descent?
Correct Answer: b
Detailed Solution:
At a local minimum, the derivative (gradient) is zero, so gradient descent will not change the
parameters.
______________________________________________________________________________
QUESTION 8:
What can be one of the practical problems of exploding gradient?
a. Too large update of weight values leading to unstable network
b. Too small update of weight values inhibiting the network to learn
c. Too large update of weight values leading to faster convergence
NPTEL Online Certification Courses
Indian Institute of Technology Kharagpur
Correct Answer: a
Detailed Solution:
Exploding gradients are a problem where large error gradients accumulate and result in very
large updates to neural network model weights during training. This has the effect of your model
being unstable and unable to learn from your training data.
____________________________________________________________________________
QUESTION 9:
Two version of SGD are implemented as follows:
SGD1: SGD1 samples data points in same order for every epoch while constructing minibatch
SGD2: SGD2 samples data samples in random order for every epoch to construct minibatch
Correct Answer: b
Detailed Solution:
Stochasticity of gradient descent adds noise which makes it less likely to get attracted
towards local minima. Deterministic gradient descent is likely to get trapped as it follows
same sequence of gradient updates for each epoch.
______________________________________________________________________________
QUESTION 10:
Choose correct statement in regards to GoogleNet?
a. Multiple Auxiliary classifiers are used at different depth levels to avoid vanishing
gradient problem
b. Bottleneck Layer in reduces learnable weights
c. Inception module captures information of image at varying resolution
d. All of the above
NPTEL Online Certification Courses
Indian Institute of Technology Kharagpur
Correct Answer: d
Detailed Solution:
____________________________________________________________________________
______________________________________________________________________
______________________________________________________________________________
************END*******