Assignment_12_2022
Assignment_12_2022
Deep Learning
Assignment- Week 12
TYPE OF QUESTION: MCQ/MSQ
Number of questions: 10 Total mark: 10 X 1 = 10
______________________________________________________________________________
QUESTION 1:
During training a Variational Auto-encoder (VAE), it is assumed that 𝑃(𝑧|𝑥) ∼ 𝑁(0, 𝐼) i.e., given
an input sample, the encoder is forced to map its latent code to 𝑁(0, 𝐼). After the training is
over, we want to use the VAE as a generative model. What should be the best choice of
distribution from which we should sample a latent vector to generate a novel example?
a. 𝑁(0, 𝐼): Normal distribution with zero mean and identity covariance
b. 𝑁(1, 𝐼): Normal distribution with mean = 1 and identity covariance
c. Uniform distribution between [-1, 1]
d. 𝑁(−1, 𝐼): Normal distribution with mean = -1 and identity covariance
Correct Answer: a
Detailed Solution:
Since during training, we forced the latent code to follow 𝑁(0, 𝐼), the decoder has learnt to map
latent codes from that distribution only. So, during sampling if we provide vectors from any
other distributions, then the encoder will have low probability to have encountered such vectors
thereby leading to unrealistic reconstructions. So, we should sample vectors from 𝑁(0, 𝐼) for
using the pre-trained VAE as a generative model.
______________________________________________________________________________
QUESTION 2:
When the GAN game has converged to its Nash equilibrium (when the Discriminator randomly
makes an error in distinguishing fake samples from real samples), what is the probability (of
belongingness to real class) given by the Discriminator to a fake generated sample?
a. 1
b. 0.5
c. 0
d. 0.25
Correct Answer: b
NPTEL Online Certification Courses
Indian Institute of Technology Kharagpur
Detailed Solution:
Nash equilibrium is reached when the generated distribution, 𝑝𝑔 (𝑥) equals the original data
distribution, 𝑝𝑑𝑎𝑡𝑎 (𝑥), which leads to 𝐷(𝑥) = 0.5 for all 𝑥.
______________________________________________________________________________
QUESTION 3:
Why is re-parameterization trick used in VAE?
Correct Answer: b
Detailed Solution:
We cannot sample in a differentiable manner from within a computational graph present in a
neural network. Re-parameterization enables the sampling function to be present outside the
main computational graph which enables us to do regular gradient descent optimization.
______________________________________________________________________________
NPTEL Online Certification Courses
Indian Institute of Technology Kharagpur
QUESTION 4:
Which one of the following graphical models fully represents a Variational Auto-encoder (VAE)
realization?
Correct Answer: a
Detailed Explanation:
For practical realization of VAE, we have an encoder 𝑄(∙) which receives an input signal, 𝑥 and
generates a latent code, 𝑧. This part of the network can be denoted by 𝑄(𝑧|𝑥) and directed from
𝑥 to 𝑧. Next, we have a decoder section which takes the encoded z vector to reconstruct the input
signal, 𝑥. This part of the network is represented by 𝑃(𝑥|𝑧) and should be directed from 𝑧 to 𝑥.
______________________________________________________________________________
QUESTION 5:
Which one of the following computational graphs correctly depict the re-parameterization trick
deployed for practical Variational Auto-encoder (VAE) implementation? Circular nodes
represent random nodes in the models and the quadrilateral nodes represent deterministic
nodes.
a b c d
Correct Answer: a
Detailed Solution:
NPTEL Online Certification Courses
Indian Institute of Technology Kharagpur
With the re-parameterization trick, the only random component in the network is the node of ∊
which is sampled from 𝑁(0, 𝐼). The other nodes of μ and σ are deterministic. Since ∊ is sampled
from outside the computational graph, the overall z vector also becomes deterministic
component for a given set of μ, σ and ∊. Also, if z is not deterministic, we cannot back propagate
gradients through it. Also, in the computation graph, the forward arrows will emerge from μ,σ,ϵ
towards z for computing the z vector.
______________________________________________________________________________
QUESTION 6:
For the following min-max game, at which state of (x, y) do we achieve the Nash equilibrium
(the state where change of one variable does not alter the state of the other variable)?
a. X = 0 y = -1
b. X = 0 , y= 0
c. X = 0 y = 1
d. X = ∞(infinite), y = 0
Correct Answer: b
Detailed Solution:
The Nash equilibrium is x=y=0. This is the only state where the action of one player does not
affect the other player’s move. It is the only state that any opponents’ actions will not change the
game outcome.
______________________________________________________________________________
QUESTION 7:
Which of the following losses can be used to optimize for generator’s objective (while training a
Generative Adversarial network) by MINIMIZING with gradient descent optimizer? Consider
cross-entropy loss,
and D(G(z)) = probability of belonging to real class as output by the Discriminator for a given
generated sample G(z).
a. CE(1, D(G(z)))
b. CE(1, -D(G(z)))
NPTEL Online Certification Courses
Indian Institute of Technology Kharagpur
c. CE(1, 1 - D(G(z)))
d. CE(1, 1 / D(G(z)))
Correct Answer: a
Detailed Solution:
Except for option (a) none of the other objective function are minimized at D(G(z)) = 1 which is
the goal of the Generator, i.e. to force the Discriminator to output probability=1 for a generated
sample. Loss function in option (a) is the only choice which keeps on decreasing as D(G(z))
increases. Also, it is required that D(G(z)) ∈ [0,1].
______________________________________________________________________________
QUESTION 8:
While training a Generative Adversarial network, which of the following losses CANNOT be
used to optimize for discriminator objective (while only sampling from the distribution of
generated samples) by MAXIMIZING with gradient ASCENT optimizer? Consider cross-entropy
loss,
and D(G(z)) = probability of belonging to real class as output by the Discriminator for a given
generated sample, G(z) from a noise vector, z.
a. CE(1, D(G(z)))
b. -CE(1, D(G(z)))
c. CE(1, 1 + D(G(z)))
d. -CE(1, 1 - D(G(z)))
Correct Answer: b
Detailed Solution:
During optimization of discriminator, when we sample from the distribution of fake/generated
distribution, we want D(G(z)) = 0. Since we want to use gradient ASCENT optimization, the
objective function should increase as we approach D(G(z)) = 0 while the objective value should
decrease with increase in value of D(G(z)). Apart from option (b), all other options satisfy the
above conditions.
______________________________________________________________________________
NPTEL Online Certification Courses
Indian Institute of Technology Kharagpur
QUESTION 9:
For training VAE, we want to predict an unknown distribution of latent code given an observed
sample, i.e., P(z|x), but we approximate it with some distribution Q(z|x) which we can control
by varying some known parameters. Which of the following loss functions is used as a loss to
minimize?
𝑃(𝑥,𝑧)
a. − ∑𝑧 𝑄(𝑧|𝑥)𝑙𝑜𝑔 𝑄(𝑧∨𝑥)
𝑃(𝑥,𝑧)
b. − ∑𝑥 𝑄(𝑧|𝑥)𝑙𝑜𝑔 𝑄(𝑧∨𝑥)
𝑃(𝑥,𝑧)
c. ∑𝑧 𝑃(𝑧|𝑥)𝑙𝑜𝑔 𝑄(𝑧∨𝑥)
d. None of the above
Correct Answer: a
Detailed Solution:
Since we are trying to approximate P(z|x) with Q(z|x), we will try to minimize the KL
divergence, KL(Q(z|x) || P(z|x)) which eventually leads to maximization of the well-known
𝑃(𝑥,𝑧)
Variational lower bound of ∑𝑧 𝑄(𝑧|𝑥)𝑙𝑜𝑔 𝑄(𝑧∨𝑥).
𝑃(𝑥,𝑧)
So, we will minimize − ∑𝑧 𝑄(𝑧|𝑥)𝑙𝑜𝑔 . See the lecture videos for detailed derivations.
𝑄(𝑧∨𝑥)
______________________________________________________________________________
QUESTION 10:
Above figure shows latent vector subtraction of two concepts of “face with glasses” and
“glasses”. What is expected from the resultant vector?
a. glasses
b. face without glasses
c. face with 2 glasses
d. None of the above
Correct Answer: b
NPTEL Online Certification Courses
Indian Institute of Technology Kharagpur
Detailed Solution:
It is expected that VAE latent space follows vector arithmetic. Thus the resultant vector is a
vector subtraction of the two concepts which will result in the final vector to represent a face
without glasses.
_______________________________________________________________________
______________________________________________________________________________
************END*******