Assignment-5-Solution
Assignment-5-Solution
Assignment- 5
QUESTION 1: [1 mark]
Which of the following is a disadvantage of Recurrent Neural Networks (RNNs)?
Correct Answer: c
QUESTION 2: [1 mark]
Correct Answer: b
Solution: Please refer to lecture slides.
_________________________________________________________________________
QUESTION 3: [1 mark]
Correct Answer: c
Solution: The cell stores long-term information in LSTM.
_________________________________________________________________________
QUESTION 4: [1 mark]
In training an RNN, what technique is used to calculate gradients over multiple timesteps?
a. Backpropagation through Time (BPTT)
b. Stochastic Gradient Descent (SGD)
c. Dropout Regularization
d. Layer Normalization
Correct Answer: a
Solution: Please refer to lecture slides.
_________________________________________________________________________
QUESTION 5: [2 mark]
a. 210
b. 190
c. 90
d. 42
Correct Answer: d
Solution:
Input to hidden weights: 3×4=12
Hidden to hidden weights: 4×4=16
Hidden to output weights: 4×2=8
Bias terms: 4(hidden) + 2(output) = 6
Total: 12+16+8+6=42
_________________________________________________________________________
QUESTION 6: [1 mark]
What is the time complexity for processing a sequence of length 'N' by an RNN, if the input
embedding dimension, hidden state dimension, and output vector dimension are all 'd'?
a. O(N)
b. O(N²d)
c. O(Nd)
d. O(Nd²)
Correct answer: d
Solution: The time complexity of processing a sequence of length N by an RNN depends on
the computational cost of updating the hidden state at each time step.
At each time step, the RNN updates its hidden state ht using the previous hidden state ht-1
and the current input xt. This update typically involves matrix multiplications:
Since these computations occur at every time step, the total complexity for a sequence of
length N is: O(N * d²)
_________________________________________________________________________
QUESTION 7: [1 mark]
Correct Answer: a
Solution: Seq2Seq models are designed to encode variable-length sequences but
compress them into fixed-size vector representations.
_________________________________________________________________________
QUESTION 8: [2 marks]
Given the following encoder and decoder hidden states, compute the attention scores. (Use
dot product as the scoring function)
a. 0.00235,0.04731,0.9503
b. 0.0737,0.287,0.6393
c. 0.9503,0.0137,0.036
d. 0.6393,0.0737,0.287
Correct Answer: a
Solution:
e1 = 1*0.5+2*1 =0.5+2 = 2.5
e2 = 3*0.5+4*1 =1.5+4 = 5.5
e3 = 5*0.5+6*1 =2.5+6 = 8.5
_________________________________________________________________________