Week 11

The document discusses various aspects of Recurrent Neural Networks (RNNs), including their architecture, limitations, and techniques to address issues like exploding gradients. It covers specific questions related to RNNs, LSTMs, and GRUs, such as parameter calculations, sequence processing order, and the purpose of gates in these networks. Key answers highlight the importance of the sum of cross-entropy as the loss function and the differences in gate structures between LSTM and GRU networks.

Uploaded by

lekha.cce

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

17 views3 pages

Week 11

Uploaded by

lekha.cce

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 3

DEEP LEARNING WEEK 11

1. We construct an RNN for the sentiment classification of text where a text can have positive
sentiment or negative sentiment. Suppose the dimension of one-hot encoded-words is R100×1 ,
dimension of state vector si is R50×1 . What is the total number of parameters in the
network? (Don’t include biases also in the network) (NAT)
Answer: Range (7599.5,7601.5)
Solution: No. of weight parameters in the network is given by 100x50(input to si)+50x50(si
to si+1 )+50x2(si to output classes positive and negative).. So the total number of
parameters in the network is 2500+5000+100=7600
2. Arrange the following sequence in the order they are performed by LSTM at time step t.
[Selectively read, Selectively write, Selectively forget]

a)Selectively read, Selectively write, Selectively forget

b)Selectively write, Selectively read, Selectively forget
c)Selectively read, Selectively forget, Selectively write
d)Selectively forget, Selectively write, Selectively read

Answer: c)
Solution: At time step t we first selectively read from the state st−1 , then selectively forget
to create the state st . Then we selectively write to create the state ht from st which will be
used in the t+1 time step.
3. What are the problems in the RNN architecture? (MSQ)

a)Morphing of information stored at each time step.

b)Exploding and Vanishing gradient problem.
c)Errors caused at time step tn can’t be related to previous time steps faraway
d)All of the above

Answer: d)
Solution: Information stored in the network gets morphed at every time step due to new
input. Exploding and vanishing gradient problems are caused by the long dependency chains
in RNN.
4. We are given an RNN where max eigenvalue λ of Weight matrix is 0.9. The activation
∂s20
function used in the RNN is logistic. What can we say about ∇ = || ||?
∂s1
a)Value of ∇ is close to 0.
b)Value of ∇ is very high.
c)Value of ∇ is 3.5.
d)Insufficient information to say anything.

Answer: a)
1
Solution:Derivative of logistic is always less than . Hence the gradient
4
∂sn
∇ = || || = γ ∗ λ < 1. Due to backpropagation through other states si we get
∂sn−1

1
∂s20
|| || < (γ ∗ λ)19 which is very close to 0.
∂s1
5. What is the objective(loss) function in the RNN?

a)Cross Entropy
b)Sum of cross-entropy
c)Squared error
d)Accuracy

Answer: b)
Solution: RNN is used for sequential tasks. At each state s we have some predicted and
actual output where loss between two is measured by cross-entropy. The loss across in RNN
is measured by the sum of cross entropy across all such states in the network.
6. Which of the following is a limitation of traditional feedforward neural networks in handling
sequential data? (MSQ)
a) They can only process fixed-length input sequences
b) They can handle variable-length input sequences
c) They can’t model temporal dependencies between sequential data
d) All of These
Answer: a),c) They can only process fixed-length input sequences
Solution: Traditional feedforward neural networks are limited in their ability to handle
sequential data because they can only process fixed-length input sequences. In contrast,
recurrent neural networks (RNNs) can handle variable-length input sequences and model the
temporal dependencies between sequential data.
7. Which of the following techniques can be used to address the exploding gradient problem in
RNNs?
a) Gradient clipping
b) Dropout
c) L1 regularization
d) L2 regularization
Answer: a) Gradient clipping
Solution: Gradient clipping is a technique used to address the exploding gradient problem
in RNNs. It involves capping the magnitude of the gradients during backpropagation, which
helps prevent them from becoming too large and destabilizing the network.
8. Which of the following is a formula for computing the output of an LSTM cell?
a) ot = σ(Wo [ht−1 , xt ] + bo )
b) ft = σ(Wf [ht−1 , xt ] + bf )
c) ct = ft ∗ ct−1 + it ∗ gt
d) ht = ot ∗ tanh(ct )
Answer: d)
Solution: The formula for computing the output of an LSTM cell is ht = ot ∗ tanh(ct ) where
ot is the output gate, ct is the cell state, and ht is the output at time t.

9. What is the purpose of the reset gate in a GRU network?

2
A) To decide how much of the previous hidden state to forget
B) To decide how much of the current input to add to the cell state
C) To decide how much of the previous hidden state to keep for the current time step
D) None of These
Answer: A, C)
Solution:To decide how much of the previous hidden state to keep for the current time step
10. Which of the following is true about LSTM and GRU networks?
A) LSTM networks have more gates than GRU networks
B) GRU networks have more gates than LSTM networks
C) LSTM and GRU networks have the same number of gates
D) Both LSTM and GRU networks have no gates
Answer: A) LSTM networks have more gates than GRU networks
Explanation: LSTM networks have three gates (input, output, and forget gates), while
GRU networks have two gates (reset and update gates). Therefore, LSTM networks have
more gates than GRU networks.

The Ultimate Guide To Prompt Engineering From Beginner To Expert Free Resources Hands-On Practice With Practical Examples (Yadav, Chandradev) (Z-Library)
100% (1)
The Ultimate Guide To Prompt Engineering From Beginner To Expert Free Resources Hands-On Practice With Practical Examples (Yadav, Chandradev) (Z-Library)
76 pages
Week 11
No ratings yet
Week 11
3 pages
Week 11 Nptel Deep Learning
No ratings yet
Week 11 Nptel Deep Learning
6 pages
Week11 Discussion - Deep Learning
No ratings yet
Week11 Discussion - Deep Learning
23 pages
Exercise #2 28 - 4 - 2025
No ratings yet
Exercise #2 28 - 4 - 2025
7 pages
UNIT 4 (MCQS)
No ratings yet
UNIT 4 (MCQS)
13 pages
Aiml C6 DL RNN CS
No ratings yet
Aiml C6 DL RNN CS
42 pages
Unit 3 RCNN
No ratings yet
Unit 3 RCNN
25 pages
Deep Learning - Assignment 11 Your Name, Roll Number 1. What Is The Difference Between Backpropagation Algorithm and Backpropagation Through Time (BPTT) Algorithm ?
No ratings yet
Deep Learning - Assignment 11 Your Name, Roll Number 1. What Is The Difference Between Backpropagation Algorithm and Backpropagation Through Time (BPTT) Algorithm ?
10 pages
CSE 4237 SoftCom Solutions
No ratings yet
CSE 4237 SoftCom Solutions
115 pages
DL QB 2marks
No ratings yet
DL QB 2marks
4 pages
Document 11
No ratings yet
Document 11
7 pages
HW4 Supplement Quiz
No ratings yet
HW4 Supplement Quiz
5 pages
DL Bits
No ratings yet
DL Bits
3 pages
DL Unit-3 Question Bank
No ratings yet
DL Unit-3 Question Bank
39 pages
RNN LSTM
No ratings yet
RNN LSTM
72 pages
Unit IV
No ratings yet
Unit IV
31 pages
Unit 3
No ratings yet
Unit 3
4 pages
ch6 RNN
No ratings yet
ch6 RNN
25 pages
Dis6 Sol
No ratings yet
Dis6 Sol
6 pages
Deep Learning Questions
No ratings yet
Deep Learning Questions
17 pages
07 RNN Recurrent Neural Networks
No ratings yet
07 RNN Recurrent Neural Networks
115 pages
Sequence Generation With RNNs - Post Quiz - Attempt Review
100% (2)
Sequence Generation With RNNs - Post Quiz - Attempt Review
5 pages
Semster - DL
No ratings yet
Semster - DL
15 pages
Practice Question DL Unit-3
No ratings yet
Practice Question DL Unit-3
3 pages
Deep Learning 2017 Lecture6RNN 1 18
No ratings yet
Deep Learning 2017 Lecture6RNN 1 18
18 pages
CH4 - AA1.1-Sequence Models
No ratings yet
CH4 - AA1.1-Sequence Models
26 pages
Unit 3 RCNN Updated
No ratings yet
Unit 3 RCNN Updated
28 pages
Unit 4 - DL
No ratings yet
Unit 4 - DL
23 pages
Deepooo
No ratings yet
Deepooo
13 pages
Module 4 Recurrent Neural Network
No ratings yet
Module 4 Recurrent Neural Network
78 pages
RNN-1 All
No ratings yet
RNN-1 All
44 pages
CS5560 Lect12-RNN - LSTM
No ratings yet
CS5560 Lect12-RNN - LSTM
30 pages
RNN & LSTM: Vamsi Krishna B 1 9 M E 0 2 3
No ratings yet
RNN & LSTM: Vamsi Krishna B 1 9 M E 0 2 3
14 pages
Module 4
No ratings yet
Module 4
14 pages
Chap 7.2 Sequence Analysis Using RNN LSTM
No ratings yet
Chap 7.2 Sequence Analysis Using RNN LSTM
60 pages
Assignment 5 Solution
No ratings yet
Assignment 5 Solution
4 pages
DL QB
No ratings yet
DL QB
4 pages
DL Mod4
No ratings yet
DL Mod4
105 pages
Module 4 RNN LSTM GRU
No ratings yet
Module 4 RNN LSTM GRU
59 pages
Exam Long Questions
No ratings yet
Exam Long Questions
8 pages
Chapter 2
No ratings yet
Chapter 2
68 pages
Long Short-Term Memory (LSTM)
No ratings yet
Long Short-Term Memory (LSTM)
25 pages
406760742
No ratings yet
406760742
23 pages
RNN LSTM
No ratings yet
RNN LSTM
37 pages
Introduction To Recurrent Neural Network
No ratings yet
Introduction To Recurrent Neural Network
18 pages
Introduction To Recurrent Neural Network
No ratings yet
Introduction To Recurrent Neural Network
9 pages
5707 11 RNN LSTM
No ratings yet
5707 11 RNN LSTM
128 pages
RNN IITMumbai
No ratings yet
RNN IITMumbai
9 pages
Lab RNN Intro
No ratings yet
Lab RNN Intro
22 pages
Outline
No ratings yet
Outline
50 pages
RNN and LSTM
No ratings yet
RNN and LSTM
65 pages
42 Recurrent Neural Networks and LSTM
No ratings yet
42 Recurrent Neural Networks and LSTM
68 pages
Recurrent Neural Network
No ratings yet
Recurrent Neural Network
34 pages
RNN 2
No ratings yet
RNN 2
144 pages
Recurrent Neural Networks
No ratings yet
Recurrent Neural Networks
14 pages
Assignment-8 Task 1
No ratings yet
Assignment-8 Task 1
2 pages
DL Half TechKnowledge
No ratings yet
DL Half TechKnowledge
50 pages
Deep Learning Exam With Answers
No ratings yet
Deep Learning Exam With Answers
4 pages
One X at A Time Re-Use The Same Edge Weights
No ratings yet
One X at A Time Re-Use The Same Edge Weights
39 pages
Comptia Network+ Primer
From Everand
Comptia Network+ Primer
John Greene
No ratings yet
Week 6
No ratings yet
Week 6
4 pages
Week 2
No ratings yet
Week 2
3 pages
Skill Enhancement
No ratings yet
Skill Enhancement
4 pages
Week 10
No ratings yet
Week 10
3 pages
Plagiarism
No ratings yet
Plagiarism
3 pages
Unit 2 - WD
No ratings yet
Unit 2 - WD
39 pages
Research Methodology
No ratings yet
Research Methodology
6 pages
Machine Learning Techniques
No ratings yet
Machine Learning Techniques
3 pages
Foundation of Datascience
No ratings yet
Foundation of Datascience
2 pages
XML
No ratings yet
XML
36 pages
BigData Mining and Analytics
No ratings yet
BigData Mining and Analytics
2 pages
Big Data Framework
No ratings yet
Big Data Framework
3 pages
cs8080 Irt Local Author
No ratings yet
cs8080 Irt Local Author
168 pages
Introduction
No ratings yet
Introduction
32 pages
IV CSE Handbook
No ratings yet
IV CSE Handbook
29 pages
Unit III - IV
No ratings yet
Unit III - IV
122 pages
Practice Questions
No ratings yet
Practice Questions
14 pages
Fake Image Detection
No ratings yet
Fake Image Detection
3 pages
Estimating Peak Water Demand Using Neural Networks
No ratings yet
Estimating Peak Water Demand Using Neural Networks
10 pages
ModelScope Text-to-Video Technical Report
No ratings yet
ModelScope Text-to-Video Technical Report
14 pages
Assignment 1 Other
No ratings yet
Assignment 1 Other
2 pages
2024 - Generalizing VT For Face Anti-Spoofing
No ratings yet
2024 - Generalizing VT For Face Anti-Spoofing
14 pages
Deep Learning Based Sentiment
No ratings yet
Deep Learning Based Sentiment
62 pages
A Deep Learning Approach For Sentiment Analysis in Spanish Tweets
No ratings yet
A Deep Learning Approach For Sentiment Analysis in Spanish Tweets
8 pages
Ec 467 Pattern Recognition
No ratings yet
Ec 467 Pattern Recognition
2 pages
Fake News Detection Using Enhanced BERT
No ratings yet
Fake News Detection Using Enhanced BERT
8 pages
Rag VSFT
No ratings yet
Rag VSFT
2 pages
Group I Discrete Mathematics
No ratings yet
Group I Discrete Mathematics
4 pages
Optimization For Long-Term Dependencies
No ratings yet
Optimization For Long-Term Dependencies
57 pages
2021 2022 - Aml 1413
No ratings yet
2021 2022 - Aml 1413
5 pages
MLNC Unsupervised
No ratings yet
MLNC Unsupervised
40 pages
Worksheet Paper - Human Computer Interaction - Jul 2024
No ratings yet
Worksheet Paper - Human Computer Interaction - Jul 2024
12 pages
Week 02 Ch2.1 Introduction To Neural Networks
No ratings yet
Week 02 Ch2.1 Introduction To Neural Networks
44 pages
Artificial Intelligence and Deep Learning
0% (1)
Artificial Intelligence and Deep Learning
9 pages
Chap 7-1 Regularization For Deep Learning-Keonwoo Noh
No ratings yet
Chap 7-1 Regularization For Deep Learning-Keonwoo Noh
41 pages
Poly Kernel
No ratings yet
Poly Kernel
6 pages
CE F417-Applications of AI in Civil Engineering-Jagadeesh
No ratings yet
CE F417-Applications of AI in Civil Engineering-Jagadeesh
3 pages
1d Backprop
No ratings yet
1d Backprop
23 pages
1.1-Introduction To The Machine Learning Course
No ratings yet
1.1-Introduction To The Machine Learning Course
18 pages
7th Semester Progress Presentation
No ratings yet
7th Semester Progress Presentation
16 pages
ETI Project (Group)
No ratings yet
ETI Project (Group)
18 pages
Some Practice Questions
No ratings yet
Some Practice Questions
10 pages
171 Submission
No ratings yet
171 Submission
6 pages
REAL-TIME FACE MASK DETECTOR FOR COVID - 19 - Group 1 Team 5
No ratings yet
REAL-TIME FACE MASK DETECTOR FOR COVID - 19 - Group 1 Team 5
11 pages
Inductive Bias
No ratings yet
Inductive Bias
3 pages

Week 11

Uploaded by

Week 11

Uploaded by

DEEP LEARNING WEEK 11

a)Selectively read, Selectively write, Selectively forget

a)Morphing of information stored at each time step.

9. What is the purpose of the reset gate in a GRU network?

You might also like