0% found this document useful (0 votes)

10 views6 pages

Deep Learning

LECTURE NOTES OF RNN IN DEEP LEARNING

Uploaded by

Divya Zindani

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

10 views6 pages

Deep Learning

LECTURE NOTES OF RNN IN DEEP LEARNING

Uploaded by

Divya Zindani

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 6

RECURRENT NEURAL NETWORK

### Recurrent Neural Networks (RNNs): A Complete Description

Recurrent Neural Networks (RNNs) are a class of neural networks designed to handle
sequential data and time-series problems. Unlike feedforward networks, RNNs have
internal memory that enables them to retain information from previous inputs and utilize
it in processing subsequent inputs. This memory makes them well-suited for tasks where
the current input is dependent on prior inputs, such as natural language processing,
speech recognition, and time-series prediction.

#### 1. The Architecture of RNNs

The core of an RNN is a loop that allows information to be passed from one step of the
network to the next. This is achieved by maintaining a hidden state (or internal state) at
each time step. The hidden state acts as memory, holding information about previous time
steps.

In a simple RNN:
- The input at time step \( t \) is denoted by \( x_t \).
- The hidden state at time \( t \) is denoted by \( h_t \), which is updated based on the
current input \( x_t \) and the hidden state from the previous time step \( h_{t-1} \).
- The output at time \( t \) is \( o_t \), which can be based on the hidden state or both the
hidden state and the input.

The equations governing the RNN are as follows:

\[
h_t = \sigma(W_h h_{t-1} + W_x x_t + b_h)
\]
\[
o_t = \sigma(W_o h_t + b_o)
\]
where:
- \( W_h \), \( W_x \), and \( W_o \) are weight matrices,
- \( b_h \) and \( b_o \) are biases,
- \( \sigma \) is a non-linear activation function like tanh or ReLU.

The network is "recurrent" because the hidden state at time \( t \) depends on the hidden
state from the previous time step \( t-1 \), creating a feedback loop. This feedback
mechanism enables the network to process sequences of data.

#### 2. Challenges in RNNs: Vanishing and Exploding Gradients

One of the major issues with training RNNs is the vanishing and exploding gradient
problem. This occurs when gradients during backpropagation through time (BPTT) either
shrink to near-zero values or grow exponentially, leading to instability. The vanishing
gradient issue hampers the network's ability to learn long-term dependencies, while the
exploding gradient problem makes the learning process unstable due to large weight
updates.

The vanishing gradient problem arises from the repeated multiplication of the gradient by
small values (due to the chain rule), especially when using non-linear activation functions
like tanh or sigmoid. Exploding gradients, on the other hand, occur when gradients
become too large, causing erratic updates to the weights during training.

Several solutions have been proposed to mitigate these issues:

- **Gradient Clipping**: This technique prevents the gradients from becoming too large by
scaling them back when they exceed a certain threshold.
- **Gated Architectures**: RNN variants like Long Short-Term Memory (LSTM) and Gated
Recurrent Units (GRU) are designed to address these problems more effectively.

#### 3. **Long Short-Term Memory (LSTM) and Gated Recurrent Units (GRU)**

LSTMs and GRUs are extensions of standard RNNs and are widely used because they
overcome the limitations of vanilla RNNs in learning long-term dependencies.
- **LSTM**: An LSTM introduces additional gates (input gate, forget gate, and output gate)
to control the flow of information through the network. It also maintains a cell state that
can retain information over long periods. The cell state is updated based on how much
information should be "forgotten" from the past and how much new information should
be added. This gating mechanism allows LSTMs to capture long-range dependencies
effectively.

The key equations for an LSTM are:

\[
f_t = \sigma(W_f [h_{t-1}, x_t] + b_f) \quad \text{(forget gate)}
\]
\[
i_t = \sigma(W_i [h_{t-1}, x_t] + b_i) \quad \text{(input gate)}
\]
\[
o_t = \sigma(W_o [h_{t-1}, x_t] + b_o) \quad \text{(output gate)}
\]
\[
\tilde{C_t} = \tanh(W_c [h_{t-1}, x_t] + b_c) \quad \text{(candidate memory cell)}
\]
\[
C_t = f_t * C_{t-1} + i_t * \tilde{C_t} \quad \text{(new cell state)}
\]
\[
h_t = o_t * \tanh(C_t) \quad \text{(new hidden state)}
\]
where:
- \( f_t \) is the forget gate,
- \( i_t \) is the input gate,
- \( o_t \) is the output gate,
- \( C_t \) is the cell state,
- \( h_t \) is the hidden state.

- **GRU**: A GRU simplifies the LSTM by combining the forget and input gates into a
single update gate, and it merges the cell state and hidden state into one. Despite having
fewer gates, GRUs have been shown to perform similarly to LSTMs in many applications.

The key equations for GRU are:

\[
z_t = \sigma(W_z [h_{t-1}, x_t] + b_z) \quad \text{(update gate)}
\]
\[
r_t = \sigma(W_r [h_{t-1}, x_t] + b_r) \quad \text{(reset gate)}
\]
\[
\tilde{h_t} = \tanh(W_h [r_t * h_{t-1}, x_t] + b_h) \quad \text{(candidate hidden state)}
\]
\[
h_t = (1 - z_t) * h_{t-1} + z_t * \tilde{h_t} \quad \text{(new hidden state)}
\]
where:
- \( z_t \) is the update gate,
- \( r_t \) is the reset gate,
- \( \tilde{h_t} \) is the candidate hidden state.

#### 4. Applications of RNNs

RNNs, LSTMs, and GRUs are used in various applications that require sequential data
processing:
- **Natural Language Processing (NLP)**: RNNs are used in language models, machine
translation, and sentiment analysis. They are capable of processing variable-length text
sequences and capturing contextual relationships between words.
- **Speech Recognition**: RNNs can model sequences of audio frames to recognize
spoken words, as they can capture the temporal dependencies in speech.
- **Time-Series Forecasting**: RNNs are used for predicting stock prices, weather
forecasting, and other applications that rely on historical data to make future predictions.
- **Image Captioning**: When combined with Convolutional Neural Networks (CNNs),
RNNs can be used to generate descriptions for images, where the CNN extracts features
and the RNN generates a sequence of words to describe the image.

#### 5. Bidirectional RNNs

A Bidirectional RNN (BRNN) consists of two RNNs: one processes the input sequence in the
forward direction (left to right), and the other processes the sequence in the backward
direction (right to left). By doing this, the BRNN can capture dependencies from both past
and future context, improving performance on tasks where context in both directions is
important, such as in NLP.

The output of a BRNN is computed by concatenating the hidden states from the forward
and backward passes at each time step.

#### 6. Attention Mechanism

Although RNNs can handle sequential data, they may struggle with very long sequences
due to the difficulty in retaining important information over many time steps. The
attention mechanism addresses this issue by allowing the network to focus on specific
parts of the input sequence when making predictions. This has been particularly effective
in machine translation and other sequence-to-sequence tasks.

### Conclusion

RNNs have made significant contributions to sequence modeling tasks, but their
limitations, such as vanishing gradients and difficulty in handling long-term dependencies,
have led to the development of more advanced architectures like LSTMs, GRUs, and
attention mechanisms. These models continue to be essential tools in various domains
where sequence data plays a critical role.

CNN RNN LSTM GRU Simple
100% (3)
CNN RNN LSTM GRU Simple
20 pages
4-Recurrent Neural Network
No ratings yet
4-Recurrent Neural Network
21 pages
1 Recurrent Neural Networks
No ratings yet
1 Recurrent Neural Networks
36 pages
Unit III
No ratings yet
Unit III
43 pages
CE6146 Lecture 4
No ratings yet
CE6146 Lecture 4
53 pages
Unit 4 - DL
No ratings yet
Unit 4 - DL
23 pages
Lecture 11
No ratings yet
Lecture 11
21 pages
Unit 4
No ratings yet
Unit 4
50 pages
Unit 3
No ratings yet
Unit 3
27 pages
Module 4 - S8 CSE NOTES - KTU DEEP LEARNING NOTES - CST414
No ratings yet
Module 4 - S8 CSE NOTES - KTU DEEP LEARNING NOTES - CST414
21 pages
Unit 4
No ratings yet
Unit 4
34 pages
Were Rnns All We Needed?: Leo - Feng@Mila - Quebec
No ratings yet
Were Rnns All We Needed?: Leo - Feng@Mila - Quebec
20 pages
Deep Learning U4
No ratings yet
Deep Learning U4
5 pages
ch6 RNN
No ratings yet
ch6 RNN
25 pages
Bianchi
No ratings yet
Bianchi
62 pages
1 Recurrent Neural Networks
No ratings yet
1 Recurrent Neural Networks
34 pages
What Is A Recurrent Neural Network (RNN) ?
No ratings yet
What Is A Recurrent Neural Network (RNN) ?
4 pages
RNN LSTM BiRNN Notes
No ratings yet
RNN LSTM BiRNN Notes
3 pages
NNDL
No ratings yet
NNDL
10 pages
Semster - DL
No ratings yet
Semster - DL
15 pages
Sequence Models - Merged
No ratings yet
Sequence Models - Merged
67 pages
Definition of RNN (Recurrent Neural Network) :: H F W X W H B y G W H B
No ratings yet
Definition of RNN (Recurrent Neural Network) :: H F W X W H B y G W H B
26 pages
5707 11 RNN LSTM
No ratings yet
5707 11 RNN LSTM
128 pages
Endsem Imp DL Unit 4
No ratings yet
Endsem Imp DL Unit 4
30 pages
Were Rnns All We Needed?: Leo - Feng@Mila - Quebec
No ratings yet
Were Rnns All We Needed?: Leo - Feng@Mila - Quebec
27 pages
Recurrent Neural Networks (RNNS) : Foundations and Applications in Sequential Learning
No ratings yet
Recurrent Neural Networks (RNNS) : Foundations and Applications in Sequential Learning
9 pages
CNN RNN LSTM Attention
No ratings yet
CNN RNN LSTM Attention
86 pages
AIDS-II PT1 Question Bank
No ratings yet
AIDS-II PT1 Question Bank
27 pages
RNNs and Their Types - Simple Explanation
No ratings yet
RNNs and Their Types - Simple Explanation
5 pages
Document
No ratings yet
Document
2 pages
CH4 - AA1.1-Sequence Models
No ratings yet
CH4 - AA1.1-Sequence Models
26 pages
RNN LSTM Transformers Notes
No ratings yet
RNN LSTM Transformers Notes
4 pages
RNN
No ratings yet
RNN
2 pages
Question Bank - 3
No ratings yet
Question Bank - 3
5 pages
DL 4
No ratings yet
DL 4
19 pages
A Recurrent Neural Network
No ratings yet
A Recurrent Neural Network
3 pages
Sequence Models RNNS, LSTMs
No ratings yet
Sequence Models RNNS, LSTMs
3 pages
Recurrent Neural Networks
No ratings yet
Recurrent Neural Networks
2 pages
30 Encoder, Decoder, Sequence To Sequence 25-09-2024
No ratings yet
30 Encoder, Decoder, Sequence To Sequence 25-09-2024
5 pages
RNNs and Their Types - 15 Slides (Easy Copy-Paste Format)
No ratings yet
RNNs and Their Types - 15 Slides (Easy Copy-Paste Format)
6 pages
DS303 RNN LSTM
No ratings yet
DS303 RNN LSTM
16 pages
Deep Arch MSC 2024
No ratings yet
Deep Arch MSC 2024
83 pages
DL Mod4
No ratings yet
DL Mod4
105 pages
Blue and White Simple Business Plan Presentation
No ratings yet
Blue and White Simple Business Plan Presentation
15 pages
Unit-Iv DL
No ratings yet
Unit-Iv DL
23 pages
RNN LSTM
No ratings yet
RNN LSTM
72 pages
Unit 4
No ratings yet
Unit 4
27 pages
Deep Learning Updated
No ratings yet
Deep Learning Updated
11 pages
Module 4-1
No ratings yet
Module 4-1
44 pages
Explain The Concept of Unfolding Computational Graphs in The Context of Recurrent Neural Networks
No ratings yet
Explain The Concept of Unfolding Computational Graphs in The Context of Recurrent Neural Networks
9 pages
Unit III (2) RNN, LSTM, Gru
No ratings yet
Unit III (2) RNN, LSTM, Gru
14 pages
Unit 3 RCNN Updated
No ratings yet
Unit 3 RCNN Updated
28 pages
Lecture Notes - RRN
No ratings yet
Lecture Notes - RRN
8 pages
Module 06
No ratings yet
Module 06
5 pages
CSE 4237 SoftCom Solutions
No ratings yet
CSE 4237 SoftCom Solutions
115 pages
RNN Simplified.
No ratings yet
RNN Simplified.
2 pages
Introduction To Recurrent Neural Networks (RNNS) : Dr. Hans Weber February 9, 2024
No ratings yet
Introduction To Recurrent Neural Networks (RNNS) : Dr. Hans Weber February 9, 2024
9 pages
Top 10 Neural Network Architectures You Need To Know: 1 - Perceptrons
No ratings yet
Top 10 Neural Network Architectures You Need To Know: 1 - Perceptrons
12 pages
Chapter 5-Computer Theory BY Danial I. A Cohen
67% (21)
Chapter 5-Computer Theory BY Danial I. A Cohen
19 pages
DL Question Bank
No ratings yet
DL Question Bank
23 pages
Relaxation Method 2012
No ratings yet
Relaxation Method 2012
46 pages
VHDL Implementation of 128 Bit Pipelined Blowfish Algorithm
No ratings yet
VHDL Implementation of 128 Bit Pipelined Blowfish Algorithm
5 pages
Optimization For Data Science
No ratings yet
Optimization For Data Science
18 pages
LectureNote CM2
No ratings yet
LectureNote CM2
51 pages
FPGA Implementation of Diffie-Hellman Key Exchange Algorithm For Zero Knowledge Proof
No ratings yet
FPGA Implementation of Diffie-Hellman Key Exchange Algorithm For Zero Knowledge Proof
47 pages
Unit 1 Ume1703
No ratings yet
Unit 1 Ume1703
205 pages
Transfer Function Model For Electric Vehicle
No ratings yet
Transfer Function Model For Electric Vehicle
10 pages
04 CS316 Algorithms Recursive Algorithms
No ratings yet
04 CS316 Algorithms Recursive Algorithms
33 pages
Human Activity Recognition
No ratings yet
Human Activity Recognition
40 pages
Form Finding of Shells by Structural Optimization
No ratings yet
Form Finding of Shells by Structural Optimization
9 pages
ESE Question Paper Format - R2018 - UME1043 - NOVDEC
No ratings yet
ESE Question Paper Format - R2018 - UME1043 - NOVDEC
3 pages
1 s2.0 S2772442523001090 Main
No ratings yet
1 s2.0 S2772442523001090 Main
13 pages
Assignment Problem
No ratings yet
Assignment Problem
18 pages
Math Assignment Unit 7
No ratings yet
Math Assignment Unit 7
5 pages
Ci 10cs56 Flat
No ratings yet
Ci 10cs56 Flat
9 pages
匈牙利方法解决任务分配问题
100% (1)
匈牙利方法解决任务分配问题
7 pages
Simplified Unit 4 and 5 Study Material
No ratings yet
Simplified Unit 4 and 5 Study Material
34 pages
RNN DL
No ratings yet
RNN DL
17 pages
QC Part Time
No ratings yet
QC Part Time
6 pages
7 1526465877 - 16-05-2018 PDF
No ratings yet
7 1526465877 - 16-05-2018 PDF
7 pages
SDID
No ratings yet
SDID
37 pages
Stress Detection Using Natural Language
No ratings yet
Stress Detection Using Natural Language
24 pages
6 Suffix-Tree
No ratings yet
6 Suffix-Tree
20 pages
A Survey of Path Planning Algorithms For Mobile Robots
No ratings yet
A Survey of Path Planning Algorithms For Mobile Robots
21 pages
Energy Management System of A Microgrid Using Deep Learning
No ratings yet
Energy Management System of A Microgrid Using Deep Learning
6 pages
Puzzle
No ratings yet
Puzzle
3 pages
Arithmetic (Q1W3D1)
No ratings yet
Arithmetic (Q1W3D1)
14 pages
QC Homework
No ratings yet
QC Homework
11 pages
Key Concepts On Deep Neural Networks
No ratings yet
Key Concepts On Deep Neural Networks
8 pages
Resume PDF
No ratings yet
Resume PDF
15 pages
Citra Log - Txt.old
No ratings yet
Citra Log - Txt.old
3 pages
D11 D12 D13 0354 Midterm
No ratings yet
D11 D12 D13 0354 Midterm
2 pages
Piyush Chaudhary
No ratings yet
Piyush Chaudhary
1 page
Gujarat Technological University
No ratings yet
Gujarat Technological University
1 page
Design And Analysis Of Algorithm
From Everand
Design And Analysis Of Algorithm
Bhupendra Mandloi
No ratings yet
Backpropagation: Fundamentals and Applications for Preparing Data for Training in Deep Learning
From Everand
Backpropagation: Fundamentals and Applications for Preparing Data for Training in Deep Learning
Fouad Sabry
No ratings yet

Deep Learning

Uploaded by

Deep Learning

Uploaded by

RECURRENT NEURAL NETWORK

### Recurrent Neural Networks (RNNs): A Complete Description

#### 1. **The Architecture of RNNs**

The equations governing the RNN are as follows:

#### 2. **Challenges in RNNs: Vanishing and Exploding Gradients**

Several solutions have been proposed to mitigate these issues:

The key equations for an LSTM are:

The key equations for GRU are:

#### 4. **Applications of RNNs**

#### 5. **Bidirectional RNNs**

#### 6. **Attention Mechanism**

You might also like

#### 1. The Architecture of RNNs

#### 2. Challenges in RNNs: Vanishing and Exploding Gradients

#### 4. Applications of RNNs

#### 5. Bidirectional RNNs

#### 6. Attention Mechanism