0% found this document useful (0 votes)

11 views

Module 06

Module 06 covers Recurrent Neural Networks (RNNs), which are designed for processing sequences of data, such as text and speech. It discusses various types of RNNs, including LSTM and GRU, and their applications in fields like natural language processing and finance. The module also addresses challenges like vanishing gradients and solutions such as gradient clipping and advanced architectures.

Uploaded by

yoxisam356

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

11 views

Module 06

Uploaded by

yoxisam356

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 5

Module 06: Recurrent Neural Networks

Contents: Introduction to Sequence Models and RNNs, RNN Model, Backpropagation

through Time (BPTT), Different Types of RNNs: Unfolded RNNs, Seq2Seq RNNs, Long
Short-Term Memory (LSTM), Bidirectional RNN, Vanishing Gradients with RNNs, Gated
Recurrent Unit (GRU), RNN applications
Introduction to Sequence Models : Sequence models are used for tasks where the input or
output (or both) are ordered sequences. Examples include:

• Text data (e.g., sentences)

• Speech recognition

• Time series forecasting

• Music generation

Sequence models capture dependencies between elements of a sequence, which traditional

feedforward neural networks cannot do.

Recurrent Neural Networks (RNNs)

Recurrent Neural Networks (RNNs) are a type of artificial neural network designed to process
sequences of data. They work especially well for jobs requiring sequences, such as time series
data, voice, natural language, and other activities. RNN works on the principle of saving the
output of a particular layer and feeding this back to the input in order to predict the output of
the layer.

Below is how you can convert a Feed-Forward Neural Network into a Recurrent Neural
Network:

The nodes in different layers of the neural network are compressed to form a single layer of
recurrent neural networks. A, B, and C are the parameters of the network. The input
layer X processes the initial input and passes it to the middle layer A. The middle layer consists
of multiple hidden layers, each with its activation functions, weights, and biases. These
parameters are standardized across the hidden layer so that instead of creating multiple hidden
layers, it will create one and loop it over.
Instead of using traditional backpropagation, recurrent neural networks
use backpropagation through time (BPTT) algorithms to determine the gradient. In
backpropagation, the model adjusts the parameter by calculating errors from the output to the
input layer. BPTT sums the error at each time step as RNN shares parameters across each layer.
Backpropagation Through Time (BPTT)

BPTT is a version of backpropagation used to train RNNs. The idea is to unroll the network
over time and apply standard backpropagation.
Steps:

1. Forward pass-through time

2. Compute loss at each time step

3. Backward pass-through time to compute gradients

4. Update parameters

Challenges:

• Vanishing gradients: Small gradients vanish over long sequences

• Exploding gradients: Large gradients can make training unstable

Solutions:

• Gradient clipping

• LSTM and GRU architectures

Types of Recurrent Neural Networks

Feedforward networks have single input and output, while recurrent neural networks are
flexible as the length of inputs and outputs can be changed. This flexibility allows RNNs to
generate music, sentiment classification, and machine translation.

There are four types of RNN based on different lengths of inputs and outputs.

• One-to-one is a simple neural network. It is commonly used for machine learning

problems that have a single input and output.

• One-to-many has a single input and multiple outputs. This is used for generating image
captions.

• Many-to-one takes a sequence of multiple inputs and predicts a single output. It is

popular in sentiment classification, where the input is text and the output is a category.

• Many-to-many takes multiple inputs and outputs. The most common application is
machine translation.
Key Differences Between CNN and RNN

• CNN is applicable for sparse data like images. RNN is applicable for time series and
sequential data.

• While training the model, CNN uses a simple backpropagation and RNN uses
backpropagation through time to calculate the loss.

• RNN can have no restriction in length of inputs and outputs, but CNN has finite inputs
and finite outputs.

• CNN has a feedforward network and RNN works on loops to handle sequential data.

• CNN can also be used for video and image processing. RNN is primarily used for
speech and text analysis.

RNN Advanced Architectures

1. Unfolded RNN

An unfolded RNN displays the time steps of the RNN as layers. Helps understand dependencies
and training flow.

Sequence-to-Sequence (Seq2Seq) RNN

• Encoder-decoder architecture

• Used for tasks like machine translation, summarization

2. Long Short-Term Memory (LSTM)

The Long Short-Term Memory (LSTM) is the advanced type of RNN, which was designed to
prevent both decaying and exploding gradient problems. Just like RNN, LSTM has repeating
modules, but the structure is different. Instead of having a single layer of tanh, LSTM has four
interacting layers that communicate with each other. This four-layered structure helps LSTM
retain long-term memory and can be used in several sequential problems including machine
translation, speech synthesis, speech recognition, and handwriting recognition. All RNN are
in the form of a chain of repeating modules of a neural network. In standard RNNs, this
repeating module will have a very simple structure, such as a single tanh layer. LSTMs also
have a chain-like structure, but the repeating module is a bit different structure. Instead of
having a single neural network layer, four interacting layers are communicating extraordinarily.
3. Bidirectional RNN (BiRNN)

• Processes sequence in both forward and backward directions

• Final output: concatenation of forward and backward hidden states

• Better for understanding full context (e.g., in NLP)

4. Vanishing Gradients with RNNs

Problem:

• During backpropagation, gradients become very small and eventually vanish

• As a result, earlier time steps fail to learn effectively

Cause:

• Chain rule of derivatives causes exponential decrease in gradients over many

layers/time steps

Solution:

• Use LSTM or GRU

• Gradient clipping

• Use ReLU or other non-saturating activation functions

5. Gated Recurrent Unit (GRU)

The gated recurrent unit (GRU) is a variation of LSTM as both have design similarities, and in
some cases, they produce similar results. GRU uses an update gate and reset gate to solve the
vanishing gradient problem. These gates decide what information is important and pass it to
the output. The gates can be trained to store information from long ago, without vanishing over
time or removing irrelevant information.
Unlike LSTM, GRU does not have cell state Ct. It only has a hidden state ht, and due to the
simple architecture, GRU has a lower training time compared to LSTM models. The GRU
architecture is easy to understand as it takes input xt and the hidden state from the previous
timestamp ht-1 and outputs the new hidden state ht.
Applications of RNNs

Domain Applications

NLP Text generation, translation, chatbots

Speech Voice recognition, speech-to-text

Healthcare ECG analysis, patient monitoring

Finance Stock prediction, anomaly detection

Robotics Motion control, sequence prediction

Digital Modulations using Matlab
From Everand
Digital Modulations using Matlab
Mathuranathan Viswanathan
4/5 (6)
TensorFlow in 1 Day: Make your own Neural Network
From Everand
TensorFlow in 1 Day: Make your own Neural Network
Krishna Rungta
3.5/5 (10)
Approximation With Artificial Neural Network - MSC Thesis 2001 PDF
No ratings yet
Approximation With Artificial Neural Network - MSC Thesis 2001 PDF
45 pages
What are Recurrent Neural Networks.docx
No ratings yet
What are Recurrent Neural Networks.docx
7 pages
Lecture Notes_RRN
No ratings yet
Lecture Notes_RRN
8 pages
Recurrent Neural Networks
No ratings yet
Recurrent Neural Networks
36 pages
DeepLearning Unit-III
No ratings yet
DeepLearning Unit-III
42 pages
RNN.docx
No ratings yet
RNN.docx
10 pages
Unit 4
No ratings yet
Unit 4
27 pages
Module2 L7 RNN LSTM
No ratings yet
Module2 L7 RNN LSTM
47 pages
unit 4_merged
No ratings yet
unit 4_merged
13 pages
What is a Recurrent Neural Network
No ratings yet
What is a Recurrent Neural Network
36 pages
DeepLearning Unit-III
No ratings yet
DeepLearning Unit-III
99 pages
DL-unit-4-part-2
No ratings yet
DL-unit-4-part-2
8 pages
Unit 5
No ratings yet
Unit 5
76 pages
UNIT-3
No ratings yet
UNIT-3
30 pages
Unit V
No ratings yet
Unit V
32 pages
Module 4-1
No ratings yet
Module 4-1
44 pages
Deep Arch Msc 2024
No ratings yet
Deep Arch Msc 2024
83 pages
Recurrent Neural Networks
No ratings yet
Recurrent Neural Networks
6 pages
CH4_AA1.1-Sequence Models (1)
No ratings yet
CH4_AA1.1-Sequence Models (1)
26 pages
AD3501_UNIT3
No ratings yet
AD3501_UNIT3
29 pages
Introduction To Recurrent Neural Networks (RNNS) : Dr. Hans Weber February 9, 2024
No ratings yet
Introduction To Recurrent Neural Networks (RNNS) : Dr. Hans Weber February 9, 2024
9 pages
Unit_3_rcnn
No ratings yet
Unit_3_rcnn
25 pages
2 U4-Rnn
No ratings yet
2 U4-Rnn
17 pages
What is an RNN
No ratings yet
What is an RNN
6 pages
A Recurrent Neural Network
No ratings yet
A Recurrent Neural Network
3 pages
Deep & Reinforcement - Unit 4
No ratings yet
Deep & Reinforcement - Unit 4
17 pages
UNIT-IV DL
No ratings yet
UNIT-IV DL
23 pages
RNN
No ratings yet
RNN
14 pages
RNN LSTM Gru R
No ratings yet
RNN LSTM Gru R
97 pages
RNN
No ratings yet
RNN
32 pages
DL_MOD4 (3)
No ratings yet
DL_MOD4 (3)
105 pages
Sequence Modeling
No ratings yet
Sequence Modeling
131 pages
Unit 3 RCNN Updated
No ratings yet
Unit 3 RCNN Updated
28 pages
module5
No ratings yet
module5
21 pages
DL
No ratings yet
DL
251 pages
DL CO3- PPT 1
No ratings yet
DL CO3- PPT 1
22 pages
Deep Learning
No ratings yet
Deep Learning
49 pages
Blue and White Simple Business Plan Presentation
No ratings yet
Blue and White Simple Business Plan Presentation
15 pages
Unit V Recurrent Neural Networks
No ratings yet
Unit V Recurrent Neural Networks
35 pages
RNN
No ratings yet
RNN
9 pages
Unit 3
No ratings yet
Unit 3
8 pages
RNN SK
No ratings yet
RNN SK
17 pages
DL Unit - III Notes1
No ratings yet
DL Unit - III Notes1
14 pages
Deep Learning RNN
100% (1)
Deep Learning RNN
53 pages
UNIT-5 Foundations of Deep Learning
No ratings yet
UNIT-5 Foundations of Deep Learning
9 pages
RNN_2
No ratings yet
RNN_2
144 pages
RNN introduction
No ratings yet
RNN introduction
22 pages
Deep Learning (MODULE-4)
No ratings yet
Deep Learning (MODULE-4)
102 pages
Unit 4 - MachineLearning
No ratings yet
Unit 4 - MachineLearning
16 pages
semster_ dl
No ratings yet
semster_ dl
15 pages
30-35
No ratings yet
30-35
26 pages
UNIT5
No ratings yet
UNIT5
13 pages
mod6
No ratings yet
mod6
48 pages
Unit III- Recurrent Neural Networks
No ratings yet
Unit III- Recurrent Neural Networks
44 pages
AAM unit 6 notes
No ratings yet
AAM unit 6 notes
20 pages
Steps For Training A Recurrent Neural Network: Advantages
No ratings yet
Steps For Training A Recurrent Neural Network: Advantages
13 pages
Nria20-Dl - Unit-4 Notes-Final
No ratings yet
Nria20-Dl - Unit-4 Notes-Final
21 pages
Recurrent Neural Networks
No ratings yet
Recurrent Neural Networks
8 pages
RNN
No ratings yet
RNN
15 pages
Unit 5
No ratings yet
Unit 5
4 pages
Approach To The Synthesis of Neural Network Structure During Classification
No ratings yet
Approach To The Synthesis of Neural Network Structure During Classification
7 pages
NNDL
No ratings yet
NNDL
96 pages
Btech All 7 Sem Soft Computing Pcp7h010 2020
No ratings yet
Btech All 7 Sem Soft Computing Pcp7h010 2020
2 pages
Deep Learning - Lecture 4
No ratings yet
Deep Learning - Lecture 4
13 pages
Lec13 Neural Networks and Deep Learning PDF
No ratings yet
Lec13 Neural Networks and Deep Learning PDF
33 pages
A Note On The Equivalence of NARX and RNN
No ratings yet
A Note On The Equivalence of NARX and RNN
7 pages
Lecture 26
No ratings yet
Lecture 26
17 pages
Nvidia Fundamentals of Deep Learning PPT 4
No ratings yet
Nvidia Fundamentals of Deep Learning PPT 4
19 pages
Multilayer Perceptron (MLP) & Linear Separabaility
No ratings yet
Multilayer Perceptron (MLP) & Linear Separabaility
7 pages
Part 1.1.neural Network and Training Algorithm
No ratings yet
Part 1.1.neural Network and Training Algorithm
34 pages
Satrio Wahyu Prasetyo_1110195021_UAS Kontrol Cerdas 2
No ratings yet
Satrio Wahyu Prasetyo_1110195021_UAS Kontrol Cerdas 2
7 pages
CSA501_ QB Neural Network Deep Learning_updated2024
No ratings yet
CSA501_ QB Neural Network Deep Learning_updated2024
11 pages
Artificial Neural Networks (ANN) : 1-Introduction
No ratings yet
Artificial Neural Networks (ANN) : 1-Introduction
5 pages
Question Bank Beel801 PDF
100% (1)
Question Bank Beel801 PDF
10 pages
Understanding LSTM Networks
No ratings yet
Understanding LSTM Networks
15 pages
Cs3027 Deep Learning Syllabus
No ratings yet
Cs3027 Deep Learning Syllabus
2 pages
Graded Assessment
No ratings yet
Graded Assessment
6 pages
2. Deep Neural Network
No ratings yet
2. Deep Neural Network
60 pages
Artificial Neural Network
No ratings yet
Artificial Neural Network
13 pages
Analysis of Multi Layer Perceptron Network
No ratings yet
Analysis of Multi Layer Perceptron Network
7 pages
Exploiting Deep Learning For Persian Sentiment Analysis
No ratings yet
Exploiting Deep Learning For Persian Sentiment Analysis
8 pages
None 1b0764e7
No ratings yet
None 1b0764e7
7 pages
ML06_Neural-Network_2024-2025
No ratings yet
ML06_Neural-Network_2024-2025
78 pages
Model Questions DWT
No ratings yet
Model Questions DWT
3 pages
CNN Slides KD
No ratings yet
CNN Slides KD
25 pages
Lecture-20 21 22 (ANN)
No ratings yet
Lecture-20 21 22 (ANN)
30 pages
Deep Neural Networks: Amity Centre For Artificial Intelligence, Amity University, Noida, India
No ratings yet
Deep Neural Networks: Amity Centre For Artificial Intelligence, Amity University, Noida, India
62 pages
Multilayer Perceptron and Uppercase Handwritten Characters Recognition
No ratings yet
Multilayer Perceptron and Uppercase Handwritten Characters Recognition
4 pages

Module 06

Uploaded by

Module 06

Uploaded by

Module 06: Recurrent Neural Networks

Contents: Introduction to Sequence Models and RNNs, RNN Model, Backpropagation

• Text data (e.g., sentences)

• Time series forecasting

Sequence models capture dependencies between elements of a sequence, which traditional

Recurrent Neural Networks (RNNs)

1. Forward pass-through time

2. Compute loss at each time step

3. Backward pass-through time to compute gradients

• Vanishing gradients: Small gradients vanish over long sequences

• LSTM and GRU architectures

Types of Recurrent Neural Networks

• One-to-one is a simple neural network. It is commonly used for machine learning

• Many-to-one takes a sequence of multiple inputs and predicts a single output. It is

RNN Advanced Architectures

Sequence-to-Sequence (Seq2Seq) RNN

• Used for tasks like machine translation, summarization

2. Long Short-Term Memory (LSTM)

• Processes sequence in both forward and backward directions

• Final output: concatenation of forward and backward hidden states

• Better for understanding full context (e.g., in NLP)

4. Vanishing Gradients with RNNs

• During backpropagation, gradients become very small and eventually vanish

• Chain rule of derivatives causes exponential decrease in gradients over many

• Use LSTM or GRU

• Use ReLU or other non-saturating activation functions

5. Gated Recurrent Unit (GRU)

NLP Text generation, translation, chatbots

Speech Voice recognition, speech-to-text

Healthcare ECG analysis, patient monitoring

Finance Stock prediction, anomaly detection

Robotics Motion control, sequence prediction

You might also like