0% found this document useful (0 votes)

4 views14 pages

Recurrent Neural Networks

Recurrent Neural Networks (RNNs) are designed to process sequential data and are particularly effective for tasks like time series analysis and natural language processing. They utilize feedback loops to retain information, but face challenges such as the vanishing and exploding gradient problems, which hinder learning long-term dependencies. Variants like Long Short-Term Memory (LSTM) and Gated Recurrent Units (GRU) have been developed to address these issues, improving performance in various applications.

Uploaded by

Kowsalya

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

4 views14 pages

Recurrent Neural Networks

Uploaded by

Kowsalya

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 14

Recurrent Neural Networks

• Recurrent Neural Networks (RNNs) are a type of artificial neural network designed to process sequences of data.
They work especially well for jobs requiring sequences, such as time series data, voice, natural language, and
other activities.
• RNN works on the principle of saving the output of a particular layer and feeding this back to the input in order to
predict the output of the layer.
• Below is how you can convert a Feed-Forward Neural Network into a Recurrent Neural Network:

Fig: Simple Recurrent Neural Network

1
Recurrent Neural Networks (Cont…)
The nodes in different layers of the neural network are compressed to form a single layer of recurrent neural
networks. A, B, and C are the parameters of the network.

2
Recurrent Neural Networks (Cont…)
Here, “x” is the input layer, “h” is the hidden layer, and “y” is the output layer. A, B, and C are the network
parameters used to improve the output of the model. At any given time t, the current input is a combination of
input at x(t) and x(t-1). The output at any given time is fetched back to the network to improve on the output.

Fig: Fully connected Recurrent Neural Network

3
Recurrent Neural Networks - Work
• In Recurrent Neural networks, the information cycles through a loop to the middle hidden layer.
• The input layer ‘x’ takes in the input to the neural network and processes it and passes it onto the middle layer.
• The middle layer ‘h’ can consist of multiple hidden layers, each with its own activation functions and weights
and biases. If you have a neural network where the various parameters of different hidden layers are not
affected by the previous layer, ie: the neural network does not have memory, then you can use a recurrent
neural network.
• The Recurrent Neural Network will standardize the different activation functions and weights and biases so that
each hidden layer has the same parameters. Then, instead of creating multiple hidden layers, it will create one
and loop over it as many times as required.

Fig: Working of Recurrent Neural Network

4
Types of Recurrent Neural Networks
There are four types of Recurrent Neural Networks:
• One to One
• One to Many
• Many to One
• Many to Many
One to One RNN
This type of neural network is known as the Vanilla Neural Network. It's used for general machine learning
problems, which has a single input and a single output.

5
Types of Recurrent Neural Networks (Cont…)
One to Many RNN
This type of neural network has a single input and multiple outputs. An example of this is the image caption.

Many to One RNN

This RNN takes a sequence of inputs and generates a single output.
Sentiment analysis is a good example of this kind of network where a given
sentence can be classified as expressing positive or negative sentiments.

6
Types of Recurrent Neural Networks (Cont…)
Many to Many RNN
This RNN takes a sequence of inputs and generates a sequence of outputs. Machine translation is one of the
examples.

7
Two Issues of Standard RNNs (Cont…)
1. Vanishing Gradient Problem
2. Exploding Gradient Problem

Vanishing Gradient Problem

Recurrent Neural Networks enable you to model time-dependent and sequential data problems, such as stock market
prediction, machine translation, and text generation. You will find, however, RNN is hard to train because of the
gradient problem.
RNNs suffer from the problem of vanishing gradients. The gradients carry information used in the RNN, and when the
gradient becomes too small, the parameter updates become insignificant. This makes the learning of long data
sequences difficult.
1. As the RNN trains, the gradients (used to adjust the model’s weights) become very small.
2. This makes it hard for the model to learn or remember information from earlier parts of the sequence.
3. The model repeatedly multiplies small numbers during backpropagation, making the gradients shrink to almost
zero.

Impact:

The RNN forgets long-term context and only focuses on recent inputs.

8
Two Issues of Standard RNNs
1. Vanishing Gradient Problem
2. Exploding Gradient Problem
Exploding Gradient Problem
While training a neural network, if the slope tends to grow exponentially instead of decaying, this is called an Exploding
Gradient. This problem arises when large error gradients accumulate, resulting in very large updates to the neural
network model weights during the training process.
Long training time, poor performance, and bad accuracy are the major issues in gradient problems.
• Sometimes, the gradients become extremely large during training.
• This makes the model’s weight updates go out of control, causing errors outputs.
• The model repeatedly multiplies large numbers, making the gradients grow bigger and bigger.

Impact:
Training becomes unstable, and the model fails to learn properly.

Issue Cause Impact Solution

Weights < 1 (during Cannot learn long-term LSTMs, GRUs, ReLU,
Vanishing Gradient
backpropagation) dependencies Clipping
Weights > 1 (during Gradient Clipping,
Exploding Gradient Training instability
backpropagation) Regularization
9
Long Short-Term Memory (LSTM)
1. Long Short-Term Memory (LSTM) is a special type of Recurrent Neural Network (RNN) designed to better handle
the vanishing gradient problem and learn long-term dependencies in sequential data. LSTMs are particularly useful
for tasks like language modeling, text generation, machine translation, and time-series forecasting.

Why LSTMs?
Standard RNNs struggle to learn long-term dependencies because their gradients can either vanish (become too small)
or explode (become too large) during backpropagation. This makes them ineffective for tasks where context over long
sequences is important. LSTMs overcome this limitation through their unique architecture that allows them to
remember information for longer periods.

10
Long Short-Term Memory (LSTM)
Structure of LSTM
Cell State (Ct):
The cell state acts as the memory of the LSTM. It carries information across time steps and can be modified by
different gates. This is what allows LSTMs to maintain long-term dependencies.
Hidden State (ht):
The hidden state is used for the output at each time step and is influenced by the cell state.
Gates:
Gates are neural network layers that control the flow of information through the cell state.
They use sigmoid σ(x) = 1 / (1 + e^(-x)) or tanh tanh(x) = (e^x - e^(-x)) / (e^x + e^(-x)) activation functions.
The gates include:
Forget Gate: Decides what information from the cell state should be discarded.
Input Gate: Decides what new information should be added to the cell state.
Output Gate: Decides what part of the cell state should be output as the hidden state.

11
Gated Recurrent Unit (GRU) Networks
• GRU is another type of RNN that is designed to address the vanishing gradient problem.

• It has two gates: the reset gate and the update gate.

• The reset gate determines how much of the previous state should be forgotten, while the update gate determines
how much of the new state should be remembered.

• This allows the GRU network to selectively update its internal state based on the input sequence.

How GRUs Work in Simple Terms

Think of GRUs as having a mechanism to decide what to remember and what to forget at each step:
• Update Gate: Controls how much of the past should be kept and how much should be replaced with new
information.
• Reset Gate: Helps decide how much of the past should be ignored when generating the new hidden state.

12
Compare GRU vs LSTM
Here is a comparison of Gated Recurrent Unit (GRU) and Long Short-Term Memory (LSTM) networks

GRU LSTM

Structure Simpler structure with two gates (update and reset gate) More complex structure with three gates (input, forget, and output gate)

Fewer parameters (3 weight matrices - update gate, reset gate and candidate More parameters (4 weight matrices - candidate cell state, input, forget, and
Parameters
hidden state) output gate)

Training Faster to train Slow to train

In most cases, GRU tend to use fewer memory resources due to its simpler LSTM has a more complex structure and a larger number of parameters, thus
Space Complexity structure and fewer parameters, thus better suited for large datasets or might require more memory resources and could be less effective for large
sequences. datasets or sequences.

Generally performed similarly to LSTM on many tasks, but in some cases, GRU LSTM generally performs well on many tasks but is more computationally
Performance has been shown to outperform LSTM and vice versa. It's better to try both and expensive and requires more memory resources. LSTM has advantages over
see which works better for your dataset and task. GRU in natural language understanding and machine translation tasks.
Thank You

CNN RNN LSTM GRU Simple
100% (3)
CNN RNN LSTM GRU Simple
20 pages
DL Co3 - PPT 1
No ratings yet
DL Co3 - PPT 1
22 pages
Unit 3 Deep Learning SPPU BE IT
No ratings yet
Unit 3 Deep Learning SPPU BE IT
30 pages
SRM Institute of Science and Technology: Record Work
No ratings yet
SRM Institute of Science and Technology: Record Work
251 pages
GenAI Module2
No ratings yet
GenAI Module2
190 pages
Deep Learning (MODULE-5)
No ratings yet
Deep Learning (MODULE-5)
71 pages
Unit-Iv DL
No ratings yet
Unit-Iv DL
54 pages
RNN LSTM
No ratings yet
RNN LSTM
49 pages
Sequence Modeling
No ratings yet
Sequence Modeling
131 pages
DL U-Ii
No ratings yet
DL U-Ii
41 pages
DeepLearning Unit-III
No ratings yet
DeepLearning Unit-III
42 pages
ML (Cs-601) Unit 4 Complete
No ratings yet
ML (Cs-601) Unit 4 Complete
45 pages
Final PDL - Unit IV
No ratings yet
Final PDL - Unit IV
51 pages
Mod 6
No ratings yet
Mod 6
48 pages
Recurrent Neural Network
No ratings yet
Recurrent Neural Network
21 pages
2 U4-Rnn
No ratings yet
2 U4-Rnn
17 pages
RNN 2
No ratings yet
RNN 2
144 pages
CS601 - Machine Learning - Unit 4 - Notes - 1672759767
No ratings yet
CS601 - Machine Learning - Unit 4 - Notes - 1672759767
12 pages
Deep Arch MSC 2024
No ratings yet
Deep Arch MSC 2024
83 pages
FeatureCAM 2015 Reference Help
100% (1)
FeatureCAM 2015 Reference Help
1,985 pages
Module2 L7 RNN LSTM
No ratings yet
Module2 L7 RNN LSTM
47 pages
Deep Learning
No ratings yet
Deep Learning
49 pages
10 20 - Apr - DL
No ratings yet
10 20 - Apr - DL
69 pages
Unit 5
No ratings yet
Unit 5
76 pages
Unit III - Recurrent Neural Networks
No ratings yet
Unit III - Recurrent Neural Networks
44 pages
Chap 7.2 Sequence Analysis Using RNN LSTM
No ratings yet
Chap 7.2 Sequence Analysis Using RNN LSTM
60 pages
Day 4
No ratings yet
Day 4
22 pages
Unit V
No ratings yet
Unit V
32 pages
RNN LSTM
No ratings yet
RNN LSTM
72 pages
ML Unit 4
No ratings yet
ML Unit 4
47 pages
Summative MATH 6 Q1
0% (1)
Summative MATH 6 Q1
4 pages
Unit 4 - Machine Learning - WWW - Rgpvnotes.in
0% (1)
Unit 4 - Machine Learning - WWW - Rgpvnotes.in
16 pages
AAM Unit 6 Notes
No ratings yet
AAM Unit 6 Notes
20 pages
Unit 3 RCNN
No ratings yet
Unit 3 RCNN
25 pages
Deep Learning
No ratings yet
Deep Learning
26 pages
Recurrent Neural Networks
No ratings yet
Recurrent Neural Networks
36 pages
LSTM&RNN
No ratings yet
LSTM&RNN
10 pages
What Is A Recurrent Neural Network
No ratings yet
What Is A Recurrent Neural Network
36 pages
Module 5
No ratings yet
Module 5
21 pages
Unit 3
No ratings yet
Unit 3
30 pages
Deep & Reinforcement - Unit 4
No ratings yet
Deep & Reinforcement - Unit 4
17 pages
CH4 - AA1.1-Sequence Models
No ratings yet
CH4 - AA1.1-Sequence Models
26 pages
RNN Introduction
No ratings yet
RNN Introduction
22 pages
RNN
No ratings yet
RNN
28 pages
Unit 4
No ratings yet
Unit 4
27 pages
Module 4
No ratings yet
Module 4
14 pages
DLT Unit-4
No ratings yet
DLT Unit-4
18 pages
Eisenstein-Nov18 - Definicao-1-30
No ratings yet
Eisenstein-Nov18 - Definicao-1-30
30 pages
Unit 4 - Machine Learning
No ratings yet
Unit 4 - Machine Learning
16 pages
Unit 4 - MachineLearning
No ratings yet
Unit 4 - MachineLearning
16 pages
Planimeters
No ratings yet
Planimeters
13 pages
Probability Problem
No ratings yet
Probability Problem
29 pages
CS5560 Lect12-RNN - LSTM
No ratings yet
CS5560 Lect12-RNN - LSTM
30 pages
Quick Start Guide To Using PID in Logix5000
No ratings yet
Quick Start Guide To Using PID in Logix5000
9 pages
What Are Recurrent Neural Networks
No ratings yet
What Are Recurrent Neural Networks
7 pages
Machine Learning Unit 4 RNN
No ratings yet
Machine Learning Unit 4 RNN
11 pages
NIMCET Sample Question Paper
No ratings yet
NIMCET Sample Question Paper
4 pages
CS 601 Machine Learning Unit 4
No ratings yet
CS 601 Machine Learning Unit 4
14 pages
Is 1893 (Part 4) :2005
100% (3)
Is 1893 (Part 4) :2005
24 pages
Lecture Notes - RRN
No ratings yet
Lecture Notes - RRN
8 pages
Experiment 2.4 DL
No ratings yet
Experiment 2.4 DL
4 pages
9 First-Order Circuits Noted
No ratings yet
9 First-Order Circuits Noted
67 pages
Introduction To Recurrent Neural Networks (RNNS) : Dr. Hans Weber February 9, 2024
No ratings yet
Introduction To Recurrent Neural Networks (RNNS) : Dr. Hans Weber February 9, 2024
9 pages
What Is An RNN
No ratings yet
What Is An RNN
6 pages
Unit 3
No ratings yet
Unit 3
8 pages
Module 06
No ratings yet
Module 06
5 pages
Full Guide To The Guide Mesh
No ratings yet
Full Guide To The Guide Mesh
3 pages
Chapter1 Existence and Uniqueness Theorems
No ratings yet
Chapter1 Existence and Uniqueness Theorems
13 pages
Analysis in Adams
No ratings yet
Analysis in Adams
5 pages
DL Unit-4
No ratings yet
DL Unit-4
4 pages
Recurrent Neural Network: Unit - 3
No ratings yet
Recurrent Neural Network: Unit - 3
12 pages
UNIT-5 Foundations of Deep Learning
No ratings yet
UNIT-5 Foundations of Deep Learning
9 pages
Part 1 Functions Equations and Their Graphs
No ratings yet
Part 1 Functions Equations and Their Graphs
30 pages
Convolution Sum PDF
No ratings yet
Convolution Sum PDF
17 pages
The Employee Engagement and OCB As Mediating On Employee Performance
No ratings yet
The Employee Engagement and OCB As Mediating On Employee Performance
21 pages
Applications of Integration - Mean and Root Mean Square Values
No ratings yet
Applications of Integration - Mean and Root Mean Square Values
6 pages
Cut & Bent Reinforcement
No ratings yet
Cut & Bent Reinforcement
3 pages
Sprague Matthew Thesis App C PDF
No ratings yet
Sprague Matthew Thesis App C PDF
26 pages
Submitted in Partial Fulfilment For The Award of Degree of
No ratings yet
Submitted in Partial Fulfilment For The Award of Degree of
13 pages
30-Second Questions
No ratings yet
30-Second Questions
3 pages
Predicate Logic PDF
No ratings yet
Predicate Logic PDF
22 pages
Biostatistics Notes For PG
No ratings yet
Biostatistics Notes For PG
10 pages
Introduction To Python - 2018
No ratings yet
Introduction To Python - 2018
20 pages
Assume That One Third of All Used Cars Are Lemons If
No ratings yet
Assume That One Third of All Used Cars Are Lemons If
2 pages
Recapitulation For Midterm 1
No ratings yet
Recapitulation For Midterm 1
19 pages
PHP Type Comparison Tables
No ratings yet
PHP Type Comparison Tables
2 pages
Gretl Empirical Exercise 2 - KEY PDF
No ratings yet
Gretl Empirical Exercise 2 - KEY PDF
3 pages
Techniques and Tools for Artificial Intelligence. Neural Networks via R and PYTHON
From Everand
Techniques and Tools for Artificial Intelligence. Neural Networks via R and PYTHON
César Pérez López
No ratings yet
TensorFlow in 1 Day: Make your own Neural Network
From Everand
TensorFlow in 1 Day: Make your own Neural Network
Krishna Rungta
3.5/5 (10)

Recurrent Neural Networks

Uploaded by

Recurrent Neural Networks

Uploaded by

Recurrent Neural Networks

Fig: Simple Recurrent Neural Network

Fig: Fully connected Recurrent Neural Network

Fig: Working of Recurrent Neural Network

Many to One RNN

Vanishing Gradient Problem

Issue Cause Impact Solution

How GRUs Work in Simple Terms

Training Faster to train Slow to train

You might also like