0% found this document useful (0 votes)

95 views22 pages

LSTM PPT

LSTM (Long Short Term Memory) networks are a specialized type of recurrent neural network (RNN) designed to learn long-term dependencies and address the vanishing gradient problem. They utilize a complex architecture with cell states and gating mechanisms to control information flow, allowing for effective learning over longer sequences. The LSTM structure includes components such as forget gates, input gates, and output gates that manage the retention and output of information.

Uploaded by

heisenberganaya1

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

95 views22 pages

LSTM PPT

Uploaded by

heisenberganaya1

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 22

LSTM Networks

(Long Short Term Memory)

Long and short term dependency

●
Short ●
Long
●
The clouds are in the ? ●
I grew up in
Kerala. .............................. I
speak fluent ?
●

2
LSTM - Overview

●
Special type of RNN capable of
learning long-term dependencies.
●
Extension of the basic Vanilla
RNN to overcome the
exploding/vanishing gradient
problem.
●
Uses 2 paths to predict values –
a long term and a short-term
memory path.
●
However, the computation is
complex when compared to
Vanilla RNN.
3
The repeating modules – RNN vs LSTM

●
Repeating Module in RNN ●
The repeating module in
LSTM

4
RNN – Components

●
Input layer
●
Hidden Layer
●
Activation Function
●
Output layer
●
Recurrent Connection

5
RNN - Architecture
Output Layer

Hidden Layer

Input Layer

6
LSTM - Overview

●
Uses the sigmoid activation function (0 to 1).

●
And Tanh or hyberbolic tangent activation function (-1 to 1).

7
Controlling the Gradients

●
When the gradients used to update the weights during
backpropagation get small and disappearing, it becomes hard
for the network to learn because the weights hardly change,
which slows down or stops training altogether.
●
LSTM leverages gating mechanisms to control the flow of
information and gradients.
●
This helps prevent the vanishing gradient problem and allows
the network to learn and retain information over longer
sequences.

8
LSTM Cell

●
Cell State – No weights or biases.
●
Information is added or removed
from the cell state using gates.

●
Gates – Composed of a
sigmoid function followed by
a pointwise multiplication.

9
●
In LSTMs, the cell state acts as a long-term memory, carrying information through the
sequence, while the hidden state is the output of the LSTM cell at a given time step and is
passed to the next cell.
●
Cell State (Ct):
– Represents the long-term memory of the model.
– Allows information to flow unchanged across the cell, providing a direct path for gradients during
backpropagation.
– Stores information about past inputs, enabling the model to learn long-term dependencies.
●
Hidden State (ht):
– The output of the LSTM cell at a given time step.
– Contributes to the final output and is passed to the next cell in the sequence.
– Is a representation of the previous inputs, retaining information from one time step to another.
– Can be thought of as the "working memory" that carries information from immediately previous events.

10
LSTM Cell

●
The sigmoid layer outputs numbers
between zero and one, describing how
much of each component should be let
through.
●
A value of zero means “let nothing
through,” while a value of one means “let
everything through!”

11
LSTM Gates - The Forget Gate
●
Decides what information we are going to
throw away from the cell state.
●
Looks at ht-1 and xt and ouputs a number
between 0 and 1 for each number ct-1 in
the cell state.
●
A value of zero means “completely omit,”
while a omit and a one means “completely
include.”

12
Forget Gate – Computation

●
Where Wf is the weight matrix, that is, the weights of ht-1 and xt.
●
Suppose we are trying to predict the next word based on all the
previous ones. In such a problem, the cell state might include
the gender of the present subject, so that the correct pronouns
can be used. When we see a new subject, we want to forget
the gender of the old subject.
13
Determining the information to store in the cell state

●
Has two parts.
●
First, a sigmoid layer called the “input gate layer” decides
which values we’ll update.
●
Next, a tanh layer creates a new candidate values, Ct, that
could be added to the state. In the next step, we’ll combine
these two to create an update to the state.
14
Determining the information to store in the cell state –
The input gate

●
This could be adding the gender of the new subject to the
cell state, to replace the old one we are forgetting.

15
Updating the cell state

●
Update using the forget gate and the input gate.
●
This could be adding the gender of the new subject to the
cell state, to replace the old one we are forgetting.

16
Deciding what to output – The output gate

●
Determines the next hidden state (short-term memory).
●
Output, ht (next hidden state) will be based on long-term (previous cell state) and short term
memory (previous hidden state).
●
The tanh layer ensures that only the relevant portions of the cell state are used to update
the hidden state.

17
Deciding what to output – The output gate

●
Sigmoid function determines how much of the information
represented by the cell state should be outputted.
●
For the language model example, since it just saw a subject, it might
want to output information relevant to a verb, in case that’s what is
coming next.

18
LSTM - Details

●
Cell state that represents Long-Term memory.
●
Can be modified by a multiplication and an addition.
●
Does not contain any weights or biases.
●
Information can flow across unrolled units without causing the
gradient to vanish or explode.

19
LSTM Details

●
Hidden state that represents short-term memories.
●
They are connected to weights and hence can get modifies
with these weights.
●
Long and short-term memories interact to make predictions.

20
Interaction between Long and Short Term Memory

●
Assume that the Long Term
memory is 2 and the short
term memory is 1.
●
The input reduces the effect
of the Long-term memory by
a small factor.

21
Interaction between Long and Short Term Memory

●
Assume that the Long Term
memory is 2 and the short term
memory is -10.
●
The input reduces the effect of the
Long-term memory to zero.
●
Since the Sigmoid function outputs
a number between 0 and 1, the o/p
determines what percentage of the
Long-term memory is
remembered.
●
This is the first stage in LSTM.
●
This part is called the forget gate.
22

Unit 2 DL
No ratings yet
Unit 2 DL
44 pages
LSTM PPT
No ratings yet
LSTM PPT
22 pages
2 DNN-CNN-RNN
100% (1)
2 DNN-CNN-RNN
87 pages
6S191 MIT DeepLearning L2
No ratings yet
6S191 MIT DeepLearning L2
93 pages
RNN LSTM GRU Transformers
0% (1)
RNN LSTM GRU Transformers
123 pages
Short Notes On Vanishing & Exploding Gradients
No ratings yet
Short Notes On Vanishing & Exploding Gradients
30 pages
DL Co3 - PPT 1
No ratings yet
DL Co3 - PPT 1
22 pages
Unit V
No ratings yet
Unit V
26 pages
Long Short-Term Memory (LSTM)
No ratings yet
Long Short-Term Memory (LSTM)
25 pages
LSTM
No ratings yet
LSTM
11 pages
Recurrent Neural Network - Fundamentals of Deep Learning
No ratings yet
Recurrent Neural Network - Fundamentals of Deep Learning
16 pages
Artificial Neural Networks Quiz Questions 1
No ratings yet
Artificial Neural Networks Quiz Questions 1
17 pages
Introduction To Long Short Term Memory LSTM
No ratings yet
Introduction To Long Short Term Memory LSTM
6 pages
LSTM
No ratings yet
LSTM
11 pages
Unit 1
No ratings yet
Unit 1
23 pages
Machine Learning: Feed Forward Neural Networks Backpropagation Algorithm Cnns and Rnns
No ratings yet
Machine Learning: Feed Forward Neural Networks Backpropagation Algorithm Cnns and Rnns
127 pages
103 RO4 Final 201819
No ratings yet
103 RO4 Final 201819
124 pages
Understanding LSTM Networks
No ratings yet
Understanding LSTM Networks
7 pages
Deep Learning Notes
No ratings yet
Deep Learning Notes
205 pages
5707 11 RNN LSTM
No ratings yet
5707 11 RNN LSTM
128 pages
RNN 2
No ratings yet
RNN 2
144 pages
RNN LSTM
No ratings yet
RNN LSTM
72 pages
UNIT-5-Modern Recurrent Neural Networks
No ratings yet
UNIT-5-Modern Recurrent Neural Networks
60 pages
Sequence Modeling - Recurrent Networks: Biplab Banerjee
No ratings yet
Sequence Modeling - Recurrent Networks: Biplab Banerjee
66 pages
ML Unit 4
No ratings yet
ML Unit 4
47 pages
Long Short Term Memory (LSTM)
No ratings yet
Long Short Term Memory (LSTM)
33 pages
Lecture 3 LSTM, GRU
No ratings yet
Lecture 3 LSTM, GRU
45 pages
RNN StannfordBased
No ratings yet
RNN StannfordBased
102 pages
LSTM
No ratings yet
LSTM
24 pages
Unit 5 RNN
No ratings yet
Unit 5 RNN
14 pages
Unit 2 DL
No ratings yet
Unit 2 DL
43 pages
Deep Learning Unit 2
No ratings yet
Deep Learning Unit 2
4 pages
Illustrated Guide To LSTM's and GRU'S - A Step by Step Explanation - by Michael Phi - Towards Data Science
No ratings yet
Illustrated Guide To LSTM's and GRU'S - A Step by Step Explanation - by Michael Phi - Towards Data Science
15 pages
RNN
No ratings yet
RNN
28 pages
Week 6
No ratings yet
Week 6
60 pages
Neural Networks
No ratings yet
Neural Networks
22 pages
LSTM
No ratings yet
LSTM
22 pages
Unlocking The Power of Long Short-Term Memory (LSTM) Networks - by Sachinsoni - Medium
No ratings yet
Unlocking The Power of Long Short-Term Memory (LSTM) Networks - by Sachinsoni - Medium
23 pages
NLP Lecture 6
No ratings yet
NLP Lecture 6
57 pages
DLT Unit-4
No ratings yet
DLT Unit-4
18 pages
UNIT-5 Foundations of Deep Learning
No ratings yet
UNIT-5 Foundations of Deep Learning
9 pages
LSTM Presentation
No ratings yet
LSTM Presentation
23 pages
LSTM & Gru
No ratings yet
LSTM & Gru
17 pages
Long Short-Term Memory (LSTM) by Mohsin
No ratings yet
Long Short-Term Memory (LSTM) by Mohsin
17 pages
LSTM by Bushra
No ratings yet
LSTM by Bushra
16 pages
EPJ LSTM Survey
No ratings yet
EPJ LSTM Survey
14 pages
DL Co-3 PPT 3
No ratings yet
DL Co-3 PPT 3
19 pages
LSTM
No ratings yet
LSTM
19 pages
LSTM
No ratings yet
LSTM
14 pages
Bachelor Thesis
No ratings yet
Bachelor Thesis
25 pages
LSTM and GRU
No ratings yet
LSTM and GRU
22 pages
Longshorttermmemorylstm 231215171600 1feb7b1b
No ratings yet
Longshorttermmemorylstm 231215171600 1feb7b1b
17 pages
Exercises INF 5860 Solution Hints
No ratings yet
Exercises INF 5860 Solution Hints
11 pages
RNNs and LSTMs
No ratings yet
RNNs and LSTMs
41 pages
Which Neural Net Architectures Give Rise To Exploding and Vanishing Gradients?
No ratings yet
Which Neural Net Architectures Give Rise To Exploding and Vanishing Gradients?
18 pages
Understanding LSTM - A Simple Guide With Diagrams and Real-Time Examples - by Neural Pai - Feb, 2025 - Medium
No ratings yet
Understanding LSTM - A Simple Guide With Diagrams and Real-Time Examples - by Neural Pai - Feb, 2025 - Medium
15 pages
LSTM
No ratings yet
LSTM
12 pages
LSTMS
No ratings yet
LSTMS
14 pages
LSTM
No ratings yet
LSTM
12 pages
Chapter 12 PartII en
No ratings yet
Chapter 12 PartII en
23 pages
Module 4
No ratings yet
Module 4
14 pages
RNN & LSTM: Vamsi Krishna B 1 9 M E 0 2 3
No ratings yet
RNN & LSTM: Vamsi Krishna B 1 9 M E 0 2 3
14 pages
Gender Classification Using Opencv: Under The Guidance of
No ratings yet
Gender Classification Using Opencv: Under The Guidance of
19 pages
Module 4 RNN LSTM GRU
No ratings yet
Module 4 RNN LSTM GRU
59 pages
Revision Notes LSTRM
No ratings yet
Revision Notes LSTRM
19 pages
Activation Functions and Initialization Methods
No ratings yet
Activation Functions and Initialization Methods
17 pages
DL Endsem 2024 FlyHigh Services
No ratings yet
DL Endsem 2024 FlyHigh Services
18 pages
LSTM Deep Learning
No ratings yet
LSTM Deep Learning
11 pages
Unit 4 - MachineLearning
No ratings yet
Unit 4 - MachineLearning
16 pages
OlahLSTM NEURAL NETWORK TUTORIAL 15
No ratings yet
OlahLSTM NEURAL NETWORK TUTORIAL 15
9 pages
XCXCXCXCXCXCXCXC
No ratings yet
XCXCXCXCXCXCXCXC
20 pages
6 - RNN LSTM & Gru
No ratings yet
6 - RNN LSTM & Gru
14 pages
The Use of NARX Neural Networks To Predict Chaotic Time Series
No ratings yet
The Use of NARX Neural Networks To Predict Chaotic Time Series
11 pages
Two Marks - Aiml
No ratings yet
Two Marks - Aiml
21 pages
5 LSTM
No ratings yet
5 LSTM
4 pages
Modified Long Short-Term Memory and Utilizing in Building Sequential Model
No ratings yet
Modified Long Short-Term Memory and Utilizing in Building Sequential Model
6 pages
NLP - L8 LSTM
No ratings yet
NLP - L8 LSTM
7 pages
Long Short-Term Memory: Machine Learning Data Mining
No ratings yet
Long Short-Term Memory: Machine Learning Data Mining
6 pages
LSTM 006
No ratings yet
LSTM 006
6 pages
LSTM Networks Thesis Updated
No ratings yet
LSTM Networks Thesis Updated
5 pages
LSTM&RNN
No ratings yet
LSTM&RNN
10 pages
A Brief History of Deep Learning - DATAVERSITY
No ratings yet
A Brief History of Deep Learning - DATAVERSITY
7 pages
Gradient Problems
No ratings yet
Gradient Problems
8 pages
Long Short-Term Memory Networks (LSTM) - Simply Explained! - Data Basecamp
No ratings yet
Long Short-Term Memory Networks (LSTM) - Simply Explained! - Data Basecamp
4 pages
LSTM Material 1
No ratings yet
LSTM Material 1
3 pages
Long Term Memory
No ratings yet
Long Term Memory
2 pages
LSTM
No ratings yet
LSTM
3 pages
Activation Function - Lect 1
No ratings yet
Activation Function - Lect 1
5 pages
LSTM Detailed Explanation
No ratings yet
LSTM Detailed Explanation
2 pages

LSTM PPT

Uploaded by

LSTM PPT

Uploaded by

LSTM Networks

(Long Short Term Memory)

You might also like