Long Short Term Memory (LSTM)

The document explains Long Short-Term Memory (LSTM) networks, highlighting their memory cell structure that maintains state over time through gating mechanisms. It details the functions of the forget and input gates in managing information flow and memory updates, as well as the advantages and challenges of using LSTMs in sequential tasks. Additionally, it notes that while LSTMs mitigate issues like vanishing gradients, they can still be computationally expensive and are sometimes outperformed by newer models such as Transformers.

Uploaded by

Light Hamza

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

156 views23 pages

Long Short Term Memory (LSTM)

Uploaded by

Light Hamza

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 23

Long Short-Term

Memory (LSTM)
Sir. Asif Ahsan
LSTMs
• Central Idea: A memory cell (interchangeably block) which can maintain its
state over time, consisting of an explicit memory (aka the cell state vector)
and gating units which regulate the information flow into and out of the
memory.
Cell State (convener belt)
• Represents the memory of the LSTM
• Undergoes changes via forgetting of old memory (forget
gate) and addition of new memory (input gate)
GATES
• Gate: sigmoid neural net layer followed by pointwise
multiplication operator.
• Gates control the flow of information to/from the
memory.
• Gates are controlled by a concatenation of the output
from the previous time step and the current input and
optionally the cell state vector.
Forget gate
• Controls what information to throw away from memory
Input gate
• input gate plays a crucial role in deciding how much of the input
information at the current time step should be updated in the cell
state.
Forget gate
• The forget gate uses a sigmoid activation function to produce
an output between 0 and 1 for each piece of information in the
cell state:
• A value of 0 indicates
• "completely forget this information."
• A value of 1 indicates
• "completely retain this information."
Memory Update
• The memory update in an LSTM (Long Short-Term Memory) network
involves modifying the cell state based on a combination of new information
and retained past information. This process is a key aspect of how LSTMs
manage long-term dependencies in sequential data.
LSTM complete
“Asif eats biryani almost every
day, it shouldn’t be hard to
guess that his favorite cuisine is
Pakistani."
"Asif eats biryani almost every
day, it shouldn’t be hard to guess
that his favorite cuisine is
Pakistani. His brother Ahmed,
however, is a lover of pasta and
cheese, which means Ahmed’s
favorite cuisine is Italian."

Cell state: birya

ni
"Asif eats biryani almost every
day, it shouldn’t be hard to guess
that his favorite cuisine is
Pakistani. His brother Ahmed,
however, is a lover of pasta and
cheese, which means Ahmed’s
favorite cuisine is Italian."

Cell state:
"Asif eats biryani almost every
day, it shouldn’t be hard to guess
that his favorite cuisine is
Pakistani. His brother Ahmed,
however, is a lover of pasta and
cheese, which means Ahmed’s
favorite cuisine is Italian."

Cell state: pasta

Memory
of Biryani
Adding memory for Pasta
Forget Biryani
Italian
cuisine
LSTMs
• It’s still possible for LSTMs to suffer from
vanishing/exploding gradients, but it’s way less likely
than with vanilla RNNs:

delicately find a recurrent weight matrix 𝑊 that isn’t too large

• If RNNs wish to preserve info over long contexts, it must

or small.
• However, LSTMs have 3 separate mechanism that adjust the
flow of information (e.g., forget gate, if turned off, will
preserve all info)
LSTM
• Advantages:
• Handles long-term dependencies
• Avoids vanishing gradients
• Excels in sequential tasks like NLP and time-series.
• Issues:
• Computationally expensive.
• Prone to overfitting.
• Outperformed by newer models like Transformers in some
cases.
Reading:
• https://colah.github.io/posts/2015-08-Understanding-LST
Ms/

• In addition to the original authors, a lot of people

contributed to the modern LSTM. A non-comprehensive
list is: Felix Gers, Fred Cummins, Santiago Fernandez,
Justin Bayer, Daan Wierstra, Julian Togelius, Faustino
Gomez, Matteo Gagliolo, and Alex Graves.

NCA-GENL Nvidia Generative Ai Llms Exam Dumps
No ratings yet
NCA-GENL Nvidia Generative Ai Llms Exam Dumps
5 pages
Kernel Methods: Fundamentals and Applications
From Everand
Kernel Methods: Fundamentals and Applications
Fouad Sabry
No ratings yet
Hebbian Learning: Fundamentals and Applications for Uniting Memory and Learning
From Everand
Hebbian Learning: Fundamentals and Applications for Uniting Memory and Learning
Fouad Sabry
No ratings yet
Hopfield Networks: Fundamentals and Applications of The Neural Network That Stores Memories
From Everand
Hopfield Networks: Fundamentals and Applications of The Neural Network That Stores Memories
Fouad Sabry
No ratings yet
Hybrid Neural Networks: Fundamentals and Applications for Interacting Biological Neural Networks with Artificial Neuronal Models
From Everand
Hybrid Neural Networks: Fundamentals and Applications for Interacting Biological Neural Networks with Artificial Neuronal Models
Fouad Sabry
No ratings yet
Routledge Handbook of Psychiatry in Asia - 1st Edition Secure Ebook Download
No ratings yet
Routledge Handbook of Psychiatry in Asia - 1st Edition Secure Ebook Download
15 pages
APPENDIX F and Enclosures 1A and B
No ratings yet
APPENDIX F and Enclosures 1A and B
4 pages
Machine Learning For Tabular Data XGBoost, Deep Learning, and AI (Mark Ryan, Luca Massaron) (Z-Library)
100% (1)
Machine Learning For Tabular Data XGBoost, Deep Learning, and AI (Mark Ryan, Luca Massaron) (Z-Library)
504 pages
A Framework For Examining Leadership in Extreme Contexts
No ratings yet
A Framework For Examining Leadership in Extreme Contexts
23 pages
Unit 4 Deeplearning
No ratings yet
Unit 4 Deeplearning
41 pages
Becoming Top Ten: An Analysis of Michigan's ESSA Plan
0% (1)
Becoming Top Ten: An Analysis of Michigan's ESSA Plan
10 pages
Machine Learning Systems
No ratings yet
Machine Learning Systems
1,748 pages
The Fool's Letter
No ratings yet
The Fool's Letter
7 pages
NLP and Generative AI Syllabus - 2025
No ratings yet
NLP and Generative AI Syllabus - 2025
5 pages
Unit 2 DL
No ratings yet
Unit 2 DL
43 pages
Unit 2 DL
No ratings yet
Unit 2 DL
44 pages
Westside Test Anxiety Scale Validation
No ratings yet
Westside Test Anxiety Scale Validation
6 pages
Deep Learning
No ratings yet
Deep Learning
127 pages
5 Techiques To FineTune LLMs
No ratings yet
5 Techiques To FineTune LLMs
7 pages
Deep Learning Lab Practicals
No ratings yet
Deep Learning Lab Practicals
24 pages
Btech CSE
No ratings yet
Btech CSE
17 pages
Deep Learning (MODULE-3)
No ratings yet
Deep Learning (MODULE-3)
85 pages
Performance Task # 4: Narration Day Dream
No ratings yet
Performance Task # 4: Narration Day Dream
3 pages
K - Ga Units and Eureka Modules - Gse Crosswalk
No ratings yet
K - Ga Units and Eureka Modules - Gse Crosswalk
4 pages
Techniques To FineTune LLMs
No ratings yet
Techniques To FineTune LLMs
7 pages
UNIT-I - Introduction To Computer Vision
No ratings yet
UNIT-I - Introduction To Computer Vision
45 pages
What Is EFT?
No ratings yet
What Is EFT?
3 pages
Symbolic Interaction Ism As Defined by Herbert Blumer
67% (6)
Symbolic Interaction Ism As Defined by Herbert Blumer
15 pages
Image Segmentation DeepLearning
No ratings yet
Image Segmentation DeepLearning
18 pages
RAG With Math
No ratings yet
RAG With Math
7 pages
Ba PDF
No ratings yet
Ba PDF
19 pages
Moonlight Essay
No ratings yet
Moonlight Essay
5 pages
Completion Status Report: Student Name Course Title Start Date
No ratings yet
Completion Status Report: Student Name Course Title Start Date
2 pages
English 5 - Q3 - W7 DLL
No ratings yet
English 5 - Q3 - W7 DLL
8 pages
A2 Reading Semana 7 Junio Aranelys - M - Lazo - Campo
No ratings yet
A2 Reading Semana 7 Junio Aranelys - M - Lazo - Campo
2 pages
Ai
No ratings yet
Ai
28 pages
Module2.3 Hyperparameter Optimization
No ratings yet
Module2.3 Hyperparameter Optimization
29 pages
Deep Reinforcement Learning
No ratings yet
Deep Reinforcement Learning
406 pages
Aristotle and His View On State
100% (2)
Aristotle and His View On State
4 pages
CEC453 Machine Learning
No ratings yet
CEC453 Machine Learning
168 pages
Knowledge Graph Construction Using Large Language Models
No ratings yet
Knowledge Graph Construction Using Large Language Models
17 pages
Changes To Earths Surface
No ratings yet
Changes To Earths Surface
3 pages
Mapeh - Arts Grade 1 First Grading
No ratings yet
Mapeh - Arts Grade 1 First Grading
31 pages
Comatose Child PDF
No ratings yet
Comatose Child PDF
2 pages
Maths Roadmap For Machine Learning
No ratings yet
Maths Roadmap For Machine Learning
16 pages
UNIT-5 Foundations of Deep Learning
No ratings yet
UNIT-5 Foundations of Deep Learning
9 pages
Methods Tools Design Methodology and Design Practi
No ratings yet
Methods Tools Design Methodology and Design Practi
9 pages
Es-331: Curriculumand Instruction
No ratings yet
Es-331: Curriculumand Instruction
0 pages
Recurrent Neural Networks
No ratings yet
Recurrent Neural Networks
104 pages
Gradient Descent
No ratings yet
Gradient Descent
17 pages
English Three Course (ILEX-UTP) 1-2016
No ratings yet
English Three Course (ILEX-UTP) 1-2016
7 pages
2009 McGowan Et Al Epigenetic Regulation of The Glucocorticoid Receptor in Human Brain Associates With Childhood Abuse
No ratings yet
2009 McGowan Et Al Epigenetic Regulation of The Glucocorticoid Receptor in Human Brain Associates With Childhood Abuse
18 pages
Machine Learning Notes
No ratings yet
Machine Learning Notes
75 pages
LSTM
No ratings yet
LSTM
42 pages
Gay PDF
100% (2)
Gay PDF
11 pages
2022 Staticspeed Vunerability Report Template
No ratings yet
2022 Staticspeed Vunerability Report Template
57 pages
170 Machine Learning Interview Questions and Answer For 2021
100% (1)
170 Machine Learning Interview Questions and Answer For 2021
65 pages
Machine Learning Notes AndrewNg
No ratings yet
Machine Learning Notes AndrewNg
141 pages
Makati City Government Individual Performance Commitment & Review (Ipcr)
No ratings yet
Makati City Government Individual Performance Commitment & Review (Ipcr)
2 pages
Week 8 Salary Negotiation Skills Test
No ratings yet
Week 8 Salary Negotiation Skills Test
3 pages
PyTorch Workflow Fundamentals
No ratings yet
PyTorch Workflow Fundamentals
1 page
Practice Final sp22
No ratings yet
Practice Final sp22
10 pages
Q-Learning and Deep Q Networks (DQN)
No ratings yet
Q-Learning and Deep Q Networks (DQN)
52 pages
Deep Learning Methods and Applications For Electrical Power Systems A Comprehensive Review
No ratings yet
Deep Learning Methods and Applications For Electrical Power Systems A Comprehensive Review
22 pages
The Mostly Complete Chart of Neural Networks
100% (1)
The Mostly Complete Chart of Neural Networks
19 pages
Final Report Avanza
No ratings yet
Final Report Avanza
31 pages
A Review On Large Language Models Architectures Ap
No ratings yet
A Review On Large Language Models Architectures Ap
31 pages
Deep Learning
No ratings yet
Deep Learning
49 pages
Academic Writing w2
No ratings yet
Academic Writing w2
53 pages
Lecture 01 (Introduction To Pattern Recognition)
No ratings yet
Lecture 01 (Introduction To Pattern Recognition)
26 pages
Henry Mintzberg's Management
No ratings yet
Henry Mintzberg's Management
8 pages
Performance Appraisal Form
100% (1)
Performance Appraisal Form
6 pages
Executive Summary of AI and ET
No ratings yet
Executive Summary of AI and ET
154 pages
4.1 Reinforcement Learning 2
No ratings yet
4.1 Reinforcement Learning 2
31 pages
A Survey of Evolution of Image Captioning PDF
No ratings yet
A Survey of Evolution of Image Captioning PDF
18 pages
Paper 1-Bidirectional LSTM With Attention Mechanism and Convolutional Layer
100% (1)
Paper 1-Bidirectional LSTM With Attention Mechanism and Convolutional Layer
51 pages
Simple Libraries in Python
No ratings yet
Simple Libraries in Python
12 pages
3D U-Net Based Brain Tumor Segmentation
No ratings yet
3D U-Net Based Brain Tumor Segmentation
11 pages
Depth Prediction Single Image
No ratings yet
Depth Prediction Single Image
8 pages
RNN and LSTM: YANG Jiancheng
No ratings yet
RNN and LSTM: YANG Jiancheng
15 pages
Udacity Deep LEarning Part4 RNN
No ratings yet
Udacity Deep LEarning Part4 RNN
338 pages
Tutorial Pytorch Best Commands
No ratings yet
Tutorial Pytorch Best Commands
8 pages
Mehryar Mohri - Foundations of Machine Learning - Book
No ratings yet
Mehryar Mohri - Foundations of Machine Learning - Book
1 page
An Introduction To Mathematics Behind Neural Networks
No ratings yet
An Introduction To Mathematics Behind Neural Networks
5 pages
RBM, DBN, and DBM
No ratings yet
RBM, DBN, and DBM
79 pages
Data Science Learning Path For 50 Days
No ratings yet
Data Science Learning Path For 50 Days
15 pages
What Is A Support Vector Machine?: Primer
No ratings yet
What Is A Support Vector Machine?: Primer
3 pages
Lecture 1: Introduction To Reinforcement Learning: David Silver
No ratings yet
Lecture 1: Introduction To Reinforcement Learning: David Silver
46 pages
Notes On Backpropagation
No ratings yet
Notes On Backpropagation
14 pages
Back Propagation
100% (1)
Back Propagation
27 pages

Long Short Term Memory (LSTM)

Uploaded by

Long Short Term Memory (LSTM)

Uploaded by

Long Short-Term

Cell state: birya

Cell state: pasta

delicately find a recurrent weight matrix 𝑊 that isn’t too large

• In addition to the original authors, a lot of people

You might also like