What Is LSTM - Long Short Term Memory - GeeksforGeeks
What Is LSTM - Long Short Term Memory - GeeksforGeeks
- GeeksforGeeks
Unlike traditional RNNs which use a single hidden state passed through
time LSTMs introduce a memory cell that holds information over
extended periods addressing the challenge of learning long-term
dependencies.
https://www.geeksforgeeks.org/deep-learning-introduction-to-long-short-term-memory/ 1/14
3/8/25, 10:21 AM What is LSTM - Long Short Term Memory? - GeeksforGeeks
LSTM Architecture
LSTM architectures involves the memory cell which is controlled by
three gates: the input gate, the forget gate and the output gate. These
gates decide what information to add to, remove from and output from
the memory cell.
Working of LSTM
LSTM architecture has a chain structure that contains four neural
networks and different memory blocks called cells.
https://www.geeksforgeeks.org/deep-learning-introduction-to-long-short-term-memory/ 2/14
3/8/25, 10:21 AM What is LSTM - Long Short Term Memory? - GeeksforGeeks
Forget Gate
The information that is no longer useful in the cell state is removed with
the forget gate. Two inputs xt (input at the particular time) and ht-1
(previous cell output) are fed to the gate and multiplied with weight
matrices followed by the addition of bias. The resultant is passed
through an activation function which gives a binary output. If for a
particular cell state the output is 0, the piece of information is forgotten
and for output 1, the information is retained for future use.
ft = σ(Wf ⋅ [ht−1 , xt ] + bf )
where:
W_f represents the weight matrix associated with the forget gate.
[h_t-1, x_t] denotes the concatenation of the current input and the
previous hidden state.
b_f is the bias with the forget gate.
σ is the sigmoid activation function.
https://www.geeksforgeeks.org/deep-learning-introduction-to-long-short-term-memory/ 3/14
3/8/25, 10:21 AM What is LSTM - Long Short Term Memory? - GeeksforGeeks
Input gate
The addition of useful information to the cell state is done by the input
gate. First, the information is regulated using the sigmoid function and
filter the values to be remembered similar to the forget gate using
inputs ht-1 and xt. . Then, a vector is created using tanh function that
gives an output from -1 to +1, which contains all the possible values
from ht-1 and xt. At last, the values of the vector and the regulated
values are multiplied to obtain the useful information. The equation for
the input gate is:
it = σ(Wi ⋅ [ht−1 , xt ] + bi )
where
https://www.geeksforgeeks.org/deep-learning-introduction-to-long-short-term-memory/ 4/14
3/8/25, 10:21 AM What is LSTM - Long Short Term Memory? - GeeksforGeeks
Output gate
The task of extracting useful information from the current cell state to
be presented as output is done by the output gate. First, a vector is
generated by applying tanh function on the cell. Then, the information is
regulated using the sigmoid function and filter by the values to be
remembered using inputs ht−1 and xt . At last, the values of the vector
ot = σ(Wo ⋅ [ht−1 , xt ] + bo )
Bi LSTMs are made up of two LSTM networks one that processes the
input sequence in the forward direction and one that processes the
input sequence in the backward direction.
The outputs of the two LSTM networks are then combined to
produce the final output.
Applications of LSTM
Some of the famous applications of LSTM includes:
LTSM vs RNN
https://www.geeksforgeeks.org/deep-learning-introduction-to-long-short-term-memory/ 6/14
3/8/25, 10:21 AM What is LSTM - Long Short Term Memory? - GeeksforGeeks
Long-term
Yes Limited
dependency learning
Ability to learn
Yes Yes
sequential data
https://www.geeksforgeeks.org/deep-learning-introduction-to-long-short-term-memory/ 7/14
3/8/25, 10:21 AM What is LSTM - Long Short Term Memory? - GeeksforGeeks
LSTMs use a cell state to store information about past inputs. This
cell state is updated at each step of the network, and the network
uses it to make predictions about the current input. The cell state
is updated using a series of gates that control how much
information is allowed to flow into and out of the cell.
No, LSTMs and CNNs serve different purposes. LSTMs are for
sequential data; CNNs are for spatial data.
Similar Reads
Long short-term memory (LSTM) RNN in Tensorflow
Long Short-Term Memory (LSTM) where designed to address the
vanishing gradient issue faced by traditional RNNs when learning long-…
4 min read
1 min read
5 min read
8 min read
4 min read
7 min read
3 min read
https://www.geeksforgeeks.org/deep-learning-introduction-to-long-short-term-memory/ 10/14
3/8/25, 10:21 AM What is LSTM - Long Short Term Memory? - GeeksforGeeks
5 min read
7 min read
5 min read
Registered Address:
K 061, Tower K, Gulshan Vivante
Apartment, Sector 137, Noida, Gautam
Buddh Nagar, Uttar Pradesh, 201305
Advertise with us
Company Explore
About Us Job-A-Thon Hiring Challenge
Legal Hack-A-Thon
Privacy Policy GfG Weekly Contest
Careers Offline Classes (Delhi/NCR)
In Media DSA in JAVA/C++
Contact Us Master System Design
https://www.geeksforgeeks.org/deep-learning-introduction-to-long-short-term-memory/ 11/14
3/8/25, 10:21 AM What is LSTM - Long Short Term Memory? - GeeksforGeeks
Languages DSA
Python Data Structures
Java Algorithms
C++ DSA for Beginners
PHP Basic DSA Problems
GoLang DSA Roadmap
SQL DSA Interview Questions
R Language Competitive Programming
Android Tutorial
https://www.geeksforgeeks.org/deep-learning-introduction-to-long-short-term-memory/ 12/14
3/8/25, 10:21 AM What is LSTM - Long Short Term Memory? - GeeksforGeeks
DSA/Placements Development/Testing
DSA - Self Paced Course JavaScript Full Course
DSA in JavaScript - Self Paced Course React JS Course
DSA in Python - Self Paced React Native Course
C Programming Course Online - Learn C with Data Structures Django Web Development Course
Complete Interview Preparation Complete Bootstrap Course
Master Competitive Programming Full Stack Development - [LIVE]
Core CS Subject for Interview Preparation JAVA Backend Development - [LIVE]
Mastering System Design: LLD to HLD Complete Software Testing Course [LIVE]
Tech Interview 101 - From DSA to System Design [LIVE] Android Mastery with Kotlin [LIVE]
DSA to Development [HYBRID]
Placement Preparation Crash Course [LIVE]
https://www.geeksforgeeks.org/deep-learning-introduction-to-long-short-term-memory/ 13/14
3/8/25, 10:21 AM What is LSTM - Long Short Term Memory? - GeeksforGeeks
https://www.geeksforgeeks.org/deep-learning-introduction-to-long-short-term-memory/ 14/14