0% found this document useful (0 votes)
14 views16 pages

LSTM by Bushra

Uploaded by

akter12345b
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views16 pages

LSTM by Bushra

Uploaded by

akter12345b
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 16

1

Presentation on

LSTM: Long Short-Term Memory


Course Code: BME-5105

Presented by: Assigned by:


Bushra Hossain Md. Khairul Islam
MSc 1st Year 1st Semester Chairman
Dept. of Biomedical Engineering Dept. of Biomedical Engineering
Islamic University, Bangladesh. Islamic University, Bangladesh.
2
Contents

❑ Introduction
❑ Sequence Modelling
❑ Features of LSTM
❑ Vanishing Gradient
❑ Long Short-Term Memory
❑ Difference between RNN and LSTM
❑ Working principle of LSTM
❑ Uses of LSTM
❑ Advantages & Disadvantages of LSTM
3
Introduction

⮚ Long Short-Term Memory (LSTM) is a type of recurrent


neural network (RNN) that uses gates to store and process
information over multiple time steps.
⮚ LSTMs are a powerful tool in deep learning and artificial
intelligence, and are used in many applications.
⮚ LSTMs are able to process sequential data, such as text,
time series, and speech, because they use gates to
selectively retain or discard information.
⮚ LSTM incorporates feedback connections, allowing it to
process entire sequences of data, not just individual data
points
⮚ LSTMs are able to overcome some of the problems that
traditional RNNs face, such as the vanishing gradient
problem.
4
Sequence Modeling

⮚ Sequence modeling is the process of predicting the next word or character. It computes
the probability of words that will have a chance to occur subsequently in a particular
sequence.
⮚ This model will take a high probability value of word or character as output.
⮚ Unlike ANN, sequence modeling current output depends not only on current input but
also on the previous output.
What is sequential data?
There are several types of data such as- Time series, Speech data, Text data, Financial data,
Audio data, Video data.
5
Features of LSTM

1. Memory Cells: LSTM networks have special memory cells that allow them to
remember information over long sequences. They keep important information
while discarding irrelevant details.
2. Gates: LSTMs use three types of gates—forget gate, input gate, and output gate—
to control the flow of information:
⮚ Forget Gate: Decides what information to discard from the cell state.
⮚ Input Gate: Decides what new information to add to the cell state.
⮚ Output Gate: Controls what part of the cell state to output to the next
hidden state.
3. Cell State: This is the central piece of memory that carries information across time
steps, allowing the network to remember context over long sequences.
6
Vanishing Gradient

⮚ The vanishing gradient problem is a phenomenon that


occurs during the training of deep neural networks
when the gradients used to update the network become
very small or "vanish"
⮚ Vanishing gradient occurs when the values assigned
are too small. This causes the computational model to
stop learning or more processing time to produce a
result.
⮚ This problem has been tackled in recent times with the
introduction of the concept of LSTM.
7
Structure of LSTM

⮚ The structure of an LSTM network consists


of a series of LSTM cells, each of which has
a set of gates (input, output, and forget gates)
that control the flow of information into and
out of the cell.
⮚ The gates are used to selectively forget or
retain information from the previous time
steps, allowing the LSTM to maintain
long-term dependencies in the input data.
8
Structure of LSTM

⮚ The LSTM cell also has a memory cell that


stores information from previous time steps and
uses it to influence the output of the cell at the
current time step.
⮚ The output of each LSTM cell is passed to the
next cell in the network, allowing the LSTM to
process and analyze sequential data over
multiple time steps.
9
Difference between RNN and LSTM
10
Working principle of LSTM

⮚ Forget Gate receives the previous hidden state and current


input, applying a sigmoid function to output a value between 0
and 1 for each element in the cell state.
⮚ Input Gate generates a vector of potential new values using a
tanh layer, deciding which values to add to the cell state.
⮚ Cell State Update combines the information retained from the
forget gate and new information from the input gate to update
the cell state.
⮚ Output Gate uses a sigmoid function to decide what information
from the cell state is passed as the hidden state.

This combination of gates and cell state updates enables the LSTM Block Diagram of LSTM
to retain, forget, or add information over time, making it effective
for sequence data tasks like language modeling or time-series
forecasting.
11
Working principle of LSTM
12
Working principle of LSTM
13
Working principle of LSTM
14
Uses of LSTM

✔ Robot Control
✔ Human Action Recognition
✔ Time Series Prediction
✔ Speech Recognition
✔ Rhythm Learning
✔ Music Composition
✔ Handwriting Recognition
✔ End To End Translation
✔ Grammer Learning
✔ Microsoft
• End To End Speech Translation
✔ Google
• Speech Recognition On The Smartphone
• Smart Assistant Allo
15
Advantages & Disadvantages of LSTM

Advantages:
Longer-Term Memory:
LSTMs can capture longer dependencies, unlike traditional
RNNs.
Controlled Memory Flow:
The gating mechanisms allow selective memory updates.

Disadvantages:
Computationally Intensive:
LSTMs require more processing power and memory.
Training Complexity:
Longer training times due to the sequential nature and complexity of the
architecture.
16

You might also like