0% found this document useful (0 votes)
20 views38 pages

Chapter 5 RNN

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
20 views38 pages

Chapter 5 RNN

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 38

Deep Learning Application for Communication

Engineering
Subject Code: ECE7419

By

Dr. RAM SEWAK SINGH


Associate Professor

Electronics and Communication Engineering Department


School of Electrical Engineering and Computing
Adama Science and Technology University,
Ethiopia, Po. Box: 1888 1
ASTU
Chapter 5: : 2
Sequence Models

ASTU
Lecture Overview 3

Problem With ML,ANN and CNN

 Recurrent Neural Networks (RNN) for sequences

Backpropagation Through Time

Vanishing and Exploding Gradients and Remedies

RNNs using Long Short-Term Memory (LSTM)

ASTU
4

ASTU
5

ANN

ASTU
Recurrent Neural Network 6

A recurrent neural network (RNN) is a deep learning structure that uses past
information to improve the performance of the network on current and future
inputs. What makes an RNN unique is that the network contains a hidden state and
loops.

The looping structure allows the network to store past information in the hidden
state and operate on sequences.

A recurrent neural network (RNN) is a network architecture for deep learning


that predicts on time-series or sequential data.

ASTU
RNNs are particularly effective for working with sequential data that varies in
7
length and solving problems such as natural signal classification, language
processing, and video analysis.

What a sequence really is?

Data inside a sequence are non i.i.d. ◦Identically, independently


distributed.

The next “word” depends on the previous “words”


Ideally on all of them.

We need context, and we need memory!

How to model context and memory ?


ASTU
8

ASTU
9

ASTU
10

ASTU
11

ASTU
Mathematical Model
12

Example: Predict Sequence of Characters

ASTU
13

ASTU
14

ASTU
15

ASTU
16

ASTU
Problem With RNN
17

ASTU
18

ASTU
19

ASTU
LSTM Networks (Long Short Term Memory) networks
20
Long Short Term Memory networks: usually just called “LSTMs” – are a special
kind of RNN, capable of learning long-term dependencies.

 LSTMs are explicitly designed to avoid the long-term dependency problem.


Remembering information for long periods of time is practically their default
behavior, not something they struggle to learn!

All recurrent neural networks have the form of a chain of repeating modules of
neural network. In standard RNNs, this repeating module will have a very simple
structure, such as a single tanh layer.

ASTU
LSTMs also have this chain like structure, but the repeating module has a different
21
structure. Instead of having a single neural network layer, there are four, interacting
in a very special way.

ASTU
(1) Memory Cell
22
(2) Forget Cell (1)
(3) I/P Gate (4)
(4) O/P Gate

(2) (3)

ASTU
In the above diagram, each line carries an entire vector, from the output of one 23
node
to the inputs of others.

The pink circles represent pointwise operations, like vector addition, while the
yellow boxes are learned neural network layers.

Lines merging denote concatenation, while a line forking denote its content being
copied and the copies going to different locations.

The Core Idea Behind LSTMs

The key to LSTMs is the cell state, the horizontal line running through the top of the
diagram. The cell state is kind of like a conveyor belt. It runs straight down the entire
chain, with only some minor linear interactions. It’s very easy for information to just
flow along it unchanged.

ASTU
24

The LSTM does have the ability to remove or add information to the cell state,
carefully regulated by structures called gates.

Gates are a way to optionally let information through. They are composed out of a
sigmoid neural net layer and a pointwise multiplication operation.

ASTU
Step-by-Step LSTM Walk Through 25

ASTU
26

ASTU
27

ASTU
28

ASTU
Variants on Long Short Term Memory
29

ASTU
Autoencoders 30

Autoencoders are a specific type of feed forward neural networks where the input is
the same as the output. They compress the input into a lower-dimensional code and
then reconstruct the output from this representation.

The code is a compact “summary” or “compression” of the input, also called


the latent-space representation.

An autoencoder consists of 3 components: Encoder, Code and Decoder. The encoder


compresses the input and produces the code, the decoder then reconstructs the input
only using this code.

ASTU
Autoencoders are mainly a dimensionality reduction (or compression) algorithm with
31
a couple of important properties:

Data-specific: Autoencoders are only able to meaningfully compress data similar to


what they have been trained on. Since they learn features specific for the given training
data, they are different than a standard data compression algorithm like Gzip.

Lossy: The output of the autoencoder will not be exactly the same as the input, it will
be a close but degraded representation. If you want lossless compression they are not
the way to go.

Unsupervised: To train an autoencoder we don’t need to do anything fancy, just throw


the raw input data at it. Autoencoders are considered an unsupervised learning
technique since they don’t need explicit labels to train on. But to be more precise they
are self-supervised because they generate their own labels from the training data.

ASTU
Architecture
32

ASTU
33

ASTU
34

ASTU
Applications, including anomaly detection, text generation, image generation, image
denoising, and digital communications. 35
 Autoencoders will naturally ignore any input noise as the encoder is trained. This
feature is ideal for removing noise or detecting anomalies when the inputs and
outputs are compared (see Figures 2 and 3).

ASTU
 The latent representation can also be used to generate synthetic data. For example,
36
you can automatically create realistic looking handwriting or phrases of text (Figure
4).

Time series-based auto encoders can also be used to detect anomalies in signal data.
For example, in predictive maintenance, an auto encoder can be trained on normal
operating data from an industrial machine (Figure 5).

Figure 5: Training on normal operating data for predictive maintenance.


 The trained auto encoder is then tested on new incoming data. A large variation from
the autoencoder’s output indicates an abnormal operation, which could require 37
investigation (Figure 6).

ASTU
38

Thank you for your attention!!

ASTU

You might also like