0% found this document useful (0 votes)

21 views37 pages

Deep Learning L3

The document discusses data augmentation techniques, which artificially increase training datasets by modifying existing data or generating new data points, and outlines its benefits and limitations. It also covers Recurrent Neural Networks (RNNs), their types, advantages, and challenges, including issues like vanishing and exploding gradients. Additionally, the document explains Long Short-Term Memory (LSTM) networks, designed to handle long-term dependencies in sequence data, detailing their architecture and functionality.

Uploaded by

220107102

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

21 views37 pages

Deep Learning L3

Uploaded by

220107102

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 37

Lecture 3 Neural Networks

INF 337 Deep Learning

Data Augmentation
Data Augmentation

Data augmentation is a technique of artiﬁcially increasing the training set by

creating modiﬁed copies of a dataset using existing data. It includes making
minor changes to the dataset or using deep learning to generate new data
points.
Augmented vs. synthetic data

Augmented data: This involves creating modiﬁed versions of existing data to

increase dataset diversity. For example, in image processing, applying
transformations like rotations, ﬂips, or color adjustments to existing images
can help models generalize better.

Synthetic data: This refers to artiﬁcially generated data, which allows

researchers and developers to test and improve algorithms without risking the
privacy or security of real-world data.
When should you use data augmentation?

1. To prevent models from overﬁtting.

2. The initial training set is too small.
3. To improve the model accuracy.
4. To Reduce the operational cost of labeling and cleaning the raw dataset.
Limitations of data augmentation

1. The biases in the original dataset persist in the augmented data.

2. Quality assurance for data augmentation is expensive.
3. Research and development are required to build a system with advanced
applications. For example, generating high-resolution images using GANs
can be challenging.
4. Finding an effective data augmentation approach can be challenging.
Data Augmentation Techniques

Audio Data Augmentation:

1. Noise injection: add gaussian or random
noise to the audio dataset to improve the
model performance.
2. Shifting: shift audio left (fast forward) or
right with random seconds.
3. Changing the speed: stretches times
series by a ﬁxed rate.
4. Changing the pitch: randomly change the
pitch of the audio.
Data Augmentation Techniques

Text Data Augmentation

1. Word or sentence shuﬄing: randomly
changing the position of a word or sentence.
2. Word replacement: replace words with
synonyms.
3. Syntax-tree manipulation: paraphrase the
sentence using the same word.
4. Random word insertion: inserts words at
random.
5. Random word deletion: deletes words at
random.
Data Augmentation Techniques

Image Augmentation
1. Geometric transformations: randomly ﬂip, crop, rotate,
stretch, and zoom images. You need to be careful
about applying multiple transformations on the same
images, as this can reduce model performance.
2. Color space transformations: randomly change RGB
color channels, contrast, and brightness.
3. Kernel ﬁlters: randomly change the sharpness or
blurring of the image.
4. Random erasing: delete some part of the initial image.
5. Mixing images: blending and mixing multiple images.
Recurrent Neural Networks
CNN
RNNs

Recurrent Neural Network is a generalization of feed-forward neural network that

has an internal memory. RNN is recurrent in nature as it performs the same
function for every input of data while the output of the current input depends on
the past one computation.
Different types of RNN’s
● The core reason that recurrent nets are more exciting is that they allow us
to operate over sequences of vectors: Sequences in the input, the output,
or in the most general case both. A few examples may make this more
concrete:
One-to-one:
This also called as Plain/Vaniall Neural networks. It deals with Fixed size of
input to Fixed size of Output where they are independent of previous
information/output.
One-to-Many:
It deals with ﬁxed size of information as input that gives sequence of data as
output.
Many-to-One:
It takes Sequence of information as input and ouputs a ﬁxed size of output.
Many-to-Many:
It takes a Sequence of information as input and process it recurrently outputs a
Sequence of data.
Bidirectional Many-to-Many:

Synced sequence input and output.

Notice that in every case are no
pre-specified constraints on the lengths
sequences because the recurrent
transformation (green) is fixed and can
be applied as many times as we like.
CNN vs RNN:
RNN:
Backpropagation Through Time:
Backpropagation Through Time:
We typically treat the full sequence (word) as one training example, so the total error
is just the sum of the errors at each time step (character). The weights as we can see
are the same at each time step. Let’s summarize the steps for backpropagation
through time
1. The cross entropy error is first computed using the current output and the actual
output
2. Remember that the network is unrolled for all the time steps
3. For the unrolled network, the gradient is calculated for each time step with
respect to the weight parameter
4. Now that the weight is the same for all the time steps the gradients can be
combined together for all time steps
5. The weights are then updated for both recurrent neuron and the dense layers
Backpropagation Through Time:

While Backpropagating you

may get 2 types of issues:

1. Vanishing Gradient
2. Exploding Gradient
Vanishing Gradient

Where the contribution from the earlier steps becomes insigniﬁcant in the
gradient descent step.
Exploding Gradients

We speak of Exploding Gradients when the algorithm assigns a stupidly high

importance to the weights, without much reason. But fortunately, this problem
can be easily solved if you truncate or squash the gradients.
How can you overcome the Challenges of
Vanishing and Exploding Gradience?

Vanishing Gradience can be overcome with:

1. Relu activation function.
2. LSTM, GRU.
Exploding Gradience can be overcome with:
● Truncated BTT(instead starting backprop at the last time stamp, we can choose
similar time stamp, which is just before it.)
● Clip Gradience to threshold.
● RMSprop to adjust learning rate.
Advantages of Recurrent Neural Network

The main advantage of RNN over ANN is that RNN can model sequence of data
(i.e. time series) so that each sample can be assumed to be dependent on
previous ones

Recurrent neural network are even used with convolutional layers to extend the
effective pixel neighborhood.
Disadvantages of Recurrent Neural Network

Gradient vanishing and exploding problems.

Training an RNN is a very diﬃcult task.

It cannot process very long sequences if using tanh or relu as an activation

function.
Long Short Term Memory
LSTM
LSTM
● Long Short-Term Memory (LSTM) is a recurrent neural network architecture
designed by Sepp Hochreiter and Jürgen Schmidhuber in 1997.
● LSTMs are explicitly designed to avoid the long-term dependency problem.
Remembering information for long periods of time is practically their default
behavior, not something they struggle to learn!
LSTM

The key to LSTMs is the cell state, the horizontal line running through the top of the
diagram. The cell state is kind of like a conveyor belt. It runs straight down the
entire chain, with only some minor linear interactions. It’s very easy for information
to just ﬂow along it unchanged.
LSTM
The LSTM does have the ability to remove or add
information to the cell state, carefully regulated by
structures called gates.

Gates are a way to optionally let information through. They

are composed out of a sigmoid neural net layer and a
pointwise multiplication operation.

The sigmoid layer outputs numbers between zero and one,

describing how much of each component should be let
through. A value of zero means “let nothing through,” while
a value of one means “let everything through!”
LSTM
The first step in our LSTM is to decide what information we’re going to throw away
from the cell state. This decision is made by a sigmoid layer called the “forget gate
layer.” It looks at ht−1 and xt, and outputs a number between 0 and 1 for each
number in the cell state Ct−1. A 1 represents “completely keep this” while a 0
represents “completely get rid of this.”
LSTM
The next step is to decide what new information we’re going to store in the cell
state. This has two parts. First, a sigmoid layer called the “input gate layer” decides
which values we’ll update. Next, a tanh layer creates a vector of new candidate
values, Ct, that could be added to the state. In the next step, we’ll combine these two
to create an update to the state.
LSTM
Finally, we need to decide what we’re going to output. This output will be based on
our cell state, but will be a filtered version. First, we run a sigmoid layer which
decides what parts of the cell state we’re going to output. Then, we put the cell
state through tanh (to push the values to be between −1 and 1) and multiply it by
the output of the sigmoid gate, so that we only output the parts we decided to.
LSTM
Finally, we need to decide what we’re going to output. This output will be based on
our cell state, but will be a filtered version. First, we run a sigmoid layer which
decides what parts of the cell state we’re going to output. Then, we put the cell
state through tanh (to push the values to be between −1 and 1) and multiply it by
the output of the sigmoid gate, so that we only output the parts we decided to.

Unit 3 Deep Learning SPPU BE IT
No ratings yet
Unit 3 Deep Learning SPPU BE IT
30 pages
UNIT I Part 1 Notes
No ratings yet
UNIT I Part 1 Notes
28 pages
DL Unit-3
No ratings yet
DL Unit-3
9 pages
RNN LSTM GRU Transformers
0% (1)
RNN LSTM GRU Transformers
123 pages
Unit IV
No ratings yet
Unit IV
22 pages
DL Co3 - PPT 1
No ratings yet
DL Co3 - PPT 1
22 pages
598 114 216 Recurrent Neural Networks
No ratings yet
598 114 216 Recurrent Neural Networks
87 pages
Whitepaper-AI and ML in Real-Time System Operations
No ratings yet
Whitepaper-AI and ML in Real-Time System Operations
57 pages
DL - Intro
No ratings yet
DL - Intro
35 pages
RNN and LSTM
No ratings yet
RNN and LSTM
15 pages
Unit 6
No ratings yet
Unit 6
41 pages
Unit 4 Notes
No ratings yet
Unit 4 Notes
105 pages
Practical AI For Beginners Bundle
No ratings yet
Practical AI For Beginners Bundle
26 pages
DL Deeplearningbook PDF
No ratings yet
DL Deeplearningbook PDF
10 pages
Krishna Rungta - TensorFlow in 1 Day Make Your Own Neural Network (2018) - Trang-1
No ratings yet
Krishna Rungta - TensorFlow in 1 Day Make Your Own Neural Network (2018) - Trang-1
24 pages
RNN 2
No ratings yet
RNN 2
144 pages
RNN StannfordBased
No ratings yet
RNN StannfordBased
102 pages
9 Deep Leaning RNN
No ratings yet
9 Deep Leaning RNN
64 pages
42 Recurrent Neural Networks and LSTM
No ratings yet
42 Recurrent Neural Networks and LSTM
68 pages
MTech I YEAR - II SEM QB
No ratings yet
MTech I YEAR - II SEM QB
12 pages
Applications of Deep Learning in Electromagnetics: Teaching Maxwell's Equations To Machines Maokun Li Download
No ratings yet
Applications of Deep Learning in Electromagnetics: Teaching Maxwell's Equations To Machines Maokun Li Download
62 pages
Unit 4
No ratings yet
Unit 4
86 pages
Deep Learning Approaches To Text Production
No ratings yet
Deep Learning Approaches To Text Production
201 pages
Mini Project Fln..
No ratings yet
Mini Project Fln..
51 pages
Lec 10
No ratings yet
Lec 10
37 pages
RNN LSTM
No ratings yet
RNN LSTM
37 pages
Illustrated Guide To LSTM's and GRU'S - A Step by Step Explanation - by Michael Phi - Towards Data Science
No ratings yet
Illustrated Guide To LSTM's and GRU'S - A Step by Step Explanation - by Michael Phi - Towards Data Science
15 pages
LSTM, RNN
No ratings yet
LSTM, RNN
38 pages
Chap 7.2 Sequence Analysis Using RNN LSTM
No ratings yet
Chap 7.2 Sequence Analysis Using RNN LSTM
60 pages
RNN LSTM
No ratings yet
RNN LSTM
72 pages
RNNs and LSTMs
No ratings yet
RNNs and LSTMs
41 pages
Module 6
No ratings yet
Module 6
42 pages
Aids Ii
No ratings yet
Aids Ii
42 pages
For Seminar
No ratings yet
For Seminar
17 pages
CH4 - AA1.1-Sequence Models
No ratings yet
CH4 - AA1.1-Sequence Models
26 pages
DL Notes
No ratings yet
DL Notes
35 pages
Unit 3
No ratings yet
Unit 3
41 pages
Unit 3 RCNN
No ratings yet
Unit 3 RCNN
25 pages
MLT Unit 4 and 5 Part 2
No ratings yet
MLT Unit 4 and 5 Part 2
34 pages
Unit Ii (57 92)
No ratings yet
Unit Ii (57 92)
36 pages
UCS - 401 - Unit-LV - Trends in Machine Learning - Model and Symbols - Bagging and Boosting, Multitask
No ratings yet
UCS - 401 - Unit-LV - Trends in Machine Learning - Model and Symbols - Bagging and Boosting, Multitask
44 pages
cs224n 2022 Lecture08 Final Project
No ratings yet
cs224n 2022 Lecture08 Final Project
71 pages
Module 5
No ratings yet
Module 5
21 pages
Onine Speech To Text Engine For Delimited Context
No ratings yet
Onine Speech To Text Engine For Delimited Context
90 pages
UNIT - 5 Lecture 2
No ratings yet
UNIT - 5 Lecture 2
26 pages
CISC 867 Deep Learning: 12. Recurrent Neural Networks
No ratings yet
CISC 867 Deep Learning: 12. Recurrent Neural Networks
72 pages
Day 4
No ratings yet
Day 4
22 pages
Understanding LSTM Networks
No ratings yet
Understanding LSTM Networks
7 pages
Understanding LSTM Networks
No ratings yet
Understanding LSTM Networks
15 pages
DL4CV Seq Att
No ratings yet
DL4CV Seq Att
63 pages
21cs743 Solutions
No ratings yet
21cs743 Solutions
19 pages
NNDL
No ratings yet
NNDL
10 pages
Lab Report (1) Bachpan
No ratings yet
Lab Report (1) Bachpan
29 pages
Understanding LSTM Networks - Colah's Blog
No ratings yet
Understanding LSTM Networks - Colah's Blog
7 pages
LSTM
No ratings yet
LSTM
22 pages
Deep Learning Questions
No ratings yet
Deep Learning Questions
17 pages
Eng PPT Tech
No ratings yet
Eng PPT Tech
18 pages
Blue and White Simple Business Plan Presentation
No ratings yet
Blue and White Simple Business Plan Presentation
15 pages
ObjectiveQ&a Mid-I NNDL
No ratings yet
ObjectiveQ&a Mid-I NNDL
15 pages
Unit 4 - MachineLearning
No ratings yet
Unit 4 - MachineLearning
16 pages
Unit 4 - Machine Learning
No ratings yet
Unit 4 - Machine Learning
16 pages
Deep Learning in Mental Health Outcome Research
No ratings yet
Deep Learning in Mental Health Outcome Research
26 pages
Understanding LSTM Networks
No ratings yet
Understanding LSTM Networks
10 pages
Speech Enhancement
No ratings yet
Speech Enhancement
60 pages
8.5 Recurrent Neural Networks
No ratings yet
8.5 Recurrent Neural Networks
5 pages
DL Module 5
No ratings yet
DL Module 5
10 pages
Recurrent Neural Network: What Does RNN Stand For?
No ratings yet
Recurrent Neural Network: What Does RNN Stand For?
7 pages
RNN & LSTM: Vamsi Krishna B 1 9 M E 0 2 3
No ratings yet
RNN & LSTM: Vamsi Krishna B 1 9 M E 0 2 3
14 pages
Top 10 Neural Network Architectures You Need To Know: 1 - Perceptrons
No ratings yet
Top 10 Neural Network Architectures You Need To Know: 1 - Perceptrons
12 pages
15.03.2024 Csa3007 A24+d23+d24
No ratings yet
15.03.2024 Csa3007 A24+d23+d24
8 pages
UNIT-5 Foundations of Deep Learning
No ratings yet
UNIT-5 Foundations of Deep Learning
9 pages
Unit 3
No ratings yet
Unit 3
8 pages
OlahLSTM NEURAL NETWORK TUTORIAL 15
No ratings yet
OlahLSTM NEURAL NETWORK TUTORIAL 15
9 pages
Deep Learning For Intelligent Demand Response and Smart Grids: A Comprehensive Survey
No ratings yet
Deep Learning For Intelligent Demand Response and Smart Grids: A Comprehensive Survey
25 pages
LSTM Material 1
No ratings yet
LSTM Material 1
3 pages
Unit Iii
No ratings yet
Unit Iii
5 pages
DL Mod 3
No ratings yet
DL Mod 3
4 pages
Dynamic Pricing For Load Shifting
No ratings yet
Dynamic Pricing For Load Shifting
14 pages
Deep Learning in Data Science Theoretical Foundati
No ratings yet
Deep Learning in Data Science Theoretical Foundati
6 pages
Slow Momentum With Fast Reversion: A Trading Strategy Using Deep Learning and Changepoint Detection
No ratings yet
Slow Momentum With Fast Reversion: A Trading Strategy Using Deep Learning and Changepoint Detection
14 pages
Intrusion Detection System Using Voting-Based Neural Network
No ratings yet
Intrusion Detection System Using Voting-Based Neural Network
12 pages
Automated Depression Detection Using Deep Representation and Sequence Learning With EEG Signals
No ratings yet
Automated Depression Detection Using Deep Representation and Sequence Learning With EEG Signals
12 pages
MMMLP: Multi-Modal Multilayer Perceptron For Sequential Recommendations
No ratings yet
MMMLP: Multi-Modal Multilayer Perceptron For Sequential Recommendations
9 pages
Speech Recognition With Artificial Neural Networks
No ratings yet
Speech Recognition With Artificial Neural Networks
6 pages
A Novel Ensemble Learning Approach of Deep Learning Techniques To Monitor Distracted Driver Behaviour in Real Time
No ratings yet
A Novel Ensemble Learning Approach of Deep Learning Techniques To Monitor Distracted Driver Behaviour in Real Time
6 pages
Air-Writing Recognition Using Deep Convolutional and Recurrent Neural Network Architectures
No ratings yet
Air-Writing Recognition Using Deep Convolutional and Recurrent Neural Network Architectures
6 pages
Raunaks Resume
No ratings yet
Raunaks Resume
1 page