0% found this document useful (0 votes)

22 views24 pages

M3 L4 RNN Regularization

The document discusses various topics related to deep learning including artificial neural networks, feedforward and backpropagation, activation functions, optimizers, regularization techniques, recurrent neural networks, and transfer learning. It also includes examples of calculating outputs of a single neural unit for different input patterns and implementing logical operators using neural networks.

Uploaded by

Anitha Saravanan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

22 views24 pages

M3 L4 RNN Regularization

Uploaded by

Anitha Saravanan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 24

Module 3

Deep Learning 10 Hours

• Artificial Neural Networks (ANN): architecture

• Feed-forward and back propagation

• Activation functions

• Optimizers in deep learning

• Regularization techniques

• Recurrent neural networks

• Transfer learning

26-03-2024 2
Problem
Consider the unit shown on Figure. Suppose that the weights corresponding to
the three inputs have the following values:
w1 = 2
w2 = −4
w3 = 1
and the activation of the unit is given by the step-function:

Calculate what will be the output value y of the unit for each of the following
input patterns:

3
Problem
Logical operators (i.e. NOT, AND, OR, XOR, etc) are the building blocks of any computational device. Logical functions
return only two possible values, true or false, based on the truth or false values of their arguments.
For example, operator AND returns true only when all its arguments are true, otherwise (if any of the arguments is
false) it returns false. If we denote truth by 1 and false by 0, then logical function AND can be represented by the
following table:
This function can be implemented by a single-unit with two inputs:
if the weights are w1 = 1 and w2 = 1 and the activation function is:

Note that the threshold level is 2 (v ≥ 2).

a) Test how the neural AND function works.
b) Suggest how to change either the weights or the threshold level of this
single-unit in order to implement the logical OR function (true when at least one of the
arguments is true):

c) The XOR function (exclusive or) returns true only when one of the arguments is true and
another is false. Otherwise, it returns always false. This can be represented by the following
table:
Do you think it is possible to implement this function using a single unit? A network of several
units?

4
Cost function of neural networks

• Used to measure how far the actual values are from the expected value.
• Variable d to represent the true value
• Variable Y to represent the neuron prediction
• M number of output nodes
• The greater the error of the neural network, the higher the value of the cost
function is.
• Quadratic cost function

5
Cost function of neural networks
• Cross entropy function

• The cross entropy function is much more sensitive to the error.

• For this reason, the learning rules derived from the cross entropy function are
generally known to yield better performance.
• It is recommended to use the cross entropy driven learning rules except for inevitable
cases such as the regression.

6
Recurrent neural networks

• In making predictions about data in the form of sequences.

• Given the price of a stock in the last week, predict whether stock price will go up.

• Given a sentence (sequence of chars/words) predict its sentiment.

• Given a sentence in English, translate it to French - is a sequence-to-sequence prediction

task, because both inputs and outputs are sequences.

7
Recurrent neural networks
• All the neural architectures are inherently designed for multidimensional data in which the attributes are largely
independent of one another.

• However, certain data types such as time-series, text, and biological data contain sequential dependencies
among the attributes.

• Examples of such dependencies are as follows:

1. In a time-series data set, the values on successive time-stamps are closely related to one another. If one
uses the values of these time-stamps as independent features, then key information about the
relationships among the values of these time-stamps is lost.

2. Although text is often processed as a bag of words, one can obtain better semantic insights when the
ordering of the words is used. In such cases, it is important to construct models that take the sequencing
information into account. Text data is the most common use case of recurrent neural networks.

3. Biological data often contains sequences, in which the symbols might correspond to amino acids or one of
the nucleobases that form the building blocks of DNA. 8
Recurrent neural networks
• A recurrent neural network or RNN is a neural network which maps from an input space of
sequences to an output space of sequences in a stateful way.
• That is, the prediction of output yt depends not only on the input xt, but also on the hidden state
of the system, ht, which gets updated over time, as the sequence is processed.
• Such models can be used for sequence generation, sequence classification, and sequence
translation.
• RNNs - a kind of architecture which has a set of hidden units replicated at each time step, and
connections between them.
• RNN - A single set of input units, hidden units, and output units, and the hidden units feed into
themselves - graph of an RNN may have self-loops
• self-loops - mean that the values of the hidden units at one time step depend on their
values at the previous time step.
9
Delayed Sequence to sequence

10
Sequence to sequence

11
Sequence to vector

12
Vector to sequence

13
14
Example of a RNN and its unrolled representation

Each color corresponds to a weight matrix which is replicated at all time steps.
15
Example of an RNN which sums its inputs over time

• All of the units are linear.

• The hidden-to-output weight is 1,

which means the output unit just
copies the hidden activation.

• The hidden-to-hidden weight is 1,

which means that in the absence of
any input, the hidden unit just
remembers its previous value.

• The input-to-hidden weight is 1,

which means the input gets added to
the hidden activation in every time
step.

16
Example 2 RNN
• RNN which receives two inputs at
each time step, and which
determines which of the two
inputs has a larger sum over time
steps.
• The hidden unit is linear, and the
output unit is logistic.
• The output unit is a logistic unit
with a weight of 5. Recall that large
weights squash the function,
effectively making it a hard
threshold at 0.
• The hidden-to-hidden weight is 1,
so by default it remembers its
previous value.
• The input-to-hidden weights are 1
and -1, which means it adds one of
the inputs and subtracts the other.

17
Backprop Through Time unrolled computation graph

Forward pass: Backward pass:

18
Vectorized backprop rules

19
RNN

20
Regularization techniques
• Regularization is to add a regularization parameter to the error function E

Various choices of hyperplanes

With Regularization term
Without Regularization term 21
L2 Regularization

• Other names - ‘weight decay’, ‘ridge regression’, and ‘Tikhonov regularization’

• L2 regularization is to use the L2 or Euclidean norm for the regularization term.
• L2 norm of a vector x = (x1, x2, . . . , xn) is

• L2-regularized error function

• Add a hyperparameter to be able to adjust how much of the regularization to be used (called
the regularization parameter or regularization rate, and denoted by λ), and divide it by the size
of the batch used.
• Modified L2-regularized error function

22
L2 Regularization
• The intuition is that during the learning procedure, smaller weights will be preferred, but larger
weights will be considered if the overall decrease in error is significant.
• This explains why it is called ‘weight decay’.
• The choice of λ determines how much will small weights be preferred (when λ is large, the
preference for small weights will be great).
• Regularized error function:

• By taking the partial derivatives of this equation

For most of the classification
and prediction problems L2
regularization is used.
• Taking this back to the general weight update rule

23
L1 Regularization

• Other names - ‘lasso’ or ‘basis pursuit denoising’.

• L1 regularization uses the absolute value instead of the squares

• L1 is superior where there are a lot of irrelevant data - either very noisy data, or features that
are not informative, but it can also be sparse data (where most features are irrelevant because
they are missing).

• Applications of L1 regularization are in signal processing and robotics.

24
Architecture of a LSTM cell

UNIT 1 Introduction Part 1
No ratings yet
UNIT 1 Introduction Part 1
37 pages
DL Co3 - PPT 1
No ratings yet
DL Co3 - PPT 1
22 pages
Unit-5: Introduction To Deep Learning: Artificial Neural Networks
No ratings yet
Unit-5: Introduction To Deep Learning: Artificial Neural Networks
14 pages
CS115 Math For Computer Science
No ratings yet
CS115 Math For Computer Science
45 pages
Ảnh Màn Hình 2025-04-10 Lúc 10.10.40
No ratings yet
Ảnh Màn Hình 2025-04-10 Lúc 10.10.40
63 pages
Sequence Models-I: (Recurrent Neural Networks-Introduction, Types of Rnns Many-To-Many-Rnns For Sequence Labeling)
No ratings yet
Sequence Models-I: (Recurrent Neural Networks-Introduction, Types of Rnns Many-To-Many-Rnns For Sequence Labeling)
22 pages
598 114 216 Recurrent Neural Networks
No ratings yet
598 114 216 Recurrent Neural Networks
87 pages
Repair
No ratings yet
Repair
0 pages
04 RNN Slides
No ratings yet
04 RNN Slides
55 pages
RNN LSTM
No ratings yet
RNN LSTM
71 pages
Unit 3 Chapter 1 RNN
No ratings yet
Unit 3 Chapter 1 RNN
121 pages
Recurrent Neural Network
No ratings yet
Recurrent Neural Network
28 pages
Department of Electronics & Electrical Engineering: Ec5245: Artificial Neural Network & Fuzzy Logic
No ratings yet
Department of Electronics & Electrical Engineering: Ec5245: Artificial Neural Network & Fuzzy Logic
51 pages
SPJ S4 2008 PDF
100% (4)
SPJ S4 2008 PDF
202 pages
Module 4 Recurrent Neural Network
No ratings yet
Module 4 Recurrent Neural Network
78 pages
Time Series RNN LSTM 1746197734
No ratings yet
Time Series RNN LSTM 1746197734
25 pages
6 NN RNN
No ratings yet
6 NN RNN
55 pages
Unit 5 Updated
No ratings yet
Unit 5 Updated
125 pages
Recurrent Neural Networks (RNNS)
No ratings yet
Recurrent Neural Networks (RNNS)
45 pages
Lec 10 New
No ratings yet
Lec 10 New
57 pages
Chapter 4 Data Sci
No ratings yet
Chapter 4 Data Sci
58 pages
Recurrent Neural Networks: RNN: S. Sumitra Department of Mathematics Indian Institute of Space Science and Technology
No ratings yet
Recurrent Neural Networks: RNN: S. Sumitra Department of Mathematics Indian Institute of Space Science and Technology
47 pages
Unit 2
No ratings yet
Unit 2
112 pages
DNN U2 Notes
No ratings yet
DNN U2 Notes
32 pages
Deep Neural Networks - 2
No ratings yet
Deep Neural Networks - 2
55 pages
Dl-Unit 5
No ratings yet
Dl-Unit 5
10 pages
Recurrent Neural Networks - Hinton
No ratings yet
Recurrent Neural Networks - Hinton
57 pages
Lec14 RNN3 8 Feb 18
No ratings yet
Lec14 RNN3 8 Feb 18
16 pages
Basic Well Head Control Panel
100% (3)
Basic Well Head Control Panel
4 pages
DL Notes
No ratings yet
DL Notes
35 pages
Recurrent Neural Networks and Long Short-Term Memory Networks: Tutorial and Survey
No ratings yet
Recurrent Neural Networks and Long Short-Term Memory Networks: Tutorial and Survey
15 pages
Week 03-04 - Deep Feedforward Networks - Intro
No ratings yet
Week 03-04 - Deep Feedforward Networks - Intro
141 pages
Introduction To Recurrent Neural Network
No ratings yet
Introduction To Recurrent Neural Network
18 pages
DL 4 Notes
No ratings yet
DL 4 Notes
34 pages
Module 5 (Chapter 10)
No ratings yet
Module 5 (Chapter 10)
17 pages
Unit 3 RCNN
No ratings yet
Unit 3 RCNN
25 pages
P. Monribot - There Is No Sexual Relation PDF
No ratings yet
P. Monribot - There Is No Sexual Relation PDF
18 pages
DL 2
No ratings yet
DL 2
62 pages
Chap 7.2 Sequence Analysis Using RNN LSTM
No ratings yet
Chap 7.2 Sequence Analysis Using RNN LSTM
60 pages
DLAI4 Networks Recurrent
No ratings yet
DLAI4 Networks Recurrent
7 pages
Lecture 5 - CS50's Introduction To Artificial Intelligence With Python
No ratings yet
Lecture 5 - CS50's Introduction To Artificial Intelligence With Python
16 pages
Sequence Models231205
No ratings yet
Sequence Models231205
72 pages
Unit III (2) RNN, LSTM, Gru
No ratings yet
Unit III (2) RNN, LSTM, Gru
14 pages
CS60010: Deep Learning: Recurrent Neural Network
No ratings yet
CS60010: Deep Learning: Recurrent Neural Network
44 pages
Recurrent Neural Networks: CSC2535 2013: Advanced Machine Learning
No ratings yet
Recurrent Neural Networks: CSC2535 2013: Advanced Machine Learning
57 pages
Module 3.2 Time Series Forecasting LSTM Model
No ratings yet
Module 3.2 Time Series Forecasting LSTM Model
23 pages
Unit 5 RNN
No ratings yet
Unit 5 RNN
14 pages
Unit 2.1
No ratings yet
Unit 2.1
37 pages
Unit 5
No ratings yet
Unit 5
10 pages
Module 2
No ratings yet
Module 2
44 pages
Advanced Data Analytics: Simon Scheidegger - University of Lausanne, Department of Economics
No ratings yet
Advanced Data Analytics: Simon Scheidegger - University of Lausanne, Department of Economics
50 pages
Institute of Engineering and Technology Davv, Indore: Lab Assingment On
No ratings yet
Institute of Engineering and Technology Davv, Indore: Lab Assingment On
14 pages
Machine Learning
No ratings yet
Machine Learning
83 pages
UNIT - 5 Lecture 2
No ratings yet
UNIT - 5 Lecture 2
26 pages
A Brief Overview of Recurrent Neural Networks (RNN)
No ratings yet
A Brief Overview of Recurrent Neural Networks (RNN)
8 pages
Two Applications of Deep Learning in The Physical Layer of Communication Systems
No ratings yet
Two Applications of Deep Learning in The Physical Layer of Communication Systems
10 pages
Kagan Lecture2
No ratings yet
Kagan Lecture2
118 pages
Introduction To Recurrent Neural Network
No ratings yet
Introduction To Recurrent Neural Network
9 pages
9.deep Feedforward Networks
100% (1)
9.deep Feedforward Networks
13 pages
DL Exam 2023-2
No ratings yet
DL Exam 2023-2
5 pages
Assignment 4
No ratings yet
Assignment 4
4 pages
Recurrent Neural Networks (RNNS) : A Gentle Introduction and Overview
No ratings yet
Recurrent Neural Networks (RNNS) : A Gentle Introduction and Overview
16 pages
3.3 Science of Failure RFCA
No ratings yet
3.3 Science of Failure RFCA
166 pages
BH35 2
100% (1)
BH35 2
4 pages
NLP-NeuralNetworks Reading Notes
No ratings yet
NLP-NeuralNetworks Reading Notes
13 pages
Level I Final Branded Exam Information Document 2019
No ratings yet
Level I Final Branded Exam Information Document 2019
13 pages
MATH 103 Module 1 - Statistics Introduction
No ratings yet
MATH 103 Module 1 - Statistics Introduction
30 pages
Astm d3689 - XYZ
No ratings yet
Astm d3689 - XYZ
4 pages
Zones of Protection
100% (1)
Zones of Protection
25 pages
History of Computer
No ratings yet
History of Computer
22 pages
EC Ders Föyü - 241025 - 115702
No ratings yet
EC Ders Föyü - 241025 - 115702
226 pages
TASKING TriCore Tools Linker Tips - Tricks - WEB
No ratings yet
TASKING TriCore Tools Linker Tips - Tricks - WEB
12 pages
Further Graphs and Tangents JNMkC9W7jtFjxkbC
No ratings yet
Further Graphs and Tangents JNMkC9W7jtFjxkbC
30 pages
Comfort & Heat Balance Between Human Body and Cloth
No ratings yet
Comfort & Heat Balance Between Human Body and Cloth
9 pages
Lantite Results
No ratings yet
Lantite Results
2 pages
Gamma Function (Lecture-1)
No ratings yet
Gamma Function (Lecture-1)
23 pages
Rizzoni Principles 7e Ch01 ISM
No ratings yet
Rizzoni Principles 7e Ch01 ISM
93 pages
Geo Server
No ratings yet
Geo Server
7 pages
Handbook of Production Economics Subhash C. Ray (Editor)
100% (1)
Handbook of Production Economics Subhash C. Ray (Editor)
72 pages
ASTRO 3: Introductory Astronomy 3rd Edition Michael A. Seeds - The Ebook in PDF Format Is Available For Download
No ratings yet
ASTRO 3: Introductory Astronomy 3rd Edition Michael A. Seeds - The Ebook in PDF Format Is Available For Download
52 pages
Excel Calculation Guide For Pipette Intermediate Checks Advance
No ratings yet
Excel Calculation Guide For Pipette Intermediate Checks Advance
3 pages
Hematocritc Determination
No ratings yet
Hematocritc Determination
3 pages
Sprpackagereport 2434200900
No ratings yet
Sprpackagereport 2434200900
135 pages
Cell Structures and Their Functions
No ratings yet
Cell Structures and Their Functions
1 page
Course Introduction - OOAD
No ratings yet
Course Introduction - OOAD
7 pages
Mathematics in The Modern World Reviewer
No ratings yet
Mathematics in The Modern World Reviewer
3 pages
Introduction To Linear Programming: Simplex Method: Lesson/ Learning Plan
No ratings yet
Introduction To Linear Programming: Simplex Method: Lesson/ Learning Plan
10 pages
Pitriani Rajab Mangasi - 201830112
No ratings yet
Pitriani Rajab Mangasi - 201830112
14 pages
Earliest Start Time (Es) : CPM Analysis Page of
No ratings yet
Earliest Start Time (Es) : CPM Analysis Page of
4 pages