0% found this document useful (0 votes)

29 views13 pages

UNIT5

deep learning

Uploaded by

Principal CECC

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

29 views13 pages

UNIT5

deep learning

Uploaded by

Principal CECC

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 13

UNIT–V

Sequence Modeling:

Recurrent and Recursive Nets Unfolding Computational Graphs,

Recurrent Neural Networks, Bidirectional RNNs, Deep Recurrent
Networks, Recursive Neural Networks, The Challenge of Long-Term
Dependencies, Optimization for Long-Term Dependencies, Explicit
Memory.

Unfolding Computational Graphs

Basic formula of RNN (10.4) is shown below:

It basically says the current hidden state h(t) is a function f of the previous
hidden state h(t-1) and the current input x(t). The theta are the parameters
of the function f. The network typically learns to use h(t) as a kind of lossy
summary of the task-relevant aspects of the past sequence of inputs up to
t.

Unfolding maps the left to the right in the figure below (both are
computational graphs of a RNN without output o)
where the black square indicates that an interaction takes place with a
delay of 1 time step, from the state at time t to the state at time t + 1.

Unfolding/parameter sharing is better than using different parameters per

position: less parameters to estimate, generalize to various length.

10.2 Recurrent Neural Network

Variation 1 of RNN (basic form): hidden2hidden connections,

sequence output. As in Fig 10.3.

The basic equations that defines the above RNN is shown

in (10.6) below (on pp. 385 of the book)
The total loss for a given sequence of x values paired with a sequence
of y values would then be just the sum of the losses over all the time steps.
For example, if L(t) is the negative log-likelihood

of y (t) given x (1), . . . , x (t) , then sum them up you get the loss for the
sequence as shown in (10.7):

 Foward Pass: The runtime is O(τ) and cannot be reduced by

parallelization because the forward propagation graph is inherently
sequential; each time step may only be computed after the previous
one.
 Backward Pass: see Section 10.2.2.

Variation 2 of RNN output2hidden, sequence output. As shown in Fig

10.4, it produces an output at each time step and have recurrent
connections only from the output at one time step to the hidden units at
the next time step
Teacher forcing (Section 10.2.1, pp 385) can be used to train RNN as in
Fig 10.4 (above), where only output2hidden connections exist, i.e
hidden2hidden connections are absent.

In teach forcing, the model is trained to maximize the conditional

probability of current output y(t), given both the x sequence so far and the
previous output y(t-1), i.e. use the gold-standard output of previous time
step in training.

Variation 3 of RNN hidden2hidden, single output. As Fig 10.5

recurrent connections between hidden units, that read an entire sequence
and then produce a single output
What is Recurrent Neural Network (RNN)?
Recurrent Neural Network(RNN) is a type of Neural Network where the
output from the previous step is fed as input to the current step. In
traditional neural networks, all the inputs and outputs are independent of
each other. Still, in cases when it is required to predict the next word of
a sentence, the previous words are required and hence there is a need to
remember the previous words. Thus RNN came into existence, which
solved this issue with the help of a Hidden Layer. The main and most
important feature of RNN is its Hidden state, which remembers some
information about a sequence. The state is also referred to as Memory
State since it remembers the previous input to the network. It uses the
same parameters for each input as it performs the same task on all the
inputs or hidden layers to produce the output. This reduces the
complexity of parameters, unlike other neural networks.
Recurrent Neural Network

How RNN differs from Feedforward Neural Network?

Artificial neural networks that do not have looping nodes are called feed
forward neural networks. Because all information is only passed
forward, this kind of neural network is also referred to as a multi-layer
neural network.
Information moves from the input layer to the output layer – if any
hidden layers are present – unidirectionally in a feedforward neural
network. These networks are appropriate for image classification tasks,
for example, where input and output are independent. Nevertheless,
their inability to retain previous inputs automatically renders them less
useful for sequential data analysis.
Bi-directional Recurrent Neural Network
An architecture of a neural network called a bidirectional recurrent
neural network (BRNN) is made to process sequential data. In order for
the network to use information from both the past and future context in
its predictions, BRNNs process input sequences in both the forward and
backward directions. This is the main distinction between BRNNs and
conventional recurrent neural networks.

There are four types of RNNs based on the number of inputs and outputs
in the network.
1. One to One
2. One to Many
3. Many to One
4. Many to Many
One to One
This type of RNN behaves the same as any simple Neural network it is
also known as Vanilla Neural Network. In this Neural network, there is
only one input and one output.

One to One RNN

One To Many
In this type of RNN, there is one input and many outputs associated with
it. One of the most used examples of this network is Image captioning
where given an image we predict a sentence having Multiple words.

One to Many RNN

Many to One
In this type of network, Many inputs are fed to the network at several
states of the network generating only one output. This type of network is
used in the problems like sentimental analysis. Where we give multiple
words as input and predict only the sentiment of the sentence as output.
Many to One RNN
Many to Many
In this type of neural network, there are multiple inputs and multiple
outputs corresponding to a problem. One Example of this Problem will
be language translation. In language translation, we provide multiple
words from one language as input and predict multiple words from the
second language as output.

Many to Many RNN

Bidirectional RNNs
A BRNN has two distinct recurrent hidden layers, one of which
processes the input sequence forward and the other of which processes it
backward. After that, the results from these hidden layers are collected
and input into a prediction-making final layer. Any recurrent neural
network cell, such as Long Short-Term Memory (LSTM) or Gated
Recurrent Unit, can be used to create the recurrent hidden layers.
The BRNN functions similarly to conventional recurrent neural
networks in the forward direction, updating the hidden state depending
on the current input and the prior hidden state at each time step. The
backward hidden layer, on the other hand, analyses the input sequence
in the opposite manner, updating the hidden state based on the current
input and the hidden state of the next time step.
Compared to conventional unidirectional recurrent neural networks, the
accuracy of the BRNN is improved since it can process information in
both directions and account for both past and future contexts. Because
the two hidden layers can complement one another and give the final
prediction layer more data, using two distinct hidden layers also offers a
type of model regularisation.
In order to update the model parameters, the gradients are computed for
both the forward and backward passes of the backpropagation through
the time technique that is typically used to train BRNNs. The input
sequence is processed by the BRNN in a single forward pass at
inference time, and predictions are made based on the combined outputs
of the two hidden layers. layers.

Bi-directional Recurrent Neural Network

Working of Bidirectional Recurrent Neural Network

1. Inputting a sequence: A sequence of data points, each represented

as a vector with the same dimensionality, are fed into a BRNN. The
sequence might have different lengths.
2. Dual Processing: Both the forward and backward directions are
used to process the data. On the basis of the input at that step and the
hidden state at step t-1, the hidden state at time step t is determined
in the forward direction. The input at step t and the hidden state at
step t+1 are used to calculate the hidden state at step t in a reverse
way.
3. Computing the hidden state: A non-linear activation function on
the weighted sum of the input and previous hidden state is used to
calculate the hidden state at each step. This creates a memory
mechanism that enables the network to remember data from earlier
steps in the process.
4. Determining the output: A non-linear activation function is used to
determine the output at each step from the weighted sum of the
hidden state and a number of output weights. This output has two
options: it can be the final output or input for another layer in the
network.
5. Training: The network is trained through a supervised learning
approach where the goal is to minimize the discrepancy between the
predicted output and the actual output. The network adjusts its
weights in the input-to-hidden and hidden-to-output connections
during training through backpropagation.

Recursive Neural Network

A branch of machine learning and artificial intelligence (AI) known as

"deep learning" aims to replicate how the human brain analyses
information and learns certain concepts. Deep Learning's foundation is
made up of neural networks. These are intended to precisely identify
underlying patterns in a data collection and are roughly modelled after the
human brain. Deep Learning provides the answer to the problem of
predicting the unpredictable.

A subset of deep neural networks called recursive neural networks

(RvNNs) are capable of learning organized and detailed data. By
repeatedly using the same set of weights on structured inputs, RvNN
enables you to obtain a structured prediction. Recursive refers to the
neural network's application to its output.

Recursive neural networks are capable of handling hierarchical data

because of their indepth tree-like structure. In a tree structure, parent
nodes are created by joining child nodes. There is a weight matrix for
every child-parent bond, and comparable children have the same weights.
To allow for recursive operations and the use of the same weights, the
number of children for each node in the tree is fixed. When it's necessary
to parse a whole sentence, RvNNs are employed.

We add the weight matrices' (W i) and children's (C i) products and use

the transformation f to determine the parent node's representation.
h=f(∑i=1i=cWiCi)Hd)

1What are long-term dependencies?

Long-term dependencies are the situations where the output of an RNN

depends on the input that occurred many time steps ago. For instance,
consider the sentence "The cat, which was very hungry, ate the mouse".
To understand the meaning of this sentence, you need to remember that
the cat is the subject of the verb ate, even though they are separated by a
long clause. This is a long-term dependency, and it can affect the
performance of an RNN that tries to generate or analyze such sentences.

2.Why are long-term dependencies hard to learn?

The main reason why long-term dependencies are hard to learn is that
RNNs suffer from the vanishing or exploding gradient problem. This
means that the gradient, which is the signal that tells the network how to
update its weights, becomes either very small or very large as it
propagates through the network. When the gradient vanishes, the network
cannot learn from the distant inputs, and when it explodes, the network
becomes unstable and produces erratic outputs. This problem is caused by
the repeated multiplication of the same matrix, which represents the
connections between the hidden units, at each time step.

3.How can you solve the vanishing or exploding gradient problem?

One way to solve the vanishing or exploding gradient problem is to use a

different activation function for the hidden units. The activation function
determines how the units respond to the input and output signals. The
most common activation function for RNNs is the hyperbolic tangent
(tanh), which has a range between -1 and 1. However, this function can
cause the gradient to vanish if the input is too large or too small. A better
alternative is the rectified linear unit (ReLU), which has a range between
0 and infinity. This function can prevent the gradient from vanishing, but
it can also cause it to explode if the input is too large.

Another way to solve the vanishing or exploding gradient problem is to

use a different weight initialization method for the network. The weight
initialization method determines how the network assigns random values
to its weights before training. The most common method for RNNs is the
uniform initialization, which draws the weights from a uniform
distribution between -1 and 1. However, this method can cause the
weights to be too large or too small, which can affect the gradient. A
better alternative is the orthogonal initialization, which draws the weights
from an orthogonal matrix, which preserves the norm of the gradient.

Explicit memory
Explicit memory is declarative memory because we consciously try to
recall a specific event or piece of information. Things we intentionally try
to recall or remember, such as formulas and dates, are all stored in
explicit memory. We utilize recalled information such as this during
everyday activities such as work or when running errands.

Explicit memory can be classed as either episodic or semantic. Episodic

memory is the memory of one's own personal past, while semantic
memories contain hard facts and concepts such as names.

MRI studies show that during recall of explicit short-term memories, the
prefrontal cortex is activated, the most recently evolved addition to the
mammalian brain. Interestingly, there appears to be a separation in
function between the left and right sides of the prefrontal cortex, with the
right more involved in spatial and the left verbal working memory.

The hippocampus, neocortex, and amygdala have been implicated during

the formation and storage of explicit long-term memory. The
hippocampus is found within the brain's temporal lobe and forms and
indexes memories about our own lives for later access.

We know this as Henry Molaison had his hippocampus removed in the

treatment of epilepsy in 1952, and following the procedure was unable to
form any new memories of things he had done. He was, however, able to
learn new skills and motor tasks, examples of implicit memory that do
not rely on this region of the brain.
Examples of explicit memory include:

 Recalling phone numbers.

 Completing an exam.
 Remembering items on a list.
 Birth dates.
 Important event dates.
 Names.
 Locations.
 Country names.

Unit 3 Deep Learning SPPU BE IT
No ratings yet
Unit 3 Deep Learning SPPU BE IT
30 pages
Recurrent Neural Networks
No ratings yet
Recurrent Neural Networks
18 pages
Virtual Gamepad Ik - Icp
No ratings yet
Virtual Gamepad Ik - Icp
1 page
Resume To Yaya Wallet
No ratings yet
Resume To Yaya Wallet
13 pages
Deep Learning (MODULE-4)
No ratings yet
Deep Learning (MODULE-4)
102 pages
Module5 DL
No ratings yet
Module5 DL
18 pages
BCA Course PDF
No ratings yet
BCA Course PDF
109 pages
Unit 04 Pandas
No ratings yet
Unit 04 Pandas
46 pages
Recurrent Neural Network: Dr. Sukanta Ghosh
100% (1)
Recurrent Neural Network: Dr. Sukanta Ghosh
34 pages
Recurrent Neural Networks
No ratings yet
Recurrent Neural Networks
6 pages
Unit-Iv DL
No ratings yet
Unit-Iv DL
54 pages
Unit 4 NLP
No ratings yet
Unit 4 NLP
19 pages
DeepLearning Unit-III
No ratings yet
DeepLearning Unit-III
42 pages
Introduction To Recurrent Neural Network
No ratings yet
Introduction To Recurrent Neural Network
18 pages
Deep Learning
No ratings yet
Deep Learning
49 pages
My Activities in Module 2
100% (6)
My Activities in Module 2
7 pages
A Good Team RACF and CICS 2
No ratings yet
A Good Team RACF and CICS 2
9 pages
Recurrent Neural Network (RNN)
No ratings yet
Recurrent Neural Network (RNN)
26 pages
Unit-2 Data Literacy
No ratings yet
Unit-2 Data Literacy
6 pages
A Recurrent Neural Network
No ratings yet
A Recurrent Neural Network
22 pages
SRM Institute of Science and Technology: Record Work
No ratings yet
SRM Institute of Science and Technology: Record Work
251 pages
Nria20-Dl - Unit-4 Notes-Final
No ratings yet
Nria20-Dl - Unit-4 Notes-Final
21 pages
DeepLearning Unit-III
No ratings yet
DeepLearning Unit-III
99 pages
DL Mod4
No ratings yet
DL Mod4
105 pages
GenAI Module2
No ratings yet
GenAI Module2
190 pages
RNN LSTM Gru R
No ratings yet
RNN LSTM Gru R
97 pages
Deep Arch MSC 2024
No ratings yet
Deep Arch MSC 2024
83 pages
Unit 5
No ratings yet
Unit 5
76 pages
Wireless Security Camera PC530
No ratings yet
Wireless Security Camera PC530
100 pages
What Is A Recurrent Neural Network
No ratings yet
What Is A Recurrent Neural Network
36 pages
Mod 6
No ratings yet
Mod 6
48 pages
TensorFlow in 1 Day: Make your own Neural Network
From Everand
TensorFlow in 1 Day: Make your own Neural Network
Krishna Rungta
3.5/5 (10)
15572854118361575jy NARI MUX 2M Interface
No ratings yet
15572854118361575jy NARI MUX 2M Interface
2 pages
Recurrent Neural Networks
No ratings yet
Recurrent Neural Networks
36 pages
Ad3501-Dl-Unit 3 Notes
No ratings yet
Ad3501-Dl-Unit 3 Notes
34 pages
Data Structure Unit 1
No ratings yet
Data Structure Unit 1
49 pages
Unit 3 RCNN Updated
No ratings yet
Unit 3 RCNN Updated
28 pages
Unit-2 Part-2
No ratings yet
Unit-2 Part-2
42 pages
Unit V Recurrent Neural Networks
No ratings yet
Unit V Recurrent Neural Networks
35 pages
CCN 19ec602 Isd
No ratings yet
CCN 19ec602 Isd
30 pages
RNN Tutorial
No ratings yet
RNN Tutorial
41 pages
Lec 4 Recurrent Neural Network Long Short-Term Memory
No ratings yet
Lec 4 Recurrent Neural Network Long Short-Term Memory
32 pages
Unit 3
No ratings yet
Unit 3
30 pages
DL Unit - III Notes1
No ratings yet
DL Unit - III Notes1
14 pages
Module 5 (Chapter 10)
No ratings yet
Module 5 (Chapter 10)
17 pages
DL Unit4
No ratings yet
DL Unit4
20 pages
Config Idevice Standard DOCU V1d0 en
No ratings yet
Config Idevice Standard DOCU V1d0 en
44 pages
Single Dan Multi Item
No ratings yet
Single Dan Multi Item
11 pages
RNN SK
No ratings yet
RNN SK
17 pages
BDS602 Module 2 PDF
No ratings yet
BDS602 Module 2 PDF
16 pages
DL 4
No ratings yet
DL 4
11 pages
Role of Pivot Tables in Business Analysis 20 Pages-1
No ratings yet
Role of Pivot Tables in Business Analysis 20 Pages-1
18 pages
Soft Computing 1
No ratings yet
Soft Computing 1
15 pages
Unit-Iv DL
No ratings yet
Unit-Iv DL
23 pages
Unit V
No ratings yet
Unit V
32 pages
بنك اسئلة مادة Wireless and Mobile Networks
No ratings yet
بنك اسئلة مادة Wireless and Mobile Networks
8 pages
Unit 3
No ratings yet
Unit 3
8 pages
Introduction To Recurrent Neural Networks
No ratings yet
Introduction To Recurrent Neural Networks
15 pages
Unit 4 - Merged
No ratings yet
Unit 4 - Merged
13 pages
Introduction To Recurrent Neural Network
No ratings yet
Introduction To Recurrent Neural Network
9 pages
Unit 3 RCNN
No ratings yet
Unit 3 RCNN
25 pages
Recurrent Neural Network
No ratings yet
Recurrent Neural Network
21 pages
Sequence Modeling Recurrent Neural Networks
No ratings yet
Sequence Modeling Recurrent Neural Networks
18 pages
Yan 2021 Fine Grained Motion Estimation For
No ratings yet
Yan 2021 Fine Grained Motion Estimation For
11 pages
Module 5
No ratings yet
Module 5
21 pages
Deep & Reinforcement - Unit 4
No ratings yet
Deep & Reinforcement - Unit 4
17 pages
What Is Bitcoin
No ratings yet
What Is Bitcoin
5 pages
Convolutional Neural Networks (CNNS)
No ratings yet
Convolutional Neural Networks (CNNS)
10 pages
What Is An RNN
No ratings yet
What Is An RNN
6 pages
Recurrent Neural Networks
No ratings yet
Recurrent Neural Networks
8 pages
Sentiment Analysis From Facebook Comments Using Au PDF
No ratings yet
Sentiment Analysis From Facebook Comments Using Au PDF
8 pages
Recurrent Neural Network Jeeva
No ratings yet
Recurrent Neural Network Jeeva
10 pages
Recurrent Neural Network
No ratings yet
Recurrent Neural Network
11 pages
A Brief Overview of Recurrent Neural Networks (RNN)
No ratings yet
A Brief Overview of Recurrent Neural Networks (RNN)
8 pages
Recurrent Neural Networks: Index
No ratings yet
Recurrent Neural Networks: Index
13 pages
Steps For Training A Recurrent Neural Network: Advantages
No ratings yet
Steps For Training A Recurrent Neural Network: Advantages
13 pages
Technical Skills
No ratings yet
Technical Skills
5 pages
What Are Recurrent Neural Networks
No ratings yet
What Are Recurrent Neural Networks
7 pages
Dl-Unit 5
No ratings yet
Dl-Unit 5
10 pages
Top - Niunaijun.blackdexa32 Logcat
No ratings yet
Top - Niunaijun.blackdexa32 Logcat
4 pages
Module 06
No ratings yet
Module 06
5 pages
Terms of Service
No ratings yet
Terms of Service
3 pages
Remote Monitoring System For Cold Storage Warehouse Using IOT
No ratings yet
Remote Monitoring System For Cold Storage Warehouse Using IOT
7 pages
Ircadv5030 Error Code E350-001 - Copytechnet - Com2
No ratings yet
Ircadv5030 Error Code E350-001 - Copytechnet - Com2
1 page
TPACK Template: Subject US Government Grade Level 12 Grade Learning Objective
No ratings yet
TPACK Template: Subject US Government Grade Level 12 Grade Learning Objective
2 pages
Tosibox Datasheet Lock100 LR PDF
No ratings yet
Tosibox Datasheet Lock100 LR PDF
2 pages
1000BASE or Gigabit Ethernet
No ratings yet
1000BASE or Gigabit Ethernet
2 pages
Jquery Validation
No ratings yet
Jquery Validation
2 pages
ECMA
No ratings yet
ECMA
7 pages

UNIT5

Uploaded by

UNIT5

Uploaded by

UNIT–V

Recurrent and Recursive Nets Unfolding Computational Graphs,

Unfolding Computational Graphs

Basic formula of RNN (10.4) is shown below:

Unfolding/parameter sharing is better than using different parameters per

10.2 Recurrent Neural Network

Variation 1 of RNN (basic form): hidden2hidden connections,

The basic equations that defines the above RNN is shown

 Foward Pass: The runtime is O(τ) and cannot be reduced by

Variation 2 of RNN output2hidden, sequence output. As shown in Fig

In teach forcing, the model is trained to maximize the conditional

Variation 3 of RNN hidden2hidden, single output. As Fig 10.5

How RNN differs from Feedforward Neural Network?

One to One RNN

One to Many RNN

Many to Many RNN

Bi-directional Recurrent Neural Network

Working of Bidirectional Recurrent Neural Network

1. Inputting a sequence: A sequence of data points, each represented

Recursive Neural Network

A branch of machine learning and artificial intelligence (AI) known as

A subset of deep neural networks called recursive neural networks

Recursive neural networks are capable of handling hierarchical data

We add the weight matrices' (W i) and children's (C i) products and use

1What are long-term dependencies?

Long-term dependencies are the situations where the output of an RNN

2.Why are long-term dependencies hard to learn?

3.How can you solve the vanishing or exploding gradient problem?

One way to solve the vanishing or exploding gradient problem is to use a

Another way to solve the vanishing or exploding gradient problem is to

Explicit memory can be classed as either episodic or semantic. Episodic

The hippocampus, neocortex, and amygdala have been implicated during

We know this as Henry Molaison had his hippocampus removed in the

 Recalling phone numbers.

You might also like