0% found this document useful (0 votes)

29 views11 pages

DL 4

Uploaded by

Rhododendron

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

29 views11 pages

DL 4

Uploaded by

Rhododendron

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 11

Unfolding Computational Graphs

Unfolding computational graphs is a technique used to represent recurrent neural networks

(RNNs) as acyclic graphs. This is done by unfolding the RNN's recurrence over a finite time
horizon. The resulting graph can then be trained using traditional back propagation techniques.

For example, in a recurrent neural network used for natural language processing, the
computational graph can be unfolded over time to represent the processing of a sentence. Each
word in the sentence corresponds to a time step, and the same set of operations is applied to
each word. The computational graph is replicated for each time step, with the outputs of one
step serving as inputs for the next step.

Unfolding the computational graph allows us to visualize the repeated computations and the
flow of data across different time steps. It helps in understanding how information is processed
and propagated through the model over time. It also enables efficient computation by reusing
the same set of operations and their corresponding parameters for each time step, rather than
recreating them for each iteration.

Advantages of Unfolding Process

The unfolding process introduces two major advantages :
1. Regardless of sequence length, the learned model has the same input size. This is
because it is specified in terms of transition from one state to another state rather than
specified in terms of a variable length history of states
2. Possible to use the same function f with the same parameters at every step.

These two factors make it possible to learn a single model f, that operates on all time steps and
all sequence lengths, rather than needing separate model g(t) for all possible time steps

Recurrent Neural Networks

Recurrent Neural Networks (RNNs) are a type of neural network where the results of one step
are fed into the next step's computations. Unlike feedforward neural networks, which process
inputs in a single pass without any memory, RNNs have a built-in memory mechanism that
allows them to capture dependencies and patterns in sequential data.

Traditional neural networks have inputs and outputs that are independent of one another, but
there is a need to remember the previous words in situations where it is necessary to anticipate
the next word in a sentence.

As a result, RNN was developed, which utilized a hidden layer to resolve this problem. The
hidden state, which retains some information about a sequence, is the primary and most
significant characteristic of RNN.
The key feature of RNNs is their ability to maintain and update an internal state, also known as
a hidden state or memory, which is passed from one step to the next. This hidden state serves
as a form of memory that captures information from previous steps and influences the
processing of future steps. It allows RNNs to consider the context and temporal information in
sequential data.

For example, let's say the RNN is trained on the sentence "The quick brown fox jumps over the
lazy dog." The RNN would learn to predict the next word in the sentence by using its internal
state to store information about the previous words in the sentence. For example, when the
RNN is presented with the word "quick," it would use its internal state to store information about
the word "the." This information would then be used to predict the next word in the sentence,
which is "brown."

RNNs can be used to: Translate languages, Generate text

Working of each Recurrent Unit

A recurrent unit (RU) is a basic building block of a recurrent neural network (RNN). It is a
function that takes as input the current state of the RNN and the current input, and outputs the
next state of the RNN.

There are many different types of RUs, but the most common type is the gated recurrent unit
(GRU). A GRU has two gates: an update gate and a reset gate. The update gate controls how
much of the current state is updated, and the reset gate controls how much of the current state
is forgotten.

The GRU works as follows:

- The current input and state is multiplied by the input and state weights and passed
through an activation function to get the input activation.
- The input activation and state activation are concatenated and passed through a linear
layer to get the candidate activation.
- The candidate activation is then passed through the update gate and the reset gate.
- The update gate is multiplied by the state activation to get the updated state.
- The reset gate is multiplied by the state activation to get the forgotten state.
- The updated state and forgotten state are then added together to get the next state.

Long Short Term Memory Networks and Gated Recurrent Unit Networks, two key versions of
RNN, were created to address the issue of Vanishing gradients and Exploding Gradients.

Advantages:
- They can learn long-term dependencies.
- They are relatively easy to train.
Disadvantages:
- They can be computationally expensive to train.
- They can be difficult to interpret.
Bidirectional RNN
A Bidirectional Recurrent Neural Network (Bidirectional RNN) is a type of Recurrent Neural
Network (RNN) architecture that processes sequential data in both forward and backward
directions. It combines the information from past and future states to make predictions or extract
features from the input sequence.

In a standard RNN, the hidden state at each time step is computed based on the previous
hidden state and the current input. This allows the RNN to capture dependencies and patterns
in the past context of the sequence. However, the standard RNN may not have access to future
information, which can be valuable for tasks that require understanding the entire sequence.

The Bidirectional RNN addresses this limitation by introducing two separate RNNs: one that
processes the sequence in the forward direction (from the beginning to the end) and another
that processes the sequence in the backward direction (from the end to the beginning). Each
RNN has its own set of parameters.

The forward RNN computes a forward hidden state sequence h_f(t) at each time step t, starting
from the first element of the input sequence. On the other hand, the backward RNN computes a
backward hidden state sequence h_b(t) at each time step t, starting from the last element of the
input sequence.

Once the forward and backward hidden states are obtained, the Bidirectional RNN can combine
the information from both directions. This can be done in different ways depending on the
specific task or objective. One common approach is to concatenate the forward and backward
hidden states at each time step:

h(t) = [h_f(t); h_b(t)]

The concatenated hidden states can then be used for further processing, such as making
predictions, extracting features, or passing them to subsequent layers in a deep neural network.

The main advantage of Bidirectional RNNs is that they capture information from both past and
future contexts, enabling them to better model dependencies in the input sequence. This can be
particularly beneficial for tasks such as speech recognition, named entity recognition, sentiment
analysis, and machine translation, where understanding the context of the entire sequence is
important.

It's worth noting that the use of Bidirectional RNNs can increase computational complexity and
memory requirements compared to standard RNNs. Additionally, in tasks where future
information is not available, such as online prediction, Bidirectional RNNs may not be suitable.
Encoder-Decoder Sequence-to-Sequence Architectures
Encoder-Decoder sequence-to-sequence architectures are a type of neural network architecture
designed to handle tasks involving sequences, such as machine translation, text summarization,
and speech recognition. This architecture consists of two main components: an encoder and a
decoder.

The encoder processes the input sequence and encodes it into a fixed-length vector
representation called the context vector or latent representation. The decoder then takes this
context vector as input and generates the output sequence step by step.

The encoder-decoder model is composed on three primary building blocks:

1. Encoder
- The input sequence, such as a sentence in machine translation, is fed into the
encoder one element at a time (e.g., word embeddings or characters).
- At each time step, the encoder computes a hidden state based on the current
input and the previous hidden state.
- The encoder's hidden state captures information from the previous inputs and
contextual information from the input sequence.
- The final hidden state of the encoder summarizes the entire input sequence into
a fixed-length context vector.

2. Hidden Vector / Encoder Vector

- The Encoder vector is a representation of the input sequence captured by the
encoder.
- It contains information about the input sequence's semantic and contextual
meaning, as encoded by the encoder.
- The Encoder vector serves as a condensed representation of the input sequence
and provides relevant information to the decoder.
3. Decoder
- The decoder takes the Encoder vector as an initial input and generates the
output sequence step by step.
- At each time step, the decoder computes a hidden state based on the current
input and the previous hidden state.
- The decoder's hidden state, similar to the encoder, captures information from the
previous inputs and the context vector.
- The decoder then predicts the next element of the output sequence based on the
current hidden state.
- This process is repeated iteratively until the desired output sequence is
generated or a predefined maximum length is reached.

Applications
- Google’s Machine Translation
- Question answering chatbots
- Speech recognition

Deep Recurrent Network

Deep Recurrent Networks (DRNs) are a class of neural network architectures that combine the
depth of deep neural networks with the sequential modeling capabilities of recurrent neural
networks (RNNs). DRNs are designed to capture complex dependencies in sequential data by
stacking multiple recurrent layers on top of each other.

The key idea behind DRNs is to create deeper representations of sequential data by allowing
information to flow through multiple layers of recurrent units. Each layer in the network receives
input from the previous layer and passes its output to the next layer, enabling the network to
capture hierarchical patterns and long-term dependencies in the data.

Training deep recurrent networks involves backpropagation through time (BPTT), an extension
of the backpropagation algorithm for recurrent networks. BPTT allows gradients to flow through
time steps and layers, enabling the network to learn from the sequential dependencies and
adjust its parameters accordingly.
Recursive Neural Networks
Recursive Neural Networks (RecNNs) are a type of neural network architecture that operate on
structured or hierarchical data, such as parse trees, dependency trees, or other recursive
structures. Unlike traditional feedforward or recurrent neural networks that process sequential or
fixed-size inputs, RecNNs recursively build representations for structured data by recursively
applying the same set of operations.

A recursive neural network (RvNN) is a type of neural network that can be used to model
hierarchical data. Hierarchical data is data that has a tree-like structure, such as sentences or
paragraphs. RvNNs can be used to learn the relationships between the different parts of the
data, and to make predictions about the data.

RvNNs are made up of a set of recursive units. Each recursive unit takes as input the current
node in the tree, and the outputs of its children. The recursive unit then produces an output for
the current node, which is used by its parent. This process continues until the root node of the
tree is reached.

The outputs of the recursive units can be used to make predictions about the data. For example,
in the case of sentences, the outputs of the recursive units can be used to predict the next word
in the sentence.

A Recursive Neural Network is used for sentiment analysis in natural language sentences. It is
one of the most important tasks of Natural language Processing (NLP), which identifies the
writing tone and sentiments of the writer in a particular sentence.

Advantages
- structure and decrease in network depth are two main advantages
- RvNN can manage hierarchical data, which is important for many real-world tasks.
- The trees can have logarithmic height, learn long-range dependencies in data.
The Challenge of Long-Term Dependencies
The challenge of long-term dependencies refers to the difficulty that arises when capturing and
modeling relationships between distant elements in a sequence or data. In sequential data,
such as natural language sentences or time series, long-term dependencies occur when the
current element or event depends on elements or events that are far in the past. However,
traditional machine learning models, including simple feedforward neural networks, struggle to
effectively capture and learn such dependencies.

The primary reason behind this challenge is the vanishing or exploding gradient problem. During
the training of neural networks, gradients are used to update the network's parameters based on
the error or loss. However, when backpropagating through many time steps or layers, the
gradients can exponentially diminish (vanishing gradients) or grow uncontrollably (exploding
gradients). This issue makes it challenging for the network to propagate information over long
time horizons, hindering its ability to capture long-term dependencies effectively.

Vanishing gradients occur when the gradients become extremely small, causing the weights to
be updated minimally or not at all. Consequently, the network fails to capture relevant
information from distant elements, and the impact of those elements on the current prediction
diminishes rapidly.

On the other hand, exploding gradients occur when the gradients become very large, leading to
unstable updates and difficulties in converging to an optimal solution. This issue can cause
training instability and prevent the network from effectively learning long-term dependencies.

The challenge of long-term dependencies is particularly problematic in tasks where

understanding the entire context is crucial, such as language modeling, machine translation, or
speech recognition. In these tasks, capturing dependencies across long distances is crucial for
generating accurate predictions or understanding the meaning of the input.

Echo State Network

Echo state networks (ESNs) are a type of recurrent neural network (RNN) that are specifically
designed to learn long-term dependencies. ESNs are particularly well-suited for processing
time-dependent or sequential data, such as time series analysis, signal processing, and
dynamic system modeling

The key idea behind Echo State Networks is the separation of the network into two components:
a fixed random reservoir and a trainable readout layer. The random reservoir is sparsely
connected and remains fixed throughout the training process. The readout layer, which is
trained using supervised learning, learns to map the reservoir's dynamics to the desired output.

ESNs consist of a reservoir and a readout layer. The reservoir is a recurrent neural network with
a large number of neurons and randomly initialized weights.
The reservoir is used to store information about the input sequence. The readout layer is used
to make predictions about the output sequence.

The main advantage of Echo State Networks lies in their simplicity and efficiency. The fixed and
randomly initialized reservoir eliminates the need to train the recurrent connections, reducing
the computational complexity and training time. Moreover, the separation of the fixed reservoir
and the trainable readout layer allows for efficient training even with limited labeled data.

Why should you use Echo State Networks

● Echo State Networks do not suffer from the vanishing/exploding gradient problem.
● While traditional neural networks are computationally expensive, ESNs tend to be fast
due to the lack of a backpropagation phase on the reservoir.
● Echo State Networks are effective at handling chaotic time series.
● Before echo state networks were introduced, recurrent neural networks were hardly ever
used in practice. This was due to the complexity involved in adjusting their connections
due to the lack of auto differentiation and susceptibility to vanishing and exploding
gradients.

Working
The echo state network makes use of a very sparsely connected hidden layer (that usually has
1% connectivity). The connectivity and weights of hidden neurons are fixed and are assigned on
a random basis. The weights of output neurons can be learned, enabling the network to produce
or reproduce specific temporal patterns. The most interesting part of this network is that in spite
of its behavior being non-linear, the only weights that end up getting modified during the training
processes are for the synapses that connect the hidden neurons to output neurons.

Leaky Units and Other Strategies for Multiple Time Scales

The Goal is to deal with long-term dependencies, Strategies which are useful to build fine and
coarse time scales are:
● Skip connections: Skip connections are connections that directly link different
time steps in an RNN. They can be used to help the network learn long-term
dependencies by providing a path for information to flow from the distant past to
the present.
● Gated units: Gated units are units that have a gating mechanism that allows
them to selectively forget information. This can be helpful for learning long-term
dependencies because it allows the network to forget irrelevant information while
retaining important information.
● LSTMs and GRUs: LSTMs and GRU(Gated Recurrent Unit) are specialized
RNN architectures that are designed to learn long-term dependencies. They do
this by using gated units to control the flow of information through the network.
The Long Short-Term Memory (LSTM) and GRU
LSTM (Long Short-Term Memory) is a type of recurrent neural network (RNN) architecture that
is specifically designed to address the vanishing gradient problem and capture long-term
dependencies in sequential data. LSTMs are widely used in tasks such as natural language
processing, speech recognition, machine translation, and time series analysis.

A general LSTM unit is composed of a cell, an input gate, an output gate, and a forget gate. The
cell remembers values over arbitrary time intervals, and three gates regulate the flow of
information into and out of the cell.

The architecture of an LSTM network includes several key components that enable it to
effectively capture and propagate information over long sequences:
● Cell State: The cell state serves as the memory of the LSTM network. It carries
information across different time steps and allows the network to maintain information
over long distances.
● Input Gate: The input gate controls the flow of information into the cell state. It
determines which parts of the input and the previous hidden state are relevant to update
the cell state at the current time step. The input gate uses a sigmoid activation function
to generate values between 0 and 1, indicating the importance of each element.
● Forget Gate: The forget gate determines which information in the cell state should be
discarded or forgotten. It takes the previous hidden state and the current input as input
and produces a forget gate value between 0 and 1 for each element in the cell state. The
forget gate uses a sigmoid activation function to determine how much information should
be forgotten.
● Output Gate: The output gate controls the flow of information from the cell state to the
output and the next hidden state. It selects which parts of the cell state should be
outputted at the current time step. Similar to the input and forget gates, the output gate
uses a sigmoid activation function to determine the relevance of each element.

Gated Recurrent Unit (GRU) is a variant of recurrent neural network (RNN) architecture that,
like LSTM, addresses the vanishing gradient problem and captures long-term dependencies in
sequential data. GRU simplifies the LSTM architecture by combining the forget and input gates
into a single update gate and merging the cell state and hidden state into a single state vector.

The architecture of a GRU network includes the following components:

● Update Gate (zt): The update gate controls the information flow from the previous
hidden state to the current hidden state. It determines how much of the previous hidden
state should be retained and how much should be updated with the current input. The
update gate uses a sigmoid activation function, and a value of 1 means to keep all the
previous hidden state, while a value of 0 means to discard it completely.
● Reset Gate (rt): The reset gate determines how much of the previous hidden state
should be forgotten and how much new information should be incorporated from the
current input. The reset gate uses a sigmoid activation function to decide which parts of
the previous hidden state should be ignored.

LSTM vs GRU

Optimization for Long term dependencies

There are two way to Optimization long term dependencies:
● Gradient clipping: Gradient clipping is a technique that limits the size of the gradients
that are used to update the network's parameters. This can help to prevent the gradients
from exploding or vanishing, which can make it difficult for the network to learn long-term
dependencies.
● Regularization: Regularization is a technique that adds a penalty to the loss function to
prevent the network from overfitting the training data. This can help to improve the
generalization performance of the network, which can make it easier for the network to
learn long-term dependencies.

Explicit Memory
In traditional RNNs, the hidden state serves as an implicit form of memory that carries
information from previous time steps. However, the hidden state has limited capacity to retain
information over long sequences due to the vanishing gradient problem. As a result, the network
may struggle to capture long-term dependencies effectively.
Explicit memory mechanisms address this limitation by introducing an explicit memory
component that can store and access information over long time spans. This memory can be
accessed at any time step, allowing the network to explicitly retrieve important information from
the past.

The explicit memory mechanisms provide several benefits to RNNs:

● Improved Long-Term Dependencies: By explicitly storing and accessing information
from the past, RNNs with explicit memory can better capture long-term dependencies in
sequential data.
● Enhanced Contextual Information: The ability to retrieve specific past information
allows the network to better contextualize the current input and make more informed
predictions or decisions.
● Increased Capacity: The explicit memory component increases the network's capacity
to store and process information, enabling it to handle longer sequences and more
complex dependencies.

Explicit memory mechanisms have been successfully applied in various tasks, including
machine translation, language modeling, question answering, and image captioning. They
provide a means to overcome the limitations of standard RNNs and capture dependencies over
longer sequences, ultimately improving the performance of the network on tasks involving
long-term information retention.

SRM Institute of Science and Technology: Record Work
No ratings yet
SRM Institute of Science and Technology: Record Work
251 pages
Deep Learning (MODULE-4)
No ratings yet
Deep Learning (MODULE-4)
102 pages
GenAI Module2
No ratings yet
GenAI Module2
190 pages
Recurrent & Recursive Nets
No ratings yet
Recurrent & Recursive Nets
10 pages
Unit 5 NNDL
No ratings yet
Unit 5 NNDL
43 pages
Mod 4-RNN Deep Learning
No ratings yet
Mod 4-RNN Deep Learning
63 pages
DL U-Ii
No ratings yet
DL U-Ii
41 pages
RNN LSTM Gru R
No ratings yet
RNN LSTM Gru R
97 pages
DL Unit Iv
No ratings yet
DL Unit Iv
15 pages
Unit 3 Chapter 1 RNN
No ratings yet
Unit 3 Chapter 1 RNN
121 pages
Mod 6
No ratings yet
Mod 6
48 pages
Recurrent Neural Network
No ratings yet
Recurrent Neural Network
21 pages
Unit-Iv DL
No ratings yet
Unit-Iv DL
54 pages
Recurrent Neural Networks
No ratings yet
Recurrent Neural Networks
18 pages
Unit 3 Questions With Answers Ghanta Ka Password
No ratings yet
Unit 3 Questions With Answers Ghanta Ka Password
20 pages
Module5 DL
No ratings yet
Module5 DL
18 pages
Aids Ii
No ratings yet
Aids Ii
42 pages
Module2 L7 RNN LSTM
No ratings yet
Module2 L7 RNN LSTM
47 pages
Day 4
No ratings yet
Day 4
22 pages
DL Mod4
No ratings yet
DL Mod4
105 pages
Unit III - Recurrent Neural Networks
No ratings yet
Unit III - Recurrent Neural Networks
44 pages
Unit 5
No ratings yet
Unit 5
76 pages
Unit V
No ratings yet
Unit V
32 pages
T3-Slide - 002 - Vanilla RNNs
No ratings yet
T3-Slide - 002 - Vanilla RNNs
25 pages
DNN U2 Notes
No ratings yet
DNN U2 Notes
32 pages
NNDL
No ratings yet
NNDL
10 pages
Unit-2 Part-2
No ratings yet
Unit-2 Part-2
42 pages
06 - LLM
No ratings yet
06 - LLM
18 pages
AAM Unit 6 Notes
No ratings yet
AAM Unit 6 Notes
20 pages
Recurrent Neural Networks
No ratings yet
Recurrent Neural Networks
14 pages
Module 4-1
No ratings yet
Module 4-1
44 pages
Unit 3 RCNN
No ratings yet
Unit 3 RCNN
25 pages
Unit 3
No ratings yet
Unit 3
41 pages
Module 5 (Chapter 10)
No ratings yet
Module 5 (Chapter 10)
17 pages
Module 5
No ratings yet
Module 5
21 pages
Dl-Unit 5
No ratings yet
Dl-Unit 5
10 pages
CS601 - Machine Learning - Unit 4 - Notes - 1672759767
No ratings yet
CS601 - Machine Learning - Unit 4 - Notes - 1672759767
12 pages
Recurrent Neural Networks
No ratings yet
Recurrent Neural Networks
36 pages
DL For Sequencial Data
No ratings yet
DL For Sequencial Data
36 pages
Unit 3
No ratings yet
Unit 3
30 pages
ML Unit 4
No ratings yet
ML Unit 4
47 pages
Explain The Concept of Unfolding Computational Graphs in The Context of Recurrent Neural Networks
No ratings yet
Explain The Concept of Unfolding Computational Graphs in The Context of Recurrent Neural Networks
9 pages
Module 4
No ratings yet
Module 4
14 pages
Unit 3 RCNN Updated
No ratings yet
Unit 3 RCNN Updated
28 pages
Recurrent Neural Networks
No ratings yet
Recurrent Neural Networks
6 pages
Unit 4
No ratings yet
Unit 4
27 pages
Unit 4 - Machine Learning
No ratings yet
Unit 4 - Machine Learning
16 pages
Unit III (2) RNN, LSTM, Gru
No ratings yet
Unit III (2) RNN, LSTM, Gru
14 pages
Hcia Ai Huawei Mock Exam
0% (1)
Hcia Ai Huawei Mock Exam
9 pages
Deep Learning
No ratings yet
Deep Learning
49 pages
Recurrent Neural Network
No ratings yet
Recurrent Neural Network
10 pages
Module 06
No ratings yet
Module 06
5 pages
Steps For Training A Recurrent Neural Network: Advantages
No ratings yet
Steps For Training A Recurrent Neural Network: Advantages
13 pages
Lecture Notes - RRN
No ratings yet
Lecture Notes - RRN
8 pages
DL Unit - III Notes1
No ratings yet
DL Unit - III Notes1
14 pages
Ministry of Higher Education and Scientific Research University of Technology Computer Engineering Department
No ratings yet
Ministry of Higher Education and Scientific Research University of Technology Computer Engineering Department
6 pages
Unit III
No ratings yet
Unit III
38 pages
UNIT5
No ratings yet
UNIT5
13 pages
Unit 3
No ratings yet
Unit 3
8 pages
Different Deep CNN Architectures - LeNet, AlexNet, VGG
No ratings yet
Different Deep CNN Architectures - LeNet, AlexNet, VGG
13 pages
Pune University Soft Computing Exam Papers
No ratings yet
Pune University Soft Computing Exam Papers
4 pages
Unit 1 NNDL
No ratings yet
Unit 1 NNDL
8 pages
Backpropagation Learning in Neural Networks
No ratings yet
Backpropagation Learning in Neural Networks
27 pages
Chapter 5 - Neural Networks
No ratings yet
Chapter 5 - Neural Networks
52 pages
Unit IV Artificial Neural Networks
No ratings yet
Unit IV Artificial Neural Networks
25 pages
NPU MachineLearning
No ratings yet
NPU MachineLearning
28 pages
Question Bank
No ratings yet
Question Bank
4 pages
Edge Comp Report
No ratings yet
Edge Comp Report
26 pages
NNDL 1
No ratings yet
NNDL 1
13 pages
Encoder Decoder Transformers Notes
No ratings yet
Encoder Decoder Transformers Notes
6 pages
Unec 1700728516
No ratings yet
Unec 1700728516
105 pages
A Review of Recurrent Neural Networks
No ratings yet
A Review of Recurrent Neural Networks
36 pages
06-DL-Deep Learning For Text Data (LSTM Seq2Seq Models)
No ratings yet
06-DL-Deep Learning For Text Data (LSTM Seq2Seq Models)
44 pages
Adaptive Linear Neuron
No ratings yet
Adaptive Linear Neuron
4 pages
Activations
No ratings yet
Activations
8 pages
EEG Classification Using Long Short-Term Memory Recurrent Neural Networks
No ratings yet
EEG Classification Using Long Short-Term Memory Recurrent Neural Networks
29 pages
Deep Learning Module-02 Search Creators
No ratings yet
Deep Learning Module-02 Search Creators
15 pages
TEAM MEMBERS Noopur Sharma Vartika Singh Vivashwat Thakur
No ratings yet
TEAM MEMBERS Noopur Sharma Vartika Singh Vivashwat Thakur
13 pages
Ch. 10: Introduction To Convolution Neural Networks CNN and Systems
No ratings yet
Ch. 10: Introduction To Convolution Neural Networks CNN and Systems
69 pages
Artificial Neural Network: Training: Debasis Samanta
No ratings yet
Artificial Neural Network: Training: Debasis Samanta
13 pages
Data Driven Artificial Neural Network LSTM Hybrid 250129 102818-1
No ratings yet
Data Driven Artificial Neural Network LSTM Hybrid 250129 102818-1
6 pages
DL 5
No ratings yet
DL 5
7 pages
Aneja Convolutional Image Captioning CVPR 2018 Paper
No ratings yet
Aneja Convolutional Image Captioning CVPR 2018 Paper
10 pages
Deep Learning - IIT Ropar - Unit 9 - Week 6
No ratings yet
Deep Learning - IIT Ropar - Unit 9 - Week 6
5 pages
Group A - 2 - Adt - Set
No ratings yet
Group A - 2 - Adt - Set
6 pages
Understanding LSTM Networks - Colah's Blog
No ratings yet
Understanding LSTM Networks - Colah's Blog
7 pages
NNFL CBCGS Syllabus
No ratings yet
NNFL CBCGS Syllabus
8 pages
Tutorial Sheet 2 (KOE073)
No ratings yet
Tutorial Sheet 2 (KOE073)
3 pages
Group A - 1 - Client - Data
No ratings yet
Group A - 1 - Client - Data
3 pages
Convolutional Neural Network in DIP
No ratings yet
Convolutional Neural Network in DIP
2 pages
Diagram For ANN
No ratings yet
Diagram For ANN
2 pages
Techniques and Tools for Artificial Intelligence. Neural Networks via R and PYTHON
From Everand
Techniques and Tools for Artificial Intelligence. Neural Networks via R and PYTHON
César Pérez López
No ratings yet
TensorFlow in 1 Day: Make your own Neural Network
From Everand
TensorFlow in 1 Day: Make your own Neural Network
Krishna Rungta
3.5/5 (10)
Machine Learning - Advanced Concepts
From Everand
Machine Learning - Advanced Concepts
Derrick Mwiti
No ratings yet

DL 4

Uploaded by

DL 4

Uploaded by

Unfolding Computational Graphs

Unfolding computational graphs is a technique used to represent recurrent neural networks

Advantages of Unfolding Process

Recurrent Neural Networks

RNNs can be used to: Translate languages, Generate text

Working of each Recurrent Unit

The GRU works as follows:

h(t) = [h_f(t); h_b(t)]

The encoder-decoder model is composed on three primary building blocks:

2. Hidden Vector / Encoder Vector

Deep Recurrent Network

The challenge of long-term dependencies is particularly problematic in tasks where

Echo State Network

Why should you use Echo State Networks

Leaky Units and Other Strategies for Multiple Time Scales

The architecture of a GRU network includes the following components:

Optimization for Long term dependencies

The explicit memory mechanisms provide several benefits to RNNs:

You might also like