Unit-2 Part-2
Unit-2 Part-2
Recursive Nets
By
Rashmi A.R.
Recurrent Neural Networks (RNNs)
• RNNs are a family of neural networks for processing sequential data.
• Much as a convolutional network is a neural network that is specialized for
processing a grid of values X such as an image.
• A recurrent neural network is a neural network that is specialized for
processing a sequence of values x(1), . . . , x(τ) .
• RNN is a type of sequential model specifically designed to work on
sequential data.
• RNN is used in NLP domain.
• RNN is well suited to sequence data.
Why use RNN?
• In sequential data, ANNs cannot be used.
• All the inputs are text, our Neural Network will not understand text, so we
need to vectorized.
Problem:
1. Textual data may be of different size.
2. Zero padding ⟵ unnecessary computation.
3. Prediction will fail when input size is big.
4. Totally disregarding the sequence information. [semantic meaning is not
maintained/retained.
Applications:
• Sentiment analysis
• Sentence completion
• Image captioning
Unfolding computational graphs
Unfolding computational graphs is a key concept in Recurrent Neural Networks
(RNNs) that helps in understanding how the network processes sequences over time.
This unrolling helps visualize and understand the way the RNN processes sequential data by passing information from
one time step to the next.
Benefits of Unfolding:
• Parameter Sharing: The same parameters are used at each time step,
allowing the model to generalize across different sequence lengths.
• Backpropagation Through Time (BPTT): Once the graph is unfolded,
standard backpropagation can be applied to compute gradients across time
steps. This process is called Backpropagation Through Time (BPTT),
where gradients flow both forward in time (during the forward pass) and
backward in time (during the backward pass).
Visual Representation:
Imagine an RNN as a looped structure, where the hidden state ℎ𝑡 feeds into the
next step. Unfolding breaks this loop and stretches it into a linear chain of
computations.
Why Unfold?
Unfolding is essential because it allows the recurrent model, which operates on
a sequence, to be trained using standard neural network training techniques.
Without unfolding, the recurrence would make it difficult to compute gradients
and update parameters effectively.
Recurrent Neural Network (RNN)
• RNNs are special class of Neural Network which has memory like features
in it.
• Past inputs are remembered, that is why they work great on sequential data.
Types of RNNs
Popular recurrent neural network architecture variants include:
• Standard RNNs
• Bidirectional recurrent neural networks (BRRNs)
• Long short-term memory (LSTM)
• Gated recurrent units (GNUs)
• Encoder-decoder RNN
Figure: FeedForward Neural Network (FNN) Figure: Recurrent Neural Network (RNN)
Recurrent Neural Network (RNN) (Cont’d)
2. Long-Term Dependencies:
Difficulty in Capturing Long-Term Dependencies: Standard RNNs
struggle to capture long-range dependencies due to the vanishing gradient
problem, making it challenging to learn from data where dependencies span
many time steps.
Bi-directional Recurrent Neural Network (BRNN)
Bi-directional Recurrent Neural Network (BRNN)
• A bidirectional recurrent neural network (RNN) is a type of recurrent
neural network (RNN) that processes input sequences in both forward and
backward directions.
• This allows the RNN to capture information from the input sequence that
may be relevant to the output prediction. Still, the same could be lost in a
traditional RNN that only processes the input sequence in one direction.
• This allows the network to consider information from the past and future
when making predictions rather than just relying on the input data at the
current time step.
• This can be useful for tasks such as language processing, where
understanding the context of a word or phrase can be important for making
accurate predictions.
• In general, bidirectional RNNs can help improve a model's performance on
various sequence-based tasks.
• This means that the network has two separate RNNs:
• One that processes the input sequence from left to right
• Another one that processes the input sequence from right to left.
• Two RNNs are applied, at every time step, both RNNs are giving the
output.
• Finally, we concatenate the output.
Equations:
We have 2 RNNs
Bi-directional Recurrent Neural Network (BRNN)
Need for Bidirectional Recurrent Neural Networks
• Bidirectional Recurrent Neural Networks (RNNs) are used when the output
at a particular time step depends on the input at that time step as well
as the inputs that come after it. However, in some cases, the output at a
particular time step may also depend on the inputs that come before it. In
such cases, Bidirectional RNNs are used to capture the dependencies in
both directions.
• The main need for Bidirectional RNNs arises in sequential data processing
tasks where the context of the data is important. For instance, in natural
language processing, the meaning of a word in a sentence may depend on
the words that come before and after it. Similarly, in speech recognition, the
current sound may depend on the previous and upcoming sounds.
• The need for Bidirectional RNNs arises in tasks where the context of the
data is important, and the output at a particular time step depends on both
past and future inputs. By processing the input sequence in both
directions, Bidirectional RNNs help to capture these dependencies and
improve the accuracy of predictions.
BRNNs improve upon traditional RNNs
Input Output
Sequence Sequence
3 challenges:
1. Input: sentence in some language (English) -> variable length
2. Output: sentence in some language (Hindi) -> variable length
3. No guarantee that 4 English words (Eg. Nice to meet you) will be translated
only to 4 words of Hindi language.