DL Mod4
DL Mod4
COMPUTATIONAL GRAPHS
• A computational graph is a directed graph where the nodes
correspond to operations or variables.
• Variables can feed their value into operations, and operations can
feed their output into other operations.
• Computational graphs are a type of graph that can be used to
represent mathematical expressions. This is similar to descriptive
language in the case of deep learning models, providing a functional
description of the required computation.
In general, the computational graph is a directed graph that is used for
expressing and evaluating mathematical expressions.
RECURRENT NEURAL NETWORK
• The computational graph to compute the training loss of a recurrent
network that maps an input sequence of x values to a corresponding
sequence of output o values.
• A loss L measures how far each o is from the corresponding training target
y.
• When using softmax outputs, we assume o is the unnormalized log
probabilities.
• The loss L internally computes yˆ = softmax(o) and compares this to the
target y.
• The RNN has input to hidden connections parametrized by a weight matrix
U , hidden-to-hidden recurrent connections parametrized by a weight
matrix W, and hidden-to-output connections parametrized by a weight
matrix V
RECURRENT NEURAL NETWORK
• Recurrent Neural Network(RNN) is a type of Neural Network where the output from the
previous step is fed as input to the current step.
• In traditional neural networks, all the inputs and outputs are independent of each other,
but in cases when it is required to predict the next word of a sentence, the previous
words are required and hence there is a need to remember the previous words.
• Thus RNN came into existence, which solved this issue with the help of a Hidden Layer.
• The main and most important feature of RNN is its Hidden state, which remembers
some information about a sequence.
• The state is also referred to as Memory State since it remembers the previous input to
the network.
• It uses the same parameters for each input as it performs the same task on all the inputs
or hidden layers to produce the output. This reduces the complexity of parameters,
unlike other neural networks.
• The Recurrent Neural Network consists of multiple fixed
activation function units, one for each time step.
• Each unit has an internal state which is called the hidden
state of the unit.
• This hidden state signifies the past knowledge that the
network currently holds at a given time step.
• This hidden state is updated at every time step to signify
the change in the knowledge of the network about the past.
• The hidden state is updated using the following recurrence
relation:-
• The formula for calculating the current state:
• Training through RNN
1.A single-time step of the input is provided to the network.
2.Then calculate its current state using a set of current input and the
previous state.
3.The current ht becomes ht-1 for the next time step.
4.One can go as many time steps according to the problem and join the
information from all the previous states.
5.Once all the time steps are completed the final current state is used
to calculate the output.
6.The output is then compared to the actual output i.e the target
output and the error is generated.
7.The error is then back-propagated to the network to update the
weights and hence the network (RNN) is trained
using Backpropagation through time.
• Advantages of Recurrent Neural Network
1.An RNN remembers each and every piece of information
through time. It is useful in time series prediction only
because of the feature to remember previous inputs as well.
This is called Long Short Term Memory.
2.Recurrent neural networks are even used with
convolutional layers to extend the effective pixel
neighborhood.
• Disadvantages of Recurrent Neural Network
1.Gradient vanishing and exploding problems.
2.Training an RNN is a very difficult task.
3.It cannot process very long sequences if using tanh or relu
as an activation function.
• Applications of Recurrent Neural Network
1.Language Modelling and Generating Text
2.Speech Recognition
3.Machine Translation
4.Image Recognition, Face detection
5.Time series Forecasting
• Types Of RNN
• There are four types of RNNs based on the number of
inputs and outputs in the network.
1.One to One
2.One to Many
3.Many to One
4.Many to Many
• One to One
• This type of RNN behaves the same as any simple
Neural network it is also known as Vanilla Neural
Network. In this Neural network, there is only one
input and one output.
• One To Many
• In this type of RNN, there is one input and many outputs associated with it.
One of the most used examples of this network is Image captioning where
given an image we predict a sentence having Multiple words.
•
• of Recurrent Neural Network
1.Language Modelling and Generating Text
2.Speech Recognition
3.Machine Translation
4.Image Recognition, Face detection
5.Time series Forecasting
• Many to One
• In this type of network, Many inputs are fed to the network at several states
of the network generating only one output. This type of network is used in
the problems like sentimental analysis. Where we give multiple words as input
and predict only the sentiment of the sentence as output.
• Many to Many
• In this type of neural network, there are multiple inputs and multiple
outputs corresponding to a problem. One Example of this Problem will be
language translation. In language translation, we provide multiple words
from one language as input and predict multiple words from the second
language as output.
• Variation Of Recurrent Neural Network (RNN)
• To overcome the problems like vanishing gradient and exploding gradient
descent several new advanced versions of RNNs are formed some of these are
as ;
1.Bidirectional Neural Network (BiNN)
2.Long Short-Term Memory (LSTM)
• Bidirectional Neural Network (BiNN)
• A BiNN is a variation of a Recurrent Neural Network in which
the input information flows in both direction and then the
output of both direction are combined to produce the input.
BiNN is useful in situations when the context of the input is
more important such as Nlp tasks and Time-series analysis
problems.
• Long Short-Term Memory (LSTM)