UNIT5
UNIT5
Sequence Modeling:
It basically says the current hidden state h(t) is a function f of the previous
hidden state h(t-1) and the current input x(t). The theta are the parameters
of the function f. The network typically learns to use h(t) as a kind of lossy
summary of the task-relevant aspects of the past sequence of inputs up to
t.
Unfolding maps the left to the right in the figure below (both are
computational graphs of a RNN without output o)
where the black square indicates that an interaction takes place with a
delay of 1 time step, from the state at time t to the state at time t + 1.
of y (t) given x (1), . . . , x (t) , then sum them up you get the loss for the
sequence as shown in (10.7):
Artificial neural networks that do not have looping nodes are called feed
forward neural networks. Because all information is only passed
forward, this kind of neural network is also referred to as a multi-layer
neural network.
Information moves from the input layer to the output layer – if any
hidden layers are present – unidirectionally in a feedforward neural
network. These networks are appropriate for image classification tasks,
for example, where input and output are independent. Nevertheless,
their inability to retain previous inputs automatically renders them less
useful for sequential data analysis.
Bi-directional Recurrent Neural Network
An architecture of a neural network called a bidirectional recurrent
neural network (BRNN) is made to process sequential data. In order for
the network to use information from both the past and future context in
its predictions, BRNNs process input sequences in both the forward and
backward directions. This is the main distinction between BRNNs and
conventional recurrent neural networks.
There are four types of RNNs based on the number of inputs and outputs
in the network.
1. One to One
2. One to Many
3. Many to One
4. Many to Many
One to One
This type of RNN behaves the same as any simple Neural network it is
also known as Vanilla Neural Network. In this Neural network, there is
only one input and one output.
Bidirectional RNNs
A BRNN has two distinct recurrent hidden layers, one of which
processes the input sequence forward and the other of which processes it
backward. After that, the results from these hidden layers are collected
and input into a prediction-making final layer. Any recurrent neural
network cell, such as Long Short-Term Memory (LSTM) or Gated
Recurrent Unit, can be used to create the recurrent hidden layers.
The BRNN functions similarly to conventional recurrent neural
networks in the forward direction, updating the hidden state depending
on the current input and the prior hidden state at each time step. The
backward hidden layer, on the other hand, analyses the input sequence
in the opposite manner, updating the hidden state based on the current
input and the hidden state of the next time step.
Compared to conventional unidirectional recurrent neural networks, the
accuracy of the BRNN is improved since it can process information in
both directions and account for both past and future contexts. Because
the two hidden layers can complement one another and give the final
prediction layer more data, using two distinct hidden layers also offers a
type of model regularisation.
In order to update the model parameters, the gradients are computed for
both the forward and backward passes of the backpropagation through
the time technique that is typically used to train BRNNs. The input
sequence is processed by the BRNN in a single forward pass at
inference time, and predictions are made based on the combined outputs
of the two hidden layers. layers.
The main reason why long-term dependencies are hard to learn is that
RNNs suffer from the vanishing or exploding gradient problem. This
means that the gradient, which is the signal that tells the network how to
update its weights, becomes either very small or very large as it
propagates through the network. When the gradient vanishes, the network
cannot learn from the distant inputs, and when it explodes, the network
becomes unstable and produces erratic outputs. This problem is caused by
the repeated multiplication of the same matrix, which represents the
connections between the hidden units, at each time step.
Explicit memory
Explicit memory is declarative memory because we consciously try to
recall a specific event or piece of information. Things we intentionally try
to recall or remember, such as formulas and dates, are all stored in
explicit memory. We utilize recalled information such as this during
everyday activities such as work or when running errands.
MRI studies show that during recall of explicit short-term memories, the
prefrontal cortex is activated, the most recently evolved addition to the
mammalian brain. Interestingly, there appears to be a separation in
function between the left and right sides of the prefrontal cortex, with the
right more involved in spatial and the left verbal working memory.