0% found this document useful (0 votes)
59 views24 pages

REPORT

Recurrent neural networks are algorithms that are well-suited for sequential data like time series prediction and natural language processing. RNNs differ from feed-forward neural networks in that they have internal memory allowing them to learn from previous inputs and outputs. Common applications of RNNs include image captioning, machine translation, and text classification.

Uploaded by

kiswah computers
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
59 views24 pages

REPORT

Recurrent neural networks are algorithms that are well-suited for sequential data like time series prediction and natural language processing. RNNs differ from feed-forward neural networks in that they have internal memory allowing them to learn from previous inputs and outputs. Common applications of RNNs include image captioning, machine translation, and text classification.

Uploaded by

kiswah computers
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 24

RECURRENT NEURAL NETWORKS (RNNs)

CHAPTER 1
INTRODUCTION

1.1 WHAT IS A NEURAL NETWORK?

A Neural Network consists of different layers connected to each other, working on


the structure and function of a human brain. It learns from huge volumes of data and uses
complex algorithms to train a neural net.

Here is an example of how neural networks can identify a dog’s breed based on their
features.

 The image pixels of two different breeds of dogs are fed to the input layer of the
neural network.
 The image pixels are then processed in the hidden layers for feature extraction.
 The output layer produces the result to identify if it’s a German Shepherd or a
Labrador.
 Such networks do not require memorizing the past output.

Several neural networks can help solve different business problems. Let’s look at a few of
them.

 Feed-Forward Neural Network: Used for general Regression and Classification


problems.
 Convolutional Neural Network: Used for object detection and image
classification.
 Deep Belief Network: Used in healthcare sectors for cancer detection.
 RNN: Used for speech recognition, voice recognition, time series prediction, and
natural language processing.

DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING Page 1


K.B.N COLLEGE OF ENGINEERING, GULBARGA
RECURRENT NEURAL NETWORKS (RNNs)

1.2 WHAT IS A RECURRENT NEURAL NETWORK (RNN)?


Recurrent neural networks (RNN) are the state-of-the-art algorithm for sequential
data and are used by Apple's Siri and and Google's voice search. It is the first algorithm
that remembers its input, due to an internal memory, which makes it perfectly suited for
machine learning problems that involve sequential data.

RNN works on the principle of saving the output of a particular layer and feeding
this back to the input in order to predict the output of the layer.

Below is how you can convert a Feed-Forward Neural Network into a Recurrent
Neural Network:

Fig: Simple Recurrent Neural Network

DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING Page 2


K.B.N COLLEGE OF ENGINEERING, GULBARGA
RECURRENT NEURAL NETWORKS (RNNs)

1.3 WHY RECURRENT NEURAL NETWORKS?


RNN were created because there were a few issues in the feed-forward neural
network:

 Cannot handle sequential data


 Considers only the current input
 Cannot memorize previous inputs

The solution to these issues is the RNN. An RNN can handle sequential data,
accepting the current input data, and previously receive inputs. RNNs can memorize
previous inputs due to their internal memory.

1.4 RNN V/S. FEED-FORWARD NEURAL NETWORKS

RNN’s and feed-forward neural networks get their names from the way they
channel information.

In a feed-forward neural network, the information only moves in one direction —


from the input layer, through the hidden layers, to the output layer. The information
moves straight through the network and never touches a node twice.

DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING Page 3


K.B.N COLLEGE OF ENGINEERING, GULBARGA
RECURRENT NEURAL NETWORKS (RNNs)

Feed-forward neural networks have no memory of the input they receive and are
bad at predicting what’s coming next. Because a feed-forward network only considers the
current input, it has no notion of order in time. It simply can’t remember anything about
what happened in the past except its training.

In a RNN the information cycles through a loop. When it makes a decision, it


considers the current input and also what it has learned from the inputs it received
previously.

The two images below illustrate the difference in information flow between a
RNN and a feed-forward neural network.

A usual RNN has a short-term memory. In combination with a LSTM they also
have a long-term memory (more on that later).

Another good way to illustrate the concept of a recurrent neural network's


memory is to explain it with an example:

Imagine you have a normal feed-forward neural network and give it the word
"neuron" as an input and it processes the word character by character. By the time it
reaches the character "r," it has already forgotten about "n," "e" and "u," which makes it
almost impossible for this type of neural network to predict which character would come
next.

A recurrent neural network, however, is able to remember those characters


because of its internal memory. It produces output, copies that output and loops it back
into the network.

DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING Page 4


K.B.N COLLEGE OF ENGINEERING, GULBARGA
RECURRENT NEURAL NETWORKS (RNNs)

Simply put: recurrent neural networks add the immediate past to the present.
Therefore, a RNN has two inputs: the present and the recent past. This is
important because the sequence of data contains crucial information about what is coming
next, which is why a RNN can do things other algorithms can’t.

A feed-forward neural network assigns, like all other deep learning algorithms, a
weight matrix to its inputs and then produces the output. Note that RNNs apply weights to
the current and also to the previous input. Furthermore, a recurrent neural network will
also tweak the weights for both through gradient descent and backpropagation through
time (BPTT).

How Does Recurrent Neural Networks Work?

In Recurrent Neural networks, the information cycles through a loop to the


middle-hidden layer.

The input layer ‘x’ takes in the input to the neural network and processes it and
passes it onto the middle layer.

The middle layer ‘h’ can consist of multiple hidden layers, each with its own
activation functions and weights and biases. If you have a neural network where the
various parameters of different hidden layers are not affected by the previous layer, ie: the
neural network does not have memory, then you can use a recurrent neural network.

DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING Page 5


K.B.N COLLEGE OF ENGINEERING, GULBARGA
RECURRENT NEURAL NETWORKS (RNNs)

The Recurrent Neural Network will standardize the different activation functions
and weights and biases so that each hidden layer has the same parameters. Then, instead
of creating multiple hidden layers, it will create one and loop over it as many times as
required.

Formula for calculating current state:

 where:

 ht -> current state


ht-1 -> previous state
xt -> input state

 Formula for applying Activation function(tanh):

where:

 whh -> weight at recurrent neuron


wxh -> weight at input neuron

 Formula for calculating output:

 Yt -> output
Why -> weight at output layer

DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING Page 6


K.B.N COLLEGE OF ENGINEERING, GULBARGA
RECURRENT NEURAL NETWORKS (RNNs)

 Training through RNN


1. A single time step of the input is provided to the network.
2. Then calculate its current state using set of current input and the previous state.
3. The current ht becomes ht-1 for the next time step.
4. One can go as many time steps according to the problem and join the information
from all the previous states.
5. Once all the time steps are completed the final current state is used to calculate the
output.
6. The output is then compared to the actual output i.e the target output and the error
is generated.
7. The error is then back-propagated to the network to update the weights and hence
the network (RNN) is trained.

DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING Page 7


K.B.N COLLEGE OF ENGINEERING, GULBARGA
RECURRENT NEURAL NETWORKS (RNNs)

CHAPTER 2
APPLICATIONS OF RECURRENT NEURAL NETWORKS
2.1 IMAGE CAPTIONING
RNNs are used to caption an image by analyzing the activities present.

2.2 TIME SERIES PREDICTION


Any time series problem, like predicting the prices of stocks in a particular month,
can be solved using an RNN.

2.3 NATURAL LANGUAGE PROCESSING


Text mining and Sentiment analysis can be carried out using an RNN for Natural
Language Processing (NLP).

DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING Page 8


K.B.N COLLEGE OF ENGINEERING, GULBARGA
RECURRENT NEURAL NETWORKS (RNNs)

2.4 MACHINE TRANSLATION

Given an input in one language, RNNs can be used to translate the input into
different languages as output.

2.5 TYPES OF RECURRENT NEURAL NETWORKS


There are four types of Recurrent Neural Networks:

1. One to One
2. One to Many
3. Many to One
4. Many to Many

DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING Page 9


K.B.N COLLEGE OF ENGINEERING, GULBARGA
RECURRENT NEURAL NETWORKS (RNNs)

 One to One RNN

This type of neural network is known as the Vanilla Neural Network. It's used for
general machine learning problems, which has a single input and a single output.

DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING Page 10


K.B.N COLLEGE OF ENGINEERING, GULBARGA
RECURRENT NEURAL NETWORKS (RNNs)

 One to Many RNN

This type of neural network has a single input and multiple outputs. An example of
this is the image caption.

DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING Page 11


K.B.N COLLEGE OF ENGINEERING, GULBARGA
RECURRENT NEURAL NETWORKS (RNNs)

 Many to One RNN

This RNN takes a sequence of inputs and generates a single output. Sentiment
analysis is a good example of this kind of network where a given sentence can be
classified as expressing positive or negative sentiments.

DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING Page 12


K.B.N COLLEGE OF ENGINEERING, GULBARGA
RECURRENT NEURAL NETWORKS (RNNs)

 Many to Many RNN

This RNN takes a sequence of inputs and generates a sequence of outputs.


Machine translation is one of the examples.

DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING Page 13


K.B.N COLLEGE OF ENGINEERING, GULBARGA
RECURRENT NEURAL NETWORKS (RNNs)

CHAPTER 3
ISSUES OF STANDARD RNNS

3.1. VANISHING GRADIENT PROBLEM

Recurrent Neural Networks enable you to model time-dependent and sequential


data problems, such as stock market prediction, machine translation, and text generation.
You will find, however, RNN is hard to train because of the gradient problem.

RNNs suffer from the problem of vanishing gradients. The gradients carry
information used in the RNN, and when the gradient becomes too small, the parameter
updates become insignificant. This makes the learning of long data sequences difficult.

DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING Page 14


K.B.N COLLEGE OF ENGINEERING, GULBARGA
RECURRENT NEURAL NETWORKS (RNNs)

3.2. EXPLODING GRADIENT PROBLEM


While training a neural network, if the slope tends to grow exponentially instead
of decaying, this is called an Exploding Gradient. This problem arises when large error
gradients accumulate, resulting in very large updates to the neural network model weights
during the training process.

Long training time, poor performance, and bad accuracy are the major issues in
gradient problems

3.3 GRADIENT PROBLEM SOLUTIONS

Now, let’s discuss the most popular and efficient way to deal with gradient
problems, i.e., Long Short-Term Memory Network (LSTMs).

 First, let’s understand Long-Term Dependencies.

Suppose you want to predict the last word in the text: “The clouds are in the ______.”

The most obvious answer to this is the “sky.” We do not need any further context
to predict the last word in the above sentence.

Consider this sentence: “I have been staying in Spain for the last 10 years…I can
speak fluent ______.”

DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING Page 15


K.B.N COLLEGE OF ENGINEERING, GULBARGA
RECURRENT NEURAL NETWORKS (RNNs)

The word you predict will depend on the previous few words in context. Here, you
need the context of Spain to predict the last word in the text, and the most suitable answer
to this sentence is “Spanish.” The gap between the relevant information and the point
where it's needed may have become very large. LSTMs help you solve this problem.

 Back propagation Through Time:

Back propagation through time is when we apply a Backpropagation algorithm to


a Recurrent Neural network that has time series data as its input.

In a typical RNN, one input is fed into the network at a time, and a single output is
obtained. But in backpropagation, you use the current as well as the previous inputs as
input. This is called a timestep and one timestep will consist of many time series data
points entering the RNN simultaneously.

Once the neural network has trained on a time set and given you an output, that
output is used to calculate and accumulate the errors. After this, the network is rolled
back up and weights are recalculated and updated keeping the errors in mind.

DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING Page 16


K.B.N COLLEGE OF ENGINEERING, GULBARGA
RECURRENT NEURAL NETWORKS (RNNs)

 Long Short-Term Memory Networks:


LSTMs are a special kind of RNN — capable of learning long-term dependencies
by remembering information for long periods is the default behavior.

All RNN are in the form of a chain of repeating modules of a neural network. In
standard RNNs, this repeating module will have a very simple structure, such as a single
tanh layer.

Fig: Long Short-Term Memory Networks

LSTMs also have a chain-like structure, but the repeating module is a bit different
structure. Instead of having a single neural network layer, four interacting layers are
communicating extraordinarily.

DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING Page 17


K.B.N COLLEGE OF ENGINEERING, GULBARGA
RECURRENT NEURAL NETWORKS (RNNs)

3.4 WORKINGS OF LSTMS IN RNN

3.5 LSTMS WORK IN A 3-STEP PROCESS.

Step 1: Decide How Much Past Data It Should Remember


The first step in the LSTM is to decide which information should be omitted from the cell
in that particular time step. The sigmoid function determines this. It looks at the previous
state (ht-1) along with the current input xt and computes the function.

DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING Page 18


K.B.N COLLEGE OF ENGINEERING, GULBARGA
RECURRENT NEURAL NETWORKS (RNNs)

Consider the following two sentences:

Let the output of h(t-1) be “Alice is good in Physics. John, on the other hand, is good at
Chemistry.”

Let the current input at x(t) be “John plays football well. He told me yesterday over the
phone that he had served as the captain of his college football team.”

The forget gate realizes there might be a change in context after encountering the first full
stop. It compares with the current input sentence at x(t). The next sentence talks about
John, so the information on Alice is deleted. The position of the subject is vacated and
assigned to John.

Step 2: Decide How Much This Unit Adds to the Current State
In the second layer, there are two parts. One is the sigmoid function, and the other is the
tanh function. In the sigmoid function, it decides which values to let through (0 or 1). tanh
function gives weightage to the values which are passed, deciding their level of
importance (-1 to 1).

With the current input at x(t), the input gate analyzes the important information —
John plays football, and the fact that he was the captain of his college team is important.

“He told me yesterday over the phone” is less important; hence it's forgotten. This process
of adding some new information can be done via the input gate.

Step 3: Decide What Part of the Current Cell State Makes It to the Output
The third step is to decide what the output will be. First, we run a sigmoid layer, which
decides what parts of the cell state make it to the output. Then, we put the cell state
through tanh to push the values to be between -1 and 1 and multiply it by the output of the
sigmoid gate.

DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING Page 19


K.B.N COLLEGE OF ENGINEERING, GULBARGA
RECURRENT NEURAL NETWORKS (RNNs)

Let’s consider this example to predict the next word in the sentence: “John played
tremendously well against the opponent and won for his team. For his contributions,
brave ____ was awarded player of the match.”

There could be many choices for the empty space. The current input brave is an adjective,
and adjectives describe a noun. So, “John” could be the best output after brave.

Build deep learning models in TensorFlow and learn the TensorFlow open-source
framework with the Deep Learning Course (with Keras &TensorFlow). Enroll now!

DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING Page 20


K.B.N COLLEGE OF ENGINEERING, GULBARGA
RECURRENT NEURAL NETWORKS (RNNs)

CHAPTER 4
ADVANTAGES & DISADVANTAGES

4.1 ADVANTAGES OF RNN’S

 The principal advantage of RNN over ANN is that RNN can model a collection of
records (i.e., time collection) so that each pattern can be assumed to be dependent
on previous ones.
 Recurrent neural networks are even used with convolutional layers to extend the
powerful pixel neighborhood.

DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING Page 21


K.B.N COLLEGE OF ENGINEERING, GULBARGA
RECURRENT NEURAL NETWORKS (RNNs)

4.2 DISADVANTAGES OF RNN’S

 Gradient exploding and vanishing problems.


 Training an RNN is a completely tough task.
 It cannot system very lengthy sequences if the usage of Tanh or Relu as an
activation feature

DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING Page 22


K.B.N COLLEGE OF ENGINEERING, GULBARGA
RECURRENT NEURAL NETWORKS (RNNs)

CHAPTER 5
CONCLUSION
 Recurrent Neural Networks stand at the foundation of the modern-day marvels of
synthetic intelligence. They provide stable foundations for synthetic intelligence
programs to be greater green, flexible of their accessibility, and most importantly,
extra convenient to use.
 However, the outcomes of recurrent neural network work show the actual cost of
the information in this day and age. They display what number of things may be
extracted out of records and what this information can create in return. And that is
exceptionally inspiring.

DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING Page 23


K.B.N COLLEGE OF ENGINEERING, GULBARGA
RECURRENT NEURAL NETWORKS (RNNs)

REFERENCES
1) Decision Tree Algorithm Introduction
2) Natural Language Processing with Python
3) An Introduction to Reinforcement Learning
4) Data Science And Machine Learning: Hands-On Labs With Python
5) Data Visualization Using Plotly: Python’s Visualization Library
6) Deep Learning Vs Machine Learning
7) Python Decorators and Generators Q & A: Day 6 Live Session Review
8) Python FAQ: Technical & Career Oriented Questions For Beginners
9) Beginners Guide To Data Types In Python

DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING Page 24


K.B.N COLLEGE OF ENGINEERING, GULBARGA

You might also like