0% found this document useful (0 votes)
36 views36 pages

What Is A Recurrent Neural Network

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
36 views36 pages

What Is A Recurrent Neural Network

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 36

What is a Recurrent Neural Network

https://machinelearningmastery.com/an-introduction-to-recurrent-neural-
networks-and-the-math-that-powers-them/

A recurrent neural network (RNN) is a special type of an artificial neural


network adapted to work for time series data or data that involves
sequences. Ordinary feed forward neural networks are only meant for data
points, which are independent of each other. However, if we have data in a
sequence such that one data point depends upon the previous data point,
we need to modify the neural network to incorporate the dependencies
between these data points. RNNs have the concept of ‘memory’ that helps
them store the states or information of previous inputs to generate the next
output of the sequence.

Unfolding A Recurrent Neural Network

Recurrent neural network. Compressed representation (top), unfolded


network (bottom).
A simple RNN has a feedback loop as shown in the first diagram of the
above figure. The feedback loop shown in the gray rectangle can be
unrolled in 3 time steps to produce the second network of the above figure.
Of course, you can vary the architecture so that the network unrolls k time
steps. In the figure, the following notation is used:

Hence, in the feedforward pass of a RNN, the network computes the values
of the hidden units and the output after k time steps. The weights
associated with the network are shared temporally. Each recurrent layer
has two sets of weights; one for the input and the second one for the
hidden unit. The last feedforward layer, which computes the final output for
the kth time step is just like an ordinary layer of a traditional feedforward
network.

The Activation Function

We can use any activation function we like in the recurrent neural network.
Common choices are:

Training A Recurrent Neural Network

The backpropagation algorithm of an artificial neural network is modified to


include the unfolding in time to train the weights of the network. This
algorithm is based on computing the gradient vector and is called back
propagation in time or BPTT algorithm for short. The pseudo-code for
training is given below. The value of k can be selected by the user for
training. In the pseudo-code below pt is the target value at time step t:

1. Repeat till stopping criterion is met:


1. Repeat till the stopping criterion is met:
1. Set all h to zero.
2. Repeat for t = 0 to n-k
1. Forward propagate the network over the unfolded network for k
time steps to compute all h and y
2. Compute the error as: e=yt+k−pt+k
3. Backpropagate the error across the unfolded network and
update the weights

Types of RNNs

There are different types of recurrent neural networks with varying


architectures. Some examples are:

One To One

Here there is a single (xt,yt) pair. Traditional neural networks employ a one
to one architecture.

One To Many
In one to many networks, a single input at xt can produce multiple outputs,
e.g., (yt0,yt1,yt2). Music generation is an example area, where one to many
networks are employed.

Many To One

In this case many inputs from different time steps produce a single output.
For example, (xt,xt+1,xt+2) can produce a single output yt. Such networks
are employed in sentiment analysis or emotion detection, where the class
label depends upon a sequence of words.

Many To Many
There are many possibilities for many to many. An example is shown
above, where two inputs produce three outputs. Many to many networks
are applied in machine translation, e.g, English to French or vice versa
translation systems.

Advantages and Shortcomings Of RNNs

RNNs have various advantages such as:

● Ability to handle sequence data.


● Ability to handle inputs of varying lengths.
● Ability to store or ‘memorize’ historical information.

The disadvantages are:

● The computation can be very slow.


● The network does not take into account future inputs to make
decisions.
● Vanishing gradient problem, where the gradients used to compute the
weight update may get very Xclose to zero preventing the network
from learning new weights. The deeper the network, the more
pronounced is this problem.

Different RNN Architectures


There are different variations of RNNs that are being applied practically in
machine learning problems:

Bidirectional recurrent neural networks (BRNN)

In BRNN, inputs from future time steps are used to improve the accuracy of
the network. It is like having knowledge of the first and last words of a
sentence to predict the middle words.

Gated Recurrent Units (GRU)

These networks are designed to handle the vanishing gradient problem.


They have a reset and update gate. These gates determine which
information is to be retained for future predictions.

Long Short Term Memory (LSTM)

LSTMs were also designed to address the vanishing gradient problem in


RNNs. LSTM use three gates called input, output and forget gate. Similar
to GRU, these gates determine which information to retain.

Further Reading

This section provides more resources on the topic if you are looking to go
deeper.

Books

● Deep Learning Essentials, by Wei Di, Anurag Bhardwaj and Jianing


Wei.
● Deep learning by Ian Goodfellow, Joshua Bengio and Aaron
Courville.
Introduction to Recurrent Neural Networks (RNNs)

RNNs are a powerful and robust type of neural network, and belong to
the most promising algorithms in use because it is the only one with an
internal memory.

Like many other deep learning algorithms, recurrent neural networks


are relatively old. They were initially created in the 1980s, but only in
recent years have we seen their true potential. An increase in
computational power along with the massive amounts of data that we
now have to work with, and the invention of long short-term memory
(LSTM) in the 1990s, has really brought RNNs to the foreground.

Because of their internal memory, RNNs can remember important


things about the input they received, which allows them to be very
precise in predicting what’s coming next. This is why they’re the
preferred algorithm for sequential data like time series, speech, text,
financial data, audio, video, weather and much more. Recurrent neural
networks can form a much deeper understanding of a sequence and its
context compared to other algorithms.

WHAT ARE RECURRENT NEURAL NETWORKS (RNNS)?

Recurrent neural networks (RNNs) are a class of neural networks


that are helpful in modeling sequence data. Derived from
feedforward networks, RNNs exhibit similar behavior to how
human brains function. Simply put: recurrent neural networks
produce predictive results in sequential data that other algorithms
can’t.

But when do you need to use an RNN?


“Whenever there is a sequence of data and that temporal dynamics that
connects the data is more important than the spatial content of each
individual frame.” – Lex Fridman (MIT)

Since RNNs are being used in the software behind Siri and Google
Translate, recurrent neural networks show up a lot in everyday life.

How Do Recurrent Neural Networks Work?

To understand RNNs properly, you’ll need a working knowledge of


“normal” feed-forward neural networks and sequential data.

Sequential data is basically just ordered data in which related things


follow each other. Examples are financial data or the DNA sequence.
The most popular type of sequential data is perhaps time series data,
which is just a series of data points that are listed in time order.

RECURRENT VS. FEED-FORWARD NEURAL NETWORKS


RNNs and feed-forward neural networks get their names from the way
they channel information.

In a feed-forward neural network, the information only moves in one


direction — from the input layer, through the hidden layers, to the
output layer. The information moves straight through the network.

Feed-forward neural networks have no memory of the input they


receive and are bad at predicting what’s coming next. Because a feed-
forward network only considers the current input, it has no notion of
order in time. It simply can’t remember anything about what happened
in the past except its training.

In a RNN the information cycles through a loop. When it makes a


decision, it considers the current input and also what it has learned
from the inputs it received previously.

The two images below illustrate the difference in information flow


between a RNN and a feed-forward neural network.
A usual RNN has a short-term memory. In combination with a LSTM
they also have a long-term memory (more on that later).

Another good way to illustrate the concept of a recurrent neural


network’s memory is to explain it with an example: Imagine you have a
normal feed-forward neural network and give it the word “neuron” as
an input and it processes the word character by character. By the time
it reaches the character “r,” it has already forgotten about “n,” “e” and
“u,” which makes it almost impossible for this type of neural network to
predict which character would come next.

A recurrent neural network, however, is able to remember those


characters because of its internal memory. It produces output, copies
that output and loops it back into the network.

Simply put: Recurrent neural networks add the immediate past to the
present.
Therefore, a RNN has two inputs: the present and the recent past. This
is important because the sequence of data contains crucial information
about what is coming next, which is why a RNN can do things other
algorithms can’t.

A feed-forward neural network assigns, like all other deep learning


algorithms, a weight matrix to its inputs and then produces the output.
Note that RNNs apply weights to the current and also to the previous
input. Furthermore, a recurrent neural network will also tweak the
weights for both gradient descent and backpropagation through time.

Types of Recurrent Neural Networks

TYPES OF RECURRENT NEURAL NETWORKS (RNNS)

● One to One
● One to Many
● Many to One
● Many to Many

Also note that while feed-forward neural networks map one input to
one output, RNNs can map one to many, many to many (translation)
and many to one (classifying a voice).
RNN and Backpropagation Through Time

To understand the concept of backpropagation through time (BPTT)


you’ll need to understand the concepts of forward and backpropagation
first. We could spend an entire article discussing these concepts, so I
will attempt to provide as simple a definition as possible.

WHAT IS BACKPROPAGATION?

Backpropagation (BP or backprop, for short) is known as a


workhorse algorithm in machine learning. Backpropagation is
used for calculating the gradient of an error function with respect
to a neural network’s weights. The algorithm works its way
backwards through the various layers of gradients to find the
partial derivative of the errors with respect to the weights.
Backprop then uses these weights to decrease error margins
when training.
In neural networks, you basically do forward-propagation to get the
output of your model and check if this output is correct or incorrect, to
get the error. Backpropagation is nothing but going backwards through
your neural network to find the partial derivatives of the error with
respect to the weights, which enables you to subtract this value from
the weights.

Those derivatives are then used by gradient descent, an algorithm that


can iteratively minimize a given function. Then it adjusts the weights up
or down, depending on which decreases the error. That is exactly how
a neural network learns during the training process.

So, with backpropagation you basically try to tweak the weights of your
model while training.

The image below illustrates the concept of forward propagation and


backpropagation in a feed-forward neural network:
BPTT is basically just a fancy buzz word for doing backpropagation on
an unrolled recurrent neural network. Unrolling is a visualization and
conceptual tool, which helps you understand what’s going on within the
network. Most of the time when implementing a recurrent neural
network in the common programming frameworks, backpropagation is
automatically taken care of, but you need to understand how it works
to troubleshoot problems that may arise during the development
process.

You can view a RNN as a sequence of neural networks that you train
one after another with backpropagation.

The image below illustrates an unrolled RNN. On the left, the RNN is
unrolled after the equal sign. Note there is no cycle after the equal sign
since the different time steps are visualized and information is passed
from one time step to the next. This illustration also shows why a RNN
can be seen as a sequence of neural networks.

An unrolled version of RNN

If you do BPTT, the conceptualization of unrolling is required since the


error of a given time step depends on the previous time step.

Within BPTT the error is back propagated from the last to the first
timestep, while unrolling all the timesteps. This allows calculating the
error for each timestep, which allows updating the weights. Note that
BPTT can be computationally expensive when you have a high number
of timesteps.

Two Issues of Standard RNNs

There are two major obstacles RNNs have had to deal with, but to
understand them, you first need to know what a gradient is.

A gradient is a partial derivative with respect to its inputs. If you don’t


know what that means, just think of it like this: a gradient measures
how much the output of a function changes if you change the inputs a
little bit.
You can also think of a gradient as the slope of a function. The higher
the gradient, the steeper the slope and the faster a model can learn. But
if the slope is zero, the model stops learning. A gradient simply
measures the change in all weights with regard to the change in error.

EXPLODING GRADIENTS

Exploding gradients are when the algorithm, without much reason,


assigns a stupidly high importance to the weights. Fortunately, this
problem can be easily solved by truncating or squashing the gradients.

VANISHING GRADIENTS

Vanishing gradients occur when the values of a gradient are too small
and the model stops learning or takes way too long as a result. This was
a major problem in the 1990s and much harder to solve than the
exploding gradients. Fortunately, it was solved through the concept of
LSTM by Sepp Hochreiter and Juergen Schmidhuber.

RNN and Long Short-Term Memory (LSTM)

Long short-term memory networks (LSTMs) are an extension for


recurrent neural networks, which basically extends the memory.
Therefore, it is well suited to learn from important experiences that
have very long time lags in between.

WHAT IS LONG SHORT-TERM MEMORY (LSTM)?

Long short-term memory (LSTM) networks are an extension of


RNN that extend the memory. LSTM are used as the building
blocks for the layers of a RNN. LSTMs assign data “weights” which
helps RNNs to either let new information in, forget information or
give it importance enough to impact the output.
The units of an LSTM are used as building units for the layers of a RNN,
often called an LSTM network.

LSTMs enable RNNs to remember inputs over a long period of time.


This is because LSTMs contain information in a memory, much like the
memory of a computer. The LSTM can read, write and delete
information from its memory.

This memory can be seen as a gated cell, with gated meaning the cell
decides whether or not to store or delete information (i.e., if it opens
the gates or not), based on the importance it assigns to the information.
The assigning of importance happens through weights, which are also
learned by the algorithm. This simply means that it learns over time
what information is important and what is not.

In a long short-term memory cell you have three gates: input, forget
and output gate. These gates determine whether or not to let new
input in (input gate), delete the information because it isn’t important
(forget gate), or let it impact the output at the current timestep (output
gate). Below is an illustration of a RNN with its three gates:
The gates in an LSTM are analog in the form of sigmoids, meaning they
range from zero to one. The fact that they are analog enables them to
do backpropagation.

The problematic issue of vanishing gradients is solved through LSTM


because it keeps the gradients steep enough, which keeps the training
relatively short and the accuracy high.

Summary

https://builtin.com/data-science/recurrent-neural-networks-and-lstm
Now that you have a proper understanding of how a recurrent neural
network works, you can decide if it is the right algorithm to use for a
given machine learning problem.
A Brief Overview of Recurrent Neural Networks (RNN)

https://www.analyticsvidhya.com/blog/2022/03/a-brief-overview-of-recurrent-
neural-networks-rnn/

Debasish Kalita — Published On March 11, 2022 and Last Modified On


March 24th, 2022

Beginner Deep Learning Python

This article was published as a part of the Data Science Blogathon.

Apple’s Siri and Google’s voice search both use Recurrent Neural Networks

(RNNs), which are the state-of-the-art method for sequential data. It’s the first

algorithm with an internal memory that remembers its input, making it perfect

for problems involving sequential data in machine learning. It’s one of the

algorithms responsible for the incredible advances in deep learning over the

last few years. In this article, we’ll go over the fundamentals of recurrent neural

networks, as well as the most pressing difficulties and how to address them.

Introduction on Recurrent Neural Networks

A Deep Learning approach for modelling sequential data is Recurrent Neural

Networks (RNN). RNNs were the standard suggestion for working with

sequential data before the advent of attention models. Specific parameters for

each element of the sequence may be required by a deep feedforward model.

It may also be unable to generalize to variable-length sequences.


So
urce: Medium.com

Recurrent Neural Networks use the same weights for each element of the

sequence, decreasing the number of parameters and allowing the model to

generalize to sequences of varying lengths. RNNs generalize to structured

data other than sequential data, such as geographical or graphical data,

because of its design.

Recurrent neural networks, like many other deep learning techniques, are

relatively old. They were first developed in the 1980s, but we didn’t appreciate

their full potential until lately. The advent of long short-term memory (LSTM) in

the 1990s, combined with an increase in computational power and the vast

amounts of data that we now have to deal with, has really pushed RNNs to the

forefront.

What is a Recurrent Neural Network (RNN)?


Neural networks imitate the function of the human brain in the fields of AI,

machine learning, and deep learning, allowing computer programs to

recognize patterns and solve common issues.

RNNs are a type of neural network that can be used to model sequence data.

RNNs, which are formed from feedforward networks, are similar to human

brains in their behaviour. Simply said, recurrent neural networks can anticipate

sequential data in a way that other algorithms can’t.

So
urce: Quora.com

All of the inputs and outputs in standard neural networks are independent of

one another, however in some circumstances, such as when predicting the

next word of a phrase, the prior words are necessary, and so the previous

words must be remembered. As a result, RNN was created, which used a

Hidden Layer to overcome the problem. The most important component of

RNN is the Hidden state, which remembers specific information about a

sequence.
RNNs have a Memory that stores all information about the calculations. It

employs the same settings for each input since it produces the same outcome

by performing the same task on all inputs or hidden layers.

The Architecture of a Traditional RNN

RNNs are a type of neural network that has hidden states and allows past

outputs to be used as inputs. They usually go like this:


RNN architecture can vary depending on the problem you’re trying to solve.

From those with a single input and output to those with many (with variations

between).

Below are some examples of RNN architectures that can help you better

understand this.

● One To One: There is only one pair here. A one-to-one architecture is


used in traditional neural networks.
● One To Many: A single input in a one-to-many network might result in
numerous outputs. One too many networks are used in the production of
music, for example.
● Many To One: In this scenario, a single output is produced by combining
many inputs from distinct time steps. Sentiment analysis and emotion
identification use such networks, in which the class label is determined
by a sequence of words.
● Many To Many: For many to many, there are numerous options. Two
inputs yield three outputs. Machine translation systems, such as English
to French or vice versa translation systems, use many to many networks.
How does Recurrent Neural Networks work?

The information in recurrent neural networks cycles through a loop to the

middle hidden layer.

So
urce: Simplilearn.com

The input layer x receives and processes the neural network’s input before

passing it on to the middle layer.

Multiple hidden layers can be found in the middle layer h, each with its own

activation functions, weights, and biases. You can utilize a recurrent neural

network if the various parameters of different hidden layers are not impacted

by the preceding layer, i.e. There is no memory in the neural network.

The different activation functions, weights, and biases will be standardized by

the Recurrent Neural Network, ensuring that each hidden layer has the same
characteristics. Rather than constructing numerous hidden layers, it will create

only one and loop over it as many times as necessary.

Common Activation Functions

A neuron’s activation function dictates whether it should be turned on or off.

Nonlinear functions usually transform a neuron’s output to a number between 0

and 1 or -1 and 1.

So
urce: MLtutorial.com

The following are some of the most commonly utilized functions:

● Sigmoid: The formula g(z) = 1/(1 + e^-z) is used to express this.


● Tanh: The formula g(z) = (e^-z – e^-z)/(e^-z + e^-z) is used to express
this.
● Relu: The formula g(z) = max(0 , z) is used to express this.

Recurrent Neural Network Vs Feedforward Neural Network


A feed-forward neural network has only one route of information flow: from the

input layer to the output layer, passing through the hidden layers. The data

flows across the network in a straight route, never going through the same

node twice.

The information flow between an RNN and a feed-forward neural network is

depicted in the two figures below.

Feed-forward neural networks are poor predictions of what will happen next
because they have no memory of the information they receive. Because it
simply analyses the current input, a feed-forward network has no idea of
temporal order. Apart from its training, it has no memory of what transpired in
the past.

The information is in an RNN cycle via a loop. Before making a judgment, it

evaluates the current input as well as what it has learned from past inputs. A
recurrent neural network, on the other hand, may recall due to internal

memory. It produces output, copies it, and then returns it to the network.

Backpropagation Through Time (BPTT)

When we apply a Backpropagation algorithm to a Recurrent Neural Network

with time series data as its input, we call it backpropagation through time.

A single input is sent into the network at a time in a normal RNN, and a single

output is obtained. Backpropagation, on the other hand, uses both the current

and prior inputs as input. This is referred to as a timestep, and one timestep

will consist of multiple time series data points entering the RNN at the same

time.
The output of the neural network is used to calculate and collect the errors

once it has trained on a time set and given you an output. The network is then

rolled back up, and weights are recalculated and adjusted to account for the

faults.

Two issues of Standard RNNs

There are two key challenges that RNNs have had to overcome, but in order to

comprehend them, one must first grasp what a gradient is.

So
urce: GreatLearning.com

With regard to its inputs, a gradient is a partial derivative. If you’re not sure

what that implies, consider this: a gradient quantifies how much the i unoutput

of a function varies when the inputs are changed slightly.


A function’s slope is also known as its gradient. The steeper the slope, the

faster a model can learn, the higher the gradient. The model, on the other

hand, will stop learning if the slope is zero. A gradient is used to measure the

change in all weights in relation to the change in error.

● Exploding Gradients: Exploding gradients occur when the algorithm gives


the weights an absurdly high priority for no apparent reason. Fortunately,
truncating or squashing the gradients is a simple solution to this problem.
● Vanishing Gradients: Vanishing gradients occur when the gradient
values are too small, causing the model to stop learning or take far too
long. This was a big issue in the 1990s, and it was far more difficult to
address than the exploding gradients. Fortunately, Sepp Hochreiter and
Juergen Schmidhuber’s LSTM concept solved the problem.

RNN Applications

Recurrent Neural Networks are used to tackle a variety of problems involving

sequence data. There are many different types of sequence data, but the

following are the most common: Audio, Text, Video, Biological sequences.

Using RNN models and sequence datasets, you may tackle a variety of

problems, including :

● Speech recognition
● Generation of music
● Automated Translations
● Analysis of video action
● Sequence study of the genome and DNA

Basic Python Implementation (RNN with Keras)


Import the required libraries

import numpy as np
import tensorflow as tf
from tensorflow import keras

ffrom tensorflow.keras import layers

Here’s a simple Sequential model that processes integer sequences, embeds each integer into a 64-

dimensional vector, and then uses an LSTM layer to handle the sequence of vectors.

model = keras.Sequential()
model.add(layers.Embedding(input_dim=1000, output_dim=64))
model.add(layers.LSTM(128))
model.add(layers.Dense(10))
model.summary()

Output:

Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
embedding (Embedding) (None, None, 64) 64000
_________________________________________________________________
lstm (LSTM) (None, 128) 98816
_________________________________________________________________
dense (Dense) (None, 10) 1290
=================================================================
Total params: 164,106
Trainable params: 164,106
Non-trainable params: 0
Conclusion
● Recurrent Neural Networks are a versatile tool that can be used in a
variety of situations. They’re employed in a variety of methods for
language modeling and text generators. They’re also employed in voice
recognition.
● This type of neural network is used to create labels for images that aren’t
tagged when paired with Convolutional Neural Networks. It’s incredible
how well this combination works.
● However, there is one flaw with recurrent neural networks. They have
trouble learning long-range dependencies, which means they don’t
comprehend relationships between data that are separated by several
steps.
● When anticipating words, for example, we may require more context than
simply one prior word. This is known as the vanishing gradient problem,
and it is solved using a special type of Recurrent Neural Network called
Long-Short Term Memory Networks (LSTM), which is a larger topic that
will be discussed in future articles.

Difference between ANN, CNN and RNN

Artificial Neural Network (ANN):

Artificial Neural Network (ANN), is a group of multiple perceptrons or

neurons at each layer. ANN is also known as a Feed-Forward Neural

network because inputs are processed only in the forward direction.


This type of neural networks are one of the simplest variants of neural

networks. They pass information in one direction, through various

input nodes, until it makes it to the output node. The network may or

may not have hidden node layers, making their functioning more

interpretable.

Advantages:

● Storing information on the entire network.

● Ability to work with incomplete knowledge.

● Having fault tolerance.

● Having a distributed memory.

Disadvantages:

● Hardware dependence.

● Unexplained behavior of the network.

● Determination of proper network structure.

Convolutional Neural Network (CNN):


Convolutional neural networks (CNN) are one of the most popular

models used today. This neural network computational model uses a

variation of multilayer perceptrons and contains one or more

convolutional layers that can be either entirely connected or pooled.

These convolutional layers create feature maps that record a region of

image which is ultimately broken into rectangles and sent out for

nonlinear processing.

Advantages:

● Very High accuracy in image recognition problems.

● Automatically detects the important features without any

human supervision.

● Weight sharing.

Disadvantages:

● CNN do not encode the position and orientation of object.

● Lack of ability to be spatially invariant to the input data.

● Lots of training data is required.

Recurrent Neural Network (RNN):


Recurrent neural networks (RNN) are more complex. They save the

output of processing nodes and feed the result back into the model

(they did not pass the information in one direction only). This is how

the model is said to learn to predict the outcome of a layer. Each node

in the RNN model acts as a memory cell, continuing the computation

and implementation of operations. If the network’s prediction is

incorrect, then the system self-learns and continues working towards

the correct prediction during backpropagation.

Advantages:

● An RNN remembers each and every information through time.

It is useful in time series prediction only because of the feature

to remember previous inputs as well. This is called Long Short

Term Memory.

● Recurrent neural network are even used with convolutional

layers to extend the effective pixel neighborhood.

Disadvantages:

● Gradient vanishing and exploding problems.

● Training an RNN is a very difficult task.


● It cannot process very long sequences if using tanh or relu as

an activation function.

Summation of all three networks in single table:

ANN CNN RNN

Tabular Data,
Type of Data Image Data Sequence data
Text Data

Parameter
No Yes Yes
Sharing

Fixed Length
Yes Yes No
input

Recurrent
No No Yes
Connections

Vanishing
and
Yes Yes Yes
Exploding
Gradient

Spatial
No Yes No
Relationship

ANN is CNN is RNN includes


considered to considered to less feature
Performance
be less be more compatibility
powerful than powerful than when
compared to
CNN, RNN. ANN, RNN.
CNN.

Facial
recognition,
Facial
text Text-to-
recognition
Application digitization speech
and Computer
and Natural conversions.
vision.
language
processing.

Having fault High accuracy Remembers


tolerance, in image each and
Main Ability to recognition every
advantages work with problems, information,
incomplete Weight Time series
knowledge. sharing. prediction.

Large training
Hardware data needed,
Gradient
dependence, don’t encode
Disadvantage vanishing,
Unexplained the position
s exploding
behavior of and
gradient.
the network. orientation of
object.

https://www.geeksforgeeks.org/difference-between-ann-cnn-and-
rnn/?ref=rp

You might also like