0% found this document useful (0 votes)
14 views10 pages

2111CS010077 Deep Learning

Deep learning is a subset of machine learning that mimics human brain functions through algorithms to make decisions based on data. It includes various architectures such as Convolutional Neural Networks (CNNs) for image processing, Long Short Term Memory Networks (LSTMs) for time series predictions, and Generative Adversarial Networks (GANs) for data generation. Each architecture has unique features and applications, contributing to advancements in fields like image recognition, natural language processing, and data visualization.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views10 pages

2111CS010077 Deep Learning

Deep learning is a subset of machine learning that mimics human brain functions through algorithms to make decisions based on data. It includes various architectures such as Convolutional Neural Networks (CNNs) for image processing, Long Short Term Memory Networks (LSTMs) for time series predictions, and Generative Adversarial Networks (GANs) for data generation. Each architecture has unique features and applications, contributing to advancements in fields like image recognition, natural language processing, and data visualization.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

Deep learning can be defined as the method of machine learning and artificial intelligence

that is intended to intimidate humans and their actions based on certain human brain
functions to make effective decisions. It is a very important data science element that
channels its mo

deling based on data-driven techniques under predictive modeling and statistics. To drive
such a human-like ability to adapt and learn and to function accordingly, there have to be
some strong forces which we popularly called algorithms.

Deep learning algorithms are dynamically made to run through several layers of neural
networks, which are nothing but a set of decision-making networks that are pre-trained to
serve a task. Later, each of these is passed through simple layered representations and move
on to the next layer. However, most machine learning is trained to work fairly well on
datasets that have to deal with hundreds of features or columns. For a data set to be
structured or unstructured, machine learning tends to fail mostly because they fail to
recognize a simple image having a dimension of 800x1000 in RGB. It becomes quite
unfeasible for a traditional machine learning algorithm to handle such depths. This is where
deep learning.

1. Convolutional Neural Networks (CNNs):

CNN's popularly known as ConvNets majorly consists of several layers and are specifically used for
image processing and detection of objects. It was developed in 1998 by Yann LeCun and was first
called LeNet. Back then, it was developed to recognize digits and zip code characters. CNNs have wide
usage in identifying the image of the satellites, medical image processing, series forecasting, and
anomaly detection.

CNN architecture:
Convolutional Neural Network consists of multiple layers like the input layer, Convolutional layer,
Pooling layer, and fully connected layers.
Simple CNN architecture

The Convolutional layer applies filters to the input image to extract features, the Pooling layer
downsamples the image to reduce computation, and the fully connected layer makes the final
prediction. The network learns the optimal filters through backpropagation and gradient descent.
Layers used to build ConvNets
A complete Convolution Neural Networks architecture is also known as covnets. A covnets is a
sequence of layers, and every layer transforms one volume to another through a differentiable
function.
Types of layers: datasets
Let’s take an example by running a covnets on of image of dimension 32 x 32 x 3.
 Input Layers: It’s the layer in which we give input to our model. In CNN, Generally, the input
will be an image or a sequence of images. This layer holds the raw input of the image with
width 32, height 32, and depth 3.
 Convolutional Layers: This is the layer, which is used to extract the feature from the input
dataset. It applies a set of learnable filters known as the kernels to the input images. The
filters/kernels are smaller matrices usually 2×2, 3×3, or 5×5 shape. it slides over the input
image data and computes the dot product between kernel weight and the corresponding input
image patch. The output of this layer is referred as feature maps. Suppose we use a total of 12
filters for this layer we’ll get an output volume of dimension 32 x 32 x 12.
 Activation Layer: By adding an activation function to the output of the preceding layer,
activation layers add nonlinearity to the network. it will apply an element-wise activation
function to the output of the convolution layer. Some common activation functions are RELU:
max(0, x), Tanh, Leaky RELU, etc. The volume remains unchanged hence output volume will
have dimensions 32 x 32 x 12.
 Pooling layer: This layer is periodically inserted in the covnets and its main function is to
reduce the size of volume which makes the computation fast reduces memory and also prevents
overfitting. Two common types of pooling layers are max pooling and average pooling. If we
use a max pool with 2 x 2 filters and stride 2, the resultant volume will be of dimension
16x16x12.

Image source: cs231n.stanford.edu

 Flattening: The resulting feature maps are flattened into a one-dimensional vector after the
convolution and pooling layers so they can be passed into a completely linked layer for
categorization or regression.
 Fully Connected Layers: It takes the input from the previous layer and computes the final
classification or regression task.
Image source: cs231n.stanford.edu

 Output Layer: The output from the fully connected layers is then fed into a logistic function
for classification tasks like sigmoid or softmax which converts the output of each class into the
probability score of each class.

2. Long Short Term Memory Networks (LSTMs):

LSTMs can be defined as Recurrent Neural Networks (RNN) that are programmed to learn and adapt
for dependencies for the long term. It can memorize and recall past data for a greater period and by
default, it is its sole behavior. LSTMs are designed to retain over time and henceforth they are majorly
used in time series predictions because they can restrain memory or previous inputs. This analogy comes
from their chain-like structure consisting of four interacting layers that communicate with each other
differently. Besides applications of time series prediction, they can be used to construct speech
recognizers, development in pharmaceuticals, and composition of music loops as well.

LSTM work in a sequence of events. First, they don't tend to remember irrelevant details attained in the
previous state. Next, they update certain cell-state values selectively and finally generate certain parts
of the cell-state as output. Below is the diagram of their operation.

3.Recurrent Neural Network(RNN):


Recurrent Neural Network(RNN) is a type of Neural Network where the output from the previous
step is fed as input to the current step. In traditional neural networks, all the inputs and outputs are
independent of each other. Still, in cases when it is required to predict the next word of a sentence,
the previous words are required and hence there is a need to remember the previous words. Thus
RNN came into existence, which solved this issue with the help of a Hidden Layer. The main and
most important feature of RNN is its Hidden state, which remembers some information about a
sequence. The state is also referred to as Memory State since it remembers the previous input to the
network. It uses the same parameters for each input as it performs the same task on all the inputs or
hidden layers to produce the output. This reduces the complexity of parameters, unlike other neural
networks.

Recurrent Neuron and RNN Unfolding


The fundamental processing unit in a Recurrent Neural Network (RNN) is a Recurrent Unit, which is
not explicitly called a “Recurrent Neuron.” This unit has the unique ability to maintain a hidden state,
allowing the network to capture sequential dependencies by remembering previous inputs while
processing. Long Short-Term Memory (LSTM) and Gated Recurrent Unit (GRU) versions improve
the RNN’s ability to handle long-term dependencies.

Recurrent Neuron

RNN Unfolding
Types Of RNN
There are four types of RNNs based on the number of inputs and outputs in the network.
1. One to One
2. One to Many
3. Many to One
4. Many to Many
One to One
This type of RNN behaves the same as any simple Neural network it is also known as Vanilla Neural
Network. In this Neural network, there is only one input and one output.
One To Many
In this type of RNN, there is one input and many outputs associated with it. One of the most used
examples of this network is Image captioning where given an image we predict a sentence having
Multiple words.
Many to One
In this type of network, Many inputs are fed to the network at several states of the network generating
only one output. This type of network is used in the problems like sentimental analysis. Where we
give multiple words as input and predict only the sentiment of the sentence as output.
Many to Many
In this type of neural network, there are multiple inputs and multiple outputs corresponding to a
problem. One Example of this Problem will be language translation. In language translation, we
provide multiple words from one language as input and predict multiple words from the second
language as output.
Recurrent Neural Network Architecture
RNNs have the same input and output architecture as any other deep neural architecture. However,
differences arise in the way information flows from input to output. Unlike Deep neural networks
where we have different weight matrices for each Dense network in RNN, the weight across the
network remains the same. It calculates state hidden state Hi for every input Xi . By using the
following formulas:
h= σ(UX + Wh-1 + B)
Y = O(Vh + C)
Hence
Y = f (X, h , W, U, V, B, C)
Here S is the State matrix which has element si as the state of the network at timestep i
The parameters in the network are W, U, V, c, b which are shared across timestep

Recurrent Neural Architecture

4. Generative Adversarial Networks (GANs):

GANs are defined as deep learning algorithms that are used to generate new instances of data that match
the training data. GAN usually consists of two components namely a generator that learns to generate
false data and a discriminator that adapts itself by learning from this false data. Over some time, GANs
have gained immense usage since they are frequently being used to clarify astronomical images and
simulate lensing the gravitational dark matter. It is also used in video games to increase graphics
for 2D textures by recreating them in higher resolution like 4K. They are also used in creating realistic
cartoons character and also rendering human faces and 3D object rendering.

GANs work in simulation by generating and understanding the fake data and the real data. During the
training to understand these data, the generator produces different kinds of fake data where the
discriminator quickly learns to adapt and respond to it as false data. GANs then send these recognized
results for updating. Consider the below image to visualize the functioning.

5. Radial Basis Function Networks (RBFNs):

RBFNs are specific types of neural networks that follow a feed-forward approach and make use of
radial functions as activation functions. They consist of three layers namely the input layer, hidden
layer, and output layer which are mostly used for time-series prediction, regression
testing, and classification.

RBFNs do these tasks by measuring the similarities present in the training data set. They usually have
an input vector that feeds these data into the input layer thereby confirming the identification and rolling
out results by comparing previous data sets. Precisely, the input layer has neurons that are sensitive to
these data and the nodes in the layer are efficient in classifying the class of data. Neurons are originally
present in the hidden layer though they work in close integration with the input layer. The hidden layer
contains Gaussian transfer functions that are inversely proportional to the distance of the output from
the neuron's center. The output layer has linear combinations of the radial-based data where the
Gaussian functions are passed in the neuron as parameter and output is generated. Consiider the given
image below to understand the process thoroughly.

6. Multilayer Perceptrons (MLPs):

MLPs are the base of deep learning technology. It belongs to a class of feed-forward neural networks
having various layers of perceptrons. These perceptrons have various activation functions in them.
MLPs also have connected input and output layers and their number is the same. Also, there's a layer
that remains hidden amidst these two layers. MLPs are mostly used to build image and speech
recognition systems or some other types of the translation software.

The working of MLPs starts by feeding the data in the input layer. The neurons present in the layer form
a graph to establish a connection that passes in one direction. The weight of this input data is found to
exist between the hidden layer and the input layer. MLPs use activation functions to determine which
nodes are ready to fire. These activation functions include tanh function, sigmoid and ReLUs. MLPs
are mainly used to train the models to understand what kind of co-relation the layers are serving to
achieve the desired output from the given data set. See the below image to understand better.

7. Self Organizing Maps (SOMs):

SOMs were invented by Teuvo Kohenen for achieving data visualization to understand the dimensions
of data through artificial and self-organizing neural networks. The attempts to achieve data visualization
to solve problems are mainly done by what humans cannot visualize. These data are generally high-
dimensional so there are lesser chances of human involvement and of course less error.

SOMs help in visualizing the data by initializing weights of different nodes and then choose random
vectors from the given training data. They examine each node to find the relative weights so that
dependencies can be understood. The winning node is decided and that is called Best Matching
Unit (BMU). Later, SOMs discover these winning nodes but the nodes reduce over time from the
sample vector. So, the closer the node to BMU more is the more chance to recognize the weight and
carry out further activities. There are also multiple iterations done to ensure that no node closer to BMU
is missed. One example of such is the RGB color combinations that we use in our daily tasks. Consider
the below image to understand how they function.
8. Deep Belief Networks (DBNs):

DBNs are called generative models because they have various layers of latent as well as stochastic
variables. The latent variable is called a hidden unit because they have binary values. DBNs are also
called Boltzmann Machines because the RGM layers are stacked over each other to establish
communication with previous and consecutive layers. DBNs are used in applications like video and
image recognition as well as capturing motional objects.

DBNs are powered by Greedy algorithms. The layer to layer approach by leaning through a top-
down approach to generate weights is the most common way DBNs function. DBNs use step by step
approach of Gibbs sampling on the hidden two-layer at the top. Then, these stages draw a sample from
the visible units using a model that follows the ancestral sampling method. DBNs learn from the values
present in the latent value from every layer following the bottom-up pass approach.

9. Restricted Boltzmann Machines (RBMs):

RBMs were developed by Geoffrey Hinton and resemble stochastic neural networks that learn from
the probability distribution in the given input set. This algorithm is mainly used in the field of
dimension reduction, regression and classification, topic modeling and are considered the building
blocks of DBNs. RBIs consist of two layers namely the visible layer and the hidden layer. Both of
these layers are connected through hidden units and have bias units connected to nodes that generate
the output. Usually, RBMs have two phases namely forward pass and backward pass.

The functioning of RBMs is carried out by accepting inputs and translating them to numbers so that
inputs are encoded in the forward pass. RBMs take into account the weight of every input, and the
backward pass takes these input weights and translates them further into reconstructed inputs. Later,
both of these translated inputs, along with individual weights, are combined. These inputs are then
pushed to the visible layer where the activation is carried out, and output is generated that can be easily
reconstructed. To understand this process, consider the below image.
10.Artificial Neural Networks:

Artificial Neural Networks contain artificial neurons which are called units. These units are
arranged in a series of layers that together constitute the whole Artificial Neural Network in a
system. A layer can have only a dozen units or millions of units as this depends on how the
complex neural networks will be required to learn the hidden patterns in the dataset. Commonly,
Artificial Neural Network has an input layer, an output layer as well as hidden layers. The input
layer receives data from the outside world which the neural network needs to analyze or learn
about. Then this data passes through one or multiple hidden layers that transform the input into data
that is valuable for the output layer. Finally, the output layer provides an output in the form of a
response of the Artificial Neural Networks to input data provided.

In the majority of neural networks, units are interconnected from one layer to another. Each of these
connections has weights that determine the influence of one unit on another unit. As the data
transfers from one unit to another, the neural network learns more and more about the data which
eventually results in an output from the output layer.

The structures and operations of human neurons serve as the basis for artificial neural networks. It
is also known as neural networks or neural nets. The input layer of an artificial neural network is
the first layer, and it receives input from external sources and releases it to the hidden layer, which
is the second layer. In the hidden layer, each neuron receives input from the previous layer neurons,
computes the weighted sum, and sends it to the neurons in the next layer. These connections are
weighted means effects of the inputs from the previous layer are optimized more or less by
assigning different-different weights to each input and it is adjusted during the training process by
optimizing these weights for improved model performance.
Artificial neurons vs Biological neurons
The concept of artificial neural networks comes from biological neurons found in animal brains So
they share a lot of similarities in structure and function wise.
 Structure: The structure of artificial neural networks is inspired by biological neurons. A
biological neuron has a cell body or soma to process the impulses, dendrites to receive them,
and an axon that transfers them to other neurons. The input nodes of artificial neural networks
receive input signals, the hidden layer nodes compute these input signals, and the output layer
nodes compute the final output by processing the hidden layer’s results using activation
functions.
 Synapses: Synapses are the links between biological neurons that enable the transmission
of impulses from dendrites to the cell body. Synapses are the weights that join the one-layer
nodes to the next-layer nodes in artificial neurons. The strength of the links is determined
by the weight value.
 Learning: In biological neurons, learning happens in the cell body nucleus or soma, which
has a nucleus that helps to process the impulses. An action potential is produced and travels
through the axons if the impulses are powerful enough to reach the threshold. This becomes
possible by synaptic plasticity, which represents the ability of synapses to become stronger
or weaker over time in reaction to changes in their activity. In artificial neural networks,
backpropagation is a technique used for learning, which adjusts the weights between nodes
according to the error or differences between predicted and actual outcomes.
 Activation: In biological neurons, activation is the firing rate of the neuron which happens
when the impulses are strong enough to reach the threshold. In artificial neural networks, A
mathematical function known as an activation function maps the input to the output, and
executes activations.

types of Artificial Neural Networks:

 Feedforward Neural Network: The feedforward neural network is one of the most basic
artificial neural networks. In this ANN, the data or the input provided travels in a single
direction. It enters into the ANN through the input layer and exits through the output layer
while hidden layers may or may not exist. So the feedforward neural network has a front-
propagated wave only and usually does not have backpropagation.
 Convolutional Neural Network: A Convolutional neural network has some similarities to the
feed-forward neural network, where the connections between units have weights that determine
the influence of one unit on another unit. But a CNN has one or more than one convolutional
layer that uses a convolution operation on the input and then passes the result obtained in the
form of output to the next layer. CNN has applications in speech and image processing which is
particularly useful in computer vision.
 Modular Neural Network: A Modular Neural Network contains a collection of different
neural networks that work independently towards obtaining the output with no interaction
between them. Each of the different neural networks performs a different sub-task by obtaining
unique inputs compared to other networks. The advantage of this modular neural network is that
it breaks down a large and complex computational process into smaller components, thus
decreasing its complexity while still obtaining the required output.
 Radial basis function Neural Network: Radial basis functions are those functions that
consider the distance of a point concerning the center. RBF functions have two layers. In the
first layer, the input is mapped into all the Radial basis functions in the hidden layer and then
the output layer computes the output in the next step. Radial basis function nets are normally
used to model the data that represents any underlying trend or function.
 Recurrent Neural Network: The Recurrent Neural Network saves the output of a layer and
feeds this output back to the input to better predict the outcome of the layer. The first layer in
the RNN is quite similar to the feed-forward neural network and the recurrent neural network
starts once the output of the first layer is computed. After this layer, each unit will remember
some information from the previous step so that it can act as a memory cell in performing
computations.

You might also like