DL Unit 1
DL Unit 1
DEPARTMENT OF CSE-ME
Syllabus: Basic concept of neurons – Perceptron algorithm – Feed forward and Back
propagation networks.
Deep learning is based on the branch of machine learning, which is a subset of artificial
intelligence. Since neural networks imitate the human brain and so deep learning will do. In
deep learning, nothing is programmed explicitly. Basically, it is a machine learning class that
makes use of numerous nonlinear processing units so as to perform feature extraction as well
as transformation. The output from each preceding layer is taken as input by each one of the
successive layers.
Deep learning is implemented with the help of Neural Networks, and the idea behind the
motivation of Neural Network is the biological neurons, which is nothing but a brain cell.
Deep learning is a collection of statistical techniques of machine learning for learning feature
hierarchies that are actually based on artificial neural networks.
So basically, deep learning is implemented by the help of deep networks, which are nothing
but neural networks with multiple hidden layers.
Example of Deep Learning
In the example given above, we provide the raw data of images to the first layer of the input
layer. After then, these input layer will determine the patterns of local contrast that means it
will differentiate on the basis of colors, luminosity, etc. Then the 1st hidden layer will
determine the face feature, i.e., it will fixate on eyes, nose, and lips, etc. And then, it will
fixate those face features on the correct face template. So, in the 2nd hidden layer, it will
actually determine the correct face here as it can be seen in the above image, after which it
will be sent to the output layer. Likewise, more hidden layers can be added to solve more
complex problems, for example, if you want to find out a particular kind of face having large
or light complexions. So, as and when the hidden layers increase, we are able to solve
complex problems.
1
PANIMALAR ENGINEERING COLLEGE
DEPARTMENT OF CSE-ME
ARCHITECTURES
With the help of the Contrastive Divergence algorithm, a layer of features is learned from
perceptible units.
Next, the formerly trained features are treated as visible units, which perform learning of
features.
Lastly, when the learning of the final hidden layer is accomplished, then the whole DBN is
trained.
Applications:
Data Compression, Pattern Recognition, Computer Vision, Sonar Target Recognition, Speech
Recognition, Handwritten Characters Recognition
2
PANIMALAR ENGINEERING COLLEGE
DEPARTMENT OF CSE-ME
5. Autoencoders
An autoencoder neural network is another kind of unsupervised machine learning algorithm.
Here the number of hidden cells is merely small than that of the input cells. But the number
of input cells is equivalent to the number of output cells. An autoencoder network is trained
to display the output similar to the fed input to force AEs to find common patterns and
generalize the data. The autoencoders are mainly used for the smaller representation of the
input. It helps in the reconstruction of the original data from compressed data. This algorithm
is comparatively simple as it only necessitates the output identical to the input.
Encoder: Convert input data in lower dimensions.
Decoder: Reconstruct the compressed data.
Applications:
Classification, Clustering, Feature Compression, Deep learning applications
3
PANIMALAR ENGINEERING COLLEGE
DEPARTMENT OF CSE-ME
Self-Driving Cars
In self-driven cars, it is able to capture the images around it by processing a huge amount of
data, and then it will decide which actions should be incorporated to take a left or right or
should it stop. So, accordingly, it will decide what actions it should take, which will further
reduce the accidents that happen every year.
Limitations
It only learns through the observations.
It comprises of biases issues.
Advantages
It lessens the need for feature engineering.
It eradicates all those costs that are
needless. It easily identifies difficult
defects.
It results in the best-in-class performance on problems.
Disadvantages
It requires an ample amount of data.
It is quite expensive to train.
It does not have strong theoretical groundwork.
They receive one or more input signals. These input signals can come from either the raw
data set or from neurons positioned at a previous layer of the neural net.
They perform some calculations.
4
PANIMALAR ENGINEERING COLLEGE
DEPARTMENT OF CSE-ME
They send some output signals to neurons deeper in the neural net through a synapse.
Here is a diagram of the functionality of a neuron in a deep learning neural net:
As you can see, neurons in a deep learning model are capable of having synapses that
connect to more than one neuron in the preceding layer. Each synapse has an associated
weight, which impacts the preceding neuron’s importance in the overall neural network.
Weights are a very important topic in the field of deep learning because adjusting a model’s
weights is the primary way through which deep learning models are trained. You’ll see this in
practice later on when we build our first neural networks from scratch.
Once a neuron receives its inputs from the neurons in the preceding layer of the model, it
adds up each signal multiplied by its corresponding weight and passes them on to an
activation function, like this:
The activation function calculates the output value for the neuron. This output value is then
passed on to the next layer of the neural network through another synapse.
5
PANIMALAR ENGINEERING COLLEGE
DEPARTMENT OF CSE-ME
This serves as a broad overview of deep learning neurons. Do not worry if it was a lot to take
in – we’ll learn much more about neurons in the rest of this tutorial. For now, it’s sufficient
for you to have a high-level understanding of how they are structured in a deep learning
model.
In this section, you will learn to understand the importance and functionality of activation
functions in deep learning.
A relationship is linear if a change in the first variable corresponds to a constant change in the
second variable. A non-linear relationship means that a change in the first variable doesn’t
necessarily correspond with a constant change in the second. However, they may impact each
other but it appears to be unpredictable.
A quick visual example, by introducing non-linearity we can better capture the patterns in
this data
6
PANIMALAR ENGINEERING COLLEGE
DEPARTMENT OF CSE-ME
7
PANIMALAR ENGINEERING COLLEGE
DEPARTMENT OF CSE-ME
Each value ranges between 0 and 1 and the sum of all values is 1 so can be used to
model probability distributions.
Only used in the output layer rather than throughout the network.
8
PANIMALAR ENGINEERING COLLEGE
DEPARTMENT OF CSE-ME
various binary classifiers. This algorithm enables neurons to learn elements and processes them one
by one during preparation. In this tutorial, "Perceptron in Machine Learning," we will discuss in-depth
knowledge of Perceptron and its basic functions in brief. Let's start with the basic introduction of
Perceptron.
Perceptron is Machine Learning algorithm for supervised learning of various binary classification
tasks. Further, Perceptron is also understood as an Artificial Neuron or neural network unit that
helps to detect certain input data computations in business intelligence.
Perceptron model is also treated as one of the best and simplest types of Artificial Neural networks.
However, it is a supervised learning algorithm of binary classifiers. Hence, we can consider it as a
single-layer neural network with four main parameters, i.e., input values, weights and Bias, net sum,
and an activation function.
In Machine Learning, binary classifiers are defined as the function that helps in deciding whether
input data can be represented as vectors of numbers and belongs to some specific class.
Binary classifiers can be considered as linear classifiers. In simple words, we can understand it as
a classification algorithm that can predict linear predictor function in terms of weight and feature
vectors.
Mr. Frank Rosenblatt invented the perceptron model as a binary classifier which contains three main
components. These are as follows:
This is the primary component of Perceptron which accepts the initial data into the system for further
processing. Each input node contains a real numerical value.
9
PANIMALAR ENGINEERING COLLEGE
DEPARTMENT OF CSE-ME
Weight parameter represents the strength of the connection between units. This is another most
important parameter of Perceptron components. Weight is directly proportional to the strength of the
associated input neuron in deciding the output. Further, Bias can be considered as the line of intercept
in a linear equation.
o Activation Function:
These are the final and important components that help to determine whether the neuron will fire or
not. Activation Function can be considered primarily as a step function.
o Sign function
o Step function, and
o Sigmoid function
The data scientist uses the activation function to take a subjective decision based on various problem
statements and forms the desired outputs. Activation function may differ (e.g., Sign, Step, and
Sigmoid) in perceptron models by checking whether the learning process is slow or has vanishing or
exploding gradients.
In Machine Learning, Perceptron is considered as a single-layer neural network that consists of four
main parameters named input values (Input nodes), weights and Bias, net sum, and an activation
function. The perceptron model begins with the multiplication of all input values and their weights,
then adds these values together to create the weighted sum. Then this weighted sum is applied to the
activation function 'f' to obtain the desired output. This activation function is also known as the step
function and is represented by 'f'.
10
PANIMALAR ENGINEERING COLLEGE
DEPARTMENT OF CSE-ME
This step function or Activation function plays a vital role in ensuring that output is mapped between
required values (0,1) or (-1,1). It is important to note that the weight of input is indicative of the
strength of a node. Similarly, an input's bias value gives the ability to shift the activation function
curve up or down.
Step-1
In the first step first, multiply all input values with corresponding weight values and then add them to
determine the weighted sum. Mathematically, we can calculate the weighted sum as follows:
Add a special term called bias 'b' to this weighted sum to improve the model's performance.
∑wi*xi + b
Step-2
In the second step, an activation function is applied with the above-mentioned weighted sum, which
gives us output either in binary form or a continuous value as follows:
Y = f(∑wi*xi + b)
11
PANIMALAR ENGINEERING COLLEGE
DEPARTMENT OF CSE-ME
Based on the layers, Perceptron models are divided into two types. These are as follows:
This is one of the easiest Artificial neural networks (ANN) types. A single-layered perceptron model
consists feed-forward network and also includes a threshold transfer function inside the model. The
main objective of the single-layer perceptron model is to analyze the linearly separable objects with
binary outcomes.
In a single layer perceptron model, its algorithms do not contain recorded data, so it begins with
inconstantly allocated input for weight parameters. Further, it sums up all inputs (weight). After
adding all inputs, if the total sum of all inputs is more than a pre-determined value, the model gets
activated and shows the output value as +1.
If the outcome is same as pre-determined or threshold value, then the performance of this model is
stated as satisfied, and weight demand does not change. However, this model consists of a few
discrepancies triggered when multiple weight inputs values are fed into the model. Hence, to find
desired output and minimize errors, some changes should be necessary for the weights input.
12
PANIMALAR ENGINEERING COLLEGE
DEPARTMENT OF CSE-ME
Like a single-layer perceptron model, a multi-layer perceptron model also has the same model
structure but has a greater number of hidden layers.
The multi-layer perceptron model is also known as the Backpropagation algorithm, which executes in
two stages as follows:
o Forward Stage: Activation functions start from the input layer in the forward stage and
terminate on the output layer.
o Backward Stage: In the backward stage, weight and bias values are modified as per the
model's requirement. In this stage, the error between actual output and demanded originated
backward on the output layer and ended on the input layer.
Hence, a multi-layered perceptron model has considered as multiple artificial neural networks having
various layers in which activation function does not remain linear, similar to a single layer perceptron
model. Instead of linear, activation function can be executed as sigmoid, TanH, ReLU, etc., for
deployment.
A multi-layer perceptron model has greater processing power and can process linear and non-linear
patterns. Further, it can also implement logic gates such as AND, OR, XOR, NAND, NOT, XNOR,
NOR.
Perceptron Function
Perceptron function ''f(x)'' can be achieved as output by multiplying the input 'x' with the learned
weight coefficient 'w'.
13
PANIMALAR ENGINEERING COLLEGE
DEPARTMENT OF CSE-ME
f(x)=1; if w.x+b>0
otherwise, f(x)=0
Characteristics of Perceptron
The perceptron model has the following characteristics.
o The output of a perceptron can only be a binary number (0 or 1) due to the hard limit transfer
function.
o Perceptron can only be used to classify the linearly separable sets of input vectors. If input
vectors are non-linear, it is not easy to classify them properly.
14