DL Part 3
DL Part 3
Dendrites from Biological Neural Network represent inputs in Artificial Neural Networks, cell
nucleus represents Nodes, synapse represents Weights, and Axon represents Output.
Relationship between Biological neural network and artificial neural network:
Dendrites Inputs
Synapse Weights
Axon Output
An Artificial Neural Network in the field of Artificial intelligence where it attempts to
mimic the network of neurons makes up a human brain so that computers will have an option
to understand things and make decisions in a human-like manner. The artificial neural network
is designed by programming computers to behave simply like interconnected brain cells.
There are around 1000 billion neurons in the human brain. Each neuron has an association
point somewhere in the range of 1,000 and 100,000. In the human brain, data is stored in such
a manner as to be distributed, and we can extract more than one piece of this data when
necessary, from our memory parallelly. We can say that the human brain is made up of
incredibly amazing parallel processors.
We can understand the artificial neural network with an example, consider an example of a
digital logic gate that takes an input and gives an output. "OR" gate, which takes two inputs. If
one or both the inputs are "On," then we get "On" in output. If both the inputs are "Off," then
we get "Off" in output. Here the output depends upon input. Our brain does not perform the
same task. The outputs to inputs relationship keep changing because of the neurons in our brain,
which are "learning."
The artificial neural network takes input and computes the weighted sum of the inputs and
includes a bias. This computation is represented in the form of a transfer function.
Deep Learning: Deep learning is a subset of machine learning that uses neural networks with
many layers (deep neural networks) to automatically extract features and representations from
data. It has been highly successful in tasks like image recognition and natural language
processing.
Biological Neural Network: In living organisms, the brain is the control unit of the neural
network, and it has different subunits that take care of vision, senses, movement, and hearing.
The brain is connected with a dense network of nerves to the rest of the body’s sensors and
actors. There are approximately 10¹¹ neurons in the brain, and these are the building blocks of
the complete central nervous system of the living body.
The neuron is the fundamental building block of neural networks. In the biological systems, a
neuron is a cell just like any other cell of the body, which has a DNA code and is generated in
the same way as the other cells. Though it might have different DNA, the function is similar in
all the organisms. A neuron comprises three major parts: the cell body (also called Soma), the
dendrites, and the axon. The dendrites are like fibers branched in different directions and are
connected to many cells in that cluster.
Dendrites receive the signals from surrounding neurons, and the axon transmits the signal to
the other neurons. At the ending terminal of the axon, the contact with the dendrite is made
through a synapse. Axon is a long fiber that transports the output signal as electric impulses
along its length. Each neuron has one axon. Axons pass impulses from one neuron to another
like a domino effect.
Perceptron model is also treated as one of the best and simplest types of Artificial Neural
networks. However, it is a supervised learning algorithm of binary classifiers. Hence, we can
consider it as a single-layer neural network with four main parameters, i.e., input values,
weights and Bias, net sum, and an activation function.
Mr. Frank Rosenblatt invented the perceptron model as a binary classifier which contains three
main components. These are as follows:
Input Nodes or Input Layer: This is the primary component of Perceptron which
accepts the initial data into the system for further processing. Each input node contains
a real numerical value.
Wight and Bias: Weight parameter represents the strength of the connection between
units. This is another most important parameter of Perceptron components. Weight is
directly proportional to the strength of the associated input neuron in deciding the
output. Further, Bias can be considered as the line of intercept in a linear equation.
Activation Function: These are the final and important components that help to
determine whether the neuron will fire or not. Activation Function can be considered
primarily as a step function.
Types of Activation functions:
o Sign function
o Step function, and
o Sigmoid function
Types of Perceptron Models: Based on the layers, Perceptron models are divided into two
types. These are as follows:
o Single-layer Perceptron Model
o Multi-layer Perceptron model
Input value or One input layer: The input layer of the perceptron is made of artificial input
neurons and takes the initial data into the system for further processing.
Weights and Bias:
Weight: It represents the dimension or strength of the connection between units. If the
weight to node 1 to node 2 has a higher quantity, then neuron 1 has a more considerable
influence on the neuron.
Bias: It is the same as the intercept added in a linear equation. It is an additional
parameter which task is to modify the output along with the weighted sum of the input
to the other neuron.
Net sum: It calculates the total sum.
Activation Function: A neuron can be activated or not, is determined by an activation function.
The activation function calculates a weighted sum and further adding bias with it to give the
result.
Algorithm
1. Create a perceptron with (n + 1) input neurons x0, x1, . . . xn, where x0 = 1 is the bias
input. Let O be the output neuron.
2. Initialize w = (w0, w1, . . ., wn) to random weights.
3. Iterate through the input patterns xj of the training set using the weight set, i, e. Compute
the weighted sum of inputs netj = ∑𝑛𝑖=0(𝑥𝑖 w𝑖 ) for each input pattern j.
4. Compute the output yj using the step function –
1 𝑖𝑓 𝑛𝑒𝑡𝑖 > 0
𝒀={ }
0 𝑖𝑓 𝑛𝑒𝑡𝑖 ≤ 0
5. Compare the computed output yj with the target output yj for each input pattern j. If all
the input patterns have been classified correctly, output the weights and exit.
6. Otherwise, update the weighs as given below: If the computed output yj is 1 but should
have been 0, wi = wi + αxi, i = 0, 1, 2, . . ., n. If the computed output yj is 1 but should
have been 1, wi = wi + αxi, i = 0, 1, 2, . . ., n. Here α is the learning parameter and is a
constant.
7. Go to step 3.
The input values are presented to the perceptron, and if the predicted output is the same as the
desired output, then the performance is considered satisfactory and no changes to the weights
are made. However, if the output does not match the desired output, then the weights need to
be changed to reduce the error.
Because SLP is a linear classifier and if the cases are not linearly separable the learning process
will never reach a point where all the cases are classified properly. The most famous example
of the inability of perceptron to solve problems with linearly non-separable cases is the XOR
problem.
Error Calculation: e = (yt - yc)
Problem-1: Solve the following problem by using Single Lyer Perceptron Algorithm (SLP)
for the following logical XOR operation. Where learning rate α = .2.
Input Output
X1 X2 Yt
0 0 0
0 1 1
1 0 1
1 1 0
Solution:
Iteration-1:
Input: x1 = 0 and x2 = 0
Random Weight: w1 = 0.3 w2 = 0.3
Net= ∑ wixi
= w1x1 + w2x2
= 0.3*0 + 0.3*0
=0
Apply Step Activation Function to calculate the computed output yc. The step activation
function is:
1 𝑖𝑓 𝑛𝑒𝑡𝑖 > 0
𝒀𝒄 = { }
0 𝑖𝑓 𝑛𝑒𝑡𝑖 ≤ 0
Since the Net= 0 so that yc=0. Here yt = 0 and yc= 0 Here, Yt is target Output, and yc= Computed
Output. Since both values are same so neurons can successfully computed the output. So that
this iteration has finished.
Iteration-2:
Input: x1 = 0 and x2 = 1
Random Weight: w1 = 0.3 w2 = 0.3
Net= ∑ wixi
= w1x1 + w2x2
= .03*0 + 0.3*1
= 0.3
Apply Step Activation Function to calculate the computed output yc. The step activation
function is:
1 𝑖𝑓 𝑛𝑒𝑡𝑖 > 0
𝒀𝒄 = { }
0 𝑖𝑓 𝑛𝑒𝑡𝑖 ≤ 0
Since the Net= .3 so that yc=1. Here yt = 1 and yc= 1 Here, Yt is target Output, and yc=
Computed Output. Since both values are same so neurons can successfully compute the output.
So that this iteration has finished.
Iteration-3:
Input: x1 = 1 and x2 = 0
Random Weight: w1 = 0.3 w2 = 0.3
Net= ∑ wixi
= w1x1 + w2x2
= .03*1 + 0.3*0
= 0.3
Apply Step Activation Function to calculate the computed output yc. The step activation
function is:
1 𝑖𝑓 𝑛𝑒𝑡𝑖 > 0
𝒀𝒄 = { }
0 𝑖𝑓 𝑛𝑒𝑡𝑖 ≤ 0
Since the Net= .3 so that yc=1. Here yt = 1 and yc= 1 Here, Yt is target Output, and yc=
Computed Output. Since both values are same so neurons can successfully compute the output.
So that this iteration has finished.
Iteration-4:
Input: x1 = 1 and x2 = 1
Random Weight: w1 = 0.3 w2 = 0.3
Net= ∑ wixi
= w1x1 + w2x2
= .03*1 + 0.3*1
= 0.6
Apply Step Activation Function to calculate the computed output yc. The step activation
function is:
1 𝑖𝑓 𝑛𝑒𝑡𝑖 > 0
𝒀𝒄 = { }
0 𝑖𝑓 𝑛𝑒𝑡𝑖 ≤ 0
Since the Net= .6 so that yc=1. Here yt = 0 and yc= 1 Here, Yt is target Output, and yc=
Computed Output. Since both values are not same so that need to update weight and Calculate
this iteration again.
2. Initialize Weights: Start by initializing the weights to small random values. For
simplicity, let's initialize the weights as (.3, .3) .
3. Define the Activation Function: In this case, we'll use a step function as the activation
function. The step function returns 1 if the input is greater than or equal to 1 and returns
0 otherwise.
4. Training: Now, we'll iterate through the training examples and adjust the weights and
bias based on the error between the predicted output and the actual output.
1. w1 = w1 + (learning_rate * E * input1)
2. w2 = w2 + (learning_rate * E * input2)
5. Repeat: Continue this training process for multiple epochs until the error becomes
sufficiently small or converges to a solution.
6. Final Weights: Once the training is complete, you will have the final weights and
bias values, which will allow you to make predictions for new input data.
Applications of Artificial Neural Network (ANN): Artificial Neural Networks (ANNs) have
a broad range of applications across various domains due to their ability to model complex,
non-linear relationships in data. Some common applications of ANNs include:
Image and Video Processing:
Image Classification: ANNs are used to classify objects within images, such as in
facial recognition, object detection, and character recognition.
Image Generation: They can generate realistic images, including deep dream images
and style transfer.
Video Analysis: ANNs are employed in video surveillance, tracking objects in videos,
and identifying anomalies.
Natural Language Processing (NLP):
Machine Translation: ANNs are used for language translation, such as Google
Translate.
Sentiment Analysis: They determine the sentiment of text data for applications in
social media monitoring and customer feedback analysis.
Text Generation: ANNs can generate human-like text, which is useful for chatbots and
content creation.
Speech Recognition: ANNs are critical in converting spoken language into text for voice
assistants and transcription services.
Recommendation Systems: ANNs power recommendation engines, as seen in content
recommendations on streaming platforms like Netflix or product recommendations on e-
commerce websites.
Autonomous Vehicles: Object Detection and Tracking: ANNs are used in self-driving cars to
detect and track objects on the road.
Path Planning: They assist in determining safe and efficient routes for autonomous vehicles.
Healthcare:
Medical Image Analysis: ANNs analyze medical images like X-rays and MRIs to
assist in diagnosis and treatment planning.
Disease Prediction: They are used to predict the risk of various diseases based on
patient data.
Finance:
Credit Scoring: ANNs help assess credit risk by analyzing an individual's credit
history.
Algorithmic Trading: They are employed in financial markets for algorithmic trading
and risk management.
Environmental Modeling: ANNs can model and predict climate patterns, aiding in weather
forecasting and climate change analysis.
Social Media and Content Generation: They generate text, images, and other content, used
in chatbots, creative content generation, and recommendation engines.
Fraud Detection: ANNs are used for fraud detection, such as credit card fraud detection and
cybersecurity.
Quality Control: In manufacturing, ANNs are used for quality control and defect detection in
production processes.
Energy Efficiency: ANNs can optimize energy consumption in buildings and industrial
processes by predicting energy demand and optimizing systems.
Customer Relationship Management (CRM): ANNs help businesses in customer
segmentation, churn prediction, and customer behavior analysis.
Biological and Medical Research: They are used for simulating biological systems and
predicting the behavior of biological molecules.
Drug Discovery: ANNs are used in drug discovery to predict the properties of molecules and
identify potential drug candidates.
Agriculture: They assist in crop disease detection, yield prediction, and pest control.
SLP of Artificial Neural Network (ANN) Development Using Python:
import random
# Data Generation
m=int(input("Total Data in Row: "))
n=int(input("Total Data in col: "))
fd=[]
y=[]
w=[]
for i in range (m):
pd=[]
for j in range(n):
gd=random.randint(0,1)
pd.append(gd)
fd.append(pd)
cd=random.randint(0,1)
y.append(cd)
for j in range(n):
wd=random.random()
w.append(wd)
print("Input Data: ",fd)
print("Target Data: ",y)
print("Weight Data: ",w)
#Algorthm
alph=.2
sum=0
for i in range(m):
for j in range(n):
print("Input Val:",fd[i][j])
sum=sum+(w[j]*fd[i][j])
if sum>0:
yc=1
else:
yc=0
if y[i]==yc:
print("Success in FT!!! and Value: ",yc)
sum=0
else:
wn=[]
for j in range(n):
wn.append(w[j])
print("WN-",wn)
while (y[i]!=yc):
print("Failure!!! and Value: ",yc)
for j in range(n):
wn[j]=wn[j]+alph*(y[i]-yc)*fd[i][j]
sum=sum+(wn[j]*fd[i][j])
if sum>0:
yc=1
else:
yc=0
if y[i]==yc:
print("Success!!! and Value: ",yc)
sum=0
The Multilayer Perceptron was developed to tackle this limitation. It is a neural network where
the mapping between inputs and output is non-linear.
A Multilayer Perceptron has input and output layers, and one or more hidden layers with many
neurons stacked together. And while in the Perceptron the neuron must have an activation
function that imposes a threshold, like rectified linear unit (ReLU) or sigmoid, neurons in a
Multilayer Perceptron can use any arbitrary activation function.
Figure 2.7: Multi-Layer Perceptron (MLP)
Input Layer: The first layer of an MLP is called the input layer, and it consists of
neurons that receive the input data features. Each neuron in the input layer represents a
feature or attribute of the data.
Hidden Layers: Between the input and output layers, there can be one or more hidden
layers. These layers contain artificial neurons that perform computations and
transformations on the input data. The term "hidden" comes from the fact that these
layers are not directly exposed to the external environment but exist to process
information.
Output Layer: The last layer of the MLP is the output layer, which provides the
network's final prediction or output. The number of neurons in this layer depends on
the type of problem (e.g., binary classification, multi-class classification, regression).
Weight and Bias: Each connection between neurons has an associated weight, which
determines the strength of the connection, and a bias term, which allows the neuron to
shift its decision boundary.
Training an MLP involves iteratively adjusting the weights and biases to minimize a
loss function, which measures the difference between the network's predictions and the
actual target values in the training data. Gradient-based optimization algorithms, such
as stochastic gradient descent (SGD), are commonly used for this purpose.
MLPs are versatile and can be applied to a wide range of machine learning tasks, including
image classification, natural language processing, speech recognition, and regression
problems. However, they may not be the best choice for all tasks, especially when dealing with
structured data like images or sequences. The architecture and hyperparameters of an MLP,
including the number of layers, the number of neurons in each layer, and the choice of
activation functions, should be tuned to match the specific characteristics of the problem at
hand.
Image recognition: MLPs can be trained to recognize patterns in images and classify
them into different categories. This is useful in applications such as facial recognition,
object detection, and image segmentation.
Natural Language Processing (NLP): MLPs can be used to understand and generate
human language. This is useful in applications such as text-to-speech, machine
translation, and sentiment analysis.
Predictive Modelling: It can be used to make predictions based on past data. This is
useful in applications such as stock market prediction, weather forecasting, and fraud
detection.
Architecture:
Input Layer: The input layer consists of neurons that take the feature values of
your data as input. Each neuron corresponds to a feature in your dataset.
Hidden Layers: One or more hidden layers with multiple neurons in each layer.
These hidden layers are responsible for learning complex patterns in the data.
Output Layer: The output layer produces the final output or prediction. The
number of neurons in the output layer depends on the problem type (e.g., binary
classification, multi-class classification, regression).
Activation Functions: Each neuron in the hidden and output layers typically uses an
activation function to introduce non-linearity into the model. Common activation
functions include ReLU (Rectified Linear Unit), Sigmoid, and Tanh.
Weights and Biases: Each connection between neurons has an associated weight,
which the network adjusts during training to learn the optimal values.Each neuron also
has a bias term.
Forward Propagation: The input data is fed into the input layer. Signals are
propagated through the network in a feedforward manner. The output of each neuron
in a layer is a weighted sum of the inputs passed through the activation function.
This process continues through the hidden layers to the output layer, producing the final
prediction.
Loss Function: A loss function measures the difference between the predicted output
and the actual target values. Common loss functions include Mean Squared Error
(MSE) for regression and Cross-Entropy for classification.
Training: Training involves iteratively feeding the training data through the network,
computing the loss, and updating the weights and biases to minimize the loss.
Training typically includes multiple epochs (iterations) over the training data until the
model converges.
Hyperparameter Tuning: Tuning the architecture of the MLP (e.g., the number of
hidden layers, the number of neurons in each layer) and hyperparameters (e.g., learning
rate) is often necessary to achieve the best performance on a specific task.
Inference: Once trained, the MLP can be used for making predictions on new, unseen
data.
Multi-Layer perceptron’s are a fundamental building block of deep learning and have
been extended into more complex architectures like Convolutional Neural Networks
(CNNs) for image analysis and Recurrent Neural Networks (RNNs) for sequential data
analysis. They are highly versatile and can be applied to a wide range of machine
learning tasks.
MLP Learning Procedure: The MLP learning procedure is working using three steps such
as Forward Pass, Error Calculation and Backward Pass:
Forward Error
Pass Calculation
Backward
Pass
Forward Pass: Starting with the input layer, propagate data forward to the output layer.
This step is the forward propagation. Initialize input (X1, X2,X3,……………Xn) and
random weight (W11,W12,………WIJ) for input layer Finally Calculate the sum of
weight and input.
𝑺𝒖𝒎 = ∑𝒏𝒊=𝟏 𝑾𝒊 𝑿𝒊 .
After completing thr calculation of sum in each iteration calculate the value of output
using the following equation
𝟏
𝒀𝒊 = 𝟏+𝒆−𝑺𝒖𝒎 ,
Where Yi will be the input for corresponding hidden layer. This Process will be
continued till calculate the final output Yc.
Error Calculation: Based on the output, calculate the error (the difference between
the Predicted and Target Output). The error needs to be minimized.
Error= Target Output (Yt)-Calculated Output (Yc)
Backward Pass: If the calculated output and target output are not matched then weights
need to be updated. Find its derivative with respect to each weight in the network and
update the model. The formula for this is: Wij(new)= Wij+Wij(Old)
Wij= η δj Yi Where η = learning rate, δ = Error calculation and Y = output of
j i
previous layer. Again δj=Yj (1-Yj) (Yt - Yj) if Yj is the final Output Unit
Solutions:
W13 W35
X1 = 0.35 X1 H3 H5
H7 Y7
X2 = 0.90 X2 H4 H6
W24 W46
Step-1:
W14= 0.04 W24= 0.05 W36= 0.1 W46= 0.03 W67= 0.01
Target output: As per the above diagram, the target output should be achieved at Y7 =0.5
based on inputs from hidden layer-2 (H5=Y5 and H6=Y6). Again H5=Y5 and H6=Y6 will be
calculated based on inputs from hidden layer-1 (H3=Y3 and H4=Y4). Finally, H3=Y3 and
H4=Y4 will be calculated based on inputs from input layer X1=.35 and X2=.90.
Hidden Layer Inputs: It has been assumed that output of each hidden layer is passed through as input
of next hidden layer. Each hidden layer output has been defined as Yi.
𝟏
Forward Pass: Activation function is 𝒀𝒊 = 𝟏+𝒆−𝑺𝒖𝒎 , and sum is 𝑺𝒖𝒎 = ∑𝒏𝒊=𝟏 𝑾𝒊 𝑿𝒊i
Weight Update:
W67 = η δ7 Y6 W57 = η δ7 Y5 W46 = η δ6 Y4
= 1 * -0.0025 * 0.52 = 1 * -0.0025 * 0.51 = 1 * 0 * 0.51
= -0.001 = -0.001 =0
Error Calculation:
Error= Target Output (Yt)-Calculated Output (Yc)
Erro= 0.50 – 0.50
Erro= 0.00
Here we have got the error value is 0. So, we’ve reached out our targeted value which
completes the process for MLP.
Repeat the forward pass with Updated weights:
y3:
sum = x1 * w13 + x2 * w23 = (0.35 * 0.01) + (0.9 * 0.07) = 0.07
1 1
y3 = 1+ 𝑒 −𝑠𝑢𝑚 = 1+𝑒 −0.07 = 0.51
y4:
sum = x1 * w14 + x2 * w24 = (0.35 * 0.04) + (0.9 * 0.05) = 0.06
1 1
y4 = 1+ 𝑒 −𝑠𝑢𝑚 = 1+𝑒 −0.06 = 0.51
y5:
sum = y3 * w35 + y4 * w45 = (0.09 * 0.51) + (0.02 * 0.51) = 0.06
1 1
y5 = = = 0.51
1+ 𝑒 −𝑠𝑢𝑚 1+𝑒 −0.06
y6:
sum = y3 * w36 + y4 * w46 = (0.1 * 0.51) + (0.03 * 0.51) = 0.07
1 1
y6 = 1+ 𝑒 −𝑠𝑢𝑚 = 1+𝑒 −0.07 = 0.52
y7:
sum = y5 * w57 + y6 * w67 = (0.03 * 0.52) + (0.01 * 0.52) = 0.01947
1 1
y7 = 1+ 𝑒 −𝑠𝑢𝑚 = 1+𝑒 −0.01947 = 0.50
• Error:
Targeted output – Calculated output = 0.50 – 0.50 = 0
Here we have got the error value is 0. So, We’ve reached out our targeted value which
completes the process for MLP.