0% found this document useful (0 votes)
58 views30 pages

DL Part 3

The document provides an overview of deep learning and artificial neural networks. It defines artificial intelligence and describes different types including narrow AI and general AI. It then discusses various AI techniques such as natural language processing, computer vision, robotics, expert systems, and machine learning. The document proceeds to describe artificial neural networks, explaining how they are modeled after biological neural networks in the brain. It provides details on the architecture of artificial neural networks including input, hidden, and output layers. Finally, it defines deep learning as a subset of machine learning that uses neural networks with many layers to automatically extract features from data.

Uploaded by

Lony Islam
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
58 views30 pages

DL Part 3

The document provides an overview of deep learning and artificial neural networks. It defines artificial intelligence and describes different types including narrow AI and general AI. It then discusses various AI techniques such as natural language processing, computer vision, robotics, expert systems, and machine learning. The document proceeds to describe artificial neural networks, explaining how they are modeled after biological neural networks in the brain. It provides details on the architecture of artificial neural networks including input, hidden, and output layers. Finally, it defines deep learning as a subset of machine learning that uses neural networks with many layers to automatically extract features from data.

Uploaded by

Lony Islam
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 30

Chapter 1: Basic Concept on Deep Learning

Artificial Intelligence: AI, or Artificial Intelligence, refers to the development of computer


systems or software that can perform tasks that typically require human intelligence. These
tasks include reasoning, problem-solving, learning, understanding natural language,
recognizing patterns, and making decisions. AI systems are designed to mimic human cognitive
functions and can be broadly categorized into two types:
Narrow AI (Weak AI): Narrow AI is designed for specific tasks or domains. It excels
at performing a well-defined set of functions but lacks the general intelligence and
versatility of human intelligence. Examples of narrow AI include voice assistants like
Siri and Alexa, recommendation systems used by streaming services, and image
recognition software.
General AI (Strong AI): General AI, often referred to as artificial general intelligence
(AGI), is the theoretical concept of AI systems that possess human-like intelligence and
can perform any intellectual task that a human can do. AGI would have the ability to
transfer knowledge and skills across different domains and adapt to new tasks
independently. AGI is still a goal for future research and development, and it does not
currently exist.
AI technologies rely on various techniques and approaches, including:
Natural Language Processing (NLP): NLP focuses on enabling computers to
understand, interpret, and generate human language. It is used in applications such as
chatbots, language translation, sentiment analysis, and text summarization.

Figure 1: Natural Language Processing


Computer Vision: Computer vision involves enabling machines to interpret and
understand visual information from the world, including images and videos.
Applications include facial recognition, object detection, and autonomous vehicles.
Figure 2: Computer Vision
Robotics: AI plays a significant role in the development of robots and autonomous
systems capable of interacting with their environment and performing tasks
autonomously. This includes industrial robots, drones, and self-driving cars.

Figure 3: Working Principle of Robot


Expert Systems: Expert systems are AI programs that mimic the decision-making
abilities of a human expert in a specific domain. They use knowledge bases and rules
to provide expert-level advice or recommendations.

Figure 4: Expert System


Machine Learning (ML): A subset of AI, machine learning involves training
algorithms on data to recognize patterns and make predictions or decisions. Types of
machine learning include supervised learning, unsupervised learning, and
reinforcement learning.

Figure 5: Machine Learning


Artificial Neural Network: The term "Artificial Neural Network" is derived from Biological
neural networks that develop the structure of a human brain. Similar to the human brain that
has neurons interconnected to one another, artificial neural networks also have neurons that are
interconnected to one another in various layers of the networks. These neurons are known as
nodes.
It determines weighted total is passed as an input to an activation function to produce the
output. Activation functions choose whether a node should fire or not. Only those who are fired
make it to the output layer. There are distinctive activation functions available that can be
applied upon the sort of task we are performing.
Figure 6: Biological Neurons
The given figure illustrates the typical diagram of Biological Neural Network. The typical
Artificial Neural Network looks something like the given figure.

Figure 6: Artificial Neurons

Dendrites from Biological Neural Network represent inputs in Artificial Neural Networks, cell
nucleus represents Nodes, synapse represents Weights, and Axon represents Output.
Relationship between Biological neural network and artificial neural network:

Biological Neural Network Artificial Neural Network

Dendrites Inputs

Cell nucleus Nodes

Synapse Weights

Axon Output
An Artificial Neural Network in the field of Artificial intelligence where it attempts to
mimic the network of neurons makes up a human brain so that computers will have an option
to understand things and make decisions in a human-like manner. The artificial neural network
is designed by programming computers to behave simply like interconnected brain cells.

There are around 1000 billion neurons in the human brain. Each neuron has an association
point somewhere in the range of 1,000 and 100,000. In the human brain, data is stored in such
a manner as to be distributed, and we can extract more than one piece of this data when
necessary, from our memory parallelly. We can say that the human brain is made up of
incredibly amazing parallel processors.

We can understand the artificial neural network with an example, consider an example of a
digital logic gate that takes an input and gives an output. "OR" gate, which takes two inputs. If
one or both the inputs are "On," then we get "On" in output. If both the inputs are "Off," then
we get "Off" in output. Here the output depends upon input. Our brain does not perform the
same task. The outputs to inputs relationship keep changing because of the neurons in our brain,
which are "learning."

The Architecture of an Artificial Neural Network: To understand the concept of the


architecture of an artificial neural network, we have to understand what a neural network
consists of. In order to define a neural network that consists of a large number of artificial
neurons, which are termed units arranged in a sequence of layers. Let’s us look at various types
of layers available in an artificial neural network. Artificial Neural Network primarily consists
of three layers:

Figure 7: Architecture of an Artificial Neural Network


Input Layer: As the name suggests, it accepts inputs in several different formats provided by
the programmer.
Hidden Layer: The hidden layer presents in-between input and output layers. It performs all
the calculations to find hidden features and patterns.
Output Layer: The input goes through a series of transformations using the hidden layer,
which finally results in output that is conveyed using this layer.

The artificial neural network takes input and computes the weighted sum of the inputs and
includes a bias. This computation is represented in the form of a transfer function.

Deep Learning: Deep learning is a subset of machine learning that uses neural networks with
many layers (deep neural networks) to automatically extract features and representations from
data. It has been highly successful in tasks like image recognition and natural language
processing.

Figure 5: AI, Machine Learning and Deep Learning


Deep learning is a collection of statistical techniques of machine learning for learning feature
hierarchies that are actually based on artificial neural networks.

Figure 6: Deep Learning


Chapter 2: Artificial Neural Network

Biological Neural Network: In living organisms, the brain is the control unit of the neural
network, and it has different subunits that take care of vision, senses, movement, and hearing.
The brain is connected with a dense network of nerves to the rest of the body’s sensors and
actors. There are approximately 10¹¹ neurons in the brain, and these are the building blocks of
the complete central nervous system of the living body.
The neuron is the fundamental building block of neural networks. In the biological systems, a
neuron is a cell just like any other cell of the body, which has a DNA code and is generated in
the same way as the other cells. Though it might have different DNA, the function is similar in
all the organisms. A neuron comprises three major parts: the cell body (also called Soma), the
dendrites, and the axon. The dendrites are like fibers branched in different directions and are
connected to many cells in that cluster.
Dendrites receive the signals from surrounding neurons, and the axon transmits the signal to
the other neurons. At the ending terminal of the axon, the contact with the dendrite is made
through a synapse. Axon is a long fiber that transports the output signal as electric impulses
along its length. Each neuron has one axon. Axons pass impulses from one neuron to another
like a domino effect.

Figure 2.1: Biological Neuron


At age 20, the average weight of the male brain is approximately 1400 g, and by the age of 65
brain weight is approximately 1300 g. Brain weight for females follows a similar trend,
although the total weight is 100–150 g less than that of males.
Our body is composed of millions to billions of nerve cells. Some nerve cells can be
comparatively smaller by 0.1 millimeters or can be longer by 1 meter.
Artificial Neural Network (ANN): The term "Artificial Neural Network" is derived from
Biological neural networks that develop the structure of a human brain. Similar to the human
brain that has neurons interconnected to one another, artificial neural networks also have
neurons that are interconnected to one another in various layers of the networks. These neurons
are known as nodes.

Figure 2.2: Representation of Biological vs. Artificial Neurons


An Artificial Neural Network (ANN) is a computational model inspired by the structure and
functioning of biological neural networks, such as the human brain. ANNs are used in machine
learning and artificial intelligence to solve a wide range of tasks, including pattern recognition,
classification, regression, and more. Here are some key features and components of artificial
neural networks:
Neurons (Nodes): Neurons in an ANN are the fundamental units of computation. They receive
input signals, perform mathematical operations on these inputs, and produce an output signal.
Neurons are also referred to as nodes or units.
Layers: ANNs are organized into layers. Typically, there are three main types of layers:
Input Layer: This layer receives the initial data or features.
Hidden Layers: These intermediate layers process the data through a series of
weighted connections and apply activation functions to produce complex
representations of the input.
Output Layer: This layer produces the final result or prediction.

Figure 2.3: Artificial Neural Networks (ANN)


Weights and Connections: The connections between neurons in different layers have
associated weights. These weights determine the strength of the connections. During the
training process, these weights are adjusted to make the network produce accurate output.
Activation Functions: Each neuron in a neural network typically has an activation function,
which determines its output based on the weighted sum of its inputs. Common activation
functions include Sigmoid, ReLU (Rectified Linear Unit), and Tanh.
Feedforward and Backpropagation: In the feedforward process, data is passed through the
network from the input layer to the output layer. During training, backpropagation is used to
update the weights based on the error between the predicted output and the actual target values.
Training: Training an ANN involves using a dataset with known inputs and desired outputs to
adjust the network's weights and biases. Optimization techniques like gradient descent and
algorithms like backpropagation are commonly used to update these parameters and minimize
the prediction error.
Artificial Neuron: An artificial neuron is a mathematical function inspired by biological
neurons in the human brain. In artificial intelligence (AI), particularly in the context of neural
networks, neurons are the basic unit of computation. Every artificial neuron is composed of at
least one input, a weight, a bias, and an activation function. Most neurons have more than one
input.
Each input comes with its own weight, which is adjusted during training to reflect the input’s
importance. In addition to weighted inputs, each neuron has a bias. This is an extra input that
has a set value that doesn’t change. The bias allows the neuron to produce outputs other than
zero, even if all the inputs are zero. Here is an example of how an artificial neuron works:
o The neuron receives multiple input values.
o Each of these inputs is multiplied by its corresponding weight.
o All the weighted inputs are added up, and the bias is added to this sum.
o The result is passed through the activation function to produce the neuron’s output. The
activation function introduces non-linearity to the model, enabling it to learn from
errors and make adjustments.

Figure 2.4: Artificial Neuron (AN)


Perceptron Algorithm: Perceptron is Machine Learning algorithm for supervised learning of
various binary classification tasks. Further, Perceptron is also understood as an Artificial
Neuron or neural network unit that helps to detect certain input data computations in business
intelligence.

Perceptron model is also treated as one of the best and simplest types of Artificial Neural
networks. However, it is a supervised learning algorithm of binary classifiers. Hence, we can
consider it as a single-layer neural network with four main parameters, i.e., input values,
weights and Bias, net sum, and an activation function.
Mr. Frank Rosenblatt invented the perceptron model as a binary classifier which contains three
main components. These are as follows:
Input Nodes or Input Layer: This is the primary component of Perceptron which
accepts the initial data into the system for further processing. Each input node contains
a real numerical value.
Wight and Bias: Weight parameter represents the strength of the connection between
units. This is another most important parameter of Perceptron components. Weight is
directly proportional to the strength of the associated input neuron in deciding the
output. Further, Bias can be considered as the line of intercept in a linear equation.
Activation Function: These are the final and important components that help to
determine whether the neuron will fire or not. Activation Function can be considered
primarily as a step function.
Types of Activation functions:
o Sign function
o Step function, and
o Sigmoid function

Figure 2.5: Activation Function


1 𝑖𝑓 𝑆𝑢𝑚 > 0
Step Function: 𝒀 = { }
0 𝑖𝑓 𝑆𝑢𝑚 ≤ 0
+1 𝑖𝑓 𝑆𝑢𝑚 > 0
Sign Function: 𝒀 = { }
−1 𝑖𝑓 𝑆𝑢𝑚 ≤ 0
𝟏
Sigmoid Function: 𝒀 = 𝟏+𝒆−𝑺𝒖𝒎

Types of Perceptron Models: Based on the layers, Perceptron models are divided into two
types. These are as follows:
o Single-layer Perceptron Model
o Multi-layer Perceptron model

Single-Layer Perceptron (SLP): A single layer perceptron (SLP) is a feed-forward network


based on a threshold transfer function. SLP is the simplest type of artificial neural networks
and can only classify linearly separable cases with a binary target (1, 0).

Figure 2.6: Single-Layer Perceptron (SLP)


Input x1, x2 and the step function is

Input value or One input layer: The input layer of the perceptron is made of artificial input
neurons and takes the initial data into the system for further processing.
Weights and Bias:
Weight: It represents the dimension or strength of the connection between units. If the
weight to node 1 to node 2 has a higher quantity, then neuron 1 has a more considerable
influence on the neuron.
Bias: It is the same as the intercept added in a linear equation. It is an additional
parameter which task is to modify the output along with the weighted sum of the input
to the other neuron.
Net sum: It calculates the total sum.
Activation Function: A neuron can be activated or not, is determined by an activation function.
The activation function calculates a weighted sum and further adding bias with it to give the
result.
Algorithm
1. Create a perceptron with (n + 1) input neurons x0, x1, . . . xn, where x0 = 1 is the bias
input. Let O be the output neuron.
2. Initialize w = (w0, w1, . . ., wn) to random weights.
3. Iterate through the input patterns xj of the training set using the weight set, i, e. Compute
the weighted sum of inputs netj = ∑𝑛𝑖=0(𝑥𝑖 w𝑖 ) for each input pattern j.
4. Compute the output yj using the step function –

1 𝑖𝑓 𝑛𝑒𝑡𝑖 > 0
𝒀={ }
0 𝑖𝑓 𝑛𝑒𝑡𝑖 ≤ 0

5. Compare the computed output yj with the target output yj for each input pattern j. If all
the input patterns have been classified correctly, output the weights and exit.
6. Otherwise, update the weighs as given below: If the computed output yj is 1 but should
have been 0, wi = wi + αxi, i = 0, 1, 2, . . ., n. If the computed output yj is 1 but should
have been 1, wi = wi + αxi, i = 0, 1, 2, . . ., n. Here α is the learning parameter and is a
constant.
7. Go to step 3.

Perceptron Weight Adjustment


wnew = wold + α (yt-yj) xi
Here α is Learning rate
yj is Computed output
yt is Target Output
x is Input data
SLP sums all the weighted inputs and if the sum is above the threshold (some predetermined
value), SLP is said to be activated (output=1).

The input values are presented to the perceptron, and if the predicted output is the same as the
desired output, then the performance is considered satisfactory and no changes to the weights
are made. However, if the output does not match the desired output, then the weights need to
be changed to reduce the error.
Because SLP is a linear classifier and if the cases are not linearly separable the learning process
will never reach a point where all the cases are classified properly. The most famous example
of the inability of perceptron to solve problems with linearly non-separable cases is the XOR
problem.
Error Calculation: e = (yt - yc)

Problem-1: Solve the following problem by using Single Lyer Perceptron Algorithm (SLP)
for the following logical XOR operation. Where learning rate α = .2.
Input Output
X1 X2 Yt
0 0 0
0 1 1
1 0 1
1 1 0
Solution:
Iteration-1:
Input: x1 = 0 and x2 = 0
Random Weight: w1 = 0.3 w2 = 0.3
Net= ∑ wixi
= w1x1 + w2x2
= 0.3*0 + 0.3*0
=0
Apply Step Activation Function to calculate the computed output yc. The step activation
function is:
1 𝑖𝑓 𝑛𝑒𝑡𝑖 > 0
𝒀𝒄 = { }
0 𝑖𝑓 𝑛𝑒𝑡𝑖 ≤ 0
Since the Net= 0 so that yc=0. Here yt = 0 and yc= 0 Here, Yt is target Output, and yc= Computed
Output. Since both values are same so neurons can successfully computed the output. So that
this iteration has finished.
Iteration-2:
Input: x1 = 0 and x2 = 1
Random Weight: w1 = 0.3 w2 = 0.3
Net= ∑ wixi
= w1x1 + w2x2
= .03*0 + 0.3*1
= 0.3
Apply Step Activation Function to calculate the computed output yc. The step activation
function is:
1 𝑖𝑓 𝑛𝑒𝑡𝑖 > 0
𝒀𝒄 = { }
0 𝑖𝑓 𝑛𝑒𝑡𝑖 ≤ 0
Since the Net= .3 so that yc=1. Here yt = 1 and yc= 1 Here, Yt is target Output, and yc=
Computed Output. Since both values are same so neurons can successfully compute the output.
So that this iteration has finished.
Iteration-3:
Input: x1 = 1 and x2 = 0
Random Weight: w1 = 0.3 w2 = 0.3
Net= ∑ wixi
= w1x1 + w2x2
= .03*1 + 0.3*0
= 0.3
Apply Step Activation Function to calculate the computed output yc. The step activation
function is:
1 𝑖𝑓 𝑛𝑒𝑡𝑖 > 0
𝒀𝒄 = { }
0 𝑖𝑓 𝑛𝑒𝑡𝑖 ≤ 0
Since the Net= .3 so that yc=1. Here yt = 1 and yc= 1 Here, Yt is target Output, and yc=
Computed Output. Since both values are same so neurons can successfully compute the output.
So that this iteration has finished.
Iteration-4:
Input: x1 = 1 and x2 = 1
Random Weight: w1 = 0.3 w2 = 0.3
Net= ∑ wixi
= w1x1 + w2x2
= .03*1 + 0.3*1
= 0.6
Apply Step Activation Function to calculate the computed output yc. The step activation
function is:
1 𝑖𝑓 𝑛𝑒𝑡𝑖 > 0
𝒀𝒄 = { }
0 𝑖𝑓 𝑛𝑒𝑡𝑖 ≤ 0
Since the Net= .6 so that yc=1. Here yt = 0 and yc= 1 Here, Yt is target Output, and yc=
Computed Output. Since both values are not same so that need to update weight and Calculate
this iteration again.

Now Update the weight –


Winew= wiold + α(yt- yc)xi [Here α is Learning rate, and α = .2]
W1new = w1old + α(yt- yc)x1
= .3 + .2(0 - 1) 1 [Here α is Learning rate, and α = .2]
= .3 - .2 = 0.1
W2new = w2old – α(yt- yc)x2
= .3 + .2(0 - 1) 1 [Here α is Learning rate, and α = .2]
= .3 - .2 = 0.1
Iteration-4.1:
Input: x1 = 1 and x2 = 1
Updated Weight: w1 = 0.1 w2 = 0.1
Net= ∑ wixi
= w1x1 + w2x2
= .01*1 + 0.1*1
= 0.2
Apply Step Activation Function to calculate the computed output yc. The step activation
function is:
1 𝑖𝑓 𝑛𝑒𝑡𝑖 > 0
𝒀𝒄 = { }
0 𝑖𝑓 𝑛𝑒𝑡𝑖 ≤ 0
Since the Net= .1 so that yc=1. Here yt = 0 and yc= 1 Here, Yt is target Output, and yc=
Computed Output. Since both values are not same so that need to update weight and calculate
this iteration again.
Now Update the weight –
Winew= wiold+α(yt- yc)xi [Here α is Learning rate, and α = .2]
W1new = w1old + α(yt- yc)x1
= .2 + .2(0 - 1) 1 [Here α is Learning rate, and α = .2]
=0
W2new = w2old – α(yt- yc)x2
= .2 - .2(0 - 1) 1 [Here α is Learning rate, and α = .2]
=0
Iteration-4.2:
Input: x1 = 1 and x2 = 1
Random Weight: w1 = 0 w2 = 0
Net= ∑ wixi
= w1x1 + w2x2
= 0*1 + 0*1
=0
Apply Step Activation Function to calculate the computed output yc. The step activation
function is:
1 𝑖𝑓 𝑛𝑒𝑡𝑖 > 0
𝒀𝒄 = { }
0 𝑖𝑓 𝑛𝑒𝑡𝑖 ≤ 0
Since the Net= 0 so that yc=0. Here yt = 0 and yc= 0 Here, Yt is target Output, and yc= Computed
Output. Since both values are same so neurons can successfully computed the output. So that
this iteration has finished.
After 3rd iteration we get the Target output for all input:
X1 X2 Yt Yc(Iteration -1) Yc(Iteration -2) Yc(Iteration -3)
0 0 0 0 0 0
0 1 1 1 1 1
1 0 1 1 1 1
1 1 0 1 1 0

Let summarize the above process:


1. A single-layer perceptron takes

• Input (0, 0) should yield an output of 0.

• Input (0, 1) should yield an output of 1.

• Input (1, 0) should yield an output of 1.


• Input (1, 1) should yield an output of 0.

2. Initialize Weights: Start by initializing the weights to small random values. For
simplicity, let's initialize the weights as (.3, .3) .

a. Weight 1 (w1) = 0.3

b. Weight 2 (w2) = 0.3

3. Define the Activation Function: In this case, we'll use a step function as the activation
function. The step function returns 1 if the input is greater than or equal to 1 and returns
0 otherwise.

4. Training: Now, we'll iterate through the training examples and adjust the weights and
bias based on the error between the predicted output and the actual output.

a. Input (0, 0):

i. Predicted output (yc) = fstep( (.3 * 0) + (.3 * 0)) = fstep( 0 ) = 0

ii. Actual output (target) = 0

iii. Error (E) = target - Predicted output (yc) = 0 - 0 = 0

b. Input (0, 1):

i. Predicted output (yc) = fstep( (.3 * 0) + (.3 * 1)) = fstep( 0.3) = 1

ii. Actual output (target) = 1

iii. Error (E) = target - yc = 1 - 1 = 0

c. Input (1, 0):

i. Predicted output (yc) = fstep( (.3 * 1) + (.3 * 0)) = fstep( 0.3 ) = 1

ii. Actual output (target) = 1

iii. Error (E) = target - yc = 1 - 1 = 0

d. Input (1, 1):

i. Predicted output (yc) = fstep( (.3 * 1) + (.3 * 1)) = fstep( 0.6 ) = 1

ii. Actual output (target) = 0

iii. Error (E) = target - (yc) = 0 – 1 = -1

iv. Update weights and bias:

1. w1 = w1 + (learning_rate * E * input1)
2. w2 = w2 + (learning_rate * E * input2)

5. Repeat: Continue this training process for multiple epochs until the error becomes
sufficiently small or converges to a solution.

6. Final Weights: Once the training is complete, you will have the final weights and
bias values, which will allow you to make predictions for new input data.

Applications of Artificial Neural Network (ANN): Artificial Neural Networks (ANNs) have
a broad range of applications across various domains due to their ability to model complex,
non-linear relationships in data. Some common applications of ANNs include:
Image and Video Processing:
Image Classification: ANNs are used to classify objects within images, such as in
facial recognition, object detection, and character recognition.
Image Generation: They can generate realistic images, including deep dream images
and style transfer.
Video Analysis: ANNs are employed in video surveillance, tracking objects in videos,
and identifying anomalies.
Natural Language Processing (NLP):
Machine Translation: ANNs are used for language translation, such as Google
Translate.
Sentiment Analysis: They determine the sentiment of text data for applications in
social media monitoring and customer feedback analysis.
Text Generation: ANNs can generate human-like text, which is useful for chatbots and
content creation.
Speech Recognition: ANNs are critical in converting spoken language into text for voice
assistants and transcription services.
Recommendation Systems: ANNs power recommendation engines, as seen in content
recommendations on streaming platforms like Netflix or product recommendations on e-
commerce websites.
Autonomous Vehicles: Object Detection and Tracking: ANNs are used in self-driving cars to
detect and track objects on the road.
Path Planning: They assist in determining safe and efficient routes for autonomous vehicles.
Healthcare:
Medical Image Analysis: ANNs analyze medical images like X-rays and MRIs to
assist in diagnosis and treatment planning.
Disease Prediction: They are used to predict the risk of various diseases based on
patient data.
Finance:
Credit Scoring: ANNs help assess credit risk by analyzing an individual's credit
history.
Algorithmic Trading: They are employed in financial markets for algorithmic trading
and risk management.
Environmental Modeling: ANNs can model and predict climate patterns, aiding in weather
forecasting and climate change analysis.
Social Media and Content Generation: They generate text, images, and other content, used
in chatbots, creative content generation, and recommendation engines.
Fraud Detection: ANNs are used for fraud detection, such as credit card fraud detection and
cybersecurity.
Quality Control: In manufacturing, ANNs are used for quality control and defect detection in
production processes.
Energy Efficiency: ANNs can optimize energy consumption in buildings and industrial
processes by predicting energy demand and optimizing systems.
Customer Relationship Management (CRM): ANNs help businesses in customer
segmentation, churn prediction, and customer behavior analysis.
Biological and Medical Research: They are used for simulating biological systems and
predicting the behavior of biological molecules.
Drug Discovery: ANNs are used in drug discovery to predict the properties of molecules and
identify potential drug candidates.
Agriculture: They assist in crop disease detection, yield prediction, and pest control.
SLP of Artificial Neural Network (ANN) Development Using Python:
import random
# Data Generation
m=int(input("Total Data in Row: "))
n=int(input("Total Data in col: "))
fd=[]
y=[]
w=[]
for i in range (m):
pd=[]
for j in range(n):
gd=random.randint(0,1)
pd.append(gd)
fd.append(pd)
cd=random.randint(0,1)
y.append(cd)
for j in range(n):
wd=random.random()
w.append(wd)
print("Input Data: ",fd)
print("Target Data: ",y)
print("Weight Data: ",w)

#Algorthm
alph=.2
sum=0
for i in range(m):
for j in range(n):
print("Input Val:",fd[i][j])
sum=sum+(w[j]*fd[i][j])
if sum>0:
yc=1
else:
yc=0
if y[i]==yc:
print("Success in FT!!! and Value: ",yc)
sum=0

else:
wn=[]
for j in range(n):
wn.append(w[j])
print("WN-",wn)
while (y[i]!=yc):
print("Failure!!! and Value: ",yc)
for j in range(n):
wn[j]=wn[j]+alph*(y[i]-yc)*fd[i][j]
sum=sum+(wn[j]*fd[i][j])
if sum>0:
yc=1
else:
yc=0
if y[i]==yc:
print("Success!!! and Value: ",yc)
sum=0

Multi-Layer Perceptron (MLP): A Multi-Layer Perceptron (MLP) is a type of artificial


neural network (ANN) designed for supervised machine learning tasks, particularly for
classification and regression. It is one of the simplest and most commonly used neural network
architectures. An MLP consists of multiple layers of interconnected artificial neurons
(perceptrons) arranged in a feedforward fashion.

The Multilayer Perceptron was developed to tackle this limitation. It is a neural network where
the mapping between inputs and output is non-linear.

A Multilayer Perceptron has input and output layers, and one or more hidden layers with many
neurons stacked together. And while in the Perceptron the neuron must have an activation
function that imposes a threshold, like rectified linear unit (ReLU) or sigmoid, neurons in a
Multilayer Perceptron can use any arbitrary activation function.
Figure 2.7: Multi-Layer Perceptron (MLP)

Key Features of MLP:

Figure 2.8: Multi-Layer Perceptron (MLP)

Input Layer: The first layer of an MLP is called the input layer, and it consists of
neurons that receive the input data features. Each neuron in the input layer represents a
feature or attribute of the data.

Hidden Layers: Between the input and output layers, there can be one or more hidden
layers. These layers contain artificial neurons that perform computations and
transformations on the input data. The term "hidden" comes from the fact that these
layers are not directly exposed to the external environment but exist to process
information.

Output Layer: The last layer of the MLP is the output layer, which provides the
network's final prediction or output. The number of neurons in this layer depends on
the type of problem (e.g., binary classification, multi-class classification, regression).

Neurons (Perceptron’s): Each neuron in an MLP is a simple processing unit that


performs a weighted sum of its inputs, applies an activation function, and produces an
output. The activation function introduces non-linearity into the model, enabling it to
learn complex relationships in the data.

Weight and Bias: Each connection between neurons has an associated weight, which
determines the strength of the connection, and a bias term, which allows the neuron to
shift its decision boundary.

Activation Functions: Common activation functions used in MLPs include the


sigmoid, hyperbolic tangent (tanh), and rectified linear unit (ReLU). These functions
introduce non-linearity, which is essential for the network's ability to model complex
patterns.

Training an MLP involves iteratively adjusting the weights and biases to minimize a
loss function, which measures the difference between the network's predictions and the
actual target values in the training data. Gradient-based optimization algorithms, such
as stochastic gradient descent (SGD), are commonly used for this purpose.

MLPs are versatile and can be applied to a wide range of machine learning tasks, including
image classification, natural language processing, speech recognition, and regression
problems. However, they may not be the best choice for all tasks, especially when dealing with
structured data like images or sequences. The architecture and hyperparameters of an MLP,
including the number of layers, the number of neurons in each layer, and the choice of
activation functions, should be tuned to match the specific characteristics of the problem at
hand.

Applications of Multi-layer Perceptron: Multi-layer perceptrons have been used in a wide


variety of applications. Some of the most common applications of MLPs include:

Image recognition: MLPs can be trained to recognize patterns in images and classify
them into different categories. This is useful in applications such as facial recognition,
object detection, and image segmentation.

Natural Language Processing (NLP): MLPs can be used to understand and generate
human language. This is useful in applications such as text-to-speech, machine
translation, and sentiment analysis.

Predictive Modelling: It can be used to make predictions based on past data. This is
useful in applications such as stock market prediction, weather forecasting, and fraud
detection.

Medical Diagnosis: Can be used to diagnose diseases or interpret medical images by


recognizing patterns in the data.

Recommender Systems: MLPs can be used to analyze a user's preferences and


behaviour to recommend products or content.

Algorithm of Multi-Layer Perceptron (MLP): A Multi-Layer Perceptron (MLP) is a type of


artificial neural network that consists of multiple layers of interconnected neurons or artificial
neurons, which are also known as nodes or units. It is a feedforward neural network, meaning
that information flows in one direction, from the input layer through one or more hidden layers
to the output layer. MLPs are widely used in machine learning and deep learning for various
tasks, including classification, regression, and pattern recognition. Here's an overview of the
algorithm and key components of a Multi-Layer Perceptron:

Architecture:
Input Layer: The input layer consists of neurons that take the feature values of
your data as input. Each neuron corresponds to a feature in your dataset.

Hidden Layers: One or more hidden layers with multiple neurons in each layer.
These hidden layers are responsible for learning complex patterns in the data.

Output Layer: The output layer produces the final output or prediction. The
number of neurons in the output layer depends on the problem type (e.g., binary
classification, multi-class classification, regression).

Activation Functions: Each neuron in the hidden and output layers typically uses an
activation function to introduce non-linearity into the model. Common activation
functions include ReLU (Rectified Linear Unit), Sigmoid, and Tanh.

Weights and Biases: Each connection between neurons has an associated weight,
which the network adjusts during training to learn the optimal values.Each neuron also
has a bias term.

Forward Propagation: The input data is fed into the input layer. Signals are
propagated through the network in a feedforward manner. The output of each neuron
in a layer is a weighted sum of the inputs passed through the activation function.
This process continues through the hidden layers to the output layer, producing the final
prediction.

Loss Function: A loss function measures the difference between the predicted output
and the actual target values. Common loss functions include Mean Squared Error
(MSE) for regression and Cross-Entropy for classification.

Backpropagation: To update the weights and biases, backpropagation is used. It


calculates the gradient of the loss function with respect to the network's parameters
(weights and biases) and adjusts them using optimization algorithms like stochastic
gradient descent (SGD).

Training: Training involves iteratively feeding the training data through the network,
computing the loss, and updating the weights and biases to minimize the loss.
Training typically includes multiple epochs (iterations) over the training data until the
model converges.

Regularization: Regularization techniques like L1 or L2 regularization, dropout, and


early stopping can be applied to prevent overfitting.

Hyperparameter Tuning: Tuning the architecture of the MLP (e.g., the number of
hidden layers, the number of neurons in each layer) and hyperparameters (e.g., learning
rate) is often necessary to achieve the best performance on a specific task.
Inference: Once trained, the MLP can be used for making predictions on new, unseen
data.

Multi-Layer perceptron’s are a fundamental building block of deep learning and have
been extended into more complex architectures like Convolutional Neural Networks
(CNNs) for image analysis and Recurrent Neural Networks (RNNs) for sequential data
analysis. They are highly versatile and can be applied to a wide range of machine
learning tasks.

MLP Learning Procedure: The MLP learning procedure is working using three steps such
as Forward Pass, Error Calculation and Backward Pass:

Forward Error
Pass Calculation

Backward
Pass

Forward Pass: Starting with the input layer, propagate data forward to the output layer.
This step is the forward propagation. Initialize input (X1, X2,X3,……………Xn) and
random weight (W11,W12,………WIJ) for input layer Finally Calculate the sum of
weight and input.
𝑺𝒖𝒎 = ∑𝒏𝒊=𝟏 𝑾𝒊 𝑿𝒊 .
After completing thr calculation of sum in each iteration calculate the value of output
using the following equation
𝟏
𝒀𝒊 = 𝟏+𝒆−𝑺𝒖𝒎 ,
Where Yi will be the input for corresponding hidden layer. This Process will be
continued till calculate the final output Yc.

Error Calculation: Based on the output, calculate the error (the difference between
the Predicted and Target Output). The error needs to be minimized.
Error= Target Output (Yt)-Calculated Output (Yc)
Backward Pass: If the calculated output and target output are not matched then weights
need to be updated. Find its derivative with respect to each weight in the network and

update the model. The formula for this is: Wij(new)= Wij+Wij(Old)
Wij= η δj Yi Where η = learning rate, δ = Error calculation and Y = output of
j i

previous layer. Again δj=Yj (1-Yj) (Yt - Yj) if Yj is the final Output Unit

or δj=Yj(1-Yj) ∑ 𝜹𝒌 𝑾𝒋𝒌 if if Yj is the Output of hidden layers and δ j

is the error of hidden layer outputs’.


Repeat the three steps given above over multiple epochs to gain the targeted output
Problem 1: Suppose a MLP has one input layer and two hidden layers. The input layer consists
of two inputs i.e. x1 = 0.35 and x2 = 0.9 which will provide the target output yt =.5. Each hidden
layer consists of two nodes.

Solutions:

W13 W35
X1 = 0.35 X1 H3 H5

H7 Y7

X2 = 0.90 X2 H4 H6
W24 W46

Figure 2.9: Diagram of Multi-Layer Perceptron (MLP)

Step-1:

Input: X1 = 0.35 and X2 = 0.90 and Target Output: Yt=.5

Initial Weight Distribution:


W13= 0.01 W23= 0.07 W35= 0.09 W45= 0.02 W57= 0.03

W14= 0.04 W24= 0.05 W36= 0.1 W46= 0.03 W67= 0.01

Target output: As per the above diagram, the target output should be achieved at Y7 =0.5
based on inputs from hidden layer-2 (H5=Y5 and H6=Y6). Again H5=Y5 and H6=Y6 will be
calculated based on inputs from hidden layer-1 (H3=Y3 and H4=Y4). Finally, H3=Y3 and
H4=Y4 will be calculated based on inputs from input layer X1=.35 and X2=.90.

Hidden Layer Inputs: It has been assumed that output of each hidden layer is passed through as input
of next hidden layer. Each hidden layer output has been defined as Yi.
𝟏
Forward Pass: Activation function is 𝒀𝒊 = 𝟏+𝒆−𝑺𝒖𝒎 , and sum is 𝑺𝒖𝒎 = ∑𝒏𝒊=𝟏 𝑾𝒊 𝑿𝒊i

Calculation of output of Y3: Calculation of output of Y4:


Sum = X1*W13+X2*W23 Sum= X1*W14+X2*W24
Sum = (0.35*0.01) +(0.9*0.07) Sum = (0.35*0.04) +(0.9*0.05)
Sum = 0.0665 Sum = 0.059
𝟏 𝟏 𝟏 𝟏
𝒀𝟑 = = = 0.51 𝒀𝟑 = = = 0.51
𝟏 + 𝒆−𝑺𝒖𝒎 𝟏 + 𝒆−.𝟎𝟔𝟔𝟓 𝟏 + 𝒆−𝑺𝒖𝒎 𝟏 + 𝒆−.𝟎𝟓𝟗

Calculation of Output of Y5: Calculation of Output of Y6:


Sum = Y3 * W35 + Y4 * W45 Sum = Y3 * W36 + Y4 * W46
Sum = (0.51*0.09) + (0.51*0.02) Sum = (0.51*0.1) + (0.51*0.03)
Sum = 0.0561 Sum = 0.0663
1 1 1 1
Y5 = 1+ 𝑒 −𝑠𝑢𝑚 = 1+𝑒 −0.0561 = 0.51 y6 = 1+ 𝑒 −𝑠𝑢𝑚 = 1+𝑒 −0.0663 = 0.52
Calculation of Output of Y7:
Sum = Y5 * W57 + Y6 * W67
Sum = (0.51*0.03) + (0.52*0.01)
Sum = 0.0205
1 1
y7 = 1+ 𝑒 −𝑠𝑢𝑚 = 1+𝑒 −0.021 = 0.51

Step-2: Error Calculation:


Error= Target Output (Yt)-Calculated Output (Yc)
Erro= 0.50 – 0.51
Erro= -0.01
Backward Pass: Since the Error of this equation is -0.01 which means there is 0.01 difference
between targeted and calculated output. Which means we need to update all the weights through
back propagation: Wij(new)= Wij+Wij(Old)
Where, Wij= η δj Yi
Where η = learning rate, δj = Error calculation and Yi = output of previous
layer. Again δj=Yj (1-Yj) (Yt - Yj) if Yj is the final Output Unit Or δj=Yj(1-
Yj) ∑ 𝜹𝒌 𝑾𝒋𝒌 If Yj is the Output of hidden layers and δj is the error of hidden
layer outputs’. Here we take learning rate, η = 1

Error Calculation for each Output and Hidden Layer:


Error For Y7: Since Y7 is the final output Error For Y6: Since Y6 is the output of
so that we used: hidden layer which is connected with output
δ7= Y7 * (1-Y7) * (Yt-Y7) Y7 so K=7 and J=6. So that we used:
= 0.51 * (1-0.51) * (0.5-0.51) δj=Yj(1-Yj) ∑ 𝜹𝒌 𝑾𝒋𝒌
= -0.0025 δ6= Y6 *(1-Y6) δ7 * W67
=0.52 * (1-0.52) * (-0.0025) * 0.01
=0
Error For Y5: Since Y5 is the output of Error For Y4: Since Y4 is the output of
hidden layer which is connected with output hidden layer which is connected with output
Y7 so K=7 and J=5. So that we used: of Hidden Layer (H5 and H6) so K=5,6 and
δj=Yj(1-Yj) ∑ 𝜹𝒌 𝑾𝒋𝒌 J=4. So that we used:
δ5= Y5 * (1-Y5) * δ7 *W57 δj=Yj(1-Yj) ∑ 𝜹𝒌 𝑾𝒋𝒌
=0.51 * (1-0.51) * (-0.0025) * 0.03 δ4= Y4 * (1-Y4) *(δ6 * W46 +δ5 * W45)
= -0.00002 =0.51 * (1-0.51) * {(0) * 0.03+(-0.00002) *
0.02}
=0
Error For Y3: Since Y3 is the output of
hidden layer which is connected with output
of Hidden Layer (H5 and H6) so K=5,6 and
J=3. So that we used:
δj=Yj(1-Yj) ∑ 𝜹𝒌 𝑾𝒋𝒌
δ3= y3 * (1-y3) *(δ5 * W35 +δ6 * W36)
=0.51 * (1-0.51) * {(-0.00002) * 0.09+(0) *
0.1}
=0

Weight Update:
W67 = η δ7 Y6 W57 = η δ7 Y5 W46 = η δ6 Y4
= 1 * -0.0025 * 0.52 = 1 * -0.0025 * 0.51 = 1 * 0 * 0.51
= -0.001 = -0.001 =0

Wij(new)= Wij+Wij(Old) Wij(new)= Wij+Wij(Old) Wij(new)= Wij+Wij(Old)


= -0.001 + 0.01 = -0.001 + 0.03 = 0 + 0.03
= 0.009 = 0.029 = 0.03

W45 = η δ5 Y4 W36 = η δ6 Y3 W46 = η δ6 Y4


= 1 * -0.00002 * 0.51 = 1 * 0 * 0.51 = 1 * 0 * 0.51
= -0.00001 =0 =0

Wij(new)= Wij+Wij(Old) Wij(new)= Wij+Wij(Old) Wij(new)= Wij+Wij(Old)


= -0.00001 + 0.03 = 0 + 0.1 = 0 + 0.03
= 0.03 = 0.1 = 0.03

W35 = η δ5 Y3 W24 = η δ4 Y2 W23 = η δ3 Y2


= 1 * -0.00002 * 0.51 = 1 * 0 * 0.9 = 1 * 0 * 0.9
= -0.00001 =0 = -0.00001

Wij(new)= Wij+Wij(Old) Wij(new)= Wij+Wij(Old) Wij(new)= Wij+Wij(Old)


= -0.00001 + 0.09 = 0 + 0.05 = 0 + 0.07
= 0.09 = 0.05 = 0.07

W14 = η δ4 Y1 W13 = η δ3 Y1 Updated Weights:


= 1 * 0 * 0.35 = 1 * 0 * 0.35
=0 =0 W13 = 0.01 W14 = 0.04
W23 = 0.07 W24 = 0.05
W35 = 0.09 W36 = 0.1
Wij(new)= Wij+Wij(Old) Wij(new)= Wij+Wij(Old) W45 = 0.02 W46 = 0.03
= 0 + 0.04 = 0 + 0.01 W57 = 0.029 W67 = 0.009
= 0.04 = 0.01

Repeat the forward Pass with Updated weights:


Calculation of output of Y3: Calculation of output of Y4:
Sum = X1*W13+X2*W23 Sum= X1*W14+X2*W24
Sum = (0.35*0.01) +(0.9*0.07) Sum = (0.35*0.04) +(0.9*0.05)
Sum = 0.0665 Sum = 0.059
𝟏 𝟏 𝟏 𝟏
𝒀𝟑 = −𝑺𝒖𝒎
= = 0.51 𝒀𝟑 = = = 0.51
𝟏+𝒆 𝟏 + 𝒆−.𝟎𝟔𝟔𝟓 𝟏+𝒆−𝑺𝒖𝒎 𝟏 + 𝒆−.𝟎𝟓𝟗

Calculation of Output of Y5: Calculation of Output of Y6:


Sum = Y3 * W35 + Y4 * W45 Sum = Y3 * W36 + Y4 * W46
Sum = (0.51*0.09) + (0.51*0.02) Sum = (0.51*0.1) + (0.51*0.03)
Sum = 0.0561 Sum = 0.0663
1 1 1 1
Y5 = 1+ 𝑒 −𝑠𝑢𝑚 = 1+𝑒 −0.0561 = 0.51 y6 = 1+ 𝑒 −𝑠𝑢𝑚 = 1+𝑒 −0.0663 = 0.52
Calculation of Output of Y7:
Sum = Y5 * W57 + Y6 * W67
Sum = (0.51*0.029) + (0.52*0.009)
Sum = 0.01938
1 1
y7 = 1+ 𝑒 −𝑠𝑢𝑚 = 1+𝑒 −0.021 = 0.50

Error Calculation:
Error= Target Output (Yt)-Calculated Output (Yc)
Erro= 0.50 – 0.50
Erro= 0.00
Here we have got the error value is 0. So, we’ve reached out our targeted value which
completes the process for MLP.
Repeat the forward pass with Updated weights:
y3:
sum = x1 * w13 + x2 * w23 = (0.35 * 0.01) + (0.9 * 0.07) = 0.07
1 1
y3 = 1+ 𝑒 −𝑠𝑢𝑚 = 1+𝑒 −0.07 = 0.51

y4:
sum = x1 * w14 + x2 * w24 = (0.35 * 0.04) + (0.9 * 0.05) = 0.06
1 1
y4 = 1+ 𝑒 −𝑠𝑢𝑚 = 1+𝑒 −0.06 = 0.51
y5:
sum = y3 * w35 + y4 * w45 = (0.09 * 0.51) + (0.02 * 0.51) = 0.06
1 1
y5 = = = 0.51
1+ 𝑒 −𝑠𝑢𝑚 1+𝑒 −0.06
y6:
sum = y3 * w36 + y4 * w46 = (0.1 * 0.51) + (0.03 * 0.51) = 0.07
1 1
y6 = 1+ 𝑒 −𝑠𝑢𝑚 = 1+𝑒 −0.07 = 0.52

y7:
sum = y5 * w57 + y6 * w67 = (0.03 * 0.52) + (0.01 * 0.52) = 0.01947
1 1
y7 = 1+ 𝑒 −𝑠𝑢𝑚 = 1+𝑒 −0.01947 = 0.50

• Error:
Targeted output – Calculated output = 0.50 – 0.50 = 0
Here we have got the error value is 0. So, We’ve reached out our targeted value which
completes the process for MLP.

You might also like