CO2 - ANN Structure and Funadamentals - P1
CO2 - ANN Structure and Funadamentals - P1
INSTRUCTIONAL OBJECTIVES
LEARNING OUTCOMES
2
Networks
Reticular Theory
Joseph von Gerlach proposed that the nervous system is a single continuous network
as opposed to a network of many discrete cells! 1871-1873
Staining Technique
Camillo Golgi discovered a chemical reaction that allowed him to examine nervous
tissue in much greater detail than ever before. He was a proponent of Reticular
theory . 1871-1873
Neuron Doctrine
Santiago Ramón y Cajal used Golgi’s technique to study the nervous system and
proposed that it is actually made up of discrete individual cells forming a network
(as opposed to a single continuous network) 1888-1891
3
Nobel Prize
Both Golgi (reticular theory) and Cajal (neuron doctrine) were jointly
awarded the 1906 Nobel Prize for Physiology or Medicine, that resulted in
lasting conflicting ideas and controversies between the two scientists
4
Computers vs. Neural Networks
“Standard” Computers Neural Networks
10
USE CASES OF ANN LEARNING
x1 x2 x3 x1 x2 x3 x1 x2 x3
A McCulloch Pitts AND OR
unit function function
y ∈ { 0, y ∈ { 0, y ∈ { 0,
1} 1} 1}
1 0 0
x1 x2 x1 x2 x1
19
Perceptron Learning Algorithm
Our goal is to find the w vector that can
perfectly classify positive inputs and negative
inputs in our data. I will get straight to the
algorithm. Here goes:
This basic architecture is classically also known as the “Perceptron” (not to be confused
with the Perceptron “algorithm”, which learns a linear classification model)
This can’t however learn nonlinear functions or nonlinear decision boundaries Source: NPTEL IIT
Limitations of Classic Non-Linear Models
Non-linear models: kNN, kernel methods, generative classification, decision trees etc.
All have their own disadvantages
kNN and kernel methods are expensive to generate predictions
Kernel based and generative models particularize the decision boundary to a particular class
of functions, e.g. quadratic polynomials, gaussian functions etc.
Decision trees require optimization over many arbitrary hyperparameters to generate good
results, and are (somewhat) expensive to generate predictions from
Not a deal-breaker, most common competitor for deep learning over large datasets tends
to be some decision-tree derivative
In general, non-linear ML models are COMPLICATED beasts
Source: NPTEL IIT
2
Multi-layer Perceptron (MLP) 5
An MLP consists of an input layer, an output layer, and one or more hidden layers
Output Layer
(with a scalar-valued output)
Hidden layer Learnab
units/nodes act as
new features
le
weights
Hidden Layer
(with K=2 hidden units)
Can think of this
model as a
combination of two
predictions and of
two simpler models
Input Layer
(with D=3 visible units)
• Feedforward Neural
Networks are artificial neural
networks where the connections
between units do not form a cycle.
• Feedforward neural networks were the
first type of artificial neural network
devised.
• They are called feedforward because
information only travels forward in the
network (no loops), first through the
input nodes, then through the hidden
nodes (if present), and finally through
the output nodes.
BUILDING BLOCKS OF NEURAL NETWORK
Neurons
The building blocks for neural networks are artificial
neurons.
ACTIVATION FUNCTIONS
33
TRAINING OF MLP
Initialization
• Initialize weights and bias of the network
Forward propagation-step
• Starting with the input layer, propagate data forward to the output layer.
• This Based on the output, calculate the error (the difference between the predicted and known
outcome).
• The error needs to be minimized.
Back propagation -step
• Backpropagate the error.
• Find its derivative with respect to each weight in the network, and update the model.
Repeat the forward process and backpropagation steps given above over
multiple epochs to learn ideal weights.
03/05/2025 34
3
Multi-layer Perceptron (MLP) 5
An MLP consists of an input layer, an output layer, and one or more hidden layers
Output Layer
(with a scalar-valued output)
Hidden layer Learnab
units/nodes act as
new features
le
weights
Hidden Layer
(with K=2 hidden units)
Can think of this
model as a
combination of two
predictions and of
two simpler models
Input Layer
(with D=3 visible units)
Will directly
show the final
output
Will combine pre-act and post-
act and directly show only to
denote the value computed by
Single a hidden node
Hidden More succinctly..
Layer
Different layers may use different non-linear activations. Output layer may have none.
a a
Leaky ReLU
ReLU ReLU and Leaky
Helps fix the dead ReLU are among
neuron problem the most popular
h of ReLU when is a ones
negative number
A nonlinear
classification
problem
40
SOLVE THE PROBLEM USING SIGMOID
ACTIVATION FUNCTION – CALCULATE OUTPUTS
41
TOPICS TO BE COVERED
• Travel back from the output layer to the hidden layer to adjust
the weights such that the error is decreased.
• Keep repeating the process until the desired output is achieved
A Step by Step BACK PROPAGATION Example
The goal of backpropagation is to optimize the weights so that the neural network can learn how to
correctly map arbitrary inputs to outputs.
To work with a single training set: given inputs 0.05 and 0.10, we want the neural network to output
0.01 and 0.99.
BACK PROPAGATION WORKING METHOD
For example,
2
the target output for I s 0.01 but the neural network output 0.75136507, therefore its error is:
Repeating this process for (r (remembering that the target is 0.99) we get:
The total error for the neural network is the sum of these errors:
The Backwards Pass
Our goal with backpropagation is to update each of the weights in the network so that they
cause the actual output to be closer the target output, thereby minimizing the error for each
output neuron
Output Layer and the network as a whole.
1
Consider . we want to know how much a change in affects the total error, .
2
When we take the partial derivative of the total error with respect to , the quantity
becomes zero because does not affect it which means we’re taking the derivative of a constant which is zero.
Next,1 how much does the output of change with respect to its total net input?
The partial derivative of the logistic function is the output multiplied by 1 minus the output:
2
Finally, how much does the total net input of change with respect to ?
Putting it all together:
You’ll often see this calculation combined in the form of the delta rule:
The Delta rule in machine learning and neural network environments is a specific type of
backpropagation that helps to refine connectionist ML/AI networks, making connections
between inputs and outputs with layers of artificial neurons.
Another way to explain the Delta rule is that it uses an error function to perform gradient
descent learning.
Apply the Delta rule
To decrease the error, we then subtract this value from the current weight
(optionally multiplied by some learning rate, eta, which we’ll set to 0.5):
We can repeat this process to get the new weights , , and :
GOING BACKWARDS BACK-PROPAGATION OF ERROR
• Computing the errors at the output is no more difficult than it was for the Perceptron, but working out
what to do with those errors is more difficult. The method that we are going to look at is called back-
propagation of error, which makes it clear that the errors are sent backwards through the network.
• There are actually, just three things that you need to know,
• Chain Rule
• Gradient Descent
• Delta Rule
54
BACK-PROPAGATION OF ERROR
The weights of the network are trained so that the error goes downhill until it reaches a local minimum, just like a ball
rolling under gravity.
55
ACTIVATION FUNCTION
56
• So now we’ve got a new form of error computation and a new activation function that decides whether or not a neuron should fire.
We can differentiate it, so that when we change the weights, we do it in the direction that is downhill for the error, which means
that we know we are improving the error function of the network.
• As far as an algorithm goes, we’ve fed our inputs forward through the network and worked out which nodes are firing.
• Now, at the output, we’ve computed the errors as the sum-squared difference between the outputs and the targets.
• What we want to do next?
• To compute the gradient of these errors and use them to decide how much to update each weight in the network. We will do that
first for the nodes connected to the output layer, and after we have updated those, we will work backwards through the network
until we get back to the
inputs again. There are just two problems:
• for the output neurons, we don’t know the inputs.
• for the hidden neurons, we don’t know the targets; for extra hidden layers, we know neither the inputs nor the targets, but even
this won’t matter for the algorithm we derive.
57
THE ADVANTAGES OF USING A BACKPROPAGATION ALGORITHM ARE AS
FOLLOWS:
• It does not have any parameters to tune except for the number of inputs.
• It is highly adaptable and efficient and does not require any prior knowledge
about the network.
• It is a standard process that usually works well.
• It is user-friendly, fast and easy to program.
• Users do not need to learn any special functions.
58
THE DISADVANTAGES OF USING A BACKPROPAGATION ALGORITHM ARE AS
FOLLOWS:
59
WEB REFERENCES
[1] https://www.guru99.com/backpropogation-neural-network.html
[2]
https://mattmazur.com/2015/03/17/a-step-by-step-backpropagation-exa
mple/
[3] http://neuralnetworksanddeeplearning.com/chap2.html
60
Conclusion
Artificial Neural Networks are an imitation of the biological neural
networks, but much simpler ones.
The computing would have a lot to gain from neural networks.
Their ability to learn by example makes them very flexible and
powerful furthermore there is need to device an algorithm in order
to perform a specific task.
REFERENCES
Craig Heller, and David Sadava, Life: The Science of Biology, fifth
edition, Sinauer Associates, INC, USA, 1998.
Introduction to Artificial Neural Networks, Nicolas Galoppo von Borries
Tom M. Mitchell, Machine Learning, WCB McGraw-Hill, Boston, 1997.
Q. How does each neuron work in ANNS?
What is back propagation?
A neuron: receives input from many other neurons;
changes its internal state (activation) based on the current input;
sends one output signal to many other neurons, possibly including its
input neurons (ANN is recurrent network).
64
THANK YOU
OUR TEAM
65