Machine Learning
Unit-2
Backpropagation Algorithm
Dr. Himanshu Sharma,
Associate Professor & Head (R&D)
NIET, Gr. Noida
Outline of the Presentation
• Introduction
• Perceptron
• Back propagation algorithm
• Three main Phases of Back propagation Algo:
• Feed Forward
• Calculate Errors
• Adjust Weights
2
Introduction:
• Back propagation is abbreviation for "backward propagation of
errors", is a common method of training artificial neural networks
(ANN).
• The method calculates the gradient of a loss function with respect to
all the weights (w) in the network using gradient descent rule.
• The gradient is fed to the optimization method which in turn uses it
to update the weights, in an attempt to minimize the loss function.
3
Historical Background:
• Donald Hebb (1949) gave the hypothesis in his thesis “The
Organization of Behavior”:
• “Neural pathways are strengthened every time they are used.”
• Early attempts to implement artificial neural networks: McCulloch
(Neuroscientist) and Pitts (Logician)
• Based on simple neurons (MC-P neurons model)
• Based on logical functions only
4
Perceptron:
• It is a step function based on a linear combination of real-valued
inputs. If the combination is above a threshold it outputs a 1,
otherwise it outputs a –1
• A perceptron can only learn examples that are linearly separable.
x1 w1
x2
w2 Σ {1 or –1}
wn w0 [3]
xn
X0=1
5
Back propagation Algorithm:
Back Propagation Algorithm:
Back Propagation Algorithm:
Back Propagation Algorithm:
Delta Rule for weight update:
• The delta rule is used for calculating the gradient change that is used for updating the weights.
• We will minimize the following error (E):
• E = ½ Σi (ti – oi) 2
• ti= taget output of ith node
• Oi = Actual Output of ith Node
• For a new input training data example X = (x1, x2, …, xn), update each weight (w) according to
this rule:
• wi = wi + Δwi
where, Δwi= - η * E’(w)/wi where, η = Learning rate of neurons
E’(w)/wi= Σi (ti – oi)*(- xi)
∆ wi = η * Σi (ti – oi) xi
13
Phases of Back Propagation Algorithm:
Phase 1: Feed Forward the ANN
Phases of Back Propagation Algorithm:
Phases of Back Propagation Algorithm:
Phase 2: Back propagation of Errors
Phases of Back Propagation Algorithm:
Phase 3: Updation of Weights
Back propagation Algorithm (Small Version)
[2]
18
What is gradient descent Rule?
• Back propagation calculates the gradient (small changes) of the
error of the network regarding the network's modifiable weights.
• This gradient is always used in a gradient descent algorithm to find
weights which can minimize the error.
E(W)
w1
[3]
19
w2
Learning Rates
• Different learning rates affect the performance of a neural network
significantly.
• Optimal Learning Rate:
• Leads to the error minimum in one learning step.
20
Limitations of the back propagation algorithm
• It is not guaranteed to find global minimum of the error function. It
may get trapped in a local minima,
• Improvements,
• Add momentum.
• Use stochastic gradient descent.
• Use different networks with different initial values for the weights.
• Back propagation learning does not require normalization of input
vectors; however, normalization could improve performance.
• Standardize all features previous to training.
21
References
• Artificial Neural Networks,
https://en.wikipedia.org/wiki/Artificial_neural_network
• Back propagation Algorithm,
https://en.wikipedia.org/wiki/Backpropagation
• Geeksforgeeks.com
• https://www.geeksforgeeks.org/backpropagation-in-data-mining/
22
Thank You