ML Unit 2
ML Unit 2
UNIT – II
Multi-layer Perceptron– Going Forwards – Going Backwards: Back
Propagation Error – Multi-layer Perceptron in Practice – Examples of
using the MLP – Overview – Deriving Back-Propagation – Radial Basis
Functions and Splines – Concepts – RBF Network – Curse of
Dimensionality – Interpolations and Basis Functions – Support Vector
Machines
MALLA REDDY COLLEGE OF ENGINEERING
UNIT – II Multi-layer Perceptron– Going Forwards – Going Backwards: Back Propagation Error – Multi-layer Perceptron in Practice – Examples of using the MLP
– Overview – Deriving Back-Propagation – Radial Basis Functions and Splines – Concepts – RBF Network – Curse of Dimensionality – Interpolations and Basis Functions –
Support Vector Machines
Multi-layer Perceptron :
The multilayer perceptron is an artificial neural network structure and is a nonparametric estimator that can be used
for classification and regression. We discuss the back propagation algorithm to train a multilayer perceptron for a
variety of applications.
We have pretty much decided that the learning in the neural network happens in the weights. So, to perform more
computation it seems sensible to add more weights. There are two things that we can do:
add some backwards connections, so that the output neurons connect to the inputs again, or add more
neurons.
The first approach leads into recurrent networks. These have been studied, but are not that
commonly used.We will instead consider the second approach.
MALLA REDDY COLLEGE OF ENGINEERING
UNIT – II Multi-layer Perceptron – Going Forwards – Going Backwards: Back Propagation Error – Multi-layer Perceptron in Practice – Examples of using the MLP
– Overview – Deriving Back-Propagation – Radial Basis Functions and Splines – Concepts – RBF Network – Curse of Dimensionality – Interpolations and Basis Functions –
Support Vector Machines
Multi-layer Perceptron :.
MALLA REDDY COLLEGE OF ENGINEERING
UNIT – II Multi-layer Perceptron – Going Forwards – Going Backwards: Back Propagation Error – Multi-layer Perceptron in Practice – Examples of using the MLP
– Overview – Deriving Back-Propagation – Radial Basis Functions and Splines – Concepts – RBF Network – Curse of Dimensionality – Interpolations and Basis Functions –
Support Vector Machines
Multi-layer Perceptron :.
we can check that a prepared network can solve the two-dimensional XOR problem, something that we have seen is
not possible for a linear model like the Perceptron.
A suitable network is shown in Figure 4.2. To check that it gives the correct answers, all that is required is to put in
each input and work through the network, treating it as two different Perceptrons, first computing the activations of
the neurons in the middle layer (labelled as C and D in Figure 4.2) and then using those activations as the inputs to the
single neuron at the output. As an example, I’ll work out what happens when you put in (1, 0) as an input;
MALLA REDDY COLLEGE OF ENGINEERING
UNIT – II Multi-layer Perceptron – Going Forwards – Going Backwards: Back Propagation Error – Multi-layer Perceptron in Practice – Examples of using the MLP
– Overview – Deriving Back-Propagation – Radial Basis Functions and Splines – Concepts – RBF Network – Curse of Dimensionality – Interpolations and Basis Functions –
Support Vector Machines
Multi-layer Perceptron :.
A suitable network is shown in Figure 4.2. To check that it gives
the correct answers, all that is required is to put in each input and
work through the network, treating it as two different
Perceptrons, first computing the activations of the neurons in the
middle layer (labelled as C and D in Figure 4.2) and then using
those activations as the inputs to the single neuron at the output.
As an example, I’ll work out what happens when you put in (1, 0)
as an input;
MALLA REDDY COLLEGE OF ENGINEERING
UNIT – II Multi-layer Perceptron – Going Forwards – Going Backwards: Back Propagation Error – Multi-layer Perceptron in Practice – Examples of using the MLP – Overview – Deriving Back-Propagation – Radial Basis Functions and Splines –
Concepts – RBF Network – Curse of Dimensionality – Interpolations and Basis Functions – Support Vector Machines
w2
𝑦 = 𝑥1𝑥2 + 𝑥1x2
x2
f(Yn) = {1 if Yin≥0 Activation Function
{0 Yin<0
MALLA REDDY COLLEGE OF ENGINEERING
UNIT – II Multi-layer Perceptron – Going Forwards – Going Backwards: Back Propagation Error – Multi-layer Perceptron in Practice – Examples of using the MLP
– Overview – Deriving Back-Propagation – Radial Basis Functions and Splines – Concepts – RBF Network – Curse of Dimensionality – Interpolations and Basis Functions –
Support Vector Machines
X1 X2 Z
Multi-layer Perceptron with XOR 0 0 0
Let us consider the weights for w11 = 1 w21 =1 0 1 0
Threshold = 1 learning rate = 1.5 1 0 1
1 1 0
(0,0) Zin = Wij*Xi = 1*0+1*0 = 0 (out = 0)
- wij =wij +n(t-o)xi ∴ -w11 = 1+1.5 *(0-1)*0=0 w21 = 1+1.5 * (0-1) * 1 = -0.5
MALLA REDDY COLLEGE OF ENGINEERING
UNIT – II Multi-layer Perceptron– Going Forwards – Going Backwards: Back Propagation Error – Multi-layer Perceptron in Practice – Examples of using the MLP
– Overview – Deriving Back-Propagation – Radial Basis Functions and Splines – Concepts – RBF Network – Curse of Dimensionality – Interpolations and Basis Functions –
Support Vector Machines
X1 X2 Z1
Multi-layer Perceptron with XOR
0 0 0
First function = Z1= 𝑥1𝑥2
0 1 0
Let us consider the weights for w11 = 1 w21 =1 updated weights:w11= 1, w21= -0.5
1 0 1
Threshold = 1 learning rate = 1.5
1 1 0
(0,0) Zin = Wij*Xi = 1*0+(-0.5)*0 = 0 (out = 0)
- wij =wij +n(t-o)xi ∴ -w11 = 1+1.5 *(0-1)*0=1 w21 = 1+1.5 * (0-1) * 1 = -0.5
MALLA REDDY COLLEGE OF ENGINEERING
UNIT – II Multi-layer Perceptron – Going Forwards – Going Backwards: Back Propagation Error – Multi-layer Perceptron in Practice – Examples of using the MLP – Overview – Deriving Back-Propagation – Radial Basis Functions and Splines –
Concepts – RBF Network – Curse of Dimensionality – Interpolations and Basis Functions – Support Vector Machines
𝑦 = 𝑥1𝑥2 + 𝑥1x2
x2 W2 2
Z1
f(Yn) = {1 if Yin≥0 Activation Function
{0 Yin<0
MALLA REDDY COLLEGE OF ENGINEERING
UNIT – II Multi-layer Perceptron – Going Forwards – Going Backwards: Back Propagation Error – Multi-layer Perceptron in Practice – Examples of using the MLP
– Overview – Deriving Back-Propagation – Radial Basis Functions and Splines – Concepts – RBF Network – Curse of Dimensionality – Interpolations and Basis Functions –
Support Vector Machines
X1 X2 Z2
Multi-layer Perceptron with XOR
0 0 0
First function = z2 = 𝑥1x2 0 1 1
Let us consider the weights for w12 = 1 w2 2 = 1 updated weights:w12= -0.5, w2 2=1 1 0 0
Threshold = 1 learning rate = 1.5 1 1 0
- wij =wij +n(t-o)xi ∴ -w12 = 1+1.5 *(0-1)* 1=-0.5 w22 = 1+1.5 * (0-1) * 0 = 1
MALLA REDDY COLLEGE OF ENGINEERING
UNIT – II –
Multi-layer Perceptron Going Forwards – Going Backwards: Back Propagation Error – Multi-layer Perceptron in Practice – Examples of using the MLP – Overview – Deriving Back-
Propagation – Radial Basis Functions and Splines – Concepts – RBF Network – Curse of Dimensionality – Interpolations and Basis Functions – Support Vector Machines
- wij =wij +n(t-o)xi ∴ -w12 = 1+1.5 *(0-1)* 1=-0.5 w22 = 1+1.5 * (0-1) * 0 = 1
MALLA REDDY COLLEGE OF ENGINEERING
• Y = Z1 OR Z2
• => yin = Z1v1 +z2v2
• => v1 = 1 ; v2 = 1 X1 X2 Z1 Z2 Y
Threshold = 1 and learning rate = 1.5 0 0 0 0 0
0 1 0 1 1
✓ (0,0) -> yin = vi * xi = 1*0+1*0 = 0 (out =0) 1 0 1 0 1
✓ (0,1) -> yin = vi*xi = 1*0+1*1 = 1 (out = 1) 1 1 0 0 0
x2 x2 z2
w22
MALLA REDDY COLLEGE OF ENGINEERING
•In machine learning, backpropagation is an effective algorithm used to train artificial neural
networks, especially in feed-forward neural networks.
•Computing the gradient in the backpropagation algorithm helps to minimize the cost
function and it can be implemented by using the mathematical rule called chain rule from calculus
to navigate through complex layers of the neural network.
MALLA REDDY COLLEGE OF ENGINEERING
MALLA REDDY COLLEGE OF ENGINEERING
Advantages of Using the Backpropagation Algorithm in Neural Networks
Backpropagation, a fundamental algorithm in training neural networks, offers several
advantages that make it a preferred choice for many machine learning tasks. Here, we discuss
some key advantages of using the backpropagation algorithm:
1.Ease of Implementation: Backpropagation does not require prior knowledge of neural networks,
making it accessible to beginners. Its straightforward nature simplifies the programming
process, as it primarily involves adjusting weights based on error derivatives.
2.Simplicity and Flexibility: The algorithm’s simplicity allows it to be applied to a wide range of
problems and network architectures. Its flexibility makes it suitable for various scenarios,
from simple feedforward networks to complex recurrent or convolutional neural networks.
3.Efficiency: Backpropagation accelerates the learning process by directly updating weights based
on the calculated error derivatives. This efficiency is particularly advantageous in training
deep neural networks, where learning features of a function can be time-consuming.
4.Generalization: Backpropagation enables neural networks to generalize well to unseen data by
iteratively adjusting weights during training. This generalization ability is crucial for
developing models that can make accurate predictions on new, unseen examples.
5.Scalability: Backpropagation scales well with the size of the dataset and the complexity of the
network. This scalability makes it suitable for large-scale machine learning tasks, where
training data and network size are significant factors.
MALLA REDDY COLLEGE OF ENGINEERING
Working of Backpropagation Algorithm
The Backpropagation algorithm works by two different passes, they are:
Forward pass Backward pass
In forward pass, initially the input is fed into the input layer. Since the inputs are raw data, they can be used
for training our neural network.
The inputs and their corresponding weights are passed to the hidden layer. The hidden layer performs the
computation on the data it receives. If there are two hidden layers in the neural network, for instance,
consider the illustration fig(a), h1 and h2 are the two hidden layers, and the output of h1 can be used as an
input of h2. Before applying it to the activation function, the bias is added.
To the weighted sum of inputs, the activation function is applied in the hidden layer to each of its neurons.
One such activation function that is commonly used is ReLU can also be used, which is responsible for
returning the input if it is positive otherwise it returns zero. By doing this so, it introduces the non-linearity to
our model, which enables the network to learn the complex relationships in the data. And finally, the
weighted outputs from the last hidden layer are fed into the output to compute the final prediction, this layer
can also use the activation function called the softmax function which is responsible for converting the
weighted outputs into probabilities for each class.
MALLA REDDY COLLEGE OF ENGINEERING
Working of Backpropagation Algorithm
•The key calculation during the backward pass is determining the gradients for each weight and bias in
the network. This gradient is responsible for telling us how much each weight/bias should be adjusted
to minimize the error in the next forward pass. The chain rule is used iteratively to calculate this
gradient efficiently.
•In addition to gradient calculation, the activation function also plays a crucial role in backpropagation, it
works by calculating the gradients with the help of the derivative of the activation function.
MALLA REDDY COLLEGE OF ENGINEERING
Working of Backpropagation
Algorithm with an example
part 1:
calculate forward propagation
error W1(0.15) W5(0.40) 0.01
0.05 h1 O1
calculate h1(in and out) x1
where h1 = w1x1+w2x2+b1
part 1:
calculate forward propagation error
calculate h1(in and out) W1(0.15) W5(0.40) 0.01
0.05 h1 O1
x1
where h1 = w1x1+w2x2+b1
h1(in)
= 0.15*0.05+0.20*0.10+0.35
= 0.377 0.99
x2 0.10 h2 O2
W4 (0.30) W8 (0.55)
𝟏
H2 (out) =
𝟏+𝒆−𝒉𝟏(𝒊𝒏) b1= (0.35) b2= (0.60)
𝟏
H2 (out) = = 0.5932
𝟏+𝒆−𝟎.𝟑𝟕𝟕𝟕
MALLA REDDY COLLEGE OF ENGINEERING
Working of Backpropagation Algorithm with an
example
part 2:
calculate forward propagation error
calculate h2(in and out)
W1(0.15) W5(0.40) 0.01
0.05 h1 O1
where h2 = x1w3+x2w4+b1 x1
h2(in)
= 0.05*0.25+0.10*0.3+0.35
= 0.0125+0.03+0.35 = 0.3925
0.99
𝟏 x2 0.10 h2 O2
h2 (out) = W4 (0.30) W8 (0.55)
𝟏+𝒆−𝒉𝟐(𝒊𝒏)
b1= (0.35) b2= (0.60)
𝟏
H2 (out) = = 0.5968
𝟏+𝒆−𝟎.𝟑𝟗𝟐𝟓
MALLA REDDY COLLEGE OF ENGINEERING
Working of Backpropagation Algorithm with an
example
part 3:
calculate forward propagation error
calculate O1(in and out)
W1(0.15) W5(0.40) 0.01
0.05 h1 O1
where O1 = h1(out)*w5+h2(out)*w6+b2 x1
O1(in)
= 0.593*0.4+0.596*0.45+0.6
= 1.105
0.99
𝟏 x2 0.10 h2 O2
O1 (out) = W4 (0.30) W8 (0.55)
𝟏+𝒆−𝑶𝟏(𝒊𝒏)
b1= (0.35) b2= (0.60)
𝟏
H2 (out) = = 0.7513
𝟏+𝒆−𝟏.𝟏𝟎𝟓
MALLA REDDY COLLEGE OF ENGINEERING
Working of Backpropagation Algorithm with an
example
part 3:
calculate forward propagation error
calculate O2(in and out)
W1(0.15) W5(0.40) 0.01
0.05 h1 O1
where O2 = h1(out)*w7+h2(out)*w8+b2 x1
O2(in)
= 0.5932*0.50+0.5968*0.55+0.6
= 1.22484
0.99
𝟏 x2 0.10 h2 O2
O2 (out) = W4 (0.30) W8 (0.55)
𝟏+𝒆−𝑶𝟐(𝒊𝒏)
b1= (0.35) b2= (0.60)
𝟏
H2 (out) = = 0.7729
𝟏+𝒆−𝟏.𝟐𝟐𝟒𝟖𝟒
MALLA REDDY COLLEGE OF ENGINEERING
Working of Backpropagation Algorithm with an
example
part 3:
calculate forward propagation error
calculate E total where E represents Error
𝟏 𝟐 W1(0.15) W5(0.40) 0.01
Etotal = ∑ 𝟐 𝒕𝒂𝒓𝒈𝒆𝒕 − 𝑶𝒖𝒕𝒑𝒖𝒕 0.05 h1 O1
x1
= 𝑬𝑶𝟏 + 𝑬𝑶𝟐 where error at output O1
where error at output O2
𝟏 𝟐 𝟏 𝟐
= 𝟐 (𝟎. 𝟎𝟏 − 𝟎. 𝟕𝟓𝟏𝟑) + 𝟐 (𝟎. 𝟗𝟗 − 𝟎. 𝟕𝟕𝟐𝟗)
0.99
𝟏 𝟏 h2 O2
= (−𝟎. 𝟕𝟒𝟏𝟑)𝟐 + 𝟐 (𝟎. 𝟐𝟏𝟕𝟏)𝟐 x2 0.10
W4 (0.30) W8 (0.55)
𝟐
b1= (0.35)
= 0.274+ 0.0235 = 0.29837 (appprx) b2= (0.60)
MALLA REDDY COLLEGE OF ENGINEERING
Working of Backpropagation Algorithm with an
example
part 3:
calculate Backward propagation error
(output layer to hidden layer)
W5,W6,W7,W8 W1(0.15) W5(0.40) 0.01
0.05 h1 O1
𝝏𝑬𝒕𝒐𝒕𝒂𝒍 x1
*W5 = W5-n 𝝏 𝑾𝟓 where ∂ Partial Differential
Where n is Learning Rate with 0.6
𝝏 𝑶𝒖𝒕 𝑶𝟏
= Out O1 (1-Out O1)
𝝏 𝒏𝒆𝒕 𝑶𝟏
= 0.751365(1-0.751365)
= 0.186815602
MALLA REDDY COLLEGE OF ENGINEERING
Working of Backpropagation Algorithm with an
example
part 3:
calculate Backward propagation error
(output layer to hidden layer)
W5,W6,W7,W8 W1(0.15) W5(0.40) 0.01
0.05 h1 O1
𝝏𝑬𝒕𝒐𝒕𝒂𝒍 x1
*W5 = W5-n 𝝏 𝑾𝟓 where ∂ Partial Differential
Where n is Learning Rate with 0.6
𝝏 𝑶𝒖𝒕 𝑶𝟏
= Out O1 (1-Out O1)
𝝏 𝒏𝒆𝒕 𝑶𝟏
= 0.751365(1-0.751365)
= 0.186815602
MALLA REDDY COLLEGE OF ENGINEERING
Working of Backpropagation Algorithm with an
example
part 3:
calculate Backward propagation error
(output layer to hidden layer)
W5,W6,W7,W8 W1(0.15) W5(0.40) 0.01
0.05 h1 O1
𝝏𝑬𝒕𝒐𝒕𝒂𝒍 x1
*W5 = W5-n 𝝏 𝑾𝟓 where ∂ Partial Differential
Where n is Learning Rate with 0.6
𝝏𝑬𝒕𝒐𝒕𝒂𝒍
W5* = W5 - n = 0.4-(0.6*0.08216704)
𝝏 𝑾𝟓
= 0.350699776
MALLA REDDY COLLEGE OF ENGINEERING
Working of Backpropagation Algorithm with an
example
part 3:
calculate Backward propagation error
(hidden layer to input layer )
W1,W2,W3,W4 W1(0.15) W5(0.40) 0.01
0.05 h1 O1
𝝏𝑬𝒕𝒐𝒕𝒂𝒍 x1
*W1 = W1-n 𝝏 𝑾𝟏
𝝏𝑬𝒕𝒐𝒕𝒂𝒍 𝝏𝑬𝒕𝒐𝒕𝒂𝒍 𝝏 𝒐𝒖𝒕 𝒉𝟏 𝝏 𝒏𝒆𝒕 𝒉𝟏
= * *
𝝏 𝑾𝟏 𝝏 𝒐𝒖𝒕 𝒉𝟏 𝝏 𝒏𝒆𝒕 𝒉𝟏 𝝏 𝑾𝟏
𝝏𝑬𝒕𝒐𝒕𝒂𝒍 𝝏𝑬 𝑶𝟏 𝝏𝑬 𝑶𝟐 0.99
= 𝝏 𝒐𝒖𝒕 𝒉𝟏 + 𝝏 𝒐𝒖𝒕 𝒉𝟏 x2 0.10 h2 O2
𝝏 𝒐𝒖𝒕 𝒉𝟏 W4 (0.30) W8 (0.55)
𝝏𝑬 𝑶𝟏 𝝏𝑬 𝑶𝟏 𝝏𝒏𝒆𝒕 𝑶𝟏 b1= (0.35) b2= (0.60)
= 𝝏 𝒏𝒆𝒕 𝑶𝟏 * 𝝏 𝒐𝒖𝒕 𝑶𝟏
𝝏 𝒐𝒖𝒕 𝒉𝟏
𝝏𝑬 𝑶𝟐 𝝏𝑬 𝑶𝟐 𝝏𝒏𝒆𝒕 𝑶𝟐
= 𝝏 𝒏𝒆𝒕 𝑶𝟐 * 𝝏 𝒐𝒖𝒕 𝒉𝟏
𝝏 𝒐𝒖𝒕 𝒉𝟏
MALLA REDDY COLLEGE OF ENGINEERING
Working of Backpropagation Algorithm with an
example
part 3:
calculate Backward propagation error
(hidden layer to input layer )
W1,W2,W3,W4 W1(0.15) W5(0.40) 0.01
0.05 h1 O1
x1
𝝏𝑬 𝑶𝟏 𝝏𝑬 𝑶𝟏 𝝏𝒏𝒆𝒕 𝑶𝟏
= *
𝝏 𝒐𝒖𝒕 𝒉𝟏 𝝏 𝒏𝒆𝒕 𝑶𝟏 𝝏 𝒐𝒖𝒕 𝒉𝟏
𝝏𝑬 𝑶𝟐 𝝏𝑬 𝑶𝟐 𝝏𝒏𝒆𝒕 𝑶𝟐
= 𝝏 𝒏𝒆𝒕 𝑶𝟐 * 𝝏 𝒐𝒖𝒕 𝒉𝟏
𝝏 𝒐𝒖𝒕 𝒉𝟏
0.99
𝝏𝑬 𝑶𝟐 𝝏𝑬 𝑶𝟐 𝝏𝒐𝒖𝒕 𝑶𝟐 h2 O2
= * x2 0.10
W4 (0.30) W8 (0.55)
𝝏 𝒏𝒆𝒕 𝑶𝟐 𝝏 𝒐𝒖𝒕 𝑶𝟐 𝝏 𝒏𝒆𝒕 𝑶𝟐
𝝏𝑬 𝑶𝟐 b1= (0.35) b2= (0.60)
= (out O2- target O2)
𝝏 𝒐𝒖𝒕 𝑶𝟐
= 0.772928465-0.99
= -0.2170771535
𝝏𝒐𝒖𝒕 𝑶𝟐
𝝏 𝒏𝒆𝒕 𝑶𝟐
= out O2 – (1-out O2)
= 0.772928465 – (1- 0.772928465)
= 0.175510052
MALLA REDDY COLLEGE OF ENGINEERING
Working of Backpropagation Algorithm with an
example
part 3:
calculate Backward propagation error
(hidden layer to input layer )
W1,W2,W3,W4 W1(0.15) W5(0.40) 0.01
0.05 h1 O1
x1
𝝏𝑬 𝑶𝟏 𝝏𝑬 𝑶𝟏 𝝏𝒏𝒆𝒕 𝑶𝟏
= *
𝝏 𝒐𝒖𝒕 𝒉𝟏 𝝏 𝒏𝒆𝒕 𝑶𝟏 𝝏 𝒐𝒖𝒕 𝑶𝟏
𝝏𝑬 𝑶𝟐 𝝏𝑬 𝑶𝟐 𝝏𝒏𝒆𝒕 𝑶𝟐
= 𝝏 𝒏𝒆𝒕 𝑶𝟐 * 𝝏 𝒐𝒖𝒕 𝒉𝟏
𝝏 𝒐𝒖𝒕 𝒉𝟏
0.99
𝝏𝑬 𝑶𝟐 𝝏𝑬 𝑶𝟐 𝝏𝒐𝒖𝒕 𝑶𝟐 h2 O2
= * x2 0.10
W4 (0.30) W8 (0.55)
𝝏 𝒏𝒆𝒕 𝑶𝟐 𝝏 𝒐𝒖𝒕 𝑶𝟐 𝝏 𝒏𝒆𝒕 𝑶𝟐
𝝏𝑬 𝑶𝟐 b1= (0.35) b2= (0.60)
= (out O2- target O2)
𝝏 𝒐𝒖𝒕 𝑶𝟐
= 0.772928465-0.99 𝝏𝑬 𝑶𝟐
= -0.2170771535 * 0.175510052
= -0.2170771535 𝝏 𝒏𝒆𝒕 𝑶𝟐
𝝏𝒐𝒖𝒕 𝑶𝟐 =
= out O2 – (1-out O2) 𝝏𝒏𝒆𝒕 𝑶𝟐
𝝏 𝒏𝒆𝒕 𝑶𝟐 = on total of O2 from h1 => W7
= 0.772928465 – (1- 0.772928465) 𝝏 𝒐𝒖𝒕 𝒉𝟏
= 0.50
= 0.175510052
MALLA REDDY COLLEGE OF ENGINEERING
Working of Backpropagation Algorithm with an
example
part 3:
calculate Backward propagation error
(hidden layer to input layer )
W1,W2,W3,W4 W1(0.15) W5(0.40) 0.01
0.05 h1 O1
x1
𝝏𝑬 𝑶𝟏 𝝏𝑬 𝑶𝟏 𝝏𝒏𝒆𝒕 𝑶𝟏
= *
𝝏 𝒐𝒖𝒕 𝒉𝟏 𝝏 𝒏𝒆𝒕 𝑶𝟏 𝝏 𝒐𝒖𝒕 𝒉𝟏
𝝏𝑬 𝑶𝟐 𝝏𝑬 𝑶𝟐 𝝏𝒏𝒆𝒕 𝑶𝟐
= 𝝏 𝒏𝒆𝒕 𝑶𝟐 * 𝝏 𝒐𝒖𝒕 𝒉𝟏
𝝏 𝒐𝒖𝒕 𝒉𝟏
= -0.03809823 * 0.50 0.99
x2 0.10 h2 O2
𝝏𝑬 𝑶𝟐 𝝏𝑬 𝑶𝟐 𝝏𝒐𝒖𝒕 𝑶𝟐 W4 (0.30) W8 (0.55)
= *
𝝏 𝒏𝒆𝒕 𝑶𝟐 𝝏 𝒐𝒖𝒕 𝑶𝟐 𝝏 𝒏𝒆𝒕 𝑶𝟐 b1= (0.35) b2= (0.60)
𝝏𝑬 𝑶𝟐
= (out O2- target O2) 𝝏𝑬 𝑶𝟐
𝝏 𝒐𝒖𝒕 𝑶𝟐 = -0.2170771535 * 0.175510052
= 0.772928465-0.99 𝝏 𝒏𝒆𝒕 𝑶𝟐
= -0.03809823
= -0.2170771535 𝝏𝒏𝒆𝒕 𝑶𝟐
𝝏𝒐𝒖𝒕 𝑶𝟐 = on total of O2 from h1 => W7
= out O2 – (1-out O2) 𝝏 𝒐𝒖𝒕 𝒉𝟏
𝝏 𝒏𝒆𝒕 𝑶𝟐 = 0.50
= 0.772928465 – (1- 0.772928465)
MALLA REDDY COLLEGE OF ENGINEERING
Working of Backpropagation Algorithm with an
example
part 3:
calculate Backward propagation error
(hidden layer to input layer )
W1,W2,W3,W4 W1(0.15) W5(0.40) 0.01
0.05 h1 O1
x1
𝝏𝑬 𝑶𝟏 𝝏𝑬 𝑶𝟏 𝝏𝒏𝒆𝒕 𝑶𝟏
= *
𝝏 𝒐𝒖𝒕 𝒉𝟏 𝝏 𝒏𝒆𝒕 𝑶𝟏 𝝏 𝒐𝒖𝒕 𝒉𝟏
=0.37257738*0.4
𝝏𝑬 𝑶𝟐 𝝏𝑬 𝑶𝟐 𝝏𝒏𝒆𝒕 𝑶𝟐
= 𝝏 𝒏𝒆𝒕 𝑶𝟐 * 𝝏 𝒐𝒖𝒕 𝒉𝟏
𝝏 𝒐𝒖𝒕 𝒉𝟏 0.99
x2 0.10 h2 O2
𝝏𝑬 𝑶𝟏 𝝏𝑬 𝑶𝟏 𝝏𝒐𝒖𝒕 𝑶𝟏 W4 (0.30) W8 (0.55)
= *
𝝏 𝒏𝒆𝒕 𝑶𝟏 𝝏 𝒐𝒖𝒕 𝑶𝟏 𝝏 𝒏𝒆𝒕 𝑶𝟏 b1= (0.35) b2= (0.60)
𝝏𝑬 𝑶𝟏
= (out O1- target O1) 𝝏𝑬 𝑶𝟏
𝝏 𝒐𝒖𝒕 𝑶𝟏 = 0.7413 * 0.5026
= 0.7513-0.01 = 0.7413 𝝏 𝒏𝒆𝒕 𝑶𝟏
𝝏𝒐𝒖𝒕 𝑶𝟏 = 0.37257738
= out O1 – (1-out O1) 𝝏𝒏𝒆𝒕 𝑶𝟏
𝝏 𝒏𝒆𝒕 𝑶𝟏 = on total of O1 from h1 => W5
= 0.7513 – (1- 0.7513) 𝝏 𝒐𝒖𝒕 𝒉𝟏
= 0.4
= 0.5026
MALLA REDDY COLLEGE OF ENGINEERING
Working of Backpropagation Algorithm with an
example
part 3:
calculate Backward propagation error
(hidden layer to input layer )
W1,W2,W3,W4 W1(0.15) W5(0.40) 0.01
0.05 h1 O1
x1
𝝏𝑬 𝑶𝟏 𝝏𝑬 𝑶𝟏 𝝏𝒏𝒆𝒕 𝑶𝟏
= *
𝝏 𝒐𝒖𝒕 𝒉𝟏 𝝏 𝒏𝒆𝒕 𝑶𝟏 𝝏 𝒐𝒖𝒕 𝑶𝟏
=0.37257738*0.4
𝝏𝑬 𝑶𝟐 𝝏𝑬 𝑶𝟐 𝝏𝒏𝒆𝒕 𝑶𝟐
= 𝝏 𝒏𝒆𝒕 𝑶𝟐 * 𝝏 𝒐𝒖𝒕 𝒉𝟏
𝝏 𝒐𝒖𝒕 𝒉𝟏 0.99
= -0.03809823 * 0.50 x2 0.10 h2 O2
W4 (0.30) W8 (0.55)
𝝏𝑬𝒕𝒐𝒕𝒂𝒍 𝝏𝑬𝒕𝒐𝒕𝒂𝒍 𝝏 𝒐𝒖𝒕 𝒉𝟏 𝝏 𝒏𝒆𝒕 𝒉𝟏 b1= (0.35) b2= (0.60)
= 𝝏 𝒐𝒖𝒕 𝒉𝟏 * 𝝏 𝒏𝒆𝒕 𝒉𝟏 *
𝝏 𝑾𝟏 𝝏 𝑾𝟏
𝝏 𝒐𝒖𝒕 𝒉𝟏
𝝏𝑬 𝑶𝟏 𝝏𝑬 𝑶𝟐
=
out h1(1-out h1) =0.5932(1-0.5932)
𝝏 𝒏𝒆𝒕 𝒉𝟏
=
𝝏𝑬𝒕𝒐𝒕𝒂𝒍
+ =0.5932*0.4077 =0.24184764
𝝏 𝒐𝒖𝒕 𝒉𝟏 𝝏 𝒐𝒖𝒕 𝒉𝟏 𝝏 𝒐𝒖𝒕 𝒉𝟏
= (0.37257738*0.4) + (-0.03809823 * 0.50 )
= 0.019049115+0.149030952 𝝏 𝒏𝒆𝒕 𝒉𝟏 𝜕
= 𝜕𝑤1(w1x1+w2x2+b1) = (x1+0+0)=0.05
=0.168080067 𝝏 𝑾𝟏
MALLA REDDY COLLEGE OF ENGINEERING
Working of Backpropagation Algorithm with an
example
part 3:
calculate Backward propagation error
(hidden layer to input layer )
W1,W2,W3,W4 W1(0.15) W5(0.40) 0.01
0.05 h1 O1
x1
𝝏𝑬 𝑶𝟏 𝝏𝑬 𝑶𝟐
𝝏𝑬𝒕𝒐𝒕𝒂𝒍
𝝏 𝒐𝒖𝒕 𝒉𝟏
= 𝝏 𝒐𝒖𝒕 𝒉𝟏 + 𝝏 𝒐𝒖𝒕 𝒉𝟏
= (0.37257738*0.4) + (-0.03809823 * 0.50 )
= 0.019049115+0.149030952
=0.168080067 0.99
x2 0.10 h2 O2
𝝏𝑬𝒕𝒐𝒕𝒂𝒍 𝝏𝑬𝒕𝒐𝒕𝒂𝒍 𝝏 𝒐𝒖𝒕 𝒉𝟏 𝝏 𝒏𝒆𝒕 𝒉𝟏 W4 (0.30) W8 (0.55)
= * *
𝝏 𝑾𝟏 𝝏 𝒐𝒖𝒕 𝒉𝟏 𝝏 𝒏𝒆𝒕 𝒉𝟏 𝝏 𝑾𝟏 b1= (0.35) b2= (0.60)
= 0.168080067 * 0.24184764*0.05 𝝏 𝒐𝒖𝒕 𝒉𝟏
= 0.00203248837 =
out h1(1-out h1) =0.5932(1-0.5932)
𝝏 𝒏𝒆𝒕 𝒉𝟏
𝝏𝑬𝒕𝒐𝒕𝒂𝒍 =0.5932*0.4077 =0.24184764
*w1 = W1-n 𝝏 𝑾𝟏
= 𝟎. 𝟏𝟓 − (𝟎. 𝟔 ∗ 𝟎. 𝟎𝟎𝟐𝟎𝟑𝟐𝟒𝟖𝟖𝟑𝟕) 𝝏 𝒏𝒆𝒕 𝒉𝟏 𝜕
= 𝜕𝑤1(w1x1+w2x2+b1) = (x1+0+0)=0.05
=0.15-0.000304873=0.149695163 𝝏 𝑾𝟏
MALLA REDDY COLLEGE OF ENGINEERING
Radial Basis Functions and Splines – Concepts – RBF Network
Single layer perceptron can be used for classifying linearly Separable data
Multilayer perceptron can be used for classifying non- linearly Separable data
Wm
Hn(x)
Xn
MALLA REDDY COLLEGE OF ENGINEERING
Radial Basis Functions and Splines – Concepts – RBF Network
The hidden layer use a non – linear radial basis function as the activation function.
Which converts the input parameter into high dimension space which is then fed
into the network to linearly separate the problem.
H1(x)
X1
W1
X2 H2(x)
W2 yn Y
Wm
Hn(x)
Xn
MALLA REDDY COLLEGE OF ENGINEERING
Radial Basis Functions and Splines – Concepts – RBF Network
H1(x)
X1
W1
X2 H2(x)
W2 yn Y
Wm
Hn(x)
Xn
MALLA REDDY COLLEGE OF ENGINEERINGA
𝑟2 +(𝑥−𝑐)2
H(x) = 𝑟
MALLA REDDY COLLEGE OF ENGINEERINGA
• Step2: for each node j in the hidden layer, find the center (c) and the variance r
define hidden layer neurons with gaussian RB
−(𝑥𝑖−𝐶𝑗)2
𝐻𝑖 𝑥 = 𝑒 𝑟2 compute (x-cj) applying Euclidian distance measure between x and cj
Step 3 : for each node k in the output layer, compute linear weighted sum of the output of each
neuron k from the hidden layer neurons j.
𝑓𝑘 𝑥 = ∑𝑚 𝑗=1 𝑊𝑖𝑗 𝐻𝑗(x) where Wij = weight in the link from the hidden layer neuron j to the
output layer
H(x) is the output of a hidden layer neuron j for an input vector x
Backward phase :
1. Train the hidden layer using Back Propagation
2. update the weights between the hidden layer and output layer
MALLA REDDY COLLEGE OF ENGINEERING
RBF Networks are conceptually similar to K-Nearest Neighbor (k-NN) models, though their implementation is distinct. The fundamental idea is that an item’s
predicted target value is influenced by nearby items with similar predictor variable values. Here’s how RBF Networks operate:
1.Input Vector: The network receives an n-dimensional input vector that needs classification or regression.
2.RBF Neurons: Each neuron in the hidden layer represents a prototype vector from the training set. The network computes the Euclidean distance between the
input vector and each neuron’s center.
3.Activation Function: The Euclidean distance is transformed using a Radial Basis Function (typically a Gaussian function) to compute the neuron’s activation value.
This value decreases exponentially as the distance increases.
4.Output Nodes: Each output node calculates a score based on a weighted sum of the activation values from all RBF neurons. For classification, the
category with the highest score is chosen.
The Curse of Dimensionality in Machine Learning arises when working with high-dimensional data, leading to increased computational complexity, overfitting,
and spurious correlations. Techniques like dimensionality reduction, feature selection, and careful model design are essential for mitigating its effects and
improving algorithm performance. Navigating this challenge is crucial for unlocking the potential of high-dimensional datasets and ensuring robust machine-
learning solutions.
Polynomial Interpolation
•Polynomial interpolation is a method of estimating values between known data points by fitting a polynomial function to the data. The goal is to find a polynomial that
passes through all the given points.
•This method is useful for approximating functions that may not have a simple analytical form. One common approach to polynomial interpolation is to use the
Lagrange polynomial or Newton’s divided differences method to construct the interpolating polynomial.
Implementation
•This article demonstrates polynomial interpolation using the interp1d function from SciPy.
•It begins by generating sample data representing points along a sine curve. The interp1d function is then applied with a cubic spline interpolation method to
approximate the curve between the data points.
•Finally, the original data points and the interpolated curve are visualized using matplotlib, showcasing the effectiveness of polynomial interpolation in
approximating the underlying function from sparse data points.
MALLA REDDY COLLEGE OF ENGINEERING
import numpy as np
from scipy.interpolate import interp1d import matplotlib.pyplot as plt
The goal of the SVM algorithm is to create the best line or decision boundary that can segregate n-dimensional space into classes so that we can easily put the new data point in the correct category
in the future. This best decision boundary is called a hyperplane.
SVM chooses the extreme points/vectors that help in creating the hyperplane. These extreme cases are called as support vectors, hence and
algorithm is termed as Support Vector Machine. Consider the below diagram in which there are two different categories that are
classified using a decision boundary or hyperplane:
Types of SVM
SVM can be of two types:
o Linear SVM: Linear SVM is used for linearly separable data, which means if a dataset can be classified into two classes by using a single straight line, then
such data is termed as linearly separable data, and classifier is used called as Linear SVM classifier.
o Non-linear SVM: Non-Linear SVM is used for non-linearly separated data, which means if a dataset cannot be classified by using a line, straight
then such data is termed as non-linear data and classifier used is called as Non-linear SVM classifier.
Linear SVM:
The working of the SVM algorithm can be understood by using an example. Suppose we have a dataset that has two tags (green and dataset blue), and the
has two features x1 and x2. We want a classifier that can classify the pair(x1, x2) of coordinates in either green or blue. Consider the below image:
MALLA REDDY COLLEGE OF ENGINEERING
So as it is 2-d space so by just using a straight line, we can easily separate these two classes. But there can be
multiple lines that can separate these classes. Consider the below image:
the SVM algorithm helps to find the best line or decision boundary; this best boundary or region is called as a hyperplane. SVM algorithm finds the closest point of the lines from both the
classes. These points are called support vectors. The distance between the vectors and the hyperplane is called as margin. And the goal of SVM is to maximize this margin.
So to separate these data points, we need to add one more dimension. For linear data, we have used
two dimensions x and y, so for non-linear data, we will add a third dimension z. It can be calculated as: z= x2 + y2
By adding the third dimension, the sample space will become as below image:
So now, SVM will divide the datasets into classes in the following way. Consider the below image:
Since we are in 3-d Space, hence it is looking like a plane parallel to the x-axis. If we convert it in 2d space with z=1, then it will
MALLA REDDY COLLEGE OF ENGINEERING
become as: