Artificial Neural Network Supervised Learning
Artificial Neural Network Supervised Learning
Supervised Learning
Advertisements
As the name suggests, supervised learning takes place under the supervision of a
teacher. This learning process is dependent. During the training of ANN under supervised
learning, the input vector is presented to the network, which will produce an output vector.
This output vector is compared with the desired/target output vector. An error signal is
generated if there is a difference between the actual output and the desired/target output
vector. On the basis of this error signal, the weights would be adjusted until the actual
output is matched with the desired output.
Perceptron
Developed by Frank Rosenblatt by using McCulloch and Pitts model, perceptron is the basic
operational unit of artificial neural networks. It employs supervised learning rule and is
able to classify the data into two classes.
https://www.tutorialspoint.com/artificial_neural_network/artificial_neural_network_supervised_learning 1/14
8/7/2019 Artificial Neural Network Supervised Learning
Links − It would have a set of connection links, which carries a weight including a
bias always having weight 1.
Adder − It adds the input after they are multiplied with their respective weights.
Activation function − It limits the output of neuron. The most basic activation
function is a Heaviside step function that has two possible outputs. This function
returns 1, if the input is positive, and 0 for any negative input.
Training Algorithm
Perceptron network can be trained for single output unit as well as multiple output units.
Weights
Bias
Learning rate α
For easy calculation and simplicity, weights and bias must be set equal to 0 and the
learning rate must be set equal to 1.
Step 2 − Continue step 3-8 when the stopping condition is not true.
xi = si (i = 1 to n)
Step 5 − Now obtain the net input with the following relation −
https://www.tutorialspoint.com/artificial_neural_network/artificial_neural_network_supervised_learning 2/14
8/7/2019 Artificial Neural Network Supervised Learning
n
yin = b + ∑ xi . w i
Here ‘b’ is bias and ‘n’ is the total number of input neurons.
Step 6 − Apply the following activation function to obtain the final output.
⎧1 if yin > θ
f (yin ) = ⎨ 0 if − θ ⩽ yin ⩽ θ
⎩
−1 if yin < −θ
Case 1 − if y ≠ t then,
b(new) = b(old) + αt
Case 2 − if y = t then,
w i (new) = w i (old)
b(new) = b(old)
Here ‘y’ is the actual output and ‘t’ is the desired/target output.
Step 8 − Test for the stopping condition, which would happen when there is no change in
weight.
Weights
Bias
Learning rate α
For easy calculation and simplicity, weights and bias must be set equal to 0 and the
learning rate must be set equal to 1.
Step 2 − Continue step 3-8 when the stopping condition is not true.
xi = si (i = 1 to n)
yin = b + ∑ xi w ij
Here ‘b’ is bias and ‘n’ is the total number of input neurons.
Step 6 − Apply the following activation function to obtain the final output for each output
unit j = 1 to m −
⎪1
⎧ if yinj > θ
f (yin ) = ⎨ 0 if − θ ⩽ yinj ⩽ θ
⎩
⎪
−1 if yinj < −θ
Case 1 − if yj ≠ tj then,
w ij (new) = w ij (old) + α tj xi
Case 2 − if yj = tj then,
w ij (new) = w ij (old)
bj (new) = bj (old)
Here ‘y’ is the actual output and ‘t’ is the desired/target output.
Step 8 − Test for the stopping condition, which will happen when there is no change in
weight.
https://www.tutorialspoint.com/artificial_neural_network/artificial_neural_network_supervised_learning 4/14
8/7/2019 Artificial Neural Network Supervised Learning
It uses delta rule for training to minimize the Mean-Squared Error (MSE) between
the actual output and the desired/target output.
Architecture
The basic structure of Adaline is similar to perceptron having an extra feedback loop with
the help of which the actual output is compared with the desired/target output. After
comparison on the basis of training algorithm, the weights and bias will be updated.
Training Algorithm
Step 1 − Initialize the following to start the training −
Weights
Bias
Learning rate α
For easy calculation and simplicity, weights and bias must be set equal to 0 and the
learning rate must be set equal to 1.
Step 2 − Continue step 3-8 when the stopping condition is not true.
Step 3 − Continue step 4-6 for every bipolar training pair s:t.
https://www.tutorialspoint.com/artificial_neural_network/artificial_neural_network_supervised_learning 5/14
8/7/2019 Artificial Neural Network Supervised Learning
xi = si (i = 1 to n)
yin = b + ∑ xi w i
Here ‘b’ is bias and ‘n’ is the total number of input neurons.
Step 6 − Apply the following activation function to obtain the final output −
1 if yin ⩾ 0
f (yin ) = {
−1 if yin < 0
Case 1 − if y ≠ t then,
Case 2 − if y = t then,
w i (new) = w i (old)
b(new) = b(old)
Here ‘y’ is the actual output and ‘t’ is the desired/target output.
Step 8 − Test for the stopping condition, which will happen when there is no change in
weight or the highest weight change occurred during training is smaller than the specified
tolerance.
It is just like a multilayer perceptron, where Adaline will act as a hidden unit
between the input and the Madaline layer.
The weights and the bias between the input and Adaline layers, as in we see in the
Adaline architecture, are adjustable.
https://www.tutorialspoint.com/artificial_neural_network/artificial_neural_network_supervised_learning 6/14
8/7/2019 Artificial Neural Network Supervised Learning
The Adaline and Madaline layers have fixed weights and bias of 1.
Architecture
The architecture of Madaline consists of “n” neurons of the input layer, “m” neurons of the
Adaline layer, and 1 neuron of the Madaline layer. The Adaline layer can be considered as
the hidden layer as it is between the input layer and the output layer, i.e. the Madaline
layer.
Training Algorithm
By now we know that only the weights and bias between the input and the Adaline layer
are to be adjusted, and the weights and bias between the Adaline and the Madaline layer
are fixed.
Weights
Bias
Learning rate α
For easy calculation and simplicity, weights and bias must be set equal to 0 and the
learning rate must be set equal to 1.
Step 2 − Continue step 3-8 when the stopping condition is not true.
Step 3 − Continue step 4-6 for every bipolar training pair s:t.
xi = si (i = 1 to n)
https://www.tutorialspoint.com/artificial_neural_network/artificial_neural_network_supervised_learning 7/14
8/7/2019 Artificial Neural Network Supervised Learning
Step 5 − Obtain the net input at each hidden layer, i.e. the Adaline layer with the
following relation −
n
Qinj = bj + ∑ xi w ij j = 1 to m
Here ‘b’ is bias and ‘n’ is the total number of input neurons.
Step 6 − Apply the following activation function to obtain the final output at the Adaline
and the Madaline layer −
1 if x ⩾ 0
f (x) = {
−1 if x < 0
Qj = f (Qinj )
y = f (yin )
m
i.e. yinj = b0 + ∑
j=1
Qj vj
In this case, the weights would be updated on Qj where the net input is close to 0 because
t = 1.
In this case, the weights would be updated on Qk where the net input is positive because t
= -1.
Here ‘y’ is the actual output and ‘t’ is the desired/target output.
Case 3 − if y = t then
https://www.tutorialspoint.com/artificial_neural_network/artificial_neural_network_supervised_learning 8/14
8/7/2019 Artificial Neural Network Supervised Learning
Step 8 − Test for the stopping condition, which will happen when there is no change in
weight or the highest weight change occurred during training is smaller than the specified
tolerance.
Architecture
As shown in the diagram, the architecture of BPN has three interconnected layers having
weights on them. The hidden layer as well as the output layer also has bias, whose weight
is always 1, on them. As is clear from the diagram, the working of BPN is in two phases.
One phase sends the signal from the input layer to the output layer, and the other phase
back propagates the error from the output layer to the input layer.
Training Algorithm
For training, BPN will use binary sigmoid activation function. The training of BPN will have
the following three phases.
Weights
Learning rate α
For easy calculation and simplicity, take some small random values.
Step 2 − Continue step 3-11 when the stopping condition is not true.
Phase 1
Step 4 − Each input unit receives input signal xi and sends it to the hidden unit for all i =
1 to n
Step 5 − Calculate the net input at the hidden unit using the following relation −
n
i=1
Here b0j is the bias on hidden unit, vij is the weight on j unit of the hidden layer coming
from i unit of the input layer.
Now calculate the net output by applying the following activation function
Qj = f (Qinj )
Send these output signals of the hidden layer units to the output layer units.
Step 6 − Calculate the net input at the output layer unit using the following relation −
yink = b0k + ∑ Qj w jk k = 1 to m
j=1
Here b0k is the bias on output unit, wjk is the weight on k unit of the output layer coming
from j unit of the hidden layer.
yk = f (yink )
Phase 2
https://www.tutorialspoint.com/artificial_neural_network/artificial_neural_network_supervised_learning 10/14
8/7/2019 Artificial Neural Network Supervised Learning
Step 7 − Compute the error correcting term, in correspondence with the target pattern
received at each output unit, as follows −
′
δk = (tk − yk )f (yink )
Δb0k = αδk
Step 8 − Now each hidden unit will be the sum of its delta inputs from the output units.
m
δinj = ∑ δk w jk
k=1
δj = δinj f (Qinj )
Δw ij = αδj xi
Δb0j = αδj
Phase 3
Step 9 − Each output unit (ykk = 1 to m) updates the weight and bias as follows −
Step 10 − Each output unit (zjj = 1 to p) updates the weight and bias as follows −
w ij (new) = w ij (old) + Δw ij
Step 11 − Check for the stopping condition, which may be either the number of epochs
reached or the target output matches the actual output.
https://www.tutorialspoint.com/artificial_neural_network/artificial_neural_network_supervised_learning 11/14
8/7/2019 Artificial Neural Network Supervised Learning
Delta rule works only for the output layer. On the other hand, generalized delta rule, also
called as back-propagation rule, is a way of creating the desired values of the hidden
layer.
Mathematical Formulation
For the activation function yk = f (yink ) the derivation of net input on Hidden layer as
well as on output layer can be given by
yink = ∑ zi w jk
1
2
E = ∑ [tk − yk ]
2
k
∂E ∂ 1
2
= ( ∑ [tk − yk ] )
∂ w jk ∂ w jk 2
k
∂ 1
2
= ⟮ [tk − t(yink )] ⟯
∂ w jk 2
∂
= −[tk − yk ] f (yink )
∂ w jk
∂
= −[tk − yk ]f (yink ) (yink )
∂ w jk
∂E ∂
= − ∑ δk (yink )
∂ vij ∂ vij
k
δj = − ∑ δk w jk f (zinj )
https://www.tutorialspoint.com/artificial_neural_network/artificial_neural_network_supervised_learning 12/14
8/7/2019 Artificial Neural Network Supervised Learning
∂E
Δw jk = −α
∂ w jk
= α δk zj
∂E
Δvij = −α
∂ vij
= α δj xi
Advertisements
edureka.co/Deep_learn/training
Intensive training that includes 7 courses, 40+ modules & a Live Industry
Project
https://www.tutorialspoint.com/artificial_neural_network/artificial_neural_network_supervised_learning 13/14
8/7/2019 Artificial Neural Network Supervised Learning
https://www.tutorialspoint.com/artificial_neural_network/artificial_neural_network_supervised_learning 14/14