SC_Module2_Notes
SC_Module2_Notes
●
As the name suggests, supervised learning takes place under the supervision of
a teacher.
●
During the training of ANN under supervised learning, the input vector is
presented to the network, which will produce an output vector.
●
This output vector is compared with the desired/target output vector.
●
An error signal is generated if there is a difference between the actual output
and the desired/target output vector.
●
On the basis of this error signal, the weights would be adjusted until the actual
output is matched with the desired output.
Perceptron networks
●
It is a supervised learning network.
●
They are feed forward networks.
●
Consists of 3 units
– a. Sensory unit (input unit)
– b. Associator unit (hidden unit)
– c. Response unit (output unit)
●
Sensory units are connected to associator units with fixed weights having values 1,0,-1 which are assigned at random.
●
Binary activation function is used in sensory unit and associator units.
●
Activation for associator is binary step with fixed threshold.
●
The output signals from associator unit to the response unit are binary.
●
Response unit has activation of 1,0,-1.
●
The activation function is
●
●
Sensory unit : A 2D matrix with photodetectors which detects the inputs. These inputs are given to the associator unit along with
fixed weights.
●
Associator unit : consists of set of subcircuits called feature predicates. They detect specific features . Ie, It identifies the type of
features. Output is 0 or 1 which is given to response unit. Weights are adjustable.
●
Response unit : Contains pattern recognizers.
●
Perceptron learning rule is used in weight updation between associator and response unit.
●
For each training input ,net calculates output and checks for errors.
●
Error= target output – calculated output.
●
If no errors, weights are correct.if errors are there , we need to adjust the weights and repeat the process unit the target output is
achieved.
●
Based on this, the weights between associator and response units will be adjusted.
●
The weights will be adjusted on the basis of learning rule if an error has occured for
aa particular training pattern.
Where
– t : target value (+1 or -1)
– α : learning rate
Learning Rule
●
A finite n number of input training vectors with their associated target values ; x(n)
and t(n).
●
The output y is obtained on the basis of the net input calculated and activation
function being applied over the net input.
●
●
The weight updation is as follows
●
If y !=t, then
●
Else
Architecture of a Simple perceptron network
●
There are n input neurons, 1 output neuron and a bias.
●
The perceptron net classifies whether an input pattern is a member or not of
a particular class.
Flowchart for training process
Perceptron training algorithm for single output classes
●
Perceptron algorithm can be used for either binary or bipolar i/p vectors,
bipolar targets , threshold fixed and variable bias.
Perceptron Network testing algorithm
Question 1: Implement AND function using bipolar i/p and targets
Question 2: Implement OR function using binary i/p and bipolar
targets using perceptron training algorithm upto 3 EPOCHS.
Tutorial:
Find the weights using perceptron network for ANDNOT function
when all the inputs are presented only one time. Use bipolar inputs
and targets
Question 4: Find weights required to perform the following classification
using perceptron network. The vectors (1,1,1,1) and (-1,1,-1,-1) are belonging
to the class (so have target value 1) vectors (1,1,1,-1) and (1,-1,-1,1) are not
belonging to the class ( so have target value-1). Assume learning rate as 1
and initial weights as 0.
●
Ans:
●
The truth table for the given vectors is :
●
Let w1=w2=w3=w4=b=0 , α =1 and θ =0.2
●
The activation function is
●
In the 3rd EPOCH, all the calculated outputs become equal to targets and the
network has converged.
●
The network architecture is
●
Final weights:
w1=-2, w2=2, w3=0, w4=2
Adaptive Linear Neuron(Adaline)
●
The units with linear activation function are called linear units.
●
A network with single linear unit is called an Adaline.
●
It uses bipolar activation for its input signals and its target output.
●
The weights between the input and the output units are adjustable.
●
Adaline is a net which has only one output unit.
●
It is trained using delta rule.
Delta rule for single output unit
●
Also known as Least Widrow- Hoff-rule.
●
Mean Square(LMS) rule or It is found to minimize the mean squared error between the
activation and the target value.
●
Widrow Hoff rule vs Perceptron learning rule:
●
1. Perceptron learning rule originates from the Hebbian assumption while the delta rule is
derived from the gradient descent method.
●
2. Perceptron learning rule stops after a finite number of learning steps. But the gradient
descent approach continues forever.
●
Updates the weights so as to minimize the difference between the net input and the target
value.
Adaline model
Question 1: Implement OR function with bipolar inputs and
targets using Adaline network . Given, Acceptable error level = 1.4
●
Truth table for OR function with bipolar inputs and targets is :
●
Initially all weights are assumed to be a smaller value (but not zero), say 0.1.
●
Ie, w1=w2=b=0.21
●
Let learning rate,α = 0.1
●
Consider the 1st input sample, (x1,x2,t)=(1,1,1).
●
Calculate the net input as
●
Now compute (t – yin)=(1-0.3)=0.7
●
Update the weights and bias as
●
Where α(t-yin)xi is called weight change, Δwi and α(t-yin) is Δb
●
These calculations are performed for all the input samples and the error is
calculated for each.
●
Summing up all the errors obtained for each input sample in one EPOCH , will
give the total mean square error of that EPOCH.
●
The network training is continued until this error is minimized to a very small
value.
●
The network training is done for OR function using Adaline network and is
tabulated in the table below.
●
The total mean square error after each EPOCH is given as
●
From the table above, it is noticed that as training goes on, the error value gets
minimized.
●
The network architecture of Adaline n/w for OR function is
Question 2: Use Adaline network to train ANDNOT function with bipolar
inputs and targets . Perform 2 EPOCHS of training
●
Truth table for ANDNOT function with bipolar inputs and targets is :
●
Initially the weights w1,w2 and bias b have assumed random value say 0.2. Also let α =0.2
●
Consider the 1st input sample (x1,x2,t)=(1,1,-1).
●
Calculate net input as :
●
Same steps are carried for 2 EPOCHS of training and network performance is
noted.
●
Total mean square error at the end of each EPOCHS is the summation of errors of
all input samples.
●
The network architecture of Adaline n/w for ANDNOT function is
Tutorial
●
Use Adaline nerwork to train NOR logic function
with bipolar inputs and targets. Perform 2 epochs of
training.
Back Propagation Network
●
This learning algorithm is applied to multilayer feed forward networks
consisting of processing elements with continuous differentiable activation
functions.
●
The networks associated with back propagation learning algorithm are called
backpropagation networks(BPNs).
●
Algorithm provides a procedure for changing the weights to classify the given
input patterns correctly.
●
It uses gradient descent method.
●
This is a method where the error is propagated back to the hidden unit.
hidden
Flowchart of BPN training
Tutorial
NOTE :
Question1 : Using BPN, find the new weights for the net shown below.
It is presented with input pattern [0,1] and the target output is 1. Use
learning rate α=0.25 and binary sigmoidal activation function
●
Initial weights to z1 : [v11, v21, v01] = [0.6, -0.1, 0.3]
●
Initial weights to z2 : [v21, v22, v02] = [-0.3, 0.4, 0.5]
●
Initial weights to y : [w1, w2, w0] = [0.4, 0.1, -0.2]
●
Given α = 0.25, activation function used is binary sigmoidal function and is given
by
●
Given sample = [0,1] and target = 1
Question 2: Find the new weights using BPN for the network shown
below. The network is presented with the input pattern [-1,1] and the
target output is +1. Use α=0.25 and bipolar sigmoidal activation function.
●
Ans : Initial weights are [v11, v21, v01]=[0.6, 0.1, 0.3], [v12,v22,v02]=[-0.3,0.4,0.5] and [w1,w2,w0] =
[0.4,0.1,-0.2], α=0.25
●
Activation function used is bipolar sigmoidal activation function as
●
Given input sample [x1,x2]=[-1,1] and target t=1
●
●
Since bipolar sigmoidal function is used, f '(yin) = λ/2 ( 1+f(yin))(1-f(yin))