10-Artificial Neural Networks - Perceptron Learning Algorithm-02-08-2024
10-Artificial Neural Networks - Perceptron Learning Algorithm-02-08-2024
Dr.S.ALBERT ALEXANDER
SCHOOL OF ELECTRICAL ENGINEERING
[email protected]
Dr.S.ALBERT ALEXANDER-
SELECT-VIT 1
Module 2
Artificial Neural Networks
❖ Perceptron Learning Algorithm
Dr.S.ALBERT ALEXANDER-SELECT-
VIT 2
2.1 Artificial Neural Networks
❖ A NN is a machine learning approach inspired by the
way in which the brain performs a particular learning task:
❖ Knowledge about the learning task is given in the form of
examples
❖ Inter neuron connection strengths (weights) are used to
store the acquired information (the training examples)
❖ During the learning process the weights are modified in
order to model the particular learning task correctly on the
training examples
Dr.S.ALBERT ALEXANDER-SELECT-
VIT 3
Definition
❖ A neural network is a system composed of many simple
processing elements operating in parallel whose function is
determined by network structure, connection strengths and
the processing performed at computing elements or nodes
❖ Artificial NN is an information processing system that has
certain performance characteristics in common with
biological neural networks
Synapse
Synapse Dendrites
Axon
Axon
Soma Soma
Dendrites
Synapse
Dr.S.ALBERT ALEXANDER-SELECT-
VIT 4
Biological Inspiration
8/5/2024 Dr.SAA/EEE/KEC 5
Brain Computation
❖ The human brain contains about 10 billion nerve cells, or
neurons
❖ On average, each neuron is connected to other neurons
through about 10,000 synapses
Dr.S.ALBERT ALEXANDER-SELECT-
VIT 6
Biological Neurons
❖ Soma or body cell is a large, round central body in which
almost all the logical functions of the neuron are realized
❖ The axon (output) is a nerve fibre attached to the soma
which can serve as a final output channel of the neuron. An
axon is usually highly branched.
❖ The dendrites (inputs) represent a highly branching tree
of fibres. These long irregularly shaped nerve fibres
(processes) are attached to the soma
❖ Synapses are specialized contacts on a neuron which are
the termination points for the axons from other neurons
Dr.S.ALBERT ALEXANDER-SELECT-
VIT 7
History of Neural Networks
❖ 1943 : Mc Cullough and Pitts – Modeling the Neuron for
parallel distributed processing
❖ 1949 : Hebb network
❖ 1958 : Rosenblatt- Perceptron
❖ 1960 : Adaline
❖ 1969 : Minsky and Papert publish limits on the ability of a
perceptron to generalize
❖ 1972 : Kohonen SOFM
❖ 1986 : Rumelhart, Hinton + Williams present BPN
❖ 1988 : Broomhead & Lowe - RBFN
❖ 1989 : Tsividis - Neural Network on a chip
Dr.S.ALBERT ALEXANDER-SELECT-
VIT 8
Fundamentals of Neural Networks
❖ Neural network inspired by biological nervous systems,
such as our brain
❖ Useful for learning real-valued, discrete-valued or vector-
valued functions (LEARNING)
❖ Applied to problems such as interpreting visual scenes,
speech recognition, learning robot control strategies etc..
❖ Works well with noisy, complex sensor data such as inputs
from cameras and microphones (ADAPTATION)
Dr.S.ALBERT ALEXANDER-SELECT-
VIT 9
Classification of Neural Networks
❖ Learning methods : Supervised, Unsupervised
❖ Architecture : Feed forward, Recurrent
❖ Output types : Binary, Continuous
❖ Node types : Uniform, Hybrid
❖ Implementations : Software, Hardware
❖ Connection weights : Adjustable, hardwired
❖ Operations: Biologically motivated, psychologically
motivated
Dr.S.ALBERT ALEXANDER-SELECT-
VIT 10
Biological vs Artificial Net
Dr.S.ALBERT ALEXANDER-SELECT-
VIT 12
A Neuron Model
f ( x1 ,..., xn ) = F (w0 + w1 x1 + ... + wn xn )
❖ f is a function to be earned
❖ (x1,…, xn) are the inputs
❖ φ is the activation function
❖ z is the weighted sum
x1
. f ( x1 ,..., xn )
. φ(z)
.
xn z = w0 + w1 x1 + ... + wn xn
Dr.S.ALBERT ALEXANDER-SELECT-
VIT 13
A Neuron Model
❖ A neuron has a set of n synapses associated to the inputs
❖ Each of them is characterized by a weight
❖ A signal xi, i=1,..,n at the ith input is multiplied (weighted) by
the weight wi, i=1,..,n
❖ The weighted input signals are summed: w1x1+…..+wnxn
❖ Thus, a linear combination of the input signals
is obtained
❖ A “free weight” (or bias), w0 which does not correspond to
any input, is added to this linear combination and this
forms a weighted sum: z= w0+w1x1+…..+wnxn
❖ A nonlinear activation function φ is applied to the z
❖ A value of the activation function y= φ(z) is the neuron's
output
Dr.S.ALBERT ALEXANDER-SELECT-
VIT 14
A Neuron Model
w0
w0
x1 w1 x1 Z=
w1 w x i i (Z)
... Output
( z ) = f ( x1 ,..., x n )
xn wn wn x n
w1
x1
Σ
x2 y
w2
xn
wn
Dr.S.ALBERT ALEXANDER-SELECT-
VIT 15
Activation functions
Linear activation Logistic activation
( z) = z ( z) =
1
1 + e− z
1
z
z
0
Σ Hyperbolic tangent activation
Threshold activation
1 − e −2u
1, if
( z ) = sign( z ) =
z 0,
(u ) = tanh(u ) =
−1, if z 0. 1 + e −2u
1
1
z 0 z
-1 -1
Dr.S.ALBERT ALEXANDER-SELECT-
VIT 16
Node Structure
❖ Applied to the weighted sum of the inputs of a neuron to
produce the output
❖ Majority of NN’s use sigmoid functions:
❖ Smooth, continuous, and monotonically increasing
(derivative is always positive)
❖ Bounded range - but never reaches max or min
❖ Consider “ON” to be slightly less than the max and “OFF”
to be slightly greater than the min
Dr.S.ALBERT ALEXANDER-SELECT-
VIT 17
Dr.S.ALBERT ALEXANDER-SELECT-
VIT 18
Example-1
SOLUTION:
Dr.S.ALBERT ALEXANDER-SELECT-
VIT 19
Example-2
For the network shown, find the output of the neuron Y when
the activation function is a) binary sigmoidal b) bipolar
sigmoidal.
SOLUTION:
Yin=(0.8*-0.2+0.3*0.3+0.2*0.8+0.6*0.5+1*0.25)= 0.64
Y=f(yin)=1 / 1+e-0.64 = 0.6548
Y=f(yin)=(2/(1+e-0.64))-1=0.3095
Dr.S.ALBERT ALEXANDER-SELECT-
VIT 20
Where do weights come from?
❖ The weights in a neural network are the most important
factor in determining its function
❖ Training is the act of presenting the network with some
sample data and modifying the weights to better
approximate the desired function
There are two main types of training:
❖ SUPERVISED: The weights are modified to reduce the
difference between the actual and desired outputs
❖ UNSUPERVISED: The neural network adjusts its own
weights so that similar inputs cause similar outputs
❖ EPOCH: One iteration through the process of providing the
network with an input and updating the network's weights
Dr.S.ALBERT ALEXANDER-SELECT-
VIT 21
Characteristics of Neural Networks
❖ Pattern of connection between the neurons (Architecture)
❖ Method of determining the weights (Training/Learning
algorithm)
❖ Activation Function/transfer/Output function
Dr.S.ALBERT ALEXANDER-SELECT-
VIT 22
Dimensions of Neural Network
❖ Various types of neurons
❖ Various applications
Dr.S.ALBERT ALEXANDER-SELECT-
VIT 23
Neural Network architecture
❖ Three different classes of network architectures
❖ Single-layer feed-forward
❖ Multi-layer feed-forward
❖ Recurrent
z-1
input
z-1 hidden
output
z-1
Dr.S.ALBERT ALEXANDER-SELECT-
VIT 25
Hidden layer?
❖ A hidden layer “hides” its desired output
❖ Neurons in the hidden layer cannot be observed through
the input/output behavior of the network.
❖ There is no obvious way to know what the desired output
of the hidden layer should be
❖ Commercial ANNs incorporate three and sometimes four
layers, including one or two hidden layers
❖ Each layer can contain from 10 to 1000 neurons depending
upon the application
Dr.S.ALBERT ALEXANDER-SELECT-
VIT 26
Learning
Supervised Learning
❖ Recognizing hand-written digits, pattern recognition,
regression
❖ Labeled examples (input, desired output)
alone)
❖ Neural Network models: self organizing maps, Hopfield
networks
Dr.S.ALBERT ALEXANDER-SELECT-
VIT 27
Unsupervised learning methods
Feedback Nets:
❖ Additive Grossberg (AG)
❖ Shunting Grossberg (SG)
❖ Binary Adaptive Resonance Theory (ART1)
❖ Analog Adaptive Resonance Theory (ART2, ART2a)
❖ Discrete Hopfield (DH)
❖ Continuous Hopfield (CH)
❖ Discrete Bidirectional Associative Memory (BAM)
❖ Temporal Associative Memory (TAM)
❖ Adaptive Bidirectional Associative Memory (ABAM)
❖ Kohonen Self-organizing Map (SOM)
❖ Kohonen Topology-preserving Map (TPM)
Dr.S.ALBERT ALEXANDER-SELECT-
VIT 28
Supervised learning methods
Feedback Nets:
❖ Brain-State-in-a-Box (BSB)
❖ Fuzzy Congitive Map (FCM)
❖ Boltzmann Machine (BM)
❖ Mean Field Annealing (MFT)
❖ Recurrent Cascade Correlation (RCC)
❖ Learning Vector Quantization (LVQ)
Dr.S.ALBERT ALEXANDER-SELECT-
VIT 29
Perceptron
❖ The perceptron calculates a weighted sum of inputs and
compares it to a threshold
❖ If the sum is higher than the threshold, the output is set to
1, otherwise to -1
❖ Learning is finding weights wi
Output
Input
Xo=1, wo
x1 Perceptron
1 if n wi xi 0
x2 o= i =0
− 1 otherwise
xn
n
i =0
wi xi
The McCullogh-Pitts model
Dr.S.ALBERT ALEXANDER-SELECT-
VIT 30
Perceptron
Input units
Cough Headache
rule
change weights to
weights decrease the error
what we got
- what we wanted
No disease Pneumonia Flu Meningitis
error
Output units
Dr.S.ALBERT ALEXANDER-SELECT-
VIT 31
How do perceptron's learn?
❖ Uses supervised training
❖ If the output is not correct, the weights are adjusted
according to the formula: Wnew = Wold + α(desired –
output)*input
Dr.S.ALBERT ALEXANDER-SELECT-
VIT 33
Solved Example: Logical AND
Training Instance 3: A=1, B=0 and Target = 0
❖ wi.xi = 1x1.2 + 0x0.6 = 1.2
Hence,
Update the weights using
Wnew = Wold + α(desired – output)*input
w1=1.2+0.5(0-1)x1 = 0.7
w2=0.6+0.5(0-1)x0 = 0.6
Dr.S.ALBERT ALEXANDER-SELECT-
VIT 34
Solved Example: Logical AND
Training Instance 1: A=0, B=0 and Target = 0
❖ wi.xi = 0x0.7 + 0x0.6 = 0
Dr.S.ALBERT ALEXANDER-SELECT-
VIT 35
Solved Example: Logical AND
Training Instance 3: A=1, B=0 and Target = 0
❖ wi.xi = 1x0.7 + 0x0.6 = 0.7
Dr.S.ALBERT ALEXANDER-SELECT-
VIT 36
Solved Example 2: Logical AND
Inputs Desired Initial Actual Error Final
Epoch output weights output weights
x1 x2 Yd w1 w2 Y e w1 w2
1 0 0 0 0.3 − 0.1 0 0 0.3 − 0.1
0 1 0 0.3 − 0.1 0 0 0.3 − 0.1
1 0 0 0.3 − 0.1 1 −1 0.2 − 0.1
1 1 1 0.2 − 0.1 0 1 0.3 0.0
2 0 0 0 0.3 0.0 0 0 0.3 0.0
0 1 0 0.3 0.0 0 0 0.3 0.0
1 0 0 0.3 0.0 1 −1 0.2 0.0
1 1 1 0.2 0.0 1 0 0.2 0.0
3 0 0 0 0.2 0.0 0 0 0.2 0.0
0 1 0 0.2 0.0 0 0 0.2 0.0
1 0 0 0.2 0.0 1 −1 0.1 0.0
1 1 1 0.1 0.0 0 1 0.2 0.1
4 0 0 0 0.2 0.1 0 0 0.2 0.1
0 1 0 0.2 0.1 0 0 0.2 0.1
1 0 0 0.2 0.1 1 −1 0.1 0.1
1 1 1 0.1 0.1 1 0 0.1 0.1
5 0 0 0 0.1 0.1 0 0 0.1 0.1
0 1 0 0.1 0.1 0 0 0.1 0.1
1 0 0 0.1 0.1 0 0 0.1 0.1
1 1 1 0.1 0.1 1 0 0.1 0.1
Threshold: = 0.2; learning rate: = 0.1
Dr.S.ALBERT ALEXANDER-SELECT-
VIT 37
Dr.S.ALBERT ALEXANDER-SELECT-
VIT 38