Single Layer Feedforward Networks

This document summarizes perceptrons and Adaline networks. [1] Perceptrons are single-layer feedforward networks that can learn linearly separable patterns using an error-driven learning rule. [2] Adalines use the same architecture but learn using the delta rule to minimize mean squared error. [3] Both can only learn linearly separable problems; multi-layer networks with nonlinear hidden units are required for more complex tasks.

Uploaded by

AKSHIT MATHUR 15BEC0582

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

48 views21 pages

Single Layer Feedforward Networks

Uploaded by

AKSHIT MATHUR 15BEC0582

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 21

Chapter 2

Single Layer Feedforward Networks

Perceptrons
• By Rosenblatt (1962)
– For modeling visual perception (retina)
– A feedforward network of three layers of units:
Sensory, Association, and Response
– Learning occurs only on weights from A units to R units
(weights from S units to A units are fixed).
– Each R unit receives inputs from n A units
– For a given training sample s:t, change weights between A
and R only if the computed output y is different from the
target output t (error driven)
wSA wAR
S A R
Perceptrons
• A simple perceptron
– Structure:
• Sing output node with threshold function
• n input nodes with weights wi, i = 1 – n
– To classify input patterns into one of the two classes
(depending on whether output = 0 or 1)
– Example: input patterns: (x1, x2)
• Two groups of input patterns
(0, 0) (0, 1) (1, 0) (-1, -1);
(2.1, 0) (0, -2.5) (1.6, -1.6)
• Can be separated by a line on the (x1, x2) plane x1 - x2 = 2
• Classification by a perceptron with
w1 = 1, w2 = -1, threshold = 2
Perceptrons

(-1, -1)
(1.6, -1.6)

• Implement threshold
by a node x0
– Constant output 1
– Weight w0 = - threshold
– A common practice in
NN design
Perceptrons
• Linear separability
– A set of (2D) patterns (x1, x2) of two classes is linearly
separable if there exists a line on the (x1, x2) plane
• w0 + w1 x1 + w2 x2 = 0
• Separates all patterns of one class from the other class
– A perceptron can be built with
• 3 input x0 = 1, x1, x2 with weights w0, w1, w2
– n dimensional patterns (x1,…, xn)
• Hyperplane w0 + w1 x1 + w2 x2 +…+ wn xn = 0 dividing the
space into two regions
– Can we get the weights from a set of sample patterns?
• If the problem is linearly separable, then YES (by perceptron
learning)
• Examples of linearly separable classes
- Logical AND function o x
patterns (bipolar) decision boundary
x1 x2 output w1 = 1
-1 -1 -1 w2 = 1 o o
-1 1 -1 w0 = -1
1 -1 -1 x: class I (output = 1)
1 1 1 -1 + x1 + x2 = 0 o: class II (output = -1)
- Logical OR function
x x
patterns (bipolar) decision boundary
x1 x2 output w1 = 1
-1 -1 -1 w2 = 1
-1 1 1 w0 = 1 o x
1 -1 1
1 1 1 1 + x1 + x2 = 0 x: class I (output = 1)
o: class II (output = -1)
Perceptron Learning
• The network
– Input vector ij (including threshold input = 1) n
– Weight vector w = (w0, w1,…, wn ) net=w⋅i j = ∑
w k ik , j
– Output: bipolar (-1, 1) using the sign node functionk =0

output=¿ { 1 if w⋅i j >0 ¿ ¿¿¿

• Training samples
– Pairs (ij , class(ij)) where class(ij) is the correct classification of ij
• Training:
– Update w so that all sample inputs are correctly classified (if
possible)
– If an input ij is misclassified by the current w
class(ij) · w · ij < 0
change w to w + Δw so that (w + Δw) · ij is closer to class(ij)
Perceptron Learning

Where η > 0 is the learning rate

Perceptron Learning
• Justification

( w    xk )  i j  ( w    class(i j )  i j )  i j
 w  i j    class(i j )  i j  i j
since i j  i  0
( w    xk )  i j  w  i j    class(i j )  i j  i j
 0 if class(i j )  1
 0 if class(i )  1
 j

 new net moves toward class(i j )

• Perceptron learning convergence theorem
– Informal: any problem that can be represented by a
perceptron can be learned by the learning rule
¿
– Theorem: If there is a w such that f (i p  w )  class (i p )
*

for all P training sample patterns {i p , class (i p )}, then for

0
any start weight vector w , the perceptron learning rule

will converge to a weight vector w such that for all p
f (i p  w )  class (i p )
( w and w may not be the same.)
*

– Proof: reading for grad students (Sec. 2.4)

Perceptron Learning
• Note:
– It is a supervised learning (class(ij) is given for all sample input ij)
– Learning occurs only when a sample input misclassified (error
driven)
• Termination criteria: learning stops when all samples are correctly
classified
– Assuming the problem is linearly separable
– Assuming the learning rate (η) is sufficiently small
• Choice of learning rate:
– If η is too large: existing weights are overtaken by Δw =   class (i j )  i j
– If η is too small (≈ 0): very slow to converge
– Common choice: η = 1.
• Non-numeric input:
– Different encoding schema
ex. Color = (red, blue, green, yellow). (0, 0, 1, 0) encodes “green”
Perceptron Learning
• Learning quality
– Generalization: can a trained perceptron correctly classify
patterns not included in the training samples?
• Common problem for many NN learning models
– Depends on the quality of training samples selected.
– Also to some extent depends on the learning rate and
initial weights (bad choices may cause learning too slow
to converge to be practical)
– How can we know the learning is ok?
• Reserve a few samples for testing
Adaline
• By Widrow and Hoff (~1960)
– Adaptive linear elements for signal processing
– The same architecture of perceptrons

– Learning method: delta rule (another way of error driven),

also called Widrow-Hoff learning rule
Try to reduce the mean squared error (MSE) between the net
input and the desired out put
Adaline
• Delta rule
– Let ij = (i0,j, i1,j,…, in,j ) be an input vector with desired output dj
– The squared error
• E  (d j  net j ) 2  (d j   wl il , j ) 2
l
• Its value determined by the weights wl
– Modify weights by gradient descent approach
• 

• Change weights in the opposite direction of E / wk

wk   (d j   wl il , j )  ik , j   (d j  net j )  ik , j
l
Adaline Learning Algorithm
Adaline Learning
• Delta rule in batch mode
– Based on mean squared error over all P samples
1 P
E  p
( d  net p ) 2
P p 1

• E is again a function of w = (w0, w1,…, wn )

• the gradient of E:
E 2 P 
  [(d p  net p ) (d p  net p )]
wk P p 1 wk
2 P
   [(d p  net p )  ik , p ]
P p 1
P
E
• Therefore wi      [(d p  net p )  ik , p ]
wi p 1
Adaline Learning
• Notes:
– Weights will be changed even if an input is classified
correctly
– E monotonically decreases until the system reaches a state
with (local) minimum E (a small change of any wi will
cause E to increase).
– At a local minimum E state, E / w i  0 i , but E is not
guaranteed to be zero (netj != dj)
• This is why Adaline uses threshold function rather than
linear function
Linear Separability Again
• Examples of linearly inseparable classes
- Logical XOR (exclusive OR) function x o

patterns (bipolar) decision boundary

x1 x2 output
-1 -1 -1 o x
-1 1 1
1 -1 1 x: class I (output = 1)
1 1 -1 o: class II (output = -1)
No line can separate these two classes, as can be seen from the
fact that the following linear inequality system has no solution

w{ 0−w1−w2<0 (1)¿{w0−w1+w2≥0 (2)¿{w0+w1−w2≥0 (3) ¿ ¿

because we have w0 < 0 from
(1) + (4), and w0 >= 0 from
(2) + (3), which is a
contradiction
Why hidden units must be non-linear?
• Multi-layer net with linear hidden layers is equivalent to a
single layer net
v11
x1 z1 w1 threshold = 0
v12
Y
v21
x2 v22 z2 w2

– Because z1 and z2 are linear unit

z1 = a1* (x1*v11 + x2*v21) + b1
z1 = a2* (x1*v12 + x2*v22) + b2
– nety = z1*w1 + z2*w2
= x1*u1 + x2*u2 + b1+b2 where
u1 = (a1*v11+ a2*v12)w1, u2 = (a1*v21 + a2*v22)*w2
nety is still a linear combination of x1 and x2.
– XOR can be solved by a more complex network with
hidden units

Threshold 
2
x1 z1 2 Threshold 
-2
-2
Y
x2 2 z2 2

(-1, -1) (-1, -1) -1

(-1, 1) (-1, 1) 1
(1, -1) (1, -1) 1
(1, 1) (-1, -1) -1
Summary
• Single layer nets have limited representation power
(linear separability problem)
• Error driven seems a good way to train a net
• Multi-layer nets (or nets with non-linear hidden units)
may overcome linear inseparability problem, learning
methods for such nets are needed
• Threshold/step output functions hinders the effort to
develop learning methods for multi-layered nets

Simple Neural Nets For Pattern Classification
No ratings yet
Simple Neural Nets For Pattern Classification
68 pages
PNAL4 SingleLayerNets
No ratings yet
PNAL4 SingleLayerNets
42 pages
Clase3 Redunidireccional
No ratings yet
Clase3 Redunidireccional
74 pages
Neural N Problems - SLP
No ratings yet
Neural N Problems - SLP
123 pages
37 Perceptron2
No ratings yet
37 Perceptron2
21 pages
Unit 2 - Soft Computing - WWW - Rgpvnotes.in
No ratings yet
Unit 2 - Soft Computing - WWW - Rgpvnotes.in
17 pages
Neural Network and Fuzzy Logic
50% (2)
Neural Network and Fuzzy Logic
54 pages
P5 Neural Nets
No ratings yet
P5 Neural Nets
114 pages
Perceptron Network
No ratings yet
Perceptron Network
26 pages
Lecture 4
No ratings yet
Lecture 4
65 pages
3 Perceptron: Nnets - L. 3 February 10, 2002
No ratings yet
3 Perceptron: Nnets - L. 3 February 10, 2002
31 pages
Perceptons Neural Networks
No ratings yet
Perceptons Neural Networks
33 pages
20.NeuralNets Short
No ratings yet
20.NeuralNets Short
60 pages
NN Ch2
No ratings yet
NN Ch2
36 pages
NY Perceptron Notes
No ratings yet
NY Perceptron Notes
21 pages
Ch1-Fundamental of Neural Network
No ratings yet
Ch1-Fundamental of Neural Network
59 pages
NN 03
No ratings yet
NN 03
27 pages
1c Perceptrons4
No ratings yet
1c Perceptrons4
5 pages
Ai Unit 4 Part 2
No ratings yet
Ai Unit 4 Part 2
45 pages
08 NN
No ratings yet
08 NN
43 pages
Soft Computing Unit 2
No ratings yet
Soft Computing Unit 2
23 pages
NN 1
No ratings yet
NN 1
6 pages
Lecture Notes 3 Perceptron
No ratings yet
Lecture Notes 3 Perceptron
7 pages
Slide 2
No ratings yet
Slide 2
35 pages
DL CHPT 1
No ratings yet
DL CHPT 1
59 pages
Perceptron Lecture 3
No ratings yet
Perceptron Lecture 3
25 pages
Neural
No ratings yet
Neural
4 pages
Linear
No ratings yet
Linear
18 pages
Unit V
No ratings yet
Unit V
25 pages
Lecturenotes Perceptron
No ratings yet
Lecturenotes Perceptron
7 pages
ANN (Perceptron) 02
No ratings yet
ANN (Perceptron) 02
14 pages
3 Non Linear Classifiers
No ratings yet
3 Non Linear Classifiers
74 pages
Perceptrons
No ratings yet
Perceptrons
11 pages
Dave Reed: Connectionist Approach To AI
No ratings yet
Dave Reed: Connectionist Approach To AI
26 pages
NN Unit 2
No ratings yet
NN Unit 2
20 pages
Perceptron PDF
No ratings yet
Perceptron PDF
37 pages
PLA Explanation
No ratings yet
PLA Explanation
19 pages
Irs Question Papers
No ratings yet
Irs Question Papers
6 pages
Introduction To Neural Networks: John Paxton Montana State University Summer 2003
No ratings yet
Introduction To Neural Networks: John Paxton Montana State University Summer 2003
31 pages
Simple Neural Nets For Pattern Classification
No ratings yet
Simple Neural Nets For Pattern Classification
68 pages
Artificial Neural Networks: HCMC University of Technology Sep. 2008
No ratings yet
Artificial Neural Networks: HCMC University of Technology Sep. 2008
71 pages
NN-Ch2 New V1
No ratings yet
NN-Ch2 New V1
99 pages
Soft Computing
No ratings yet
Soft Computing
92 pages
Anthony Kuh - Neural Networks and Learning Theory
No ratings yet
Anthony Kuh - Neural Networks and Learning Theory
72 pages
ANN 3 - Perceptron
100% (1)
ANN 3 - Perceptron
56 pages
A Presentation On: By: Edutechlearners
No ratings yet
A Presentation On: By: Edutechlearners
33 pages
Pattern Recognition & Analysis Assignment - Ii
No ratings yet
Pattern Recognition & Analysis Assignment - Ii
19 pages
Artificial Neural Networks
No ratings yet
Artificial Neural Networks
71 pages
Artificial Neural Networks: HCMC University of Technology Sep. 2008
No ratings yet
Artificial Neural Networks: HCMC University of Technology Sep. 2008
71 pages
Perceptons Neural Networks
No ratings yet
Perceptons Neural Networks
33 pages
Supervised Learning Neural Networks
No ratings yet
Supervised Learning Neural Networks
34 pages
1.1 Introduction To Data Science 1
No ratings yet
1.1 Introduction To Data Science 1
17 pages
Artificial Neural Networks: System That Can Acquire, Store, and Utilize Experiential Knowledge
100% (1)
Artificial Neural Networks: System That Can Acquire, Store, and Utilize Experiential Knowledge
40 pages
Perceptron Linear Classifiers
No ratings yet
Perceptron Linear Classifiers
42 pages
Iv. Single Layer Structures: 4.1. Perceptrons
No ratings yet
Iv. Single Layer Structures: 4.1. Perceptrons
26 pages
ANN - Perceptron - Adaline
No ratings yet
ANN - Perceptron - Adaline
15 pages
2007 02 01b Janecek Perceptron
No ratings yet
2007 02 01b Janecek Perceptron
37 pages
Future Needs For Control Theory in Industry - Report of The Control Technology Survey in Japanese Industry
No ratings yet
Future Needs For Control Theory in Industry - Report of The Control Technology Survey in Japanese Industry
8 pages
UNIT-4 Foundations of Deep Learning
100% (1)
UNIT-4 Foundations of Deep Learning
43 pages
The Failure of The Perceptron To Successfully Simple Problem Such As XOR (Minsky and Papert)
No ratings yet
The Failure of The Perceptron To Successfully Simple Problem Such As XOR (Minsky and Papert)
13 pages
Seminar Report On CYBORGS
77% (31)
Seminar Report On CYBORGS
33 pages
ImSeg 10 11 18
No ratings yet
ImSeg 10 11 18
41 pages
Predicting Stock Market Trends
No ratings yet
Predicting Stock Market Trends
15 pages
Progress in Energy and Combustion Science: Masoud Aliramezani, Charles Robert Koch, Mahdi Shahbakhti
No ratings yet
Progress in Energy and Combustion Science: Masoud Aliramezani, Charles Robert Koch, Mahdi Shahbakhti
38 pages
Ba ZC415 Acm Course Handout
No ratings yet
Ba ZC415 Acm Course Handout
30 pages
Slide Database Concept
No ratings yet
Slide Database Concept
30 pages
Lead Compensator-Time Domain
No ratings yet
Lead Compensator-Time Domain
17 pages
Language Is The Ability To Acquire and Use Complex Systems of
No ratings yet
Language Is The Ability To Acquire and Use Complex Systems of
2 pages
How Colleges Work The Cybernetics of Academic Organization and Leadership PDF
0% (1)
How Colleges Work The Cybernetics of Academic Organization and Leadership PDF
4 pages
Lecture Slides-Week15,16
No ratings yet
Lecture Slides-Week15,16
50 pages
Temperature Controller
No ratings yet
Temperature Controller
22 pages
Verbal Communication
No ratings yet
Verbal Communication
5 pages
Discussion 27th June 18
No ratings yet
Discussion 27th June 18
2 pages
Mod-1 Question Bank
No ratings yet
Mod-1 Question Bank
1 page
DCIT 403 - Desining Intelligent Agents - Outline
No ratings yet
DCIT 403 - Desining Intelligent Agents - Outline
6 pages
Upload 1
No ratings yet
Upload 1
17 pages
634634634
No ratings yet
634634634
1 page
Theroy of Facial Landmarks
No ratings yet
Theroy of Facial Landmarks
4 pages
Men and Machines - Wiener
No ratings yet
Men and Machines - Wiener
8 pages
2022 AIOpen A Survey of Transformers Lin, Wang, Liu, Qiu
No ratings yet
2022 AIOpen A Survey of Transformers Lin, Wang, Liu, Qiu
22 pages
An Introduction To Convolutional Neural Networks: November 2015
No ratings yet
An Introduction To Convolutional Neural Networks: November 2015
12 pages
MongoDB Tutorial ?
No ratings yet
MongoDB Tutorial ?
9 pages
CCF Using ML and DL
No ratings yet
CCF Using ML and DL
19 pages
Object Detection in Last Decade - A Survey : Scientific Journal of Informatics
No ratings yet
Object Detection in Last Decade - A Survey : Scientific Journal of Informatics
11 pages
Data Structures and Algorithm
No ratings yet
Data Structures and Algorithm
2 pages
New Text Document
No ratings yet
New Text Document
2 pages
Student's Solutions Manual and Supplementary Materials for Econometric Analysis of Cross Section and Panel Data, second edition
From Everand
Student's Solutions Manual and Supplementary Materials for Econometric Analysis of Cross Section and Panel Data, second edition
Jeffrey M. Wooldridge
No ratings yet
Mathematics 1St First Order Linear Differential Equations 2Nd Second Order Linear Differential Equations Laplace Fourier Bessel Mathematics
From Everand
Mathematics 1St First Order Linear Differential Equations 2Nd Second Order Linear Differential Equations Laplace Fourier Bessel Mathematics
Andrew Igla
No ratings yet
Differential Equations (Calculus) Mathematics E-Book For Public Exams
From Everand
Differential Equations (Calculus) Mathematics E-Book For Public Exams
Mohmmad Khaja Shareef
5/5 (1)
Application of Derivatives Tangents and Normals (Calculus) Mathematics E-Book For Public Exams
From Everand
Application of Derivatives Tangents and Normals (Calculus) Mathematics E-Book For Public Exams
Mohmmad Khaja Shareef
5/5 (1)