0% found this document useful (0 votes)
187 views37 pages

Perceptron PDF

The document describes the simple perceptron, which is a single-layer feedforward neural network. It consists of a single processing unit that takes weighted inputs and produces an output using an activation function. The perceptron learning algorithm is used to train the weights and bias by minimizing errors on a training set. The perceptron convergence theorem guarantees that the learning algorithm will converge in a finite number of iterations for any linearly separable data.

Uploaded by

darthvader79
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
187 views37 pages

Perceptron PDF

The document describes the simple perceptron, which is a single-layer feedforward neural network. It consists of a single processing unit that takes weighted inputs and produces an output using an activation function. The perceptron learning algorithm is used to train the weights and bias by minimizing errors on a training set. The perceptron convergence theorem guarantees that the learning algorithm will converge in a finite number of iterations for any linearly separable data.

Uploaded by

darthvader79
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 37

The Simple Perceptron

Artificial Neural Network


Information processing architecture loosely
modelled on the brain
Consist of a large number of interconnected
processing units (neurons)
Work in parallel to accomplish a global task
Generally used to model relationships between
inputs and outputs or find patterns in data
Artificial Neural Network
3 Types of layers
Single Processing Unit
Activation Functions
Function which takes the total input and
produces an output for the node given some
threshold.
Network Structure
Two main network structures

1. Feed-Forward
Network

2. Recurrent
Network
Network Structure
Two main network structures

1. Feed-Forward
Network

2. Recurrent
Network
Learning Paradigms
Supervised Learning:
Given training data consisting of pairs of
inputs/outputs, find a function which correctly
matches them
Unsupervised Learning:
Given a data set, the network finds patterns and
categorizes into data groups.
Reinforcement Learning:
No data given. Agent interacts with the
environment calculating cost of actions.
Simple Perceptron
The perceptron is a single layer feed-forward
neural network.
Simple Perceptron
Simplest output function

Used to classify patterns said to be linearly


separable
Linearly Separable
Linearly Separable
The bias is proportional to the offset of the plane
from the origin
The weights determine the slope of the line
The weight vector
is perpendicular to
the plane
Perceptron Learning Algorithm
We want to train the perceptron to classify
inputs correctly
Accomplished by adjusting the connecting
weights and the bias
Can only properly handle linearly separable
sets
Perceptron Learning Algorithm
We have a training set which is a set of input
vectors used to train the perceptron.
During training both wi and (bias) are modified
for convenience, let w0 = and x0 = 1

Let, , the learning rate, be a small positive


number (small steps lessen the possibility of
destroying correct classifications)
Initialise wi to some values
Perceptron Learning Algorithm
Desired output d n =
{
1 if x nset A
1 if x nset B
1. Select random sample from training set as input
2. If classification is correct, do nothing
3. If classification is incorrect, modify the weight
vector w using
w i =w i d n x i n
Repeat this procedure until the entire training set
is classified correctly
Learning Example
Initial Values:

= 0.2


0
w= 1
0.5

0 = w 0w 1 x1w 2 x 2
= 0x 10.5x 2
x 2 = 2x1
Learning Example

= 0.2


0
w= 1
0.5

x1 = 1, x2 = 1
wTx > 0

Correct classification,
no action
Learning Example

= 0.2


0
w= 1
0.5
x1 = 2, x2 = -2
w 0 = w 00.21
w 1 = w 10.22
w 2 = w 20.22
Learning Example

= 0.2


0.2
w = 0.6
0.9
x1 = 2, x2 = -2
w 0 = w 00.21
w 1 = w 10.22
w 2 = w 20.22
Learning Example

= 0.2


0.2
w = 0.6
0.9
x1 = -1, x2 = -1.5
wTx < 0

Correct classification,
no action
Learning Example

= 0.2


0.2
w = 0.6
0.9
x1 = -2, x2 = -1
wTx < 0

Correct classification,
no action
Learning Example

= 0.2


0.2
w = 0.6
0.9
x1 = -2, x2 = 1
w 0 = w 00.21
w 1 = w 10.22
w 2 = w 20.21
Learning Example

= 0.2


0
w = 0.2
1.1
x1 = -2, x2 = 1
w 0 = w 00.21
w 1 = w 10.22
w 2 = w 20.21
Learning Example

= 0.2


0
w = 0.2
1.1
x1 = 1.5, x2 = -0.5
w 0 = w 00.21
w 1 = w 10.21.5
w 2 = w 20.20.5
Learning Example

= 0.2


0.2
w = 0.5
1
x1 = 1.5, x2 = -0.5
w 0 = w 00.21
w 1 = w 10.21.5
w 2 = w 20.20.5
Perceptron Convergence Theorem
The theorem states that for any data set which is
linearly separable, the perceptron learning rule is
guaranteed to find a solution in a finite number of
iterations.

Idea behind the proof: Find upper & lower


bounds on the length of the weight vector to show
finite number of iterations.
Perceptron Convergence Theorem
Let's assume that the input variables come from
two linearly separable classes C1 & C2.

Let T1 & T2 be subsets of training vectors which


belong to the classes C1 & C2 respectively.
Then T1 U T2 is the complete training set.
Perceptron Convergence Theorem
As we have seen, the learning algorithms
purpose is to find a weight vector w such that
wx 0 xC 1 (x is an input vector)
wx 0 xC 2
If the kth member of the training set, x(k), is
correctly classified by the weight vector w(k)
computed at the kth iteration of the algorithm,
then we do not adjust the weight vector.
However, if it is incorrectly classified, we use the
modifier w k 1=w k d k x k
Perceptron Convergence Theorem
So we get
w k 1 = w k x k if w k x k 0, x k C 2
w k 1 = w k x k if w k x k 0, x k C 1

We can set = 1, as for 1 (>0) just scales


the vectors.
We can also set the initial condition w(0) = 0, as
any non-zero value will still converge, just
decrease or increase the number of iterations.
Perceptron Convergence Theorem
Suppose that w(k)x(k) < 0 for k = 1, 2, ... where
x(k) T1, so with an incorrect classification we
get
w k 1=w k x k x k C 1
By expanding iteratively, we get
w k 1= x k w k
=x
.
k x k 1w k 1
.
.

=x k ...x 1w 0

.
Perceptron Convergence Theorem
As we assume linear separability, a solution w*
where wx(k) > 0, x(1)...x(k) T1. Multiply both
sides by the solution w* to get

w w k 1 = w x 1...w x k

These are all > 0,


hence all >= ,
where
=min w xk
Thus we get

w w k 1 k
Perceptron Convergence Theorem
Now we make use of the Cauchy-Schwarz
inequality which states that for any two vectors
A, B
2 2 2
A B = AB
Applying this we get
2 2 2
w w k 1 w w k1
From the previous slide we know ww k 1 k
Thus, it follow that 2 2
2 k
w k 1 2
w
Perceptron Convergence Theorem
We continue the proof by going down another
route.
w j1 = w jx j for j=1, ... , k with x j T 1
We square the Euclidean norm on both sides
2 2
w j1 =w j x j
2 2
= w j x j 2w jx j
Thus we get
incorrectly
w j12w j2 x j2 classified,
so < 0
Perceptron Convergence Theorem
Summing both sides for all j
2 2 2
w j1 w j x j
w j 2w j12 x j12
.
.
.
2 2 2
w 1 w 0 x 1
We get
k
w k 1 x j
2 2

j=1
k 2
= max x j
Perceptron Convergence Theorem
But now we have a conflict between the
equations, for sufficiently large values of k
2 2
2 2 k
w k 1 k w k 1 2
w
So, we can state that k cannot be larger than
some value kmax for which the two equations are
both satisfied.
2 2 2
k max w
k max = 2
k max = 2
w
Perceptron Convergence Theorem
Thus it is proved that for k = 1, k, w(0) = 0,
given that a solution vector w* exists, the
perceptron learning rule will terminate after at
most kmax iterations.
The End

You might also like