0% found this document useful (0 votes)
17 views5 pages

1c Perceptrons4

Uploaded by

jiejialing08
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views5 pages

1c Perceptrons4

Uploaded by

jiejialing08
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

Outline

COMP9444: Neural Networks and


Deep Learning
Week 1c. Perceptrons ➛ Neurons – Biological and Artificial
➛ Perceptron Learning
Alan Blair
School of Computer Science and Engineering ➛ Linear Separability
May 28, 2024 ➛ Multi-Layer Networks

Structure of a Typical Neuron Biological Neurons

The brain is made up of neurons (nerve cells) which have


➛ a cell body (soma)
➛ dendrites (inputs)
➛ an axon (outputs)
➛ synapses (connections between cells)

Synapses can be exitatory or inhibitory and may change over time.


When the inputs reach some threshhold an action potential
(electrical pulse) is sent along the axon to the outputs.

3 4
Artificial Neural Networks McCulloch & Pitts Model of a Single Neuron

(Artificial) Neural Networks are made up of nodes which have x1


❍❍
➛ inputs edges, each with some weight ❍❍
w1 ❍❍
➛ outputs edges (with weights) ❥
❍ s
Σ ✲ g ✲ g(s)
➛ an activation level (a function of the inputs) ✟


w2 ✟✟
Weights can be positive or negative and may change over time (learning). ✟ ✕
✁ x1 , x2 are inputs
✟✟ ✁
The input function is the weighted sum of the activation levels of inputs. x2 ✁w0 =-th w1 , w2 are synaptic weights
The activation level is a non-linear transfer function g of this input: ✁ s = w x + w x −th
✁ 1 1 2 2
th is a threshold
! =w x +w x +w
1 1 2 2 0
activationi = g(si ) = g( wij xj ) 1
w0 is a bias weight
j
g is transfer function
Some nodes are inputs (sensing), some are outputs (action)

5 6

Transfer function Linear Separability


Originally, a (discontinuous) step function was used for the transfer function: Question: what kind of functions can a perceptron compute?

x2
" 1, if s>0
g(s) =
0, if s<0

Technically, this is called the step function if g(0) = 1 and the Heaviside function if
g(0) = 0.5 (but, we will use the two terms interchangeably). x1
(Later, other transfer functions were introduced, which are continuous and smooth) Answer: linearly separable functions

7 8
Linear Separability Rosenblatt Perceptron

Examples of linearly separable functions:

AND w1 = w2 = 1.0, w0 = −1.5


OR w1 = w2 = 1.0, w0 = −0.5
NOR w1 = w2 = −1.0, w0 = 0.5

Q: How can we train it to learn a new function?

9 10

Rosenblatt Perceptron Perceptron Learning Rule

Adjust the weights as each input is presented.


recall: s = w1 x1 + w2 x2 + w0
if g(s) = 0 but should be 1, if g(s) = 1 but should be 0,

wk ← wk + η xk wk ← wk − η xk
w0 ← w0 + η w0 ← w0 − η
! !
so s ← s + η (1 + x2k ) so s ← s − η (1 + x2k )
k k
otherwise, weights are unchanged. (η > 0 is called the learning rate)
Theorem: This will eventually learn to classify the data correctly,
as long as they are linearly separable.

11 12
Perceptron Learning Example Training Step 1

x1


❍❍ x2
w1 ❍❍ 0.2 x1 + 0.0 x2 − 0.1 > 0


Σ → (+/−) ✲


w2 ✟✟✟ w1 x1 + w2 x2 + w0 > 0
✟ ✕
✁ w1 ← w1 − η x1 = 0.1
✟✟ ✁ learning rate η = 0.1
(1,1) w2 w2 − η x2 =
x2 ✁w0 begin with random weights ← −0.1
✁ w1 = 0.2 w0 ← w0 − η = −0.2

w2 = 0.0
1
w0 = −0.1
x1

13 14

Training Step 2 Training Step 3

0.1 x1 − 0.1 x2 − 0.2 > 0 x2 (2,2) 0.3 x1 + 0.0 x2 − 0.1 > 0


x2 w1 ← w1 + η x1 = 0.3
3rd point correctly classified,
w2 ← w2 + η x2 = 0.0
so no change
w0 ← w0 + η = −0.1
4th point:
w1 ← w1 − η x1 = 0.1
(2,1) w2 ← w2 − η x2 = −0.2
(1.5,0.5) w0 ← w0 − η = −0.2
0.1 x1 − 0.2 x2 − 0.2 > 0
x1 x1

15 16
Final Outcome Limitations of Perceptrons
Problem: many useful functions are not linearly separable (e.g. XOR)

x2 I1 I1 I1

1 1 1

eventually, all the data will be cor- ?


rectly classified (provided it is lin-
early separable) 0 0 0
0 1 I2 0 1 I2 0 1 I2
(a) I1 and I2 (b) I1 or I2 (c) I1 xor I2
Possible solution:
x1 x1 XOR x2 can be written as: (x1 AND x2 ) NOR (x1 NOR x2 )
Recall that AND, OR and NOR can be implemented by perceptrons.

17 18

Multi-Layer Neural Networks Historical Context


XOR
In 1969, Minsky and Papert published a book highlighting the limitations of
+0.5 Perceptrons, and lobbied various funding agencies to redirect funding away from
−1 −1 neural network research, preferring instead logic-based methods such as expert
NOR systems.
It was known as far back as the 1960’s that any given logical function could be
+1 −1 implemented in a 2-layer neural network with step function activations. But, the the
AND NOR −1.5 +0.5 question of how to learn the weights of a multi-layer neural network based on
+1 −1 training examples remained an open problem. The solution, which we describe in
the next section, was found in 1976 by Paul Werbos, but did not become widely
known until it was rediscovered in 1986 by Rumelhart, Hinton and Williams.

Problem: How can we train it to learn a new function? (credit assignment)

19 20

You might also like