0% found this document useful (0 votes)
13 views41 pages

Machine Learning

This document provides an overview of artificial neural networks (ANNs), detailing their structure, function, and learning mechanisms. It explains the components of biological neurons, the architecture of ANNs, and various transfer functions used in neuron outputs. Additionally, it covers the perceptron model, learning rules, and examples of applications in pattern recognition and automated driving.

Uploaded by

abwmbkytnaa
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views41 pages

Machine Learning

This document provides an overview of artificial neural networks (ANNs), detailing their structure, function, and learning mechanisms. It explains the components of biological neurons, the architecture of ANNs, and various transfer functions used in neuron outputs. Additionally, it covers the perceptron model, learning rules, and examples of applications in pattern recognition and automated driving.

Uploaded by

abwmbkytnaa
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 41

Machine learning

Lecture 6

Artificial neural networks


1
Introduction
• An artificial neural network (ANN) is an
information-processing system that has certain
performance characteristics in common with
biological neural networks.
• A method of computing, based on the
interaction of multiple connected processing
elements.
• mathematical models for information
processing, which based on the biological
prototypes and mechanisms of human brain
activities.
• It is composed of interconnected computing
units called neurons
• ANN like human, learn by examples
Biological Neurons
• Human brain = tens of
thousands of neurons
• Each neuron is connected to
thousands other neurons
• A neuron is made of:

– The soma: body of the


neuron
– Dendrites: filaments that
provide input to the neuron
– The axon: sends an output
signal
– Synapses: connection with
other neurons – releases
certain quantities of
chemicals called
neurotransmitters to other
neurons

3
The biological neuron
• The pulses
generated by
the neuron
travels along
the axon as an
electrical wave.
• Once these
pulses reach
the synapses at
the end of the
axon open up
chemical
vesicles
exciting the
other neuron.

4
Neural Network

Input Layer Hidden 1 Hidden 2 Output Layer


Simple Neuron

X1
W1

Inputs X2 W2  f Output

Wn

Xn
Neural Network
Application

•Pattern recognition can be implemented using NN


•The figure can be T or H character, the network should
identify each class of T or H.
Navigation of a car

• Done by Pomerlau. The network takes inputs from a


34X36 video image and a 7X36 range finder. Output units
represent “drive straight”, “turn left” or “turn right”. After
training about 40 times on 1200 road images, the car
drove around CMU campus at 5 km/h (using a small
workstation on the car). This was almost twice the speed
of any other non-NN algorithm at the time.

04/23/25 8
Automated driving at 70 mph on
a public highway

Camera
image

30 outputs
for steering 30x32 weights
into one out o
4 hidden
four hidden
units
unit
30x32 pixels
as inputs 9
Neuron Model
• A neuron has more than one input x1, x2,..,xm
• Each input is associated with a weight w1,
w2,..,wm
• Neurons have also biases b as an additional
input component, This representation for the
bias is useful because bias terms can be
interpreted as additional weights
• The net input of the neuron is
n = w1 x1 + w2 x2+….+ wm xm + b

n  wi xi  b
Neuron output

• The neuron output is

y = f (n)
• f is called transfer function
Transfer Function

• We have 3 common transfer functions

– Hard limit transfer function

– Linear transfer function

– Sigmoid transfer function


Hard Limit Transfer
Function
It returns 0 if the input is less than 0 or 1
if the input is greater than or equal to 0
Linear Function

The linear transfer function gives


the output is equal to the input
y= n
Sigmoid Function

• The output of the sigmoid function


is into the range 0 to 1 according
to the formula 1
y 
1  e n
Architecture of ANN
• Feed-Forward Networks
Allow the signals to travel one way
from input to output
• Feed-Back Networks
The signals travel as loops in the
network, the output is connected to
the input of the network
Example

• The input to a single-input neuron is 2.0,


its weight is 2.3 and the bias is –3.
• What is the output of the neuron if it has
transfer function as:
– Hard limit

– Linear

– sigmoid
n  wi xi  b

N= 2.3*2.0+(-3)=1.6 , now find y=f(n)

1. Hard Limit Transfer :- if n>=0 y=1


if n<0 y=0
2. Linear Transfer y=n , y=1.6
1
y 
3. Sigmoid Transfer 1  e n

Y= 1/(1+e^(-1.6)) = 0.832
Learning Rule

• The learning rule modifies the weights

of the connections.

• The learning process is divided into

Supervised and Unsupervised learning


Perceptron
• It is a network of one neuron and
hard limit transfer function
X1
W1

Inputs X2 W2  f Output

Wn

Xn
Perceptron

• The perceptron is given first a


randomly weights vectors
• Perceptron is given chosen data pairs
(input and desired output)
• Preceptron learning rule changes the
weights according to the error in
output
Perceptron

• The weight-adapting procedure is an


iterative method and should reduce the
error to zero
• The output of perceptron is
Y = f(n)
= f ( w1x1+w2x2+…+wnxn +b)

=f (wixi +b)
Perceptron Learning
Rule
W new = W old + (t-o) X

Where W new is the new weight

W old is the old value of weight

X is the input value

t is the desired value of output

o is the actual value of output


Example Perceptron
• Categorisation of 2x2 pixel black & white
images
– Into “bright” and “dark”
• Representation of this rule:
– If it contains 2, 3 or 4 white pixels, it is
“bright”
– If it contains 0 or 1 white pixels, it is “dark”
• Perceptron architecture:
– Four input units, one for each pixel
– One output unit: +1 for white, -1 for dark
Example Perceptron

• Example calculation: x1=-1, x2=1, x3=1, x4=-1


– S = 0.25*(-1) + 0.25*(1) + 0.25*(1) + 0.25*(-1) = 0
• 0 > -0.1, so the output from the ANN is +1
– So the image is categorised as “bright”
Learning in Perceptrons
• Need to learn
– Both the weights between input and output
units
– And the value for the threshold
• Make calculations easier by
– Thinking of the threshold as a weight from a
special input unit where the output from the
unit is always 1
• Exactly the same result
– But we only have to worry about learning
weights
New Representation
for Perceptrons

Special Input Unit Threshold function


Always produces 1 has become this
Learning Algorithm
• Weights are set randomly initially
• For each training example E
– Calculate the observed output from the ANN,
o(E)
– If the target output t(E) is different to o(E)
• Then tweak all the weights so that o(E) gets closer
to t(E)
• Tweaking is done by perceptron training rule (next
slide)
• This routine is done for every example E
• Don’t necessarily stop when all examples
used
– Repeat the cycle again (an ‘epoch’)
– Until the ANN produces the correct output
• For all the examples in the training set (or good
enough)
Perceptron Training Rule
• When t(E) is different to o(E)
– Add on Δi to weight wi
– Where Δi = η(t(E)-o(E))xi
– Do this for every weight in the network
• Interpretation:
– (t(E) – o(E)) will either be +2 or –2 [cannot be
the same sign]
– So we can think of the addition of Δ i as the
movement of the weight in a direction
• Which will improve the networks performance with
respect to E
– Multiplication by xi
• Moves it more if the input is bigger
The Learning Rate
• η is called the learning rate
– Usually set to something small (e.g., 0.1)
• To control the movement of the weights
– Not to move too far for one example
– Which may over-compensate for another
example
• If a large movement is actually necessary
for the weights to correctly categorise E
– This will occur over time with multiple epochs
Worked Example

• Return to the “bright” and “dark” example


• Use a learning rate of η = 0.1
• Suppose we have set random weights:
Worked Example
• Use this training example, E, to update weights:

• Here, x1 = -1, x2 = 1, x3 = 1, x4 = -1 as before


• Propagate this information through the network:
– S = (-0.5 * 1) + (0.7 * -1) + (-0.2 * +1) + (0.1 * +1) + (0.9 * -1) = -2.2

• Hence the network outputs o(E) = -1


• But this should have been “bright”=+1
– So t(E) = +1
Calculating the Error
Values
• Δ0 = η(t(E)-o(E))x0
= 0.1 * (1 - (-1)) * (1) = 0.1 * (2) = 0.2
• Δ1 = η(t(E)-o(E))x1
= 0.1 * (1 - (-1)) * (-1) = 0.1 * (-2) = -0.2
• Δ2 = η(t(E)-o(E))x2
= 0.1 * (1 - (-1)) * (1) = 0.1 * (2) = 0.2
• Δ3 = η(t(E)-o(E))x3
= 0.1 * (1 - (-1)) * (1) = 0.1 * (2) = 0.2
• Δ4 = η(t(E)-o(E))x4
= 0.1 * (1 - (-1)) * (-1) = 0.1 * (-2) = -0.2
Calculating the New
Weights
• w’0 = -0.5 + Δ0 = -0.5 + 0.2 = -0.3

• w’1 = 0.7 + Δ1 = 0.7 + -0.2 = 0.5

• w’2 = -0.2 + Δ2 = -0.2 + 0.2 = 0

• w’3= 0.1 + Δ3 = 0.1 + 0.2 = 0.3

• w’4 = 0.9 + Δ4 = 0.9 - 0.2 = 0.7


New Look Perceptron

• Calculate for the example, E, again:


– S = (-0.3 * 1) + (0.5 * -1) + (0 * +1) + (0.3 * +1) + (0.7 * -1) = -1.2

• Still gets the wrong categorisation


– But the value is closer to zero (from -2.2 to -1.2)
– In a few epochs time, this example will be correctly
categorised
Boolean Functions
• Take in two inputs (-1 or +1)
• Produce one output (-1 or +1)
• In other contexts, use 0 and 1
• Example: AND function
– Produces +1 only if both inputs are +1
• Example: OR function
– Produces +1 if either inputs are +1
• Related to the logical connectives
from F.O.L.
Boolean Functions as
Perceptrons

• Problem: XOR boolean function


– Produces +1 only if inputs are different
– Cannot be represented as a perceptron
– Because it is not linearly separable
Linearly Separable
Boolean Functions

• Linearly separable:
– Can use a line (dotted) to separate +1 and –1
• Think of the line as representing the threshold
– Angle of line determined by two weights in perceptron
– Y-axis crossing determined by threshold
Linearly Separable
Functions

• Result extends to functions taking many inputs


– And outputting +1 and –1
• Also extends to higher dimensions for outputs
Exercises
• Design a neural network to
recognize the problem of
• X1=[2 2] , t1=0
• X2=[1 -2], t2=1
• X3=[-2 2], t3=0
• X4=[-1 1], t4=1
Start with initial weights w=[0 0] and
bias =0
Exercises
• Four one-dimensional data
belonging to two classes are
X = [1 -0.5 3 -2]
T = [1 -1 1 -1]
W = -2.5, b= 1.75

You might also like