Networks of Artificial Neurons, Single Layer Perceptrons
Networks of Artificial Neurons, Single Layer Perceptrons
I1
I2
Oi
Iij
∑ wij
In
neuron i neuron j
synapse ij
𝑛
𝐼𝑘𝑖 = 𝑂𝑘 ∙ 𝑤𝑘𝑖 𝑂𝑖 = 𝑠𝑔𝑛( 𝑘=1 𝐽𝑘𝑖 − 𝜃𝑖 ) 𝐼𝑖𝑗 = 𝑂𝑖 ∙ 𝑤𝑖𝑗
The Perceptron
We can connect any number of McCulloch-Pitts neurons together in any way
we like.
An arrangement of one input layer McCulloch-Pitts neurons feeding forward to
one output layer of McCulloch-
Pitts neurons is known as a
1 1 1
Perceptron.
2 2 2
𝑂 𝑠𝑔𝑛 ∑ 𝑂 𝑤 −
. .
. .
. .
n m
m
Already this is a powerful
computational device. Later we shall see variations that make it even more
powerful.
Implementing Logic Gates with M-P Neurons
In each case, we have inputs Ii and outputs O, and need to determine the weights
and thresholds. It is easy to find solutions by inspection:
NOT AND OR
I O i1 i2 O i1 i2 O
0 1 0 0 0 0 0 0
1 0 0 1 0 0 1 1
1 0 0 1 0 1
1 1 1 1 1 1
-0.5 1.5 0.5
-1 1 1 1 1
The Need to Find Weights Analytically
Constructing simple networks by hand is one thing. But what about harder
problems? For example, what about:
OR
i1 i2 O
?
0 0 0
0 1 1 ? ?
1 0 1
1 1 1
I1 I2 O
0 0 0 −
0 1 0 −
1 0 0 −
1 1 1 −
It is easy to see that are an infinite number of solutions. Similarly, there are an
infinite number of solutions for the NOT and OR networks.
Limitations of Simple Perceptrons
We can follow the same procedure for the XOR network:
I1 I2 O
0 0 0 −
0 1 1 −
1 0 1 −
1 1 0 −
Clearly, the second and third inequalities are incompatible with the fourth, so
there is in fact no solution. We need more complex networks, e.g. that combine
together many simple networks, or use different activation/thresholding/transfer
functions.
It then becomes much more difficult to determine all the weights and thresholds
by hand. Next lecture we shall see how a neural network can learn these
parameters.
First, we need to consider what these more complex networks might involve.
ANN Architectures/ Structures/ Topologies
Mathematically, ANNs can be represented as weighted discrete graphs. For our
purposes, we can simply think in terms of activation flowing between processing
units via one-way connections. Three common ANN architectures are:
Single-layer Feed-Forward NNs: One input layer and one output layer of
processing units. No feedback connections. (For example, a simple Perceptron)
Multi-layer Feed-Forward NNs : One input layer, one output layer, and one or
more hidden layers of processing units. No feedback connections. The hidden
layers sit in between input and output layers, and thus are hidden from the outside
world. (For example, a Multi-Layer Perceptron.)
Recurrent NNs: Any network with at least one feed-back connection. It may, or
may not, have hidden units. (For example, a Simple Recurrent Network.)
Further interesting variations include: short-cut connections, partial connectivity,
time-delayed connections, Elman networks, Jordan Networks, moving windows,
…
Examples of Network Architectures
Multi-Layer
Single-Layer Simple Recurrent
Perceptron
Perceptron Network
Other Types of Activation/Transfer Function
Sigmoid Functions: These are smooth (differentiable) and monotonically
increasing.
Hyperbolic tangent
𝑖
𝑖 −
𝑖
The Threshold as a special Kind of Weight
It would simplify the mathematics if we could treat the neuron threshold as if it
were just another connection weight. The crucial thing we need to compute for
the each unit j is:
𝑂 𝑤 − 𝜃 𝑂 𝑤 𝑂 𝑤 𝑂 𝑤 − 𝜃
It is easy to see that if we define 𝑤 −𝜃 and 𝑂 , then this becomes
𝑂 𝑤 − 𝜃 𝑂 𝑤 𝑂 𝑤 𝑂 𝑤 𝑂 𝑤 𝑂 𝑤
This simplifies the basic Perceptron equation so that:
𝑂 𝑠𝑔𝑛( 𝑂 𝑤 − 𝜃) 𝑠𝑔𝑛 𝑂 𝑤
We just have to include an extra input unit with activation 𝑂 and then we
only need to compute "weights", and no explicit thresholds.
Example: A Classification Task
A typical neural network application is classification. Consider the simple example
of classifying aeroplanes given their masses and speeds:
How do we construct a neural network that can classify any Bomber and Fighter?
General Procedure for Building Neural Networks
Formulating NN solutions for particular problems in a multi-stage process:
1. Understand and specify your problem in terms of inputs and required outputs,
e.g. for classification the outputs are the classes usually represented as binary
vectors.
2. Take the simplest form of network you think might be able to solve your
problem, e.g. a simple Perceptron.
3. Try to find appropriate connection weights (including neuron thresholds) so
that the network produces the right outputs for each input in its training data.
4. Make sure that the NW works on its training data, and test its generalization by
checking its performance on new testing data.
5. If the NW doesn't perform well enough, go back to stage 3 and try harder.
6. If the NW still doesn't perform well enough, go back to stage 2 and try harder.
7. If the NW still doesn't perform well enough, go back to stage 1 and try harder.
8. Problem solved – move on to next problem.
Building a Neural Network for our Example
For our aeroplane classifier example, our inputs can be direct encodings of the
masses and speeds. Generally we would have one output unit for each class,
with activation 1 for 'yes' and '0' for 'no'. with just two classes here, we can have
just one output unit, with activation 1 for 'Fighter' and 0
for 'Bomber' (or vice versa). The simplest network to try
Mass
first is a simple Perceptron. We can further simplify
matters by replacing the threshold by an extra weight
as discussed above. This gives us:
𝑙 𝑠𝑠 𝑠𝑔𝑛 𝑤 𝑠𝑠 𝑤 𝑝
W0 W1 W2
1 Mass Speed
That's stages 1 and 2 done. Next lecture we begin a systematic look at how to
proceed with stage 3, first for the Perceptron, and then for more complex types
of networks.
Conclusion
References:
𝑖
𝑠𝑔𝑛
𝑖
𝑖𝑔𝑚𝑜𝑖
In1
In2
Out
∑
Inn