0% found this document useful (0 votes)
136 views30 pages

NN Models & Architecture of NN: CSE-4619 Machine Learning

The document discusses the history and architecture of neural networks. It describes early models like the McCulloch-Pitts model and the perceptron. It explains different neural network architectures including single-layer feedforward, multi-layer feedforward, and recurrent networks. It provides an example of using a neural network for classification and outlines the process for building, training, and testing a neural network model.

Uploaded by

proshanto salma
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
136 views30 pages

NN Models & Architecture of NN: CSE-4619 Machine Learning

The document discusses the history and architecture of neural networks. It describes early models like the McCulloch-Pitts model and the perceptron. It explains different neural network architectures including single-layer feedforward, multi-layer feedforward, and recurrent networks. It provides an example of using a neural network for classification and outlines the process for building, training, and testing a neural network model.

Uploaded by

proshanto salma
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 30

NN Models & Architecture of NN

CSE-4619
Machine Learning

~ from the lecture of Noureddin Sadawi


McCullogh-Pitts Model
In 1943 two electrical engineers, Warren McCullogh and Walter Pitts,
published the first paper describing what we would call a neural
network. Their "neurons" operated under the following assumptions:
➔ They are binary devices (Vi = [0,1])
➔ Each neuron has a fixed threshold, theta
➔ The neuron receives inputs from excitatory synapses, all having
identical weights.
➔ Inhibitory inputs have an absolute veto power over any excitatory
inputs.
➔ At each time step the neurons are simultaneously (synchronously)
updated by summing the weighted excitatory inputs and setting the
output (Vi) to 1 iff the sum is greater than or equal to the threhold
AND if the neuron receives no inhibitory input.
McCullogh-Pitts Model

It cannot be represented with a single neuron, but the relationship


XOR = (V1 OR V2) AND NOT (V1 AND V2) suggests that it can be
represented with the network below:
The Perceptron vs McCullogh-Pitts Model
The next major advance was the perceptron, introduced by Frank
Rosenblatt in his 1958 paper. The perceptron had the following
differences from the McCullough-Pitts neuron:
➔ The weights and thresholds were not all identical.
➔ Weights can be positive or negative.
➔ There is no absolute inhibitory synapse.
➔ Although the neurons were still two-state, the output function f(u)
goes from [-1,1], not [0,1]. (This is no big deal, as a suitable change
in the threshold lets you transform from one convention to the other.)
➔ Most importantly, there was a learning rule.
Decision Boundaries for AND and OR

We can now plot the decision boundaries of our logic gates


AND OR
w1=1, w2=1, θ=1.5 w1=1, w2=1, θ=0.5
OR
I1 I2 out I1

0 0 0
0 1 1
AND
I1 (1, 0) (1, 1)
1 0 1
I1 I2 out
0 0 0 1 1 1

0 1 0
(1, 1)
1 0 0 (1, 0)
1 1 1 I2
(0, 0) (0, 1)

I2
(0, 0) (0, 1)

8
Decision Boundary for XOR

The difficulty in dealing with XOR is rather obvious. We need two straight
lines to separate the different outputs/decisions:
I1

XOR

I1 I2 out I1
0 0 0

0 1 1
I2

1 0 1

1 1 0

I2

Solution: either change the transfer function so that it has more than one
decision boundary, or use a more complex network that is able to generate
more complex decision boundaries.
9
ANN Architectures
Mathematically, ANNs can be represented as weighted directed graphs. The
most common ANN architectures are:

Single-Layer Feed-Forward NNs: One input layer and one output layer of
processing units. No feedback connections (e.g. a Perceptron)

Multi-Layer Feed-Forward NNs: One input layer, one output layer, and one or
more hidden layers of processing units. No feedback connections (e.g. a
Multi-Layer Perceptron)

Recurrent NNs: Any network with at least one feedback connection. It may, or
may not, have hidden units

Further interesting variations include: sparse connections, time-delayed


connections, moving windows, …

10
Examples of Network Architectures

Single Layer Multi-Layer Recurrent


Feed-Forward Feed-Forward Network

11
Example: A Classification Task

A typical neural network application is classification. Consider the simple example of


classifying trucks given their masses and lengths:

Mass Length Class

10.0 6 Lorry

20.0 5 Lorry

5.0 4 Van

2.0 5 Van

2.0 5 Van

3.0 6 Lorry

10.0 7 Lorry

15.0 8 Lorry

5.0 9 Lorry

How do we construct a neural network that can classify any Lorry and Van?
14
Cookbook Recipe for Building Neural Networks
Formulating neural network solutions for particular problems is a multi-stage
process:
1. Understand and specify the problem in terms of inputs and required
outputs
2. Take the simplest form of network you think might be able to solve your
problem
3. Try to find the appropriate connection weights (including neuron
thresholds) so that the network produces the right outputs for each input
in its training data
4. Make sure that the network works on its training data and test its
generalization by checking its performance on new testing data
5. If the network doesn’t perform well enough, go back to stage 3 and try
harder
6. If the network still doesn’t perform well enough, go back to stage 2 and
try harder
7. If the network still doesn’t perform well enough, go back to stage 1 and
try harder
8. Problem solved – or not

15
Building a Neural Network (stages 1 & 2)

For our truck example, our inputs can be direct encodings of the masses and
lengths. Generally we would have one output unit for each class, with
activation 1 for ‘yes’ and 0 for ‘no’. In our example, we still have one output
unit, but the activation 1 corresponds to ‘lorry’ and 0 to ‘van’ (or vice versa).
The simplest network we should try first is the single layer Perceptron. We
can further simplify things by replacing the threshold by an extra weight as
we discussed before. This gives us:

Class=sgn(w0+w1.Mass+w2.Length)

w2
w0
w1

1 Mass Length

16
Training the Neural Network (stage 3)
Whether our neural network is a simple Perceptron, or a much
complicated multi-layer network, we need to develop a systematic
procedure for determining appropriate connection weights.

The common procedure is to have the network learn the appropriate


weights from a representative set of training data.

For classifications a simple Perceptron uses decision boundaries (lines


or hyperplanes), which it shifts around until each training pattern is
correctly classified.

The process of “shifting around” in a systematic way is called learning.

The learning process can then be divided into a number of small steps.

17

You might also like