0% found this document useful (0 votes)
96 views213 pages

Fundamentals of ANN

The document provides an overview of fundamentals of artificial neural networks (ANNs). It discusses the history of ANNs and their uses. It describes typical ANN architectures as feedforward and recurrent. Activation functions are categorized as binary, differentiable, and linear. Learning paradigms include supervised, unsupervised, and hybrid methods. Supervised learning involves training a network using examples with known outputs to minimize error through gradient descent or other algorithms.

Uploaded by

Josi Mo
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
96 views213 pages

Fundamentals of ANN

The document provides an overview of fundamentals of artificial neural networks (ANNs). It discusses the history of ANNs and their uses. It describes typical ANN architectures as feedforward and recurrent. Activation functions are categorized as binary, differentiable, and linear. Learning paradigms include supervised, unsupervised, and hybrid methods. Supervised learning involves training a network using examples with known outputs to minimize error through gradient descent or other algorithms.

Uploaded by

Josi Mo
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 213

Introduction Features Fundamentals Madaline Case Study: Binary Classification Using Perceptron

Fundamentals of Artificial Neural Networks

Fakhri Karray
University of Waterloo

() May 22, 2009 1 / 61


Introduction Features Fundamentals Madaline Case Study: Binary Classification Using Perceptron

Outline

Introduction
A Brief History

Features of ANNs
Neural Network Topologies
Activation Functions
Learning Paradigms

Fundamentals of ANNs
McCulloch-Pitts Model
Perceptron
Adaline (Adaptive Linear Neuron)

Madaline
Case Study: Binary Classification Using Perceptron

() May 22, 2009 2 / 61


Introduction Features Fundamentals Madaline Case Study: Binary Classification Using Perceptron

Introduction

Artificial Neural Networks (ANNs) are physical cellular systems, which


can acquire, store and utilize experiential knowledge.

ANNs are a set of parallel and distributed computational elements


classified according to topologies, learning paradigms and at the way
information flows within the network.
ANNs are generally characterized by their:
Architecture
Learning paradigm
Activation functions

() May 22, 2009 3 / 61


Introduction Features Fundamentals Madaline Case Study: Binary Classification Using Perceptron

Typical Representation of a Feedforward ANN

() May 22, 2009 4 / 61


Introduction Features Fundamentals Madaline Case Study: Binary Classification Using Perceptron

Interconnections Between Neurons

() May 22, 2009 5 / 61


Introduction Features Fundamentals Madaline Case Study: Binary Classification Using Perceptron

History

A Brief History

ANNs have been originally designed in the early forties for pattern
classification purposes.
⇒ They have evolved so much since then.

ANNs are now used in almost every discipline of science and technology:

from Stock Market Prediction to the design of Space Station frame,


from medical diagnosis to data mining and knowledge discovery,
from chaos prediction to control of nuclear plants.

() May 22, 2009 6 / 61


Introduction Features Fundamentals Madaline Case Study: Binary Classification Using Perceptron

Features of ANNs

ANN are classified according to the following:

Architecture Activation Functions Learning Paradigms


Feedforward Binary Supervised
Recurrent Continuous Unsupervised
Hybrid

() May 22, 2009 7 / 61


Introduction Features Fundamentals Madaline Case Study: Binary Classification Using Perceptron

Neural Network Topologies

Neural Network Topologies

Feedforward Flow of Information

() May 22, 2009 8 / 61


Introduction Features Fundamentals Madaline Case Study: Binary Classification Using Perceptron

Neural Network Topologies

Neural Network Topologies (cont.)

Recurrent Flow of Information

() May 22, 2009 9 / 61


Introduction Features Fundamentals Madaline Case Study: Binary Classification Using Perceptron

Activation Functions

Binary Activation Functions

Step Function Signum Function




1, if x > 0  1, if x > 0
step(x ) = sigum(x ) = 0, if x = 0
0, otherwise
−1, otherwise

2 2

1 1

0
0
-1
-1
-2
-2 0 2
-2
-2 0 2

() May 22, 2009 10 / 61


Introduction Features Fundamentals Madaline Case Study: Binary Classification Using Perceptron

Activation Functions

Differentiable Activation Functions

Differentiable functions
Sigmoid function Hyperbolic tangent

1 ex −e−x
sigmoid(x) = 1+e−x
tanh(x) = ex +e−x

1 1

0.8
0.5
0.6
0
0.4

0.2 -0.5

0 -1
-2 0 2 -2 0 2

() May 22, 2009 11 / 61


Introduction Features Fundamentals Madaline Case Study: Binary Classification Using Perceptron

Activation Functions

Differentiable Activation Functions (cont.)

Differentiable functions
Sigmoid derivative Linear function

e−x
sigderiv(x) = (1+e−x )2
lin(x) = x

0.3 3

0.2 1

0.1 -1

-2

0
-3
-2 0 2 -2 0 2

() May 22, 2009 12 / 61


Introduction Features Fundamentals Madaline Case Study: Binary Classification Using Perceptron

Learning Paradigms

Learning Paradigms

Supervised Learning
Multilayer perceptrons
Radial basis function networks
Modular neural networks
LVQ (learning vector quantization)

Unsupervised Learning
Competitive learning networks
Kohonen self-organizing networks
ART (adaptive resonant theory)

Others
Autoassociative memories (Hopfield networks)

() May 22, 2009 13 / 61


Introduction Features Fundamentals Madaline Case Study: Binary Classification Using Perceptron

Learning Paradigms

Supervised Learning
Training by example; i.e., priori known desired output for each input
pattern.

Particularly useful for feedforward networks.

() May 22, 2009 14 / 61


Introduction Features Fundamentals Madaline Case Study: Binary Classification Using Perceptron

Learning Paradigms

Supervised Learning (cont.)

Training Algorithm

1 Compute error between desired and actual outputs

2 Use the error through a learning rule (e.g., gradient descent) to adjust the
network’s connection weights

3 Repeat steps 1 and 2 for input/output patterns to complete one epoch

4 Repeat steps 1 to 3 until maximum number of epochs is reached or an


acceptable training error is reached

() May 22, 2009 15 / 61


Introduction Features Fundamentals Madaline Case Study: Binary Classification Using Perceptron

Learning Paradigms

Unsupervised Learning

No priori known desired output.

In other words, training data composed of input patterns only.

Network uses training patterns to discover emerging collective properties


and organizes the data into clusters.

() May 22, 2009 16 / 61


Introduction Features Fundamentals Madaline Case Study: Binary Classification Using Perceptron

Learning Paradigms

Unsupervised Learning: Graphical Illustration

() May 22, 2009 17 / 61


Introduction Features Fundamentals Madaline Case Study: Binary Classification Using Perceptron

Learning Paradigms

Unsupervised Learning (cont.)

Unsupervised Training
1 Training data set is presented at the input layer
2 Output nodes are evaluated through a specific criterion
3 Only weights connected to the winner node are adjusted
4 Repeat steps 1 to 3 until maximum number of epochs is reached or the
connection weights reach steady state

Rationale
Competitive learning strengths the connection between the incoming
pattern at the input layer and the winning output node.

The weights connected to each output node can be regarded as the


center of the cluster associated to that node.

() May 22, 2009 18 / 61


Introduction Features Fundamentals Madaline Case Study: Binary Classification Using Perceptron

Learning Paradigms

Unsupervised Learning (cont.)

Unsupervised Training
1 Training data set is presented at the input layer
2 Output nodes are evaluated through a specific criterion
3 Only weights connected to the winner node are adjusted
4 Repeat steps 1 to 3 until maximum number of epochs is reached or the
connection weights reach steady state

Rationale
Competitive learning strengths the connection between the incoming
pattern at the input layer and the winning output node.

The weights connected to each output node can be regarded as the


center of the cluster associated to that node.

() May 22, 2009 18 / 61


Introduction Features Fundamentals Madaline Case Study: Binary Classification Using Perceptron

Learning Paradigms

Reinforcement Learning

Reinforcement learning mimics the way humans adjust their behavior


when interacting with physical systems (e.g., learning to ride a bike).

Network’s connection weights are adjusted according to a qualitative and


not quantitative feedback information as a result of the network’s
interaction with the environment or system.

The qualitative feedback signal simply informs the network whether or not
the system reacted “well” to the output generated by the network.

() May 22, 2009 19 / 61


Introduction Features Fundamentals Madaline Case Study: Binary Classification Using Perceptron

Learning Paradigms

Reinforcement Learning: Graphical


Representation

() May 22, 2009 20 / 61


Introduction Features Fundamentals Madaline Case Study: Binary Classification Using Perceptron

Learning Paradigms

Reinforcement Learning

Reinforcement Training Algorithm

1 Present training input pattern network

2 Qualitatively evaluate system’s reaction to network’s calculated output

If response is “Good”, the corresponding weights led to that output are


strengthened

If response is “Bad”, the corresponding weights are weakened.

() May 22, 2009 21 / 61


Introduction Features Fundamentals Madaline Case Study: Binary Classification Using Perceptron

Fundamentals of ANNs

Late 1940’s : McCulloch Pitt Model (by McCulloch and Pitt)

Late 1950’s – early 1960’s : Perceptron (by Roseblatt)

Mid 1960’s : Adaline (by Widrow)

Mid 1970’s : Back Propagation Algorithm - BPL I (by Werbos)

Mid 1980’s : BPL II and Multi Layer Perceptron (by Rumelhart and Hinton)

() May 22, 2009 22 / 61


Introduction Features Fundamentals Madaline Case Study: Binary Classification Using Perceptron

Fundamentals of ANNs

Late 1940’s : McCulloch Pitt Model (by McCulloch and Pitt)

Late 1950’s – early 1960’s : Perceptron (by Roseblatt)

Mid 1960’s : Adaline (by Widrow)

Mid 1970’s : Back Propagation Algorithm - BPL I (by Werbos)

Mid 1980’s : BPL II and Multi Layer Perceptron (by Rumelhart and Hinton)

() May 22, 2009 22 / 61


Introduction Features Fundamentals Madaline Case Study: Binary Classification Using Perceptron

Fundamentals of ANNs

Late 1940’s : McCulloch Pitt Model (by McCulloch and Pitt)

Late 1950’s – early 1960’s : Perceptron (by Roseblatt)

Mid 1960’s : Adaline (by Widrow)

Mid 1970’s : Back Propagation Algorithm - BPL I (by Werbos)

Mid 1980’s : BPL II and Multi Layer Perceptron (by Rumelhart and Hinton)

() May 22, 2009 22 / 61


Introduction Features Fundamentals Madaline Case Study: Binary Classification Using Perceptron

Fundamentals of ANNs

Late 1940’s : McCulloch Pitt Model (by McCulloch and Pitt)

Late 1950’s – early 1960’s : Perceptron (by Roseblatt)

Mid 1960’s : Adaline (by Widrow)

Mid 1970’s : Back Propagation Algorithm - BPL I (by Werbos)

Mid 1980’s : BPL II and Multi Layer Perceptron (by Rumelhart and Hinton)

() May 22, 2009 22 / 61


Introduction Features Fundamentals Madaline Case Study: Binary Classification Using Perceptron

Fundamentals of ANNs

Late 1940’s : McCulloch Pitt Model (by McCulloch and Pitt)

Late 1950’s – early 1960’s : Perceptron (by Roseblatt)

Mid 1960’s : Adaline (by Widrow)

Mid 1970’s : Back Propagation Algorithm - BPL I (by Werbos)

Mid 1980’s : BPL II and Multi Layer Perceptron (by Rumelhart and Hinton)

() May 22, 2009 22 / 61


Introduction Features Fundamentals Madaline Case Study: Binary Classification Using Perceptron

McCulloch-Pitts Model

McCulloch-Pitts Model

Overview

First serious attempt to model the computing process of the biological


neuron.

The model is composed of one neuron only.

Limited computing capability.

No learning capability.

() May 22, 2009 23 / 61


Introduction Features Fundamentals Madaline Case Study: Binary Classification Using Perceptron

McCulloch-Pitts Model

McCulloch-Pitts Model: Architecture

() May 22, 2009 24 / 61


Introduction Features Fundamentals Madaline Case Study: Binary Classification Using Perceptron

McCulloch-Pitts Model

McCulloch-Pitts Models (cont.)

Functionality

1 l input signals presented to the network: x1 , x2 , . . ., xl .

2 l hard-coded weights, w1 , w2 , . . ., wl , and bias θ, are applied to compute


Pl
the neuron’s net sum: i=1 wi li − θ.

3 A binary activation function f is applied to the neuron’s net sum to


l
!
X
calculate the node’s output o: o = f wi x i − θ .
i=1

() May 22, 2009 25 / 61


Introduction Features Fundamentals Madaline Case Study: Binary Classification Using Perceptron

McCulloch-Pitts Model

McCulloch-Pitts Models (cont.)

Remarks

It is sometimes simpler and more convenient to introduce a virtual input


x0 = 1 and assigning its corresponding weight w0 = −θ. Then,
l
!
X
o=f wi xi with x0 = 1, w0 = −θ
i=0

Synaptic weights are not updated due to the lack of a learning


mechanism.

() May 22, 2009 26 / 61


Introduction Features Fundamentals Madaline Case Study: Binary Classification Using Perceptron

Perceptron

Perceptron

Overview
Uses supervised learning to adjust its weights in response to a
comparative signal between the network’s actual output and the target
output.

Mainly designed to classify linearly separable patterns.

Definition: Linear Separation


Patterns are linearly separable means that there exists a hyperplanar
multidimensional decision boundary that classifies the patterns into two
classes.

() May 22, 2009 27 / 61


Introduction Features Fundamentals Madaline Case Study: Binary Classification Using Perceptron

Perceptron

Perceptron

Overview
Uses supervised learning to adjust its weights in response to a
comparative signal between the network’s actual output and the target
output.

Mainly designed to classify linearly separable patterns.

Definition: Linear Separation


Patterns are linearly separable means that there exists a hyperplanar
multidimensional decision boundary that classifies the patterns into two
classes.

() May 22, 2009 27 / 61


Introduction Features Fundamentals Madaline Case Study: Binary Classification Using Perceptron

Perceptron

Linearly Separable Patterns

() May 22, 2009 28 / 61


Introduction Features Fundamentals Madaline Case Study: Binary Classification Using Perceptron

Perceptron

Non-Linearly Separable Patterns

() May 22, 2009 29 / 61


Introduction Features Fundamentals Madaline Case Study: Binary Classification Using Perceptron

Perceptron

Perceptron

Remarks

One neuron (one output)

l input signals: x1 , x2 , . . ., xl

Adjustable weights w1 , w2 , . . ., wl , and bias θ

Binary activation function; i.e., step or hard limiter function

() May 22, 2009 30 / 61


Introduction Features Fundamentals Madaline Case Study: Binary Classification Using Perceptron

Perceptron

Perceptron: Architecture

() May 22, 2009 31 / 61


Introduction Features Fundamentals Madaline Case Study: Binary Classification Using Perceptron

Perceptron

Perceptron (cont.)
Perceptron Convergence Theorem
If the training set is linearly separable, there exists a set of weights for which
the training of the Perceptron will converge in a finite time and the training
patterns are correctly classified.
x2
Decision boundary
separating the two
classes A and B
Class B (▽)

In the two-dimensional case, the


theorem translates into finding the line
defined by w1 x1 + w2 x2 − θ = 0, which x1
adequately classifies the training
w1 θ
patterns. x2 =
w2
x1 +
w2
Class A (◦)

() May 22, 2009 32 / 61


Introduction Features Fundamentals Madaline Case Study: Binary Classification Using Perceptron

Perceptron

Training Algorithm
1 Initialize weights and thresholds to small random values.
2 Choose an input-output pattern (x (k ) , t (k ) ) from the training data.
P 
l (k )
3 compute the network’s actual output o (k ) = f i=1 wi xi −θ ·
4 Adjust the weights and bias according to the Perceptron learning rule:
(k )
∆wi = η[t (k ) − o (k ) ]xi , and ∆θ = −η[t (k ) − o (k ) ], where η ∈ [0, 1] is the
Perceptron’s learning rate.
If f is the the signum function, this becomes equivalent to:
( (k )
(
2ηt (k ) xi , if t (k ) 6= o (k ) −2ηt (k ) , if t (k ) 6= o (k )
∆wi = ∆θ =
0 , otherwise 0 , otherwise

5 If a whole epoch is complete, then pass to the following step; otherwise go to


Step 2.
6 If the weights (and bias) reached steady state (∆wi ≈ 0)through the whole epoch,
then stop the learning; otherwise go through one more epoch starting from
Step 2.
() May 22, 2009 33 / 61
Introduction Features Fundamentals Madaline Case Study: Binary Classification Using Perceptron

Perceptron

Example

Problem Statement

Classify the following patterns using η = 0.5:

Class (1) with target value (−1) :T = [2, 0]T , U = [2, 2]T , V = [1, 3]T
Class (2) with target value (+1) :X = [−1, 0]T , Y = [−2, 0]T , Z = [−1, 2]T

Let the initial weights be w1 = −1, w2 = 1, θ = −1·

Thus, initial boundary is defined by x2 = x1 − 1·

() May 22, 2009 34 / 61


Introduction Features Fundamentals Madaline Case Study: Binary Classification Using Perceptron

Perceptron

Example

Solution

T properly classified, but not U and V .

Hence, training is needed.

Let us start by selecting pattern U.

sgn(2 × (−1) + 2 × (1) + 1) = 1 ⇒∆w1 = ∆w2 = −1 × (2) = −2,


⇒∆θ = +1

Updated boundary is defined by x2 = −3x1 ·

All patterns are now properly classified.

() May 22, 2009 35 / 61


Introduction Features Fundamentals Madaline Case Study: Binary Classification Using Perceptron

Perceptron

Example: Graphical Solution

x2
Updated bound-
ary x2 = −3x1

Z U

Y X T
x1

(◦) Class 1 = -1
(△) Class 2 = 1
Original bound-
ary x2 = x1 − 1

() May 22, 2009 36 / 61


Introduction Features Fundamentals Madaline Case Study: Binary Classification Using Perceptron

Perceptron

Perceptron (cont.)

Remarks

Simple-layer perceptrons suffer from two major shortcomings:

1 Cannot separate linearly non-separable patterns.

2 Lack of generalization: once trained, it cannot adapt its weights to a new set
of data.

() May 22, 2009 37 / 61


Introduction Features Fundamentals Madaline Case Study: Binary Classification Using Perceptron

Adaline (Adaptive Linear Neuron)

Adaline (Adaptive Linear Neuron)

Overview

More versatile than the Perceptron in terms of generalization.

More powerful in terms of weight adaptation.

An Adaline is composed of a linear combiner, a binary activation function


(hard limiter), and adaptive weights.

() May 22, 2009 38 / 61


Introduction Features Fundamentals Madaline Case Study: Binary Classification Using Perceptron

Adaline (Adaptive Linear Neuron)

Adaline: Graphical Illustration

() May 22, 2009 39 / 61


Introduction Features Fundamentals Madaline Case Study: Binary Classification Using Perceptron

Adaline (Adaptive Linear Neuron)

Adaline (cont.)

Learning in an Adaline

Adaline adjusts its weights according to the least mean squared (LMS)
algorithm (also known as the Widrow-Hoff learning rule) through gradient
descent optimization.

At every iteration, the weights are adjusted by an amount proportional to


the gradient of the cumulative error of the network E(w)·
⇒ ∆w = −η▽w E(w)

() May 22, 2009 40 / 61


Introduction Features Fundamentals Madaline Case Study: Binary Classification Using Perceptron

Adaline (Adaptive Linear Neuron)

Adaline (cont.)

Learning in an Adaline (cont.)

The network’s cumulative error E(w) for all patterns (x (k ) , t (k ) ),


k = 1, 2, . . . , n. This is the error between the desired response t (k ) and
P (k )
the linear combiner’s output ( i wi xi − θ).
" !#2
(k )
X X
(k )
E(w) = t − wi x i −θ
k i

Hence, individual weights ! are updated as:


(k ) (k )
X
∆wi = η t (k ) − wi x i xi .
i

() May 22, 2009 41 / 61


Introduction Features Fundamentals Madaline Case Study: Binary Classification Using Perceptron

Adaline (Adaptive Linear Neuron)

Adaline (cont.)

Training Algorithm
1 Initialize weights and thresholds to small random values.
2 Choose an input-output pattern (x (k ) , t (k ) ) from the training data.
(k )
Compute the linear combiner’s output r (k ) =
P
3
i=1 wi x i − θ.
4 Adjust the
 weights (and bias)
 according to the LMS rule as:
(k )
P (k ) (k )
∆wi = η t − i wi xi xi , where η ∈ [0, 1] being the learning rate.

5 If a whole epoch is complete, then pass to the following step; otherwise


go to Step 2.
6 If the weights (and bias) reached steady state (∆wi ≈ 0) through the
whole epoch, then stop the learning; otherwise go through one more
epoch starting from Step 2.

() May 22, 2009 42 / 61


Introduction Features Fundamentals Madaline Case Study: Binary Classification Using Perceptron

Adaline (Adaptive Linear Neuron)

Adaline (cont.)

Advantages of the LMS Algorithm

Easy to implement.

Suitable for generalization, which is a missing feature in the Perceptron.

() May 22, 2009 43 / 61


Introduction Features Fundamentals Madaline Case Study: Binary Classification Using Perceptron

Madaline
Shortcoming of Adaline

The adaline, while having attractive training capabilities, suffers also (similarly
to the perceptron) from the inability to train patterns belonging to nonlinearly
separable spaces.

Researchers have tried to circumvent this difficulty by setting cascade


layers of adaline units.

When first proposed, this seemingly attractive idea did not lead to much
improvement due to the lack of an existing learning algorithm capable of
adequately updating the synaptic weights of a cascade architecture of
perceptrons.

Other researchers were able to solve the nonlinear separability problem


by combining in parallel a number of adaline units called a madaline.
() May 22, 2009 44 / 61
Introduction Features Fundamentals Madaline Case Study: Binary Classification Using Perceptron

Madaline: Graphical Representation

() May 22, 2009 45 / 61


Introduction Features Fundamentals Madaline Case Study: Binary Classification Using Perceptron

Madaline: Example
Solving the XOR logic function by combining in parallel two adaline units
using the AND logic gate.

Graphical Solution

Related Binary Table


x1 x2 o = x1 XORx2
0 0 1
0 1 -1
1 0 -1
1 1 1

() May 22, 2009 46 / 61


Introduction Features Fundamentals Madaline Case Study: Binary Classification Using Perceptron

Madaline (cont.)

Remarks

Despite the successful implementation of the adaline and the madaline


units in a number of applications, many researchers conjectured that to
have successful connectionist computational tools, neural models should
involve a topology with a number of cascaded layers.

Schematics of the madaline implementation of the backpropagation


learning algorithm to neural network models composed of multiplelayers
of perceptrons.

() May 22, 2009 47 / 61


Introduction Features Fundamentals Madaline Case Study: Binary Classification Using Perceptron

Case Study: Binary Classification Using


Perceptron

We need to train the network using the following set of input and desired
output training vectors:

(x (1) = [1, −2, 0, −1]T ; t (1) = −1),


(x (2) = [0, 1.5, −0.5, −1]T ; t (2) = −1),
(x (3) = [−1, 1, 0.5, −1]T ; t (3) = +1),

Initial weight vector w (1) = [1, −1, 0, 0.5]T

Learning rate η = 0.1

() May 22, 2009 48 / 61


Introduction Features Fundamentals Madaline Case Study: Binary Classification Using Perceptron

Epoch 1

Introducing the first input vector x (1) to the network


Computing the output of the network
T
o(1) = sgn(w (1) x (1) )
= sgn([1, −1, 0, 0.5][1, −2, 0, −1]T )
= +1 6= t (1) ,

Updating weight vector

w (2) = w (1) + η[t (1) − o(1) ]x (1)


= w (1) + 0.1(−2)x (1)
= [0.8, −0.6, 0, 0.7]T

() May 22, 2009 49 / 61


Introduction Features Fundamentals Madaline Case Study: Binary Classification Using Perceptron

Epoch 1

Introducing the first input vector x (2) to the network


Computing the output of the network
T
o(2) = sgn(w (2) x (2) )
= sgn([0.8, −0.6, 0, 0.7][0, 1.5, −0.5, −1]T )
= −1 = t (2) ,

Updating weight vector

w (3) = w (2)

() May 22, 2009 50 / 61


Introduction Features Fundamentals Madaline Case Study: Binary Classification Using Perceptron

Epoch 1

Introducing the first input vector x (3) to the network


Computing the output of the network
T
o(3) = sgn(w (3) x (3) )
= sgn([0.8, −0.6, 0, 0.7][−1, 1, 0.5, −1]T )
= −1 6= t (3) ,

Updating weight vector

w (4) = w (3) + η[t (3) − o(3) ]x (3)


= w (3) + 0.1(2)x (3)
= [0.6, −0.4, 0.1, 0.5]T

() May 22, 2009 51 / 61


Introduction Features Fundamentals Madaline Case Study: Binary Classification Using Perceptron

Epoch 2
We reuse the training set (x (1) , t (1) ), (x (2) , t (2) ) and (x (3) , t (3) ) as
(x (4) , t (4) ), (x (5) , t (5) ) and (x (6) , t (6) ), respectively.

Introducing the first input vector x (4) to the network


Computing the output of the network
T
o(4) = sgn(w (4) x (4) )
= sgn([0.6, −0.4, 0.1, 0.5][1, −2, 0, −1]T )
= +1 6= t (4) ,

Updating weight vector

w (5) = w (4) + η[t (4) − o(4) ]x (4)


= w (4) + 0.1(−2)x (4)
= [0.4, 0, 0.1, 0.7]T

() May 22, 2009 52 / 61


Introduction Features Fundamentals Madaline Case Study: Binary Classification Using Perceptron

Epoch 2

Introducing the first input vector x (5) to the network


Computing the output of the network
T
o(5) = sgn(w (5) x (5) )
= sgn([0.4, 0, 0.1, 0.7][0, 1.5, −0.5, −1]T )
= −1 = t (5) ,

Updating weight vector

w (6) = w (5)

() May 22, 2009 53 / 61


Introduction Features Fundamentals Madaline Case Study: Binary Classification Using Perceptron

Epoch 2

Introducing the first input vector x (6) to the network


Computing the output of the network
T
o(6) = sgn(w (6) x (6) )
= sgn([0.4, 0, 0.1, 0.7][−1, 1, 0.5, −1]T )
= −1 6= t (6) ,

Updating weight vector

w (7) = w (6) + η[t (6) − o(6) ]x (6)


= w (6) + 0.1(2)x (6)
= [0.2, 0.2, 0.2, 0.5]T

() May 22, 2009 54 / 61


Introduction Features Fundamentals Madaline Case Study: Binary Classification Using Perceptron

Epoch 3

We reuse the training set (x (1) , t (1) ), (x (2) , t (2) ) and (x (3) , t (3) ) as
(x (7) , t (7) ), (x (8) , t (8) ) and (x (9) , t (9) ), respectively.

Introducing the first input vector x (7) to the network


Computing the output of the network
T
o(7) = sgn(w (7) x (7) )
= sgn([0.2, 0.2, 0.2, 0.5][1, −2, 0, −1]T )
= −1 = t (7) ,

Updating weight vector

w (8) = w (7)

() May 22, 2009 55 / 61


Introduction Features Fundamentals Madaline Case Study: Binary Classification Using Perceptron

Epoch 3

Introducing the first input vector x (8) to the network


Computing the output of the network
T
o(8) = sgn(w (8) x (8) )
= sgn([0.2, 0.2, 0.2, 0.5][0, 1.5, −0.5, −1]T )
= −1 = t (8) ,

Updating weight vector

w (9) = w (8)

() May 22, 2009 56 / 61


Introduction Features Fundamentals Madaline Case Study: Binary Classification Using Perceptron

Epoch 3

Introducing the first input vector x (9) to the network


Computing the output of the network
T
o(9) = sgn(w (9) x (9) )
= sgn([0.2, 0.2, 0.2, 0.5][−1, 1, 0.5, −1]T )
= −1 6= t (9) ,

Updating weight vector

w (10) = w (9) + η[t (9) − o(9) ]x (9)


= w (9) + 0.1(2)x (9)
= [0, 0.4, 0.3, 0.3]T

() May 22, 2009 57 / 61


Introduction Features Fundamentals Madaline Case Study: Binary Classification Using Perceptron

Epoch 4

We reuse the training set (x (1) , t (1) ), (x (2) , t (2) ) and (x (3) , t (3) ) as
(x (10) , t (10) ), (x (11) , t (11) ) and (x (12) , t (12) ), respectively.

Introducing the first input vector x (10) to the network


Computing the output of the network
T
o(10) = sgn(w (10) x (10) )
= sgn([0, 0.4, 0.3, 0.3][1, −2, 0, −1]T )
= −1 = t (10) ,

Updating weight vector

w (11) = w (10)

() May 22, 2009 58 / 61


Introduction Features Fundamentals Madaline Case Study: Binary Classification Using Perceptron

Epoch 4

Introducing the first input vector x (11) to the network


Computing the output of the network
T
o(11) = sgn(w (11) x (11) )
= sgn([0, 0.4, 0.3, 0.3][0, 1.5, −0.5, −1]T )
= +1 6= t (11) ,

Updating weight vector

w (12) = w (11) + η[t (11) − o(11) ]x (11)


= w (11) + 0.1(−2)x (11)
= [0, 0.1, 0.4, 0.5]T

() May 22, 2009 59 / 61


Introduction Features Fundamentals Madaline Case Study: Binary Classification Using Perceptron

Epoch 4

Introducing the first input vector x (12) to the network


Computing the output of the network
T
o(12) = sgn(w (12) x (12) )
= sgn([0, 0.1, 0.4, 0.5][−1, 1, 0.5, −1]T )
= −1 6= t (12) ,

Updating weight vector

w (13) = w (12) + η[t (12) − o(12) ]x (12)


= w (12) + 0.1(2)x (12)
= [−0.2, 0.3, 0.5, 0.3]T

() May 22, 2009 60 / 61


Introduction Features Fundamentals Madaline Case Study: Binary Classification Using Perceptron

Final Weight Vector

Introducing the input vectors for another epoch will result in no change
to the weights which indicates that w (13) is the solution for this problem;

Final weight vector: w = [w1 , w2 , w3 , w4 ] = [−0.2, 0.3, 0.5, 0.3]·

() May 22, 2009 61 / 61


Multi-Layer Perceptrons (MLPs)
Radial Basis Function Network
Kohonen’s Self-Organizing Network
Hopfield Network

Major Classes of Neural Networks

Major Classes of Neural Networks


Multi-Layer Perceptrons (MLPs)
Radial Basis Function Network
Kohonen’s Self-Organizing Network
Hopfield Network

Outline

Multi-Layer Perceptrons (MLPs)

Radial Basis Function Network

Kohonen’s Self-Organizing Network

Hopfield Network

Major Classes of Neural Networks


Background
Multi-Layer Perceptrons (MLPs)
Backpropagation Learning Algorithm
Radial Basis Function Network
Examples
Kohonen’s Self-Organizing Network
Applications and Limitations of MLP
Hopfield Network
Case Study

Multi-Layer Perceptrons (MLPs)

Major Classes of Neural Networks


Background
Multi-Layer Perceptrons (MLPs)
Backpropagation Learning Algorithm
Radial Basis Function Network
Examples
Kohonen’s Self-Organizing Network
Applications and Limitations of MLP
Hopfield Network
Case Study

Background

The perceptron lacks the important capability of recognizing


patterns belonging to non-separable linear spaces.

The madaline is restricted in dealing with complex functional


mappings and multi-class pattern recognition problems.

The multilayer architecture first proposed in the late sixties.

Major Classes of Neural Networks


Background
Multi-Layer Perceptrons (MLPs)
Backpropagation Learning Algorithm
Radial Basis Function Network
Examples
Kohonen’s Self-Organizing Network
Applications and Limitations of MLP
Hopfield Network
Case Study

Background (cont.)

MLP re-emerged as a solid connectionist model to solve a


wide range of complex problems in the mid-eighties.

This occurred following the reformulation of a powerful


learning algorithm commonly called the Back Propagation
Learning (BPL).

It was later implemented to the multilayer perceptron


topology with a great deal of success.

Major Classes of Neural Networks


Background
Multi-Layer Perceptrons (MLPs)
Backpropagation Learning Algorithm
Radial Basis Function Network
Examples
Kohonen’s Self-Organizing Network
Applications and Limitations of MLP
Hopfield Network
Case Study

Schematic Representation of MLP Network

Major Classes of Neural Networks


Background
Multi-Layer Perceptrons (MLPs)
Backpropagation Learning Algorithm
Radial Basis Function Network
Examples
Kohonen’s Self-Organizing Network
Applications and Limitations of MLP
Hopfield Network
Case Study

Backpropagation Learning Algorithm (BPL)

The backpropagation learning algorithm is based on the


gradient descent technique involving the minimization of
the network cumulative error.
q
X
E (k) = [ti (k) − oi (k)]2
i =1

i represents i-th neuron of the output layer composed of a


total number of q neurons.

It is designed to update the weights in the direction of the


gradient descent of the cumulative error.

Major Classes of Neural Networks


Background
Multi-Layer Perceptrons (MLPs)
Backpropagation Learning Algorithm
Radial Basis Function Network
Examples
Kohonen’s Self-Organizing Network
Applications and Limitations of MLP
Hopfield Network
Case Study

Backpropagation Learning Algorithm (cont.)

A Two-Stage Algorithm

1 First, patterns are presented to the network.

2 A feedback signal is then propagated backward with the main


task of updating the weights of the layers connections
according to the back-propagation learning algorithm.

Major Classes of Neural Networks


Background
Multi-Layer Perceptrons (MLPs)
Backpropagation Learning Algorithm
Radial Basis Function Network
Examples
Kohonen’s Self-Organizing Network
Applications and Limitations of MLP
Hopfield Network
Case Study

BPL: Schematic Representation

Schematic Representation of the MLP network illustrating the


notion of error back-propagation

Major Classes of Neural Networks


Background
Multi-Layer Perceptrons (MLPs)
Backpropagation Learning Algorithm
Radial Basis Function Network
Examples
Kohonen’s Self-Organizing Network
Applications and Limitations of MLP
Hopfield Network
Case Study

Backpropagation Learning Algorithm (cont.)

Objective Function

Using the sigmoid function as the activation function for all


the neurons of the network, we define Ec as

n n q
X 1 XX
Ec = E (k) = [ti (k) − oi (k)]2
2
k=1 k=1 i =1

Major Classes of Neural Networks


Background
Multi-Layer Perceptrons (MLPs)
Backpropagation Learning Algorithm
Radial Basis Function Network
Examples
Kohonen’s Self-Organizing Network
Applications and Limitations of MLP
Hopfield Network
Case Study

Backpropagation Learning Algorithm (cont.)


The formulation of the optimization problem can now be
stated as finding the set of the network weights that
minimizes Ec or E (k).
Objective Function: Off-Line Training

n q
1 XX
minw Ec = minw [ti (k) − oi (k)]2
2
k=1 i =1

Objective Function: On-Line Training

q
1X
minw E (k) = minw [ti (k) − oi (k)]2
2
i =1

Major Classes of Neural Networks


Background
Multi-Layer Perceptrons (MLPs)
Backpropagation Learning Algorithm
Radial Basis Function Network
Examples
Kohonen’s Self-Organizing Network
Applications and Limitations of MLP
Hopfield Network
Case Study

BPL: On-Line Training


Pq
Objective Function: minw E (k) = minw 21 i =1 [ti (k) − oi (k)]
2

Updating Rule for Connection Weights

∂E (k)
∆w (l) = −η ,
∂w l

l is layer (l -th) and η denotes the learning rate parameter,


(l)
∆wij : the weight update for the connection linking the node
j of layer (l − 1) to node i located at layer l .

Major Classes of Neural Networks


Background
Multi-Layer Perceptrons (MLPs)
Backpropagation Learning Algorithm
Radial Basis Function Network
Examples
Kohonen’s Self-Organizing Network
Applications and Limitations of MLP
Hopfield Network
Case Study

BPL: On-Line Training (cont.)

Updating Rule for Connection Weights

ojl−1 : the output of the neuron j at layer l − 1, the one


located just before layer l ,

totil : the sum of all signals reaching node i at hidden layer l


coming from previous layer l − 1·

Major Classes of Neural Networks


Background
Multi-Layer Perceptrons (MLPs)
Backpropagation Learning Algorithm
Radial Basis Function Network
Examples
Kohonen’s Self-Organizing Network
Applications and Limitations of MLP
Hopfield Network
Case Study

Illustration of Interconnection Between Layers of MLP

Major Classes of Neural Networks


Background
Multi-Layer Perceptrons (MLPs)
Backpropagation Learning Algorithm
Radial Basis Function Network
Examples
Kohonen’s Self-Organizing Network
Applications and Limitations of MLP
Hopfield Network
Case Study

Interconnection Weights Updating Rules


(l ) (l )
(l) ∂oi ∂toti
∆w (l) = ∆wij = −η[ ∂E (k)
(l ) ][ (l ) ][ (l ) ]
∂oi ∂toti ∂wij

For the case where the layer (l ) is the output layer (L):
(l )
(L) (L) (L) (L−1) (l) ∂f (toti )
∆wij = η[ti − oi ][f ′ (tot)i ]oj ; f ′ (tot)i = (l )
∂toti
(L) (L) (L)
By denoting δi = [ti − oi ][f ′ (tot)i ] as being the error
signal of the i -th node of the output layer, the weight update
(L) (L) (L−1)
at layer (L) is as follows: ∆wij = ηδi oj

In the case where f is the sigmoid function, the error signal


becomes expressed as:
(L) (L) (L)
δiL = [(ti − oi )oi (1 − oi )]

Major Classes of Neural Networks


Background
Multi-Layer Perceptrons (MLPs)
Backpropagation Learning Algorithm
Radial Basis Function Network
Examples
Kohonen’s Self-Organizing Network
Applications and Limitations of MLP
Hopfield Network
Case Study

Interconnection Weights Updating Rules (cont.)

Propagating the error backward now, and for the case where
(l)
(l ) represents a hidden layer (l < L ), the expression of ∆wij
(l) (l) (l−1)
becomes given by: ∆wij = ηδi oj ,
(l) ′ (l) Pnl l+1 l+1
where δi = f (tot)i p=1 δp wpi .

(l)
Again when f is taken as the sigmoid function, δi becomes
(l) (l) (l) P l
expressed as: δi = oi (1 − oi ) np=1 δpl+1 wpil+1 .

Major Classes of Neural Networks


Background
Multi-Layer Perceptrons (MLPs)
Backpropagation Learning Algorithm
Radial Basis Function Network
Examples
Kohonen’s Self-Organizing Network
Applications and Limitations of MLP
Hopfield Network
Case Study

Updating Rules: Off-Line Training

The weight update rule:


∂Ec
∆w (l) = −η .
∂w l
All previous steps outlined for developing the on-line update
rules are reproduced here with the exception that E (k)
becomes replaced with Ec .

In both cases though, once the network weights have reached


steady state values, the training algorithm is said to converge.

Major Classes of Neural Networks


Background
Multi-Layer Perceptrons (MLPs)
Backpropagation Learning Algorithm
Radial Basis Function Network
Examples
Kohonen’s Self-Organizing Network
Applications and Limitations of MLP
Hopfield Network
Case Study

Required Steps for Backpropagation Learning Algorithm

Step 1: Initialize weights and thresholds to small random


values.

Step 2: Choose an input-output pattern from the training


input-output data set (x(k), t(k))·

Step 3: Propagate the k-th signal forward through the


network and compute the output Pvalues for all i neurons at
nl −1 l l−1
every layer (l ) using oil (k) = f ( p=0 wip op )·

Step 4: Compute the total error value E = E (k) + E and the


(L) (L) (L) (L)
error signal δi using formulae δi = [ti − oi ][f ′ (tot)i ]·

Major Classes of Neural Networks


Background
Multi-Layer Perceptrons (MLPs)
Backpropagation Learning Algorithm
Radial Basis Function Network
Examples
Kohonen’s Self-Organizing Network
Applications and Limitations of MLP
Hopfield Network
Case Study

Required Steps for BPL (cont.)

Step 5: Update the weights according to


(l) (l) (l−1)
∆wij = −ηδi oj , for l = L, · · · , 1 using
(L) (L) (L)
δi = [ti − oi ][f ′ (tot)i ] and proceeding backward using
(l)
δi = oil (1 − oil ) np=1 δpl+1 wpil+1 for l < L·
P l

Step 6: Repeat the process starting from step 2 using another


exemplar. Once all exemplars have been used, we then reach
what is known as one epoch training.

Step 7: Check if the cumulative error E in the output layer


has become less than a predetermined value. If so we say the
network has been trained. If not, repeat the whole process for
one more epoch.

Major Classes of Neural Networks


Background
Multi-Layer Perceptrons (MLPs)
Backpropagation Learning Algorithm
Radial Basis Function Network
Examples
Kohonen’s Self-Organizing Network
Applications and Limitations of MLP
Hopfield Network
Case Study

Momentum

The gradient descent requires by nature infinitesimal


differentiation steps.

For small values of the learning parameter η, this leads most


often to a very slow convergence rate of the algorithm.

Larger learning parameters have been known to lead to


unwanted oscillations in the weight space.

To avoid these issues, the concept of momentum has been


introduced.

Major Classes of Neural Networks


Background
Multi-Layer Perceptrons (MLPs)
Backpropagation Learning Algorithm
Radial Basis Function Network
Examples
Kohonen’s Self-Organizing Network
Applications and Limitations of MLP
Hopfield Network
Case Study

Momentum (cont.)

The modified weight update formulae including momentum term


c (t)
given as: ∆w (l) (t + 1) = −η ∂E∂w l + γ∆w l (t).

Major Classes of Neural Networks


Background
Multi-Layer Perceptrons (MLPs)
Backpropagation Learning Algorithm
Radial Basis Function Network
Examples
Kohonen’s Self-Organizing Network
Applications and Limitations of MLP
Hopfield Network
Case Study

Example 1

To illustrate this powerful algorithm, we apply it for the


training of the following network shown in the next page.

x : training patterns, and t : output data


x (1) = (0.3, 0.4), t(1) = 0.88
x (2) = (0.1, 0.6), t(2) = 0.82
x (3) = (0.9, 0.4), t(3) = 0.57
Biases: −1
1
Sigmoid activation function: f (tot) = 1+e −λtot
, using λ = 1,
then f ′ (tot) = f (tot)(1 − f (tot)).

Major Classes of Neural Networks


Background
Multi-Layer Perceptrons (MLPs)
Backpropagation Learning Algorithm
Radial Basis Function Network
Examples
Kohonen’s Self-Organizing Network
Applications and Limitations of MLP
Hopfield Network
Case Study

Example 1: Structure of the Network

Major Classes of Neural Networks


Background
Multi-Layer Perceptrons (MLPs)
Backpropagation Learning Algorithm
Radial Basis Function Network
Examples
Kohonen’s Self-Organizing Network
Applications and Limitations of MLP
Hopfield Network
Case Study

Example 1: Training Loop (1)

Step (1) Initialization

Initialize the weights to 0.2, set learning rate to η = 0.2 ; set


maximum tolerable error to Emax = 0.01 (i.e. 1% error), set
E = 0 and k = 1.

Step (2) - Apply input pattern


Apply the 1st input pattern to the input layer.
x (1) = (0.3, 0.4), t(1) = 0.88, then,
o0 = x1 = 0.3; o1 = x2 = 0.4; o2 = x3 = −1;

Major Classes of Neural Networks


Background
Multi-Layer Perceptrons (MLPs)
Backpropagation Learning Algorithm
Radial Basis Function Network
Examples
Kohonen’s Self-Organizing Network
Applications and Limitations of MLP
Hopfield Network
Case Study

Example 1: Training Loop (1)

Step (3) - Forward propagation

Propagate the signal forward through the network

o3 = f (w30 o0 + w31 o1 + w32 o2 ) = 0.485

o4 = f (w40 o0 + w41 o1 + w42 o2 ) = 0.485


o5 = −1
o6 = f (w63 o3 + w64 o4 + w65 o5 ) = 0.4985

Major Classes of Neural Networks


Background
Multi-Layer Perceptrons (MLPs)
Backpropagation Learning Algorithm
Radial Basis Function Network
Examples
Kohonen’s Self-Organizing Network
Applications and Limitations of MLP
Hopfield Network
Case Study

Example 1: Training Loop (1)

Step (4) - Output error measure

Compute the error value E


1
E= (t − o6 )2 + E = 0.0728
2

Compute the error signal δ6 of the output layer

δ6 = f ′ (tot6 )(t − o6 )
= o6 (1 − o6 )(t − o6 )
= 0.0945

Major Classes of Neural Networks


Background
Multi-Layer Perceptrons (MLPs)
Backpropagation Learning Algorithm
Radial Basis Function Network
Examples
Kohonen’s Self-Organizing Network
Applications and Limitations of MLP
Hopfield Network
Case Study

Example 1: Training Loop (1)

Step (5) - Error back-propagation


Third layer weight updates:
∆w63 = ηδ6 o3 = 0.0093 new = w old + ∆w = 0.2093
w63 63 63

∆w64 = ηδ6 o4 = 0.0093 new = w old + ∆w = 0.2093


w64 64 64

∆w65 = ηδ6 o5 = 0.0191 new = w old + ∆w = 0.1809


w65 65 65

Second layer error signals:


δ3 = f3′ (tot3 ) 6i =6 wi 3 δi = o3 (1 − o3 )w63 δ6 = 0.0048
P

δ4 = f4′ (tot4 ) 6i =6 wi 4 δi = o4 (1 − o4 )w64 δ6 = 0.0048


P

Major Classes of Neural Networks


Background
Multi-Layer Perceptrons (MLPs)
Backpropagation Learning Algorithm
Radial Basis Function Network
Examples
Kohonen’s Self-Organizing Network
Applications and Limitations of MLP
Hopfield Network
Case Study

Example 1: Training Loop (1)

Step (5) - Error back-propagation (cont.)


Second layer weight updates:
new = w old + ∆w =0.2003
∆w30 = ηδ3 o0 = 0.00028586 w30 30 30

∆w31 = ηδ3 o1 = 0.00038115 new = w old + ∆w =0.2004


w31 31 31

∆w32 = ηδ3 o2 = −0.00095288 new = w old + ∆w =0.199


w32 32 32

∆w40 = ηδ4 o0 = 0.00028586 new = w old + ∆w =0.2003


w40 40 40

∆w41 = ηδ4 o1 = 0.00038115 new = w old + ∆w =0.2004


w41 41 41

∆w42 = ηδ4 o2 = −0.00095288 new = w old + ∆w =0.199


w42 42 42

Major Classes of Neural Networks


Background
Multi-Layer Perceptrons (MLPs)
Backpropagation Learning Algorithm
Radial Basis Function Network
Examples
Kohonen’s Self-Organizing Network
Applications and Limitations of MLP
Hopfield Network
Case Study

Example 1: Training Loop (2)

Step (2) - Apply the 2nd input pattern


x (2) = (0.1, 0.6), t(2) = 0.82, then,
o0 = 0.1; o1 = 0.6; o2 = −1;

Step (3) - Forward propagation


o3 = f (w30 o0 + w31 o1 + w32 o2 ) = 0.4853
o4 = f (w40 o0 + w41 o1 + w42 o2 ) = 0.4853
o5 = −1
o6 = f (w63 o3 + w64 o4 + w65 o5 ) = 0.5055
Step (4) - Output error measure
E = 12 (t − o6 )2 + E = 0.1222
= o6 (1 − o6 )(t − o6 ) = 0.0786

Major Classes of Neural Networks


Background
Multi-Layer Perceptrons (MLPs)
Backpropagation Learning Algorithm
Radial Basis Function Network
Examples
Kohonen’s Self-Organizing Network
Applications and Limitations of MLP
Hopfield Network
Case Study

Training Loop - Loop (2)

Step (5) - Error back-propagation


Third layer weight updates:
∆w63 = ηδ6 o3 = 0.0076 new = w old + ∆w = 0.2169
w63 63 63

∆w64 = ηδ6 o4 = 0.0076 new = w old + ∆w = 0.2169


w64 64 64

∆w65 = ηδ6 o5 = 0.0157 new = w old + ∆w = 0.1652


w65 65 65

Second layer error signals:


δ3 = f3′ (tot3 ) 6i =6 wi 3 δi = o3 (1 − o3 )w63 δ6 = 0.0041
P

δ4 = f4′ (tot4 ) 6i =6 wi 4 δi = o4 (1 − o4 )w64 δ6 = 0.0041


P

Major Classes of Neural Networks


Background
Multi-Layer Perceptrons (MLPs)
Backpropagation Learning Algorithm
Radial Basis Function Network
Examples
Kohonen’s Self-Organizing Network
Applications and Limitations of MLP
Hopfield Network
Case Study

Example 1: Training Loop (2)

Step (5) - Error back-propagation (cont.)


Second layer weight updates:
∆w30 = ηδ3 o0 = 0.000082169 new = w old + ∆w =0.2004
w30 30 30

∆w31 = ηδ3 o1 = 0.00049302 new = w old + ∆w =0.2009


w31 31 31
new = w old + ∆w =0.1982
∆w32 = ηδ3 o2 = −0.00082169 w32 32 32

∆w40 = ηδ4 o0 = 0.000082169 new = w old + ∆w =0.2004


w40 40 40

∆w41 = ηδ4 o1 = 0.00049302 new = w old + ∆w =0.2009


w41 41 41
new = w old + ∆w =0.1982
∆w42 = ηδ4 o2 = −0.00082169 w42 42 42

Major Classes of Neural Networks


Background
Multi-Layer Perceptrons (MLPs)
Backpropagation Learning Algorithm
Radial Basis Function Network
Examples
Kohonen’s Self-Organizing Network
Applications and Limitations of MLP
Hopfield Network
Case Study

Example 1: Training Loop (3)


Step (2) - Apply the 2nd input pattern
x (3) = (0.9, 0.4), t(3) = 0.57, then,
o0 = 0.9; o1 = 0.4; o2 = −1;
Step (3) - Forward propagation
o3 = f (w30 o0 + w31 o1 + w32 o2 ) = 0.5156
o4 = f (w40 o0 + w41 o1 + w42 o2 ) = 0.5156
o5 = −1
o6 = f (w63 o3 + w64 o4 + w65 o5 ) = 0.5146
Step (4) - Output error measure
E = 12 (t − o6 )2 + E = 0.1237
= o6 (1 − o6 )(t − o6 ) = 0.0138
Major Classes of Neural Networks
Background
Multi-Layer Perceptrons (MLPs)
Backpropagation Learning Algorithm
Radial Basis Function Network
Examples
Kohonen’s Self-Organizing Network
Applications and Limitations of MLP
Hopfield Network
Case Study

Example 1: Training Loop (3)

Step (5) - Error back-propagation


Third layer weight updates:
∆w63 = ηδ6 o3 = 0.0014 new = w old + ∆w = 0.2183
w63 63 63

∆w64 = ηδ6 o4 = 0.0014 new = w old + ∆w = 0.2183


w64 64 64

∆w65 = ηδ6 o5 = −0.0028 new = w old + ∆w


w65 65 65 = 0.1624

Second layer error signals:


δ3 = f3′ (tot3 ) 6i =6 wi 3 δi = o3 (1 − o3 )w63 δ6 = 0.00074948
P

δ4 = f4′ (tot4 ) 6i =6 wi 4 δi = o4 (1 − o4 )w64 δ6 = 0.00074948


P

Major Classes of Neural Networks


Background
Multi-Layer Perceptrons (MLPs)
Backpropagation Learning Algorithm
Radial Basis Function Network
Examples
Kohonen’s Self-Organizing Network
Applications and Limitations of MLP
Hopfield Network
Case Study

Example 1: Training Loop (3)

Step (5) - Error back-propagation (cont.)


Second layer weight updates:
new = w old + ∆w =0.2005
∆w30 = ηδ3 o0 = 0.00013491 w30 30 30

∆w31 = ηδ3 o1 = 0.000059958 new = w old + ∆w =0.2009


w31 31 31

∆w32 = ηδ3 o2 = −0.0001499 new = w old + ∆w =0.1981


w32 32 32

∆w40 = ηδ4 o0 = 0.00013491 new = w old + ∆w =0.2005


w40 40 40

∆w41 = ηδ4 o1 = 0.000059958 new = w old + ∆w =0.2009


w41 41 41

∆w42 = ηδ4 o2 = −0.0001499 new = w old + ∆w =0.1981


w42 42 42

Major Classes of Neural Networks


Background
Multi-Layer Perceptrons (MLPs)
Backpropagation Learning Algorithm
Radial Basis Function Network
Examples
Kohonen’s Self-Organizing Network
Applications and Limitations of MLP
Hopfield Network
Case Study

Example 1: Final Decision

Step (6) - One epoch looping

The training patterns have been cycled one epoch.

Step (7) - Total error checking

E = 0.1237 and Emax = 0.01 , which means that we have to


continue with the next epoch by cycling the training data
again.

Major Classes of Neural Networks


Background
Multi-Layer Perceptrons (MLPs)
Backpropagation Learning Algorithm
Radial Basis Function Network
Examples
Kohonen’s Self-Organizing Network
Applications and Limitations of MLP
Hopfield Network
Case Study

Example 2

Effect of Hidden Nodes on Function Approximation

Consider this function f (x) = x sin(x)

Six input/output samples were selected from the range [0, 10]
of the variable x

The first run was made for a network with 3 hidden nodes

Another run was made for a network with 5 and 20 nodes,


respectively.

Major Classes of Neural Networks


Background
Multi-Layer Perceptrons (MLPs)
Backpropagation Learning Algorithm
Radial Basis Function Network
Examples
Kohonen’s Self-Organizing Network
Applications and Limitations of MLP
Hopfield Network
Case Study

Example 2: Different Hidden Nodes

Major Classes of Neural Networks


Background
Multi-Layer Perceptrons (MLPs)
Backpropagation Learning Algorithm
Radial Basis Function Network
Examples
Kohonen’s Self-Organizing Network
Applications and Limitations of MLP
Hopfield Network
Case Study

Example 2: Remarks

A higher number of nodes is not always better. It may


overtrain the network.

This happens when the network starts to memorize the


patterns instead of interpolating between them.

A smaller number of nodes was not able to approximate


faithfully the function given the nonlinearities induced by the
network was not enough to interpolate well in between the
samples.

It seems here that this network (with five nodes) was able to
interpolate quite well the nonlinear behavior of the curve.

Major Classes of Neural Networks


Background
Multi-Layer Perceptrons (MLPs)
Backpropagation Learning Algorithm
Radial Basis Function Network
Examples
Kohonen’s Self-Organizing Network
Applications and Limitations of MLP
Hopfield Network
Case Study

Example 3

Effect of Training Patterns on Function Approximation

Consider this function f (x) = x sin(x)

Assume a network with a fixed number of nodes (taken as five


here), but with a variable number of training patterns

The first run was made for a network with 3 three samples

Another run was made for a network with 10 and 20 samples,


respectively.

Major Classes of Neural Networks


Background
Multi-Layer Perceptrons (MLPs)
Backpropagation Learning Algorithm
Radial Basis Function Network
Examples
Kohonen’s Self-Organizing Network
Applications and Limitations of MLP
Hopfield Network
Case Study

Example 3: Different Samples

Major Classes of Neural Networks


Background
Multi-Layer Perceptrons (MLPs)
Backpropagation Learning Algorithm
Radial Basis Function Network
Examples
Kohonen’s Self-Organizing Network
Applications and Limitations of MLP
Hopfield Network
Case Study

Example 3: Remarks

The first run with three samples was not able to provide a
good mach with the original curve.

This can be explained by the fact that the three patterns, in


the case of a nonlinear function such as this, are not able to
reproduce the relatively high nonlinearities of the function.

A higher number of training points provided better results.

The best result was obtained for the case of 20 training


patterns. This is due to the fact that a network with five
hidden nodes interpolates extremely well in between close
training patterns.

Major Classes of Neural Networks


Background
Multi-Layer Perceptrons (MLPs)
Backpropagation Learning Algorithm
Radial Basis Function Network
Examples
Kohonen’s Self-Organizing Network
Applications and Limitations of MLP
Hopfield Network
Case Study

Applications of MLP

Multilayer perceptrons are currently among the most used


connectionist models.

This stems from the relative ease for training and


implementing, either in hardware or software forms.

Applications

• Signal processing • Weather forecasting


• Pattern recognition • Signal compression
• Financial market prediction

Major Classes of Neural Networks


Background
Multi-Layer Perceptrons (MLPs)
Backpropagation Learning Algorithm
Radial Basis Function Network
Examples
Kohonen’s Self-Organizing Network
Applications and Limitations of MLP
Hopfield Network
Case Study

Applications of MLP

Multilayer perceptrons are currently among the most used


connectionist models.

This stems from the relative ease for training and


implementing, either in hardware or software forms.

Applications

• Signal processing • Weather forecasting


• Pattern recognition • Signal compression
• Financial market prediction

Major Classes of Neural Networks


Background
Multi-Layer Perceptrons (MLPs)
Backpropagation Learning Algorithm
Radial Basis Function Network
Examples
Kohonen’s Self-Organizing Network
Applications and Limitations of MLP
Hopfield Network
Case Study

Limitations of MLP

Among the well-known problems that may hinder the


generalization or approximation capabilities of MLP is the one
related to the convergence behavior of the connection weights
during the learning stage.

In fact, the gradient descent based algorithm used to update


the network weights may never converge to the global
minima.

This is particularly true in the case of highly nonlinear


behavior of the system being approximated by the network.

Major Classes of Neural Networks


Background
Multi-Layer Perceptrons (MLPs)
Backpropagation Learning Algorithm
Radial Basis Function Network
Examples
Kohonen’s Self-Organizing Network
Applications and Limitations of MLP
Hopfield Network
Case Study

Limitations of MLP

Many remedies have been proposed to tackle this issue either


by retraining the network a number of times or by using
optimization techniques such as those based on:

Genetic algorithms,

Simulated annealing.

Major Classes of Neural Networks


Background
Multi-Layer Perceptrons (MLPs)
Backpropagation Learning Algorithm
Radial Basis Function Network
Examples
Kohonen’s Self-Organizing Network
Applications and Limitations of MLP
Hopfield Network
Case Study

MLP NN: Case Study

Function Estimation (Regression)

Major Classes of Neural Networks


Background
Multi-Layer Perceptrons (MLPs)
Backpropagation Learning Algorithm
Radial Basis Function Network
Examples
Kohonen’s Self-Organizing Network
Applications and Limitations of MLP
Hopfield Network
Case Study

MLP NN: Case Study

Use a feedforward backpropagation neural network that


contains a single hidden layer.

Each of hidden nodes has an activation function of the logistic


form.

Investigate the outcome of the neural network for the


following mapping.
f (x) = exp(−x 2 ), x ∈ [0 2]

Experiment with different number of training samples and


hidden layer nodes

Major Classes of Neural Networks


Background
Multi-Layer Perceptrons (MLPs)
Backpropagation Learning Algorithm
Radial Basis Function Network
Examples
Kohonen’s Self-Organizing Network
Applications and Limitations of MLP
Hopfield Network
Case Study

MLP NN: Case Study

Experiment 1: Vary Number of Hidden Nodes

Uniformly pick six sample points from [0 2], use half of them
for training and the rest for testing

Evaluate regression performance increasing the number of


hidden nodes

Use
P sum of regression error (i.e.
i ∈test samples (Output(i ) − True output(i )) ) as performance
measure

Repeat each test 20 times and compute average results,


compensating for potential local minima

Major Classes of Neural Networks


Background
Multi-Layer Perceptrons (MLPs)
Backpropagation Learning Algorithm
Radial Basis Function Network
Examples
Kohonen’s Self-Organizing Network
Applications and Limitations of MLP
Hopfield Network
Case Study

MLP NN: Case Study

Major Classes of Neural Networks


Background
Multi-Layer Perceptrons (MLPs)
Backpropagation Learning Algorithm
Radial Basis Function Network
Examples
Kohonen’s Self-Organizing Network
Applications and Limitations of MLP
Hopfield Network
Case Study

MLP NN: Case Study

Experiment 2: Vary Number of Training Samples

Construct neural network using three hidden nodes

Uniformly pick sample points from [0 2], increasing their


number for each test

Use half of sample data points for training and the rest for
testing

Use the same performance measure as experiment 1, i.e. sum


of regression error

Repeat each test 50 times and compute average results

Major Classes of Neural Networks


Background
Multi-Layer Perceptrons (MLPs)
Backpropagation Learning Algorithm
Radial Basis Function Network
Examples
Kohonen’s Self-Organizing Network
Applications and Limitations of MLP
Hopfield Network
Case Study

MLP NN: Case Study

Major Classes of Neural Networks


Multi-Layer Perceptrons (MLPs) Topology
Radial Basis Function Network Learning Algorithm for RBF
Kohonen’s Self-Organizing Network Examples
Hopfield Network Applications

Radial Basis Function Network

Major Classes of Neural Networks


Multi-Layer Perceptrons (MLPs) Topology
Radial Basis Function Network Learning Algorithm for RBF
Kohonen’s Self-Organizing Network Examples
Hopfield Network Applications

Topology

Radial basis function network (RBFN) represent a special


category of the feedforward neural networks architecture.

Early researchers have developed this connectionist model for


mapping nonlinear behavior of static processes and for
function approximation purposes.

The basic RBFN structure consists of an input layer, a


single hidden layer with radial activation function and an
output layer.

Major Classes of Neural Networks


Multi-Layer Perceptrons (MLPs) Topology
Radial Basis Function Network Learning Algorithm for RBF
Kohonen’s Self-Organizing Network Examples
Hopfield Network Applications

Topology: Graphical Representation

Major Classes of Neural Networks


Multi-Layer Perceptrons (MLPs) Topology
Radial Basis Function Network Learning Algorithm for RBF
Kohonen’s Self-Organizing Network Examples
Hopfield Network Applications

Topology (cont.)

The network structure uses nonlinear transformations in its


hidden layer (typical transfer functions for hidden functions
are Gaussian curves).

However, it uses linear transformations between the hidden


and output layers.

The rationale behind this is that input spaces, cast nonlinearly


into high-dimensional domains, are more likely to be linearly
separable than those cast into low-dimensional ones.

Major Classes of Neural Networks


Multi-Layer Perceptrons (MLPs) Topology
Radial Basis Function Network Learning Algorithm for RBF
Kohonen’s Self-Organizing Network Examples
Hopfield Network Applications

Topology (cont.)

Unlike most FF neural networks, the connection weights


between the input layer and the neuron units of the hidden
layer for an RBFN are all equal to unity.

The nonlinear transformations at the hidden layer level have


the main characteristics of being symmetrical.

They also attain their maximum at the function center, and


generate positive values that are rapidly decreasing with the
distance from the center.

Major Classes of Neural Networks


Multi-Layer Perceptrons (MLPs) Topology
Radial Basis Function Network Learning Algorithm for RBF
Kohonen’s Self-Organizing Network Examples
Hopfield Network Applications

Topology (cont.)

As such they produce radially activation signals that are


bounded and localized.

Parameters of Each activation


Function

The center

The width

Major Classes of Neural Networks


Multi-Layer Perceptrons (MLPs) Topology
Radial Basis Function Network Learning Algorithm for RBF
Kohonen’s Self-Organizing Network Examples
Hopfield Network Applications

Topology (cont.)

For an optimal performance of the network, the hidden layer


nodes should span the training data input space.

Too sparse or too overlapping functions may cause the


degradation of the network performance.

Major Classes of Neural Networks


Multi-Layer Perceptrons (MLPs) Topology
Radial Basis Function Network Learning Algorithm for RBF
Kohonen’s Self-Organizing Network Examples
Hopfield Network Applications

Radial Function or Kernel Function

In general the form taken by an RBF function is given as:

k x − vi k
gi (x) = ri ( )
σi

where x is the input vector,

vi is the vector denoting the center of the radial function gi ,

σi is width parameter.

Major Classes of Neural Networks


Multi-Layer Perceptrons (MLPs) Topology
Radial Basis Function Network Learning Algorithm for RBF
Kohonen’s Self-Organizing Network Examples
Hopfield Network Applications

Famous Radial Functions

The Gaussian kernel function is the most widely used form of


RBF given by:

− k x − vi k2
gi (x) = exp( )
2σi2

The logistic function has also been used as a possible RBF


candidate:
1
gi (x) = 2
1 + exp( kx−v
σ2
ik
)
i

Major Classes of Neural Networks


Multi-Layer Perceptrons (MLPs) Topology
Radial Basis Function Network Learning Algorithm for RBF
Kohonen’s Self-Organizing Network Examples
Hopfield Network Applications

Output of an RBF Network

A typical output of an RBF network having n units in the


hidden layer and r output units is given by:
n
X
oj (x) = wij gi (x), j = 1, · · · , r ·
i =1

where wij is the connection weight between the i-th receptive


field unit and the j-th output,

gi is the i-th receptive field unit (radial function).

Major Classes of Neural Networks


Multi-Layer Perceptrons (MLPs) Topology
Radial Basis Function Network Learning Algorithm for RBF
Kohonen’s Self-Organizing Network Examples
Hopfield Network Applications

Learning Algorithm

Two-Stage Learning Strategy

At first, an unsupervised clustering algorithm is used to


extract the parameters of the radial basis functions, namely
the width and the centers.

This is followed by the computation of the weights of the


connections between the output nodes and the kernel
functions using a supervised least mean square algorithm.

Major Classes of Neural Networks


Multi-Layer Perceptrons (MLPs) Topology
Radial Basis Function Network Learning Algorithm for RBF
Kohonen’s Self-Organizing Network Examples
Hopfield Network Applications

Learning Algorithm: Hybrid Approach

The standard technique used to train an RBF network is the


hybrid approach.

Hybrid Approach

Step 1: Train the RBF layer to get the adaptation of centers


and scaling parameters using the unsupervised training.

Step 2: Adapt the weights of the output layer using


supervised training algorithm.

Major Classes of Neural Networks


Multi-Layer Perceptrons (MLPs) Topology
Radial Basis Function Network Learning Algorithm for RBF
Kohonen’s Self-Organizing Network Examples
Hopfield Network Applications

Learning Algorithm: Step 1

To determine the centers for the RBF networks, typically


unsupervised training procedures of clustering are used:

K-means method,

”Maximum likelihood estimate” technique,

Self-organizing map method.

This step is very important in the training of RBFN, as the


accurate knowledge of vi and σi has a major impact on the
performance of the network.

Major Classes of Neural Networks


Multi-Layer Perceptrons (MLPs) Topology
Radial Basis Function Network Learning Algorithm for RBF
Kohonen’s Self-Organizing Network Examples
Hopfield Network Applications

Learning Algorithm: Step 2

Once the centers and the widths of radial basis functions are
obtained, the next stage of the training begins.

To update the weights between the hidden layer and the


output layer, the supervised learning based techniques such as
are used:

Least-squares method,

Gradient method.

Because the weights exist only between the hidden layer and
the output layer, it is easy to compute the weight matrix for
the RBFN.

Major Classes of Neural Networks


Multi-Layer Perceptrons (MLPs) Topology
Radial Basis Function Network Learning Algorithm for RBF
Kohonen’s Self-Organizing Network Examples
Hopfield Network Applications

Learning Algorithm: Step 2 (cont.)

In the case where the RBFN is used for interpolation


purposes, we can use the inverse or pseudo-inverse method
to calculate the weight matrix.

If we use Gaussian kernel as the radial basis functions and


there are n input data, we have:

G = [{gij }],

where
− k xi − vj k2
gij = exp( ), i , j = 1, · · · , n
2σj2

Major Classes of Neural Networks


Multi-Layer Perceptrons (MLPs) Topology
Radial Basis Function Network Learning Algorithm for RBF
Kohonen’s Self-Organizing Network Examples
Hopfield Network Applications

Learning Algorithm: Step 2 (cont.)


Now we have:
D = GW
where D is the desired output of the training data.

If G −1 exists, we get:
W = G −1 D
In practice however, G may be ill-conditioned (close to
singularity) or may even be a non-square matrix (if the
number of radial basis functions is less than the number of
training data) then W is expressed as:
W = G +D

Major Classes of Neural Networks


Multi-Layer Perceptrons (MLPs) Topology
Radial Basis Function Network Learning Algorithm for RBF
Kohonen’s Self-Organizing Network Examples
Hopfield Network Applications

Learning Algorithm: Step 2 (cont.)

We had:

W = G + D,

where G + denotes the pseudo-inverse matrix of G , which can


be defined as

G + = (G T G )−1 G T

Once the weight matrix has been obtained, all elements of the
RBFN are now determined and the network could operate on
the task it has been designed for.

Major Classes of Neural Networks


Multi-Layer Perceptrons (MLPs) Topology
Radial Basis Function Network Learning Algorithm for RBF
Kohonen’s Self-Organizing Network Examples
Hopfield Network Applications

Learning Algorithm: Step 2 (cont.)

We had:

W = G + D,

where G + denotes the pseudo-inverse matrix of G , which can


be defined as

G + = (G T G )−1 G T

Once the weight matrix has been obtained, all elements of the
RBFN are now determined and the network could operate on
the task it has been designed for.

Major Classes of Neural Networks


Multi-Layer Perceptrons (MLPs) Topology
Radial Basis Function Network Learning Algorithm for RBF
Kohonen’s Self-Organizing Network Examples
Hopfield Network Applications

Example

Approximation of Function f (x) Using an RBFN

We use here the same function as the one used in the MLP
section, f (x) = x sin(x).

The RBF network is composed here of five radial functions.

Each radial function has its center at a training input data.

Three width parameters are used here: 0.5, 2.1, and 8.5.

The results of simulation show that the width of the function


plays a major importance.

Major Classes of Neural Networks


Multi-Layer Perceptrons (MLPs) Topology
Radial Basis Function Network Learning Algorithm for RBF
Kohonen’s Self-Organizing Network Examples
Hopfield Network Applications

Example: Function Approximation with Gaussian Kernels


(σ = 0.5)

Major Classes of Neural Networks


Multi-Layer Perceptrons (MLPs) Topology
Radial Basis Function Network Learning Algorithm for RBF
Kohonen’s Self-Organizing Network Examples
Hopfield Network Applications

Example: Function Approximation with Gaussian Kernels


(σ = 2.1)

Major Classes of Neural Networks


Multi-Layer Perceptrons (MLPs) Topology
Radial Basis Function Network Learning Algorithm for RBF
Kohonen’s Self-Organizing Network Examples
Hopfield Network Applications

Example: Function Approximation with Gaussian Kernels


(σ = 8.5)

Major Classes of Neural Networks


Multi-Layer Perceptrons (MLPs) Topology
Radial Basis Function Network Learning Algorithm for RBF
Kohonen’s Self-Organizing Network Examples
Hopfield Network Applications

Example: Comparison

Major Classes of Neural Networks


Multi-Layer Perceptrons (MLPs) Topology
Radial Basis Function Network Learning Algorithm for RBF
Kohonen’s Self-Organizing Network Examples
Hopfield Network Applications

Example: Remarks

A smaller width value 0.5 doesn’t seem to provide for a good


interpolation of the function in between sample data.

A width value 2.1 provides a better result and the


approximation by RBF is close to the original curve.

This particular width value seems to provide the network with


the adequate interpolation property.

A larger width value 8.5 seems to be inadequate for this


particular case, given that a lot of information is being lost
when the ranges of the radial functions are further away from
the original range of the function.

Major Classes of Neural Networks


Multi-Layer Perceptrons (MLPs) Topology
Radial Basis Function Network Learning Algorithm for RBF
Kohonen’s Self-Organizing Network Examples
Hopfield Network Applications

Advantages/Disadvantages

Unsupervised learning stage of an RBFN is not an easy task.

RBF trains faster than a MLP.

Another advantage that is claimed is that the hidden layer is


easier to interpret than the hidden layer in an MLP.

Although the RBF is quick to train, when training is finished


and it is being used it is slower than a MLP, so where speed is
a factor a MLP may be more appropriate.

Major Classes of Neural Networks


Multi-Layer Perceptrons (MLPs) Topology
Radial Basis Function Network Learning Algorithm for RBF
Kohonen’s Self-Organizing Network Examples
Hopfield Network Applications

Applications

Known to have universal approximation capabilities, good


local structures and efficient training algorithms, RBFN
have been often used for nonlinear mapping of complex
processes and for solving a wide range of classification
problems.

They have been used as well for control systems, audio and
video signals processing, and pattern recognition.

Major Classes of Neural Networks


Multi-Layer Perceptrons (MLPs) Topology
Radial Basis Function Network Learning Algorithm for RBF
Kohonen’s Self-Organizing Network Examples
Hopfield Network Applications

Applications (cont.)

They have also been recently used for chaotic time series
prediction, with particular application to weather and power
load forecasting.

Generally, RBF networks have an undesirably high number of


hidden nodes, but the dimension of the space can be reduced
by careful planning of the network.

Major Classes of Neural Networks


Multi-Layer Perceptrons (MLPs) Topology
Radial Basis Function Network Learning Algorithm
Kohonen’s Self-Organizing Network Example
Hopfield Network Applications

Kohonen’s Self-Organizing Network

Major Classes of Neural Networks


Multi-Layer Perceptrons (MLPs) Topology
Radial Basis Function Network Learning Algorithm
Kohonen’s Self-Organizing Network Example
Hopfield Network Applications

Topology

The Kohonen’s Self-Organizing Network (KSON) belongs to


the class of unsupervised learning networks.

This means that the network, unlike other forms of supervised


learning based networks updates its weighting parameters
without the need for a performance feedback from a teacher
or a network trainer.

Major Classes of Neural Networks


Multi-Layer Perceptrons (MLPs) Topology
Radial Basis Function Network Learning Algorithm
Kohonen’s Self-Organizing Network Example
Hopfield Network Applications

Unsupervised Learning

Major Classes of Neural Networks


Multi-Layer Perceptrons (MLPs) Topology
Radial Basis Function Network Learning Algorithm
Kohonen’s Self-Organizing Network Example
Hopfield Network Applications

Topology (cont.)

One major feature of this network is that the nodes distribute


themselves across the input space to recognize groups of
similar input vectors.

However, the output nodes compete among themselves to be


fired one at a time in response to a particular input vector.

This process is known as competitive learning.

Major Classes of Neural Networks


Multi-Layer Perceptrons (MLPs) Topology
Radial Basis Function Network Learning Algorithm
Kohonen’s Self-Organizing Network Example
Hopfield Network Applications

Topology (cont.)

Two input vectors with similar pattern characteristics excite


two physically close layer nodes.

In other words, the nodes of the KSON can recognize groups


of similar input vectors.

This generates a topographic mapping of the input vectors to


the output layer, which depends primarily on the pattern of
the input vectors and results in dimensionality reduction of the
input space.

Major Classes of Neural Networks


Multi-Layer Perceptrons (MLPs) Topology
Radial Basis Function Network Learning Algorithm
Kohonen’s Self-Organizing Network Example
Hopfield Network Applications

A Schematic Representation of a Typical KSOM

Major Classes of Neural Networks


Multi-Layer Perceptrons (MLPs) Topology
Radial Basis Function Network Learning Algorithm
Kohonen’s Self-Organizing Network Example
Hopfield Network Applications

Learning

The learning here permits the clustering of input data into a


smaller set of elements having similar characteristics
(features).

It is based on the competitive learning technique also known


as the winner take all strategy.

Presume that the input pattern is given by the vector x.

Assume wij is the weight vector connecting the input elements


to an output node with coordinate provided by indices i and j.

Major Classes of Neural Networks


Multi-Layer Perceptrons (MLPs) Topology
Radial Basis Function Network Learning Algorithm
Kohonen’s Self-Organizing Network Example
Hopfield Network Applications

Learning

Nc is defined as the neighborhood around the winning output


candidate.

Its size decreases at every iteration of the algorithm until


convergence occurs.

Major Classes of Neural Networks


Multi-Layer Perceptrons (MLPs) Topology
Radial Basis Function Network Learning Algorithm
Kohonen’s Self-Organizing Network Example
Hopfield Network Applications

Steps of Learning Algorithm

Step 1: Initialize all weights to small random values. Set a


value for the initial learning rate α and a value for the
neighborhood Nc .

Step 2: Choose an input pattern x from the input data set.

Step 3: Select the winning unit c (the index of the best


matching output unit) such that the performance index I
given by the Euclidian distance from x to wij is minimized:

I = kx − wc k = minij kx − wij k

Major Classes of Neural Networks


Multi-Layer Perceptrons (MLPs) Topology
Radial Basis Function Network Learning Algorithm
Kohonen’s Self-Organizing Network Example
Hopfield Network Applications

Steps of Learning Algorithm (cont.)

Step 4: Update the weights according to the global network


updating phase from iteration k to iteration k + 1 as:
(
wij (k) + α(k)[x − wij (k)] if (i , j) ∈ Nc (k),
wij (k + 1) =
wij (k) otherwise.

where α(k) is the adaptive learning rate (strictly positive value


smaller than the unity),
Nc (k) the neighborhood of the unit c at iteration k.

Major Classes of Neural Networks


Multi-Layer Perceptrons (MLPs) Topology
Radial Basis Function Network Learning Algorithm
Kohonen’s Self-Organizing Network Example
Hopfield Network Applications

Steps of Learning Algorithm (cont.)

Step 5: The learning rate and the neighborhood are decreased


at every iteration according to an appropriate scheme.

For instance, Kohonen suggested a shrinking function in the


form of α(k) = α(0)(1 − k/T ), with T being the total
number of training cycles and α(0) the starting learning rate
bounded by one.
As for the neighborhood, several researchers suggested an
initial region with the size of half of the output grid and
shrinks according to an exponentially decaying behavior.

Step 6: The learning scheme continues until enough number


of iterations has been reached or until each output reaches a
threshold of sensitivity to a portion of the input space.

Major Classes of Neural Networks


Multi-Layer Perceptrons (MLPs) Topology
Radial Basis Function Network Learning Algorithm
Kohonen’s Self-Organizing Network Example
Hopfield Network Applications

Steps of Learning Algorithm (cont.)

Step 5: The learning rate and the neighborhood are decreased


at every iteration according to an appropriate scheme.

For instance, Kohonen suggested a shrinking function in the


form of α(k) = α(0)(1 − k/T ), with T being the total
number of training cycles and α(0) the starting learning rate
bounded by one.
As for the neighborhood, several researchers suggested an
initial region with the size of half of the output grid and
shrinks according to an exponentially decaying behavior.

Step 6: The learning scheme continues until enough number


of iterations has been reached or until each output reaches a
threshold of sensitivity to a portion of the input space.

Major Classes of Neural Networks


Multi-Layer Perceptrons (MLPs) Topology
Radial Basis Function Network Learning Algorithm
Kohonen’s Self-Organizing Network Example
Hopfield Network Applications

Steps of Learning Algorithm (cont.)

Step 5: The learning rate and the neighborhood are decreased


at every iteration according to an appropriate scheme.

For instance, Kohonen suggested a shrinking function in the


form of α(k) = α(0)(1 − k/T ), with T being the total
number of training cycles and α(0) the starting learning rate
bounded by one.
As for the neighborhood, several researchers suggested an
initial region with the size of half of the output grid and
shrinks according to an exponentially decaying behavior.

Step 6: The learning scheme continues until enough number


of iterations has been reached or until each output reaches a
threshold of sensitivity to a portion of the input space.

Major Classes of Neural Networks


Multi-Layer Perceptrons (MLPs) Topology
Radial Basis Function Network Learning Algorithm
Kohonen’s Self-Organizing Network Example
Hopfield Network Applications

Steps of Learning Algorithm (cont.)

Step 5: The learning rate and the neighborhood are decreased


at every iteration according to an appropriate scheme.

For instance, Kohonen suggested a shrinking function in the


form of α(k) = α(0)(1 − k/T ), with T being the total
number of training cycles and α(0) the starting learning rate
bounded by one.
As for the neighborhood, several researchers suggested an
initial region with the size of half of the output grid and
shrinks according to an exponentially decaying behavior.

Step 6: The learning scheme continues until enough number


of iterations has been reached or until each output reaches a
threshold of sensitivity to a portion of the input space.

Major Classes of Neural Networks


Multi-Layer Perceptrons (MLPs) Topology
Radial Basis Function Network Learning Algorithm
Kohonen’s Self-Organizing Network Example
Hopfield Network Applications

Example

A Kohonen self-organizing map is used to cluster four vectors


given by:

(1, 1, 1, 0),
(0, 0, 0, 1),
(1, 1, 0, 0),
(0, 0, 1, 1).

The maximum numbers of clusters to be formed is m = 3.

Major Classes of Neural Networks


Multi-Layer Perceptrons (MLPs) Topology
Radial Basis Function Network Learning Algorithm
Kohonen’s Self-Organizing Network Example
Hopfield Network Applications

Example

Suppose the learning rate (geometric decreasing) is given by:

α(0) = 0.3,
α(t + 1) = 0.2α(t).

With only three clusters available and the weights of only one
cluster are updated at each step (i.e., Nc = 0), find the weight
matrix. Use one single epoch of training.

Major Classes of Neural Networks


Multi-Layer Perceptrons (MLPs) Topology
Radial Basis Function Network Learning Algorithm
Kohonen’s Self-Organizing Network Example
Hopfield Network Applications

Example: Structure of the Network

Major Classes of Neural Networks


Multi-Layer Perceptrons (MLPs) Topology
Radial Basis Function Network Learning Algorithm
Kohonen’s Self-Organizing Network Example
Hopfield Network Applications

Example: Step 1

The initial weight matrix is:


 
0.2 0.4 0.1
0.3 0.2 0.2
W = 0.5

0.3 0.5
0.1 0.1 0.1
Initial radius: Nc = 0

Initial learning rate: α(0) = 0.3

Major Classes of Neural Networks


Multi-Layer Perceptrons (MLPs) Topology
Radial Basis Function Network Learning Algorithm
Kohonen’s Self-Organizing Network Example
Hopfield Network Applications

Example: Repeat Steps 2-3 for Pattern 1

Step 2: For the first input vector (1, 1, 1, 0), do steps 3 - 5.

Step 3:
I (1) = (1 − 0.2)2 + (1 − 0.3)2 + (1 − 0.5)2 + (0 − 0.1)2 = 1.39
I (2) = (1 − 0.4)2 + (1 − 0.2)2 + (1 − 0.3)2 + (0 − 0.1)2 = 1.5
I (3) = (1 − 0.1)2 + (1 − 0.2)2 + (1 − 0.5)2 + (0 − 0.1)2 = 1.71

The input vector is closest to output node 1. Thus node 1 is


the winner. The weights for node 1 should be updated.

Major Classes of Neural Networks


Multi-Layer Perceptrons (MLPs) Topology
Radial Basis Function Network Learning Algorithm
Kohonen’s Self-Organizing Network Example
Hopfield Network Applications

Example: Repeat Step 4 for Pattern 1

Step 4: weights on the winning unit are updated:

w new (1) = w old (1) + α(x − w old (1))


= (0.2, 0.3, 0.5, 0.1) + 0.3(0.8, 0.7, 0.5, 0.9)
= (0.44, 0.51, 0.65, 0.37)
 
0.44 0.4 0.1
0.51 0.2 0.2
W =0.65

0.3 0.5
0.37 0.1 0.1

Major Classes of Neural Networks


Multi-Layer Perceptrons (MLPs) Topology
Radial Basis Function Network Learning Algorithm
Kohonen’s Self-Organizing Network Example
Hopfield Network Applications

Example: Repeat Steps 2-3 for Pattern 2

Step 2: For the second input vector (0, 0, 0, 1), do steps 3 - 5.

Step 3:

I (1) = (0 − 0.44)2 + (0 − 0.51)2 + (0 − 0.65)2 + (1 − 0.37)2


= 1.2731
I (2) = (0 − 0.4)2 + (0 − 0.2)2 + (0 − 0.3)2 + (1 − 0.1)2 = 1.1
I (3) = (0 − 0.1)2 + (0 − 0.2)2 + (0 − 0.5)2 + (1 − 0.1)2 = 1.11

The input vector is closest to output node 2. Thus node 2 is


the winner. The weights for node 2 should be updated.
Major Classes of Neural Networks
Multi-Layer Perceptrons (MLPs) Topology
Radial Basis Function Network Learning Algorithm
Kohonen’s Self-Organizing Network Example
Hopfield Network Applications

Example: Repeat Step 4 for Pattern 2

Step 4: weights on the winning unit are updated:

w new (2) = w old (2) + α(x − w old (2))


= (0.4, 0.2, 0.3, 0.1) + 0.3(−0.4, −0.2, −0.3, 0.9)
= (0.28, 0.14, 0.21, 0.37)
 
0.44 0.28 0.1
0.51 0.14 0.2
W =0.65

0.21 0.5
0.37 0.37 0.1

Major Classes of Neural Networks


Multi-Layer Perceptrons (MLPs) Topology
Radial Basis Function Network Learning Algorithm
Kohonen’s Self-Organizing Network Example
Hopfield Network Applications

Example: Repeat Steps 2-3 for Pattern 3

Step 2: For the second input vector (1, 1, 0, 0), do steps 3 - 5:

Step 3:

I (1) = (1 − 0.44)2 + (1 − 0.51)2 + (0 − 0.65)2 + (0 − 0.37)2


= 1.1131
I (2) = (1 − 0.28)2 + (1 − 0.14)2 + (0 − 0.21)2 + (0 − 0.37)2
= 1.439
I (3) = (1 − 0.1)2 + (1 − 0.2)2 + (0 − 0.5)2 + (0 − 0.1)2 = 1.71

The input vector is closest to output node 1. Thus node 1 is


the winner. The weights for node 1 should be updated.

Major Classes of Neural Networks


Multi-Layer Perceptrons (MLPs) Topology
Radial Basis Function Network Learning Algorithm
Kohonen’s Self-Organizing Network Example
Hopfield Network Applications

Example: Repeat Step 4 for Pattern 3

Step 4: weights on the winning unit are updated:

w new (1) = w old (1) + α(x − w old (1))


= (0.44, 0.51, 0.65, 0.37) + 0.3(0.56, 0.49, −0.65, −0.37)
= (0.608, 0.657, 0.455, 0.259)
 
0.608 0.28 0.1
0.657 0.14 0.2
W =0.455

0.21 0.5
0.259 0.37 0.1

Major Classes of Neural Networks


Multi-Layer Perceptrons (MLPs) Topology
Radial Basis Function Network Learning Algorithm
Kohonen’s Self-Organizing Network Example
Hopfield Network Applications

Example: Repeat Steps 2-3 for Pattern 4

Step 2: For the second input vector (0, 0, 1, 1), do steps 3 - 5:

Step 3:

I (1) = (0 − 0.608)2 + (0 − 0.657)2 + (1 − 0.455)2 + (1 − 0.259)2


= 1.647419
I (2) = (0 − 0.28)2 + (0 − 0.14)2 + (1 − 0.21)2 + (1 − 0.37)2
= 1.119
I (3) = (0 − 0.1)2 + (0 − 0.2)2 + (1 − 0.5)2 + (1 − 0.1)2 = 1.11

The input vector is closest to output node 3. Thus node 3 is


the winner. The weights for node 3 should be updated.

Major Classes of Neural Networks


Multi-Layer Perceptrons (MLPs) Topology
Radial Basis Function Network Learning Algorithm
Kohonen’s Self-Organizing Network Example
Hopfield Network Applications

Example: Repeat Step 4 for Pattern 4

Step 4: weights on the winning unit are updated:

w new (3) = w old (3) + α(x − w old (3))


= (0.1, 0.2, 0.5, 0.1) + 0.3(−0.1, −0.2, 0.5, 0.9)
= (0.07, 0.14, 0.65, 0.37)
 
0.608 0.28 0.07
0.657 0.14 0.14
W =0.455 0.21 0.65

0.259 0.37 0.37

Major Classes of Neural Networks


Multi-Layer Perceptrons (MLPs) Topology
Radial Basis Function Network Learning Algorithm
Kohonen’s Self-Organizing Network Example
Hopfield Network Applications

Example: Step 5

Epoch 1 is complete.

Reduce the learning rate:


α(t + 1) = 0.2α(t) = 0.2(0.3) = 0.06

Repeat from the start for new epochs until ∆wj becomes
steady for all input patterns or the error is within a tolerable
range.

Major Classes of Neural Networks


Multi-Layer Perceptrons (MLPs) Topology
Radial Basis Function Network Learning Algorithm
Kohonen’s Self-Organizing Network Example
Hopfield Network Applications

Applications

A Variety of KSONs could be applied to different applications


using the different parameters of the network, which are:

Neighborhood size,
Shape (circular, square, diamond),
Learning rate decaying behavior, and
Dimensionality of the neuron array (1-D, 2-D or n-D).

Major Classes of Neural Networks


Multi-Layer Perceptrons (MLPs) Topology
Radial Basis Function Network Learning Algorithm
Kohonen’s Self-Organizing Network Example
Hopfield Network Applications

Applications (cont.)

Given their self-organizing capabilities based on the


competitive learning rule, KSONs have been used extensively
for clustering applications such as

Speech recognition,
Vector coding,
Robotics applications, and
Texture segmentation.

Major Classes of Neural Networks


Multi-Layer Perceptrons (MLPs) Topology
Radial Basis Function Network Learning Algorithm
Kohonen’s Self-Organizing Network Example
Hopfield Network Applications and Limitations

Hopfield Network

Major Classes of Neural Networks


Multi-Layer Perceptrons (MLPs) Topology
Radial Basis Function Network Learning Algorithm
Kohonen’s Self-Organizing Network Example
Hopfield Network Applications and Limitations

Recurrent Topology

Major Classes of Neural Networks


Multi-Layer Perceptrons (MLPs) Topology
Radial Basis Function Network Learning Algorithm
Kohonen’s Self-Organizing Network Example
Hopfield Network Applications and Limitations

Origin

A very special and interesting case of the recurrent topology.

It is the pioneering work of Hopfield in the early 1980’s that


led the way for the designing of neural networks with feedback
paths and dynamics.

The work of Hopfield is seen by many as the starting point for


the implementation of associative (content addressable)
memory by using a special structure of recurrent neural
networks.

Major Classes of Neural Networks


Multi-Layer Perceptrons (MLPs) Topology
Radial Basis Function Network Learning Algorithm
Kohonen’s Self-Organizing Network Example
Hopfield Network Applications and Limitations

Associative Memory Concept

The associative memory concept is able to recognize newly


presented (noisy or incomplete) patterns using an already
stored ’complete’ version of that pattern.

We say that the new pattern is ‘attracted’ to the stable


pattern already stored in the network memories.

This could be stated as having the network represented by an


energy function that keeps decreasing until the system has
reached stable status.

Major Classes of Neural Networks


Multi-Layer Perceptrons (MLPs) Topology
Radial Basis Function Network Learning Algorithm
Kohonen’s Self-Organizing Network Example
Hopfield Network Applications and Limitations

General Structure of the Hopfield Network

The structure of Hopfield network is made up of a number of


processing units configured in one single layer (besides the input
and the output layers) with symmetrical synaptic connections; i.e.,

wij = wji

Major Classes of Neural Networks


Multi-Layer Perceptrons (MLPs) Topology
Radial Basis Function Network Learning Algorithm
Kohonen’s Self-Organizing Network Example
Hopfield Network Applications and Limitations

General Structure of the Hopfield Network (cont.)

Major Classes of Neural Networks


Multi-Layer Perceptrons (MLPs) Topology
Radial Basis Function Network Learning Algorithm
Kohonen’s Self-Organizing Network Example
Hopfield Network Applications and Limitations

Hopfield Network: Alternative Representations

Major Classes of Neural Networks


Multi-Layer Perceptrons (MLPs) Topology
Radial Basis Function Network Learning Algorithm
Kohonen’s Self-Organizing Network Example
Hopfield Network Applications and Limitations

Network Formulation

In the original work of Hopfield, the output of each unit can


take a binary value (either 0 or 1) or a bipolar value (either -1
or 1).

This value is fed back to all the input units of the network
except to the one corresponding to that output.

Let us suppose here that the state of the network with


dimension n (n neurons) takes bipolar values.

Major Classes of Neural Networks


Multi-Layer Perceptrons (MLPs) Topology
Radial Basis Function Network Learning Algorithm
Kohonen’s Self-Organizing Network Example
Hopfield Network Applications and Limitations

Network Formulation: Activation Function

The activation rule for each neuron is provided by the


following:
n
( P
X 1 if wij oj > θi
oi = sign( wij oj − θi ) = Pi 6=j
j=1
−1 if i 6=j wij oj < θi

oi : the output of the current processing unit (Hopfield neuron)

θi : threshold value

Major Classes of Neural Networks


Multi-Layer Perceptrons (MLPs) Topology
Radial Basis Function Network Learning Algorithm
Kohonen’s Self-Organizing Network Example
Hopfield Network Applications and Limitations

Network Formulation: Energy Function

An energy function for the network


XX X
E = −1/2 wij oi oj + oi θi
i 6=j

E is so defined as to decrease monotonically with variation of


the output states until a minimum is attained.

Major Classes of Neural Networks


Multi-Layer Perceptrons (MLPs) Topology
Radial Basis Function Network Learning Algorithm
Kohonen’s Self-Organizing Network Example
Hopfield Network Applications and Limitations

Network Formulation: Energy Function (cont.)

This could be readily noticed from the expression relating the


variation of E with respect to the output states variation.
X
∆E = −∆oi ( wij oj − θi )
i 6=j

This expression shows that the energy function E of the


network continues to decrease until it settles by reaching a
local minimum.

Major Classes of Neural Networks


Multi-Layer Perceptrons (MLPs) Topology
Radial Basis Function Network Learning Algorithm
Kohonen’s Self-Organizing Network Example
Hopfield Network Applications and Limitations

Transition of Patterns from High Energy Levels to Lower


Energy Levels

Major Classes of Neural Networks


Multi-Layer Perceptrons (MLPs) Topology
Radial Basis Function Network Learning Algorithm
Kohonen’s Self-Organizing Network Example
Hopfield Network Applications and Limitations

Hebbian Learning

The learning algorithm for the Hopfield network is based on


the so called Hebbian learning rule.

This is one of the earliest procedures designed for carrying out


supervised learning.

It is based on the idea that when two units are simultaneously


activated, their interconnection weight increase becomes
proportional to the product of their two activities.

Major Classes of Neural Networks


Multi-Layer Perceptrons (MLPs) Topology
Radial Basis Function Network Learning Algorithm
Kohonen’s Self-Organizing Network Example
Hopfield Network Applications and Limitations

Hebbian Learning (cont.)


The Hebbian learning rule also known as the outer product
rule of storage, as applied to a set of q presented patterns
pk (k = 1, ..., q) each with dimension n (n denotes the number
of neuron units in the Hopfield network), is expressed as:
 q
X
1

pkj pki if i 6= j
wij = n k=1

0 if i = j

The weight matrix W = {wij } could also be expressed in


terms of the outer product of the vector pk as:

q
1X q
W = {wij } = pk pkT − I
n n
k=1
Major Classes of Neural Networks
Multi-Layer Perceptrons (MLPs) Topology
Radial Basis Function Network Learning Algorithm
Kohonen’s Self-Organizing Network Example
Hopfield Network Applications and Limitations

Learning Algorithm

Step 1 (storage): The first stage is to store the patterns


through establishing the connection weights. Each of the q
fundamental memories presented is a vector of bipolar
elements (+1 or -1).

Step 2 (initialization): The second stage is initialization and


consists in presenting to the network an unknown pattern u
with same dimension as the fundamental patterns.
Every component of the network outputs at the initial
iteration cycle is set as

o(0) = u

Major Classes of Neural Networks


Multi-Layer Perceptrons (MLPs) Topology
Radial Basis Function Network Learning Algorithm
Kohonen’s Self-Organizing Network Example
Hopfield Network Applications and Limitations

Learning Algorithm

Step 1 (storage): The first stage is to store the patterns


through establishing the connection weights. Each of the q
fundamental memories presented is a vector of bipolar
elements (+1 or -1).

Step 2 (initialization): The second stage is initialization and


consists in presenting to the network an unknown pattern u
with same dimension as the fundamental patterns.
Every component of the network outputs at the initial
iteration cycle is set as

o(0) = u

Major Classes of Neural Networks


Multi-Layer Perceptrons (MLPs) Topology
Radial Basis Function Network Learning Algorithm
Kohonen’s Self-Organizing Network Example
Hopfield Network Applications and Limitations

Learning Algorithm (cont.)

Step 3 (retrieval 1): Each one of the component oi of the


output vector o is updated from cycle l to cycle l + 1 by:

Xn
oi (l + 1) = sgn( wij oj (l ))
j=1

This process is known as asynchronous updating.


The process continues until no more changes are made and
convergence occurs.

Step 4 (retrieval 2): Continue the process for other presented


unknown patterns by starting again from step 2.

Major Classes of Neural Networks


Multi-Layer Perceptrons (MLPs) Topology
Radial Basis Function Network Learning Algorithm
Kohonen’s Self-Organizing Network Example
Hopfield Network Applications and Limitations

Learning Algorithm (cont.)

Step 3 (retrieval 1): Each one of the component oi of the


output vector o is updated from cycle l to cycle l + 1 by:

Xn
oi (l + 1) = sgn( wij oj (l ))
j=1

This process is known as asynchronous updating.


The process continues until no more changes are made and
convergence occurs.

Step 4 (retrieval 2): Continue the process for other presented


unknown patterns by starting again from step 2.

Major Classes of Neural Networks


Multi-Layer Perceptrons (MLPs) Topology
Radial Basis Function Network Learning Algorithm
Kohonen’s Self-Organizing Network Example
Hopfield Network Applications and Limitations

Learning Algorithm (cont.)

Step 3 (retrieval 1): Each one of the component oi of the


output vector o is updated from cycle l to cycle l + 1 by:

Xn
oi (l + 1) = sgn( wij oj (l ))
j=1

This process is known as asynchronous updating.


The process continues until no more changes are made and
convergence occurs.

Step 4 (retrieval 2): Continue the process for other presented


unknown patterns by starting again from step 2.

Major Classes of Neural Networks


Multi-Layer Perceptrons (MLPs) Topology
Radial Basis Function Network Learning Algorithm
Kohonen’s Self-Organizing Network Example
Hopfield Network Applications and Limitations

Example

Problem Statement

We need to store a fundamental pattern (memory) given


by the vector O = [1, 1, 1, −1]T in a four node binary
Hopefield network.

Presume that the threshold parameters are all equal to zero.

Major Classes of Neural Networks


Multi-Layer Perceptrons (MLPs) Topology
Radial Basis Function Network Learning Algorithm
Kohonen’s Self-Organizing Network Example
Hopfield Network Applications and Limitations

Establishing Connection Weights

Weight matrix expression discarding 1/4 and having q = 1


q
1X q
W = pk pkT − I = p1 p1T − I
n n
k=1

Therefore:
     
1 1 0 0 0 0 1 1 −1
 1   0 1 0 0
= 1 0 1 −1

W =
 1  1 1 1 −1 − 0
  
0 1 0  1 1 0 −1
−1 0 0 0 1 −1 −1 −1 0

Major Classes of Neural Networks


Multi-Layer Perceptrons (MLPs) Topology
Radial Basis Function Network Learning Algorithm
Kohonen’s Self-Organizing Network Example
Hopfield Network Applications and Limitations

Network’ States and Their Code

Total number of states: There are 2n = 24 = 16 different states.

State Code State Code


A 1 1 1 1 I -1 -1 1 1
B 1 1 1 -1 J -1 -1 1 -1
C 1 1 -1 -1 K -1 -1 -1 -1
D 1 1 -1 1 L -1 -1 -1 1
E 1 -1 -1 1 M -1 1 -1 1
F 1 -1 -1 -1 N -1 1 -1 -1
G 1 -1 1 -1 O -1 1 1 -1
H 1 -1 1 1 P -1 1 1 1

Major Classes of Neural Networks


Multi-Layer Perceptrons (MLPs) Topology
Radial Basis Function Network Learning Algorithm
Kohonen’s Self-Organizing Network Example
Hopfield Network Applications and Limitations

Computing Energy Level of State A = [1, 1, 1, 1]

All thresholds are equal to zero: θi = 0, i = 1, 2, 3, 4·


Therefore,
4 X
X 4
E = −1/2 wij oi oj
i =1 j=1

E = −1/2(w11 o1 o1 + w12 o1 o2 + w13 o1 o3 + w14 o1 o4 +


w21 o2 o1 + w22 o2 o2 + w23 o2 o3 + w24 o2 o4 +
w31 o3 o1 + w32 o3 o2 + w33 o3 o3 + w34 o3 o4 +
w41 o4 o1 + w42 o4 o2 + w43 o4 o3 + w44 o4 o4 )

Major Classes of Neural Networks


Multi-Layer Perceptrons (MLPs) Topology
Radial Basis Function Network Learning Algorithm
Kohonen’s Self-Organizing Network Example
Hopfield Network Applications and Limitations

Computing Energy Level of State A (cont.)

For state A, we have A = [o1 , o2 , o3 , o4 ] = [1, 1, 1, 1]· Thus,

E = −1/2(0 + (1)(1)(1) + (1)(1)(1) + (−1)(1)(1)+


(1)(1)(1) + 0 + (1)(1)(1) + (−1)(1)(1)+
(1)(1)(1) + (1)(1)(1) + 0 + (−1)(1)(1)+
(−1)(1)(1) + (−1)(1)(1) + (−1)(1)(1) + 0)
E = −1/2(0 + 1 + 1 − 1+
1 + 0 + 1 − 1+
1 + 1 + 0 − 1+
− 1 − 1 − 1 + 0)
E = −1/2(6 − 6) = 0

Major Classes of Neural Networks


Multi-Layer Perceptrons (MLPs) Topology
Radial Basis Function Network Learning Algorithm
Kohonen’s Self-Organizing Network Example
Hopfield Network Applications and Limitations

Energy Level of All States

Similarly, we can compute the


energy level of the other states.

Two potential attractors: the


original fundamental pattern
[1, 1, 1, −1]T and its
complement [−1, −1, −1, 1]T .

Major Classes of Neural Networks


Multi-Layer Perceptrons (MLPs) Topology
Radial Basis Function Network Learning Algorithm
Kohonen’s Self-Organizing Network Example
Hopfield Network Applications and Limitations

Retrieval Stage

We update the components of each state asynchronously


using equation:

Xn
oi = sgn( wij oj − θi )
j=1

Updating the state asynchronously means that for every state


presented we activate one neuron at a time.

All states change from high energy to low energy levels.

Major Classes of Neural Networks


Multi-Layer Perceptrons (MLPs) Topology
Radial Basis Function Network Learning Algorithm
Kohonen’s Self-Organizing Network Example
Hopfield Network Applications and Limitations

State Transition for State J = [−1, −1, 1, −1]T


Transition 1 (o1 )

X4
o1 = sgn( wij oj − θi ) = sgn(w12 o2 + w13 o3 + w14 o4 − 0)
j=1

= sgn((1)(−1) + (1)(1) + (−1)(−1))


= sgn(+1)
= +1

As a result, the first component of the state J changes from


−1 to 1. In other words, the state J transits to the state G at
the end of first transition.
J = [−1, −1, 1, −1]T (2) → G = [1, −1, 1, −1]T (0)
Major Classes of Neural Networks
Multi-Layer Perceptrons (MLPs) Topology
Radial Basis Function Network Learning Algorithm
Kohonen’s Self-Organizing Network Example
Hopfield Network Applications and Limitations

State Transition for State J (cont.)


Transition 2 (o2 )

X4
o2 = sgn( wij oj − θi ) = sgn(w21 o1 + w23 o3 + w24 o4 )
j=1

= sgn((1)(1) + (1)(1) + (−1)(−1))


= sgn(+3)
= +1

As a result, the second component of the state G changes


from −1 to 1. In other words, the state G transits to the
state B at the end of first transition.
G = [1, −1, 1, −1]T (0) → B = [1, 1, 1, −1]T (−6)
Major Classes of Neural Networks
Multi-Layer Perceptrons (MLPs) Topology
Radial Basis Function Network Learning Algorithm
Kohonen’s Self-Organizing Network Example
Hopfield Network Applications and Limitations

State Transition for State J (cont.)


Transition 3 (o3 )
As state B is a fundamental pattern, no more transition will occur.
Let us see!
X4
o3 = sgn( wij oj − θi ) = sgn(w31 o1 + w32 o2 + w34 o4 )
j=1

= sgn((1)(1) + (1)(1) + (−1)(−1))


= sgn(+3)
= +1

No transition is observed.
B = [1, 1, 1, −1]T (−6) → B = [1, 1, 1, −1]T (−6)
Major Classes of Neural Networks
Multi-Layer Perceptrons (MLPs) Topology
Radial Basis Function Network Learning Algorithm
Kohonen’s Self-Organizing Network Example
Hopfield Network Applications and Limitations

State Transition for State J (cont.)


Transition 4 (o4 )
Again as state B is a fundamental pattern, no more transition will
occur. Let us see!
X4
o4 = sgn( wij oj − θi ) = sgn(w41 o1 + w42 o2 + w43 o3 )
j=1

= sgn((−1)(1) + (−1)(1) + (−1)(1))


= sgn(−3)
= −1

No transition is observed.
B = [1, 1, 1, −1]T (−6) → B = [1, 1, 1, −1]T (−6)
Major Classes of Neural Networks
Multi-Layer Perceptrons (MLPs) Topology
Radial Basis Function Network Learning Algorithm
Kohonen’s Self-Organizing Network Example
Hopfield Network Applications and Limitations

Asynchronous State Transition Table


By repeating the same procedure for the other states,
asynchronous transition table is easily obtained.

Major Classes of Neural Networks


Multi-Layer Perceptrons (MLPs) Topology
Radial Basis Function Network Learning Algorithm
Kohonen’s Self-Organizing Network Example
Hopfield Network Applications and Limitations

Some Sample Transitions

Fundamental Pattern B = [1, 1, 1, −1]T

There is no change of the energy level and no transition


occurs to any other state.

It is in its stable state because this state has the lowest energy.

State A = [1, 1, 1, 1]T

Only the forth element o4 is updated asynchronously.

The state transits to O = [1, 1, 1, −1]T , representing the


fundamental pattern with the lowest energy value ”-6”.

Major Classes of Neural Networks


Multi-Layer Perceptrons (MLPs) Topology
Radial Basis Function Network Learning Algorithm
Kohonen’s Self-Organizing Network Example
Hopfield Network Applications and Limitations

Some Sample Transitions (cont.)

Complement of Fundamental Pattern L = [−1, −1, −1, 1]T

Its energy level is the same as B and hence it is another stable


state.

Every complement of a fundamental pattern is a


fundamental pattern itself.

This means that the Hopefield network has the ability to


remember the fundamental memory and its complement.

Major Classes of Neural Networks


Multi-Layer Perceptrons (MLPs) Topology
Radial Basis Function Network Learning Algorithm
Kohonen’s Self-Organizing Network Example
Hopfield Network Applications and Limitations

Some Sample Transitions (cont.)


State D = [1, 1, −1, 1]T
It could transit a few times to end up at state C after being
updated asynchronously.

Update the bit o1 , the state becomes M = [−1, 1, −1, 1]T


with energy 0

Update the bit o2 , the state becomes E = [1, −1, −1, 1]T
with energy 0

Update the bit o3 , the state becomes A = [1, 1, 1, 1]T , the


state A with energy 0

Update the bit o4 , the state becomes C = [1, 1, −1, −1]T


with energy 0
Major Classes of Neural Networks
Multi-Layer Perceptrons (MLPs) Topology
Radial Basis Function Network Learning Algorithm
Kohonen’s Self-Organizing Network Example
Hopfield Network Applications and Limitations

Some Sample Transitions (cont.)

State D: Remarks

From the process we know that state D can transit to four


different states.
This depends on which bit is being updated.
If the state D transits to state A or C , it will continue the
updating and ultimately transits to the fundamental state B,
which has the energy −6, the lowest energy.
If the state D transits to state E or M, it will continue the
updating and ultimately transits to state L, which also has the
lowest energy −6.

Major Classes of Neural Networks


Multi-Layer Perceptrons (MLPs) Topology
Radial Basis Function Network Learning Algorithm
Kohonen’s Self-Organizing Network Example
Hopfield Network Applications and Limitations

Transition of States J and N from High Energy Levels to


Low Energy Levels

Major Classes of Neural Networks


Multi-Layer Perceptrons (MLPs) Topology
Radial Basis Function Network Learning Algorithm
Kohonen’s Self-Organizing Network Example
Hopfield Network Applications and Limitations

State Transition Diagram


Each node is characterized by its vector state and its energy
level.

Major Classes of Neural Networks


Multi-Layer Perceptrons (MLPs) Topology
Radial Basis Function Network Learning Algorithm
Kohonen’s Self-Organizing Network Example
Hopfield Network Applications and Limitations

Applications

Information retrieval and for pattern and speech recognition,

Optimization problems,

Combinatorial optimization problems such as the traveling


salesman problem.

Major Classes of Neural Networks


Multi-Layer Perceptrons (MLPs) Topology
Radial Basis Function Network Learning Algorithm
Kohonen’s Self-Organizing Network Example
Hopfield Network Applications and Limitations

Limitations

Limited stable-state storage capacity of the network,

Hopfield estimated roughly that a network with n processing


units should allow for 0.15n stable states.

Many studies have been carried out recently to increase the


capacity of the network without increasing much the number
of the processing units

Major Classes of Neural Networks

You might also like