0% found this document useful (0 votes)
30 views

Lecture5 RBFN

NN Book

Uploaded by

Sanjiv Cr
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
30 views

Lecture5 RBFN

NN Book

Uploaded by

Sanjiv Cr
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 27

Neural Networks

Radial Basis Function Networks


Laxmidhar Behera

Department of Electrical Engineering


Indian Institute of Technology, Kanpur

Intelligent Control p.1/??

Radial Basis Function Networks


Radial Basis Function Networks (RBFN) consists of 3 layers
an input layer
a hidden layer
an output layer
The hidden units provide a set of functions that constitute
an arbitrary basis for the input patterns.
hidden units are known as radial centers and
represented by the vectors c1 , c2 , , ch
transformation from input space to hidden unit space
is nonlinear whereas transformation from hidden unit
space to output space is linear
dimension of each center for a p input network is p 1

Intelligent Control p.2/??

Network Architecture
x1
c1 1

x2

w1
c2 2 w 2

x3
x4

..
..
..
..
.

xp
Input
layer

..
..
..
..
.

wn

ch h
Hidden layer Output
layer
of Radial
Basis functions

Figure 1: Radial Basis Function Network

Intelligent Control p.3/??

Radial Basis functions


The radial basis functions in the hidden layer produces a
significant non-zero response only when the input falls
within a small localized region of the input space.
Each hidden unit has its own receptive field in input space.
An input vector xi which lies in the receptive field for center
cj , would activate cj and by proper choice of weights the
target output is obtained. The output is given as

y=

h
X

j w j ,

j = (kx cj k)

j=1

wj : weight of j th center, : some radial function.

Intelligent Control p.4/??

Contd...
Different radial functions are given as follows.
z 2 /2 2

Gaussian radial function


(z) = e
Thin plate spline
(z) = z 2 logz
Quadratic
(z) = (z 2 + r2 )1/2
Inverse quadratic
(z) = (z2 +r12 )1/2
Here, z = kx cj k
The most popular radial function is Gaussian
activation function.

Intelligent Control p.5/??

RBFN vs. Multilayer Network


RBF NET

MULTILAYER NET

It has a single hidden layer

It has multiple hidden layers

The basic neuron model as


well as the function of the
hidden layer is different
from that of the output layer

The computational nodes


of all the layers are similar

The hidden layer is nonlinear All the layers are


but the output layer is linear
nonlinear

Intelligent Control p.6/??

Contd...
RBF NET

MULTILAYER NET

Activation function of the hidden Activation function comunit computes the Euclidean
putes the inner product
distance between the input vec- of the input vector and
tor and the center of that unit
the weight of that unit
Establishes local mapping,
hence capable of fast learning

Constructs global approximations to I/O mapping

Two-fold learning. Both the centers (position and spread) and


the weights have to be learned

Only the synaptic


weights have to be
learned

Intelligent Control p.7/??

Learning in RBFN
Training of RBFN requires optimal selection of the
parameters vectors ci and wi , i = 1, h.
Both layers are optimized using different techniques
and in different time scales.
Following techniques are used to update the weights
and centers of a RBFN.
Pseudo-Inverse Technique (Off line)
Gradient Descent Learning (On line)
Hybrid Learning (On line)

Intelligent Control p.8/??

Pseudo-Inverse Technique
This is a least square problem. Assume a fixed radial
basis functions e.g. Gaussian functions.
The centers are chosen randomly. The function is
P
normalized i.e. for any x,
i = 1.

The standard deviation (width) of the radial function is


determined by an adhoc choice.

The learning steps are as follows:


1. The width is fixed according to the spread of centers
i = e

h
kxci k2 )
d2

i = 1, 2, h

where h: number of centers, d: maximum distance


between the chosen centers. Thus = d2h .
Intelligent Control p.9/??

Contd...
2. From figure 1, = [1 , 2 , , h ]
w = [w1 , w2 , , wh ]T
w = y d ,

y d is the desired output

3. Required weight vector is computed as


w = (T )1 T y d = 0 y d
0 = (T )1 T is the pseudo-inverse of .
This is possible only when T is non-singular. If this
is singular, singular value decomposition is used to
solve for w.

Intelligent Control p.10/??

Illustration: EX-NOR problem


The truth table and the RBFN architecture are given below:

x1 x2 y d
1
0
0
1

x1

0
1
0
1

c1 1
w1

w2
x2

0
0
1
1

bias = +1

c2 2

Choice of centers is made randomly from 4 input patterns.


c1 = [0 0]T and c2 = [1 1]T
1 = (kx c1 k) = e
Similarly, 2 = e

kxc2 k2

kxc1 k2

, x = [x1 x2 ]T

Intelligent Control p.11/??

Contd...
Output y = w1 1 + w2 2 +
Applying 4 training patterns one after another

w1 + w2 e2 + = 1 w1 e1 + w2 e1 + = 1
w1 e1 + w2 e1 + = 1 w1 e2 + w2 + = 1


1
0.1353 1
1


w
1
0.3679 0.3679 1
0

,
y
=
=
, w =

w2

0.3679 0.3679 1
0

1
0.1353
1
1
h
iT
Using w = 0 y d , we get w = 2.5031 2.5031 1.848

Intelligent Control p.12/??

Gradient Descent Learning


One of the most popular approaches to update c and w, is
supervised training by error correcting term which is
achieved by a gradient descent technique. The update rule
for center learning is
E
cij (t + 1) = cij (t) 1
, for i = 1 to p, j = 1 to h
cij
the weight update law is
E
wi (t + 1) = wi (t) 2
wi
P d
1
where the cost function is E = 2 (y y)2

Intelligent Control p.13/??

Contd...
The actual response is

y=

h
X

i w i

i=1

the activation function is taken as

i = e

zi2 /2 2

where zi = kx ci k, is the width of the center.


Differentiating E w.r.t. wi , we get

E
y
E

=
= (y d y)i
wi
y
wi

Intelligent Control p.14/??

Contd...
Differentiating E w.r.t. cij , we get

y
E
i
E

cij
y
i cij
i
zi
= (y y) wi

zi cij
zi
= 2 i

X
=
( (xj cij )2 )1/2
cij j
d

i
Now,
zi
zi
and
cij

= (xj cij )/zi

Intelligent Control p.15/??

Contd...
After simplification, the update rule for center learning is:

i
cij (t + 1) = cij (t) + 1 (y y)wi 2 (xj cij )

The update rule for the linear weights is:

wi (t + 1) = wi (t) + 2 (y d y)i

Intelligent Control p.16/??

Example: System identification


The same Surge Tank system has been taken for
simulation. The system model is given as

h(t + 1) = h(t) + T
t
T
u(t)
h(t)
g

:
:
:
:
:

2gh(t)
u(t)
p
+p
3h(t) + 1
3h(t) + 1

discrete time step


sampling time
input flow, can be positive or negative
liquid level of the tank (output)
the gravitational acceleration

Intelligent Control p.17/??

Data generation
Sampling time is taken as 0.01 sec, 150 data have been
generated using the system equation. The nature of input
u(t) and y(t) is shown in the following figure.
6
input u(t)
output h(t)

I/O data

0
0

50

100
time step

150

Intelligent Control p.18/??

Contd...
The system is identified from the input-output data
using a radial basis function network
Network parameters are given below:
Number of inputs
Number of outputs
Units in the hidden layer
Number of I/O data
Radial Basis Function
Width of the radial function ()
Center learning rate (1 )
Weight learning rate (2 )

:
:
:
:
:
:
:
:

2 [(u(t), h(t)]
1 [target: h(t + 1)]
30
150
Gaussian
0.707
0.3
0.2

Intelligent Control p.19/??

Result: Identification
After identification, the root mean square error error is
found to be < 0.007. Convergence of mean square error is
shown in the following figure.
0.4
0.35

Mean square error

0.3
0.25
0.2
0.15
0.1
0.05
0
0

200

400

600

epochs

800

1000

Intelligent Control p.20/??

Model Validation
After identification, the model is validated through a set of
100 input-output data which is different from the data set
used for training. The result is shown in the following figure
where left figure represents the desired and network output
and right figure shows the corresponding input.
1

6
0.8

Input u(t)

Output yd, ya

4
0.6

0.4

0.2

2
Desired
Actual

0
0

10

20

30

40

50

time step

60

70

80

90

100

4
1

20

40

60

time step (t)

Intelligent
Control
80
100 p.21/??

Hybrid Learning
In hybrid learning, the radial basis functions relocate their
centers in a self-organized manner while the weights are
updated using supervised learning.
When a pattern is presented to RBFN, either a new center
is grown if the pattern is sufficiently novel or the parameters
in both layers are updated using gradient descent.
The test of novelty depends on two criteria:
Is the Euclidean distance between the input pattern
and the nearest center greater than a threshold (t)?
Is the mean square error at the output greater than a
desired accuracy?
A new center is allocated when both criteria are satisfied.

Intelligent Control p.22/??

Contd...
Find out a center that is closest to x in terms of Euclidean
distance. This particular center is updated as follows:

ci (t + 1) = ci (t) + (x ci (t))
Thus the center moves closer to x.
While centers are updated using unsupervised learning,
the weights can be updated using least mean squares
(LMS) or recursive least squares (RLS) algorithm. We will
present the RLS algorithm.

Intelligent Control p.23/??

Contd...
The ith input of the RBFN can be written as

xi = T i , i = 1, , n
where Rl , the output vector of the hidden layer;
i Rl , the connection weight vector from hidden
units to ith output unit. The weight update law:
i (t + 1) = i (t) + P (t + 1)(t)[xi (t + 1) T (t)i (t)
P (t + 1) = P (t) P (t)(t)[1 + T (t)P (t)(t)]1 T (t)P (t)

where P (t) Rll . This algorithm is more accurate


and fast compared to LMS algorithm.

Intelligent Control p.24/??

Example: System Identification


The same surge tank model is identified with the
same input-output data set. The parameters of
RBFN are also same as that of gradient descent
technique.
Center training is done using unsupervised K-mean
clustering with a learning rate of 0.5.
Weight training is done using gradient descent
method with a learning rate of 0.1.
After identification root mean square is found to be
< 0.007.

Intelligent Control p.25/??

Contd...
Mapping of input data and centers are shown in the
following figure. Left figure shows how the centers are
initialized and right figure shows the spreading of centers
towards the nearest inputs after learning.
Before Training

After Training

2.5

2.5

Input
Center

Input
Center
2

1.5

1.5

0.5

0.5

0
4

10

0
4

Intelligent
Control
8
10 p.26/??

Comparison of Results
Schemes

RMS error

No. of iterations

Back-propagation

0.00822

5000

RBFN (Gradient Descent)

0.00825

2000

RBFN (Hybrid Learning)

0.00823

1000

It is seen from the table that RBFN learns faster compared


to a simple feedforward network.

Intelligent Control p.27/??

You might also like