0% found this document useful (0 votes)
38 views10 pages

Radial-Basis Function Networks

Radial basis function (RBF) networks have a hidden layer with radial basis activation functions that produce outputs depending on the distance between the input and stored vector centers. Each hidden neuron acts as a local receptor most sensitive to nearby inputs. The hidden outputs are combined linearly in the output node. Parameters like center locations, spreads, and weights are learned from data using algorithms that may select centers randomly and set spreads based on maximum inter-center distances.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
38 views10 pages

Radial-Basis Function Networks

Radial basis function (RBF) networks have a hidden layer with radial basis activation functions that produce outputs depending on the distance between the input and stored vector centers. Each hidden neuron acts as a local receptor most sensitive to nearby inputs. The hidden outputs are combined linearly in the output node. Parameters like center locations, spreads, and weights are learned from data using algorithms that may select centers randomly and set spreads based on maximum inter-center distances.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

NN 5

Radial-Basis Function Networks RBF

• A function is radial basis (RBF) if its output depends on (is a


non-increasing function of) the distance of the input from a given
stored vector.
• RBFs represent local receptors, as illustrated below, where
each point is a stored vector used in one RBF.
• In a RBF network one hidden layer uses neurons with RBF
activation functions describing local receptors. Then one output
node is used to combine linearly the outputs of the hidden
neurons.
w3

The vector P is “interpolated” using the


three vectors; each vector
gives a contribution that depends on
w2
its weight and on its distance from
the point P. In the picture we have
P w1

w1 < w3 < w2
NN 5 1

RBF
RBF ARCHITECTURE
x1
w1
ϕ1
x2
y

wm1
ϕ m1

xm

• One hidden layer with RBF activation functions


ϕ 1 ... ϕ m 1
• Output layer with linear activation function.
y = w 1ϕ 1 (|| x − t1 ||) + ... + w m 1ϕ m 1 (|| x − t m 1 ||)
|| x − t || distance of x = ( x1 ,..., xm ) from vector t
NN 5 2

Rossella Cancelliere 1
NN 5

RBF
HIDDEN NEURON MODEL

• Hidden units: use radial basis functions


φσ( || x - t||) the output depends on the distance of
the input x from the center t

x1
φσ( || x - t||)
x2 ϕσ
t is called center
σ is called spread
center and spread are parameters
xm

NN 5 3

RBF
HIDDEN NEURON MODEL

• A hidden neuron is more sensitive to data


points near its center.

• For Gaussian RBF this sensitivity may be


tuned by adjusting the spread σ, where a
larger spread implies less sensitivity.

NN 5 4

Rossella Cancelliere 2
NN 5

RBF
Gaussian RBF φ

φ:

center

σ is a measure of how spread the curve is:

Large σ Small σ

NN 5 5

RBF
Interpolation with RBF

The interpolation problem:

Given a set of N different points { x i ∈ ℜ m , i = 1L N } and a set of

N real numbers { d i ∈ ℜ , i = 1 L N }, find a function F : ℜ m ⇒ ℜ

that satisfies the interpolation condition: F (x i ) = d i


N
If F ( x ) = ∑ w iϕ ( x − x i ) we have:
i =1

 ϕ (|| x1 − x1 ||) ... ϕ (|| x1 − xN ||)   w1  d1 


 ...  ...  = ...  ⇒Φw = d
     
ϕ (|| x N − x1 ||) ... ϕ (|| x N − xN ||)   wN  d N 
NN 5 6

Rossella Cancelliere 3
NN 5

RBF
Types of φ

Micchelli’s theorem:
Let {xi }iN=1 be a set of distinct points in ℜ m
.Then the N-by-N interpolation
matrix Φ , whose ji-th element is ϕ ji (
= ϕ x j − xi ) is nonsingular.

• Multiquadrics: Inverse multiquadrics:


1 c>0
ϕ (r ) = (r 2 + c 2 ) ϕ (r) =
1
2

(r2 + c2 ) 2
1

r =|| x − t ||
• Gaussian functions (most used):
 r2 
ϕ ( r ) = exp  − 
2  σ >0
 2σ 
NN 5 7

RBF
RBF network parameters

• What do we have to learn for a RBF NN


with a given architecture?
– The centers of the RBF activation functions
– the spreads of the Gaussian RBF activation
functions
– the weights from the hidden to the output layer
• Different learning algorithms may be used for
learning the RBF network parameters. We
describe three possible methods for learning
centers, spreads and weights.

NN 5 8

Rossella Cancelliere 4
NN 5

RBF
Learning Algorithm 1
• Centers: are selected at random
– centers are chosen randomly from the training set
• Spreads: are chosen by normalization:
Maximum distance between any 2 centers d max
σ= =
number of centers m
1

• Then the activation function of hidden neuron i


becomes:

(
ϕi x − ti
2
) = exp  − dm
2
1 2
x − t i 
 max 
NN 5 9

RBF
Learning Algorithm 1

• Weights: are computed by means of the


pseudo-inverse method.
– For an example ( xi , d i ) consider the output of
the network

y(xi ) ≈ w1ϕ1(|| xi − t1 ||) + ...+ wm1ϕm1(|| xi − tm1 ||)


– We would like y ( xi ) = d i for each example, that
is

w1ϕ1(|| xi − t1 ||) + ... + wm1ϕ m1 (|| xi − tm1 ||) ≈ di

NN 5 10

Rossella Cancelliere 5
NN 5

RBF
Learning Algorithm 1

• This can be re-written in matrix form for one example

[ϕ1 (|| xi − t1 ||) ... ϕ m1 (|| xi − tm1 ||)][ w1...wm1 ]T = di


and
ϕ1 (|| x1 − t1 ||)...ϕ m1 (|| x1 − tm1 ||) 
... [ w ...w ]T = [d ...d ]T
  1 m1 1 N

ϕ1 (|| x N − t1 ||)...ϕ m1 (|| xN − tm1 ||)

for all the examples at the same time

NN 5 11

RBF
Learning Algorithm 1

let  ϕ1 (|| x1 − t1 ||) ... ϕ m1 (|| x1 − t m1 ||) 


Φ =  ... 

ϕ1 (|| xN − t1 ||) ... ϕ m1 (|| x N − t m1 ||)

then we can write  w1  d1 


Φ ...  = ... 
   
 wm1  d N 
If Φ + is the pseudo-inverse of the matrix Φ
we obtain the weights using the following
formula
+
1[ w ...w ] = Φ [d ...d N ]T
m1
T
1
NN 5 12

Rossella Cancelliere 6
NN 5

RBF
Learning Algorithm 1: summary

1. Choose the centers randomly from the


training set.

2. Compute the spread for the RBF function


using the normalization method.

3. Find the weights using the pseudo-inverse


method.

NN 5 13

RBF
Learning Algorithm 2: Centers
• clustering algorithm for finding the centers
1 Initialization: tk(0) random k = 1, …, m1
2 Sampling: draw x from input space
3 Similarity matching: find index of center closer to x

k(x) = arg min k x(n) − t k ( n )


4 Updating: adjust centers
t k (n) +η[x(n)− tk (n)] if k = k(x)
t k ( n + 1) =
t k (n) otherwise
5 Continuation: increment n by 1, goto 2 and continue
until no noticeable changes of centers occur
NN 5 14

Rossella Cancelliere 7
NN 5

Learning Algorithm 2: summary RBF

• Hybrid Learning Process:


• Clustering for finding the centers.
• Spreads chosen by normalization.
• LMS algorithm (see Adaline) for finding the
weights.

NN 5 15

RBF
Learning Algorithm 3
• Apply the gradient descent method for finding
centers, spread and weights, by minimizing the
(instantaneous) squared error 1
E = ( y ( x ) − d )2
2
• Update for:

∂E
centers ∆t j = −ηt j
∂ tj
spread ∂E
∆σ j = −ησ j
∂σ j
weights ∂E
∆w ij = −ηij
∂w ij
NN 5 16

Rossella Cancelliere 8
NN 5

RBF
Comparison with multilayer NN

RBF-Networks are used for regression and for


performing complex (non-linear) pattern classification
tasks.

Comparison between RBF networks and FFNN:


• Both are examples of non-linear layered feed-forward
networks.

• Both are universal approximators.

NN 5 17

RBF
Comparison with multilayer NN

• Architecture:
– RBF networks have one single hidden layer.
– FFNN networks may have more hidden layers.

• Neuron Model:
– In RBF the neuron model of the hidden neurons is different from the
one of the output nodes.
– Typically in FFNN hidden and output neurons share a common
neuron model.
– The hidden layer of RBF is non-linear, the output layer of RBF is
linear.
– Hidden and output layers of FFNN are usually non-linear.

NN 5 18

Rossella Cancelliere 9
NN 5

RBF
Comparison with multilayer NN

• Activation functions:
– The argument of activation function of each hidden neuron in
a RBF NN computes the Euclidean distance between input
vector and the center of that unit.
– The argument of the activation function of each hidden
neuron in a FFNN computes the inner product of input vector
and the synaptic weight vector of that neuron.
• Approximation:
– RBF NN using Gaussian functions construct local
approximations to non-linear I/O mapping.
– FF NN construct global approximations to non-linear I/O
mapping.

NN 5 19

Rossella Cancelliere 10

You might also like