0% found this document useful (0 votes)
50 views17 pages

Radial Basis Function (RBF) Neural Networks For The Senior Design Project

This document discusses Radial Basis Function (RBF) neural networks. It explains that RBFs use radial basis functions whose output depends on the distance from a center point. They are used to form clusters in pattern recognition. An RBF network has two layers: the first layer transforms the input data into a linearly separable space using radial basis functions, and the second layer then separates the classes. The document provides an example of using RBFs to solve the XOR problem and discusses the k-means clustering algorithm for training the network to find the cluster centers.

Uploaded by

ዛላው መና
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
50 views17 pages

Radial Basis Function (RBF) Neural Networks For The Senior Design Project

This document discusses Radial Basis Function (RBF) neural networks. It explains that RBFs use radial basis functions whose output depends on the distance from a center point. They are used to form clusters in pattern recognition. An RBF network has two layers: the first layer transforms the input data into a linearly separable space using radial basis functions, and the second layer then separates the classes. The document provides an example of using RBFs to solve the XOR problem and discusses the k-means clustering algorithm for training the network to find the cluster centers.

Uploaded by

ዛላው መና
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 17

Radial Basis Function (RBF)

Neural Networks

for the Senior Design Project

10/27/2004 1
Radial Basis Functions

• Functions, φ(r), whose output (φ) Example: Gaussian


depends on the distance (r) from
some center point ⎧⎪ − r 2 ⎫⎪
Φ (r ) = exp⎨ 2 ⎬
⎪⎩ 2σ ⎪⎭
– Output is large (≈ 1) for
input points near the center 1
(i.e., for small r) 0.75

φ(r)
0.5
– Output falls off rapidly (→ 0) 0.25
0
as input points move away
from the center (i.e., as r -2 -1 0 1 2
increases) r

• Used to form “clusters” in


σ is called the radius of the basis
pattern recognition (assuming
clusters exist in the pattern) function. (It is a measure of the
width of the Gaussian pulse.)
10/27/2004 2
RBF Architecture
• RBF Neural Networks are 2-layer, feed-forward networks.

• The 1st layer (hidden) is not a traditional neural network layer.

• The function of the 1st layer is to transform a non-linearly


separable set of input vectors to a linearly separable set.

• The second layer is then a simple feed-forward layer (e.g., of


Perceptron or ADALINE type neurons) that draws the hyperplane to
separate the classes.

10/27/2004 3
1st-Layer Functionality
• Example: X-OR Problem

p1 p2 X-OR
0 0 0
0 1 1
1 0 1
p2 1 1 0 φ2

1 1
p1 φ1(r1)
Layer 1
p2
(2 neurons)
0 p1 φ2(r2) 0 φ1
0 1 0 1
10/27/2004 4
1st Layer Architecture
Example: 2 Inputs, 2 Neurons

r1 ⎧⎪ − r12 ⎫⎪
distance Φ1 (r1 ) = exp⎨ 2 ⎬ φ1
p1 from p to C1 ⎪⎩ 2σ 1 ⎪⎭
Perceptron
or ADALINE
r2 ⎧⎪ − r22 ⎫⎪ Network
p2 distance
from p to C2
Φ 2 (r2 ) = exp⎨ 2 ⎬ φ2
⎪⎩ 2σ 2 ⎪⎭

1st Layer 2nd Layer


For each neuron in 1st Layer:
1. Compute the distance of the input “point” (p) from the
center of the cluster (C) representing that neuron

2. Find the radial basis function φ as of function of the


10/27/2004 distance r between the input and the cluster center. 5
X-OR Example (Mechanics Only)
• Assume 2 neurons in Layer 1, with centers: C1 = (0, 0), C2 = (1,1)
p2
C2 = (1, 1) • Find r1 for each point p = (p1, p2)
1 (distance to C1)

p1 • Find r2 for each point p = (p1, p2)


C1 = (0, 0) (distance to C2)
1

p1 p2 r12 r22 φ(r1) φ(r2) • Set σ: 2σ2 = 1 ⇒ σ2 = ½


0 0 0 2 1 .135 (In this example φ1 = φ2 = φ)
0 1 1 1 .368 .368
• Calculate φ(r1) and φ(r2) using
1 0 1 1 .368 .368
1 1 2 0 .135 1 φ(r) = exp(-r2)
10/27/2004 6
X-OR Example, Continued
• Result: p1 p2 r1 2 r2 2 φ(r1) φ(r2)
0 0 0 2 1 .135
0 1 1 1 .368 .368
1 0 1 1 .368 .368
1 1 2 0 .135 1

φ2
• Plot resulting points (φ1, φ2)
1
0.8
(Linearly Separable) 0.6 X-OR = 0
0.4 X-OR = 1
0.2
0 φ1
10/27/2004 7
0 0.2 0.4 0.6 0.8 1
Problems with X-OR Example
• # of neurons in the hidden layer, the centers of the neurons, and the
radius (σ) of the RBF’s were assumed known

• In most pattern recognition problems, the centers for the neurons


must be learned*

• Training of the RBF (i.e., finding the centers of the clusters) can be
done via
– the k-means Clustering Algorithm; or
– the Kohonen Algorithm (types of unsupervised
learning)
• We will use k-means Clustering, where the number of clusters, k, is
set in advanced

* Since, in traditional NN’s, the weights are the parameters that must be learned,
the centers are sometimes called the weights of RBF neurons.
10/27/2004 8
k-Means Clustering Algorithm
1. Randomly choose k points from the input training data to be initial
cluster centers

2. For each training data point, find the distance to each of the k
cluster centers

3. Choose the closest center for each training data point, and assign
the corresponding cluster number from 1 to k for the data point.

4. For each cluster i (i = 1, …, k), find the average of the assigned


training data points in that cluster. Make the average the new center
for the cluster.

5. Repeat steps 2 – 4, until the clusters have converged (i.e., until


none of the training data points are changing clusters.)
10/27/2004 9
Finding the Radius (σ) for the RBF’s
• Usually found with P-nearest-neighbor algorithm (often with P = 2)

• P-nearest-neighbor algorithm:

1. For each cluster center, find the P nearest cluster centers

2. For each neuron/cluster, Set σ = RMS distance between the


cluster center and its P nearest cluster centers.

10/27/2004 10
Practice Problem: k-Means Clustering, k = 5

7 Vectors p1 coord p2 coord


6
5 p1 0 5
4
p2

3 p2 0 6
2
1
0 p3 3 3

0 1 2 3 4 5 6 p4 5 0

p1 p5 0 0

p6 2 2
Step 1: k = 5 randomly selected
cluster centers: C1 = p1, …, C5 = p5 p7 5 5
10/27/2004 11
K-Means Clustering, Steps 2 -4
(2 Iterations)
distances new ave.
1st Iteration i Ci(x) Ci(y) p1 p2 p3 (3,3) p4 p5 p6 (2,2) p7 (5,5)

1 0 5 0 1

2 0 6 0

3 3 3 0 * * (10/3,10/3)

4 5 0 0

5 0 0 0

2nd Iteration 1 0 5 0 1

2 0 6 0

3 3.33 3.33 * * *

4 5 0 0

5 0 0 0
10/27/2004 Done 12
Finding the Radius (σ) for Each
Neuron/Cluster

distances squared nearest MS-dst RMS-dst 2 σ2

i Ci(x) Ci(y) C1 C2 C3 C4 C5

1 0 5 0 1 13.89 50 25 C2, C3 7.44 2.73 14.89

2 0 6 1 0 18.22 61 36 C1, C3 9.61 3.10 19.22

3 3.33 3.33 13.89 18.22 0 13.89 22.22 C1, C4 13.89 3.73 27.78

4 5 0 50 61 13.89 0 25 C3, C5 19.44 4.41 38.89

5 0 0 25 36 22.22 25 0 C1, C3 23.61 4.86 47.22

10/27/2004 13
1st Layer of RBF Network
for the Practice Problem
(0,5)
d(p, C1)
r1
{
Φ1 (r1 ) = exp − r12 / 14.9} φ1
(0,6)
d(p, C2)
r2
{ }
Φ 2 (r2 ) = exp − r22 / 19.2 φ2

(10/3, 10/3) Perceptron


p d(p, C3)
r3
{
Φ 3 (r3 ) = exp − r32 / 27.8 } φ3 or ADALINE
Network
(5, 0)
d(p, C4)
r4
{
Φ 4 (r4 ) = exp − r42 / 38.9 } φ4
(0, 0)
d(p, C5)
r5
{
Φ 5 (r5 ) = exp − r52 / 47.2 } φ5

10/27/2004 14
Outputs from Layer 1 = Inputs for Layer 2

5-dim. Output vectors from Layer 1

i pi(x) pi(y) r12 r22 r32 r42 r52 φ1 φ2 φ3 φ4 φ5

1 0 5 0 1 13.89 50 25 1 0.95 0.61 0.28 0.59

2 0 6 1 0 18.22 61 36 0.94 1 0.52 0.21 0.47

3 3 3 13 18 0.22 13 18 0.42 0.39 0.99 0.72 0.68

4 5 0 50 61 13.89 0 25 0.03 0.04 0.61 1 0.59

5 0 0 25 36 22.22 25 0 0.19 0.15 0.45 0.53 1

6 2 2 13 20 3.56 13 8 0.42 0.35 0.88 0.72 0.84

7 5 5 25 26 5.56 25 50 0.19 0.26 0.82 0.53 0.35

5-dim. Input vectors for Layer 2


10/27/2004 15
Layer 2: ADALINE or Perceptron
• Perceptron • ADALINE (ADAptive Linear
1. Initialize W, b with 0’s Neuron)
2. Apply first input vector: 1. Initialize W, b with 0’s
output a = hardlim(Wp+b) 2. Apply first input vector:
3. Update: Wnew = Wold +(t-a)pt output a = Wp+b
bnew = bold + (t-a) 3. Update: Wnew = Wold +(t-a)pt
4. Repeat steps 2 and 3 for other bnew = bold + (t-a)
input vectors, until weights & 4. Repeat steps 2 and 3 for other
biases converge input vectors, until weights &
5. Check: a = hardlim(Wp + b) for all biases converge
inputs 5. Find outputs: a = hardlims(Wp +
b) for all inputs

10/27/2004 16
Layer 2: Perceptron or ADALINE
Single Multiple-Input Neuron
Input
p
W
5x1 n a
Perceptron 1x5
+ hardlim
1 1x1 1x1
b
5 1x1 a = hardlim(Wp + b)

Single Multiple-Input Neuron


Input
p
ADALINE W
5x1 n a
1x5 ft = purelin
+
1 b 1 x 1 fcl = hardlims 1 x 1
5
10/27/2004
1x1 17

You might also like