0% found this document useful (0 votes)
19 views6 pages

Approach 1: Exact RBF

The document discusses the Exact Radial Basis Function (RBF) approach for function approximation, where the first layer weights are set to training data and the spread is determined based on the maximum distance between centers. Michelli’s Theorem ensures the invertibility of the interpolation matrix for distinct data points, making RBFs suitable for function approximation without iterative training. However, using too many receptive fields can lead to overfitting, prompting the need for methods to reduce complexity, such as selecting a subset of training data or employing unsupervised training techniques like K-means.

Uploaded by

kediatrisha13
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views6 pages

Approach 1: Exact RBF

The document discusses the Exact Radial Basis Function (RBF) approach for function approximation, where the first layer weights are set to training data and the spread is determined based on the maximum distance between centers. Michelli’s Theorem ensures the invertibility of the interpolation matrix for distinct data points, making RBFs suitable for function approximation without iterative training. However, using too many receptive fields can lead to overfitting, prompting the need for methods to reduce complexity, such as selecting a subset of training data or employing unsupervised training techniques like K-means.

Uploaded by

kediatrisha13
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

Approach 1

.
.
.
.
.

Exact RBF
The first layer weights u are set to the training data; U=XT. That is the
gaussians are centered at the training data instances.
The spread is chosen as , where dmax is the maximum Euclidean
distance between any two centers, and N is the number of training data
points. Note that H=N, for this case.
The output of the kth RBF output neuron is then

Multiple Single output


outputs

During training, we want the outputs to be equal to our desired targets.


Without loss of any generality, assume that we are approximating a single
dimensional function, and let the unknown true function be f(x). The
desired output for each input is then di=f(xi), i=1, 2, …, N.
.
.
.
Approach 1
.
. (Cont.)

We then have a set of linear equations, which can be represented in


the matrix form:

Define:

Is this matrix always invertible?


.
.
.
Approach 1
.
. (Cont.)

Michelli’s Theorem (1986)

If {xi}iN=1 are a distinct set of points in the d-dimensional space, then the
N by N interpolation matrix Φ with elements obtained from radial basis
functions is nonsingular, and hence can be inverted!

Note that the theorem is valid regardless the value of N, the choice of the
RBF (as long as it is an RBF), or what the data points may be, as long as
they are distinct!
.
.
.
Approach1
.
. (Cont.)

The Gaussian is the most commonly used RBF (why…?).


Note that

Gaussian RBFs are localized functions ! unlike the sigmoids used by MLPs

Using Gaussian radial basis Using sigmoidal radial basis


functions functions
Exact RBF Properties
.
.
.
.
.

Using localized functions typically makes RBF networks more suitable for
function approximation problems.
Since first layer weights are set to input patterns, second layer weights are
obtained from solving linear equations, and spread is computed from the
data, no iterative training is involved !!!
Guaranteed to correctly classify all training data points!
However, since we are using as many receptive fields as the number of
data, the solution is over determined, if the underlying physical process
does not have as many degrees of freedom 🡺 Overfitting!
The importance of σ: Too small will
also cause overfitting. Too large will
fail to characterize rapid changes in
the signal.
.
.
Too many
Receptive Fields?
.
.
.

In order to reduce the artificial complexity of the RBF, we need to


use fewer number of receptive fields.
How about using a subset of training data, say M < N of them.
These M data points will then constitute M receptive field centers.
How to choose these M points…?
At random 🡺 Approach 2.

Output layer weights are determined as they were in Approach 1, through


solving a set of M linear equations!
Unsupervised training: K-means 🡺 Approach 3
The centers are selected through self organization of clusters, where the
data is more densely populated. Determining M is usually heuristic.

You might also like