Technical Report 2007-001: Radial Basis Functions Response Surfaces
Technical Report 2007-001: Radial Basis Functions Response Surfaces
Technical Report 2007-001: Radial Basis Functions Response Surfaces
Abstract
Radial Basis Functions (RBF) are a powerful tool for multivariate scattered data
interpolation. Despite their simple formulation RBF hide a sound theoretical frame-
work. In modeFRONTIER five different radial functions are available. Furthermore
a fully automatic scaling policy is implemented, based on the minimization of the
mean leave-one-out error.
1 Introduction
Radial Basis Functions (RBF) are a powerful tool for multivariate scattered data in-
terpolation. Scattered data means that the training points do not need to be sampled
on a regular grid: in fact RBF is a proper meshless method. Since RBF are interpolant
response surfaces they pass exactly through training points.
There exists a vast body of literature on both the theoretical and the computational
features of RBF: for example refer to [1, 2, 3] for a detailed treatment of the subject.
f (xi ) = fi , i = 1, . . . , n , (1)
where k.k is the Euclidean norm in the d-dimensional space, and δ a fixed scaling pa-
rameter. The radial function (or kernel ) φ(r) : [0, +∞) → R is a suitable fixed function
chosen out of a given list. So the RBF interpolant s is simply a linear combination
of identical spherical symmetric functions, centered at the n different training points
sites.
The coefficients cj represent the free parameters of the RBF model. Their values
are obtained by imposing the interpolation equations:
By defining the symmetrical matrix A (termed the collocation matrix of the RBF) as
1
Tec. Rep. 2007-001 April 2, 2007
Here m represents the degree of the polynomial, and it depends only on the choice of
φ. The polynomial term has the form
q
X
d
pm (x) = bj πj (x) ∈ Pm , (9)
j=1
where {πj (x)} is a basis of the linear space Pmd containing all real-valued polynomials
and it is equal to
m+d
q= . (10)
d
An example will clarify the form of the polynomial term. With three variables
(d = 3), a second order polynomial (m = 2) involves a 10-dimensional polynomial
space (q = 10). A suitable basis {πj (x)} of this space is the following set of monomials:
{1, x1 , x2 , x3 , x21 , x1 x2 , x22 , x1 x3 , x2 x3 , x23 } .
Now by defining the matrix P as
Pij = πj (xi ) , i = 1, . . . , n and j = 1, . . . , q , (11)
the interpolation equations (6) are replaced by
A·c+P·b=f, (12)
and are coupled with q additional equations, the moment conditions:
PT · c = 0 . (13)
In this way the interpolation equations and the moment conditions can be arranged
together to form an augmented system of dimensions n + q:
! ! !
A P c f
· = . (14)
PT 0 b 0
Since the radial function is CPD, this linear system can be inverted, and there exists
a unique solution for the unknown c and b vectors of coefficients.
2
Tec. Rep. 2007-001 April 2, 2007
G φ(r) = exp(−r2 ) PD
(
r3 d odd m = (d + 1)/2
PS φ(r) = 2
r log(r) d even m = d/2
Table 1: Available radial functions φ(r). They are either PD, or CPD: in the latter case the degree
m of the required polynomial is specified. The symbol (.)k+ denotes the truncated power function:
(x)k+ = xk for x > 0, and (x)k+ = 0 for x ≤ 0.
4 Radial functions
In modeFRONTIER five different radial functions are available: Gaussians (G), Duchon’s
Polyharmonic Splines (PS), Hardy’s MultiQuadrics (MQ), Inverse MultiQuadrics (IMQ),
and Wendland’s Compactly Supported C 2 (W2). This list represents a complete set
of state of the art and widely used radial functions that can be found in literature.
The analytical expressions of the different functions is shown in table 1, while
figure 1 shows their plots. G, IMQ, and W2, are PD; on the contrary PS and MQ are
CPD, and so they require the additional polynomial term.
W2 have different expressions according to the space dimensionality; anyhow they
cannot be used with more than 5 dimensions.
PS have two expressions, according to the fact that space dimensionality is even or
odd. PS include Thin-Plate Splines (for d = 2) and the usual Natural Cubic Splines
(for d = 1) as particular cases.
A·c=f, (15)
3
Tec. Rep. 2007-001 April 2, 2007
Radial functions
3
G
PS
2.5 MQ
IMQ
W2
2
1.5
φ(r)
1
0.5
−0.5
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2
r
6 Leave-one-out error
One system for checking the goodness of an interpolant response surface is the leave-
one-out methodology. In turn, each point belonging to the training set is excluded from
4
Tec. Rep. 2007-001 April 2, 2007
the training procedure. The value predicted in the excluded point by the so created
surface is then compared to the known value. The leading idea is that the smaller this
value on average, the better the surface trained on the whole dataset.
A severe drawback of this technique is its huge computational demand: n different
surfaces have to be created, each using n − 1 training points, where n is the size of the
original training set. Very often this fact prevents this method from being used.
We will show in this section how this is not the case in RBF framework: in fact
there is a convenient way for computing the root-mean-square (rms) leave-one-out error
(see [5]).
A·c=f. (18)
Note that even the augmented system (14) can be considered to have the same form
A c=e
e ·e f, (19)
simply denoting with the symbol e. the relevant augmented quantities. Thus the results
presented in this section can be regarded as generic.
Let d(k) be the solution of
and
n
X (k)
Aij dj = δik , ∀i = 1, . . . , n , (22)
j=1
Let c(k) be the solution of the linear system obtained from (21) by removing from
A the k-th row and the k-th column:
n
X (k)
Aij cj = fi , ∀i = 1, . . . , k − 1, k + 1, . . . , n . (24)
j=1, j6=k
5
Tec. Rep. 2007-001 April 2, 2007
The RBF interpolant s(k) (x) obtained by excluding the k-th point from the training
set has the form (compare to equation 2):
n
X
(k) (k)
s (x) = cj φ (kx − xj k/ δ) , (27)
j=1, j6=k
where the coefficients c(k) are exactly those of equation (24), which represents indeed
the interpolation equations for the reduced training set.
Evaluating it just in the point xk excluded from the training, we obtain
n
X n
X
(k) (k) (k)
s (xk ) = cj φ (kxk − xj k/ δ) = Akj cj ; (28)
j=1, j6=k j=1, j6=k
6
Tec. Rep. 2007-001 April 2, 2007
G radial function
1
0.5
0.9 1.0
2.0
0.8
0.7
0.6
φ(r/δ) 0.5
0.4
0.3
0.2
0.1
0
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2
r
Figure 2: Plot of different G radial functions relative to different values of the shape parameter δ
(values are reported in the caption).
Its evaluation represents a fringe benefit, since it is readily computable once that the
linear system of interpolation equations has been solved (which is the most demanding
part from the computational point of view). We will see later on how this value is
useful in order to set automatically the scaling parameter.
7 Scaling parameter
The scaling parameter determines the shape of the radial function: figure 2 shows an
example. Its value has to be set accordingly to the specific problem one is facing: in
general it can be related to the data spatial density.
7
Tec. Rep. 2007-001 April 2, 2007
sin_k2n2 problem
1
0.8
0.6
0.4
0.2
0
f
−0.2
−0.4
−0.6
−0.8
−1
1
0.8 1
0.6 0.8
0.4 0.6
0.4
0.2
0.2
0 0
y x
1h i
f (x) = sin(kπx1 ) − sin(kπx2 ) ,
2
where the parameter k (the number of half-periods in the sinusoidal functions) is fixed
to k = 2. Figure 3 shows the plot of the response function. We take n = 40 randomly
chosen points as the training set.
8
Tec. Rep. 2007-001 April 2, 2007
20
collocation matrix
10
18
10
16
10
14
10
condition number 10
12
10
10
8
10
6
10
4
10
2
10
0
10
−2 −1 0 1 2
10 10 10 10 10
scaling parameter
Figure 4: Condition number vs. scaling parameter for RBF-G on sin k2n2 problem.
0
training data
10
−2
10
−4
10
−6
10
performance
−8
10
−10
10
−12
10
−14
10
−16
10
−2 −1 0 1 2
10 10 10 10 10
scaling parameter
Figure 5: Training performance (rms error on training data) vs. scaling parameter for RBF-G on
sin k2n2 problem.
9
Tec. Rep. 2007-001 April 2, 2007
1
validation data
10
0
10
−1
10
performance
−2
10
−3
10
−4
10
−2 −1 0 1 2
10 10 10 10 10
scaling parameter
Figure 6: Validation performance (rms error on validation data) vs. scaling parameter for RBF-G on
sin k2n2 problem.
1
leave−one−out error
10
0
10
rms error
−1
10
−2
10
−3
10
−2 −1 0 1 2
10 10 10 10 10
scaling parameter
Figure 7: rms leave-one-out error vs. scaling parameter for RBF-G on sin k2n2 problem.
10
Tec. Rep. 2007-001 April 2, 2007
RBF−G
1
0.8
0.6
0.4
0.2
0
f
−0.2
−0.4
−0.6
−0.8
−1
1
0.8 1
0.6 0.8
0.4 0.6
0.4
0.2
0.2
0 0
y x
Figure 6 shows the plot of the validation performance (i.e. the rms error on the val-
idation data) vs. the scaling parameter. A brand new set of 500 points, the validation
dataset, has been generated in order to study the goodness of the trained RBF. Clearly
this is the function that we always would like to minimize, getting the optimum value
of δ: but obviously in real application this plot is unknown.
The plot of the rms leave-one-out error vs. the scaling parameter is shown in fig-
ure 7. The aspect of the curve resembles very closely that of figure 6: more importantly
the location of the minimum is the same. This fact make it possible to gain a feasible
method for automatically setting the scaling parameter to an optimal value. We will
analyze this aspect later on (see section 7.5).
RBF have been benchmarked over many test problems, obtaining similar results.
Also the other kinds of RBF other than G have been tested, studying their character-
istic behavior. A common result is that always the rms leave-one-out error curve is
similar to that of validation performance. Herein only one test case has been reported,
for illustrative sake.
11
Tec. Rep. 2007-001 April 2, 2007
RBF−G
1
0.8
0.6
0.4
0.2
0
f −0.2
−0.4
−0.6
−0.8
−1
1
0.8 1
0.6 0.8
0.4 0.6
0.4
0.2
0.2
0 0
y x
RBF−G
1
0.8
0.6
0.4
0.2
0
f
−0.2
−0.4
−0.6
−0.8
−1
1
0.8 1
0.6 0.8
0.4 0.6
0.4
0.2
0.2
0 0
y x
12
Tec. Rep. 2007-001 April 2, 2007
• too high scaling parameter (e.g. δ = 100.0): ill-conditioned problem. Bad re-
sults: too “smooth” surface, even not able to interpolate the training points. See
figure 10.
In order to improve the numerical stability of the problem one should maximize the
separation distance: max q.
Therefore in order to improve both approximation quality and numerical stability
one should maximize the ratio max(q/h). Clearly this objective is achieved for a well
distributed, almost uniform, set of training points. But in general, for scattered data,
one deals with q ≪ h. In case of uniform distribution of data, there is no way for
further improving both objectives: there is a trade-off situation between min h and
max q.
13
Tec. Rep. 2007-001 April 2, 2007
8 RBF vs. NN
Sometimes in literature one finds the terminology “Radial Basis Function Networks”:
these are particular Neural Networks (NN) which use radial functions as transfer func-
tion. Often RBF Networks are simply the RBF herein exposed, but interpreted from
the point of view of NN: in our opinion this could be misleading, since RBF and
NN are quite different algorithms for building response surfaces, each with its own
characteristics (see e.g. [7] to revise NN theory).
Here is a list of differences between RBF and NN. Furthermore some features of
RBF Networks are presented: it seems to be hard to fit them in the context of NN.
• RBF are interpolant while NN are approximants (i.e. they do not pass exactly
through training points).
• In RBF Networks the number of neurons in the hidden layers is equal to the size
of the training set. In general this is not the case for a NN. Furthermore in RBF
Networks each neuron is strongly coupled with its relevant training point (as
explained in the next point), while in usual NN there is no direct link between
neurons and training points: in general each neuron is able to map different
regions of space.
14
Tec. Rep. 2007-001 April 2, 2007
network). This net input is then transformed by the non-linear transfer func-
tion. On the contrary in RBF Networks each neuron combines the inputs by
evaluating the Euclidean distance of the input vector from one training point:
so the combination of the inputs values is non-linear, and instead of having free
parameters one deals with the coordinates of one given training point. Then the
transfer function is a radial function (usually only of G type).
9 Parameters settings
Only few parameters must be defined by the user for the RBF algorithm in mode-
FRONTIER:
Training Set : the designs set for Radial Basis Functions training. It is possible to
choose between All Designs database and Only Marked Designs.
Radial Functions : five different radial functions are available: Gaussians (G),
Duchon’s Polyharmonic Splines (PS), Hardy’s MultiQuadrics (MQ), Inverse Mul-
tiQuadrics (IMQ), and Wendland’s Compactly Supported C 2 (W2).
Scaling Policy : the Automatic selection will let the algorithm to choose the proper
scaling parameter value by means of minimization of the rms leave-one-out error.
On the contrary, if the User Defined choice is selected, the user has to define
manually the value of the scaling parameter.
Scaling Parameter : this field is significant if and only if the User Defined choice is
selected, and defines the value of the scaling parameter δ.
It is always possible to stop the run by clicking on the Stop RSM button. However,
since the interpolation equations are solved by the exact one-step SVD algorithm, a
premature stop will result in no solution.
It is always useful to look at the RSM log during and after the training process.
Some information on points distribution is shown, such as minimum, maximum, and
mean mutual distances. When the automatic scaling policy is enabled, at each step
the scaling parameter and the mean leave-one-out error values are shown. Finally the
condition number of the problem is shown. Consider that when variables normalization
is enabled, also all other values are normalized accordingly.
15
Tec. Rep. 2007-001 April 2, 2007
References
[1] Wendland, Holger, 2004, Scattered Data Approximation, Cambridge University
Press
[2] Iske, Armin, 2004, Multiresolution Methods in Scattered Data Modelling, Springer
[3] Buhmann, Martin D., 2003, Radial Basis Functions: Theory and Implementations,
Cambridge University Press
[4] Press, W. H., Teukolsky, S. A., Vetterling, W. T., & Flannery, B. P., 1992, Numer-
ical Recipes in C. The Art of Scientific Computing, 2nd ed., Cambridge University
Press
[5] Rippa, Shmuel, 1999, An algorithm for selecting a good value for the parameter
c in radial basis function interpolation, Adv. in Comp. Math., 11, 193-210
[6] Schaback, Robert, 1995, Error estimates and condition numbers for radial basis
function interpolation, Adv. in Comp. Math., 3, 251-264.
[7] Rigoni, Enrico and Lovison, Alberto, 2006, Neural Networks Response Surfaces,
Esteco Technical Report 2006-001.
16