J. R. Statist. Soc.
B (1978),
40, No. 3, pp. 364-372
Improper Priors, SpUlne Smoothing and the Problem of
Guarding Against Model Errors in Regression
By GRACE WAHBA
University of Wisconsin at Madison
(Received February 1978. Final revision July 1978)
SUMMARY
Spline and generalized spline smoothing is shown to be equivalent to Bayesian
estimation with a partially improper prior. This result supports the idea that spline
smoothing is a natural solution to the regression problem when one is given a set of
regression functions but one also wants to hedge against the possibility that the
true model is not exactly in the span of the given regression functions. A natural
measure of the deviation of the true model from the span of the regression functions
comes out of the spline theory in a natural way. An appropriate value of this measure
can be estimated from the data and used to constrain the estimated model to have
the estimated deviation. Some convergence results and computational tricks are
also discussed.
Keywords: SPLINE SMOOTHING; IMPROPER PRIORS; NONPARAMETRC REGRESSION; MODEL ERRORS
1. INTRODUCTION
CONSIDER the model
Y(t) = g(t)+e, i = 1,2, ...,n,n te , (1.1)
where e = (er, ..., e 0)' 6(,2 2I.x,) and g(-) is som
index set S-. When 5 is an inverval of the real line, cubic polynomial smoothing splines are
well known to provide an aesthetically satisfying method for estimating g( ), from a
realization y = (y, ...,yn)' of Y = (Y(t), ..., Y(tQ)). See Rowlands, Liber and Danie
(1974) for a very useful example. Splines are an appealing alternative to fitting a specified
set of m regression functions, for example polynomials of degree less than m, when one is
uncertain that the true curve g(-) is actually in the span of the specified regression functions.
Kimeldorf and Wahba (1970a, b, 1971) explored certain relationships between Bayesian
estimation and spline smoothing. In this note we provide a somewhat different formulation
and generalization of the result in Kimeldorf and Wahba (1971). Here we prove that poly-
nomial spline (respectively generalized spline) smoothing is equivalent to Bayesian estimation
with a prior on g which is "diffuse" on the coefficients of the polynomials of degree <m
(respectively specified set of m regression functions), and "proper" over an appropriate set
of random variables not including the coefficients of the regression functions. Since Gauss
Markov estimation is equivalent to Bayesian estimation with a prior diffuse over the coefficients
of the regression functions, this result leads to the conclusion that spline smoothing is a (the ?)
natural extension of Gauss-Markov regression with m specified regression functions. We
claim that spline smoothing is an appropriate solution to the problem arising when one wants
to fit a given set of regression functions to the data but one also wants to "hedge" against
model errors, that is, against the possibility that the true model g is not exactly in the span of
the given set of regression functions. We show that the spline smoothing approach leads to a
natural measure of the deviation of the true g from the span of the regression functions and,
furthermore, a good value of this measure can be estimated from the data. The estimated
value of the measure is then used to control the deviation of the estimated g.
This content downloaded from 128.104.46.206 on Sun, 09 Jun 2019 01:40:55 UTC
All use subject to https://about.jstor.org/terms
1978] WAHBA - Spline Smoothing 365
From another point of view this measure can be viewed as the "bandwidth parameter"
which controls the "smoothness" of the estimated g, and so in this approach to non-parametric
(or semi-parametric) regression, a good value of the bandwidth parameter can be estimated
from the data.
Smith (1973) introduced uncertainty about a particular form of regression model in a
Bayesian context, though he did not assume a non-parametric form for the regression function.
The present work is philosophically close to that of Blight and Ott (1975), who adopted a
Bayesian approach to estimating g, and part of this work may be considered to be a
generalization of theirs. O'Hagan (1978) also treats the model (1.1) from a Bayesian view-
point, but the details of his approach appear to be somewhat different. The model of Young
(1977) can also be seen, in part, to be a special case of our generalized spline model with m = 1,
although our approach diverges from Young's at the point where he introduces priors on
his "hyperparameters". None of these works provide the feature of estimating the bandwidth
parameter from the data. The present set-up is briefly mentioned in my discussion to O'Hagan's
paper, where it is observed that O'Hagan's experimental design criteria (for the choice of
ti, ..., t) can be formulated in the context of the approach in the present paper.
Other approaches to the estimation of g in the model (1.1) have been made by Priestly
and Chao (1972), Benedetti (1977), Clark (1977) and Stone (1977). Priestly and Chao, and
Benedetti use kernel non-parametric regression to estimate g and provide mean square error
convergence rates. For the polynomial smoothing splines considered here, integrated m.s.e.
convergence rates of the estimated g to the true g, as maxij t+1 - ti -?0, have been recen
found by Craven and Wahba (1977) and are quoted ini Section 5 for comparison with Priestly
and Chao's, and Benedetti's results.
In Section 4 we make some remarks concerning the efficient computation of generalized
splines.
We note that the method for estimating the "bandwidth" parameter of this paper can also
be used in connection with certain density and log spectral density estimates, see Wahba
(1978a, b).
Other recent related work is Silverman (1978a) who provides a different approach to
estimating the bandwidth parameter in the density estimation context, and (1978b) provides
a spline estimate of the log density ratio, and Leonard (1978) who develops density estimates
from a Bayesian point of view.
2. POLYNOmiAL SPLINES AS PoSTERIoR MEANS WrTH A PARTIALLY IMPROPER PRIOR
ON THE POLYNOMIALS OF DEGREE LESS THAN m
Let .Y = [0,1]. Given data {y(t1), ...,y(t")}, 0< t1...t. <1, the smoothing polynomial
spline of degree 2m-1 to the data, call it g,,n is defined as the solution to the minimization
problem: Find gE W(m): {g: g,g', ...,g(ml) abs. cont. g(m) e2[0 l]} to minimize
B (g(t1) -y)2+ A (g(m)(U))2du., (2.1)
where yj = y(t1), and A is to be chosen. If y cannot be
nomial of degree less than m, then the solution is well kn
nomial spline of degree 2m -1 (see Schoenberg, 1964), that is, it is piecewise a polynomial of
degree 2m- 1 in each interval [ti, ti+], i = 1,2, ..., n-1, with the pieces joined so that the
resulting function has 2m-2 continuous derivatives. An efficient computational algorithm
for the cubic polynomial smoothing spline (m = 2) is given by Reinsch (1967) and code is
available in the IMSL library (1977). We show that the spline solution gn,A to the minimi-
zation problem of (2.1) is a Bayesian estimate for g with a "partially diffuse" prior; the quantity
J= J (g L)(u))2 du is a natural measure of the deviation of gn,A from the span of the poly-
nomials of degree less than m, and furthermore a good value of Jcan be estimated from the data.
This content downloaded from 128.104.46.206 on Sun, 09 Jun 2019 01:40:55 UTC
All use subject to https://about.jstor.org/terms
366 WAHBA - Spline Smoothing [No. 3,
Theorem 1. Let g(t), t e [0,1] have the prior distribution which is the same as the
distribution of the stochastic process Xg(t), t E [0, 1],
m
Xg(t)
:1=1
= , Oj i>(t) + biZ(t), (2.2)
where 0 = (01, Oj m)' X(, eImxm), Oj(t) = t-111(j- 1)!, j = 1,2, ...,
is the m-fold integrated Wiener process (Shepp, 1966),
Z(t)J (-) d(u). (2.3)
Then
(i) The polynomial spline g.k(-) which i
gn,A(t) = lim Eg{g(t)I Y = y} (2.4)
with A= a2/nb, where EC is expectation over the posterior distribution of g
(2.2). (e = oo corresponds to the "diffuse" prior on 0.)
(ii) Suppose y cannot be interpolated exactly by some polynomial of degree less than m.
Then limgnQ( ), as A -oo, is the polynomial of degree mr-1 best fitting the data in a least
squares sense, limgn,,(), as A--O, is that function in W(m) which minimizes Jf(g(m)(u))2du
subject to the condition that it interpolates y, and J(A) = f1(g.(T)(U))2 du is a monotone strictly
decreasing function of A.
(iii) Let loss be measured by the mean square prediction error R(A) given by
n
R(A) = n-1 , (g(t1) -gns,(tJ))2.
J=1
Define A(A) by
A(A) = n-'{Jj (I-A((A)) y||2 + a2 tr A2(A) - a2 tr (I- A(A))2},
where A(A) is the symmetric non-negative definite matrix satisfying
gn,A = A (A)y,
where
gn,L = (gn,Atl), gn,L(tn)) @
If g = (g(tl), ...,g(tn))' is viewed as fixed, and expectation taken with respect to C, then
E(.A) = ER(A)
so that an optimum A for squared error of prediction loss may be estimated from the dat
minimizing A(A).
Before giving the proof we discuss the meaning of this Theorem. We interpret (i) and (
as saying that estimation with the polynomial spline gn; should be viewed as a (the ?) natur
extension of Gauss-Markov estimation with polynomial regression functions (i.e. estim
with gn,.). This is because the Gauss-Markov regression estimate can be obtained as
posterior mean of g when g has a prior diffuse on the coefficients of the polynomial
A < oo is obtained as the posterior mean of g when g has a diffuse prior on the coefficient
the polynomials modified by the addition of biZ(-) to the prior specification, b>O.
In practice A = a2/nb is not generally known, so that it is fortunate that A can be estimated
from the data via (iii). If a2 is not known an estimate of A which minimizes ER(A) asymp-
totically for large n for fixed g e W(m) can be obtained by using the method of generalized
cross-validation (GCV) as described in Craven and Wahba (1977).
This content downloaded from 128.104.46.206 on Sun, 09 Jun 2019 01:40:55 UTC
All use subject to https://about.jstor.org/terms
1978] WAHA - Spline Smoothing 367
Proof of Theorem 1. Part (ii) is well known, see Schoenberg (1964), Reinsch (1967, 1971),
Anselone and Laurent (1968), Kimeldorf and Wahba (1970b). To prove (i) we use Lemma 5.1
of Kimeldorf and Wahba (1971) where an explicit formula for gntA is given. It is
gnA(t) = (#,(t), ... fOmQ(t)) (T' M-1 T)- T' Mly
+ (Q4(t), ..., Ql,,(t)) M-'(I- T(T' -1L T)-' T' M-1)y, (2.5)
where T is the n x m matrix of rank m with jkth entry 0kQ), M = nAI,x + Qn, Qn is the
n x n matrix with jkth entry Q(t1, tk) and Qti(t) -Q(ti, t), where
Q(S9 t) = +w (( +!((*_l du.
(We remark that Q(s, t) = EZ(s)Z(t).) With the prior of (2.2) it is easily seen that the prior
covariances EY' Xg(t) and EY' Y are
E9YtXg(t) = Omlt .. 7(t) T" + b(Qj:(t)9 ...* Qt"(t))i
EYY' = eTT' +bQ, + a2I.
Setting A = a2/nb, ,q = e/b and M = Q,, +nAI gives
E{Xg(t) IY = y} = ('k(t), ... m(t)) cT'(-qTT' + M)-' y+ (Q11(t), ..., Qg,,(t)) (71'T' + M)-1 y. (2.6)
By comparing (2.5) and (2.6), it remains only to show that
lim -qT'(q7TT' + M)-1 = (T' M-1 T)-1 T' M-1 (2.7)
and
lim (97'T' + M)' = M-'(I- T(T' M-1 T)-1 T'M-1). (2.8)
?)-*00
Now, it can be verified that
(7YT' +M)-' = M-1L- M-1 T(T' M-1 T)-'{I+7r-'(T'M-L T)-'}-' T' M-1, (2.9)
and expanding in powers of q and letting 710oo completes the proof of (2.7) and (2.8).
(iii) appears in Craven and Wahba (1977), but since the proof is immediate we give it here:
We have
ER(A) = En-' 11 A(A) y g||2 = n-'{Jj (I-_A(A)) g||2 + 2 tr,A2(A)}
and (iii) follows from
Ell (I-A(tA)) yi12 = 11 (I-A(A)) gil2 + a2 tr (I-A(A))2.
We remark that A(A) is obtained from (2.5) and is
A(A) = T(T M-1 T)-1 T' M-+ Q, M-'(I- T(T' M-1 T)-1 T M-1).
Craven and Wahba (1977) and Utreras (1978) both give some aesthetically very pleasing
plots of g,j, where A is chosen by the GCV method, and m = 2. (The GCV method chooses
A to minimize V(A) = 11(I-A(A))yjj2/[tr (I-A(A))]2.) Both reports demonstrate nicely how
well g,- recovers g. If a2 is known accurately, the minimizer of A(A) can be expected to
behave much like A. In Craven and Wahba, the algorithm of Reinsch (1967) is used to
compute gnA* Utreras (1978) gives approximate expressions for the eigenvalues of A(A
the large n, equally spaced data case which can considerably simplify the calculation of
or V(A).
We remark that m as well as A can be estimated from the data by minimizing 1 (or V)
as a function of both these parameters.
This content downloaded from 128.104.46.206 on Sun, 09 Jun 2019 01:40:55 UTC
All use subject to https://about.jstor.org/terms
368 WAHA - Spline Smoothing [No. 3,
Numerical experiments in estimating m as well as A have been performed in connection
with the log spectral density estimates of Wahba (1978b) and it was found that a modest
improvement in mean square error can sometimes be made by estimating m, instead of using
m = 2, the cubic spline case.
3. GENERALIZED SPLINES AS POSTERIOR MEANS WITH A PARTIALLY IMPROPER PRIOR
We now consider the general case where polynomials on [0,1] are replaced by some real-
valued functions defined on some arbitrary index set .S. For example, J` may be
a square or sphere. We require only that the n x m matrix with jkth entry #k(tl) be of rank m.
Families of extensions of Gauss-Markov estimates analogous to the smoothing polynomial
spline will be found. These estimates will be generalized splines.
A very general form of Theorem 1, for these essentially arbitrary - and {b} can be stated
in the context of reproducing kernel Hilbert spaces (r.k.h.s.). We have concluded from the
work of Parzen (1961, 1970), that r.k.h.s. is in fact a natural setting for analysing arbitrary
Gaussian stochastic processes with continuous time parameter. Thus, we beg the reader's
indulgence while we give a definition of a generalized spline as the solution to a minimization
problem in r.k.h.s. Then we proceed to the general form of Theorem 1.
We note (Aronszajn, 1950), that a (real) r.k.h.s. at" is a Hilbert space of real-valued
functions on f- with the property that, for each fixed t* e$-, the linear functional which
g E-A.K to g(t*) is a continuous linear functional. Then, by the Riesz representation th
(Akhiezer and Glazman, 1961, p. 33), there exists an element, call it St. in Y such t
<g, St.> = g(t*), where < , * > is the inner product in i. We can associate with at the so
reproducing kernel (r.k.) K(s, t), s, t e , defined by K(s, t) = <88, St>, clearly 88(t) = K
The kernel K(s, t) is always positive definite (since 11 la, 8"'II2 > 0) and so there always exis
Gaussian stochastic process with K as its covariance. We will denote by ax the r
with r.k. .9, and let the inner product in V*K be < >K.
We now let AX be any r.k.h.s. of real-valued functions on f9 which contain the
(Construction of such A#K when the {#j} are any extended Tchebychev system of functio
[0,1] may be found in Kimeldorf and Wahba (1971).) It is not hard to show that -Y".
representation as the direct sum of span {b,} and AQ, the r.k.h.s. with r.k. Q(s, t), s,
given by (see Wahba, 1973)
Q(s, t) = K(s, t) - E #i(s)kij #j(t),
ij=1
where ki1 is the ijth entry of the inverse of the (necessarily strictly posi
with ijth entry <K#,+ j>K. Let PQ be the orthogonal projection oper
(hat is, I-PQ is the orthogonal projection in A.E onto span Qb1}.) The analogue of
fl(g(m)(u))2du is jJPQgJJ2, and this is, of course, a measure of the deviation of g from spa
{#j}, being the distance in XY- from g to span {+b}.
Suppose y is not in the span of the vectors {4s,Iln where 4s = (1(t1), ..., #j(tn))'. Then
(Anselone and Laurent, 1968; Kimeldorf and Wahba, 1971) there is a unique solution, call
it gn2,t to the minimization problem: Find geaSxE to minimize
n
n (g(t1)_-y)2+AJJPQgjJ2 (3.1)
We shall call any gn,; obtained as a solution of th
smoothing spline, or, consistent with the terminology
spline.
This content downloaded from 128.104.46.206 on Sun, 09 Jun 2019 01:40:55 UTC
All use subject to https://about.jstor.org/terms
1978] WAHA - Spline Smoothing 369
Theorem 2. Let g(t), t eY have the prio
of the stochastic process X(t),
m
Xg(t)= E OSj oj(t) +biZ(t), t e59, (3.2)
1=1
where 0 = t, * am) . AJ-X(0, eImxm), b is fixed > 0 and Z(t) is a zero mean Gaussian stochastic
process with EZ(s)Z(t) = Q(s, t). Then:
(i) The generalized spline g, which is the minimizer of (3.1) has the property
gnJ,(t) = lim Ejg(t)I Y = y},
with A = a2/nb, where Eg is expectation over the posterior distribution of g(t) with the prior
(3.2).
(ii) Suppose y is not in the span of the {cj2}. Then Hm;L_,gn,,&) is that element in span
{bQ(-)} best fitting the data in a least squares sense. If Q., the n x n matrix with ijth entry
Q(ti, t1), is of full rank, limt.,Ogn,A(.) is that function in EK which minimizes 11 PQ gII5 subject
to the conditions that it interpolate the data, and J(A) = KjPQgn,f is a monotone decreasing
function of A.
(iii) Let
R(A) = n' z (g(t)-gJ,(tJ))2,
j=l
and define A(A) by
A(A) = n-1{II (I- A(A)) yjj2 + r2 tr A2(A) _ c2 tr (I- A(A))2},
where A(A) is the symmetric, non-negative definite matrix satisfying
gn = A(A)y.
If g is viewed as fixed, then
ER(A) = EA(A)
so that an optimum A for squared error of prediction loss may be estimated from the data by
minimizing A&(A).
We remark that the function g.; is given by (2.5), with Q(s, t) = EZ(s)Z(t), similarly,
the matrix A(A) is as in Section 2.
Proof of Theorem. Beginning with Lemma 5.1 of Kimeldorf and Wahba (1971), the proof
parallels directly the proof of Theorem 1, and is omitted.
4. REPREsENTATIoNs OF gn ; FOR EFFIcENT COMPUrING
We believe smoothing splines to be appropriate for solving a wide variety of practical
problems, in practice, including smoothing surfaces, once efficient numerical algorithms are
developed. If Xx is a space of periodic functions on [0,1] or a tensor product of periodic
spaces on [0,1] x ... x [0,1], and the {t3 are equally spaced or the tensor product of equally
spaced points then computing problems are readily solved. (See Wahba, 1977, for a com-
puted example.) In general, however, the efficient computation of g.;L presents challenges,
if n is very large, as would usually be the case if 'T is a rectangle in d-space. It will probably
be necessary to choose Q with computational ease-an important consideration.
Equation (2.5) will generally not be the best representation for computing gn ,t. We discuss
some other representations for g.,; chosen with efficient computing in mind. We assume below
that Q. is of full rank. Since
T' M-1(I- T(T' M-1 T)-1 T' M-1) = mxn
This content downloaded from 128.104.46.206 on Sun, 09 Jun 2019 01:40:55 UTC
All use subject to https://about.jstor.org/terms
370 WAHBA- Spline Smoothing [No. 3,
it is clear that g,s has a representation
m i-n
gn,A = z oqb+ E cihi, (4.1)
where 0 and c = (cl, ..., cn_m) are vectors of constants, and
n-m
hi-= E i ij-,
J) 1
where the (n -im) x n dimensional matrix B with ijth entry bw satisfies BT 0(n-m)xr but
is otherwise arbitrary.
We will demonstrate shortly that c, 0 and A(A) y = gn,A satisfy
(Zh+nABB')c = By, (4.2)
TO = y-MB'c (4.3)
and
g.,,= A(A)y = y-nAB'c, (4.4)
where h is the (n-m)x(n-m) dimensional matrix with
to choose B so that {hI}, B and Eh have convenient pr
obtain c, 0, gn,A and gnA(-) from (4.1) to (4.4) by first s
polynomial spline case, by choosing the entries in B corresponding to divided differences,
one can obtain Eh and B both banded matrices and an efficient code results (see Reins
1967; Anselone and Laurent, 1968). The span of the {h1} can be constructed from B-splin
which are nice hill-like functions (see Curry and Schoenberg, 1966; deBoor, 1972).
Equation (4.2) can be shown to be equivalent to Anselone and Laurent, equations (8.26)
and (9.1). However, we provide a direct proof of (4.2) using (2.5) without the elegant but
lengthy machinery of their work. We must show that
(h,...., hn-m)(h +nABB')-'By= (Q,,..., Qt,,) (M-l-M-'T(T'M-' T)-T'M-')y. (4.5)
Now since <Q,,, Qt>E = Q(ti, t1), we have that Eh = BQn B' and so the left-hand side of (4.5)
is given by
(Qt, *.*, Qt) B'(BMB')- By. (4.6)
However,
B'(BMB')-1 B =_ M M -1 T(T' M-1 T7'-1 T' M-1 (4.7)
T'
as can be seen by observing that the n x n matrix X = BM is of full rank and
0
X{M-- M-MT(T'M-'T)-'TM-'}X' ) = X{B'(BMB') B} X'. (4.8)
Equations (4.3) and (4.4) follow immediately from (4.2) and (3.3).
5. CONVERGENCE PROPERTiES OF gnA
In the case of polynomial splines with Y = [0,1] the mean square error convergence
properties (of ER(A)) are known from Craven and Wahba (1977) and we give them here for
comparison purposes. We have, from Theorem 1,
n
ER(A) = En-l , (g(t) -gn,;L(t,))2 _= n-'{J1 (I- A(A))gJJ2+ a2trA2(A)}.
i=1
This content downloaded from 128.104.46.206 on Sun, 09 Jun 2019 01:40:55 UTC
All use subject to https://about.jstor.org/terms
1978] WAHBA - Spline Smoothing 371
Using Lemmas 4.1 and 4.3 of Craven and Wahba it can be shown (ignoring terms of o(l)),
that an upper bound on ER(A) is given by
ER(A) < A J(g(m)(U))2du + nA/2Ms
where
c = a2max [n(t+,- t,)]1/2mf (1d+Xm)2
This bound is minimized for A = constn-2m/(2m+l) and so
minR(A) < O(n-2m/(2m+l)).
A
We remark on the comparison between this rate and that obtained by Priestly and Chao
(1972) and Benedetti (1977) for kernel type non-parametric regression estimates. They
obtain mean square error at a point convergence rates for their estimate, call it g, of the form
E(g(t) -g(t))2 = O(n-2m/(2m+l))
under the assumption that g(m)Q) is well defined and bounded at t. Their rates and ours are
not directly comparable since we assume g e W(m), and compute an estimate of integrated
mean square error. However, as in the case of density estimation (see Wahba, 1975a, 1976)
it appears that the same convergence rates under identical assumptions will obtain if the method
is matched to m and the bandwidth parameter is chosen optimally.
ACKNOWLEDGEMENTS
We thank a referee for suggesting the identity (2.9), which considerably shortened the
original proof of Theorem 1.
This research was supported by the U.S. Army under Contract No. DAAG29-77-G-0207.
REFERENCES
AKHIEZER, N. I. and GLAZMAN, I. M. (1961). Theory of Linear Operators in Hilbert Space. Translated
from the Russian by Merlynd Nestel, p. 33. New York: Ungar.
ANSELONE, P. M. and LAURENT, P. J. (1968). A general method for the construction of interpolating or
smoothing spline-functions. Numer. Math., 12, 66-82.
ARONSZAJN, N. (1950). Theory of reproducing kernels. Trans. Amer. Math. Soc., 68, 337-404.
BENEDETTI, JACQUELINE K. (1977). On the nonparametric estimation of regression functions. J. R. Statist.
Soc. B, 39, 248-253.
BLIGHT, B. J. N. and OTr, C. (1975). A Bayesian approach to model inadequacy for polynomial regression.
Biometrika, 62, 79-88.
CLARK, R. M. (1977). Non-parametric estimation of a smooth regression function. J. R. Statist. Soc. B,
39, 107-113.
CRAVEN, P. and WAHBA, G. (1977). Smoothing noisy data with spline functions: estimating the correct
degree of smoothing by the method of generalized cross-validation. University of Wisconsin, Statistics
Department Technical Report No. 445, October 1977. To appear in Numer. Math.
CURRY, H. B. and SCHOENBERG, I. J. (1966). On Polya frequency functions IV; the fundamental spline
functions and their limits. J. Analyse Math., 7, 71-107.
DE BOOR, C. (1972). On calculating with B-splines. J. Approximation Theory, 6, 50-62.
HOUSEHOLDER, A. (1964). The Theory of Matrices in Numerical Analysis. New York: Blaisdell.
INTERNATIONAL MATHEMATICAL AND STATISTICAL LIBRARIES, INC., MANUAL (1977). Subroutine ICSSCU.
KIMELDORF, G. and WAHBA, G. (1970a). A correspondence between Bayesian estimation on stochastic
processes and smoothing by splines. Ann. Math. Statist., 41, 495-502.
- (1970b). Spline functions and stochastic processes. Sankhyd, A, 32, 173-180.
(1971). Some results on Tchebychefflan spline functions. J. Math. Anal. and Applic., 33, 82-95.
LEONARD, T. (1978). Density estimation, stochastic processes and prior information (with Discussion).
J. R. Statist. Soc. B, 40, 113-146.
This content downloaded from 128.104.46.206 on Sun, 09 Jun 2019 01:40:55 UTC
All use subject to https://about.jstor.org/terms
372 WAHA - Spline Smoothing [No. 3,
O'HAGAN, A. (1978). Curve fitting and optimal design for prediction (with Discussion). J. B. Statist. Soc.
B, 40,1-42.
PARZEN, E. (1961). An approach to time series analysis. Ann. Math. Statist., 32, 951-989.
(1970). Statistical inference on time series by RKHS methods, Proceedings of the 12th Biennial
Seminar of the Canadian Mathematical Congress, pp. 1-37.
PRSTLY, M. B. and CHAo, M. T. (1972). Non-parametric function fitting. J. B. Statist. Soc. B, 34, 385-392.
REINCH, C. H. (1967). Smoothing by spline functions. Numer. Math., 10, 177-183.
- (1971). Smoothing by spline functions 1I. Numer. Math., 16, 451-454.
RowLANDs, R. E., Loma, T. and DANEL, I. M. (1974). Stress analysis of anisotropic laminated plates.
J. Amer. Inst. Aeronautics and Astronautics, 12, 7, 903-908.
SCHOENBERG, I. J. (1964). Spline functions and the problem of graduation. Proc. Nat. Acad. Sci. (USA),
52, 947-950.
SIEPP, L. A. (1966). Radon-Nikodym derivatives of Gaussian measures. Ann. Math. Statist., 37, 321-354.
SLVERMAN, B. W. (1978a). Choosing the window width when estimating a density. Biometrika, 65, 1-11.
- (1978b). Density ratios, empirical likelihood and cot death. Appl. Statist., 27, 26-33.
SmrrH, A. F. M. (1973). Bayes estimates in one-way and two-way models. Biometrika, 60, 319-329.
SToNE, C. 3. (1977). Consistent nonparametric regression. Ann. Statist., 5, 595-645.
UTERAS, F. (1978). Sur le choix paramttre d'ajustement dans le lissage par fonctions spline. No. 296,
Seminaire d'Analyse Num6rique, Math6matiques Appliqu6es, Universit6 Scientifique et M6dicale de
Grenoble.
WABA, G. (1973). A class of approximate solutions to linear operator equations. J. Approxim. Theory,
9, 61-77.
- (1975a). Optimal convergence properties of variable knot, kemel, and orthogonal series methods for
density estimation. Ann. Statist., 3, 15-29.
(1975b). A canonical form for the problem of estimating smooth surfaces. University of Wisconsin-
Madison, Department of Statistics, Technical Report No. 420.
(1976). Histosplines with knots which are order statistics. J. R. Statist. Soc. B, 38, 140-151.
- (1977). Optimal smoothing of density estimates. In Classification and Clustering (J. Van Ryzin,
ed.), pp. 423-458. New York: Academic Press.
- (1978a). Data-based optimal smoothing of orthogonal series density estimates, University of
Wisconsin-Madison, Department of Statistics, Technical Report No. 509, (submitted).
(1978b). Automatic smoothing of the log periodogram. University of Wisconsin-Madison, Depart-
ment of Statistics. Technical Report No. 536 (submitted).
YOuNG, A. S. (1977). A Bayesian approach to prediction using polynomials. Biometrika, 64, 309-317.
This content downloaded from 128.104.46.206 on Sun, 09 Jun 2019 01:40:55 UTC
All use subject to https://about.jstor.org/terms