7406HW04
7406HW04
HW#4
Local Smoothing. The goal of this homework is to help you better understand the statistical properties
and computational challenges of local smoothing such as loess, Nadaraya-Watson (NW) kernel smoothing,
and spline smoothing.
For this purpose, we will compute empirical bias, empirical variances, and empirical mean square error
(MSE) based on m = 1000 Monte Carlo runs, where in each run we simulate a data set of n = 101
observations from the additive noise model
Yi = f (xi ) + i (1)
and 1 , · · · , n are independent and identically distributed (iid) N (0, 0.22 ). This function is known to pose a
variety of estimation challenges, and below we explore the difficulties inherent in this function.
(1) Let us first consider the (deterministic fixed) design with equi-distant points in [−2π, 2π].
(a) For each of m = 1000 Monte Carlo runs, simulate or generate a data set of the form (xi , Yi )) with
i−1
xi = 2π(−1 + 2 n−1 ) and Yi is from the model in (1). Denote such data set as Dj at the j-th
Monte Carlo run for j = 1, · · · , m = 1000.
(b) For each data set Dj or each Monte Carlo run, compute the three different kinds of local smooth-
ing estimates at every point in Dj : loess (with span = 0.75), Nadaraya-Watson (NW) kernel
smoothing with Gaussian Kernel and bandwidth = 0.2, and spline smoothing with the default
tuning parameter.
(c) At each point xi , for each local smoothing method, based on m = 1000 Monte Carlo runs, compute
the empirical bias, empirical variance, and empirical mean square error (MSE), which are defined
as
m
\ 1 X ˆ(j)
Bias{f (xi )} = f̄ m (xi ) − f (xi ), with f̄m (xi ) = f (xi )
m j=1
m 2
\ 1 X ˆ(j)
V ar{f (xi )} = f (xi ) − f̄m (xi ) ,
m j=1
m 2
\(xi )} 1 X ˆ(j)
M SE{f = f (xi ) − f (xi ) ,
m j=1
Here we use the true function value f (xi ) in (2) in the definition of empirical Bias and empirical
MSE, which better helps us understanding the performance of different local smoothing methods
1
when estimating the true function f . Moreover, we purposely use the coefficient m (instead of the
1
standard coefficient m−1 ) in the definition of empirical variance, so that the well-known relation
MSE = Bias2 + Var is applicable to the empirical version.
(d) Plot these quantities against xi for all three kinds of local smoothing estimators: loess, NW kernel,
and spline smoothing.
(e) Provide a through analysis of what the plots suggest, e.g., which method is better/worse on bias,
variance, and MSE? Do you think whether it is fair comparison between these three methods?
Why or why not?
1
(2) Repeat part (1) with another (deterministic) design that has non-equidistant points in the interval
[−2π, 2π]. The following R code is used to generate the design points xi ’s in my laptop, denoted by x2
below (you can keep these xi ’s fixed in the m = 1000 Monte Carlo runs):
set.seed(79)
x2 <- round(2*pi*sort(c(0.5, -1 + rbeta(50,2,2), rbeta(50,2,2))), 8)
For those students who use Python or other softwares, below are the values of x2, which is also available
in the dataset “HW04part2.x.csv”:
Here for simplicity and reasonable comparison, when estimating and predicting by the local smoothing
methods, please use span = 0.3365 for loess, bandwidth = 0.2 for NW kernel smoothing, and spar =
0.7163 for spline splines.
Discuss the statistical challenges and the computational challenges when using these three local smooth-
ing methods to estimate the Mexican hat function under this non-equidistant design, including the
suitable choices of tuning parameters.
2
Appendix: below are some sample R codes that may be useful to this homework, and please note that you
might need to modify/revise these R codes!
##Generate data, fit the data and store the fitted values
for (j in 1:m){
## simulate y-values
## Note that you need to replace $f(x)$ below by the mathematical definition in eq. (2)
y <- f(x) + rnorm(length(x), sd=0.2);
## Below is the sample R code to plot the mean of three estimators in a single plot
meanlp = apply(fvlp,1,mean);
meannw = apply(fvnw,1,mean);
meanss = apply(fvss,1,mean);
## within each loop, you can consider the three local smoothing methods:
## please remember that you need to first simulate Y values within each loop
y <- (1-x2^2) * exp(-0.5 * x2^2) + rnorm(length(x2), sd=0.2);
predict(loess(y ~ x2, span = 0.3365), newdata = x2);
ksmooth(x2, y, kernel="normal", bandwidth= 0.2, x.points=x2)$y;
predict(smooth.spline(y ~ x2, spar= 0.7163), x=x2)$y