0% found this document useful (0 votes)
13 views4 pages

Bootstrap

Bootstrapping is a computational method used to assess the statistical performance of estimators by resampling from the empirical distribution of observed data. It provides a way to estimate the bias, variance, and cumulative density function of an estimator without needing new samples. The technique is applicable in various fields, including signal processing, geophysics, and biomedical engineering, and can also be adapted for non-iid data scenarios.

Uploaded by

Rezki Younsi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views4 pages

Bootstrap

Bootstrapping is a computational method used to assess the statistical performance of estimators by resampling from the empirical distribution of observed data. It provides a way to estimate the bias, variance, and cumulative density function of an estimator without needing new samples. The technique is applicable in various fields, including signal processing, geophysics, and biomedical engineering, and can also be adapted for non-iid data scenarios.

Uploaded by

Rezki Younsi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

Bootstrapping

Bootstrapping [1] is a simple computational approach to assess the statistical performance


of estimators. The paper by Politis [2] is an excellent tutorial on this subject. A variety
of applications to signal processing are described by Zoubir and Boashash [3]. The boot-
strap has also been extensively used in geophysics, biomedical engineering, control, applied
mechanics, and regression analysis.

1 Basic Problem
Given Y = (Y1 , Y2, · · · , Yn ) drawn iid from a (partially or even completely unknown) distri-
bution P , estimate some parameter θ of P . The estimator θ̂ is based on a statistic T (Y ).
Then produce some measure of statistical accuracy of the estimator, e.g., its mean, variance,
confidence interval, or even full distribution.
For instance, θ could be a moment of P , or the median of P , etc.
For large sample size n and under regularity conditions, asymptotic theory√ for maximum-
likelihood (ML) estimators tells us that the normalized estimation error n(θ̂M L − θ) con-
verges in distribution to N (0, 1/Iθ ), where Iθ is Fisher information. Since P is unknown, the
estimator θ̂ assumed above is generally not the ML estimator. Under regularity conditions,
the estimator θ̂ will often be consistent in probability (like the ML estimator), but won’t be
asymptotically efficient, i.e., its asymptotic variance will exceed 1/Iθ . For instance,
P assume
P has mean θ and finite variance. Then the natural estimator θ̂ = n1 T (Y ) = n1 ni=1 Yi is
consistent but not efficient – unless P is Gaussian with known variance.
In Sections 2 and 3 we consider two approaches to measure the accuracy of the estimator:
one that requires acquiring many new samples, and a modification that doesn’t. The latter
approach is the bootstrapping method.

2 Traditional Monte-Carlo Approach


Assume we draw n iid samples Y (k) from P , for 1 ≤ k ≤ K. That is, we acquire a total of
Kn iid samples from P . For each dataset Y (k) we compute an estimate θ̂(k) . For large K, the
bias and variance, and cumulative density function (cdf) of the estimator are approximately
given by
XK
d θ̂) = 1
Bias(θ̂) ≈ Bias( θ̂(k) − θ
K k=1
!2
X  2 K
1 X (k)
K
d θ̂) = 1
Var(θ̂) ≈ Var( θ̂(k) − θ̂ .
K − 1 k=1 K k=1

1
d θ̂) above is not a valid estimator of the bias because evaluating it requires
(Note that Bias(
knowledge of θ!) The cumulative density function (cdf) of θ̂ is given by

Q(x) = P [θ̂ ≤ x]

and may be estimated by the empirical cdf,

1 X
K
Q̂(x) = 1 (k) .
K k=1 {θ̂ ≤x}

3 Bootstrap
The basic idea of bootstrapping is to use the procedure of Section 2 with the following
modification. Instead of drawing Y (k) from P , draw it from the empirical distribution

1X
n
P̂ (y) = 1{Yi ≤y}
n i=1

(k)
based on the n samples Y1 , · · · , Yn . In other words, draw Yi iid (with replacement) from the
uniform distribution over the observed dataset {Y1 , · · · , Yn }. We denote by Y ∗(k) , 1 ≤ k ≤ K,
the resamples obtained from this procedure and by θ̂∗(k) the estimate computed from Y ∗(k) .
Then we form the following estimator of the bias and variance of θ̂:

1 X ∗(k)
K

Bias (θ̂) = θ̂ −θ
K k=1
!2
1 X  ∗(k) 2 1 X ∗(k)
K K

Var (θ̂) = θ̂ − θ̂ .
K k=1 K k=1

The estimated cdf for the estimator is

1 X
K

Q̂ (x) = 1{θ̂∗(k) ≤x} .
K
k=1

Example. Let P be any distribution over the ten digits 0, 1, · · · , 9, and θ the median
of the distribution. The estimator is simply θ̂ = med(Y1 , · · · , Yn ). Say that for n = 5 we
observe Y = (5, 2, 2, 9, 7). We use bootstrapping with K = 2. Say we obtain the resamples
Y ∗(1) = (9, 2, 9, 5, 2) and Y ∗(2) = (2, 7, 7, 5, 9). The corresponding median estimates are
θ̂∗(1) = 5 and θ̂∗(2) = 7. Therefore the bootstrap estimate of the mean of θ̂ is 12 (5 + 7) = 6,
and its variance is 1.

2
When does the bootstrap work? For large n, the empirical cdf estimator P̂ converges
in the sup norm to the actual P :
a.s.
sup |P̂ (y) − P (y)| → 0 as n → ∞.
y

Thus one may heuristically expect that the bootstrap is approximately equivalent to the
procedure of Sec. 2. In fact a little bit more is required, e.g., a sufficient condition is that θ̂
be asymptotically Gaussian.
An example where the bootstrap fails is that of iid uniform Y over the interval [0, θ],
using the estimator θ̂(Y ) = max1≤i≤n Yi .

4 Extensions
The bootstrap principle may also be used in problems where the data Yi , 1 ≤ i ≤ n, are not
iid. A modification of the method of Sec. 3 is needed. The approach is best illustrated with
an example.
Example. Consider the model

Yi = a + bsi + Wi , 1≤i≤n

where θ = (a, b) is to be estimated, si is a known signal, and Wi is iid noise with mean zero
and finite variance. Consider the least-squares estimator
X
n
(â, b̂) = arg min (Yi − a − bsi )2
(a,b)
i=1

which is consistent under regularity assumptions on the signal. Note this estimator coincides
with the ML estimator if the noise distribution is Gaussian with known variance. Define the
residual errors
Ei = Yi − â − b̂si , 1 ≤ i ≤ n,
which follow approximately the same iid distribution as {Wi } if â, b̂ are accurate estimators
of a, b. Define the centered residuals

1X
n
Ẽi = Ei − Ej , 1 ≤ i ≤ n.
n j=1

The bootstrap principle is applied as follows.


For 1 ≤ k ≤ K, do the following:
∗(k)
1. Draw resamples Ei iid uniformly with replacement from {Ẽ1 , · · · , Ẽn }.

3
∗(k) ∗(k)
2. Generate pseudo-data Yi = â + b̂si + Ei for 1 ≤ i ≤ n.

3. Apply least-squares estimation to obtain


X
n
∗(k)
(â∗(k) , b̂∗(k) ) = arg min (Yi − a − bsi )2 .
(a,b)
i=1

Then one may obtain the desired estimates of the bias, variance, etc. of the estimator from
(â∗(k) , b̂∗(k) ), 1 ≤ k ≤ K.
Related Problems. The same idea applies to problems such as estimation of parameters
in a Markov model:
Yi = Fθ (Yi−1, Yi−2 ) + Wi , 1 ≤ i ≤ n
where Wi is iid noise with mean zero and finite variance. The prediction function F is
parameterized by θ, which can be estimated using nonlinear least-squares:
X
n
θ̂ = arg min (Yi − Fθ (Yi−1 , Yi−2 ))2 .
θ
i=3

The bootstrap has also been used to assess the accuracy of spectrum estimators, hypothesis
tests, etc.

References
[1] B. Efron, “Bootstrap Methods: Another Look at the Jackknife,” Annals of Statistics,
pp. 1—26, Jan. 1979.

[2] D. Politis, “Computer-intensive methods in statistical analysis,” IEEE Signal Processing


Magazine, pp. 39—55, Jan. 1998.

[3] A. Zoubir and B. Boashash, “The Bootstrap and its Application in Signal Processing,”
IEEE Signal Processing Magazine, pp. 56—76, Jan. 1998.

You might also like