0% found this document useful (0 votes)
19 views16 pages

Bootstrap Methods 2020

The document provides an overview of bootstrap methods, a statistical technique for assessing sampling variability through resampling. It explains the foundational concepts, including the estimation of probabilities and standard errors, as well as applications in hypothesis testing and confidence intervals. The bootstrap method is highlighted for its flexibility and ability to improve asymptotic approximations in various statistical scenarios.

Uploaded by

jorgebac1718
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views16 pages

Bootstrap Methods 2020

The document provides an overview of bootstrap methods, a statistical technique for assessing sampling variability through resampling. It explains the foundational concepts, including the estimation of probabilities and standard errors, as well as applications in hypothesis testing and confidence intervals. The bootstrap method is highlighted for its flexibility and ability to improve asymptotic approximations in various statistical scenarios.

Uploaded by

jorgebac1718
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 16

Bootstrap Methods

Class Notes
Manuel Arellano
Revised: February 2, 2020

Introduction
The bootstrap is an alternative method of assessing sampling variability. It is a
mechanical procedure that can be applied in a wide variety of situations.

It works in the same way regardless of whether something straightforward is being


estimated or something more complex.

The bootstrap was invented and given its name by Brad Efron in a paper published in
1979 in the Annals of Statistics.

The bootstrap is probably the most widely used methodological innovation in


statistics since Ronald Fisher’s development of the analysis of variance in the 1920s
(Erich Lehmann 2008).

1
The idea of the bootstrap
Let Y1 , ..., YN be a random sample according to some distribution F and let
bθ N = h (Y1 , ..., YN ) be some statistic of interest. We want to estimate the
distribution of b θN
PrF b
θN r = PrF [h (Y1 , ..., YN ) r]
where the subscript F indicates the distribution of the Y ’s.
A simple estimator of PrF b θN r is the plug-in estimator. It replaces F by the
b
empirical cdf FN :
1 N
N i∑
FbN (s ) = 1 (Y i s) ,
=1
which assigns probability 1/N to each of the observed values y1 , ..., yN of Y1 , ..., YN .
The resulting estimator is then
PrFb [h (Y1 , ..., YN ) r] (1)
N

where Y1 , ..., YN denotes a random sample from FbN .


The formula (1) for the estimator of the cdf of b θ N is easy to to write down, but it is
prohibitive to calculate except for small N.
To see this note that each of the Yi is capable of taking on the N values y1 , ..., yN ,
so that the total number of values of h Y1 , ..., YN that has to be considered is N N .
To calculate (1), one would have to count how many of these N N values are r .
2
The idea of the bootstrap (continued)
For example, suppose that Yi is binary, F is given by Pr (Y = 1) = p, b
θ N is the
sample mean and N = 3. There are eight possible samples:

y1 , y2 , y3 Pr (y1 , y2 , y3 ) b
θN
(0, 0, 0) (1 p )3 0
(1, 0, 0) p (1 p )2 1/3
(0, 1, 0) p (1 p )2 1/3
(0, 0, 1) p (1 p )2 1/3
(1, 1, 0) p 2 (1 p ) 2/3
(1, 0, 1) p 2 (1 p ) 2/3
(0, 1, 1) p 2 (1 p ) 2/3
(1, 1, 1) p3 1

So that PrF b
θN r is determined by

r PrF b
θN = r
0 (1 p )3
1/3 3p (1 p )2
2/3 3p 2 (1 p )
1 p3

3
The idea of the bootstrap (continued)

Suppose that the observed values y1 , y2 , y3 are (0, 1, 1), so that the observed value of
b
θ N is 2/3. Therefore, our estimate of Pr b θ N = r is given by

r PrFb b
θN = r
N
3
0 1 23 = 27 1
= .037
2 2 2 6
1/3 3 3 1 3 = 27 = .222
2
2/3 3 23 1 23 = 12 27 = .444
2 3 8
1 3 = 27 = .297

The previous example is so simple that the calculation of PrFb b θN r can be done
N
analytically, but in general this type of calculation is beyond reach.

4
The idea of the bootstrap (continued)
Estimation by simulation

A standard device for (approximately) evaluating probabilities that are too di¢ cult to
calculate exactly is simulation.
To calculate the probability of an event, one generates a sample from the underlying
distribution and notes the frequency with which the event occurs in the generated
sample.
If the sample is su¢ ciently large, this frequency will provide an approximation to the
original probability with a negligible error.
Such approximation to the probability (1) constitutes the second step of the
bootstrap method.
A number M of samples Y1 , ..., YN (the “bootstrap” samples) are drawn from FbN ,
and the frequency with which

h (Y1 , ..., YN ) r

provides the desired approximation to the estimator (1) (Lehmann 2008).

5
Numerical illustration

To illustrate the method I have generated M = 1000 bootstrap samples of size N = 3


with p = 2/3, which is the value of p that corresponds to the sample distribution
(0, 1, 1). The result is

r #samples bootstrap pdf PrFb b


θN = r
N
(0, 0, 0) 0 37 .037 .037
(1, 0, 0) , (0, 1, 0) , (0, 0, 1) 1/3 222 .222 .222
(1, 1, 0) , (1, 0, 1) , (0, 1, 1) 2/3 453 .453 .444
(1, 1, 1) 1 288 .288 .297

The discrepancy between the last two columns can be made arbitrarily small by
increasing M .

The method we have described consisting in drawing random samples with


replacement treating the observed sample as the population is called “non-parametric
bootstrap”.

6
Bootstrap standard errors

The bootstrap procedure is very ‡exible and applicable to many di¤erent situations
such as the bias and variance of an estimator, to the calculation of con…dence
intervals, etc.
As a result of resampling we have available M estimates from the arti…cial samples:
(1 ) (M )
b
θ N , ..., b
θ N . A bootstrap standard error is then obtained as
" #
M 2 1/2
1 (m )
M 1 m∑
b
θN b
θN (2)
=1

where b
θ N = ∑M b(m )
m =1 θ N /M .

In the previous example, the bootstrap mean is bθ N = 0.664, the bootstrap standard
error is 0.271 calculated as in (2) with M = 1000, and the analytical standard error is
2 31/2
b
θN 1 b θN
4 5 = 0.272
N

where b
θ N = 2/3 and N = 3.

7
Asymptotic properties of bootstrap methods
Using the bootstrap standard error to construct test statistics cannot be shown to
improve on the approximation provided by the usual asymptotic theory, but the good
news is that under general regularity conditions it does have the same asymptotic
justi…cation as conventional asymptotic procedures.
This is good news because bootstrap standard errors are often much easier to obtain
than analytical standard errors.
Re…nements for large-sample pivotal statistics
Even better news is the fact that in many cases the bootstrap does improve the
approximation of the distribution of test statistics, in the sense that the bootstrap can
provide an asymptotic re…nement compared with the usual asymptotic theory.
The key aspect for achieving such re…nements (consisting in having an asymptotic
approximation with errors of a smaller order of magnitude in powers of the sample
size) is that the statistic being bootstrapped is asymptotically pivotal.
An asymptotically pivotal statistic is one whose limiting distribution does not depend
on unknown parameters (like standard normal or chi-square distributions).
This is the case with t-ratios and Wald test statistics, for example.
Note that for a t-ratio to be asymptotically pivotal in a regression with heteros-
kedasticity, the robust White form of the t-ratio needs to be used.

8
Asymptotic properties of bootstrap methods (continued)

The upshot of the previous discussion is that the replication of the quantity of interest
(mean, median, etc.) is not always the best way to use the bootstrap if improvements
on asymptotic approximations are sought.
In particular, when we wish to calculate a con…dence interval, it is better not to
bootstrap the estimate itself but rather to bootstrap the distribution of the t-value.
This is feasible when we have a large sample estimate of the standard error, but one is
skeptical about the accuracy of the normal probability approximation.
Such procedure will provide more accurate estimates of con…dence intervals than
either the simple bootstrap or the asymptotic standard errors.

However, often we are interested in bootstrap methods because an analytic standard


error is not available or is hard to calculate.
In those cases the motivation for the bootstrap is not necessarily improving on the
asymptotic approximation but rather obtaining a simpler approximation with the same
justi…cation as the conventional asymptotic approximation.

9
An example: the sample mean
To illustrate how the bootstrap works, let us consider the estimation of a sample
mean:
yi = β + ui ,
where ui are iid with zero mean.
The OLS estimate on the original sample is:
1 N
N i∑
b
β= yi = y .
=1
Then, sampling from the original sample is equivalent to selecting N indices with
probability 1/N. Let W be a generic draw from that distribution. We have:
1 N 1 N
N i∑ N i∑
E(W jy1 , ..., yN ) = yi , Var(W jy1 , ..., yN ) = (yi y )2 .
=1 =1
Let e
β be the sample mean of a bootstrap sample. It follows from the sample mean
theorem that:
1 N
N i∑
E( e
βjy1 , ..., yN ) = E(W jy1 , ..., yN ) = yi ,
=1
and
Var(W jy1 , ..., yN ) 1 N
Var(e
βjy1 , ..., yN ) = = 2 ∑ (yi y )2 .
N N i =1
10
An example: the sample mean (continued)

Therefore the distribution of e


β, conditional on the original sample, is centered around
the original estimate and consistently estimates the variance of b
β.

This illustrates the bootstrap principle according to which the relation between e
β and
b
β is the same as the relation between b β and the true β.

11
Bootstrap con…dence intervals

To obtain a con…dence interval for θ based on b


θN bN is a s.e. of
σN where σ
θ /b
b
θ N , we need to estimate quantiles of b
θN σN ; namely, cτ for given τ such that
θ /b
!
b
θN θ
Pr cτ =τ
bN
σ
b
θN θ b
θN θ
Given Pr bN
σ
c1 α/2 =1 α/2 and Pr bN
σ
cα/2 = α/2, we can say
that Pr b
θN b N c1
σ α/2 θ σ b
bN cα/2 = 1 α. Thus, the equitailed 1 α
θN
b bN c1 α/2 and b
limits for θ would be θ N σ θN σ bN cα/2 .
If α = 0.05 and b θ N θ /b σN N (0, 1) we have c1 α/2 = 1.96 and cα/2 = 1.96.
b
A bootstrap estimate of Pr θN θ
bN
σ
s replaces F by FbN everywhere, including
replacing θ = g (F ), which is a characteristic of F , by b θ N = g FbN . Therefore, the
resulting plug-in estimator is
! 0 1
(m )
b
θN b θN 1 M @b θN b
θN
M m∑
PrFb e
cτ 1 cτ A = τ.
e
N bN
σ =1 b
σ
(m )
N

Thus, the bootstrap con…dence interval is bθN bN e


σ c1 b
α/2 , θ N bN e
σ cα/2 .
12
Bootstrap hypothesis testing

The connection between con…dence intervals and signi…cance tests can be exploited
to test certain parametric hypotheses.
Suppose we wish to test H0 : θ = θ 0 against the two-sided alternative H1 : θ 6= θ 0 .
For a test at the 5% level we …rst compute a number b
κ such that
0 1
b
θN b θN
PrFb @ >bκ A = .05
N bN
σ

Then reject H0 in favor of H1 if

b
θN θ0
>b
κ.
bN
σ

This procedure is preferable because the bootstrap distribution of b


θN b σN is
θ N /b
a better approximation to the distribution of b
θN σN under H0 than the
θ 0 /b
bootstrap distribution of b
θN b
θ N is to the distribution of b
θN θ 0 under H0 .

13
Bootstrap hypothesis testing: linear regression example

In the simple linear regression model yi = α + βxi + ui consider two-sided testing of


H0 : β = 0.
Let the t ratio be t β = b σβ .
β /b
!
b
β b
β
Reject H0 if t β > b
κ where b
κ is selected such that Pr bβ
σ
>b
κ = 0.5.

An alternative presentation of the same procedure is as follows. Generate bootstrap


(m ) (m ) (m ) (m ) (m ) (m )
samples under H0 : ye i =b
α+u b b
where u
i =y i α b
b βx . Let e
i β bei
(m ) (m )
the OLS slope of yei on xi .
(m ) (m ) (m ) (m ) (m ) (m ) (m )
Note that since yei = yi b
βxi we have e
β =bβ b eβ
β and σ bβ .

!
e
β
Therefore, the critical value e
κ such that Pr eβ
σ
>e
κ = 0.5 and b
κ are the same.

14
Bootstrapping dependent samples

We have discussed Efron’s nonparametric bootstrap.


There are other forms of bootstrap in the literature (parametric bootstrap, residual
bootstrap, subsampling).

Time series models

Dealing with time-series models requires to take serial dependence into account.
One way of doing it is to sample by blocks.

Strati…ed clustered samples

The bootstrap can be applied to a strati…ed clustered sample.

All we have to do is to treat the strata separately, and resample, not the basic
underlying units (the households) but rather the primary sample units (the clusters).

15
Using replicate weights

Taking strati…cation and clustering sampling features into account, either analytically
or by bootstrap, requires the availability of stratum and cluster indicators.

Generally, Statistical O¢ ces or survey providers do not make them available for
con…dentiality reasons.

To enable the estimation of the sampling distribution of estimators and test statistics
without disclosing stratum or cluster information, an alternative is to provide replicate
weights.

For example, the EFF provides 999 replicate weights. Speci…cally the EFF provides
replicate cross-section weights, replicate longitudinal weights, and multiplicity factors
(Bover, 2004).
Multiplicity factors indicate the number of times a given household appears in a
particular bootstrap sample.

The provision of replicate weights is an important development because it facilitates


the use of replication methods, which are simple and of general applicability, together
with allowing for con…dentiality safe ward.

16

You might also like