0% found this document useful (0 votes)
10 views16 pages

Cheap Subsampling Bootstrap Confidence Intervals For Fast and Robust Inference in Biostatistics

This document presents a novel Cheap Subsampling bootstrap method for constructing confidence intervals in biostatistics, which addresses the computational challenges of traditional bootstrapping techniques. The method is designed to be fast and robust, particularly in the context of semiparametric causal inference, and is validated through empirical experiments and application to data from the LEADER trial. The authors demonstrate that their approach yields reliable confidence intervals with fewer bootstrap replications while maintaining asymptotic validity under appropriate conditions.

Uploaded by

남승우
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views16 pages

Cheap Subsampling Bootstrap Confidence Intervals For Fast and Robust Inference in Biostatistics

This document presents a novel Cheap Subsampling bootstrap method for constructing confidence intervals in biostatistics, which addresses the computational challenges of traditional bootstrapping techniques. The method is designed to be fast and robust, particularly in the context of semiparametric causal inference, and is validated through empirical experiments and application to data from the LEADER trial. The authors demonstrate that their approach yields reliable confidence intervals with fewer bootstrap replications while maintaining asymptotic validity under appropriate conditions.

Uploaded by

남승우
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 16

Cheap Subsampling bootstrap confidence intervals for

fast and robust inference in biostatistics


Johan Sebastian Ohlendorff1,∗ , Anders Munch1 , Kathrine Kold Sørensen2 ,
arXiv:2501.10289v1 [stat.ME] 17 Jan 2025

and Thomas Alexander Gerds1


1
Section of Biostatistics, University of Copenhagen, Denmark
2
Department of Cardiology, Nordsjællands Hospital, Denmark

Abstract
Bootstrapping is often applied to get confidence limits for semiparametric inference
of a target parameter in the presence of nuisance parameters. Bootstrapping with
replacement can be computationally expensive and problematic when cross-validation
is used in the estimation algorithm due to duplicate observations in the bootstrap
samples. We provide a valid, fast, easy-to-implement subsampling bootstrap method
for constructing confidence intervals for asymptotically linear estimators and discuss
its application to semiparametric causal inference. Our method, inspired by the Cheap
Bootstrap (Lam, 2022), leverages the quantiles of a t-distribution and has the desired
coverage with few bootstrap replications. We show that the method is asymptotically
valid if the subsample size is chosen appropriately as a function of the sample size. We
illustrate our method with data from the LEADER trial (Marso et al., 2016), obtaining
confidence intervals for a longitudinal targeted minimum loss-based estimator (van der
Laan and Gruber, 2012). Through a series of empirical experiments, we also explore
the impact of subsample size, sample size, and the number of bootstrap repetitions on
the performance of the confidence interval.
Keywords: bootstrap; causal inference; computational efficiency; subsampling; targeted
learning.

1 Introduction
Epidemiological studies of observational data are often characterized by large sample sizes
and analyzed using statistical algorithms that incorporate machine learning estimators to
estimate the nuisance parameters involved in the estimation of a target parameter of in-
terest (Hernán and Robins, 2016, 2020; van der Laan and Rose, 2018). The bootstrap is
a standard approach in cases where (asymptotic) formulas for standard errors do not exist
or are not implemented. But even if there exists an asymptotic formula for constructing
confidence intervals, one may wish to supplement the analysis with bootstrap confidence
intervals if the validity of the estimator of the formula-based standard error depends on the
correct specification of the nuisance parameter models (Chiu et al., 2023). However, the
computational burden of the standard bootstrap algorithms increases with the sample size.
Methods for constructing bootstrap confidence intervals often use empirical quantiles of a
bootstrapped statistic. Popular choices include the percentile bootstrap and the bootstrap-
t confidence interval (Tibshirani and Efron, 1993). It is recommended these methods be

1
run with a minimum of 1000 bootstrap replications (Efron, 1987). Others are based on the
standard error using the bootstrap samples. According to Efron (1987), performing between
25 and 100 bootstrap replications is sufficient for stability in standard error-based bootstrap
confidence intervals.
When bootstrap samples are drawn with replacement, complications can arise if one
of the subroutines is sensitive to duplicate observations in the data (Bickel et al., 1997).
This is the case, for example, if cross-validation is used to tune hyperparameters of machine
learning algorithms for nuisance parameters. Cross-validation is a means of evaluating the
performance of a model on unseen data by splitting the data into independent training
and test sets. However, if we first apply the bootstrap with replacement and then cross-
validation, the same observation may be present in both the training and test sets (see Figure
1). This violates the independence between the training and test sets in the cross-validation
procedure, which may lead to a biased estimate of the out-of-sample error.

Non-parametric bootstrap
Fold 1

Fold 2

Sample Resampled Data

Subsampling
Fold 1

Fold 2

Sample Subsampled Data

Figure 1: Illustration of the problem with ties in the data when using bootstrap with
replacement and the subsampling bootstrap. First, a bootstrap sample is drawn with and
without replacement from the data set. The data set is then split into two folds for the
cross-validation procedure. We see that there is one observation present in both folds for
the bootstrap sample drawn with replacement. Conversely, the subsample does not have
this issue.

In this article, we propose a Cheap Subsampling bootstrap algorithm that samples with-
out replacement to obtain bootstrap data sets that are smaller than the original data set.
Our algorithm and formula are based on the Cheap Bootstrap confidence interval (Lam,
2022). Note that (Lam, 2022) also discusses subsampling but with replacement. Their
approach may have theoretical advantages over subsampling without replacement (Bickel
et al., 1997), but our approach is compatible with cross-validation and other methods that

2
are sensitive to ties.
The consistency of subsampling, which we need for the validity of the Cheap Subsampling
confidence interval, has been derived under the assumption that the asymptotic distribution
of the estimator of interest exists (Politis and Romano, 1994). Consistency has also been
derived under the assumption that the estimator of interest is asymptotically linear (Wu,
1990). In contrast, non-parametric bootstrapping (drawing a bootstrap sample of size n
from the data set with replacement) is known to fail in various theoretical settings (Bickel
et al., 1997). Consistency of subsampling requires that the subsample size is chosen correctly
as a function of the sample size. The results are asymptotic and do not provide a method for
selecting the subsample size in practice. Here we apply the conditions in Wu (1990) to show
the asymptotic validity of the Cheap Subsampling confidence interval for asymptotically
linear estimators (Theorem 1). In addition, we show that the Cheap Subsampling confidence
interval converges to a confidence interval based on a delete-d jackknife variance estimator
(Shao and Wu, 1989) as the number of bootstrap repetitions increases. In the limit, our
confidence interval is valid for any number of bootstrap replications.
We demonstrate the use of our method with an application in causal inference in the
LEADER trial (Marso et al., 2016), which investigates the effects of liraglutide on cardiovas-
cular outcomes in patients with type 2 diabetes. The overall goal is to estimate the causal
effect of staying on a treatment which we estimate with a longitudinal targeted minimum
loss-based estimator (van der Laan and Gruber, 2012). Bootstrap inference for the longitu-
dinal targeted minimum loss-based estimator is of general interest and in particular useful
in cases where estimates of the standard error are not reliable (Tran et al., 2023; van der
Laan et al., 2023).
The remainder of the article is organized as follows. In Section 2, we introduce the
Cheap Subsampling algorithm and formulate the conditions for the asymptotic validity of
the Cheap Subsampling confidence interval and its connection to an asymptotic confidence
interval based on the delete-d jackknife variance estimator. In Section 3, we apply the
Cheap Subsampling confidence interval to the LEADER trial data. In Section 4, we present
a simulation study to investigate the performance of the Cheap Subsampling confidence
interval.

2 Cheap Subsampling bootstrap


2.1 Notation and framework
We introduce our Cheap Subsampling algorithm in a general framework that includes the
applied settings. Let Dn = (O1 , . . . , On ) with Oi ∈ Rd be a data set of independent and
identically distributed random variables sampled from some unknown probability measure
P ∈ P. Here P denotes a suitably large set of probability measures on Rd . We denote by
Ψ : P → R the statistical functional of interest and by Ψ̂n an estimator of Ψ(P ) based on
Dn . The estimator Ψ̂n is asymptotically linear (Bickel et al., 1993) if
n
1X
Ψ̂n − Ψ(P ) = ϕP (Oi ) + Rn (P )
n i=1

where ϕP : Rd → R is a measurable function with EP [ϕP (O)] = 0, 0 < EP [ϕP (O)2 ] < ∞,

and the remainder term fulfills Rn (P ) = oP ( √1n ) for all P ∈ P. A subsample Dm =
∗ ∗
(O1 , . . . , Om ) is a diminished data set obtained by drawing m < n observations without

3
replacement from the data set Dn . We denote by Ψ̂∗m the estimate based on the subsample

Dm .

2.2 Cheap Subsampling confidence interval


We aim to construct confidence intervals for Ψ(P ) based on the estimator Ψ̂n and B ≥ 1
subsamples of size m. By repeating the subsampling procedure independently B ≥ 1 times
we obtain the estimates based on subsamples {Ψ̂∗(m,1) , . . . , Ψ̂∗(m,B) } and define the Cheap
Subsampling confidence interval as
 r r 
m m
I(m,n,B) = Ψ̂n − tB,1−α/2 S, Ψ̂n + tB,1−α/2 S , (1)
n−m n−m
where tB,1−α/2 is the 1 − α/2 quantile of a t-distribution with B degrees of freedom and
r 2
PB 
S = B1 b=1 Ψ̂∗(m,b) − Ψ̂n .
The asymptotic validity of the Cheap Subsampling confidence interval (1) can be shown
under the assumption that Ψ̂n is asymptotically normal and that the subsample size m(n)
is chosen appropriately as a function of the sample size n. To ease the notation, we will just
write m instead of m(n) in what follows. Sufficient conditions for asymptotic validity are
formulated in Theorem 1.
Theorem 1. Let Ψ̂n be an asymptotically linear estimator of Ψ(P ). If the subsample size
m fulfills that supn∈N m
n ≤ c for some 0 < c < 1, then the coverage probability of the Cheap
Subsampling confidence interval I(m,n,B) converges to 1 − α, that is,

P (Ψ(P ) ∈ I(m,n,B) ) → 1 − α,

as m, n → ∞ for any B ≥ 1.
Proof. The proof is given in Appendix A.
Remark 1. Other choices of the subsample size may be of interest such as m/n → 1 and
n − m → ∞ as n → ∞ (Wu, 1990), but would require conditions on the remainder term Rn
that are too restrictive for our purposes.
Next, we state a result that shows that the endpoints of the Cheap Subsampling confi-
dence interval converge to a random limit fully determined by the data Dn as the number of
bootstrap repetitions increases (Theorem 2). Specifically, the theorem has the consequence
that the endpoints of the Cheap Subsampling confidence interval (1) converge to the end-
points of an (asymptotic) confidence interval based on the delete-(n − m) jackknife variance
estimator for the variance as B → ∞. The delete-(n−m) jackknife variance estimator (Shao
and Wu, 1989) is given by
m
d jack =
Var EP [(Ψ̂∗m − Ψ̂n )2 |Dn ].
n−m
If the condition of Theorem 2 is fulfilled, we have
r
m
q
P
Ψ̂n ± tB,1−α/2 S → Ψ̂n ± q1−α/2 Var
d jack , as B → ∞,
n−m
where q1−α/2 is the 1 − α/2 quantile of the standard normal distribution.

4
Theorem 2. Let Ψ̂n be any estimator. If EP [Ψ̂4n ] < ∞, then
B 2
1 X ∗ P
S2 = Ψ̂(m,b) − Ψ̂n → EP [(Ψ̂∗m − Ψ̂n )2 |Dn ],
B
b=1

as B → ∞ for fixed m and n.

3 Case study: Application to the LEADER data set


For the sole purpose of illustrating our method, we use the data from the LEADER trial
(Marso et al., 2016), where we apply the Cheap Subsampling confidence interval to the
longitudinal targeted minimum loss-based estimator (LTMLE) (van der Laan and Rose,
2018). We use a subset of the data from the LEADER trial, which includes 8652 patients
with type 2 diabetes. The baseline variables selected for analysis were sex, age, use of
thiazide, use of statin, hypertension, BMI, and the number of years since diabetes diagnosis.
The time-varying data was discretized into 8 intervals of 6 months length each. In each
interval, we defined time-varying covariates that represent HbA1c, use of thiazolidinediones,
use of sulfonylureas, use of metformin, and use of DPP-4 inhibitors. For each time interval,
the LTMLE algorithm estimates nuisance regression models for the outcome, the propensity
of treatment, and the censoring probability. Nuisance parameter estimation for the LTMLE
was performed using a discrete super learner (van der Laan et al., 2007), including learners
based on penalized regression (Tay et al., 2023) and random forests obtained with the ranger
algorithm (Wright and Ziegler, 2017). The discrete super learner used 2-fold cross-validation
to choose the best-fitting model. Throughout, we apply the Cheap Subsampling algorithm
to obtain confidence intervals for the causal effect of adhering to the placebo regimen and the
absolute 4-year risk of all-cause death. We compare our method with asymptotic confidence
intervals that are based on an estimate of the efficient influence function (van der Laan and
Gruber, 2012). We investigate the Monte Carlo error (effect of setting the random seed),
the impact of the subsample size, and the number of bootstrap repetitions on the Cheap
Subsampling confidence interval.
Figure 2 shows the effect of the number of bootstrap repetitions B on the Cheap Sub-
sampling confidence interval. We observe that the Cheap Subsampling confidence interval
is comparable to the asymptotic confidence interval and that a higher number of bootstrap
repetitions results in a more stable confidence interval. It is seen that the Cheap Subsam-
pling confidence intervals stabilize quickly and do not change significantly after 10 bootstrap
replications.
Due to the random nature of the bootstrap, the bootstrap confidence intervals are af-
fected by Monte Carlo error, i.e., they depend on the random seed set by the algorithm.
This means that the confidence intervals will change when repeating the whole bootstrap
procedure with a new random seed. In Figure 3, we illustrate this seed effect on the upper
endpoint of the confidence interval for various subsample sizes and 10 repetitions of the
whole Cheap Subsampling bootstrap algorithm. This shows that B ≥ 25 seems sufficient
for the seed effect to be negligible in our data example. We also see that in this example the
Cheap Bootstrap confidence interval does not depend on the subsample size with sufficiently
many bootstrap repetitions (B ≥ 25).

5
15%
Endpoints of the confidence intervals

10%

5%

0 50 100 150 200


Bootstrap iteration (B)

Figure 2: Lower and upper endpoints (y-axis) of 95% Cheap Subsampling confidence inter-
vals for the absolute risk of dying within 4 years for the placebo regimen in the LEADER trial
using the LTMLE. The x-axis shows the number of bootstrap repetitions B ∈ {1, . . . , 200}
for the subsample size m = ⌊0.8 · 8652⌋ = 6850. Additionally, the lower and upper endpoints
of the asymptotic confidence interval is the black horizontal lines and the point estimate is
the dotted line.

4 Simulation study
In this section, we simulate data to investigate the effects of sample size, subsample size, and
the number of bootstrap repetitions on the coverage probability and the width of the Cheap
Subsampling confidence interval. We consider a survival setting with a binary treatment
and a time-to-event outcome and apply the LTMLE algorithm for which we discretize time
into 2 time intervals. In the simulation study, the target parameter is the absolute risk
of an event within the end of the second time interval under sustained treatment. For
details on the data-generating mechanism, the simulation study, and the R code, see the
supplementary material (Appendix C) and https://github.com/jsohlendorff/cheap_

6
15.0%
Upper limit of confidence interval

12.5%

Bootstrap
iteration (B)
5
10.0%
25
100
200

7.5%

5.0%

0.5 0.632 0.8 0.9


Subsample proportion (η)

Figure 3: The upper endpoint of 95% Cheap Subsampling confidence interval based on
the LTMLE of the absolute risk of dying within 4 years under the placebo regimen in the
LEADER trial. The plot shows the Monte Carlo error (random seed effect) based on 10
runs of the Cheap Subsampling algorithm for each of the subsample sizes m = ⌊η · 8652⌋
with η ∈ {0.5, 0.632, 0.8, 0.9} and number of bootstrap repetitions B ∈ {5, 20, 100, 200}.

subsampling_simulation_study.
In our simulation study, we consider sample sizes n ∈ {250, 500, 1000, 2000, 8000} and
vary the subsample size m = ⌊η · n⌋ with η ∈ {0.5, 0.632, 0.8, 0.9} and the number of
bootstrap repetitions B ∈ {1, . . . , 500}. For each scenario, we repeat the whole procedure in
2000 simulated data sets. For the estimation of the nuisance parameters, we use (correctly
specified) logistic regression models.
In each instance, we compute the empirical coverage of the confidence intervals and
the average relative width of the Cheap Subsampling confidence interval for the LTMLE
when compared with the asymptotic confidence interval which is based on an estimate of
the efficient influence function van der Laan and Gruber (2012). Additionally, we compare

7
our Cheap Subsampling confidence interval with the Cheap Bootstrap confidence interval
(Lam, 2022). The results are summarized across the 2000 simulated data sets in Table 1
and Figure 4.

96%

Confidence interval
method
Coverage

95% Cheap Bootstrap


Cheap Subsampling
Standard error based
on asymptotics

94%

0 100 200 300 400 500


Number of bootstrap samples (B)

Figure 4: Results from the simulation study illustrating the coverage (y-axis) of three 95%
confidence intervals for the absolute risk of an event before the end of the second time
interval under sustained exposure using the LTMLE for n = 2000. The three confidence
intervals are the asymptotic confidence interval, the Cheap Subsampling confidence interval
(m = ⌊0.632 · 2000⌋ = 1264), and the Cheap Bootstrap confidence interval (Lam, 2022).
The x-axis shows the number of bootstrap repetitions B ∈ {1, . . . , 500}.

Figure 4 shows that the coverage is close to the nominal level for very low numbers
of bootstrap repetitions and fixed subsample size m = ⌊0.632 · 2000⌋ = 1264. This was
guaranteed by Theorem 1 only in large samples. When we compare the coverage of the
asymptotic confidence interval with the coverage of the Cheap Bootstrap confidence interval
(Lam, 2022), we see that the Cheap Subsampling confidence interval has similar coverage,
albeit with slightly worse coverage for very low numbers of bootstrap replications B. Table

8
1 shows no systematic effects on the coverage of the Cheap Subsampling confidence interval
for different subsample sizes in large sample sizes. However, the coverage appears to depend
on the subsample size when the sample size is small.
For the widths in Table 1, we see that the Cheap Subsampling confidence interval is, in
general, slightly wider than the asymptotic confidence intervals, but that increasing B results
in narrower confidence intervals. A possible explanation for the wider Cheap Subsampling
confidence intervals at low B is that the quantiles of the t-distribution are large for low
degrees of freedom but quite comparable to the normal distribution for large degrees of
freedom (B ≥ 25). Similar results were obtained in the case study (Section 3) for small
values of B. Moreover, the width of the Cheap Subsampling confidence interval slightly
decreases with increasing subsample size and sample size when compared to the asymptotic
confidence interval, but this effect is less noticeable than the effect of the number of bootstrap
repetitions.

Table 1: The table shows the coverage and relative widths compared to the asymptotic
confidence interval of the 95% Cheap Subsampling confidence interval for the absolute risk
of an event before the end of the second time interval under sustained exposure using
the LTMLE for different subsample percentages η and sample sizes n and the number of
bootstrap repetitions B.
Coverage (%) Relative width (%)
Subsample proportion (η) Subsample proportion (η)
B n 50% 63.2% 80% 90% 50% 63.2% 80% 90%
5 250 94.8 93.2 92.9 93.7 131.0 127.5 127.5 127.2
500 94.2 93.8 95.0 94.5 126.3 126.0 126.9 125.8
1000 94.0 95.0 94.2 94.7 125.1 125.8 124.9 125.3
2000 94.8 95.1 95.0 94.5 125.6 126.2 123.5 124.8
8000 95.3 95.8 94.5 95.3 125.8 127.1 124.8 124.0
25 250 93.6 92.5 93.2 92.2 107.9 106.0 106.1 106.3
500 94.2 93.8 94.3 95.1 105.8 104.9 104.8 104.1
1000 94.5 95.2 94.2 94.3 104.3 104.8 104.5 104.1
2000 94.2 95.0 95.3 95.0 104.9 105.2 104.2 104.3
8000 94.7 95.3 95.3 94.5 104.2 104.8 104.0 103.6
100 250 93.8 92.7 93.2 92.5 104.8 103.5 103.2 103.0
500 94.2 93.8 94.9 95.0 102.5 102.2 101.8 101.7
1000 94.9 94.8 94.6 93.8 101.5 101.5 101.4 101.1
2000 94.2 94.6 94.8 95.0 101.5 101.3 101.0 101.1
8000 94.5 95.2 95.0 95.4 101.0 101.3 100.7 101.0
500 250 93.9 92.8 93.0 92.7 103.9 102.8 102.2 102.1
500 93.8 93.8 94.7 95.1 101.7 101.3 101.1 101.0
1000 94.8 94.8 94.5 94.0 100.8 100.7 100.6 100.5
2000 94.0 94.5 94.8 95.0 100.6 100.5 100.3 100.3
8000 94.6 94.9 95.0 94.7 100.2 100.3 100.2 100.2

9
5 Discussion
The Cheap Subsampling confidence interval is a valuable tool for applied research where
computational efficiency is needed. We have shown that it provides asymptotically valid
confidence intervals and investigated the real world and small sample performance for a
target parameter in a semiparametric causal inference setting. The Cheap Subsampling
confidence interval is easy to implement and can be applied to any asymptotically linear
estimator. Theoretically, the method can be applied already with very few bootstrap rep-
etitions. But, in our case study, the Monte Carlo error may not be regarded as negligible
for B < 25. This is similar to the suggestion given by Efron (1987). This is likely due to
mn 2
n−m S being a Monte Carlo bootstrap estimator of the asymptotic variance.
Our empirical study shows that the coverage of the Cheap Subsampling confidence inter-
val is more sensitive to the subsample sizes in small data sets. In these situations, we need
to choose the subsample size carefully to ensure correct coverage. Politis et al. (1999) and
Bickel and Sakov (2008) provide methods for adaptively selecting the subsample size. For
example, one may want to conduct a Monte Carlo experiment by selecting from a list of sub-
sample sizes m1 , . . . , mK the one that gives the best apparent coverage. The most notable
issue with these approaches is the computational burden. In future work, we will investigate
the possibility of adapting these methods to choosing the subsample size in practice.
With large data sets, such as those found in electronic health records, there is also the
possibility of using the Bag of Little Bootstraps (Kleiner et al., 2012) or the Cheap Bag of
Little Bootstraps (Lam, 2022). The idea behind these methods is to avoid the tuning of the
subsample size but to retain computational feasibility by estimating on smaller data sets.
Another advantage over the Cheap Subsampling confidence interval is that the confidence
intervals based on the Bag of Little Bootstraps are second-order accurate, likely resulting
in narrower confidence intervals. Lam (2022) showed that the Cheap Bootstrap based on
resampling yields confidence intervals that are second-order accurate. In future work, we
will investigate if the Cheap Subsampling bootstrap confidence interval can be made second-
order accurate, e.g., by using interpolation (Bertail, 1997) and extrapolation (Bertail and
Politis, 2001). On the other hand, the Bag of Little Bootstraps sample with replacement
and hence these methods suffer from the problem illustrated in Figure 1.
In our application of the Cheap Subsampling confidence intervals, we chose to study the
finite-sample properties with the TMLE, but there are also other choices given by Tran et al.
(2023); Coyle and van der Laan (2018). Both of these approaches provide valid confidence
intervals for the TMLE that adequately deal with the issue with cross-validation. The
method in Tran et al. (2023) also reduces the computation time for bootstrapping by only
needing to estimate the nuisance parameters once in the entire sample. However, these
approaches are specifically designed for the TMLE and may not be applicable to other
estimators.

5.1 Acknowledgments
The authors would like to thank Novo Nordisk for providing the data from the LEADER
trial.

10
5.2 Funding
Partially funded by the European Union. Views and opinions expressed are however those
of the author(s) only and do not necessarily reflect those of the European Union or Euro-
pean Health and Digital Executive Agency (HADEA). Neither the European Union nor the
granting authority can be held responsible for them. This work has received funding from
the UK research and Innovation under contract number 101095556.

5.3 Conflicts of interest


None declared.

5.4 Data availability


Data sharing cannot be provided due to the proprietary nature of the data.

References
Bertail, P. (1997). Second-order properties of an extrapolated bootstrap without replace-
ment under weak assumptions. Bernoulli 3 (2), 149 – 179.
Bertail, P. and D. N. Politis (2001). Extrapolation of subsampling distribution estima-
tors: The i.i.d. and strong mixing cases. The Canadian Journal of Statistics / La Revue
Canadienne de Statistique 29 (4), 667–680.
Bickel, P. J., F. Götze, and W. R. van Zwet (1997). Resampling Fewer Than n Observations:
Gains, Losses, and Remedies for Losses. Statistica Sinica 7 (1), 1–31.
Bickel, P. J., C. A. Klaassen, P. J. Bickel, Y. Ritov, J. Klaassen, J. A. Wellner, and Y. Ritov
(1993). Efficient and adaptive estimation for semiparametric models, Volume 4. Springer.
Bickel, P. J. and A. Sakov (2008). On the choice of m in the m out of n bootstrap and
confidence bounds for extrema. Statistica Sinica, 967–985.
Chiu, Y.-H., L. Wen, S. McGrath, R. Logan, I. J. Dahabreh, and M. A. Hernán (2023).
Evaluating model specification when using the parametric g-formula in the presence of
censoring. American journal of epidemiology 192 (11), 1887–1895.
Coyle, J. and M. J. van der Laan (2018). Targeted Bootstrap, pp. 523–539. Cham: Springer
International Publishing.
Efron, B. (1987). Better bootstrap confidence intervals. Journal of the American Statistical
Association 82 (397), 171–185.
Hernán, M. A. and J. M. Robins (2016). Using big data to emulate a target trial when a
randomized trial is not available. American journal of epidemiology 183 (8), 758–764.
Hernán, M. A. and J. M. Robins (2020). Causal inference: What if. Boca Raton: Chapman
& Hall/CRC, FL.
Kleiner, A., A. Talwalkar, P. Sarkar, and M. I. Jordan (2012). A scalable bootstrap for
massive data.

11
Lam, H. (2022, January). A Cheap Bootstrap Method for Fast Inference.
https://arxiv.org/abs/2202.00090v1.
Marso, S. P., G. H. Daniels, K. Brown-Frandsen, P. Kristensen, J. F. Mann, M. A. Nauck,
S. E. Nissen, S. Pocock, N. R. Poulter, L. S. Ravn, W. M. Steinberg, M. Stockner,
B. Zinman, R. M. Bergenstal, and J. B. Buse (2016). Liraglutide and cardiovascular
outcomes in type 2 diabetes. New England Journal of Medicine 375 (4), 311–322.
Politis, D. N. and J. P. Romano (1994). Large Sample Confidence Regions Based on Sub-
samples under Minimal Assumptions. The Annals of Statistics 22 (4), 2031 – 2050.
Politis, D. N., J. P. Romano, and M. Wolf (1999). Subsampling. Springer Series in Statistics.
New York, NY: Springer.
Shao, J. and C. F. J. Wu (1989). A General Theory for Jackknife Variance Estimation. The
Annals of Statistics 17 (3), 1176 – 1197.
Tay, J. K., B. Narasimhan, and T. Hastie (2023). Elastic net regularization paths for all
generalized linear models. Journal of Statistical Software 106 (1), 1–31.

Tibshirani, R. J. and B. Efron (1993). An introduction to the bootstrap. Monographs on


statistics and applied probability 57 (1).
Tran, L., M. Petersen, J. Schwab, and M. J. van der Laan (2023, January). Robust variance
estimation and inference for causal effect estimation. Journal of Causal Inference 11 (1).

van der Laan, M. J., D. Benkeser, and W. Cai (2023). Efficient estimation of pathwise
differentiable target parameters with the undersmoothed highly adaptive lasso. The In-
ternational Journal of Biostatistics 19 (1), 261–289.
van der Laan, M. J. and S. Gruber (2012, May). Targeted Minimum Loss Based Estimation
of Causal Effects of Multiple Time Point Interventions. The International Journal of
Biostatistics 8 (1).
van der Laan, M. J., E. C. Polley, and A. E. Hubbard (2007). Super learner. Statistical
Applications in Genetics and Molecular Biology 6 (1).
van der Laan, M. J. and S. Rose (2018). Targeted Learning in Data Science: Causal In-
ference for Complex Longitudinal Studies. Springer Series in Statistics. Cham: Springer
International Publishing.
Wright, M. N. and A. Ziegler (2017). ranger: A fast implementation of random forests for
high dimensional data in C++ and R. Journal of Statistical Software 77 (1), 1–17.
Wu, C. F. J. (1990). On the Asymptotic Properties of the Jackknife Histogram. The Annals
of Statistics 18 (3), 1438 – 1452.

12
6 Appendix A: Proof of Theorem 1
To prove the theorem, we use the notation and framework of Section 2.1. Since the estimator
Ψ̂n is asymptotically linear, Slutsky’s theorem and the central limit theorem yield
√ d d
→ Z = N (0, σ 2 )
n(Ψ̂n − Ψ(P )) −

where σ 2 = EP [ϕP (O)2 ] > 0. Theorem 2 (iii) of Wu (1990) gives that


r 
mn P
sup P (Ψ̂∗m − Ψ̂n ) ≤ x|Dn − Φσ2 (x) → 0,
x∈R n−m

if n−m
n > λ for some λ > 0 for all n ∈ N. By our assumption on the subsample size, we
have mn ≤ c for some 0 < c < 1, and thus
n−m
n =1− m n ≥ λ := 1 − c > 0. This implies
r 
mn P
P (Ψ̂∗ − Ψ̂n ) ≤ x|Dn → Φσ2 (x),
n−m m

for all x ∈ R as m, n → ∞, where Φσ2 is the cumulative distribution function of a Normal


distribution with mean 0 and variance σ 2 . The remainder of the proof follows along the
mn
steps of the proof of Theorem 1 in Lam (2022). Let k(m,n) = n−m . We want to show that
√ q q
d
( n(Ψ̂n − Ψ(P )), k(m,n) (Ψ̂∗(m,1) − Ψ̂n ), . . . , k(m,n) (Ψ̂∗(m,B) − Ψ̂n )) −
→ (Z0 , . . . , ZB ), (2)

d
where Z0 , . . . , ZB are independent and identically
p distributed with Zb = Z. By conditioning
on Dn and using conditional independence of k(m,n) (Ψ̂∗(m,b) − Ψ̂n ), b = 1, . . . , B (bootstrap
samples are drawn independently given the data), we have
√ q q 
P n(Ψ̂n − Ψ(P )) ≤ z0 , k(m,n) (Ψ̂∗(m,1) − Ψ̂n ) ≤ z1 , . . . , k(m,n) (Ψ̂∗(m,B) − Ψ̂n ) ≤ zB
B
Y
− Φσ2 (zb )
b=0
" B B
#
√  Y q  Y
≤E I n(Ψ̂n − Ψ(P )) ≤ z0 P k(m,n) (Ψ̂∗(m,b) − Ψ̂n ) ≤ zb |Dn − Φσ2 (zb )
b=1 b=1
(3)
B
Y h √  i
+ Φσ2 (zb ) P n(Ψ̂n − Ψ(P )) ≤ z0 − Φσ2 (z0 ) , (4)
b=1

√ d
for any z = (z0 , . . . , zB ) ∈ RB+1 . Since, by assumption, n(Ψ̂n − Ψ(P )) − → Z, the term in
P
equation (4) converges to zero as n → ∞. Since also P ( k(m,n) (Ψ̂∗(m,b) − Ψ̂n ) ≤ z|Dn ) →
p

Φσ2 (z) for b = 1, . . . , B and all z ∈ R as n → ∞, it follows that the integrand of the term (3)
converges to zero in probability as n → ∞. Since the integrand in the term (3) is bounded
by 1, it follows from dominated convergence that (3) tends to zero. Thus (2) holds. From

13
this result, we deduce that
√ √
n(Ψ̂n −Ψ(P ))
Ψ̂n − Ψ(P ) n(Ψ̂n − Ψ(P )) σ
T(m,n) = q = p =s √
m k(m,n) S 2
n−m S 1
PB k(m,n)
B b=1 σ (Ψ̂∗(m,b) − Ψ̂n )

d Z̃0

→q PB
1
B b=1 Z̃b2

d
where Z̃b = Zb /σ = N (0, 1). Note that by the independence of the Z̃b ’s and the fact that
d Z̃0
Z̃b = N (0, 1), we have that √ 1 P B 2
has a t-distribution with B degrees of freedom. This
B b=1 Z̃b
shows that T(m,n) converges in distribution to a t-distribution with B degrees of freedom,
as n → ∞. Thus, we have

P (Ψ(P ) ∈ I(m,n,B) ) = P (−tB,1−α/2 < T(m,n) < tB,1−α/2 ) → 1 − α,

as n → ∞.

7 Appendix B: Proof of Theorem 2


h i
To prove the theorem, we shall argue that E (Ψ̂∗m − Ψ̂n )4 < ∞. An application of Cheby-
shev’s inequality for conditional expectations then provides the means to show the statement
of the theorem.
Since E[Ψ̂4n ] < ∞, the binomial theorem gives
4  
X 4
E[(Ψ̂∗m − Ψ̂n ) ] = 4
E[Ψ̂4−i ∗ i
n (Ψ̂m ) ].
i=0
i

∗,i
Applying Hölder’s inequality with p = 4/i and q = 4/(4 − i), we have |E[Ψ̂4−i n Ψ̂m ]| ≤
4 1/p ∗ 4 1/q 4 1/p 4 1/q
(E[Ψ̂n ]) E[(Ψ̂m ) ] = (E[Ψ̂n ]) E[(Ψ̂m ) ] , where the latter equality follows from the
fact that the subsample has marginally the same distribution as a full sample of m obser-
vations. Moreover, since E[(Ψ̂∗m − Ψ̂n )2 |Dn ] = argming E[(Ψ̂∗m − Ψ̂n )2 − g(Dn )]2 , where the
minimum is taken over all Dn -measurable functions g, we have that
 2  h i
∗ ∗
2 2
E Ψ̂m − Ψ̂n ) − E[(Ψ̂m − Ψ̂n ) |Dn ] ≤ E (Ψ̂∗m − Ψ̂n )4 < ∞. (5)

By using that conditionally on Dn , the random variables Ψ̂∗(m,1) , . . . , Ψ̂∗(m,B) are indepen-
dent, we have by Chebyshev’s inequality for conditional expectations, for arbitrary ε > 0,
that
B
! B
1 X 2 1 X

P 2
Ψ̂(m,b) − Ψ̂n − E[(Ψ̂m − Ψ̂n ) |Dn ] ≥ ε Dn ≤ 2 2 Var[(Ψ̂∗(m,b) −Ψ̂n )2 |Dn ].
B B ε
b=1 b=1

14
Taking the expectation on both sides of the previous display, we have
B
!
1 X 2
∗ 2
P Ψ̂(m,b) − Ψ̂n − E[(Ψ̂m − Ψ̂n ) |Dn ] ≥ ε
B
b=1
1
≤ E[Var[(Ψ̂∗ − Ψ̂n )2 |Dn ]] (6)
Bε2   m 
1  2
∗ 2 ∗ 2
= E E (Ψ̂ m − Ψ̂ n ) − E[( Ψ̂ m − Ψ̂ n ) |D n ] | Dn
Bε2
 2 
1 ∗ 2 ∗ 2
= E ( Ψ̂m − Ψ̂n ) − E[(Ψ̂ m − Ψ̂ n ) |D n ] (7)
Bε2
1 h i
≤ 2
E (Ψ̂∗m − Ψ̂n )4 . (8)

 2
where (6) follows since Ψ̂(m,b) − Ψ̂n , b = 1, . . . , B are identically distributed and expec-
tations are linear, (7) follows from the tower property of conditional expectations, and (8)
follows by (5). Taking the limit as B → ∞ concludes the proof.

8 Appendix C: Data-generating mechanism


In this section, we describe the data-generating mechanism for our simulation study (Section
4). See https://github.com/jsohlendorff/cheap_subsampling_simulation_study for
the R code. We denote by Yt the outcome, where Yt = 1 if an event has occurred at time t
and Yt = 0 otherwise. We also denote the treatment by At (1 is treated and 0 is untreated),
the time-varying confounders by Wt (continuous), and the censoring indicators by Ct (1 is
uncensored and 0 is censored). We generate the data at baseline (t = 0) and at two time
points (t = 1 and t = 2) as follows:
W0 ∼ N (0, 1),
A0 ∼ Bern(expit(−0.2 + 0.4W0 )),
C1 ∼ Bern(expit(3.5 + W0 )),
(
Bern(expit(−1.4 + 0.1W0 − 1.5A0 )), if C1 = 1
Y1 ∼
∅, if C1 = 0,
(
N (0.5W0 + 0.2A0 , 1), if C1 = 1 and Y1 = 0
W1 ∼
∅, otherwise,
(
Bern(expit(−0.4W0 + 0.8A0 )), if C1 = 1 and Y1 = 0
A1 ∼
∅, otherwise,

Bern(expit(3.5 + W1 )), if C1 = 1 and Y1 = 0

C2 ∼ 0 if C1 = 0

∅, otherwise,


Bern(expit(−1.4 + 0.1W1 − 1.5A1 )), if C2 = 1 and Y1 = 0

Y2 ∼ 1 if C2 = 1 and Y1 = 1

∅, if C2 = 0,

15
1
where expit(x) = 1+exp(−x) is the logistic function, N (µ, σ 2 ) is the normal distribution with
mean µ and variance σ 2 , Bern(p) is the Bernoulli distribution with probability p, and ∅ is
the missingness indicator.

16

You might also like