Identification and estimation of causal effects using non-concurrent controls in platform trials
Abstract
Platform trials are multi-arm designs that simultaneously evaluate multiple treatments for a single disease within the same overall trial structure. Unlike traditional randomized controlled trials, they allow treatment arms to enter and exit the trial at distinct times while maintaining a control arm throughout. This control arm comprises both concurrent controls, where participants are randomized concurrently to either the treatment or control arm, and non-concurrent controls, who enter the trial when the treatment arm under study is unavailable. While flexible, platform trials introduce the challenge of using non-concurrent controls, raising questions about estimating treatment effects. Specifically, which estimands should be targeted? Under what assumptions can these estimands be identified and estimated? Are there any efficiency gains? In this paper, we discuss issues related to the identification and estimation assumptions of common choices of estimand. We conclude that the most robust strategy to increase efficiency without imposing unwarranted assumptions is to target the concurrent average treatment effect (cATE), the ATE among only concurrent units, using a covariate-adjusted doubly robust estimator. Our studies suggests that, for the purpose of obtaining efficiency gains, collecting important prognostic variables is more important than relying on non-concurrent controls. We also discuss the perils of targeting ATE due to an untestable extrapolation assumption that will often be invalid. We provide simulations illustrating our points and an application to the ACTT platform trial, resulting in a 20% improvement in precision.
Keywords: adaptive trials; causality; doubly robust; efficiency; estimand
1 Introduction
Platform trials are multi-arm designs that simultaneously evaluate multiple treatments for a single disease within the same overall trial structure (Woodcock &Β LaVange, 2017; Berry etΒ al., 2015; Park etΒ al., 2022). Unlike traditional randomized controlled trials, they allow treatment arms to enter and exit the trial at distinct times while maintaining a control arm throughout. These trials have been instrumental in assessing the efficacy of treatments across various therapeutic areas (Barker etΒ al., 2009; Foltynie etΒ al., 2023; Wells etΒ al., 2012, among others) and gained traction during the COVID-19 pandemic (Hayward etΒ al., 2021; Angus etΒ al., 2020; Kalil etΒ al., 2021, among others). For instance, the Adaptive COVID-19 Treatment Trial (ACTT) (Kalil etΒ al., 2021) was a platform trial that investigated treatments for hospitalized adult patients with COVID-19 pneumonia. ACTT comprised of multiple stages, as depicted in Figure 1. In the initial stage (ACTT-1), the efficacy of remdesivir alone versus placebo was evaluated. Subsequently, in the second stage (ACTT-2), placebo was discontinued, and a new treatment, remdesivir plus baricitinib, was introduced while concurrently randomizing participants to remdesivir alone. Here, the remdesivir alone arm served as a shared arm between the ACTT-1 and ACTT-2 stages. The remdesivir alone arm is termed non-concurrent for remdesivir plus baricitinib during ACTT-1 and concurrent during ACTT-2. In this paper, we adhere to the terminology used in current literature (BofillΒ Roig etΒ al., 2023; Lee &Β Wason, 2020) and designate the shared arm as control, irrespective of whether it is a placebo arm or an active control or an experimental treatment. Thus, we consistently use the terms concurrent and non-concurrent controls regardless of the nature of the shared arm.
The central question revolves around the efficient utilization of non-concurrent controls to estimate treatment effects in platform trials. Specifically, what estimands should be targeted to evaluate the causal effect of a treatment versus a shared control? Under what assumptions can these estimands be identified and estimated? Does using non-concurrent controls lead to efficiency gains?
Addressing these questions requires careful consideration of how the timing of entry into the platform trial may introduce bias into the study results, which is referred to as βtime driftβ, βtemporal driftβ or βtime trendβ. Various methods have been proposed to control for it, including test-then-pool approaches (Viele etΒ al., 2014), frequentist and Bayesian regression models (Lee &Β Wason, 2020; Sridhara etΒ al., 2022; BofillΒ Roig etΒ al., 2023; Saville etΒ al., 2022), propensity-score-based methods (Yuan etΒ al., 2019; Chen etΒ al., 2020), and other approaches (Han etΒ al., 2017; Collignon etΒ al., 2020; Ibrahim &Β Chen, 2000; Neuenschwander etΒ al., 2009; Banbeta etΒ al., 2019; Gravestock etΒ al., 2017; Bennett etΒ al., 2021; Hobbs etΒ al., 2011; Normington etΒ al., 2020; Schmidli etΒ al., 2020; Hupf etΒ al., 2021; Jiang etΒ al., 2023).
While these methods provide a statistical way to incorporate non-concurrent controls and control for the βtemporal driftβ bias, these approaches are βmodel-firstβ, meaning that they are focused on first providing a model for the outcome and then reverse engineering interpretations for the estimated parameters in terms of causal effects. This approach conflicts with the recently advocated estimand framework (FDA, 2021) and the International Council for Harmonisation (ICH) E9(R1) guidance (International Council for Harmonisation, 2017), where the causal target of interest is first identified based solely on scientific discussions, and then the optimal statistical estimation method for that target is deployed. Existing methods for the use of non-concurrent controls lack a formal framework for characterizing causal effects and their identifying conditions, which implies that interpreting the effect estimates from these procedures and making recommendations regarding clinical practice become challenging. These concerns are underscored in recent reviews (Collignon etΒ al., 2022; Koenig etΒ al., 2024) and are discussed in the FDA estimand framework (FDA, 2021) and the ICH E9(R1) guidance (International Council for Harmonisation, 2017).
In this paper, we discuss the use of non-concurrent controls, using the estimand framework to guide our discussion and choices. We propose to target the concurrent average treatment effect of treatment arm , , as an estimand of interest in platform trials. Specifically, is the marginal average difference in outcomes for individuals who receive treatment compared to those in the shared control group, among the concurrent population. We then provide assumptions for its non-parametric identification, and show that these assumptions are all feasible, in contrast to the assumptions required for identification of the average treatment effect, which are not testable. We develop several estimators for , including outcome regression, inverse probability weighting, and doubly robust estimators. We show that efficiency gains can be obtained by leveraging non-concurrent controls for estimators based on outcome regression under correct models specification. Interestingly, we also show that there are no asymptotic gains in efficiency when using non-concurrent controls with doubly robust estimators adjusted by time of entry into the trial when treatment availability is a deterministic function of entry time. However, we show that efficiency gains can be obtained when treatment availability is a stochastic function of entry time.
In randomized trials, efficiency gains may come from multiple sources. For instance, one can attempt to gain efficiency by increasing the sample size, as illustrated by the use of non-concurrent controls. Alternatively, precision may be increased through adjustment for prognostic variables (see Colantuoni &Β Rosenblum, 2015; Benkeser etΒ al., 2021, among others). Prognostic variables are often incorporated though regression models, which can then be mapped into conditional or marginal effect estimates. However, it is important to remember that outcome models may lead to biased results under certain types of misspecification. It is therefore important to use doubly robust estimators for covariate adjustment. Doubly robust estimators are consistent when either the treatment assignment or the outcome model is correctly specified, a property we obtain βfor freeβ in platform trials due to randomization. Therefore, a key takeaway when targeting in platform trials is to use a doubly robust estimator that prioritizes identifying strong prognostic baseline variables rather than relying on non-concurrent controls. The latter provide no efficiency benefit when using a robust estimator that does not rely on the ability to correctly specify the outcome regression mechanism when treatment availability is a deterministic function of entry time.
Finally, we further highlight the risks of targeting the average treatment effect using data for the entire duration of the trial including the non-concurrent period, due to its dependence on an untestable extrapolation assumption.
2 Notation and setup
For each of study participants, let denote the (random) entry time, after eligibility screening and consent, of a unit into the study, let denote a set of baseline variables, let denote the randomized treatment taking values , where denotes the control arm and denotes the treatments of interest. Let denote an indicator of whether arm was available at time , and define . Let denote a binary or numerical outcome measured at a fixed time after entry . The observed data is , where represents the data for the experimental unit , i.e., . We define with probability one so that at least treatments plus control are available at the start of the trial. We also assume the data are ordered in time of study entry in the sense that . Note that by design.
2.1 A structural causal model and associated DAG
To encapsulate the role of entry time and non-concurrent controls in platform trials, we posit the structural causal model and directed acyclic graph (DAG) (Pearl, 1995) represented in Figure 2, and its interpretation in terms of a non-parametric structural equation model in eq. (1) respectively.
(1) | ||||
We now discuss some important features of ModelΒ (1). ModelΒ (1) allows all variables to be dependent, directly or through other variables, on entry time , and therefore appropriately models temporal drifts. It also allows the treatment assignment to depend on the participantβs covariates , thus allowing study designs such as stratified randomization (Broglio, 2018). ModelΒ (1) also imposes some exclusion restrictions. First, the treatment assignment is not allowed to depend on the entry time other than through treatment availability . In other words, a participant entering the study at time can only be assigned to available treatments at that time, but the randomization probability of a treatment that is available for assignment does not vary in time. Second, the outcome for unit , , is not allowed to depend on the availability of treatments , other than through the treatment actually given to unit , but it is allowed to directly depends on unitβs entry time. Third, the availability of treatments does not depend on covariates . The two last assumptions are reasonable assumptions since the treatments under evaluation do not often depend on trial data. In this paper, we assumed that covariates depend on entry time , and not the other way around. This assumption is reasonable because, in many platform trials, entry time does not depend on individual-level data. Furthermore, it can be shown that the results presented in the next sections also hold when entry time depends on . In ModelΒ (1), the functions , , and are completely unknown, thus making the model non-parametric, while the treatment assignment function, , and treatment availability function, , are known by design. In addition, the random variables , , and are unmeasured factors that impact the entry time, covariates, and outcomes, respectively. The random variables control the randomization probabilities and are known by design. The random variables represent all factors that determine the availability of treatments for subjects in the trial.
In the following section, under this model and its associated DAG, we define the concurrent average treatment effect as the causal estimand of interest and introduce its identification assumptions, aligning with the estimand framework advocated by the FDA (FDA, 2021).
3 Definition and identification of the concurrent average treatment effect
In this paper, we focus on endpoints measured at fixed time-points post-randomization. Additionally, we consider an intention-to-treat (ITT) analysis. Our results can be easily extended to binary endpoints. We define the concurrent average treatment effect of treatment against a shared control arm in terms of counterfactual variables (Pearl, 2010), , where , that would have been observed in a hypothetical world where treatment had been given, i.e., . We first define it and then discuss their non-parametric identification.
Definition 1 (Conditional and marginal average treatment effect of treatment arm compared to shared control among concurrent population).
is the ITT-average treatment effect among only concurrent units, . is its conditional versions, conditioning on baseline variables and entry time . We now provide assumptions to identify it.
3.1 Non-parametric identification
Non-parametric identification allows us to express the causal target quantity of interest in terms of the distribution of the observed data without relying on assumptions on the functional form of the distributions (Pearl, 1995). In order to discuss non-parametric identification of , we introduce the following assumptions:
A1weak A-ignorability.
Assume
.
A2Consistency.
Assume
.
A3Positivity of treatment assignment mechanism among concurrent units.
Assume
for all and s.t. .
A4Positivity of shared arm assignment mechanism among all controls.
Assume
for all and .
A5Pooling concurrent and non-concurrent controls.
Assume
for all s.t. .
AssumptionΒ A1 is an untestable assumption, i.e., it is a function of counterfactuals which are unobservable; that state that once we control for , and , the counterfactual outcome under is independent from the treatment assignment. We expect this to hold by design because of randomization.
AssumptionΒ A2 is a standard causal inference assumption that states that under and once controlled for , the distribution of the observed outcome under is the same as that of the counterfactual outcome for all in . This assumption is implied by the structural causal model (1). We expect this also to hold by design.
AssumptionΒ A3 states that once a treatment arm is available in the trial all covariate profiles have a positive probability of receiving such treatment. Similarly, AssumptionΒ A4 states that within the shared control arm, covariate profiles have a positive probability of receiving the control group (Note that assumptionΒ A4 is redundant given assumptionΒ A3 but helps clarifying our identification proofs). These two assumptions hold by design in platform trials.
AssumptionΒ A5 states that once we control for and , the conditional expectation of outcome under control () in the pooled concurrent and non-concurrent units (, and β right-hand side of assumptionΒ A5) is the same as that among only concurrent units ( β left-hand side of assumptionΒ A5), for all values among the concurrent units. In other words, after conditioning on and , what is learned using all the pooled data can be used to predict conditional expectations under only concurrent. In addition, it is straightforward to see that these quantities depends only on observable data. For instance, we know by design that we have data for the left-hand side of assumptionΒ A5 for all such that and for the right-hand side, for all in the shared arm. Therefore assumptionΒ A5 can be tested as discussed in our practical guidelines in section 8.
Remark 1.
In contrast, when is a stochastic function of , i.e., , an assumption is needed since we do not know . e.g., the unknown error can be different between the two expectations. In addition, while this remark is always true at the population level and for the true conditional outcome expectations, for given estimators, its validity depends on the correct model specification. For instance, if there are non-linearities in in the data, and we fit a linear models within the pooled dataset and within the concurrent dataset, these two linear regressions will not be equal because they will capture the projection of the true non-linear expectation onto linear models in different subset of the range of . This underscores the importance of using non-parametric models for these regressions if data are to be pooled.
Theorem 1 (Identification of in platform adaptive trials under ModelΒ (1)).
Equivalent expressions based on weighting are provided in the appendix. A comparison between expressions (2) and (3) reveals why researchers have been historically motivated to use non-concurrent controls: they can be useful in estimating the outcome expectation for the controls, therefore potentially reducing the variance of the estimator.
In the next sections, we will show that, when is a deterministic function of , efficiency gains only bear out for plug-in estimators based on parametric regressions, which can be biased if the models are misspecified. Doubly robust estimators, which are always consistent by virtue of randomization, will not benefit from these efficiency gains. However, we will also show, that efficiency gains can be obtained when is a stochastic function of .
3.2 On the identification of the average treatment effect
In this paper, we propose targeting in platform trials. However, many researchers are familiar with another estimand, the average treatment effect, defined as the expected difference between treatment and control in the entire trial population ( and ). In formulas,
Definition 2 (Conditional and marginal average treatment effect of treatment arm compared to shared control).
where is the conditional version of , conditioning on baseline variables and entry time . The familiarity of is partly because, in standard randomized controlled trials, the common statistical model used to evaluate treatment effects is a linear model that regresses the outcome on the treatment group and baseline covariates. In such model, the canonical interpretation of the model coefficient for the treatment aligns with that of the . For instance, assuming the following linear model: , we can show that,
We now explain why we believe that targeting in platform trials is dangerous. To do so, we start by introducing an additional assumption needed to identify in platform trials:
A6Extrapolation of outcome mechanism among the treated.
Assume
for all .
AssumptionΒ A5 and assumption A6 state that the outcome distribution among controls and treated are exchangeable between patients for whom treatment is available and those for whom it is not, respectively; given patientsβ baseline variables and entry time. Note that assumption A5 and assumption A6 are similar in nature in that they assume exchangeability of the outcome mechanism for treatment and control arms. However, there is a fundamental difference between these assumptions that makes identification based on A5 more reliable than identification based on A6. AssumptionΒ A5 is a testable assumption since it is based on observed data. In addition, as aforementioned, assumption A5 is a statement that holds always true when is a determinist function of and its validity only depends on the correct model specification. Consequently, whether pooling data in a specific regression algorithm is appropriate can be empirically checked as shown in our practical guidelines.
On the other hand, assumption A6 is an identification assumption based on unobserved data, since it requires assuming that the conditional outcome expectation observed in patients who could hypothetically be randomized to treatment can be used to extrapolate to those who could not. In other words, A6 is an extrapolation assumption, since it assumes that the expected outcome under treatment in times and baseline variables of no treatment availability can be extrapolated from a model fit on times and baseline variables where the treatment was available. Consequently, assumption A6 cannot be empirically checked.
We know state and show in the appendix that identification of depends on the extrapolation assumption A6.
Theorem 2 (Identification of in platform adaptive trials under ModelΒ (1)).
In summary, unlike , depends on an extrapolation assumption (A6) which can be risky. Firstly, this assumption cannot be tested. Secondly, it is often unrealistic for novel diseases with a rapidly changing pathology and clinical landscape.
4 Relation to analytical approaches common in the literature
Regression models are often used to estimate or and then used to extrapolate to units where , thus targeting as discussed in Section 3.2. Inferences are then made using the regression coefficient related to the treatment, whether within the frequentist (Lee &Β Wason, 2020; BofillΒ Roig, Krotka, Burman, Glimm, Gold &Β Hees, 2022) or Bayesian framework (Saville etΒ al., 2022; BofillΒ Roig, KΓΆnig, Meyer &Β Posch, 2022; Ibrahim &Β Chen, 2000).
Matching techniques have been proposed to estimate the average treatment effect among the treated, . The idea is to balance covariates between concurrent and non-concurrent controls by using for instance a matching algorithm based on the propensity score (Yuan etΒ al., 2019).
Bayesian methods have been proposed to include non-concurrent controls. The idea is to learn a prior of the parameter of interest using non-concurrent controls only. Then, this prior is combined with the concurrent control data via Bayesβ theorem. Meta-analytic priors (Schmidli etΒ al., 2014) or elastic priors (Jiang etΒ al., 2023) have been proposed. These methods assume an exchangeability assumption for the control parameters, which relates to A5. Other Bayesian methods have been proposed (Neuenschwander etΒ al., 2009; Bennett etΒ al., 2021; Wei etΒ al., 2024, among others). These methods, however, do not allow for the use of baseline covariates and it is not clear what estimands they target.
In this paper, to estimate , we propose estimators based on outcome regression (OR) and inverse-probability-weighting (IPW), and doubly robust estimators.
5 Estimation of
To build intuition, we start by introducing outcome regression (OR) and inverse probability weighting (IPW) estimators for considering a deterministic function of . Since OR and IPW estimators are not robust to model misspecification, we then propose doubly robust (DR) estimators. To simplify notation, the following sections assume there are only two treatment arms and . Furthermore, we assume that for some time such that treatment is only available for patients who entered the trial after time .
5.1 Estimators based on parametric outcome regression
Based on the identification results presented in Theorem 1, eq.Β (2), the conditional mean , where , can be modelled as
where (oc) stands for only-concurrent. Based on the identification results presented in Theorem 1, eq.Β (3) , the conditional mean , can be modelled as
An estimate of and can be then obtained by using maximum likelihood estimation, i.e., ordinary least squares, only among the concurrent controls for and among all concurrent and non-concurrent controls when using , i.e., among only. Let and denote consistent estimators of and , respectively. We then propose
as an outcome regression estimator for . Under Theorem 1, eq.Β (3), we propose the alternative outcome regression estimator for ,
Large sample properties.
5.2 An estimator based on parametric inverse probability weighting
Following standard procedures, we model the conditional probability of treatment assignment given and among only by using a logistic regression model,
where . We obtain an estimate of by using maximum likelihood estimation, only among the concurrent controls . We then propose,
where , , and for clarity.
Large sample properties.
5.3 Doubly robust estimators
Doubly robust (DR) estimators for average treatment effects provide consistent estimates by combining outcome regression and IPW. To derive DR estimators of , we follow standard practice of constructing them based on efficient influence functions (EIF)s (Bickel etΒ al., 1993; Fisher &Β Kennedy, 2021; Hines etΒ al., 2022; Kennedy, 2022). Influence functions are a core component of classical statistical theory. They aid in constructing estimators with desirable properties such as double robustness, asymptotic normality, and fast rates of convergence. Additionally, they enable the incorporation of machine learning algorithms while preserving valid statistical inferences and providing insights into statistical efficiency, i.e., the best performance for estimating an estimand. We provide efficiency considerations of the proposed estimators in section 6. The next theorem provide these EIFs,
Theorem 3.
These influence functions suggest the following estimators,
where, , and , can be estimated by using parametric and machine learning methods. Building on the results from the previous sections, we can leverage the linear regression models introduced earlier: , and as outcome models. Similarly, the logistic regression models and introduced previously for and , respectively, can also be employed.
Large sample properties.
Note that if is a deterministic function of , and therefore , the two EIFs (4) and (5) are the same. In addition, the first influence function in Theorem 3 boils down to the standard influence function for the average treatment effect in the the population. Therefore, it inherits the standard analysis of the one-step estimator for average treatment effects as discussed in (Kennedy etΒ al., 2021, Section 4.1). Similar analysis can be conducted when is a stochastic function of . In summary, if the outcome models are correctly specified, it can be shown that estimators of the form of and are root-n consistent, asymptotically normal with asymptotically valid 95% confidence intervals given by the closed-form expressions and , respectively, and efficient in the local asymptotic minimax sense. If the models are misspecified, the confidence intervals will be conservative (Kennedy, 2022).
Double robustness.
Similarly, if is a deterministic function of the two estimators boils down to the standard DR estimator for the average treatment effect in the population, they also inherit the same double robust property. This means that if either the outcome regression model (, ) or the treatment assignment model (, ) is correctly specified (in a parametric sense), then the DR estimator is consistent, see section 4.2 of Kennedy (2022) for details. Recall that our proposed estimators are doubly robust due to randomization i.e. the treatment assignment mechanism is known by design. We provide some empirical result of this property in our simulations in section 7. In addition, it can be shown that if is a stochastic function of , the same double robustness property holds, provided the model for is correctly specified.
6 Efficiency considerations
Estimators based on outcome regression.
As shown in the appendix, the influence function of the conditional expectation under control, , depends on two components: the influence function of itself and that of the regression coefficients. Here, depending on the estimand under study. Therefore, a more precise estimation of the regression coefficients (the variance of the estimated regression coefficient is inversely proportional to the sample size), translate to a more precise estimation of and consequently of compared to .
Doubly robust estimators.
As previously discussed, if is a deterministic function of the two EIFs presented in section 5.3 are the same. In this case, efficiency gains come solely from better fitting of the regression , which under assumption (A5) is equal to (because ). In this case it becomes purely about getting this regression right, and these efficiency gains do not show up in the first order analysis of the estimator. In addition, as aforementioned, efficiency gains can be obtained by leveraging prognostic variables as discussed in Colantuoni &Β Rosenblum (2015); Benkeser etΒ al. (2021). We show some empirical results in our simulations in section 7. Interestingly, efficiency gains can also be achieved when using doubly robust estimators under ModelΒ (1) when is a non deterministic function of as shown in our simulation setting. Under this scenario, we expect to see efficiency gains also when using an estimator based on inverse probability weighting solely.
7 Simulations
In this section we evaluate the performance of the proposed estimators with respect to, bias squared, variance, mean square error, and coverage of the 95% confidence interval, across levels of the percentage of concurrent controls, and model misspecification when estimating . We do not compare our proposed estimators with methods described in section 4, because it is not clear if they target .
Finally, our simulations aim to showcase the theoretical properties previously discussed, rather than evaluating them under complex real-world scenarios.
7.1 Setup
Aims
To evaluate the performance and gains in efficiency of our proposed estimators across levels of (1) percentage of concurrent controls (90% to 10%) and (2) model misspecification (correct outcome and treatment models; and misspecified outcome and correct treatment model) considering a deterministic function of . In addition, we also evaluate efficiency gains by comparing the estimated variance of the outcome regression (Section 5.1) and doubly robust (Section 5.3) estimators that only use concurrent data compared with those that use all data considering both a deterministic and a stochastic function of .
Data-generating mechanisms
We considered generating data from ModelΒ (1). Specifically, we considered a sample size of and for each subject , we simulated the following data:
-
Step 1. the entry time and a baseline covariate , where ;
-
Step 2. an indicator whether treatment was available at time , as a deterministic function of being less than a threshold describing the level of the percentage of concurrent controls;
-
Step 3. a binary treatment , where and when and , otherwise (participants for which treatment only control is available);
-
Step 4. two counterfactual outcomes, , and , with , and the observed outcome . Since we consider a homogeneous treatment effect, .
Estimands
The estimand of interest is .
Methods
For each dataset across levels of percentage of concurrent controls, and misspecification we used the methods summarized in Table 1.
Performance metrics
Bias squared, variance, mean square error (MSE), and coverage of the 95% confidence interval. In addition, we also considered the ratio of the estimated variances.
Scenarios
We considered levels of percentage of concurrent controls between 10% and 90% by 10%. Misspecified models were set to only include an intercept β not controlling for any covariates or entry time.
Acronym | |
Method | |
Outcome regression using only concurrent data, (, Section 5.1) | OR-oc |
Outcome regression using all data, (, , Section 5.1) | OR-ac |
Weighting using only concurrent data (, Section 5.2) | IPW |
Doubly robust using only concurrent data (, Section 5.3) | DR-oc |
Doubly robust using all data (, Section 5.3) | DR-ac |
7.2 Results
7.2.1 Bias, variance, MSE, and coverage
Figure 3 and Figure 4, show bias squared, variance, MSE and coverage of the 95% confidence intervals in estimating across percentage of concurrent controls when models are correct and when only the treatment model is correctly specified, respectively. When both the outcome and the treatment models are correct (Figure 3), bias squared is negligible across levels of concurrent controls for all methods. Variance is shown to increase with decreasing levels of concurrent controls across all methods, with OR-ac being smaller than OR-oc, suggesting a gain in efficiency (more on this in the next section). Similar behavior can be seen for the MSE. Finally, all methods achieve desirable coverage levels. When the outcome model is misspecified (Figure 4), both estimators based on outcome regression show bias while maintaining a relatively small variance. MSE is consequently dominated by bias. DR estimators and the IPW estimator maintaine negligible levels of bias and relatively small variance.
7.2.2 Efficiency gains
Figure 5 shows the ratio of the estimated standard errors of DR-oc over DC-ac and OR-oc over OR-ac across levels of concurrent controls and misspecification.
The top panels of Figure 5 follow ModelΒ (1) where is a deterministic function of . As discussed in Section 6, estimators based on outcome regression that use all controls seem to have a gain in efficiency, while DR estimators did not under correct models.
The bottom panels of Figure 5 follow ModelΒ (1) where is a not a deterministic function of . Specifically, we generated , where and , and is the threshold discussed in Step 2 above. Under correct models (including the one for ), the entry time is a prognostic variable of and therefore improves efficiency compared to using only concurrent controls (Bottom panels of 5; DR-oc/DR-ac). These results suggest that efficiency gains can be obtained when using a doubly robust estimator under ModelΒ (1) when is not a deterministic function of and the model for is correct.
Summary of results.
Methods based on outcome regression improve efficiency when using non-concurrent controls. However, they introduce bias when misspecified. In contrast, doubly robust estimators provide consistent estimates with relatively small variance when either the treatment or outcome model is correctly specified. In addition, doubly robust estimator have the potential to additionally improve efficiency when is not a deterministic function of under ModelΒ (1).
8 Practical considerations
What estimand should we target?
In this paper, we introduce , whereas it not clear what estimand the current related literature targets. Given the more stringent and untestable assumptions needed to identify , we suggest targeting . Note that and will coincide under the assumption of homogeneous treatment effect. While this is true in theory, we would expect it not to hold in practical settings, leading to different results as showed in our case study in the next section.
Should we pool concurrent and non-concurrent controls? Testing assumptionΒ A5.
To evaluate assumptionΒ A5 under ModelΒ (1), we propose using the method introduced by Luedtke etΒ al. (2019). Specifically, following the notation of the original paper, we suggest to set and , where is the observed data, and and are elements of the space of univariate bounded real-valued measurable functions defined on the the support of the distribution (Luedtke etΒ al., 2019). We show an application of this testing procedure in our case study. Note that if is a deterministic function of as in our case study, this test is a test on the statistical models, i.e., if the projections on the linear model are different, and not a test on the true expectations. In other words, it is a test to evaluate model misspecifications and therefore guide analysts on the choice to either use or not non-concurrent controls.
Should we leverage prognostic baseline variables for additional precision?
Recent literature suggests that incorporating baseline prognostic variables can improve the precision of estimates (Colantuoni &Β Rosenblum, 2015). We propose following this approach by appropriately controlling for these variables in the analysis. This may explain the increased precision observed with DR-oc and OR-oc estimators compared to the naive estimator in our case study (presented in the next section), despite being computed within the concurrent population only.
What estimator should we use?
Under ModelΒ (1) and assuming is a deterministic function of , to estimate , we recommend using , the doubly robust estimator using only concurrent data (Section 5.3). We recommend this estimator because: 1) it has the same efficiency as the DR estimator that uses non-concurrent controls 2) it does not require any additional assumptions, thus better aligning with the FDA recommendations (FDA, 2021, Section A.5.1); 3) it accommodates covariates that, if prognostic, can be leveraged to improve efficiency (Colantuoni &Β Rosenblum, 2015); and 4) it is doubly robust, meaning that it is consistent when either the treatment assignment model or the outcome model is correctly specified, a property we obtain βfor freeβ in platform trials due to randomization. Assuming is a stochastic function of and if an analyst chooses to leverage non-concurrent controls, then the DR-ac estimator is recommended.
Sample size calculation for a prospective trial with non-concurrent controls.
Our theoretical and methodological results suggest an efficiency gain when including non-concurrent control with estimators based on regression models. While these results are promising, we suggest to conduct standard sample size calculation as if the non-concurrent control data will not be available. At the analysis stage, precision can then be improved by using non-concurrent control data as previously described with the caveat that the outcome model must be correctly specified.
Multiple comparisons.
Our proposed methods enable the use of standard type I error control procedures, such as Bonferroni or Benjamini-Hochberg corrections, due to the validity of 95% confidence intervals, test statistics, and p-values (demonstrated in previous sections). This allows for straightforward application of these corrections in platform trials with, for instance, pre-planned interim analyses, and multiple primary endpoints.
Summary of practical guidelines.
Target as the estimand of interest. Use DR-oc to obtain consistent estimates of . To improve efficiency, focus on prognostic baseline covariates rather than relying on non-concurrent controls. If leveraging non-concurrent controls is of interest, use DR-ac. Finally, conduct sample size calculations without considering non-concurrent controls and conduct standard multiple comparisons adjustments.
9 The Adaptive COVID-19 Treatment Platform Trial
In this section, we apply our proposed estimators to estimate using data from the Adaptive COVID-19 Treatment Trial (ACTT) (Kalil etΒ al., 2021). This was a platform trial that investigated treatments for hospitalized adult patients with COVID-19 pneumonia. The trial comprised multiple stages, as illustrated in Figure 1. The initial phase, ACTT-1, involved the assessment of the effectiveness of remdesivir alone compared to placebo. Subsequently, in the second stage (ACTT-2), the placebo was phased out, and a novel treatment, combining remdesivir with baricitinib, was introduced. Simultaneously, participants were randomized to receive either remdesivir alone or the combination therapy of remdesivir and baricitinib. Data were accessed using the NIAID Clinical Trials Data Repository (https://data.niaid.nih.gov/). We have a Data User Agreement in place for its use.
Study population and endpoint of interest.
We considered the combined participants of ACCT-1 and ACTT-2 as our study population. We followed the inclusion and exclusion criteria of the original study. The final study population was comprised of 1,379 participants, 541 from ACTT-1 and 1,033 from ACTT-2. We considered the time to recovery in days as our enpoint of interest.
Treatments under study and targeted causal estimand.
We considered two treatment arms: remdesivir alone (which served as the shared control arm) and remdesivir plus baricitinib. Our target estimand was , where represents the remdesivir plus baricitinib arm and represents the remdesivir alone shared arm.
Baseline covariates.
We consider the following baseline covariates: age, sex assigned at birth (female, male), race (White, Black, Asian, Other: American Indian or Alaska Native, Native Hawaiian or Other Pacific Islander, Multiple), ethnicity (Hispanic or Latino, Not Hispanic or Latino), BMI, geographic region of study site (Asia, Europe, North America), disease severity stratum (mild, severe), and having any of these comorbidities: duration of symptoms, hypertension, coronary artery disease, congestive heart disease, chronic oxygen requirement, chronic respiratory disease, chronic liver disease, chronic kidney disease, diabetes type I, diabetes type II, obesity, cancer, immune deficiency, and asthma, in addition to the entry time which we normalized to be between 0 and 1.
Models setup.
We computed OR-oc, OR-ac, OR-ad, IPW, DR-oc and DR-ac (Table 1) by using linear and logistic regression models. We computed the naive estimator by taking the average difference in the endpoint between the two arms among only concurrent participants. Variances were obtained by using the sandwich estimator (for naive, OR-oc, OR-ac, IPW) and by taking the variance of the efficient influence function (for DR-oc and DR-ac). Wald 95% confidence intervals and Wald tests were constructed.
Results.
Table 2 shows the point estimate for , standard errors, 95% confidence intervals and p-values. The naive estimate of , resulted in a value of -1.33 with a standard error of 0.58. This suggest that baricitinib plus remdesivir was superior to remdesivir alone in reducing recovery time as in the original trial (Kalil etΒ al., 2021). OR-oc, IPW, DR-oc and DR-ac improved precision while maintaining a similar point estimate. OR-ac improved precision the most (around 28% improvement compared with the naive estimator), however, it resulted in a different point estimate, -0.75 which led to a non significant result. This suggest that the outcome model used to obtain OR-ac might be misspecified and therefore that assumption A5 does not hold. Using an omnibus test as described in our practical guidelines in section 8 we obtained a p-value using both the variance and the eigenvalue approach (Luedtke etΒ al., 2019), thus supporting rejecting assumption A5 and, therefore, suggests not using non-concurrent controls. In contrast, doubly robust estimators improved precision while maintaining a similar point estimate as the naive estimator. We believe the improved precision observed in OR-oc, IPW, and DR-oc compared to the naive estimator stems from appropriately adjusting for baseline variables, as discussed in Colantuoni &Β Rosenblum (2015), and in our previous section. Note that a conditional analysis, targeting , using a standard linear regression model regressing the outcome in the full population on treatment arms, entry time and baseline covariates led to a non significant point estimate of -0.46 (standard error equal to 0.52).
Method | SE | 95% CI | p-value | Ratio | |
---|---|---|---|---|---|
OR-oc | -1.29 | 0.47 | (-2.21;-0.37) | 0.01 | 1.22 |
OR-ac | -0.75 | 0.45 | (-1.63;0.13) | 0.10 | 1.28 |
IPW | -1.28 | 0.47 | (-2.20;-0.36) | 0.01 | 1.22 |
DR-oc | -1.30 | 0.47 | (-2.22;-0.38) | 0.01 | 1.22 |
DR-ac | -1.30 | 0.47 | (-2.22;-0.38) | 0.01 | 1.21 |
naive | -1.33 | 0.58 | (-2.47;-0.19) | 0.02 | 1.00 |
10 Conclusion
In this paper, we introduced identification results and estimation techniques to identify and estimate concurrent average treatment effects in the presence of non-concurrent control in platform trials. We argue that identifying and estimating relies on an extrapolation assumption that is both untestable and often too stringent, particularly in the context of platform trials, where multiple, potentially novel treatments or interventions are being evaluated and the outcome mechanism is poorly understood. Therefore, we advocate focusing primarily on , where assumptions can be tested. By focusing on rather than , we also open the door to leveraging non-parametric models based on machine and deep learning techniques for learning outcome and treatment assignment mechanisms under the proposed doubly robust estimators (Kennedy, 2022; DΓaz, 2020; Hirshberg &Β Wager, 2021). In fact, while these methods can capture complex data relationships, potentially mitigating model misspecification, they may not be suitable for extrapolation. Furthermore, our proposed doubly robust estimator accommodates Bayesian techniques while retaining valid frequentist properties, as demonstrated in (Shin &Β Antonelli, 2023; Antonelli etΒ al., 2022).
A key takeaway of this paper is to target in platform trials, and, under ModelΒ (1), to use a doubly robust estimator only among concurrent units, prioritizing the identification of strong prognostic baseline variables rather than relying on non-concurrent controls, especially when is a deterministic function of .
In this paper, we presented results that can be used for continuous and binary endpoints. Estimators can be constructed for time-to-event endpoints under the non-parametric causal model introduced in eq. (1).
Finally, in this paper, we demonstrate results assuming a structural equation model where treatment assignment may depend on baseline covariates; however, similar identification and estimation results can be obtained without baseline covariates.
References
- (1)
- Angus etΒ al. (2020) Angus, D.Β C., Derde, L., Al-Beidh, F., Annane, D., Arabi, Y., Beane, A., van Bentum-Puijk, W., Berry, L., Bhimani, Z., Bonten, M. etΒ al. (2020), βEffect of hydrocortisone on mortality and organ support in patients with severe covid-19: the remap-cap covid-19 corticosteroid domain randomized clinical trialβ, Jama 324(13),Β 1317β1329.
- Antonelli etΒ al. (2022) Antonelli, J., Papadogeorgou, G. &Β Dominici, F. (2022), βCausal inference in high dimensions: a marriage between bayesian modeling and good frequentist propertiesβ, Biometrics 78(1),Β 100β114.
-
Banbeta etΒ al. (2019)
Banbeta, A., Rosmalen, J., Dejardin, D. &Β Lesaffre, E. (2019), βModified power prior with multiple historical trials for binary endpointsβ, Stat Med. 38.
https://doi.org/10.1002/sim.8019 - Barker etΒ al. (2009) Barker, A., Sigman, C., Kelloff, G.Β J., Hylton, N., Berry, D.Β A. &Β Esserman, L. (2009), βI-spy 2: an adaptive breast cancer trial design in the setting of neoadjuvant chemotherapyβ, Clinical Pharmacology & Therapeutics 86(1),Β 97β100.
- Benkeser etΒ al. (2021) Benkeser, D., DΓaz, I., Luedtke, A., Segal, J., Scharfstein, D. &Β Rosenblum, M. (2021), βImproving precision and power in randomized trials for covid-19 treatments using covariate adjustment, for binary, ordinal, and time-to-event outcomesβ, Biometrics 77(4),Β 1467β1481.
-
Bennett etΒ al. (2021)
Bennett, M., White, S., Best, N. &Β Mander, A. (2021), βA novel equivalence probability weighted power prior for using historical control data in an adaptive clinical trial design: A comparison to standard methodsβ, Pharm Stat. 20.
https://doi.org/10.1002/pst.2088 - Berry etΒ al. (2015) Berry, S.Β M., Connor, J.Β T. &Β Lewis, R.Β J. (2015), βThe platform trial: an efficient strategy for evaluating multiple treatmentsβ, Jama 313(16),Β 1619β1620.
- Bickel etΒ al. (1993) Bickel, P.Β J., Klaassen, C.Β A., Bickel, P.Β J., Ritov, Y., Klaassen, J., Wellner, J.Β A. &Β Ritov, Y. (1993), Efficient and adaptive estimation for semiparametric models, Vol.Β 4, Springer.
- BofillΒ Roig etΒ al. (2023) BofillΒ Roig, M., Burgwinkel, C., Garczarek, U., Koenig, F., Posch, M., Nguyen, Q. &Β Hees, K. (2023), βOn the use of non-concurrent controls in platform trials: a scoping reviewβ, Trials 24(1),Β 1β17.
-
BofillΒ Roig, KΓΆnig, Meyer &Β Posch (2022)
BofillΒ Roig, M., KΓΆnig, F., Meyer, E. &Β Posch, M. (2022), βCommentary: Two approaches to analyze platform trials incorporating non-concurrent controls with a common assumptionβ, Clin Trials. 19.
https://doi.org/10.1177/17407745221112016 -
BofillΒ Roig, Krotka, Burman, Glimm, Gold &Β Hees (2022)
BofillΒ Roig, M., Krotka, P., Burman, C.Β F., Glimm, E., Gold, S.Β M. &Β Hees, K. (2022), βOn model-based time trend adjustments in platform trials with non-concurrent controlsβ, BMC Med Res Methodol. 22.
https://doi.org/10.1186/s12874-022-01683-w - Boos &Β Stefanski (2013) Boos, D.Β D. &Β Stefanski, L.Β A. (2013), Essential statistical inference: theory and methods, Vol. 591, Springer.
- Broglio (2018) Broglio, K. (2018), βRandomization in clinical trials: permuted blocks and stratificationβ, Jama 319(21),Β 2223β2224.
-
Chen etΒ al. (2020)
Chen, W.Β C., Wang, C., Li, H., Lu, N., Tiwari, R. &Β Xu, Y. (2020), βPropensity score-integrated composite likelihood approach for augmenting the control arm of a randomized controlled trial by incorporating real-world dataβ, J Biopharm Stat. 30.
https://doi.org/10.1080/10543406.2020.1730877 - Colantuoni &Β Rosenblum (2015) Colantuoni, E. &Β Rosenblum, M. (2015), βLeveraging prognostic baseline variables to gain precision in randomized trialsβ, Statistics in medicine 34(18),Β 2602β2617.
- Collignon etΒ al. (2022) Collignon, O., Schiel, A., Burman, C.-F., Rufibach, K., Posch, M. &Β Bretz, F. (2022), βEstimands and complex innovative designsβ, Clinical Pharmacology & Therapeutics 112(6),Β 1183β1190.
-
Collignon etΒ al. (2020)
Collignon, O., Schritz, A., Senn, S.Β J. &Β Spezia, R. (2020), βClustered allocation as a way of understanding historical controls: Components of variation and regulatory considerationsβ, Stat Methods Med Res. 29.
https://doi.org/10.1177/0962280219880213 - DΓaz (2020) DΓaz, I. (2020), βMachine learning in the estimation of causal effects: targeted minimum loss-based estimation and double/debiased machine learningβ, Biostatistics 21(2),Β 353β358.
-
FDA (2021)
FDA (2021), βE9 (r1) statistical principles for clinical trials: addendum: estimands and sensitivity analysis in clinical trialsβ, Guidance for Industry .
https://shorturl.at/clJL5 - Fisher &Β Kennedy (2021) Fisher, A. &Β Kennedy, E.Β H. (2021), βVisually communicating and teaching intuition for influence functionsβ, The American Statistician 75(2),Β 162β172.
- Foltynie etΒ al. (2023) Foltynie, T., Gandhi, S., Gonzalez-Robles, C., Zeissler, M.-L., Mills, G., Barker, R., Carpenter, J., Schrag, A., Schapira, A., Bandmann, O. etΒ al. (2023), βTowards a multi-arm multi-stage platform trial of disease modifying approaches in parkinsonβs diseaseβ, Brain 146(7),Β 2717β2722.
- Gravestock etΒ al. (2017) Gravestock, I., Held, L. &Β consortium, C.-N. (2017), βAdaptive power priors with empirical bayes for clinical trialsβ, Pharmaceutical statistics 16(5),Β 349β360.
-
Han etΒ al. (2017)
Han, B., Zhan, J., JohnΒ Zhong, Z., Liu, D. &Β Lindborg, S. (2017), βCovariate-adjusted borrowing of historical control data in randomized clinical trialsβ, Pharm Stat. 16.
https://doi.org/10.1002/pst.1815 - Hayward etΒ al. (2021) Hayward, G., Butler, C.Β C., Yu, L.-M., Saville, B.Β R., Berry, N., Dorward, J., Gbinigie, O., VanΒ Hecke, O., Ogburn, E., Swayze, H. etΒ al. (2021), βPlatform randomised trial of interventions against covid-19 in older people (principle): protocol for a randomised, controlled, open-label, adaptive platform, trial of community treatment of covid-19 syndromic illness in people at higher riskβ, BMJ open 11(6),Β e046799.
- Hines etΒ al. (2022) Hines, O., Dukes, O., Diaz-Ordaz, K. &Β Vansteelandt, S. (2022), βDemystifying statistical learning based on efficient influence functionsβ, The American Statistician 76(3),Β 292β304.
- Hirshberg &Β Wager (2021) Hirshberg, D.Β A. &Β Wager, S. (2021), βAugmented minimax linear estimationβ, The Annals of Statistics 49(6),Β 3206β3227.
-
Hobbs etΒ al. (2011)
Hobbs, B.Β P., Carlin, B.Β P., Mandrekar, S.Β J. &Β Sargent, D.Β J. (2011), βHierarchical commensurate and power prior models for adaptive incorporation of historical information in clinical trialsβ, Biometrics. 67.
https://doi.org/10.1111/j.1541-0420.2011.01564.x -
Hupf etΒ al. (2021)
Hupf, B., Bunn, V., Lin, J. &Β Dong, C. (2021), βBayesian semiparametric meta-analytic-predictive prior for historical control borrowing in clinical trialsβ, Stat Med. 40.
https://doi.org/10.1002/sim.8970 -
Ibrahim &Β Chen (2000)
Ibrahim, J.Β G. &Β Chen, M.Β H. (2000), βPower prior distributions for regression modelsβ, Stat Sci. 15.
https://doi.org/10.1214/ss/1009212673 -
International Council for Harmonisation (2017)
International Council for Harmonisation (2017), βIch harmonised guideline e9 (r1): Estimands and sensitivity analysis in clinical trialsβ.
https://shorturl.at/arQT6 - Jiang etΒ al. (2023) Jiang, L., Nie, L. &Β Yuan, Y. (2023), βElastic priors to dynamically borrow information from historical data in clinical trialsβ, Biometrics 79(1),Β 49β60.
- Kalil etΒ al. (2021) Kalil, A.Β C., Patterson, T.Β F., Mehta, A.Β K., Tomashek, K.Β M., Wolfe, C.Β R., Ghazaryan, V., Marconi, V.Β C., Ruiz-Palacios, G.Β M., Hsieh, L., Kline, S. etΒ al. (2021), βBaricitinib plus remdesivir for hospitalized adults with covid-19β, New England Journal of Medicine 384(9),Β 795β807.
- Kennedy (2022) Kennedy, E.Β H. (2022), βSemiparametric doubly robust targeted double machine learning: a reviewβ, arXiv preprint arXiv:2203.06469 .
- Kennedy etΒ al. (2021) Kennedy, E.Β H., Balakrishnan, S. &Β Wasserman, L. (2021), βSemiparametric counterfactual density estimationβ, arXiv preprint arXiv:2102.12034 .
- Koenig etΒ al. (2024) Koenig, F., Spiertz, C., Millar, D., RodrΓguez-Navarro, S., MachΓn, N., VanΒ Dessel, A., GenescΓ , J., PericΓ s, J.Β M., Posch, M., SΓ‘nchez-Montalva, A. etΒ al. (2024), βCurrent state-of-the-art and gaps in platform trials: 10 things you should know, insights from eu-pearlβ, Eclinicalmedicine 67.
- Lee &Β Wason (2020) Lee, K.Β M. &Β Wason, J. (2020), βIncluding non-concurrent control patients in the analysis of platform trials: is it worth it?β, BMC medical research methodology 20(1),Β 1β12.
- Luedtke etΒ al. (2019) Luedtke, A., Carone, M. &Β vanΒ der Laan, M.Β J. (2019), βAn omnibus non-parametric test of equality in distribution for unknown functionsβ, Journal of the Royal Statistical Society Series B: Statistical Methodology 81(1),Β 75β99.
-
Neuenschwander etΒ al. (2009)
Neuenschwander, B., Branson, M. &Β Spiegelhalter, D.Β J. (2009), βA note on the power priorβ, Stat Med. 28.
https://doi.org/10.1002/sim.3722 -
Normington etΒ al. (2020)
Normington, J., Zhu, J., Mattiello, F., Sarkar, S. &Β Carlin, B. (2020), βAn efficient bayesian platform trial design for borrowing adaptively from historical control data in lymphomaβ, Contemp Clin Trials. 89.
https://doi.org/10.1016/j.cct.2019.105890 - Park etΒ al. (2022) Park, J.Β J., Detry, M.Β A., Murthy, S., Guyatt, G. &Β Mills, E.Β J. (2022), βHow to use and interpret the results of a platform trial: usersβ guide to the medical literatureβ, Jama 327(1),Β 67β74.
- Pearl (1995) Pearl, J. (1995), βCausal diagrams for empirical researchβ, Biometrika 82(4),Β 669β688.
- Pearl (2010) Pearl, J. (2010), βAn introduction to causal inferenceβ, The International Journal of Biostatistics 6(2),Β 7.
- Saville etΒ al. (2022) Saville, B.Β R., Berry, D.Β A., Berry, N.Β S., Viele, K. &Β Berry, S.Β M. (2022), βThe bayesian time machine: accounting for temporal drift in multi-arm platform trialsβ, Clinical Trials 19(5),Β 490β501.
- Schmidli etΒ al. (2014) Schmidli, H., Gsteiger, S., Roychoudhury, S., OβHagan, A., Spiegelhalter, D. &Β Neuenschwander, B. (2014), βRobust meta-analytic-predictive priors in clinical trials with historical control informationβ, Biometrics 70(4),Β 1023β1032.
-
Schmidli etΒ al. (2020)
Schmidli, H., HΓ€ring, D.Β A., Thomas, M., Cassidy, A., Weber, S. &Β Bretz, F. (2020), βBeyond randomized clinical trials: Use of external controlsβ, Clin Pharmacol Ther. 107.
https://doi.org/10.1002/cpt.1723 - Shin &Β Antonelli (2023) Shin, H. &Β Antonelli, J. (2023), βImproved inference for doubly robust estimators of heterogeneous treatment effectsβ, Biometrics 79(4),Β 3140β3152.
- Sridhara etΒ al. (2022) Sridhara, R., Marchenko, O., Jiang, Q., Pazdur, R., Posch, M., Berry, S., Theoret, M., Shen, Y.Β L., Gwise, T., Hess, L. etΒ al. (2022), βUse of nonconcurrent common control in master protocols in oncology trials: report of an american statistical association biopharmaceutical section open forum discussionβ, Statistics in Biopharmaceutical Research 14(3),Β 353β357.
-
Viele etΒ al. (2014)
Viele, K., Berry, S., Neuenschwander, B., Amzal, B., Chen, F. &Β Enas, N. (2014), βUse of historical control data for assessing treatment effects in clinical trialsβ, Pharm Stat. 13.
https://doi.org/10.1002/pst.1589 - Wei etΒ al. (2024) Wei, W., Blaha, O., Esserman, D., Zelterman, D., Kane, M., Liu, R. &Β Lin, J. (2024), βA bayesian platform trial design with hybrid control based on multisource exchangeability modellingβ, Statistics in Medicine 43(12),Β 2439β2451.
- Wells etΒ al. (2012) Wells, A., Fisher, P., Myers, S., Wheatley, J., Patel, T. &Β Brewin, C.Β R. (2012), βMetacognitive therapy in treatment-resistant depression: A platform trialβ, Behaviour research and therapy 50(6),Β 367β373.
- Woodcock &Β LaVange (2017) Woodcock, J. &Β LaVange, L.Β M. (2017), βMaster protocols to study multiple therapies, multiple diseases, or bothβ, New England Journal of Medicine 377(1),Β 62β70.
-
Yuan etΒ al. (2019)
Yuan, J., Liu, J., Zhu, R., Lu, Y. &Β Palm, U. (2019), βDesign of randomized controlled confirmatory trials using historical control data to augment sample size for concurrent controlsβ, J Biopharm Stat. 29.
https://doi.org/10.1080/10543406.2018.1559853
SUPPLEMENTARY MATERIAL
Identification and estimation of causal effects using non-concurrent controls in platform trials
Michele Santacatterina,
Federico Macchiavelli Giron,
Xinyi Zhang, and IvΓ‘n DΓaz
Division of Biostatistics, Department of Population Health,
New York University School of Medicine,
New York, NY, 10016
- Proofs and M-estimation details:
Proofs and M-estimation details
Proof of Theorem 1
(6) | ||||
Since we are interested in the effect of on , and in using non-concurrent controls, , we study paths from to and then apply d-separation. We start by studying paths from to .
By applying d-separation, the set conditionally block the path from to . This leads to the following assumptions:
A1weak A-ignorability.
Let . Assume
.
A2Consistency.
Assume
.
A3Positivity of treatment assignment mechanism among concurrent units.
Assume
for all and s.t. .
A4Positivity of shared arm assignment mechanism among all controls.
Assume
for all and .
A5Pooling concurrent and non-concurrent controls.
Assume
for all s.t. .
A6Conditional exchangeability of outcome mechanism among the treated.
Assume
for all .
Identification of concurrent ATE
Recall that
Definition 3 (Conditional and marginal average treatment effect of treatment arm compared to control among concurrent population).
Identification based on the G-formula.
Proof We start by showing it for treatment . We refer to as for clarity and (IE) as iterated expectation.
by (IE) | |||
by (A1) | |||
by (A2,A3) | |||
We now show the proof under treatment 0.
Consequently, under (A1)-(A4), is non-parametrically identified as
(7) |
In addition, under (A1)-(A5), is identified as
(8) |
β
Identification based on weighting.
Proof We start by showing it for treatment . We refer to as for clarity, and (IE) as iterated expectation.
by (IE) | |||
by (A1) | |||
by (A2) | |||
by (A3) | |||
by (IE) | |||
Note that if is deterministic, then and therefore
We now show the proof under treatment 0.
Note that if is deterministic, then and therefore
β
Proof of Theorem 2
Identification of ATE
Recall that
Definition 4 (Conditional and marginal average treatment effect of treatment arm compared to control).
Identification based on the G-formula.
Proof We show the proof for treatment . We refer to as for clarity.
by (IE) | ||||
by (A1) | ||||
by (A2,A4) |
The proof for treatment can be shown by following the steps for identifying in the section above and then assuming (A6) to be able to marginalize to concurrent and non-concurrent controls combined. Consequently, under (A1-A6), is identified as
(9) |
β
M-estimation details
We here provide detail on the M-estimation approach for obtaining asymptotic variances for outcome regression and weighted estimators. Recall that represent the data for the experimental unit , i.e., and consider .
Outcome regression
. This estimator consider only concurrent controls. Letβs define where and are the mean outcomes under treatment and control in the only concurrent control population. We started by considering controls, and the following estimating equations
where and are the score functions for the model of the conditional mean and the the marginal mean under control, respectively. We consider the following Jacobian matrix of the estimating equations,
We then constructed the following influence functions
where where obtained by ordinary least squares. We conducted a similar analysis for . Finally, we obtained the variance of as,
where .
. This estimator consider both concurrent and non concurrent controls when estimating . Hence, the analysis for looks the same as that for only changing the estimating equation for , i.e., , while the conditional mean of the outcome among the treated remains computed within only concurrent, i.e., . Specifically, we started by considering controls, and the following estimating equations
where and are the score functions for the model of the conditional mean and the the marginal mean under control, respectively. While for the treated units we considered, and the following estimating equations
where and . Derivation of the Jacobian matrix of the estimating equations is similar to the above.
Parametric inverse probability weighting
. This estimator consider only concurrent controls. Letβs define . We started by considering controls, and the following estimating equations
where and , are the score functions for the model of the conditional probability and the marginal mean under control and treatment, respectively, and where , , and . We consider the following Jacobian matrix of the estimating equations,
We then constructed the following influence functions
where where obtained by ordinary least squares. Finally, we obtained the variance of as,
where .
Proof of Theorem 3
We start by introducing some notation. We introduce an operator , where is a probability distribution assumed to lie in some nonparametric model , that maps functionals to their influence function and where is our observed data. Recall the following building blocks:
-
(bb1)
the influence function of is
-
(bb2)
the influence function of is
-
(bb3)
(product rule)
-
(bb4)
(chain rule)
-
(bb5)
-
(bb6)
.
Finally, recall that the parameter of interest (under the aforementioned identification assumption) is
while in the nonparametric model that assumes (A5) is
Theorem 3, eq. (4).
We define and pretend that the data is discrete. Recall that under discrete data
We now analyze the influence function of ,
by (bb3) | |||
by (bb1,bb2) | |||
by (bb5) | |||
by (bb6) |
where in the last equality we also used the fact that under , and . Analogously we can compute the influence function of ,
We no compute the influence function of ,
We no consider the influence function of ,
We can now combine to obtain
Theorem 3, eq. (5).
Under assumption (A5), we now target (among controls),
The influence function of is
As shown before, we can then compute the influence function of , and finally of under assumption (A5), leading to,