Sample Size Calculation
Sample Size Calculation
RESEARCH ARTICLE
Abstract: In crossover trials, patients receive two or more Ideally, clinical trials should be large enough to reliably
treatments in a random order in different periods. The sample detect the smallest possible difference in the primary
size determination is often an important step in planning a outcome with treatments that are considered clinically
crossover study. This paper concerns sample size calculations worthwhile. According to Lee et al. (2005), it is not
in 2x2 crossover trials, with random patient effects and no uncommon for studies to be underpowered, failing to
interaction between the treatment and the patient under two detect even large treatment effects because of inadequate
scenarios, namely the exact and the large sample size approaches.
sample size. It is considered unethical to recruit patients
Simulation was carried out for determining the sample size for
for a study that does not have a large enough sample size
both scenarios. For varying parameter values, simulation was
for the trial to deliver meaningful information on the
used for generating samples of the required size and examining
whether the significance level and power of the tests are tested intervention. Thus, sample size should be based
maintained. The results indicate that when the sample size was on scientific considerations. Several approaches are
≤ 5, neither method maintained error rates and when the sample discussed (Pocock, 1983; Julious & Patterson, 2004) for
size was >5 and < 12 only the exact approach maintained error calculating sample size including the power approach and
rates. However, when the sample size is approximately > 12 the confidence interval approach. According to previous
both methods maintained error rates. In addition it was found studies (Chow et al. 2003;Woodward, 1992), these
that a saving in sample size can be achieved depending on the approaches require the specification of several parameters
extent of the correlation between the observations on the same such as between treatment and within treatment variances
patient. The simulation results indicate that crossover studies for the treatments under consideration, the correlation
should not be conducted when the anticipated sample size is within patients and the reference improvement, which is
≤ 5 and when a sample size of >5 and < 12 is anticipated, required to be detected. Chow et al. (2003) explain the
the exact method of determining sample size should be used. two different approaches for 2x2 crossover designs but do
When larger sample sizes are anticipated either method can be not give guidelines on when to use specific approaches.
used but the method based on large sample size approximation
In this study simulation is extensively used to examine
is simpler.
the problem of setting guidelines.
Keywords: Crossover trial, exact method, large sample method,
sample size calculation, simulation. In this study, two situations are considered in
the calculation of sample size of a crossover study as
explained in Chow et al. (2003). These are,
INTRODUCTION
(i) The exact approach
Crossover trials are clinical trials in which patients are (ii) The large sample ( approximate ) approach.
given all the medications to be studied in a random order.
According to Grizzle(1965) these studies are generally Further, the study gives guidelines for when to use
conducted on patients with chronic diseases to control the exact approach and the large sample approach and to
their symptoms. The data are analyzed according to the study how much saving in sample size can be achieved
original intention to treat. when observations on the same patient are correlated.
*
Corresponding author ([email protected])
78 N. M. Siyasinghe & M.R. Sooriyarachchi
Mixed model used: In clinical trials, it is common Here it is assumed that there is no treatment by
to assume that the patients respond consistently to period interaction, since a simple hypothesis test can
treatments. However, the assumption is invalid if the be used only under this assumption. The subject effects
patients vary randomly in their responses to the drug. Sij1 , Sij2 are assumed to be independent and identically
For this type of situation, a random subject effects model distributed as bivariate normal random variables with
where the subject effect is considered to be random and mean 0 and covariance matrix
the treatment and period effects are considered to be fixed
has to be considered (Brown & Prescott, 2006). ~ª σ BT2 (0,
ρσ BT)σ BR º
¦=« »
¬ ρσ BT σ BR
(0, σ BR) ¼
2
Chow et al. (2003) have explained how to calculate ~
the sample size in a crossover design using either of the
two approaches, namely the exact and the approximate. where σ2BT is the variance
~ (0,between
2
) patients for the
In this paper a similar 2 × 2 crossover design comparing ‘treated’ group, σ2BR is the variance2between patients for
mean responses for two groups is considered. the ‘reference’ group~ and ρ(0, )
is the correlation between
..................(3)
subjects in the treated and reference groups.
In the first approach the test statistic is based on the
Student’s t distribution, whereas in the second approach So, Sij1 and Sij2 have a bivariate normal distribution
the test statistic is based on the normal distribution. In the with 0 and variance – covariance matrix ∑ . It is
mean..................(3)
exact approach, the sample size depends on the degrees assumed that the errors eij1 and eij2 are such that
of freedom. The calculation of sample size is therefore
eij 1 ~ iid N (0,σ WT 2 )
not straightforward. The same calculation can be done
without difficulty if the approximate approach, which is eij 2 ~ iid N (0,σ WR2 )
based on the normal approximation, is used. Values of the 2
2
inverse t distribution function need to be determined for where, σ WT is the within patient variation for the treated
calculating the sample sizes for the exact 2 approach. This group and σ2WR is the within patient variation for the
is done by using the approximation given m in Cooke2 et al. reference group.
..................(3)
m
(1982). The criteria used for determining the method to
be used for calculating sample size is based on which Consider a group, which gets treatment 1 in the first
method maintains power and significance level. period and treatment 2 in the second period, then the
model can be written as follows,
Let Yijk be the response observed from the ( ( ) ..................(5)
) ..................(5)
jth (j= 1,2,..,n) subject in the ith sequence (i = 1,2) Y1 j 1 = µ1 + p1 + s1 j 1 + e1 j 1..................(3) ...(3)
under kth treatment (k=1,2). The model considered is
and
Yijk = µ + t k + pi + sijk + eijk, where
µ is the overall
mean, t k is the kth treatment ..................(3)
effect, pi is the ith sequence Y2 j 2 = µ2 + p2 + s2 j 2 + e2 j ε2 ................(4) ...(4)
(period) effect , sijk is the random
2
th
effect of the j subject
0 ε
m n 0
Estimation of the mean and variance of the treatment
in the ith sequence under kth treatment and enijk is the error
th
term corresponding to the j subject in the i sequenceth difference: The method of Chow et al. (2003) explains
th
under k treatment. a procedure to measure the treatment difference of a
crossover trial and this , section . . . discusses
. . . . . . . . . that
. . ( 6procedure.
)
The following mixed model is used. 2 nε be the measure
Let ( of )treatment
..................(5)
difference, then
, . . . . . .Now
ε 2= µ1 − µ 2 (test-reference). . . . . take
. . . .d( 6=) y − y
ij ij 1 ij 2
Yijk = µ + tk + pi + sijk + eijk .....................(1) ...(1) Anmunbiased estimate for ε is given by,
1 2 n
σ m2 ,
2
Here, treatment effect and period effect are εˆ = σ 2 σ 2 d 2 and
ρ σ σ
V [ε
2 ˆ
] =
2 nBT2εin= 1BR j = 1 1WT ................(7)
ij
m WR 2n
considered as fixed effects and subject effect as random. 0 2 2
m
n
In this study equal allocation of patients to treatment σ σ
= BT + σ BR −) 2ρσ BTσ BR + σ WT + σ WR
2 2m 2 2 2
.....................(2) where (................(7)
groups are assumed and no replication is considered. m
1
2 2 2 2 2
Then define the following
m
2
notation.
m σ BT
2
σ BR
BT
2
ρ σ WT
BR
2
WT σ WR
2
WR
calculation at the design stage. Thus it is required to find [Here Z −1 (a ) indicates the ath ordinate of the standard
an unbiased estimate for use in the test statistic at the normal distribution]
analysis stage. An unbiased estimate for σ2m can be given
by, Simulation studies:
1 2 n
Var (εˆ ) = σˆ m2 = (a) Description: In order to satisfy the above mentioned
( d ij − d i . ) 2 ..................(5)
...(5)
2( n − 1) i =1 j =1 objectives, a simulation study was carried out. For the
exact approach, the bisection method was used as the
1 n root finding technique for determining the sample size, as
where d i . = ¦ d ij
n j =1 ε described in Press et al.(2002). The simulation study was
0 also used for determining whether the type 1 error and
n Estimating sample size: The null hypothesis (Ho): ε = 0 the power are maintained, for both approaches. Sample
and the alternative hypothesis (H1)n: ε ≠ 0 are used to sizes were determined for varying correlations for both
test whether the effect of two treatments is equal or not. approaches, and thereby the saving in sample size with
, ..............( 6 ) increasing correlation was studied. Finally, based on the
εˆ
0 results of the simulation study carried out, guidelines are
n Under H0 , the test statistic σˆ m2 follows a t distribution
2n
provided for sample size calculation in crossover trials.
with 2n-2 degrees of freedom (Chow et al., 2003). A C programme was written for performing the
1, ................(7)
..............( 6 ) simulation study. The C language was selected since it is
The null hypothesis is rejected at α level of significance if efficient in doing large scale simulations. The first step of
the simulation study was to set some practically plausible
ε ˆ values for the parameters required (Sooriyarachchi &
> tα , 2 n − 2 . . . . . . . . . ....(6)
. . . . ( 6Whitehead,
) 1998 ; Whitehead et al., 2008). Usually
σˆ m 1 2 ................(7)
2 n crossover trials are associated with a small sample size,
due to comparison of treatments being within patient
rather than between patient, and variances within patient
The above mentioned hypothesis test will satisfy a power being usually smaller than the between variances. The
»
> of» = 1 − β ................(7)
if between treatment standard deviation for the treated
» 1 ................(7) group (σBT) was examined over two values namely, 3 and
4 and the between treatment standard deviation for the
ª ^ º
« ε -εˆε » reference group (σBR) was set equal to σBT , which is often
Pr « > tα , 2 n − 2 » = 1 − β ................(7) ...(7) assumed in crossover trials. Two values were assigned
σˆ 2 ε =ε R
« m » for the within patient standard deviation for the treated
¬ 2n ¼
group (σWT) namely, 0.3 and 0.5 and again the within
patient standard deviation for the reference group (σWR)
From equations (6) and (7) the corresponding sample set equal to σWT .
size can be obtained by
The within subject correlation coefficient was
( )
2
ªT −1 1 − α + T 2 n − 2 (1 − β ) σˆ m
−1 º 2 indicated by the variable ρ. The values of ρ were
n = ¬ 2n− 2 2 ¼ ...................(8) 0, 0.3, 0.6, and 0.9. i.e. considering there
examined over
2 εR 2 is no correlation at all, some correlation, high correlation
...(8) and very high correlation, respectively so that we can
see and compare the outcomes for various situations.
[Here Tn−1 (a ) indicates the ath ordinate of the t distribution, Note that although it was not considered in this study
with n degrees of freedom]
( )
( a) considering the large sample approach, instead improvement
T When
it is also possible to consider situations where σBT does
not equal σBR and σWT does not equal σWR. The reference
is indicated by the variable named εR. The
of the t distribution, the standard normal distribution is values of εR that were examined are 1.5, 2 and 3.
used. Then the formulae for the sample size calculation
can be obtained as, For each of these combinations 1000 simulations were
( ) ( carried out under the null and the alternative hypotheses.
( )
2
ª Z −1 1 − α + Z − 1 (1 − β ) º σˆ m2
¬ 2 ¼ Under the null hypothesis, the mean difference between
T ( a) n = ...(9)
................(9 )
2ε 2 treatments (ε) is set to zero and under the alternative
R
Journal of the National Science Foundation of Sri Lanka 39 (1) March 2011
( ) ( )
80 N. M. Siyasinghe & M.R. Sooriyarachchi
hypothesis, the mean difference between treatments (ε) of the test can be obtained by the proportion of rejections
is set to εR . of the null hypothesis when the alternative hypothesis
is true. If the corresponding proportions are within the
As explained in the introduction, calculation of above probability intervals, it can be concluded that the
sample size is not straightforward for the exact case as significance level/power is well maintained by the test.
the sample size is dependent on the degrees of freedom Table 1 gives the proportion of rejections of the null
in this case and thus the sample size determination hypothesis under the null and alternative hypothesis for
requires solving of a nonlinear equation in n; hence a root the exact approach and Table 2 corresponds to the similar
finding technique is needed. The method used here is the table for the large sample approach. The proportions
Bisection Method explained in Press et al. (2002). which are out of the confidence limits are highlighted in
the tables.
After obtaining an estimate for the sample size, it
In these tables the values taken by all the nuisance
was of interest to determine the proportion of rejections ˆ2ˆ2ˆ2ˆ22
ˆ mmmparameters
mm ( σ BT
2
, σ BR
2
, σ WT
2
, σ WR
2
, ρ ) and reference
under the null and alternative hypotheses out of thousand
improvement (εR) are given including whether the
simulations to see whether the power and the significance
simulation was done under the null hypothesis (g =1) or
level are maintained. That is to simulate each sample size under the alternative hypothesis (g =2). Then for each
1000 times and get the proportion of rejections. In order combination, the calculated sample size, the proportion of
to do that, we need to simulate the model explained in the rejections of the null hypothesis, ˆ m2 σ m2 and the mean value
introduction. For that we need to generate sijk’s and eijk’s of σˆ m are given. This is useful in deciding how close
2
An estimate for the significance level can be obtained by Tables 1 and 2 show that for most of the combinations,
the proportion of rejections of the null hypothesis when ˆ m2 the σ m2 and the mean of σˆ m2 are close to each
values of
the null hypothesis is true, and an estimate for the power other for both situations.
March 2011 Journal of the National Science Foundation of Sri Lanka 39 (1)
Sample size for crossover trials 81
In order to illustrate the results more clearly, several Figure 1 shows how the values of nuisance parameters
graphs have been plotted in addition to the two tables. and the reference improvement effect the calculation of
Figure 1 is drawn to illustrate the variation of sample size sample size for the Exact method. It can be seen that when
with respect to del for different combinations of ρ, σBT, ρ increases, the sample size required rapidly decreases,
σBW and σWT for the exact approach. irrespective of the situation. Here ρ represents the within
Table 1: Proportion of rejections of the null hypotheses under the null and the alternative hypotheses for the exact method
Journal of the National Science Foundation of Sri Lanka 39 (1) March 2011
82 N. M. Siyasinghe & M.R. Sooriyarachchi
Table 1 continued…
March 2011 Journal of the National Science Foundation of Sri Lanka 39 (1)
Sample size for crossover trials 83
Table 2: Proportion of rejections of the null hypotheses under the null and the alternative hypotheses for the large sample approximation
Journal of the National Science Foundation of Sri Lanka 39 (1) March 2011
84 N. M. Siyasinghe & M.R. Sooriyarachchi
Table 2 continued…
March 2011 Journal of the National Science Foundation of Sri Lanka 39 (1)
Sample size for crossover trials 85
patient correlation coefficient. The higher the correlation the large sample approach and illustrates similar results
between patients, the higher the gain in sample size. Also as in Table 2. When ρ is very high (0.9) and the reference
it can be observed that, the within patient variance (σWT)2 improvement is large (3), there is a higher tendency in
has less effect than the between patient variance (σBT)2 obtaining a small sample size.
on the calculation of sample size because the latter is
usually much greater than the former. When the reference
CONCLUSION
improvement increases the sample size becomes smaller,
because the difference we want to detect is larger. Also
This study deals with the sample size calculation of
the gain in sample size due to increasing correlation is
crossover trials under two situations in the power
higher for smaller reference improvement.
approach, namely the exact and large sample methods
(Chow et al. 2003).
Figure 2 is drawn in order to see whether the
significance level is maintained by the test for the exact
From the results of the simulation study the following
approach. It shows a graph of the proportion of rejections
guidelines can be given. It was seen that when the sample
of the null hypothesis, when the null hypothesis is true size is very small (less than or equal to five), neither
versus del for different combinations of ρ, σBT and σWT method maintains error rates. i.e. even the exact method
for the exact approach.
which was based on the t distribution failed for very
small sample sizes. This is because the approximation
Two coloured lines represent the band within
used to calculate the inverse of the t distribution which
which the proportion should lie, in order to maintain
is described in Cooke et al. (1982), in determining the
the significance level. The corresponding sample size
sample size, is not accurate for very small sample sizes.
is shown near the points, which are out of the bands.
Also it is evident that when the sample size is fairly large
Similar results as shown by Table 1 are illustrated here.
in terms of crossover studies (5 < sample size < 12), only
the exact approach has maintained error rates. This is
Figure 3 is drawn in order to see how well the
because the normal ordinate is an underestimate of the t
power is maintained by the test for the exact approach.
ordinate for sample sizes within that range. That means
It gives a graph of the proportion of rejections of the
within the specified sample limits the exact approach
null hypothesis, when the alternative hypothesis is true
should be used.
versus del for different combinations of ρ, σBT and σWT
for the exact approach. When the sample size is large in terms of crossover
studies ( ≥ 12) both methods have maintained error rates.
When considering Figure 3 it can be seen that most i.e. for the sample sizes greater than about eleven, the
of the sample sizes which lie outside the bands are very large sample approach, which is much simpler than the
small numbers except for 12, which is very close to the exact approach, can be used for sample size determination
upper limit. When the sample size is five or less, many instead of the exact approach, which needs numerical
points lie outside the band. The reason is the imprecision methods. A higher reduction in sample size can be
of the approximation used in calculating the inverse t achieved when the within patient correlation is high.
distribution values when the sample size is less than or
equal to five. When ρ is very high ( 0.9) and the reference A better approximation for the inverse distribution
improvement is large ( 3 ), there is a higher tendency of the t distribution should be found when calculating
in obtaining a small sample size, hence a higher number sample sizes, which are believed to be very small. The
of points can be observed outside the bands in those study is done for 2 x 2 crossover trials, which consider
combinations. only two treatments and two periods. As an extension
one can consider more treatments and periods. Also in
From Figures 2 and 3, it can be seen that similar results this study there are no replications of treatments, i.e.
are obtained as per the table for the exact approach. a treatment is given to a set of patients (subjects) only
once. One can extend this study to have replications.
Figures 4 and 5 are the corresponding graphs
to Figures 1 and 2, for the large sample approach An assumption used in this study is that of equality
respectively. The conclusions drawn from Figure 4 are of within and between patient variances for both treated
same as those drawn from Figure 1. Figure 5 illustrates and reference groups (i.e. σBT 2 = σBR 2 and σWT 2 = σWR2 ) and
similar results as given in Table 2. equal allocation of patients to both groups. Thus further
investigation can be carried out taking different values
Figure 6 is the corresponding graph to Figure 3, for for these parameters.
Journal of the National Science Foundation of Sri Lanka 39 (1) March 2011
86 N. M. Siyasinghe & M.R. Sooriyarachchi
Sample
Samplesize
sizevsvsdelta
deltafor
for different combinations
different combinations of of rho,
rho, sigmabt
sigmabt andand sigmawt
sigmawt
Samlpe size
Sample
Delta
Figure 1: Graph of sample size versus delta for different combinations of ρ, σBT and σWT for the
exact approach
Proportion of of
Proportion rejections
rejectionsofofthe
thenull
nullhypothesis whentha
hypothesis when thenull
nullhypothesis
hypothesis
isis true
true
vs delta for different
vs delta combinations
for different ofofrho,
combinations rho,sigmabt
sigmabt,and sigmawt
sigmawt
4
Level
Significancelevel
Significance
4
Delta
Figure 2: Graph of proportion of rejections of the null hypothesis, when the null hypothesis is true versus
delta for different combinations of ρ, σBT and σWT for the exact approach
March 2011 Journal of the National Science Foundation of Sri Lanka 39 (1)
Sample size for crossover trials 87
Proportion ofrejections
Proportion of rejectionsofof thethe
nullnull hypothesis
hypothesis whenwhen the null hypothesis
tha alternative hypothesisisistrue
true
vs delta for different
vs delta combinations
for different combinationsofofrho, sigmabtsigmawt
rho, sigmabt, and sigmawt
Power
Power
4 4
Delta
Figure 3: Graph of proportion of rejections of the null hypothesis, when the alternative hypothesis is
for different combinations of ρ, σBT and
true versus delta δ ρ σWT σ approach
σ for the exact
Sample
Sample size
size vs vs delta
delta for for different
different combinationsof
combinations of rho,
rho, sigmabt,
sigmabtsigmawt
and sigmawt
size
size
Sample
Sample
Delta
26
Figure 4: Graph of sample size versus delta for different combinations of ρ, σBT and σWT for the large
sample approach
Journal of the National Science Foundation of Sri Lanka 39 (1) March 2011
88 N. M. Siyasinghe & M.R. Sooriyarachchi
Proportion
Proportionofofrejections
rejections of thenull
of the nullhypothesis
hypothesis when
when thathe
nullnull hypothesis
hypothesis is true
is true
vs vs
delta
deltafor
fordifferent
different combinations
combinations ofofrho,
rho, sigmabt
sigmabt andand sigmawt
sigmawt
level
Significance level
Significance
Figure 5 : Graph of proportion of rejections of the null hypothesis, when the null hypothesis is true
versus delta for different combinations of ρ, σBT and σWT for the large sample approach
Proportionofof
Proportion rejections
rejections of the
of the nullnull hypothesis
hypothesis whenwhen the null hypothesis
the alternative istrue
hypothesis is true
vs vs
delta for different combinations of rho, sigmabt and sigmawt
delta, for different combinations of rho, sigmabt and sigmawt
5
5
3
2
2
Power
Power
8
8 5
5 3
2
Figure 6: Graph of proportion of rejections of the null hypothesis, when the alternative hypothesis is true
versus delta for different combinations of ρ, σBT and σWT for the large sample approach
March 2011 Journal of the National Science Foundation of Sri Lanka 39 (1)
Sample size for crossover trials 89
In the model specified, it was assumed that there is no and its use in clinical trials. Biometrics 21(2): 467-480.
treatment by period interaction (that is, the effect of the 7. Jones B. & Kenward M. (2003). The Analysis of Cross-
treatment remains consistent over the two periods). If over Trials. Chapman and Hall / CRC Press, New York,
such an interaction is present, a simple t test cannot be USA.
used in testing the hypothesis and a modelling approach 8. Julious S.A. & Patterson S.D. (2004). Sample sizes for
estimation in clinical trials. Pharmaceutical Statistics
will have to be used (Jones & Kenward, 2003). This
3(3):213-215
requires a new investigation to be carried out.
9. Lee C., Lee L.H., Christoper L.W., Chen M. & Benjamin R.
(2005). Clinical Trials of Drugs and Biopharmaceuticals.
References Chapman and Hall /CRC Press, New York, USA.
10. Pocock S.J. (1983). Clinical Trials: A Practical Approach.
1. Al-Subaihi A.A. (2004). Simulating correlated multiv- John Wiley & Sons, Inc., Chichester, UK.
ariate pseudorandom numbers. Journal of Statistical Soft- 11. Press W.H., Teukolsky S.A., Vetterling W.T. & Flannery
ware 9(i04) http://ideas.repec.org/a/jss/jstsof/09i04.html. B.P. (2002). Numerical Recipes in C++, pp. 357-358.
2. Brown H. & Prescott R. (2006). Applied Mixed Models in Cambridge University Press, Cambridge, UK.
Medicine. John Wiley & Sons, Inc., Chichester, UK. 12. Sooriyarachchi M.R. & Whitehead J. (1998). A method for
3. Chow S.C., Shao J. & Wang H. (2003). Sample Size sequential analysis of survival data with non-proportional
Calculations in Clinical Research. Chapman and Hall/ hazards. Biometrics 54(3): 1072-1084.
CRC Press, New York, USA. 13. Whitehead A., Sooriyarachchi M.R., Whitehead J. &
4. Cooke D., Craven A.H. & Clarke G.M. (1982). Basic Bolland K. (2008). Incorporating intermediate binary
Statistical Computing. Edward Arnold Press, London, responses into interim analyses of clinical trials: a
UK. comparison of four methods. Statistics in Medicine 27(10):
5. Golder E.R. & Settle J.G. (1976). The Box-Muller method 1646 – 1666.
for generating pseudo-random normal deviates. Applied 14. Woodward M. (1992). Formulae for sample size, power
Statistics 25(1):12-20. and minimum detectable risk in medical studies. The
6. Grizzle J.E. (1965). The two-period change-over design Statistician 41(2):185-196.
Journal of the National Science Foundation of Sri Lanka 39 (1) March 2011