Statistical Rule of Thumb
Statistical Rule of Thumb
Statistical Rule of Thumb
Steven P. Millard
Probability, Statistics and Information
Seattle, WA
98115-5117
Chapter 2
Sample Size
2.1 The Basic Formula
Introduction
The rst question faced by a statistical consultant, and frequently the last,
is, "How many subjects (animals, units) do I need?" This usually results in
exploring the size of the treatment eects the researcher has in mind and the
variability of the observational units. Researchers are usually less interested
in questions of Type I error, Type II error, and one-sided versus two-sided
alternatives. You will not go far astray if you start with the basic sample size
formula for two groups, with a two-sided alternative, normal distribution with
variances homogeneous.
Rule of Thumb
where,
16
2
= 1 , 2
Illustration
Derivation
For = 0:05, = 0:20 the values of z1,=2 + z1, are 1.96 and 0.84 respectively
and 2(z1,=2 + z1, )2 = 15:68 which can be rounded up to 16. So a quick rule
of thumb for sample size calculations is:
16
n = 2:
This formula is convenient to memorize. The key is to think in terms of standardized units of . The multiplier can be calculated for other values of Type I
and Type II error. In addition, for a given sample size the detectable dierence
can be calculated.
Introduction
Rule of Thumb
Illustration
For the situation described in the consulting session the sample size becomes,
8(0:302) [1 + (1 , 0:20)2]:
n=
(0:20)2
= 29:52 ' 30
and the researcher will need to aim for about 30 subjects per group. If the
treatment is to be compared with a standard, that is, only one group is needed
then the sample size required will be 15.
Derivation
Since the coecient of variation is assumed to be constant this implies that the
variances of the two populations are not the same and the variance 2 in the
sample size formula is replaced by the average of the two population variances:
(12 + 22 )=2. Replacing i by iCV for i = 1; 2 and simplifying the algebra leads
to the equation above.
Sometimes the researcher will not have any idea of the variability inherent in
the system. For biological variables a variability on the order of 35% is not
uncommon and you will be able to begin the discussion by assuming a sample
size formula of:
1
2
n'
(P C )2 [1 + (1 , P C ) ]:
References
Frequently the question is asked to calculate a sample size for a xed condence
interval width. We consider two situations where the condence in the original
scale is w and is w = w= in units of the standard deviation
Rule of Thumb
and,
Illustration
n=
16 2 ;
n=
16
(w ) 2 :
w2
If = 4 and the condence interval width desired is 2 then the required sample
size is 64. In terms of standardized units the value for w = 0:5 leading to the
same answer.
Derivation
w = 2 1:96 p :
n
The sample size formula for the condence interval width is identical to the
formula for sample sizes comparing two groups. Thus you have to memorize
only one formula. If you switch back and forth between these two formulations
in a consulting session you must point out that you are moving from two sample
to one sample situations.
This formulation can also be used for setting up a condence interval on a
dierence of two means. You can show that the multiplier changes from 16 to
32. This makes sense because the variance of two independent means is twice
the variance of each mean.
A rather elegant result for sample size calculations can be derived in the case
of Poisson variables. It is based on the square root transformation of Poisson
random variables.
Rule of Thumb
4p :
n= p
( 1 , 2)2
Illustration
Suppose two Poisson distributed populations are to be compared. The hypothesized means are 30
p and 36.
p Then the number of sampling units per group are
required to be 4=( 30 , 36)2 = 14:6 = 15 observations per group.
Derivation
Rule of Thumb
Suppose that the background level of radiation is and let 1 and 2 now be
the additional radiation over background. Then, Xi is Poisson (( + i )).The
rule-of-thumb sample size formula is:
16( + (1 + 2 )=2) :
n=
(1 , 2 )2
Illustration
Suppose the means of the two populations are 1 and 2 with no background
radiation. Then the sampling eort is n = 24. Now assume a background level
of 1.5. Then the sample sizes per group become 48. Thus the sample size has
doubled with a background radiation halfway between the two means.
Derivation
We did not use the square root transformation. The reason is that the background radiation level is more transparently displayed in the original scale and,
second, if the square root transformation is used then an expansion in terms in
the 0 s produces exactly the formula above. The denominator does not include
the background radiation but the numerator does. Since the sample size is proportional to the numerator, increasing levels of background radiation require
larger sample sizes to detect the same dierence in radiation levels. When the
square root transformation formula is used in the rst example the sample size
is 23.3, and in the second example, 47.7. These values are virtually identical
to 24 and 48. While the formula is based on the normal approximation to the
Poisson distribution the eect of background radiation is very clear.
Rule of Thumb
n=
Illustration
(1 , 2)2 :
For 1 = 0:5 and 2 = 0:7 the required sample size per group is n = 100.
Derivation
where,
= 1 , 2 ;
An upper limit on the required sample size is obtained at the maximum values
of i which occurs at i = 1=2 for i = 1; 2. For these values = 1=2 and the
sample size formula becomes as above.
Some care should be taken with this approximation. It is reasonably good for
values of n that come out between 10 and 100. For larger (or smaller) resulting
sample sizes using this approximation, more exact formulae should be used. For
more extreme values use tables of exact values given by Haseman (1978) or use
more exact formulae (see Fisher and van Belle, 1993). Note that the tables by
Haseman are for one-tailed tests of the hypotheses.
References
Haseman (1978) contains tables for \exact" sample sizes based on the hypergeometric distribution. See also Fisher and van Belle (1993)
In some cases it may be useful to have unequal sample sizes. For example,
in epidemiological studies in may not be possible to get more cases but more
controls are available. Suppose n subjects are required per group but only n1
are available for one of the groups where we assume that n1 < n. We desire
to know the number of subject, kn1 required in the second group in order to
obtain the same precision as with n in each group.
Rule of Thumb
Illustration
k=
(2n1 , n) :
Suppose that sample size calculations indicate that n = 16 cases and controls
are needed in a case-control study. However, only 12 cases are available. How
many controls will be needed to obtain the same precision? The answer is
10
Derivation
For two independent samples of size n, the variance of the estimate of dierence
(assuming equal variances) is proportional to,
1 + 1:
n
Given a sample size n1 < n available for the rst sample and a sample size kn
for the second sample, then equating the variances for the two designs,
1+1= 1 + 1 ;
n
n1
kn1
This approach can be generalized to situations where the variances are not
equal. The derivations are simplest when one variance is xed and the second
variance is considered a multiple of the rst variance (analogous to the sample
size calculation).
Now consider two designs, one with n observations in each group and the
other with n and kn observations in each group.
The relative precision of these two designs is,
s
SEk
1 1+ 1 ;
=
SE1
2
k
where SEk and SE1 are the standard errors of the designs with kn and n
subjects in the two groups respectively.
For k = 1 we are back to the usual two-samplepsituation with equal sample
size. If we make k = 1 the relative precision is 0:5 = 0:71. Hence, the best
we can do is to decrease the standard error of the dierence by 29%. For k = 4
we are already at 0:79 so that from the point of view of precision there is no
reason to go beyond four or ve more subjects in the second group than the rst
group. This will come close to the maximum possible precision in each group.
In some two sample situations the cost per observation is not equal and the
challenge then is to choose the sample sizes in such a way so as to minimize
11
cost and maximize precision, or minimize the standard error of the dierence
(or, equivalently, minimize the variance of the dierence). Suppose the cost per
observation in the rst sample is c1 and in the second sample is c2 . How should
the two sample sizes n1 and n2 be chosen?
Rule of Thumb
rc
c2
This is known as the square root rule: pick sample sizes inversely proportional
to square root of the cost of the observations. If costs are not too dierent then
equal sample sizes are suggested (because the square root of the ratio will be
closer to 1).
Illustration
Suppose the cost per observation for the rst sample is 160 and the cost per
observation for the second sample is 40. Then the rule of thumb states that
you should take twice as many observations in the second group as compared
to the rst. To calculate the specic sample sizes, suppose that on an equal
sample basis 16 observations are needed. To get equal precision with n1 and
2n1 we solve the same equation as in the previous section to produce 12 and 24
observations, respectively.
Derivation
n2
subject to the total cost being C . This is a linear programming problem with
solutions:
C
;
n1 =
c + pc c
and
n2 =
1 2
C
:
c2 + pc1 c2
12
The argument is similar as that in connection with the unequal sample size rule
of thumb.
The rule of threes can be used to address the following type of question, \I am
told by my physician that I need a serious operation and have been informed
that there has not been a fatal outcome in the twenty operations carried out by
the physician. Does this information give me an estimate of the potential post
operative mortality?" The answer is \yes!"
Rule of Thumb
Given no observed events in n trials, a 95% upper bound on the rate of occurrence is,
3:
n
Illustration
Given no observed events in 20 trials a 95% upper bound on the rate of occurrence is 3=20 = 0:15. Hence, with no fatalities in twenty operations the rate
could still be as high as 0:15.
Derivation
= :
n
13
References
Fisher, L. and van Belle G. (1993). Biostatistics: A Methodology for the Health
Sciences. Wiley and Sons, New York, NY.
Hanley, J.A. and Lippman-Hand, A. (1983) If nothing goes wrong, is everything alright? Journal of the American Medical Association, 249: 1743-1745.
Haseman, J.K. (1978) Exact sample sizes for the use with the Fisher-Irwin
test for 2x2 tables. Biometrics, 34: 106-109.
van Belle, G. and Martin D. C. (1993) Sample size as a function of coecient
of variation and ratio of means. American Statistician, 47: 165-167.
14