0% found this document useful (0 votes)
12 views27 pages

7 Sample Size Determination

The document discusses the importance of determining sample size in study design, emphasizing the need for sufficient subjects to achieve statistically significant results. It outlines factors affecting sample size, approaches for determination, and formulas for various types of outcome variables, including proportions and means. Additionally, it addresses concepts like confidence level, power, and design effect, providing examples and exercises for practical application.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views27 pages

7 Sample Size Determination

The document discusses the importance of determining sample size in study design, emphasizing the need for sufficient subjects to achieve statistically significant results. It outlines factors affecting sample size, approaches for determination, and formulas for various types of outcome variables, including proportions and means. Additionally, it addresses concepts like confidence level, power, and design effect, providing examples and exercises for practical application.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 27

Sample size determination

Jimma,2024
Sample size determination

 Determining the sample size for a study is a crucial component of


study design.
 The goal is to include sufficient numbers of subjects so that
statistically significant results can be detected.
 Among the questions that a researcher should ask when planning
a survey or study is that "How large a sample do I need?"
 The answer will depend on the aims, nature and scope of
the study and on the expected result.
 All of which should be carefully considered at the planning
stage
Sample size determination…

In general, sample size depends on:


 The type of data analysis to be performed
 The desired precision of the estimates one wishes
to achieve
 The kind and number of comparisons that will be
made
 The number of variables that have to be examined
simultaneously
 How heterogeneous the sampled population is
 The objective of the study
Approaches

• We can use two approaches to determine


sample size
1. Rules of thumb for determining the sample
size
2. Statistical formula
1. Rules of thumb
1. For smaller samples (N < 100), there is little point in
sampling. Survey the entire population.
2. If the population size is around 500 (give or take 100), 50%
should be sampled.
3. If the population size is around 1500, 20% should be
sampled.
4. Beyond a certain point (N = 5000), the population size is
almost irrelevant and a sample size of 400 may be
adequate.
5. Statistician maximalist: at least 500
6. To make generalizations about entire population, need a
total sample size of 200-400 (depending on total
population and confidence level desired)
2. Statistical formula
There are three possible categories of outcome variables

1. Where the variable of interest has only two alternatives response:


yes/no, dead/alive, vaccinated/not vaccinated and so on.
2. When the outcome variable with multiple, mutually exclusive
alternatives responses, such as marital status, religion, blood
group and so on.
For these two categories of outcome variables, the data are generally
express as percentages or rates. So we can use percentage to compute
the sample size.
3. Continuous response variables such as birth weight, age at first
marriage, blood pressure and cerium uric acid level, for which
numerical measurement are usually made.
– In this case the data are summarize in the form of means and standard
deviations or their derivatives.
Sample size and power
• Sample size and power are essential for the
evaluation of the role of chance
• If a study has a inadequate sample size, then a
result could not show us a real difference as a
difference
• A true association will be difficult or impossible
to distinguish from a non-true association
because of inadequate power
Confidence Level
 α : The significance level of a test: the
probability of rejecting the null hypothesis
when it is true (or the probability of making a
Type I error). It is usually 5% (0.05)
 Confidence level: The probability that an
estimate of a population parameter is within
certain specified limits of the true value;
(commonly denoted by 1- α , and is usually
95%).
Power and β
• Power: The probability of incorrectly rejecting
the null hypothesis when it is false; commonly
denoted by 1-β
• β : The probability of failing to reject the null
hypothesis when it is false (or the probability
of making a Type II error)
Sample Size required for single Proportions

• The formula requires the knowledge of p, the


proportion in the population possessing the
characteristic of interest.
• Formula:

• Where
– p can be obtained from Estimates may be available from
previous studies
– A pilot or preliminary sample
– If not; to come with larger size, set p = 0:5
Sample Size required for single Proportions

• Zα/2 is the value of Z from standard normal curve at α/2 For


α = 0.05 the Z0.025 = 1.96
• For α = 0.1 the Z0.05 = 1.65 and so on…

Margin of error (w)


– The margin of error (w) measures the precision of the
estimate
– Small value of w indicates high precision
– It lies in the interval (0%; 5%]
– For p close to 50%, w is assumed to be close to 5%
– For smaller value of p, w is assumed to be lower than
5%
Sample size for single mean
• the formula requires the knowledge of ,
population standard deviation for the variable
of interest
• Formula:

Where σ can be obtained from


– Previous studies
– A pilot or preliminary sample
Example 1

• A survey is being planned to determine what


proportion of families in a certain area are
medically indigent. It is believed that the
proportion cannot be greater than 0.35.
• A 95% confidence interval is desired with w=
0.05. What sample size should be selected?
• Solution
– Given z = 1.96, p =0.35, and d =0.05
Design Effect
• It is a correction of bias in the variance introduced in
the sampling design, by selecting subjects due to the
use of clusters.
• The design effect can be calculated after study
completion, but should be accounted for at the design
stage.
• The design effect is 1 (i.e., no design effect) when
taking a simple random sample.
• The design effect varies using cluster sampling; it is
usually estimated that the design effect is 2 in
multistage sampling having cluster sampling.
Design Effect
• Global and cluster variance

Where
– p=global proportion
– pi=proportion of the ith cluster
– n=number of subjects
– k=number of strata
Sample size Formula
SRS

Cluster sampling

Where
– p=expected prevalence
– d=absolute precision
– g=design effect
Sample size for analytic study
Desired values for the probabilities of a and b
The proportion of the baseline (controls or non-
exposed)Population:
 EXPOSED (for case-control studies), or
DISEASED (for cohort/intervention studies)Often
based on previous studies or reports
Magnitude of the expected effect (RR, OR) Often
based on previous studies or reports
Minimum effect that investigator considers worth
detecting
Formula: different formulae depending on study
design, research question, and type of data
Sample size formula for analytic study using
continuous outcome

Cont…

Sample size formula for analytic study using
proportion
Formula
– The sample size which will be randomly taken from
each group can be determined by:

• p0 = proportion of controls or unexposed


• p1 = proportion of cases or exposed
• 1 − α=level of confidence and Z1−α/2 is the value of Z from
standard normal distribution
• 1 − β= power and Z1−β is the value of z from standard
normal distribution
Example 2

Example 2

(0.15 × 0.85 + 0.25 × 0.75)(1.96 + 0.84)2


n= = 247

(0.25 − 0.15)2
• Thus, 247 OC users and 247 none OC users
needed for the study
Example 3
• Case-control study of oral contraceptive (OC) use in relation
to risk of MI among women of childbearing age
• Previous studies:
– 15% of women with MI use OCs
– 10% of women without MI use OCs
• OR of MI associated with current OC use = 1.8
• Conventional a = 0.05 (two-sided)
• Conventional b = 0.20 (80% power to detect difference if one
truly exists)
• Assume equal sample sizes (n1=n2)
• Answer: 409 cases and 409 controls
Unequal Sample Sizes

•The formula changes slightly according to the


ratio we want between the cases and controls
•n1 is the sample size for the first group
•r ∗ n1 is the sample size of the second group
(where r is pre-specified, and rx as many
controls as cases)
•Very easy to do in Epi Info
Summary
• The sample size we obtained from the formula should be
adjusted for none-response, lost from the follow up,
design effect and so on.
• Finite population correction formula can be used as
needed
Finite Population correction

where
nf= final sample size
ni= sample size from the formula
N = Size of the study population
Sample size using EPI-INFO
• We can also determine sample size using
statistical softwares
• There are many softwares which can be used
• Mostly, we used EPI-INFO in health
• The procedure is simple in EPI-INFO 7
• We will use it for sample size determination
Exercise
calculate sample size for following question alph=0.05,
margin of error=.05
1. birth weight of children in Jimma town(SD=5)
2. birth weight of children in model kebele and non
model kebele, previous SD=15, The difference is
(SD=3) NB.case-control study
3. Mortality by cancer in JUSH(P=15%)
4. Diarrhea positive in model kebele and non model
kebele, previous P=31, NB.case-control study
4. What are statistical and non statistical criterias to be
considered during sample size calculation?

You might also like