0% found this document useful (0 votes)
8 views129 pages

Chapter 4 Samplind Distr and SS

Chapter 4 discusses the concepts of population, sampling, and sample size determination in research. It outlines the importance of defining populations, the characteristics of samples, and various sampling methods, including probability and non-probability sampling techniques. The chapter also highlights the advantages and disadvantages of sampling, emphasizing the need for representative samples to ensure accurate generalizations about the population.

Uploaded by

naseemahmed5599
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views129 pages

Chapter 4 Samplind Distr and SS

Chapter 4 discusses the concepts of population, sampling, and sample size determination in research. It outlines the importance of defining populations, the characteristics of samples, and various sampling methods, including probability and non-probability sampling techniques. The chapter also highlights the advantages and disadvantages of sampling, emphasizing the need for representative samples to ensure accurate generalizations about the population.

Uploaded by

naseemahmed5599
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 129

Chapter 4

Sampling Distribution and sample size determination

Ahmed M(Assistant professor of Epidemiology and


Biostatistics, PhD in Epidemiology Candidate )

03/26/2025 1
Population
 A group of individuals, objects, or items
from among which samples are taken for
measurement.
 Decisions must be made concerning the
population or individual units (persons,
households, etc.) to be investigated
 The population under consideration should
be clearly and explicitly defined in terms
of place, time, and other relevant criteria

03/26/2025 2
Source of Population

Those are the general population


from which the study subjects are
selected

Identify on whom to do the research.

Think to whom the study result


would be inferred or concluded.
03/26/2025 3
Study population

 These are subjects that would be


selected for the study.
• i.e, these are people in which if they got
the chance of being selected would be
enrolled in the study).

 They are identified and selected


from the Source Population by using
both inclusion and/ or exclusion
criteria of being a study subject.
03/26/2025 4
Sampling

• The population is too large to collect


information from all members
• Instead, select a sample of
individuals hoping that the sample is
representative of the population
• Sampling: The selection of a
number of study units from a defined
study population
03/26/2025 5
Characteristics of sample

03/26/2025 6
Sampling
It is not easy to collect all the information about
population and also it is not possible to study
the characteristics of the entire population
(finite or infinite) due to time factor, cost factor
and other constraints.
 Thus we need sample.
 Sample is a finite subset of statistical
individuals in a population and the number of
individuals in a sample is called the sample size.
03/26/2025 7
Sampling
• Sampling is the process involving the selection
of a finite number of elements from a given
population of interest, for purposes of inquiry.
• A main concern in sampling:
– Ensure that the sample represents the
population, and
– The findings can be generalized.

03/26/2025 8
Sampling ---
• Inferences about the population are based on
the information from the sample drawn from
that population.
• However, due to the variability in the
characteristics of the population, scientific
sample designs should be applied to select a
representative sample.
• Sampling enables us to estimate the
characteristic of a population by directly
observing a portion of the population.
03/26/2025 9
Sample Information

Population

03/26/2025 10
Common terms used in sampling
• Population: it is the collection of all items
of interest.
• Sample: It is a subset of the population.
• Sampling: It is the method by which we
select a sample from the population
• Reference population (or target
population): the population of interest to
whom the researchers would like to make
generalizations.
03/26/2025 11
Common terms ----
• Sampling population: the actual group in which the
study is conducted = Sample
• Study population: the subset of the target population
from which a sample will be drawn.
• Study unit: the population on which information will
be collected(measurement is done).
• Sampling frame: It is the list of all the sampling units
in the source population and from which a random
sample is to be drawn.
03/26/2025 12
Common terms ----
Sampling scheme (Design): method of selecting
sampling units from sampling frame.
Random, convenience sample…
Reasons for Sampling
• For an exploratory purpose – to get general
impression of the total population.
• For the purpose of obtaining estimates on
certain characteristics of the population

03/26/2025 13
Researchers are interested to know about factors associated with
ART use among HIV/AIDS patients attending certain hospitals in a
given Region

Target population = All ART


patients in the Region

Study population = All


ART patients in, e.g. 3,
hospitals in the Region

Sample

03/26/2025 14
03/26/2025 15
Advantages of sampling:

• Feasibility: Sampling may be the only feasible method of


collecting information.
• Reduced cost: Sampling reduces demands on resource such as
finance, personnel, and material.
• Greater accuracy: Sampling may lead to better accuracy of
collecting data
• Sampling error: Precise allowance can be made for sampling
error
• Greater speed: Data can be collected and summarized more
quickly.
Disadvantage
• There is always a sampling error
• Sampling may create a feeling of discrimination within the
population
• Sampling may be inadvisable where every unit in the
population is legally required to have a record.
03/26/2025 16
03/26/2025 17
03/26/2025 18
03/26/2025 19
03/26/2025 20
03/26/2025 21
03/26/2025 22
03/26/2025 23
03/26/2025 24
03/26/2025 25
03/26/2025 26
03/26/2025 27
Sampling Methods
Two broad divisions:
A. Probability sampling methods

B. Non-probability sampling methods

A. Probability sampling methods


• Involves random selection of a sample

• Every sampling unit has a known and non-zero


probability of selection into the sample.
• Involves the selection of a sample from a population,
based on chance.
03/26/2025 28
Specifying the sampling method

• Probability Sampling
– Every element in the target population or universe [sampling
frame] has equal probability of being chosen in the sample
for the survey being conducted.
– Scientific, operationally convenient and simple in theory.
– Results may be generalized.
• Non-Probability Sampling
– Every element in the universe [sampling frame] does not
have equal probability of being chosen in the sample.
– Operationally convenient and simple in theory.
– Results may not be generalized.
03/26/2025 29
• Probability sampling is:
– more complex,
– more time-consuming and
– usually more costly than non-probability
sampling.
® a technique you can use to maximize external
validity or generalizability of the results of the
study.
• However, because study samples are randomly
selected and their probability of inclusion can be
calculated,
– reliable estimates can be produced and
03/26/2025 30
– inferences can be made about the population.
• There are several different ways in which a
probability sample can be selected.
• The method chosen depends on a number of
factors, such as
– the available sampling frame,
– how spread out the population is,
– how costly it is to survey members of the
population

03/26/2025 31
When to use Non probability Sampling
• Group that represents the target population already
exists.
• Difficult or impossible to obtain the list of names for
sampling (Homeless, IV Drug user).
• All of the cases of interest may not be identified ahead
of time.
• For rare population.

03/26/2025 32
Advantages Non-probability sampling

• Used when a sampling frame does not exist.


• they are quick, inexpensive and convenient.
• Good for pretests, pilot studies, In-depth interviews.
• Used when Precise representativeness is not
necessary

03/26/2025 33
Disadvantage Non-probability sampling

• No random selection (unrepresentative).

• Reliability cannot be measured.

• No way to measure the precision of the


resulting sample.
• Inappropriate for generalizing findings

03/26/2025 34
03/26/2025 35
03/26/2025 36
03/26/2025 37
Types of Sampling Methods

Samples
Method

Probability Samples
Non-Probability
Samples

Snowball Simple Stratified


Random
Purposive Judgemental
Systematic Cluster

Convenience
Multistage Random Sampling
Quota
03/26/2025 38
Simple Random Sampling

Simple random sampling is a method of


probability sampling in which every unit has
an equal non-zero and known chance of being
selected

03/26/2025 39
1. Simple random sampling
Each member of a population has an equal chance of being
included in the sample.
• To use a SRS method:
– Make a numbered list of all the units in the population
i.e. Sampling frame ( not always mandatory )
– Each unit should be numbered from 1 to N (where N is
the size of the population)
– Select the required number.
• The randomness of the sample is ensured by:
• Use of “lottery’ methods
• Table of random numbers
• Computer programs
03/26/2025 40
Random number table
• It is a table of random numbers constructed by a
process that
1. In any position in the table, each of the numbers 0
through 9 has a probability 1/10 of
occurring.
2. The occurrence of any number in one part of the table is
independent of the occurrence of any number in any
other part of the table.
• SRS has certain limitations:
– Difficult if the reference population is dispersed.
– Minority subgroups of interest may not be selected.

03/26/2025 41
Assumption of the population

Homogeneity with respect to


the variable of interest

Availability of frame

03/26/2025 42
Simple random sampling

Example: evaluate the prevalence of tooth decay


among 1200 children attending a school

• List of children attending the school


• Children numerated from 1 to 1200
• Sample size = 100 children
• Random sampling of 100 numbers between 1 and
1200
How to randomly select?

03/26/2025 43
How to Use Random Number Tables

1. Assign a unique number to each population element in


the sampling frame. Start with serial number 1, or 01, or
001, etc. upwards depending on the number of digits
required.
2. Choose a random starting position.
3. Select serial numbers across rows or down
columns.
4. Discard numbers that are not assigned to any population
element and ignore numbers that have already been
selected.
5. Repeat the selection process until the required number
of sample elements is selected.
03/26/2025 44
Generating random number using
computer: OpenEpi

03/26/2025 45
Feed the information

03/26/2025 46
The numbers generated
are:

03/26/2025 47
Simple random sampling

03/26/2025 48
03/26/2025 49
Systematic Random Sampling

 Systematic random sampling is a


method of probability sampling in which the
defined target population is ordered and
the sample is selected according to position
using a skip interval

03/26/2025 50
Steps in Drawing a Systematic Random Sample

1: Obtain a list of units that contains an acceptable frame of the


target population
2: Determine the number of units in the list and the desired
sample size
3: Compute the skip interval
• The population is listed in a particular order, then every
kth unit is selected
– Start at a random point between 1 and k
– Here k is chosen so that N ≈ kn
4: Determine a random start point
5: Beginning at the start point, select the units by choosing
each unit that corresponds to the skip interval
03/26/2025 51
• Example: the researcher wants to know
the prevalence of malnutrition among
under 5 children in woreda x

Systematic random sampling

03/26/2025 52
Systematic random sampling

03/26/2025 53
03/26/2025 54
3. Stratified random sampling

• It is done when the population is known to be have


heterogeneity with regard to some factors and those
factors are used for stratification
• Using stratified sampling, the population is divided into
homogeneous, mutually exclusive groups called strata, and
• A population can be stratified by any variable that is
available for all units prior to sampling (e.g., age, sex,
province of residence, income, etc.).
• A separate sample is taken independently from each
stratum.
• Any of the sampling methods mentioned in this section
(and others that exist) can be used to sample within each
stratum.
03/26/2025 55
Stratified Random Sampling
 Stratified random sampling is a method of
probability sampling in which the population is
divided into different subgroups and samples are
selected from each subgroup

03/26/2025 56
Steps in Drawing a Stratified Random
Sample

1: Divide the target population into


homogeneous subgroups or
strata
2.Decided which type of stratified sampling
to use
3.Distribute the total sampling for each
strata
4. Draw random samples fro each stratum
5: Combine the samples from each stratum
into a single sample of the target
population
03/26/2025 57
Stratified sampling
Stratified samples can be:
• Proportionate: involving the selection of sample
elements from each stratum, such that the ratio of
sample elements from each stratum to the sample
size equals that of the population elements within
each stratum to the total number of population
elements.
• Disproportionate: the sample is disproportionate
when the above mentioned ratio is unequal.

03/26/2025 58
Stratified …
Population of L strata, stratum l contains nl units

Take simple random sample in every


stratum

03/26/2025 59
Why do we need to create strata?
• It can make the sampling strategy more efficient.

• A larger sample is required to get a more


accurate estimation if a characteristic varies
greatly from one unit to the other.

• For example, if every person in a population had


the same salary, then a sample of one individual
would be enough to get a precise estimate of the
average salary.
03/26/2025 60
Stratified random sampling

03/26/2025 61
4. Cluster sampling
• Sometimes it is too expensive to carry out SRS
– Population may be large and scattered.
– Complete list of the study population unavailable
– Travel costs can become expensive if interviewers have to
survey people from one end of the country to the other.
• Cluster sampling is the most widely used to reduce the
cost
• The clusters should be homogeneous, unlike stratified
sampling where the strata are heterogeneous

03/26/2025 62
Cluster sampling

• Principle
– Whole population divided into groups
e.g. woreda, kebele or Got
– Random sample taken of these
groups (“clusters”)
– Within selected clusters, all units
e.g. households included

03/26/2025 63
Steps in cluster sampling
• Cluster sampling divides the population into groups
or clusters.
• A number of clusters are selected randomly to
represent the total population, and then all units
within selected clusters are included in the sample.
• No units from non-selected clusters are included in
the sample—they are represented by those from
selected clusters.
• This differs from stratified sampling, where some
units are selected from each group.

03/26/2025 64
Example: Cluster sampling
Section 1 Section 2

Section 3

Section 5

Section 4

03/26/2025 65
Example
• In a school based study, we assume students of
the same school are homogeneous.
• We can select randomly sections and include all
students of the selected sections only

03/26/2025 66
Cluster sampling

03/26/2025 67
Cluster sampling
• Advantages
– Simple as complete list of sampling units
within population not required
– Less travel/resources required
• Disadvantages
– Potential problem is that cluster members are
more likely to be alike, than those in another
cluster (homogenous)…
– This “dependence” needs to be taken into
account in the sample size….and the analysis
(“design effect”)
03/26/2025 68
Difference Between Cluster and Stratified
Sampling

Population of L strata, stratum l contains nl Population of C


units clusters

Take simple random sample in every stratum Take srs of clusters, sample
every unit in chosen
03/26/2025
clusters 69
5. Multi-stage sampling
• In a very large and diverse population,
sampling may be done in two or more stages
• Carried out in phases and usually involves more
than one sampling method.
• Design effect should be considered

Eg: in a study of diarrheal disease in a district;


District--------Kebele--------Village
03/26/2025 70
Multi-stage sampling

• Similar to the cluster sampling, except that it involves


picking a sample from within each chosen cluster, rather
than including all units in the cluster.
• This type of sampling requires at least two stages.
• The primary sampling unit (PSU) is the sampling unit in
the first sampling stage.
• The secondary sampling unit (SSU) is the sampling unit
in the second sampling stage, etc.
03/26/2025 71
Woreda PSU

Kebele SSU

Sub-Kebele TSU

HH

03/26/2025 72
• In the first stage, large groups or clusters are identified and
selected. These clusters contain more population units than
are needed for the final sample.

• In the second stage, population units are picked from within


the selected clusters (using any of the possible probability
sampling methods) for a final sample.

• If more than two stages are used, the process of


choosing population units within clusters continues
until there is a final sample.

• With multi-stage sampling, you still have the benefit of


a more concentrated sample for cost reduction.
03/26/2025 73
• However, the sample is not as concentrated as other
clusters and the sample size is still bigger than for a
simple random sample size.

• Also, you do not need to have a list of all of the units in


the population. All you need is a list of clusters and list
of the units in the selected clusters.

• Admittedly, more information is needed in this type of


sample than what is required in cluster sampling.

• However, multi-stage sampling still saves a great


amount of time and effort by not having to create a list
of all the units in a population.
03/26/2025 74
B. Non-probability sampling
• In non-probability sampling, every item has an
unknown chance of being selected.

• In non-probability sampling, there is an assumption


that there is an even distribution of a characteristic of
interest within the population.

• For probability sampling, random is a feature of the


selection process.
• This is what makes the researcher believe that any
sample would be representative and because of that,
results will be accurate.
03/26/2025 75
• For probability sampling, random is a feature of the
selection process, rather than an assumption about
the structure of the population.
• In non-probability sampling, since elements are
chosen arbitrarily, there is no way to estimate the
probability of any one element being included in the
sample.
• Also, no assurance is given that each item has a
chance of being included, making it impossible either
to estimate sampling variability or to identify possible
bias

03/26/2025 76
• Reliability cannot be measured in non-probability
sampling; the only way to address data quality is to
compare some of the survey results with available
information about the population.

• Still, there is no assurance that the estimates will meet


an acceptable level of error.

• Researchers are reluctant to use these methods


because there is no way to measure the precision of the
resulting sample.

03/26/2025 77
• Despite these drawbacks, non-probability
sampling methods can be useful when
descriptive comments about the sample itself
are desired.

• Secondly, they are quick, inexpensive and


convenient.

• There are also other circumstances, such as


researches, when it is unfeasible or impractical
to conduct probability sampling.
03/26/2025 78
§ Non-probability sampling procedures are not valid
for obtaining a sample that is truly representative
of a larger population.
§ Almost always, non probability samples tend to
over- select some population elements and under-
select others.
§ When the known probabilities of selection are not
known, there is no precise way to adjust for such
distortions.
03/26/2025 79
The most common types of non-probability sampling

1. Convenience or haphazard sampling


2. Volunteer sampling
3. Judgment sampling
4. Quota sampling
5. Snowball sampling technique

03/26/2025 80
1. Convenience or haphazard sampling
• Convenience sampling is sometimes referred to as
haphazard or accidental sampling.
• It is not normally representative of the target
population because sample units are only selected if
they can be accessed easily and conveniently.
• The obvious advantage is that the method is easy to
use, but that advantage is greatly offset by the
presence of bias.
• Although useful applications of the technique are
limited, it can deliver accurate results when the
population is homogeneous.

03/26/2025 81
Convenience or haphazard sampling…

• Selection of subjects based on easy availability &


accessibility
• Often used in face to face interviews
• Advantage - very easy to carry out,
• Disadvantage
• Difficult to draw any meaningful conclusion.
• May not be representative
03/26/2025 82
2. Volunteer sampling
• As the term implies, this type of sampling occurs when
people volunteer to be involved in the study.
• In psychological experiments or pharmaceutical trials
(drug testing), for example, it would be difficult and
unethical to enlist random participants from the general
public.
• In these instances, the sample is taken from a group of
volunteers.
• Sometimes, the researcher offers payment to attract
respondents.
• Introduces strong bias/self selection bias

03/26/2025 83
Volunteer sampling….
• In exchange, the volunteers accept the possibility of a
lengthy, demanding or sometimes unpleasant process.
• Sampling voluntary participants as opposed to the
general population may introduce strong biases.
• Often in opinion polling, only the people who care
strongly enough about the subject tend to respond.
• The silent majority does not typically respond, resulting
in large selection bias.
03/26/2025 84
3. Purposive/Judgemental
• The researchers choose the sample based on who they
think would be appropriate for the study.

• Primarily used when there is a limited number of people


that have expertise in the area being researched.

• Appropriate when the study subjects are difficult to


locate.

• Judgment sampling is subject to the researcher's biases.

• One advantage of judgment sampling is the reduced cost


03/26/2025 85
Judgment sampling

03/26/2025 86
4. Quota sampling

• This is one of the most common forms of non-


probability sampling.
• Sampling is done until a specific number of units
(quotas) for various sub-populations have been
selected.
• The main argument against quota sampling is that it
does not meet the basic requirement of randomness.
• Some units may have no chance of selection or the
chance of selection may be unknown.
• Therefore, the sample may be biased.

03/26/2025 87
• Quota sampling is generally less expensive than
random sampling.
• It is also easy to administer, especially considering the
tasks of listing the whole population, randomly
selecting the sample and following-up on non-
respondents can be omitted from the procedure.
• Quota sampling is an effective sampling method
when information is urgently required and can be
conducted without sampling frames.
• In many cases where the population has no suitable
frame, quota sampling may be the only appropriate
sampling method.
03/26/2025 88
Quota sampling

03/26/2025 89
5. Snowball sampling
• A technique for selecting a research sample where
existing study subjects recruit future subjects from among
their friends.
• Thus the sample group appears to grow like a rolling
snowball.
• This sampling technique is often used in hidden
populations which are difficult for researchers to access;
example populations would be drug users or commercial
sex workers.
03/26/2025 90
Snowball sampling
• Because sample members are not selected from a
sampling frame, snowball samples are subject to numerous
biases. For example, people who have many friends are more
likely to be recruited into the sample.
• Involves a process of “chain referrals”

• Suitable for locating key informants.

• You start with one or two key informants and ask them if they
know persons who know a lot about your topic of interest.
• Used when trying to interview hard to reach groups.
03/26/2025 91
Sample Size Determination

• An essential part of planning any study is to


decide how many people need to be studied

03/26/2025 92
• Sample Size: The number of study subjects selected to
represent a given study population.
• Important to make inferences based on the findings from
the sample.
• Should be sufficient to represent the characteristics of
interest of the study population.
• In estimating a certain characteristic of a population,
sample size calculations are important to ensure that
estimates are obtained with required precision or
confidence
• The accuracy of the predicted results determine the size
of the sample.

03/26/2025 93
• In studies concerned with detecting an effect
(e.g. a difference between two groups),
sample size calculations are important to
ensure the detection of whether association
exists or not.

• If the sample is too small, then even if large


differences are observed, it will be impossible
to show that these are due to anything more
than sampling variation.
03/26/2025 94
Sample size determination depends on the:
– Objective of the study
– Design of the study
• Descriptive/Analytic

– Accuracy of the measurements to be made


– Degree of precision required for generalization
– Plan for statistical analysis
– Degree of confidence with which to conclude

03/26/2025 95
• Common questions:
– “How many subjects should I study?”
– Too small sample = Waste of time and resources
= Results have no practical use
– Too large sample = Waste of resources
= Data quality compromised

03/26/2025 96
• When deciding on sample size:

∆ COST
PRECISION

Sample size = Precision = Cost

03/26/2025 97
• The feasible sample size is also determined
by the availability of resources:
– time
– manpower
– transport
– available facility, and
– money

03/26/2025 98
03/26/2025 99
1. Sample Size: Single Sample
• The aim is to have a large enough sample with which
to estimate a population mean or proportion within a
narrow interval with high reliability.
• Concerned with the precision of the estimate
(“narrowness of the CI”).
estimate ± d units

Sample size for single sample includes:


A. Sample size for estimating a single population mean
B. Sample size to estimate a single population
proportion
03/26/2025 100
03/26/2025 101
03/26/2025 102
03/26/2025 103
A. Sample size for estimating a single population mean
• AIM: Estimate µ
• WANT: Estimate ( ) ± d units
where e = Margin of error =
= Absolute precision
= Half of the width (w) of CI
Steps:
1. Specify d (or w = 2d)
2. Use known σ2 or estimate using s2

03/26/2025 104
Standard error of the
estimator of the parameter
of interest
3.

Where d = e in some text books

03/26/2025 105
Sample size for single population mean
Populations of cancer patient have a survival
standard deviation of 43.3 months. If one
wants to conduct a sample survey on these
populations, how large sample is needed so
that 95% of the means of these samples of
size will be with in 6 months of the population
mean? The population size is 480 patients.

03/26/2025 106
Example:
1. Find the minimum sample size needed to estimate the drop in
heart rate (µ) for a new study using a higher dose of
propranolol than the standard one. We require that the two-
sided 95% CI for µ be no wider than 5 beats per minute and
the sample sd for change in heart rate equals 10 beats per
minute.
n = (1.96)2102/(2.5)2 = 62 patients
2. Suppose that for a certain group of cancer patients, we are
interested in estimating the mean age at diagnosis. We would
like a 95% CI of 5 years wide. If the population SD is 12 years,
how large should our sample be?

03/26/2025 107
• Suppose d=1
• Then the sample size increases

3. A hospital director wishes to estimate the mean


weight of babies born in the hospital. How large a
sample of birth records should be taken if she/he
wants a 95% CI of 0.5 wide? Assume that a
reasonable estimate of  is 2.
• Ans: 246 BIRTH RECORDS.
03/26/2025 108
03/26/2025 109
03/26/2025 110
Sample size for single population proportion
• In a survey of school children to determine the
proportion of immunized against polio, an
investigator specifies his/her maximum discrepancy
between sample and population proportion of
immunized to be 0.04 and he/she wish to be 99%
certain that the discrepancy is with in the limit. How
large sample is needed if he/she has no any previous
knowledge on population proportion

03/26/2025 111
B. Sample size to estimate a single population
proportion
• Aim: Estimate p
• Want: Estimate ± d units where d = Z•SE
(95% CI of width=2d)
Steps:
1. Specify d (or w = 2d)
2. Use estimated p (use p=0.5 if no information)
3. Solve for n

03/26/2025 112
1. Suppose that you are interested to know the
proportion of infants who breastfed >18 months of
age in a rural area. Suppose that in a similar area,
the proportion (p) of breastfed infants was found to
be 0.20. What sample size is required to estimate the
true proportion within ±3% points with 95%
confidence. Let p=0.20, d=0.03, α=5%

03/26/2025 113
• Suppose there is no prior information about
the proportion (p) who breastfeed
• Assume p=q=0.5 (most conservative)
• Then the required sample size increases

03/26/2025 114
• An estimate of p is not always available.
• However, the formula may also be used for
sample size calculation based on various
assumptions for the values of p.
• P = 0.1  n = (1.96)2(0.1)(0.9)/(0.05)2 = 138
P = 0.2  n = (1.96)2(0.2)(0.8)/(0.05)2 = 246
P = 0.3  n = (1.96)2(0.3)(0.7)/(0.05)2 = 323
P = 0.5  n = (1.96)2(0.5)(0.5)/(0.05)2 = 384
P = 0.7  n = (1.96)2(0.7)(0.3)/(0.05)2 = 323
P = 0.8  n = (1.96)2(0.8)(0.2)/(0.05)2 = 246

03/26/2025 115
Some Considerations

03/26/2025 116
But the population 2 is most of the time unknown

As a result, it has to be estimated from:


• Pilot or preliminary sample:
– Select a pilot sample and estimate 2 with
the sample variance, s2
• Previous or similar studies

03/26/2025 117
• For a fixed absolute precision (d), the required
sample size increases as P increases form 0 to
0.5, and then decreases in the same way as
the prevalence approaches 1.

03/26/2025 118
2. A survey is planned to determine what proportion
of the medical students have regularly chewed
khat. If no estimate of p is available and a pilot
sample cannot be drawn, what sample size would
be required if a 95% confidence is desired, and
d=0.04 is to be used.
• Ans: 600 students

03/26/2025 119
Some Considerations

03/26/2025 120
Example 2

Nursing graduate student wants to do her thesis work


on the title “assessment of the outcome of pregnancy
among women who visited SSHYRH JJ university hospital
gynecology and obstetrics ward for the year 2020”
What will be the sample size she should take for this
study?

03/26/2025 121
Sample size using statistical
software
• As an alternative method, we can use EPI INFO statistical
software to calculate the sample size required for the
study.
• Let us assume the population that we want to conduct the
study has target population of size N=100,000.
• Sample size determination for Epidemiological study
design
• The proportion of the variable of interest is not known which
means there is no previous study done and hence we decided
to use 50 percent as an estimate of the prevalence for that
variable.
• Then the steps that we need to follow to get the required
sample size using EPI INFO statistical software are given
03/26/2025 122
below:
Steps to compute sample size

• First make sure you install the software EPI


INFO.
• If your computer has the software, then go to
the start menu and open it.
• This is the window that you are going to get
when you open the software:

03/26/2025 123
Start page

03/26/2025 124
03/26/2025 125
03/26/2025 126
03/26/2025 127
03/26/2025 128
Thank. you!!!

129

You might also like