Sample Size Calculation
Sample Size Calculation
Onsongo
W.
Sample Size
Lecture Notes
Calculation
Simple Random
Sampling Without
Replacement
Finite
Population
Lecturer: Winnie Onsongo (PhD)
and Central
Limit Theorem
Stratified Random
Department of Statistics and Actuarial Science
Sampling
University of Ghana
[email protected]
Dr. Onsongo
W.
Sample size for estimating Ȳ and YT
Sample Size
Calculation The surveyor has to consider is the sample required to
Simple Random
Sampling Without
Replacement
estimate the population parameters with a desired level of
Confidence
precision.
intervals
Finite
obtain the size of a simple random sample subject to the
Population
and Central
condition that the difference between
the sample mean
Limit Theorem and the population mean ȳ − Ȳ ≤ d with some
Stratified Random
Sampling
probability ≥ (1 − α).
d is the margin of error.
Consider ȳ as an estimator of Ȳ and suppose that the
sample observations are normally distributed.
2/17
Sample Size Calculation
Dr. Onsongo
Sample size for estimating Ȳ and YT
W.
Then ȳ∼ N (E (ȳ ) , Var
(ȳ )) i.e.
Sample Size n s2
Calculation ȳ ∼ N Ȳ , 1 −
Simple Random
Sampling Without
N n
Replacement
Confidence
We determine n such that P ȳ − Ȳ ≤ d ≥ 1 − α.
intervals
Finite
This implies that
Population ( )
and Central
Limit Theorem −d ȳ − Ȳ d
Stratified Random P p ≤p ≤p ≥1−α
Sampling
Var (ȳ ) Var (ȳ ) Var (ȳ )
n o
Or equivalently P −Z α ≤ Z ≤ Z α ≥ 1 − α where
2 2
d d
Zα = p = r
2 Var (ȳ ) n s2
1− 3/17
Sample Size Calculation
Dr. Onsongo
Sample size for estimating Ȳ and YT
W.
s2 1
Consequently, n = or n ≈
Sample Size
Calculation d2 s2 d2 1
+ +
Z α2 s 2 Z α2
Simple Random
Sampling Without
Replacement
N N
2 2
Confidence
intervals The weak point of this expression is the estimate of the
Finite population variance for it is not known.
Population
and Central
Limit Theorem
Similarly for estimating the population total,
Stratified Random
Sampling
1
n≈
d2 1
2 2 2
+
N s Zα N
2
Dr. Onsongo
W.
Confidence Intervals for Ȳ and YT
Sample Size
Calculation
Having selected a sample and using the observed sample
Simple Random
Sampling Without
data to estimate population parameters, it is desirable to
Replacement
assess the quality of our estimators.
Confidence
intervals
This is mostly done by constructing confidence
Finite
Population bands/intervals within which we are sufficiently sure that
and Central
Limit Theorem
the population parameter by placing a bound on the
Stratified Random
Sampling
probable error of the estimate.
The procedure of determining confidence intervals utilizes
the data to determine an interval with the property that
the interval has a high probability of containing the true
population value.
5/17
Confidence intervals
Dr. Onsongo
W.
Sample Size
Confidence Intervals for Ȳ and YT
Calculation
Simple Random
Let I represent the confidence interval for the population
Sampling Without
Replacement mean Ȳ .
Confidence
intervals By choosing some small number α as the allowed
Finite probability error, then P Ȳ ∈ I = 1 − α.
Population
and Central
Limit Theorem The interval I is a random quantity since its endpoints will
Stratified Random
Sampling keep varying from sample to sample.
The quantity (1 − α) is referred to as the confidence
coefficient with choices of the value of α being 0.01,0.05,
and 0.1.
6/17
Confidence intervals
Dr. Onsongo
W. Confidence Intervals for Ȳ and YT
Sample Size For instance, α = 0.05 implies that for of our n sample
Calculation
Simple Random values, the interval covers the true value of our population
Sampling Without
Replacement parameter.
Confidence
intervals Construction of the confidence bands is based on a normal
Finite
Population
approximation for the distribution of the sample estimates
and Central
Limit Theorem
under simple random sampling.
Stratified Random
Sampling The approximate 100(1 − α)% confidence interval for the
population total is
r
s2
ŶT ± t N (N − n)
n
7/17
Confidence intervals
Dr. Onsongo
W.
Sample Size
In cases where the observations y1 , y2 , ..., yn are not
Calculation normally distributed, the approximate confidence bands
Simple Random
Sampling Without
Replacement
depend on the approximate normal distribution of the
Confidence sample mean ȳ .
intervals
9/17
Finite Population and Central Limit Theorem
Dr. Onsongo
W.
The same cannot be said about sampling without
replacement because selecting a unit in the first draw
Sample Size
Calculation eliminates the unit from the selection pool and therefore
Simple Random
Sampling Without reduces the chance of obtaining the unit in subsequent
Replacement
Confidence
draws.
intervals
For a population of size N in the sequence, let ȲN be the
Finite
Population population mean and ȳN be the sample mean of a random
and Central
Limit Theorem sample selected from the population.
Stratified Random
Sampling
According to the finite population central limit theorem,
√ȳN −ȲN ∼ N (0, 1) as both n and N − n become larger
Var (ȳN )
and larger.
ˆ (ȳN ) of a simple
This result holds with the estimated Var
random sample of size n.
10/17
Stratified Random Sampling
Dr. Onsongo
W.
Proportional Allocation of Sample size
Sample Size
Calculation
Under proportional allocation, the sizes of the samples
Simple Random
Sampling Without
from the different strata are kept proportional to the sizes
Replacement
of the strata.
Confidence
intervals
Variability and cost considered the same.
Finite
Population
and Central
Purpose of sampling is to estimate population value of
Limit Theorem
Stratified Random
some characteristics.
Sampling
11/17
Stratified Random Sampling
Dr. Onsongo
W. Proportional Allocation of Sample size
Sample Size Illustration: Suppose a sample of size n = 30 is to be
Calculation
Simple Random
drawn from a population of size N = 8000
Sampling Without
Replacement
Suppose the population is divided into three strata of size
Confidence
intervals N1 = 4000, N2 = 2400 and N3 = 1600
Finite
Population Adopting proportional allocation:
and Central
Limit Theorem
4000
Stratified Random
Sampling n1 = n × Wi = 30 × = 15
8000
n2 = 9
n3 = 6
12/17
Stratified Random Sampling
Dr. Onsongo
W.
Sample Size
Disproportional Allocation
Calculation
Simple Random
It may be considered reasonable to take larger samples
Sampling Without
Replacement from the more variable strata and smaller samples from
Confidence
intervals
the less variable strata.
Finite We can then account for both (differences in stratum size
Population
and Central and differences in stratum variability) by using
Limit Theorem
Stratified Random disproportionate sampling design.
Sampling
13/17
Stratified Random Sampling
Dr. Onsongo
W. Disproportional Allocation, An illustration
Sample Size We determine the sample size for the different strata using:
Calculation
Simple Random
Sampling Without nNi si
Replacement
nh = , i = 1, 2, . . . , h
Confidence s1 N1 + s2 N2 + · · · + sh Nh
intervals
14/17
Stratified Random Sampling
Dr. Onsongo
W.
Disproportional Allocation, An illustration
Sample Size
Calculation Solution:
Simple Random
Sampling Without
Replacement
▷ Sample size for strata with N1 = 5000
Confidence
intervals
84(5000)(15)
n1 = = 50
Finite 5000(15) + 2000(18) + 3000(5)
Population
and Central
Limit Theorem ▷ Sample size for strata with N2 = 2000
Stratified Random
Sampling
84(2000)(18)
n2 = = 24
5000(15) + 2000(18) + 3000(5)
15/17
Stratified Random Sampling
Dr. Onsongo
W.
Cost Optimal Disproportionate Sampling Design
In addition to differences in stratum size and differences in
Sample Size
Calculation stratum variability, we may have differences in stratum
Simple Random
Sampling Without
Replacement
sampling cost.
Confidence
intervals
We have cost optimal disproportionate sampling design.
Finite
Population
The formula for determining the sample sizes for different
and Central
Limit Theorem
strata is:
Stratified Random √
Sampling
nNi si / Ci
ni = √ √ √ , i = 1, 2, . . . ,
s1 N1 / C1 + s2 N2 / C2 + · · · +h Nh / Ch
where
Sample Size
Calculation
Simple Random
Sampling Without
Replacement
Confidence
intervals
Finite
Population
THANK YOU
and Central
Limit Theorem
Stratified Random
Sampling
17/17