Stratified Sampling Notes
Stratified Sampling Notes
Stratified Sampling
In stratified sampling the population consisting of N units is first divided into k subpopulations
or strata of N1 , N2 , . . . , Nk units respectively, such that sampling units within each stratum are as
homogeneous as possible and strata should be as heterogeneous from each other as possible.
These strata are not overlapping and together they comprise the total population such that
k
X
N1 + N2 + . . . + Nk = Ni = N
i=1
Sampling is done independently from each stratum and estimators from each of the strata are
pooled with suitable weights to obtain an estimator of the population mean ȲN . If simple random
sampling (SRS) is used to select a sample from each stratum, the method is referred to as stratified
random sampling.
Sizes of samples that are selected from the strata are denoted by n1 , n2 , . . . , nk respectively. The
Xk
total sample size n = ni .
i=1
1. If data of known precision are required for certain subdivisions of the population, it is advis-
able to treat each subdivision as a population in its own right.
2. Stratification creates convenience for organization of field work requiring supervision since it
minimizes travel costs if it is done according to administrative places, e.g. districts in the
case of Lesotho.
3. Different sampling methods may be adopted for selecting a sample from different strata. e.g
people living in institutions such as hotels, hospitals, prisons and cattle posts are often placed
in a different stratum from people living in ordinary homes. Hence different sampling methods
can be appropriate for the two situations.
5. Stratification may produce a gain in precision in the estimates of characteristics of the whole
population.
1
• If each stratum is homogeneous, in the sense that measurements vary little from one
unit to another, a precise estimate of any stratum mean can be obtained from a small
sample in that stratum.
• The estimates from strata can be combined into a precise estimate for the whole popu-
lation.
6. In case of extreme values in the population, such values may be segregated to form a separate
stratum.
Notation
Let Yij be the value of the jth unit in the ith stratum,
N be the population size
Ni be the size of the ith stratum
YNi be the total of the ith stratum
ȲN or Ȳ be the population mean
Ni
wi = N be unknown weights
Suppose there are k strata, then units in the population can be presented as follows:
1 ... h ... k
Y11 ... Yh1 ... Yk1
.. .. ..
. . .
Y1i ... Yhi ... Yki
.. . ..
. ..
.
Y1N ... YhN ... YkN
2
Table 2: Notation for the ith Stratum
1 ... h ... k
Stratum Size N1 ... Nh ... Nk
Stratum Total YN1 ... YNh ... YNk
Stratum Mean ȲN1 ... ȲNh ... ȲNk
The mean of the ith stratum ȲNi is the mean of all units in the ith stratum presented as
Ni
1 X
ȲNi = Yij
Ni
j=1
The papulation mean is given as
1 XX
ȲN = Yij ,
N
i j
which can be presented as
k k
X Ni X
ȲN = ȲNi = wi ȲNi
N
i=1 i=1
If the sample of size ni is drawn from the ith stratum of size Ni by SRS, the sample estima-
tor for the ith stratum is given as
ni
1 X
ȳni = yij ,
ni
j=1
which is an unbiased estimator of the ith stratum mean ȲNi , i.e. E(yni ) = ȲNi
If in every stratum the sample estimator ȳni is unbiased estimator of the ith stratum mean ȲNi ,
then ȳst is an unbiased estimator of the population mean ȲN .
3
Proof
k
X
E(ȳst ) = wi E(ȳni )
i=1
k
X
= wi ȲNi
i=1
k
X Ni
= ȲNi
N
i=1
= ȲN
Since sample estimators ȳni are unbiased in the individual strata. i.e. E(ȳni ) = ȲNi , the population
mean may be written as
X Ni
k X
Yij
i=1 j=1
ȲN =
N
k
X Ni
= ȲNi
N
i=1
k
X
= wi ȲNi
i=1
k
hX i
Var(ȳst ) = Var wi ȳni
i=1
k
X
= wi2 Var(ȳni )
i=1
when samples are selected from strata by SRS
k 1
X 1 2
Var(ȳst ) = wi2 − S
ni Ni i
i=1
Ni
X
where Si2 = 1
Ni −1 (Yij − ȲNi )2 and E(s2i ) = Si2
i=1
In stratified sampling the values of the sample sizes ni in the respective strata are determined
by the sampler.
They may be selected to minimize the variance of the estimator (Var(ȳst )) for a specified cost of
studying the sample or to minimize the cost for a specified value of Var(ȳst ).
4
Allocation of Total Sample to the ith Stratum
Given the total sample size n we may choose how to allocate it among the k strata and there
are two types of allocation that we can use, namely proportional allocation and optimum/Neymsn
allocation.
Proportional Allocation
Under proportional allocation the size of the sample that is selected from the ith stratum ni is in
the same proportion to the total sample size n as the size of the ith stratum Ni is to the population
size N
ni Ni
i.e n = N .
For instance if a stratum contains 15% of the population elements, the sample from this stra-
tum will consists of 15% of the total sample size n.
Ni Ni
In such cases ni = N n = wi n, where wi = N and N wi = Ni
Optimum/Neyman Allocation
In optimum allocation a given total sample size n should be allocated among the k strata such that
the stratified sampling estimator ȳst will have the smallest possible variance.
The problem is to determine n1 , n2 , . . . , nk such that we minimize
5
k 1
X 1 2
Var(ȳst ) = wi2
− S
ni N i i
i=1
subject to the constraint that the total sample size n is calculated as
k
X
n1 + n2 + . . . + nk = ni = n
i=1
Optimum allocation estimates the population mean or population total with the lowest variance
for a fixed total sample size n. In this case
nw S
ni = Pk i i
i=1 wi Si
that it is given by
k k
1X 2 1 X
Var(ȳst )opt = wi Si − wi Si2
n N
i=1 i=1
6
ik N
1 XX
= (Yij − ȲNi + ȲNi − ȲN )2
N −1
i=1 j=1
1 XX 1 X
= (Yij − ȲNi )2 + Ni (ȲNi − ȲN )2
N −1 N −1
i=1 j=1 i=1
We have
XX
(N − 1)S 2 = (Yij − ȲN )2
X i=1
Xj=1 X
= (Yij − ȲNi )2 + Ni (ȲNi − ȲN )2
i j i
which can be written as
X X
(N − 1)S 2 = (Ni − 1)Si2 + Ni (ȲNi − ȲN )2
i i
Dividing by N we get
(N − 1) 2 X (Ni − 1) 2 X Ni
S = Si + (ȲNi − ȲN )2
N N N
i i
If Ni s are large (as it is likely practice) then Ni − 1 ≈ Ni
X X
S2 = wi Si2 + wi (ȲNi − ȲN )2
i i
multiplying both sides by n1 − N1
we get
1 1 2 1 1 X 1 1 X
− S = − wi Si2 + − wi (ȲNi − ȲN )2
n N n N n N
i 1 1 X
i
The variance of SRS estimator ȳn is greater than that of the stratified estimator based on pro-
portional allocation, of the same sample size n. The difference between the two variances may be
small, but it is always nonnegative for large Ni s.
Proportional stratified sampling is better than SRS or it is more efficient than SRS, hence it should
be preferred to SRS in all cases where it is feasible (in all cases where the strata sizes Ni s are
known).