Varying Probability Sampling
Varying Probability Sampling
MODULE VII
LECTURE - 23
VARYING PROBABILITY
SAMPLING
DR. SHALABH
DEPARTMENT OF MATHEMATICS AND STATISTICS
INDIAN INSTITUTE OF TECHNOLOGY KANPUR
1
The simple random sampling scheme provides a random sample where every unit in the population has equal probability
of selection. Under certain circumstances, more efficient estimators are obtained by assigning unequal probabilities of
selection to the units in the population. This type of sampling is known as varying probability sampling scheme.
If y is the variable under study and x is an auxiliary variable related to y, then in the most commonly used varying
probability scheme, the units are selected with probability proportional to the value of x, called as size. This is termed as
probability proportional to a given measure of size (pps) sampling. If the sampling units vary considerably in size,
then SRS does not take into account the possible importance of the larger units in the population. A large unit, i.e., a unit
with large value of y contributes more to the population total than smaller units, so it is natural to expect that a selection
scheme which assigns more probability of inclusion in a sample to larger units than to smaller units would provide more
efficient estimators than the estimators based on equal probability scheme. This is accomplished through pps sampling.
Note that the size considered is the value of auxiliary variable x and not the value of study variable y. For example in an
agriculture survey, the yield depends on the area under cultivation. So bigger areas are likely to have larger population and
they will contribute more towards the population total, so the value of the area can be considered as the size of auxiliary
variable. Also, the cultivated area for a previous period can also be taken as the size while estimating the yield of crop.
Similarly, in an industrial survey, the number of workers in a factory can be considered as the measure of size while
2
Difference between the methods of SRS and varying probability scheme:
In SRS, the probability of drawing a specified unit at any given draw is the same. In varying probability scheme, the
It appears in pps sampling that such procedure would give biased estimators as the larger units are over-represented
and the smaller units are under-represented in the sample. This will happen in case of sample mean as an estimator
of population mean where all the units are given equal weight. Instead of giving equal weights to all the units, if the
sample observations are suitably weighted at the estimation stage by taking the probabilities of selection into account,
In pps sampling, there are two possibilities to draw the sample, i.e., with replacement and without replacement.
The probability of selection of a unit will not change and the probability of selecting a specified unit is same at any
PPS+WOR is more complex than PPS + WR . We consider both the cases separately.
3
PPS sampling with replacement (WR):
First we discuss the two methods to draw a sample with PPS and WR.
then selecting those n units whose serial numbers correspond to a set of n numbers where each number is
In selection of a sample with varying probabilities, the procedure is to associate with each unit a set of consecutive
natural numbers, the size of the set being proportional to desired probability.
If x1 , x2 ,..., xN are the positive integers proportional to the probabilities assigned to the N units in the population, then
a possible way to associate the cumulative totals of the units. Then the units are selected based on the values of
4
Units Size Cumulative sizes
1 X1 T1 = X 1
1 and TN by using
random number
probability T ,
i = 1,2,…, N.
N
i −1
i −1 X i −1 Ti −1 = ∑ X j table.
j =1
sample of size n.
N
TN = ∑ X i
N
N X N = ∑ Xi
j =1 j =1
5
In this case, the probability of selection of ith unit is
Ti − Ti −1 X i
=Pi =
TN TN
⇒ Pi ∝ X i .
Drawback:
This procedure involves writing down the successive cumulative totals. This is time consuming and tedious if the
number of units in the population is large.
6
Lahiri’s method:
Let M = Max X i , i.e., maximum of the sizes of N units in the population or some convenient number greater than
i =1,2,..., N
M.
2. If j ≤ X i , then ith unit is selected otherwise rejected and another pair of random number is chosen.
3. To get a sample of size n, this procedure is repeated till n units are selected.
Now we see how this method ensures that the probabilities of selection of units are varying and are proportional to
size.
P(1 ≤ i ≤ N ) P(1 ≤ j ≤ M / i )
1 Xi
= =. Pi * , say.
N M
1 N
Xi
Probability that no unit is selected at a=
trial
N
∑ 1 − M
i =1
1 NX
= N − M
N
X
=−
1 =Q , say.
7
M
Probability that unit i is selected at a draw (all other previous draws result is non selection of unit i)
Thus the probability of selection of unit i is proportional to the size Xi. So this method generates a pps sample.
Advantages:
1. It does not require writing down all cumulative totals for each unit.
2. Sizes of all the units need not be known before hand. We need only some number greater than the maximum size
and the sizes of those units which are selected by the choice of the first set of random numbers 1 to N for drawing
sample under this scheme.
Disadvantages:
8
Example: Consider following data set of 10 number of workers in the factory and its output.
2 5 60 T2 = 2 + 5 = 7
3 10 12 T3 = 2 + 5 + 10 = 17
4 4 6 T4 = 17 + 4 = 21
5 7 8 T5 = 21 + 7 = 28
6 12 13 T6 = 28 + 12 = 30
7 3 4 T7 = 30 + 3 = 33
8 14 17 T8 = 33 + 14 = 47
9 11 13 T9 = 47 + 11 = 58
10 6 8 T10 = 58 + 6 = 64
9
Selection of sample using cumulative total method:
1. First Draw: Draw a random number between 1 and 64.
- Suppose it is 23
- T4 < 23 < T5
- Unit Y is selected and Y5 = 8 enters in the sample..
2. Second Draw:
• Suppose it is 38
• T7 < 38 < T8
So we need to select a pair of random number (i, j) such that 1 ≤ i ≤ 10, 1 ≤ j ≤ 14.
Following table shows the sample obtained by Lahiri’s scheme:
10
Random Random Observation Selection of
number number unit
1 ≤ i ≤ 10 1 ≤ j ≤ 14
11
Sampling Theory
MODULE VII
LECTURE - 24
VARYING PROBABILITY
SAMPLING
DR. SHALABH
DEPARTMENT OF MATHEMATICS AND STATISTICS
INDIAN INSTITUTE OF TECHNOLOGY KANPUR
1
Varying probability scheme with replacement:
Let
Yi : value of study variable y for the ith unit of the population, i = 1,2,…,N.
Define
yi
=zi = , i 1, 2,..., N .
NPi
Under varying probability scheme and with replacement, for a sample of size n,
1 n
z = ∑ zi
n i =1
σ z2
2
N
Y
is an unbiased estimator of population mean Y , variance of z is
n
=
where σ ∑ 2
z Pi i − Y and an
i =1 NPi
unbiased estimate of variance of z is
sz2 1 n
= ∑
n n − 1 i =1
( zi − z ) 2 .
2
Proof:
1 n yi
E(z ) = ∑ E
n i =1 NPi
1 n Y1 Y Y
= ∑
n i =1 NP1
P1 + 2 P2 + ... + N PN
NP2 NPN
1 n
= ∑Y
n i =1
=Y.
The variance of z is
n
1
Var ( z ) =
n2
Var ( ∑
i =1
zi )
n
1
= 2
n
∑Var ( z )
i =1
i ( zi' s are independent in WR case)
n
1
∑ E [ z − E ( z )]
2
= i i
n2 i =1
n
1
=
n2
∑ E(z − Y )
i =1
i
2
1 n Y
2
Y2
2
YN
2
= 2 ∑ 1
i =1 NP1
− Y P1 + − Y ) P2 + ... + − Y PN
n
NP2 NPN
1 N Yi
2
= ∑ − Y Pi
n i =1 NPi
σ z2
= .
3
n
2
To show that
s z is an unbiased estimator of variance of z , consider
n
n
(n − 1) E ( sz2 ) = E ∑ ( zi − z ) 2
i =1
n 2
= E ∑ zi − nz 2
i =1
n
= ∑ E ( zi2 ) − nE ( z ) 2
i =1
n
= ∑ Var ( z ) + {E ( z )} − n Var ( z ) + {E ( z )}
2 2
i i
i =1
2
( )
n N
Y
= ∑ σ z2 + Y 2 ) − n( n + Y 2
σ z2
) ∑ i − Y P=
using Var ( zi= σ 2
i 1 NPi
i z
i 1=
= (n − 1)σ z2
E ( sz2 ) = σ z2
s 2 σ z2
or E z= = Var ( z )
n
n
s 2
1 n y 2
∑
⇒ Var ( z ) == − nz .
z i 2
n n(n − 1) i =1 NPi
1
Note: If Pi = , then z = y ,
N
2
1 1 N Yi σ y2
=
Var ( z ) ∑ =
n N i =1 N . 1
−Y
n
which is the same as in case of SRSWR.
4
N
Estimation of population total:
1 n yi
=Ytot =
ˆ ∑ N z.
n i =1 Pi
Taking expectation, we get
1 n Y1 Y2 YN
= ˆ
E (Ytot ) ∑
n i =1 P1
P1 +
P2
P2 + ... + P
PN
N
N
= ∑=
Y
i =1
i Ytot .
n i =1 N Pi
2
1 N Yi
= ∑ − Ytot Pi
n i =1 Pi
1 N Yi 2
= ∑
n i =1 Pi 2
− Ytot2 .
sz2
An estimate of the variance of ˆ
Ytot is Var (Yˆ
tot ) = N .
n
5
Varying probability scheme without replacement
In varying probability scheme without replacement, when initial probabilities of selection are unequal, the probability
of drawing a specified unit of the population at a given draw changes with the draw. Generally, the sampling WOR
provides a more efficient estimator than sampling WR. The estimators for population mean and variance are more
complicated. So this scheme is not commonly used in practice, especially in large scale sample surveys with small
sampling fractions.
∑ P =1
i =1
i
6
Consider
Pi (2) = Probability of selection of Ui at 2nd draw.
Pi P Pi Pi Pi
=
Pi (2) P1 + P2 i + ... + Pi −1 + Pi +1 + ... + PN
1 − P1 1 − P2 1 − Pi −1 1 + Pi +1 1 − PN
N
Pi
= ∑
j ( ≠i ) =
1
Pj
1 − Pj
N
Pi P P
= ∑
j ( ≠i )=
1
Pj
1 − Pj
+ Pi i − Pi i
1 − Pi 1 − Pi
N
Pi Pi
= ∑ P 1− P
j =1
j − Pi
1 − Pi
j
N Pj P
= Pi ∑ − i
j =1 1 − Pj 1 − Pi
7
1
Pi (2) ≠ Pi (1) for all i unless Pi = .
N
y
Pi (2) will in general be different for each i = 1,2,…,N. So E i will change with successive draws. This makes
y1 i
P
the varying probability scheme WOR more complex. Only will provide an unbiased estimator of Y . In general,
NP1
yi
(i ≠ 1) will not provide an unbiased estimator of Y .
NPi
Ordered estimates
To overcome the difficulty of changing expectation with each draw, associate a new variate with each draw such that its
expectation is equal to the population value of the variate under study. Such estimators take into account the order of
the draw. They are called ordered estimates. The order of the value obtained at previous draw will affect the
unbiasedness of population mean.
We consider the ordered estimators proposed by Des Raj, first for the case of two draws and then generalize the
result.
8
Des Raj ordered estimator
Let y1 and y2 denote the values of units U1 and U2 drawn at first and second draw respectively. Note that y1 and y2
are not the values of first two units in the population. Further, let P1 and P2 denote the initial probabilities of selection of
U1 and U2 respectively.
1 y2
=z2 1
y +
N P2 / (1 − P1 )
1 (1 − P1 )
= y1 + y2
N P2
z1 + z2
z= .
2
P2
Note that is the probability P (U 2 | U1 ).
1 − P1
9
Estimation of population mean:
E ( z ) =.Y
N
Note that ∑ P = 1.
i =1
i
Consider
1 y1
E ( z1 ) = E
N P1
1 Y1 Y2 YN
= P1 + P2 + ... + PN
N P1 P2 PN
=Y
1 (1 − P1 )
=
E ( z2 ) E y1 + y2
N P2
1 (1 − P1 )
= E ( y1 ) + E E y2 U1 (Using E (Y ) =
E X [ EY (Y / X )].
N P2
P2
Since y2 can take all possible values except y1 with probability .
1 − P1
So (1 − P1 ) y j (1 − P1 ) Pj
E y2 U1 = ∑ .
P2
j P j 1 − P1
where the summation is taken over all the values of y except the value y1 which is selected at the first stage. So
(1 − P1 )
10
E y2 U1=
Ytot − y1
P2
Substituting it in E ( z2 ), we have
1
E ( z2=
) [ E ( y1 ) + E (Ytot − y1 )]
N
1
= E (Ytot )
N
Ytot
=
N
=Y.
E ( z1 ) + E ( z2 )
Thus E ( z ) =
2
Y +Y
=
2
=Y.
11
Sampling Theory
MODULE VII
LECTURE - 25
VARYING PROBABILITY
SAMPLING
DR. SHALABH
DEPARTMENT OF MATHEMATICS AND STATISTICS
INDIAN INSTITUTE OF TECHNOLOGY KANPUR
1
Variance:
1 N 2 1 N Yi
2 2
1 N 2 Yi
1 − 2 ∑ Pi 2 N 2 ∑ Pi P − Ytot − 4 N 2 ∑ Pi P − Ytot .
Var ( z ) =
= i 1= i 1 i
= i 1 i
= ( z ) E ( z 2 ) − [ E ( z )]
2
Var
2
1 y1 y2 (1 − P1 )
= E + y1 + − Y
2
2 N P1 P2
2
1 y (1 + P1 ) y2 (1 − P1 )
= 2
E 1 + −Y
2
4N P1 P2
↓ ↓
nature of nature of
variable variable
depends depends
only on upon1st and
1st draw 2nd draw
2
1 N Yi (1 + Pi ) Y j (1 − Pi ) PP
2
= ∑ +
i j
−Y 2
4 N i ≠ j =1 Pi 1 − Pi
2
Pj
1 N Y 2 (1 + P ) 2 PP Y j2 (1 − Pi ) 2 PP (1 − Pi 2 ) PP
i j
∑ + + − Y
i i i j i j 2
= 2YY
1 − Pi 1 − Pi 1 − Pi
i j
4N 2 i ≠ j =1 Pi 2
Pj 2
PPi j
1 N Y 2 (1 + P ) 2 Pj Y j2 (1 − Pi ) 2 Pi
∑ + + 2YY + −Y .
2
i i
= (1 P )
1 − Pi 1 − Pi
i j i
4N 2 i ≠ j =1 Pi Pj
1 N Yi 2 (1 + Pi ) 2 N N N Y j Yi
2 2 N N
2 ∑ ∑ j i ∑ i i ∑ ∑ i i ∑ Y j − Yi )] − Y
Var ( z ) P − P + P (1 − P ) − + 2 Y (1 + P )( 2
= Pi (1 − Pi ) j 1 =
4 N i 1 = i1= j 1 Pj = Pi i 1 =j 1
1 N Yi 2 N Y j Yi
N 2 2 N N
2 ∑ ∑ i ∑ ∑ i i ∑ j
= (1 + Pi
2
+ 2 Pi ) + Pi (1 − P ) − + 2 Y (1 + P )( Y − Yi −Y
) 2
=4 N i 1 Pi =i 1 = j 1 Pj = Pi i 1 =j 1
1 N Yi 2 N 2 N N N Y2 N N N
Yi 2
= ∑ + ∑ Yi Pi + 2∑ Yi + ∑ Pi ∑ − ∑ Yi − ∑ Pi ∑
2 j 2 2
4 N 2=
i 1 Pi =i 1 i =1 =i 1 =j 1 Pj =i 1 =i 1 =j 1 Pj
N N N N N N
+ ∑ PY
i i + 2∑ Yi ∑ Y j − 2∑ Yi Pi + 2∑ Yi Pi ∑ Y j − 2∑ Yi ] − Y
2 2 2 2
i ==
i 1 j 1 =i 1 =i=
1 j 1 =i 1
1 N Yi 2 N 2 N Y j
2 N N
= 2 2 ∑ − ∑ Pi ∑ − ∑ Yi
2
+ 2Y 2
tot + 2Ytot ∑ i i − Y
Y P 2
3
4N
= i 1 Pi i=
=1 Pj i 1
j 1= =i 1
1 N 2 1 N Yi 2 2 2 1 N 2 N
2 2
= 2 1 − ∑ Pi 2 ∑
−Ytot + Ytot − 2 ∑ i
Y − 2Y 2
tot − 2Ytot ∑ Yi Pi + 4 N Y
= 2 i 1= 4 N i 1 Pi = 4N i 1 =i 1
2
1 N 2 1 N Yi 1 N N
1 N 2 1
= 2 ∑ i 2 N 2 ∑ i P tot 4 N 2 ∑ i
− − − − tot ∑ i i − + + 1 − 2 ∑ Pi 2 N 2 Ytot
2 2 2 2
1 P P Y ( Y 2Y Y P 2Ytot 4Ytot )
= i 1= i 1 i =i 1 =i 1 i =1
2
1 N 2 1 N Yi 1 N N
=1 − 2 ∑ Pi 2 N 2 ∑ Pi P − Ytot − 4 N 2 (∑ Yi − 2Ytot ∑ Yi Pi + 2Ytot − 2Ytot + ∑ Pi Ytot )
2 2 2 2 2
= i 1= i 1 i =i 1 =i 1 i
2
1 N 2 1 N Yi
∑ (Y − 2YtotYi Pi + Pi 2Ytot2 )
N
1
1 2 ∑ Pi 2 N 2 ∑ Pi P − Ytot − 4 N 2
=− 2
i
= i 1= i 1 i i =1
2 2
1 1 N 2 N Yi 1 N 2 Yi
=
2 N 2 =
1 − ∑ i ∑
P Pi − Ytot − ∑ i
P
N 2 i 1 Pi
− Ytot
2 i 1= i 1 Pi 4=
2 2 2
1 N Yi 1 N 2 N Yi 1 N 2 Yi
= ∑ Pi
2 i 1 NPi
− Y − 2 ∑ i ∑
P − Ytot − 2 ∑ i
P − Ytot .
= 4N=i 1 =i 1 Pi 4N=i 1 Pi
↓ ↓
4
Estimation of Var ( z )
=
Var ( z ) E ( z 2 ) − ( E ( z )) 2
= E(z 2 ) − Y 2 .
Since
E ( z1 z2 ) = E [ z1 E ( z2 / u1 ) ]
= E z1Y
= YE ( z1 )
= Y 2.
Consider
E z 2 − z1 z2 = E ( z 2 ) − E ( z1 z2 )
= E(z 2 ) − Y 2
= Var ( z )
(z ) =
⇒ Var z 2 − z1 z2 is an unbiased estimator of Var ( z )
5
Alternative form of estimate of Var ( z )
( z=
Var ) z 2 − z1 z2
z +z
2
= 1 2 − z1 z2
2
( z1 − z2 ) 2
=
4
2
1 y1 y1 y2 1 − P1
= NP − N − N P
4 1 2
2
1 y1 y2 (1 − P1 )
= (1 − P1 ) P −
4N 2 1 P2
2
(1 − P1 ) 2 y1 y2
= − .
4 N 2 P1 P2
6
Case 2: General case
Let (u1 , u2 ,..., ur ,..., un ) be the units selected in the order in which they are drawn in n draws and let ( y1 , y2 ,.., yr ,..., yn )
and ( P1 , P2 ,..., Pr ,..., Pn ) be the corresponding y-values and initial probabilities of selection.
Further, let y1
Z1 =
NP1
1 yr
=
Zr y1 + y 2 + ... + y r −1 + (1 − P1 − ... − Pr −1 ) for=
r 2,3,..., n.
N Pr
1 n
Consider z = ∑ zr asan estimator of population mean.
n r =1
Then
1 n
E(z ) = ∑ E ( zr )
n r =1
1 n
= ∑Y
n r =1
=Y .
Thus z is an unbiased estimator for population mean Y .
7
Estimation of variance
= ( z ) E ( z 2 ) − [ E ( z )]
2
Proof : Var
= E ( z )2 − Y 2 .
Consider
E ( z1 z2 ) = E [ z1 E ( z2 / u1 ) ]
= E z1Y
= YE ( z1 )
= Y 2.
Substituting Y 2 = E ( z1 z2 ) in Var ( z ), we have
=
Var ( z ) E ( z )2 − Y 2
= E ( z 2 ) − E ( z1 z2 )
= E ( z 2 − z1 z2 ).
Thus
( z=
Var ) z 2 − z1 z2
is an unbiased estimator of Var ( z ) .
8
This can be further simplified as
( z=
Var ) z 2 − z1 z2
z +z
2
= 1 2 − z1 z2
2
z −z
2
= 1 2
2
2
1 y1 y1 y2 1 − P1
= − −
4 NP1 N N P2
2
1 y1 y2 (1 − P1 )
= (1 − P1 ) −
4N 2 P1 P2
2
(1 − P1 ) 2 y1 y2
= − .
4 N 2 P1 P2
9
Case 2: General case for n draws:
Let y1 , y2 ,..., yn be the values of the units in the order in which they are drawn and P1 , P2 ,..., Pn be thus respective
initial probabilities. Define
y1
z1 =
NP1
1 yr
=
zr y1 + y2 + ... + yr −1 + (1 − P1 − P2 − ... − Pr −1 ) ; =
r 2,3,..., n
N Pr
1 n
z= ∑ zi .
n i =1
1
E ( zr ) = E [ ( zr | y1 , y2 ,..., yr −1 )]
N
yi
Note that takes all values except those which were selected in the previous draws with corresponding probabilities
Pi
Pr .
1 − P1 − P2 − ... − Pr −1
10
yr
E ( zr | y1 , y2 ,..., yr −1=
) E[ y1 + y2 + ... + yr −1 + (1 − P1 − ... − Pr −1 )]
Pr
Pr (1 − P1 − ... − Pr −1 ) E[Ytot − ( y1 + y2 + ... + yr −1 )]
= E[( y1 + y2 + ... + yr −1 ) + .
Pr (1 − P1 − ... − Pr −1 )
= E (Ytot )
= Ytot
1
E ( zr ) = E (Ytot )
N
=Y
1 n
E(z ) = ∑ E ( zi )
n i =1
1
= nY
n
=Y.
The expression for variance of z in general case is complex but its estimate is simple.
11
Estimate of variance:
=
Var (z ) E(z 2 ) − Y 2
= E zrY
= YE ( zr )
=Y2
because for r < s, zr will not contribute
and similarly for s < r , zs will not contribute in the expectation.
Further, for s < r,
E ( zr zs ) = E [ zs E ( zr / y1 , y2 ,..., yr −1 ) ]
= E zsY
= YE ( zs )
= Y 2.
Consider,
1 n n 1 n n
E
−
∑ ∑ z r s
z =
−
∑ ∑ E ( zr z s )
n ( n 1) r ( ≠ s )= 1 s = 1 n ( n 1) r ( ≠ s )= 1 s = 1
1
= n(n − 1)Y 2
n(n − 1)
12
= Y 2.
Substituting Y 2 in Var ( z ), we get
2
Var ( z ) = E ( z 2 ) − Y
1 n n
= E(z ) − E 2
∑ ∑ E ( zr zs )
n( n − 1) r ( ≠ s )= 1 s = 1
n n
(z ) = 1
⇒Var z2 − ∑ ∑ zr zs .
n( n − 1) r ( ≠ s )= 1 s = 1
2
n n n n
Using ∑ = zr ∑ zr2 + ∑ ∑z z r s
r= 1 r= 1 r ( ≠ s )= 1 s = 1
n n n
⇒ ∑ ∑ zr zs =
r ( ≠ s )= 1 s = 1
n 2 z 2 − ∑ zr2 ,
r =1
(z ) = 1 2 2 n 2
Var z2 − n z − ∑ zr
n(n − 1) r =1
1 n 2
= ∑
n(n − 1) r =1
zr − nz 2
n
1
= ∑
n(n − 1) r =1
( zr − z ) 2 .
13
Sampling Theory
MODULE VII
LECTURE - 26
VARYING PROBABILITY
SAMPLING
DR. SHALABH
DEPARTMENT OF MATHEMATICS AND STATISTICS
INDIAN INSTITUTE OF TECHNOLOGY KANPUR
1
Unordered estimator:
In ordered estimator, the order in which the units are drawn is considered. Corresponding to any ordered estimator,
there exist an unordered estimator which does not depend on the order in which the units are drawn and has
smaller variance than the ordered estimator.
N
In case of sampling WOR from a population of size N, there are unordered sample(s) of size n.
n
Corresponding to any unordered sample(s) of size n units, there are n! ordered samples.
Moreover,
Probability of unordered Probability of ordered Probability of ordered
= + .
sample (u1 , u2 ) sample (u1 , u 2 ) sample (u2 , u 1 )
2
N
Let=
zsi , s 1, 2,..,= 2,..., n !( M ) be an estimator of population parameter θ based on ordered sample si
, i 1,=
n
Consider a scheme of selection in which the probability of selecting the ordered sample ( si ) is psi .
The probability of getting the unordered sample(s) is the sum of the probabilities, i.e.,
M
ps = ∑ psi .
i =1
For a population of size N with units denoted 1, 2,…, , the samples of size n are n - tuples. In the nth draw, the
sample space will consist of N ( N − 1)...( N − n + 1) unordered sample points.
1
psio [selection of any ordered sample]
P=
N ( N − 1)...( N − n + 1)
n!
psiu P [ selection of any unordered
= sample ] = n ! P [ selection of any ordered sample ]
N ( N − 1)...( N − n + 1)
then
M ( = n !)
n !( N − n )! 1
=ps ∑ psio
=
i =1
=
N! N
.
n
3
Theorem:
N M
θˆ0 z=
= si , s 1, 2,..., = ; i 1, 2,...,= θˆu
M ( n !) and= ∑z si psi' are the ordered and unordered
n i =1
estimators of θ , then
(i ) E (θˆu ) = E (θˆ0 )
(ii ) Var (θˆu ) ≤ Var (θˆ0 )
where zsi is a function of si th ordered sample (hence a random variable) and ps i
is the probability of selection of si th
psi
ordered sample and psi =
'
.
ps
N
Proof: Total number of ordered sample = n !
n
N
n M
(i ) E (θˆ0 ) = ∑∑ zsi psi
=s 1 =i 1
N
M
n
E (θu ) = ∑ ∑ zsi psi' ps
ˆ
i1
=s 1 =
p
= ∑ ∑ zsi si ps
s i ps
= ∑∑ zsi psi
s i
= E (θˆ0 )
4
N
(ii) Since θˆ0 = zsi , so θˆ02 = zsi2 with probability
= psi , i 1,=
2,..., M , s 1, 2,...,
2 n
M M
Similarly, θˆu
= '
si si
=i 1 =i 1
2
u ∑=
z p , so θˆ ∑ zsi psi' with probability
ps
Consider
2
(θˆ0 ) E (θˆ02 ) − E (θˆ0 )
Var=
∑∑ z
2
= 2
si psi − E (θˆ0 )
s i
2
(θˆu ) E (θˆu2 ) − E (θˆu )
Var=
2
= ∑ ∑ zsi psi' ps − E (θˆ0 )
2
s i
2
Var (θˆ0 ) − Var (θˆ=
u) ∑∑
s i
z psi − ∑ ∑ zsi psi' ps
s i
2
si
2
= ∑∑ z psi + ∑ ∑ zsi psi' ps − 2∑ ∑ zsi psi' ∑ zsi psi' ps
2
si
s i s i s i i
2
' '
= ∑s ∑i si si ∑i si si ∑i si ∑i si si ∑i si si ps
z 2
p + z p p − 2 z p z p
2
'
=∑ ∑ zsi psi + ∑ zsi psi psi − 2 ∑ zsi psi' zsi psi
2
s i i i
= ∑∑
s
( zsi − ∑ zsi psi ) psi ≥ 0
i i
' 2
⇒ V (θˆ ) − V (θˆ ) ≥ 0
0 u
or V (θˆu ) ≤ V (θˆ0 )
5
Estimate of Var (θˆu )
Since
Var (θˆ0 ) − Var (θˆu )=
s i
∑∑ ( z − ∑ z
si
i
psi' ) 2 psi
si
(θˆ ) = (θˆ ) −
Var u Var 0 ∑∑
s
i
( z si − ∑i
z si p ' 2
si ) p si
(θˆ ) − p ' (
= ∑
p ' Var
i
si 0 ∑i
z − z p ' )2 .
si si ∑ i
si si
Based on this result, now we use the ordered estimators to construct an unordered estimator. It follows from this
theorem that the unordered estimator will be more efficient than the corresponding ordered estimators.
6
Murthy’s unordered estimator corresponding to Des Raj’s ordered
estimator for the sample size n
Suppose yi and y j are the values of units ui and u j selected in first and second draws respectively with varying
probability and WOR in a sample of size 2 and let Pi and Pj be the corresponding initial probabilities of selection.
So now we have two ordered estimates corresponding to the ordered samples
1 yi yj
z (=
s1* ) (1 + Pi ) + (1 − Pi )
2 N Pi Pj
1 yi y j (1 − Pi )
yi + +
2N Pi Pj
and
1 yj yi
z (=
s2* ) (1 + Pj ) + (1 − Pj )
2N Pj Pi
1 y j yi (1 − Pj )
yj + + .
2N P j Pi
7
* *
The probabilities corresponding to z ( s1 ) and z ( s2 ) are
PP
p( s1* ) = i j
1 − Pi
Pj Pi
p( s2* ) =
1 − Pj
=
p( s ) p ( s1* ) + p ( s2* )
i j (2 − Pi − Pj )
PP
=
(1 − Pi )(1 − Pj )
1 − Pj
p '( s1* ) =
2 − Pi − Pj
1 − Pi
p '( s2* ) = .
2 − Pi − Pj
8
Murthy’s unordered estimate z (u ) corresponding to the Des Raj’s ordered estimate is given as
1 y y j yj y
(1 + Pi ) i + (1 − Pi ) (1 − Pj ) + (1 + Pj ) + (1 − Pj ) i (1 − Pi )
2N Pi Pj Pi Pi
=
(1 − Pj ) + (1 − Pi )
1
(1 − Pj ) {(1 + Pi ) + (1 − Pi )} + (1 − Pi ) {(1 − Pj ) + (1 + Pj }
yi yj
2 N Pi Pj
=
2 − Pi − Pj
yi y
(1 − Pj ) + (1 − Pi ) j
Pi Pj
= .
N (2 − Pi − Pj )
9
Unbiasedness:
y y j PP PP
(1 − Pj ) i + (1 − Pi ) + i j
i j
Pi Pj 1 − Pi 1 − Pj
E [ z (u ) ] = ∑
1
N i< j 2 − Pi − Pj
y y j PP P P
(1 − Pj ) i + (1 − Pi ) + j i
i j
Pi Pj 1 − Pi 1 − Pj
2∑
1
=
2 N i< j 2 − Pi − Pj
y y j PP P P
(1 − Pj ) i + (1 − Pi ) + j i
i j
1 Pi Pj 1 − Pi 1 − Pj
= ∑
2 N i≠ j 2 − Pi − Pj
1 yi y j PP
= ∑ (1 − Pj ) + (1 − Pi )
−
i j
−
2N
i≠ j Pi P
j (1 Pi )(1 P )
j
1 yi Pj y j Pi
= ∑ 1 − P + 1 − P
2N i≠ j i j
N N
N
Using result= ∑ ai b j ∑ ai ∑ b j − bi , we have
= i≠ j 1 =i 1 = j 1
1 N yi N N y N
= E [ z (u ) ] ∑ (∑ Pj − Pi ) + ∑ j (∑ Pi −Pj )
= 1 − Pi j 1
2 N i 1 = = 1 − Pj i 1
j 1 =
1 N yi N yj
= ∑ (1 − Pi +∑
) (1 − Pj )
= 1 − Pi
2 N i 1 = j 1 1 − Pj
1 N N
Y +Y
= ∑ yi + ∑ =
yi
N i 1 =i 1
2= 2
10
=Y.
Variance: The variance of z (u ) can be found as
2
1 N (1 − Pi − Pj )(1 − Pi )(1 − Pj ) Yi Y j PP
i j (2 − Pi − Pj )
Var [ z (u ) ] ∑ −
2 i ≠ j =1 N 2 (2 − Pi − Pj ) P P (1 − P )(1 − P )
i j i j
2
1 N PP
i j (1 − Pi − Pj ) Yi Yj
= ∑ − .
2 i ≠ j =1 N 2 (2 − Pi − Pj ) Pi Pj
= [ z (u ) ] (1 − Pi − Pj )(1 − Pi )(1 − Pj ) yi − y j .
Var
N 2 (2 − Pi − Pj ) 2 P P
i j
11
Sampling Theory
MODULE VII
LECTURE - 27
VARYING PROBABILITY
SAMPLING
DR. SHALABH
DEPARTMENT OF MATHEMATICS AND STATISTICS
INDIAN INSTITUTE OF TECHNOLOGY KANPUR
1
Horvitz Thompson (HT) estimator of population mean
The unordered estimates have limited applicability as they lack simplicity and expressions for estimators and their
variance becomes unmanageable when sample size is even moderately large. The HT estimate is simple than
other estimators. Let N be the population size and yi , (i = 1, 2,..., N ) be the value of characteristic under study
and a sample of size n is drawn by WOR using arbitrary probability of selection at each draw.
Thus prior to each succeeding draw, there is defined a new probability distribution for the units available at that
draw. The probability distribution at each draw may or may not depend upon the initial probability at the first draw.
The HT estimator of Y is
1 n
zn Yˆ=
= HT ∑ zi
n i =1
1 N
= ∑ α i zi .
n i =1
2
Unbiasedness
1 N
E (YˆHT ) = ∑ E ( ziαi )
n i =1
1 N
= ∑ zi E (αi )
n i =1
1 N nyi
= ∑ E (αi )
n i =1 NE (αi )
1 N nyi
= = ∑
n i =1 N
Y
Variance
V (YˆHT ) = V ( zn )
= E ( zn2) − [ E ( zn ) ]
2
= E ( zn2) − Y 2 .
Consider
2
1 N
E ( z ) = 2 E ∑ αi zi
n
2
n i =1
1 N 2 2 N N
= 2
E ∑ i 1
α z + ∑ ∑ αiα j zi z j
n i= 1 i ( ≠ j )= 1 j = 1
1 N 2 N N
= 2 ∑ zi E (αi ) + ∑ ∑ zi z j E (αiα j ) .
2
3
n i ( ≠ j )= 1 j = 1
If S = {s} is the set of all possible samples and π i is probability of selection of ith unit in the sample s then
E (α=
i) 1 P( yi ∈ s ) + 0.P( yi ∉ s )
= 1.π i + 0.(1 − π i ) = π i
E (α=2
i ) 12. P( yi ∈ s ) + 02.P( yi ∉ s )
= πi.
So
E (αi ) = E (αi2 )
1 N 2 N N
=
E(z ) 2
∑ zi π i + i (∑ ∑ π ij zi z j
n 2 =i 1
n
≠ j=) 1 =j 1
where π ij is the probability of inclusion of ith and jth unit in the sample. This is called as second order inclusion
probability.
Now
Y 2 = [ E ( zn ) ]
2
2
1 N
= 2 E ∑ α i zi
n i =1
1 N 2 2
N N
= ∑ z [ E (α ) ] + ∑ ∑ zij E (α i ) E (α j )
n 2 i = 1
i i
i ( ≠ j )= 1 j = 1
1 N 2 2 N N
2 ∑ i i ∑ ∑
= z π + π iπ j zi z j .
n i= 1 i ( ≠ j )= 1 j = 1
4
Thus
1 N N N 1 N N N
Var (YˆHT ) = 2 ∑ π i zi2 + ∑ ∑ π ij zi z j − 2 ∑ π i2 zi2 + ∑ ∑ π iπ j zi z j
n i= 1 i ( ≠ j )= 1 j = 1 n i= 1 i ( ≠ j )= 1 j = 1
1 N N N
2 ∑ i ∑ ∑
= π (1 − π i ) zi
2
+ (π ij − π iπ i ) zi z j
n i= 1 i ( ≠ j )= 1 j = 1
1 N n 2 yi2 N N n 2 yi y j
= 2 ∑ π i (1 − π i ) 2 2 + ∑ ∑ (π ij − π iπ i ) 2
n i =1 N π i i ( ≠ j )= 1 j = 1 N π iπ j
1 N 1− πi 2 N N π −π π
= ∑ + ∑ ∑ i j .
ij i i
i
y y y
N 2 i = 1 π i i ( ≠ j )= 1 j = 1 π iπ j
Estimate of variance
1 n y 2 (1 − π ) N N π −π π yi y j
= =
Vˆ1 Var (YˆHT ) ∑ i i
+ ∑ ∑
ij i j
.
N2 πi2 π
i ( ≠ j )= 1 j = 1 π π
ij i j
5
Yates and Grundy form of variance
Since there are exactly n values of α i which are 1 and ( N − n) values which are zero, so
N
∑α
i =1
i = n.
∑ E (α ) = n.
i =1
i
Also
2
N N N N
∑αi
E =
i= 1
∑ E (α i2 ) +
i= 1
∑ ∑ E (α α
i ( ≠ j )= 1 j = 1
i j )
N N N
∑ E (α i ) + ∑ ∑ E (α iα J ) (using E (α i ) =
E (n) = E (α i2 ))
2
i= 1 i ( ≠ j )= 1 j = 1
N N
n 2= n + ∑ ∑ E (α α
i ( ≠ j )= 1 j = 1
i J )
N N
∑ ∑ E (α α=)
i ( ≠ j )= 1 j = 1
i J n(n − 1).
Thus
E (α iα
= j) (α i 1,=
P= α j 1)
(α i 1) P=
= P= α i 1)
(α j 1|=
(α i ) E (α j | α i 1).
= E=
6
Therefore
N
∑
j ( ≠i ) =
1
E (α i α j ) − E (α i ) E (α j )
N
= ∑ E (α i ) E (α j | α i= 1) − E (α i ) E (α j )
j ( ≠i ) =
1
N
= E (α i ) ∑ E (α j | α i= 1) − E (α j )
j ( ≠i ) =
1
= E (α i ) [ (n − 1) − (n − E (α i )) ]
− E (α i ) [1 − E (α i ) ]
=
−π i (1 − π i ).
= (1)
Similarly
N
∑
i ( ≠ j )=
1
E (αi α j ) − E (αi ) E (α j ) =
−π j (1 − π j ). (2)
1 N N N
ˆ )
Var (Y=HT ∑
n2 i= 1
π i (1 − π i ) zi
2
+ ∑ ∑ (π ij − π i π j ) zi z j .
i ( ≠ j )= 1 j = 1
7
Using (1) and (2) in this expression, we get
1 N N N N
2 ∑ i ∑ ∑ ∑
YˆHT )
Var (= π (1 − π i ) zi
2
+ π j (1 − π j ) z 2
j − 2 (π iπ j − π ij ) z i z j
2n i= 1 j= 1 i ≠ j= 1 j= 1
1 N N
2 ∑ ∑
= − E (αiα j ) − E (αi ) E (α j ) zi2
2n =i 1 j (=
≠i ) 1
N 2
− ∑ ∑ E (αiα j ) − E (αi ) E (α j ) z j − 2 ∑ ∑{E (αi ) E (α j ) − E (αiα j ) zi z j }
N N n
=j 1 i ( ≠ =
j) 1 i ( ≠ j )= 1 j = 1
1 N N N N N N
2 ∑ ∑ ∑ ∑ ∑ ∑
= ( −π ij + π π
i i ) zi
2
+ ( −π ij + π π
i i ) z 2
j + 2 (π ij − π iπ i ) zi z j
2n i ( ≠ j )= 1 j = 1 i ( ≠ j )= 1 j = 1 i ( ≠ j )= 1 j = 1
1 N N
= ∑ ∑
2n 2 i ( ≠ j ) = 1 j = 1
(π iπ j − π ij )( zi
2
+ z 2
j − 2 zi z j ) .
The expression for π i and π ij can be written for any given sample size.
For example, for n = 2, assume that at the second draw, the probability of selecting a unit from the units available
is proportional to the probability of selecting it at the first draw. Since
E (α i ) = Probability of selecting Y in a sample of two
i
= Pi1 + Pi 2
where Pir is the probability of selecting Yi at rth draw (r = 1, 2). If Pi is the probability of selecting the rth unit at first
draw then we had earlier derived that
8
Pi1 = Pi
N P P
= ∑ j − i Pi .
j =1 1 − Pj 1 − Pi
So
N Pj P
E (α i ) Pi ∑
= − i .
j =1 1 − Pj 1 − Pi
Again
E (α iα j ) = Probability of including both yi and y j in a sample of size two
= Pi1 Pj 2 / i + Pj1 Pi 2 / j
Pj Pi
= Pi + Pj
1 − Pi 1 − Pj
1 1
=PP + .
1 − Pi 1 − Pj
i j
Estimate of variance
The estimate of variance is given by
(Yˆ ) 1 n n π iπ j − π ij
=Var HT
2n 2
∑ ∑ π ij
( zi −z j ) 2 .
9
i(≠ j ) j=
1
Midzuno system of sampling:
Under this system of selection of probabilities, the unit in the first draw is selected with unequal probabilities of
selection (i.e. pps) and at all subsequent draws, remaining all the units are selected with SRSWOR.
E (α i=
) π=
i P (unit i is included in the sample)
10
Similarly,
E (α iα j ) = Probability both yi and y j are in sample
Probability that
yi is selected at the first draw and
=
y is selected at any of (
the subsequent draws n − draws
1)
j
Probability that y j is selected at the first draw and
+
y is selected at any of ( −
i the subsequent n draws
1)
Probability thatneither
y j nor y j is selected at the first draw but
+
both of −
themare selected duringthe subsequent
( n draws
1)
n −1 n −1 (n − 1)(n − 2)
= Pi + Pj + (1 − Pi − Pj )
N −1 N −1 ( N − 1)( N − 2)
(n − 1) N − n n−2
= ( Pi + Pj ) +
( N − 1) N − 2 N − 2
n −1 N − n n−2
=π ij ( Pi + Pj ) + .
N −1 N − 2 N − 2
Similarly,
E (αiα jα=
k) π=
ijk Probability of including yi , y j and yk in the sample
( n − 1)( n − 2) N − n n−3
= ( Pi + Pj + Pk ) + .
( N − 1)( N − 2) N − 3 N − 3
By an extension of this argument, if yi , y j ,..., yr are the r units in the sample of size n(r < n), the probability of
including these r units in the sample is
11
Similarly, if yi , y j ,..., yq be the n units, the probability of including these n units in the sample is
( n − 1)( n − 2)...1
E (αiα j ...α q =
) π ij ...= ( Pi + Pj + ... + Pq )
( N − 1)( N − 2)...( N − n + 1)
q
1
= ( Pi + Pj + ... + Pq )
N − 1
n −1
Thus if Pi ' s are proportional to some measure of size of units in population then the probability of selecting a
specified sample is proportional to the total measure of the size of units included in the sample.
Substituting these π i , π ij , π ijk etc. in the HT estimator, we can obtain the estimator of population’s mean and
variance. In particular, an unbiased estimate of variance of HT estimator given by
(Yˆ ) 1 N N π iπ j − π ij
=Var HT
2n 2
∑∑
i ≠ j= 1 j= 1 π ij
( zi − z j ) 2
reduces to
(Yˆ ) N −n n n n −1 ( zi − z j ) 2
=Var HT
2( N − 1) 2 n 2
∑ ∑ ( N − n) PP i j +
N −2
(1 − Pi − Pj )
π ij
.
i ( ≠ j )= 1 j = 1
The main advantage of this method of sampling is that it is possible to compute a set of revised probabilities of
selection such that the inclusion probabilities resulting from the revised probabilities are proportional to the initial
probabilities of selection. It is desirable to do so since the initial probabilities can be chosen proportional to some
measure of size.
12