Lecture 12
Lecture 12
Introduction
The Survey sampler has been interested in methods of improving the precisions of estimates
of population parameters both at the selection and estimation stages. If there is one thing that
distinguishes sampling theory from general statistical theory, it is the degree of emphasis laid
on the use of auxiliary information for improving the precision of estimates.
Auxiliary information was used in study session five for purposes of stratification. In this
study, you will be introduced to some other methods of making use of auxiliary information
to achieve higher precision.
Frequently, the quantity that is to be estimated from a simple random sample is the ratio of
two variables both of which vary from unit to unit. Ratio estimates are prejudiced and
modifications must be made when they are used in experimental or survey
For example, in a household survey the average number of suits of clothes per adult male, the
average expenditure on cosmetics per adult female, and the average number of hours per
week spent watching TV per child aged 10-15 years may be of interest. Or, the interest may
74
be in the estimation of income to expenditure, ratio of farmers’ income to non-farmers’
income, ratio of male to female enrolment in schools etc.
In order to estimate the first of these items, we would record for the ith household
(i = 1, 2, ...n) the number of adult males xi who live in these households and the total
number of suits yi that they have. The population parameter to be estimated is the ratio R.
In-Text Question
Ration estimation is a technique that uses available auxiliary information. True or False
In-Text Answer
True
Y Y
R= = = The ratio of the population totals or means of character y or x
X X
75
y y
Rˆ = = ..................(6.2i )
x x
y ˆ ...........(6.2ii ) respectively.
YˆR = X = RX
x
ˆ y ˆ ..............(6.2iii )
YR = X = RX
x
Consequently, for most practical purposes, approximate variance results are equally valid for
comparison of its precision.
In simple random sampling, without replacement, if variates yi and xi are measured on each
unit of a simple random sample of size n , assumed large, the Mean square error, (MSE) and
y
variance of Rˆ = are each approximately given by
x
∑( y − Rxi )
2
1− f
( ) ( )
i
MSE Rˆ = V Rˆ = i =1
nX 2 ( N − 1)
Where
Y
R= is the ratio of the population means and
X
n
f =
N
Proof:
y y − Rx
Rˆ − R = − R = .................(6.2.2.0)
x x
76
If n is large, x should not differ greatly from X . In order to avoid having to work out the
y − Rx
Rˆ − R = ...............(6.2.2.1)
X
E ( y − Rx )
(
E Rˆ − R = ) X
............(6.2.2.2)
Y − RX
= =0
X
Y
R= .
X
( ) ( ) 1
E ( y − Rx ) ...................(6.2.2.3)
2
MSE Rˆ = E Rˆ − R =
2
2
X
The quantity ( y − Rx ) is the sample mean of the variate di = yi − Rxi , whose population mean
D = Y − RX = 0.
Note: The variance of the sample mean y is equal to the approximate variance of the ratio
y ( y − Rxi )
if the variate yi is replaced by the variate i .
x X
A sample estimate of variance of the ratio estimator is given by
77
∑( y )
n 2
− Rx
ˆ
(1 − f ) i i
v( Rˆ ) = i =1
nX 2 n −1
(1 − f ) n 2 ˆ 2 n 2
ˆ xy
n
= ∑ i
nX 2 (n − 1) i =1
y + R ∑
i =1
xi − 2 R ∑
i =1
i i
1− f 2 ˆ2 2
=
nX 2( s y + R s x − 2 ˆ
Rs)xy
...................(6.2.2.5)
where
s 2y & s x2 are the variances of X and Y respectively.
n
∑ ( xi − x )( yi − y )
Sxy = i =1
n −1
If the population mean of x is not known, its sample estimate may be used provided that the
sample size is large (n ≥ 30) .
∑(y )
n 2
− Rx
ˆ
1− f
( )
i i
S Rˆ = i=1
..............(6.2.2.6)
nX n −1
()
One way to compute S Rˆ is to express it as
1− f ∑ yi2 − 2 Rˆ ∑ yi xi + Rˆ 2 ∑ xi2
( )
S Rˆ =
nx n −1
.................(6.2.2.7)
78
1− f ˆ 2 1− f ˆ n 2 n
Bˆ ( Rˆ ) = 2
( Rsx − s xy ) =
n(n − 1) X 2 R ∑ xi − ∑ xi yi
nX i =1 i =1
Where ........................(6.2.2.8)
n
∑ (x i − x )( yi − y )
sxy = i =1
n −1
Example 6.1
A simple random sample (without replacement) of 20 villages was selected from the 88
villages in a given state. Using the sample observations on the area under maize cultivation
y and land area of the village x collected from each of the sample villages in the table
below:
(i) Estimate the ratio of total area under maize cultivation to total land area in the state.
(ii) Estimate the total area under maize by method of ratio estimation.
(iii) Calculate the bias and variance of your estimates in (i) and (ii).
79
Solution
20 20
N = 88 , n = 20 , = 16583.77 , x = 829.1885 ,
∑x i =1
i ∑x
i =1
i
2
= 17,367,948
20 20
∑y
i =1
i = 10601.51 , y = 530.0755 , ∑y
i =1
i
2
= 6, 794, 275.214 ;
20
X = 83819.96; X = 95.4995; f = = 0.2273
88
y 530.0755
Rˆ = = = 0.639270202
x 829.1885
YˆR = RX
ˆ = 0.639270202 × 83819.96 = 53,583.60hectares
(iii) For the estimation of the variance and the bias, we shall use these sample results
∑y 2
i − ny 2
6794275.214 − 20(530.0775) 2
s 2y = i =1
= = 61,822.74178
n −1 19
17367948 − 20(829.1885) 2
sx2 = = 190361.928
19
10468440.11 − 20(530.0775)(829.1885)
sxy = = 88302.988
19
1− f 2 ˆ2 2
v ( Rˆ ) =
nX 2
(
s y + R sx − 2 Rs
ˆ
xy )
1 − 0.2273
= 61822.742 + (0.6393) 2 (190361.928) − 2(0.6393)(88302.988)
2
20(952.4996)
= 0.00114
80
The estimate of the bias of R is
1 − 0.2273
Bˆ ( Rˆ ) = ( 0.6393 ×190361.928 − 88302.988)
20(19)(952.4995) 2
= 0.00142
Bˆ ( Rˆ ) 0.00142
= = 0.042
v( R) ˆ 0.03376
or
4.2%
In-Text Question
Ratio estimators are generally biased. True or False
In-Text Answer
True
S y2
V (Yˆ ) = N 2 (1 − f ) .....................(6.2.2.9)
n
In this case no use is made of the auxiliary information provided by x . If this information is
used to form the ratio estimate Yˆ = XRˆ , a first approximation to the mean square error around
Y has been found to be
81