0% found this document useful (0 votes)
7 views8 pages

Lecture 12

Study Session 6 focuses on Ratio and Regression Estimation, emphasizing the use of auxiliary information to enhance the precision of population parameter estimates. It explains the concepts of ratio estimation and its bias, variance, and comparison with simple averages. The session includes practical examples and calculations to illustrate the application of these estimation techniques.

Uploaded by

Jorams Barasa
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views8 pages

Lecture 12

Study Session 6 focuses on Ratio and Regression Estimation, emphasizing the use of auxiliary information to enhance the precision of population parameter estimates. It explains the concepts of ratio estimation and its bias, variance, and comparison with simple averages. The session includes practical examples and calculations to illustrate the application of these estimation techniques.

Uploaded by

Jorams Barasa
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

Study Session 6: Ratio and Regression Estimation

Introduction
The Survey sampler has been interested in methods of improving the precisions of estimates
of population parameters both at the selection and estimation stages. If there is one thing that
distinguishes sampling theory from general statistical theory, it is the degree of emphasis laid
on the use of auxiliary information for improving the precision of estimates.

Auxiliary information was used in study session five for purposes of stratification. In this
study, you will be introduced to some other methods of making use of auxiliary information
to achieve higher precision.

Learning Outcomes for Study Session 6


At the end of this study, you should be able to:

6.1 Explain Ratio Estimation


6.2 Explain Regression Estimation

6.1 Ratio Estimation


Ration estimation is a technique that uses available auxiliary information which is connected
with the variable of interest. In a sample survey, in addition to estimating the means, totals
and proportions, you may wish to estimate the ratio of two characters.

Frequently, the quantity that is to be estimated from a simple random sample is the ratio of
two variables both of which vary from unit to unit. Ratio estimates are prejudiced and
modifications must be made when they are used in experimental or survey

For example, in a household survey the average number of suits of clothes per adult male, the
average expenditure on cosmetics per adult female, and the average number of hours per
week spent watching TV per child aged 10-15 years may be of interest. Or, the interest may

74
be in the estimation of income to expenditure, ratio of farmers’ income to non-farmers’
income, ratio of male to female enrolment in schools etc.

In order to estimate the first of these items, we would record for the ith household
(i = 1, 2, ...n) the number of adult males xi who live in these households and the total

number of suits yi that they have. The population parameter to be estimated is the ratio R.

In-Text Question
Ration estimation is a technique that uses available auxiliary information. True or False

In-Text Answer
True

6.1.1 Definitions and Notations


Let yi = the value of the characteristic under study for the ith unit of the population;

xi = The value of the auxiliary characteristic on the same unit;

Y = The total of y characteristic of the population;

X = The total of x characteristic of the population;

Y Y
R= = = The ratio of the population totals or means of character y or x
X X

ρ = The correlation coefficient between x and y in the population.

Suppose it is desired to estimate Y , or Y or R by drawing a simple random sample of n


units from the population. Assume that based on n pairs of observations, y and x are the
sample means of the characteristics y and x , respectively, and the population total X or
Y
mean X is known. The ratio estimators of the population ratio R = , the total Y , and the
X
mean Y , may be defined by:

75
y y 
Rˆ = = ..................(6.2i ) 
x x

y ˆ ...........(6.2ii )  respectively.
YˆR = X = RX
x 
ˆ y ˆ ..............(6.2iii ) 
YR = X = RX 
x 

Ratio estimator R̂ unlike its components y and x is generally biased.

6.1.2 Approximate Variance of Ratio Estimator


Since ratio estimators are generally biased their mean square errors are considered for the
purpose of comparing their efficiency with that of any other estimator. Ratio estimators,
although biased, are consistent, and with simple random sampling for moderately large
samples, the bias is negligible.

Consequently, for most practical purposes, approximate variance results are equally valid for
comparison of its precision.

In simple random sampling, without replacement, if variates yi and xi are measured on each
unit of a simple random sample of size n , assumed large, the Mean square error, (MSE) and
y
variance of Rˆ = are each approximately given by
x

∑( y − Rxi )
2

1− f
( ) ( )
i
MSE Rˆ = V Rˆ = i =1

nX 2 ( N − 1)
Where
Y
R= is the ratio of the population means and
X
n
f =
N

Proof:

y y − Rx
Rˆ − R = − R = .................(6.2.2.0)
x x

76
If n is large, x should not differ greatly from X . In order to avoid having to work out the

distribution of the ratio of two random variables ( y − Rx ) and x , we replace x by X in the


denominator of (6.2.2.0) as an approximation. This gives

y − Rx
Rˆ − R = ...............(6.2.2.1)
X

Now, when we average over all simple random samples of size n ,

E ( y − Rx ) 
(
E Rˆ − R = ) X

 ............(6.2.2.2)
Y − RX 
= =0
X 
Y
R= .
X

From (6.2.2.1) we also obtain the result

( ) ( ) 1
E ( y − Rx ) ...................(6.2.2.3)
2
MSE Rˆ = E Rˆ − R =
2
2
X

The quantity ( y − Rx ) is the sample mean of the variate di = yi − Rxi , whose population mean

D = Y − RX = 0.

Note: The variance of the sample mean y is equal to the approximate variance of the ratio
y ( y − Rxi )
if the variate yi is replaced by the variate i .
x X
A sample estimate of variance of the ratio estimator is given by

77
∑( y )
n 2 
− Rx
ˆ

(1 − f ) i i
v( Rˆ ) = i =1

nX 2 n −1 
(1 − f )  n 2 ˆ 2 n 2 
ˆ xy
n
= ∑ i
nX 2 (n − 1)  i =1
y + R ∑
i =1
xi − 2 R ∑
i =1
i i



1− f 2 ˆ2 2 
=
nX 2( s y + R s x − 2 ˆ
Rs)xy

 ...................(6.2.2.5)

where 

s 2y & s x2 are the variances of X and Y respectively.
n 
∑ ( xi − x )( yi − y ) 
Sxy = i =1 
n −1 

If the population mean of x is not known, its sample estimate may be used provided that the
sample size is large (n ≥ 30) .

For the estimated standard error of R̂ , this gives

∑(y )
n 2
− Rx
ˆ
1− f
( )
i i
S Rˆ = i=1
..............(6.2.2.6)
nX n −1

If X is not known, the sample estimate x is substituted in the denominator of (6.2.2.6).

()
One way to compute S Rˆ is to express it as

1− f ∑ yi2 − 2 Rˆ ∑ yi xi + Rˆ 2 ∑ xi2
( )
S Rˆ =
nx n −1
.................(6.2.2.7)

The sample estimator of the bias is given by

78

1− f ˆ 2 1− f ˆ n 2 n 
Bˆ ( Rˆ ) = 2
( Rsx − s xy ) =
n(n − 1) X 2  R ∑ xi − ∑ xi yi  
nX  i =1 i =1 

Where  ........................(6.2.2.8)
n 
∑ (x i − x )( yi − y ) 

sxy = i =1

n −1

Example 6.1

A simple random sample (without replacement) of 20 villages was selected from the 88
villages in a given state. Using the sample observations on the area under maize cultivation
y and land area of the village x collected from each of the sample villages in the table
below:

yi 242.82 245.65 352.90 755.57 247.68

xi 313.39 764.05 655.27 797.72 310.80

yi 553.22 792.81 609.48 1055.46 781.07

xi 968.66 1134.42 1688.68 1623.93 1186.22

yi 246.06 448.81 489.69 674.63 434.65

xi 422.17 701.89 600.88 789.95 486.92

yi 244.03 668.97 929.19 533.39 295.43

xi 248.64 841.25 1440.04 1235.43 372.96

The total land area of the 88 villages is 83,819.96 hectares.

(i) Estimate the ratio of total area under maize cultivation to total land area in the state.

(ii) Estimate the total area under maize by method of ratio estimation.

(iii) Calculate the bias and variance of your estimates in (i) and (ii).

79
Solution

20 20
N = 88 , n = 20 , = 16583.77 , x = 829.1885 ,
∑x i =1
i ∑x
i =1
i
2
= 17,367,948

20 20

∑y
i =1
i = 10601.51 , y = 530.0755 , ∑y
i =1
i
2
= 6, 794, 275.214 ;

20
X = 83819.96; X = 95.4995; f = = 0.2273
88

(i) Using (6.2i), the estimate of the ratio is

y 530.0755
Rˆ = = = 0.639270202
x 829.1885

(ii) The estimate of total area under maize cultivation is

YˆR = RX
ˆ = 0.639270202 × 83819.96 = 53,583.60hectares

(iii) For the estimation of the variance and the bias, we shall use these sample results

∑y 2
i − ny 2
6794275.214 − 20(530.0775) 2
s 2y = i =1
= = 61,822.74178
n −1 19

17367948 − 20(829.1885) 2
sx2 = = 190361.928
19
10468440.11 − 20(530.0775)(829.1885)
sxy = = 88302.988
19

The variance of R̂ is calculated using

1− f 2 ˆ2 2
v ( Rˆ ) =
nX 2
(
s y + R sx − 2 Rs
ˆ
xy )
1 − 0.2273
= 61822.742 + (0.6393) 2 (190361.928) − 2(0.6393)(88302.988) 
2 
20(952.4996)
= 0.00114

80
The estimate of the bias of R is

1 − 0.2273
Bˆ ( Rˆ ) = ( 0.6393 ×190361.928 − 88302.988)
20(19)(952.4995) 2
= 0.00142

The ratio of the bias of R̂ to its standard error is

Bˆ ( Rˆ ) 0.00142
= = 0.042
v( R) ˆ 0.03376
or
4.2%

For the variance of YˆR , we use

Vˆ (YˆR ) = X 2Vˆ ( Rˆ ) = (83189.96) 2 (0.00114) = 8, 009, 212.228

In-Text Question
Ratio estimators are generally biased. True or False

In-Text Answer
True

6.1.3 Comparison with the simple average


The circumstances under which the ratio estimate will be better than the simple average
(sample mean) will now be pointed out. The variance of Yˆ = Ny in simple random sampling
(without replacement) is

S y2
V (Yˆ ) = N 2 (1 − f ) .....................(6.2.2.9)
n

In this case no use is made of the auxiliary information provided by x . If this information is
used to form the ratio estimate Yˆ = XRˆ , a first approximation to the mean square error around
Y has been found to be

81

You might also like