Use of Auxiliary Information For Estimating Population Mean in Systematic Sampling Under Non-Response
Use of Auxiliary Information For Estimating Population Mean in Systematic Sampling Under Non-Response
Florentin Smarandache
University of New Mexico, Gallup, USA
Published in:
Rajesh Singh, Florentin Smarandache (Editors)
ON IMPROVEMENT IN ESTIMATING POPULATION PARAMETER(S)
USING AUXILIARY INFORMATION
Educational Publishing (Columbus) & Journal of Matter Regularity (Beijing),
USA - China, 2013
ISBN: 978-1-59973-230-5
pp. 5 - 16
Abstract
In this paper we have adapted Singh and Shukla (1987) estimator in systematic
sampling using auxiliary information in the presence of non-response. The properties of the
suggested family have been discussed. Expressions for the bias and mean square error (MSE)
of the suggested family have been derived. The comparative study of the optimum estimator
of the family with ratio, product, dual to ratio and sample mean estimators in systematic
sampling under non-response has also been done. One numerical illustration is carried out to
verify the theoretical results.
5
1. Introduction
There are some natural populations like forests etc., where it is not possible to apply
easily the simple random sampling or other sampling schemes for estimating the population
characteristics. In such situations, one can easily implement the method of systematic
sampling for selecting a sample from the population. In this sampling scheme, only the first
unit is selected at random, the rest being automatically selected according to a predetermined
pattern. Systematic sampling has been considered in detail by Madow and Madow (1944),
Cochran (1946) and Lahiri (1954). The application of systematic sampling to forest surveys
has been illustrated by Hasel (1942), Finney (1948) and Nair and Bhargava (1951).
The use of auxiliary information has been permeated the important role to improve
the efficiency of the estimators in systematic sampling. Kushwaha and Singh (1989)
suggested a class of almost unbiased ratio and product type estimators for estimating the
population mean using jack-knife technique initiated by Quenouille (1956). Later Banarasi et
al. (1993), Singh and Singh (1998), Singh et al. (2012), Singh et al. (2012) and Singh and
Solanki (2012) have made an attempt to improve the estimators of population mean using
auxiliary information in systematic sampling.
6
In the sequence of improving the estimator, Singh and Shukla (1987) proposed a
family of factor-type estimators for estimating the population mean in simple random
sampling using an auxiliary variable, as
( A + C )X + fB x
Tα = y (1.1)
( A + fB )X + C x
where y and x are the sample means of the population means Y and X respectively. A ,
B and C are the functions of α , which is a scalar and chosen so as the MSE of the estimator
Tα is minimum.
Where,
A = (α − 1)(α − 2 ) , B = (α − 1)(α − 4 ) ,
n
f = .
N
Remark 1 : If we take α = 1, 2, 3 and 4, the resulting estimators will be ratio, product, dual
to ratio and sample mean estimators of population mean in simple random sampling
respectively (for details see Singh and Shukla (1987) ).
In this paper, we have proposed a family of factor-type estimators for estimating the
population mean in systematic sampling in the presence of non-response adapting Singh and
Shukla (1987) estimator. The properties of the proposed family have been discussed with the
help of empirical study.
7
class respectively in the population. Obviously, N 1 and N 2 are not known but their unbiased
estimates can be obtained from the sample as
Nˆ 1 = n1 N / n ; Nˆ 2 = n 2 N / n .
Further, using Hansen and Hurwitz (1946) technique we select a sub-sample of size
h2 from the n2 non-respondent units such that n 2 = h2 L ( L > 1 ) and gather the information
on all the units selected in the sub-sample (for details on Hansen and Hurwitz (1946)
technique see Singh and Kumar (2009)).
Let Y and X be the study and auxiliary variables with respective population means
Y and X . Let y ij (xij ) be the observation on the j th unit in the i th systematic sample under
study (auxiliary) variable ( i = 1...k : j = 1...n ).Let us consider the situation in which non-
response is observed on study variable and auxiliary variable is free from non-response. The
Hansen-Hurwitz (1946) estimator of population mean Y and sample mean estimator of X
based on a systematic sample of size n , are respectively given by
* n1 y n1 + n2 y h2
y =
n
1 n
and x = xij
n j =1
where y n1 and y h2 are respectively the means based on n1 respondent units and h2 non-
*
respondent units. Obviously, y and x are unbiased estimators of Y and X respectively. The
*
respective variances of y and x are expressed as
( )
V y =
* N −1
nN
{1 + (n − 1)ρ Y }S Y2 + L − 1W2 S Y22
n
(2.1)
and
()
V x =
N −1
nN
{1 + (n − 1)ρ X }S X2 (2.2)
8
where ρY and ρ X are the correlation coefficients between a pair of units within the
systematic sample for the study and auxiliary variables respectively. S Y2 and S X2 are
respectively the mean squares of the entire group for study and auxiliary variables. SY22 be the
population mean square of non-response group under study variable and W2 is the non-
response rate in the population.
*
* y
yR = X, (2.3)
x
*
* y x
yP = (2.4)
X
and
*
yD = y
* (N X − n x ) . (2.5)
(N − n )X
* * *
Obviously, all the above estimators y R , y P and y D are biased. To derive the biases
* * *
and mean square errors (MSE) of the estimators y R , y P and y D under large sample
approximation, let
y = Y (1 + e0 )
*
x = X (1 + e1 )
Ee( )=2
0
( ) = N − 1 {1 + (n − 1)ρ }C
V y
2
*
Y
2
Y +
L −1 S2
W2 Y22 , (2.6)
Y nN n Y
E (e12 ) =
()=
V x N −1
{1 + (n − 1)ρ X }C X2 (2.7)
2
X nN
and
9
E (e0 e1 ) =
( )
*
Cov y , x
=
N −1
{1 + (n − 1)ρ Y } 2 {1 + (n − 1)ρ X } 2 ρCY C X
1 1
(2.8)
YX nN
where CY and C X are the coefficients of variation of study and auxiliary variables
respectively in the population (for proof see Singh and Singh(1998) and Singh (2003, pg.
no. 138) ).
* * *
The biases and MSE’s of the estimators y R , y P and y D up to the first order of
( )*
B yR =
N −1
nN
(
Y {1 + (n − 1)ρ X } 1 − Kρ * C X2 , ) (2.9)
MSE y R =( )
* N −1 2
nN
[
Y {1 + (n − 1)ρ X } ρ * CY2 + 1 − 2 Kρ * C X2 +
2 L −1
n
(
W2 S Y22 , ) ] (2.10)
( )*
B yP =
N −1
nN
Y {1 + (n − 1)ρ X }Kρ *C X2 , (2.11)
MSE y P =( )
* N −1 2
nN
[
Y {1 + (n − 1)ρ X } ρ * CY2 + 1 + 2 Kρ * C X2 +
2 L −1
n
(
W2 S Y2 2 , ) ] (2.12)
( )*
B yD =
N −1
nN
[
Y{1 + (n − 1)ρ X } − ρ * K C 2X , ] (2.13)
MSE y D =( )
* N −1 2 2 f
Y {1 + (n − 1)ρ X } ρ * CY2 +
1− f
f
1 − f
− 2 ρ * K C X2
nN
+
(L − 1) W S Y22 (2.14)
2
n
where,
ρ * = {1 + (n − 1)ρY }
1
2 CY
and K = ρ .
{1 + (n − 1)ρ X } 1
2 CX
10
* *
y lr = y + b(X − x ) ( 2.15)
MSE ( y *lr ) =
N −1 2
[ ]
Y {1 + (n − 1)ρ X } CY2 − K 2 C X2 ρ * +
2 (L − 1) W S 2 (2.16)
2 Y2
nN n
Adapting the estimator proposed by Singh and Shukla (1987), a family of factor-type
estimators of population mean in systematic sampling under non-response is written as
* ( A + C ) X + fB x
Tα* = y . (3.1)
( A + fB )X + C x
It can easily be seen that the proposed family generates the non-response versions of
some well known estimators of population mean in systematic sampling on putting different
choices of α . For example, if we take α = 1, 2, 3 and 4, the resulting estimators will be ratio,
product, dual to ratio and sample mean estimators of population mean in systematic sampling
under non-response respectively.
Obviously, the proposed family is biased for the population mean Y . In order to find
the bias and MSE of Tα* , we use large sample approximations. Expressing the equation (3.1)
C
where D = .
A + fB + C
Since D < 1 and ei < 1 , neglecting the terms of ei ’s (i = 0,1) having power greater
11
Tα* − Y =
Y
A + fB + C
[ {
( A + C ) e0 − De1 + D 2 e12 − De0 e1 }
{
+ fB e0 − (D − 1)e1 + D(D − 1)e12 − (D − 1)e0 e1 . }] (3.3)
Y (C − fB )
[
E Tα* − Y = ]
C
A + fB + C A + fB + C
( )
E e12 − E (e0 e1 ) .
fB C
Let φ1 (α ) = and φ 2 (α ) = then
A + fB + C A + fB + C
C − fB
φ (α ) = φ 2 (α ) - φ1 (α ) = .
A + fB + C
Thus, we have
[ ] [ ( )
E Tα* − Y = Yφ (α ) φ 2 (α )E e12 − E (e0 e1 ) . ] (3.4)
Putting the values of E (e12 ) and E (e0 e1 ) from equations (2.7) and (2.8) into the
( )
B Tα* = φ (α )
N −1
nN
[ ]
Y {1 + (n − 1)ρ X }φ 2 (α ) − ρ * K C X2 . (3.5)
Squaring both the sides of the equation (3.3) and then taking expectation, we get
[
E Tα* − Y ] 2
[( ) ( )
= Y E e02 + φ 2 (α )E e12 − 2φ (α )E (e0 e1 ) .
2
] (3.6)
( ) ( )
Substituting the values of E e02 , E e12 and E (e0 e1 ) from the respective equations
(2.6), (2.7) and (2.8) into the equation (3.6), we get the MSE of Tα* as
( )
MSE Tα* =
N −1 2
nN
[
Y {1 + (n − 1)ρ X } ρ * CY2 + {φ 2 (α ) − 2φ (α )ρ * K }C X2
2
]
+
(L − 1) W S 2 . (3.7)
2 Y2
n
12
3.2 Optimum Choice of α
In order to obtain the optimum choice of α , we differentiate the equation (3.7) with
respect to α and equating the derivative to zero, we get the normal equation as
N −1 2
nN
[ ]
Y {1 + (n − 1)ρ X } 2φ (α )φ ′(α ) − 2φ ′(α )ρ * K C X2 = 0 (3.8)
φ (α ) = ρ * K (3.9)
which is the cubic equation in α . Thus α has three real roots for which the MSE of proposed
family would attain its minimum.
Putting the value of φ (α ) from equation (3.9) into equation (3.7), we get
which is the MSE of the usual regression estimator of population mean in systematic
4. Empirical Study
In the support of theoretical results, we have considered the data given in Murthy
(1967, p. 131-132). These data are related to the length and timber volume for ten blocks of
the blacks mountain experimental forest. The value of intraclass correlation coefficients
ρ X and ρY have been given approximately equal by Murthy (1967, p. 149) and Kushwaha
and Singh (1989) for the systematic sample of size 16 by enumerating all possible systematic
samples after arranging the data in ascending order of strip length. The particulars of the
population are given below:
13
3 2
S Y2 2 = SY = 18086.0025.
4
Table 1 depicts the MSE’s and variance of the estimators of proposed family with
respect to non-response rate ( W2 ).
α W2
*
1 (= y R ) 371.37 484.41 597.45 710.48
*
2 (= y P ) 1908.81 2021.85 2134.89 2247.93
*
3(= y D ) 1063.22 1176.26 1289.30 1402.33
4(= y )
*
1140.69 1253.13 1366.17 1479.205
5. Conclusion
In this paper, we have adapted Singh and Shukla (1987) estimator in systematic
sampling in the presence of non-response using an auxiliary variable and obtained the
optimum estimator of the proposed family. It is observed that the proposed family can
generate the non-response versions of a number of estimators of population mean in
systematic sampling on different choice of α . From Table 1, we observe that the proposed
family under optimum condition has minimum MSE, which is equal to the MSE of the
regression estimator (most of the class of estimators in sampling literature under optimum
condition attains MSE equal to the MSE of the regression estimator). It is also seen that the
MSE or variance of the estimators increases with increase in non response rate in the
population.
14
References
1. Banarasi, Kushwaha, S.N.S. and Kushwaha, K.S. (1993): A class of ratio, product and
difference (RPD) estimators in systematic sampling, Microelectron. Reliab., 33, 4,
455–457.
2. Cochran, W. G. (1946): Relative accuracy of systematic and stratified random
samples for a certain class of population, AMS, 17, 164-177.
3. Finney, D.J. (1948): Random and systematic sampling in timber surveys, Forestry,
22, 64-99.
4. Hansen, M. H. and Hurwitz, W. N. (1946) : The problem of non-response in sample
surveys, Jour. of The Amer. Stat. Assoc., 41, 517-529.
5. Hasel, A. A. (1942): Estimation of volume in timber stands by strip sampling, AMS,
13, 179-206.
6. Kushwaha, K. S. and Singh, H.P. (1989): Class of almost unbiased ratio and product
estimators in systematic sampling, Jour. Ind. Soc. Ag. Statistics, 41, 2, 193–205.
7. Lahiri, D. B. (1954): On the question of bias of systematic sampling, Proceedings of
World Population Conference, 6, 349-362.
8. Madow, W. G. and Madow, L.H. (1944): On the theory of systematic sampling, I.
Ann. Math. Statist., 15, 1-24.
9. Murthy, M.N. (1967): Sampling Theory and Methods. Statistical Publishing Society,
Calcutta.
10. Nair, K. R. and Bhargava, R. P. (1951): Statistical sampling in timber surveys in
India, Forest Research Institute, Dehradun, Indian forest leaflet, 153.
11. Quenouille, M. H. (1956): Notes on bias in estimation, Biometrika, 43, 353-360.
12. Singh, R and Singh, H. P. (1998): Almost unbiased ratio and product type- estimators
in systematic sampling, Questiio, 22,3, 403-416.
13. Singh, R., Malik, S., Chaudhary, M.K., Verma, H. and Adewara, A. A. (2012) : A
general family of ratio type- estimators in systematic sampling. Jour. Reliab. and Stat.
Stud.,5(1), 73-82).
14. Singh, R., Malik, S., Singh, V. K. (2012) : An improved estimator in systematic
sampling. Jour. Of Scie. Res., 56, 177-182.
15
15. Singh, H.P. and Kumar, S. (2009) : A general class of dss estimators of population
ratio, product and mean in the presence of non-response based on the sub-sampling of
the non-respondents. Pak J. Statist., 26(1), 203-238.
16. Singh, H.P. and Solanki, R. S. (2012) : An efficient class of estimators for the
population mean using auxiliary information in systematic sampling. Jour. of Stat.
Ther. and Pract., 6(2), 274-285.
17. Singh, S. (2003) : Advanced sampling theory with applications. Kluwer Academic
Publishers.
18. Singh, V. K. and Shukla, D. (1987): One parameter family of factor-type ratio
estimators, Metron, 45 (1-2), 273-283.
16