Chapter 3 - 2012
Chapter 3 - 2012
In some cases the nature of the survey may require recording of the attributes, which can
be expressed qualitatively. The qualitative information can be quantified by counting the
attribute characteristics. These characteristics could be of various forms, such as living in
urban or rural area, being a male or a female, married or unmarried, literate or illiterate,
adults between 18 and 45 years or adults over 45 years, etc.
The main interest for such attributes could be to estimate the total number of units and the
proportion of units in the population possessing some characteristics. Attributes can be
changed into quantifiable information by allocating the score “1” or “0”, while
measurable variables can also be changed into attributes by categorizing the population
into different groups.
It is worth presenting the special simple form that the variance of a proportion takes when
the design is simple random sampling. The following discussion will consider a
population classified into two-category, in which each member of the population is
classified as either having or not having a specified characteristics of interest.
3.2 Variances and Standard Errors of the Estimates of the Population Proportion
For any unit in the population or in the sample, we define an observation (variable) yi as
follows to facilitate counting.
yi = 1, if the unit is in C
0, if the Unit is not in C
N
N Y i
A NPQ
For population, Y Y i A, Y i 1
P , and S 2 ,
i 1 N N N 1
1
n
n y a i
n pq
and for sample y y i a, y i 1
p , s2
(verify).
i 1 n n n 1
Similar to a continuous case, a sample proportion, p, can be used to make inferences
about a population proportion P. Just like the sample mean y , the sample proportion p is
also a random variable that depends on what members of the population are included in
that sample.
Theorem 5:
Theorem 6:
Theorem 7:
pq N n pq
An unbiased estimate the sample variance will be var p 1 f
n 1 N n 1
If N is large relative to n, the finite population correction (1-f) is negligible and the
pq
variance of p is var p . (Prove this theorem)
n 1
2
3.4 Confidence Limits
For the proportion estimate the confidence limits can be obtained by: P = p Z a S.E
2
pfor large sample size and substitute S.E.(p) by s.e.(p) to get the confidence interval,
P = p Z a s.e. p . A slight improvement can be achieved by applying continuity
2
correction for normal approximation to binomial, i.e.
P = p Z a s.e. p + 1/2n.
2
S .E.(ˆ)
Generally, the coefficient of variation of an estimator ˆ is given by CV (ˆ) and
E (ˆ)
Var (ˆ)
its square is known as rel-variance, i.e., CV 2 (ˆ) .
( E (ˆ)) 2
The sample size required for estimation population proportion P can be obtained in a
similar way and have similar forms to those shown above for the mean. Assume that the
proportion estimate p is normally distributed with absolute margin of error d p P or
Z 2 PQ d 2
relative error d P , the sample size n can be calculated by n
1 1 N Z 2 PQ Nd 2
no
(verify this). If we put no Z 2 PQ d 2 , then we get n . For large
1 1 N no N
3
no
population size N we have the sample size n , and we can approximate n
1 no N
by n o as we have done for the mean.
Z 2 PQ Z 2Q
Using the relative error () and the relation d = P, we set no
d2 P 2
In Practice, the population parameters S 2 , y , P must be estimated and the other factors
Z and , usually set by the investigator (researcher). The relation shows the following
summary points.
How do we get estimates of the population parameters in order to use these estimates in
sample size determination? In actual practice, there are four possible ways of estimating
the parameters.
Reading Assignment: Read Cochran 3rd ed., chapter 4, section 4.7, page 78-81.
Examples:
4
1. A teacher training institutes are interested in estimating the proportion (P) of teachers
who consider semester system to be more suitable as compared to the 3-term system of
education. A SRS of n =120 teachers is taken from a total N =1200 teachers, without
replacement. Some of the teachers are in favor of two semesters while others are not and
it is found that 72 teachers are in favor of semester system.
i) Estimate the proportion P along with the standard error of your estimate.
iii) Do you think the sample size 120 is sufficient if the tolerable error could be 0.08? If
not, how many more units should be included in the sample?
Solution: n= 120, a= 72 , N= 1200,