SP Sampling Lect 8
SP Sampling Lect 8
Lecture 8
Simple Random Sampling
Shalabh
Department of Mathematics and Statistics
Indian Institute of Technology Kanpur
2
Estimation of Confidence limits for the population mean:
unknown
When 𝝈𝟐 is unknown, then the 𝟏𝟎𝟎 𝟏 𝜶)% confidence
interval is
y t Var ( y ) Y y t Var ( y )
2 2
where t denotes the upper (/2)% points of t distribution with
(n – 1) degrees of freedom.
3
Proof: Useful Result when is known and unknown
Assume that the population is normally distributed N(𝒀 , 2)
Then
y Y
N (0,1) when 2 is known.
Var ( y )
y Y
t( n 1) when 2 is unknown.
( y)
Var
4
Proof: Estimation of Confidence limits for the population
mean: known
When 2 is known, then the 100(1 – )% confidence interval is
given by
y Y
P Z Z 1
2 Var ( y ) 2
or P y Z Var ( y ) Y y Z Var ( y ) 1
2 2
the confidence limits are
y Z Var ( y ), y Z Var ( y )
2 2
where Z denotes the upper (/2)% points on N(0, 1) distribution.
5
Proof: Estimation of Confidence limits for the population
mean: unknown
When 2 is unknown, then the 100(1 – )% confidence interval is
given by y Y
P t t 1
2 ( y) 2
Var
or P y t Var ( y ) Y y t Var ( y ) 1
2 2
YT can be estimated by
YˆT NYˆ Ny .
7
Estimation of Population Total
Then E (YˆT ) NE ( y ) NY YT .
Variance of 𝑌 is
Var (YˆT ) N 2 Var ( y )
2 N n 2 N ( N n) 2
N Nn S S for SRSWOR
n
N 2 N 1 S 2 N ( N 1) S 2 for SRSWR.
Nn n
Estimate of variance of 𝑌 is
(Yˆ ) N 2 Var
Var ( y)
T
N ( N n) 2
s for SRSWOR
n
2
N s2 for SRSWR.
n 8
Determination of Sample Size
The size of the sample is needed before the survey starts and goes
into operation.
9
Determination of Sample Size
The sample size can be determined on the basis of prescribed
values of the
• error of estimation,
others.
10
Determination of Sample Size
An important constraint or need to determine the sample size is
that the information regarding the population standard derivation
S should be known for these criteria.
The reason and need for this will be clear when we derive the
sample size.
11
Determination of Sample Size
A possible solution to this issue is to conduct a pilot survey and
collect a preliminary sample of small size, estimate S and use it as a
known value of S it.
Now we find the sample size under different criteria assuming that
the samples have been drawn using SRSWOR. The case for SRSWR
can be derived similarly. 12
1. Prespecified Variance
The sample size is to be determined such that the variance of y
should not exceed a given value, say V. In this case, find n such that
Var ( y ) V
or N n 2
S V
Nn
1 1 V
or 2
n N S
1 1 1 S2
or where ne .
n N ne V
ne
or n
ne
1
N 13
1. Prespecified Variance
It may be noted here that ne can be known only when S2 is known.
P y Y e (1 ).
15
2. Pre‐specified Estimation Error
Assuming the normal distribution for the population,
N n 2
y ~ N Y , S
Nn
we can write
y Y e
P 1
Var ( y ) Var ( y )
or Z 2 Var ( y ) e 2
2
16
2. Pre‐specified Estimation Error
Now
N n 2
Z2
S e2
2 Nn
or
2
Z S / e
2
n 2
1 1 Z S / e
N 2
2 Z Var ( y ) W
2
18
3. Pre‐specified Width of the Confidence Interval
This can be expressed as
N n
2Z S W
2 Nn
or 1 1
4Z 2 S 2 W 2
2 n N
or 1 1 W2
n N 4 Z 2 S 2
2
4Z 2 S 2
2
or n W2 .
4Z 2 S 2
1 2
NW 2
19
3. Pre‐specified Width of the Confidence Interval
4 Z 2 S 2
2
W2
4 Z 2 S 2
nsmallest 2
W2
20
4. Pre‐specified Coefficient of Variation
The coefficient of variation (CV) is defined as the ratio of standard
error (or standard deviation) and mean.
21
4. Pre‐specified Coefficient of Variation
CV ( y ) C0
Var ( y )
or C0
Y
N n 2
S
or Nn C 2
0
Y2
C2
1 1 C02 Co2
or 2 or n
n N C C2
1
NC02
is the required sample size where C S is the
Y
population coefficient of variation.
22
4. Pre‐specified coefficient of Variation
The smallest sample size needed in this case is
C2
C02
nsmallest
C2
1
NC02
If N is large, then
C2
n 2
C0
C2
and nsmalest 2
C0
23
5. Pre‐specified Relative Error
When y is used for estimating the population mean Y , then the
relative estimation error is defined as y Y .
Y
y Y RY
P 1.
Var ( y ) Var ( y )
24
5. Pre‐specified Relative Error
Assuming the population to be normally distributed,
N n 2
y ~ N Y , S .
Nn
So it can be written that
RY
Z
Var ( y ) 2
N n 2
or 2
Z S R 2 2
Y
2 Nn
1 1 R2
or 2 2
n N C Z
2
25
5. Pre‐specified Relative Error
2
Z C
2
R
or
n
2
Z C
1
1 2
N R
S
where C is the population coefficient of variation and should
Y
be known.
If N is large, then
2
z C
n 2 .
R
26
6. Pre‐specified Cost
Let an amount of money C is being designated for sample
survey to collect n observations,
C0 be the overhead cost and
C1 be the cost of collection of one unit in the sample.
27