Chapter 8 - Confidence Intervals - Lecture Notes
Chapter 8 - Confidence Intervals - Lecture Notes
In this chapter we are entering the part of statistics that is inferential statistics.
The purpose of collecting data on a sample is not only to describe the sample but
to be able to use this data and infer (generalize) to the population represented by
this sample. Inferential Statistics is the collection of methods that help us make
decisions on the population based on sample data
In Statistics µ is called the true population mean and we use x the sample mean to
estimate it, and P is called the population true proportion and we use ^pthe
sample proportion to estimate it.
Point Estimate: If we pick a random sample and calculate its mean x then this will
be the point estimate of the population value µ.
But remember from Chapter 7 when we picked all possible samples each sample
yielded a different x . So the actual value depends on which sample was picked.
the point estimate then assigns a value for µ which is almost always different
than the true population mean.
Confidence Interval Estimate of a population mean
So now we are going to find an estimate for the population value, this value is a
range based on data from a sample with a level of confidence. This interval is
constructed around the point estimate, and a Level of Confidence that this
interval will catch the parameter is made.
Before we can construct a Confidence Level there are certain conditions that have
to be met
1. The sample must be randomly selected
2. The sampling distribution of x should follow a normal distribution
Remember from Chapter 7 for the sampling distribution of x to follow
a normal distribution
Either the variable of interest x has a normal distribution in the
population
Or the sample size ≥ 30
The width of the Confidence interval is the difference between the upper Limit
and the lower limit
W = Upper Limit – Lower Limit = 2 Z ϭ x = 2 ME
There are 3 factors that could affect the width of the confidence Interval
The Confidence level : The larger the confidence level is , the larger is the
absolute value of Z
( for example 95% z= 1.96 ; 99% z= 2.575 ), so as a result the larger is the
Margin of error ( keeping everything else the same ) and the wider is the
confidence interval
ϭ
The larger the sample size n the smaller is ϭ x (remember ϭ x =¿
√ n ) as a
result the narrower is the confidence interval
The third factor is the population standard deviation but this is a property
of the population we are studying and we have no control on ( but if we are
comparing 2 populations with everything the same the larger the
population standard deviation ϭ the larger the width of the confidence
interval )
n = 1500
If we take all possible samples of n=1500 and construct a 99% confidence interval
for µ from each sample, we can expect about 99% of these confidence intervals
will contain µ and 1% will not.
Now we are going to see how to construct a confidence interval in the case where
the population standard deviation is not known
The conditions that the sample should be a random sample and that the sampling
distribution of x follows a normal distribution should be met.
So if confidence level is 95% and the sample size is 10 then the t coefficient we
t=2.262.
Example
8.43 A random sample of 20 acres gave a mean yield of wheat equal to 41.2
bushels per acre with a standard deviation of 3 bushels. Assuming that the yield
per acre is normally distributed, construct a 90% confidence level for the
population mean
Solution :
t= 1.729
√
Standard Deviation of the sampling distribution of ^p is : σ ^p = pq
n
Shape of the sampling distribution of ^pis normally distributed if np≥ 5 and
nq is ≥5
But when estimating the population proportion we do not know p we are trying
to estimate it going to replace p by ^pand so σ = pq
^p
√ n
Becomes s ^p =
√ ^p q^
n
And the confidence interval for the population proportion p is :
^p ± Z s ^p = ^p ± Z
√ ^p q^
n
^^
√
And Z p q is the margin of error which is the maximum error of the estimate
n
Solution :
s ^p =
√ √
^p q^
n
= 0.38 x .0 .62
50
= 0.069
b. The width of the confidence interval constructed in part a may be reduced by:
1. Lowering the confidence level
2. Increasing the sample size
The second alternative is better because lowering the confidence level results in
a less reliable estimate of p.
Determining the Sample Size for the estimation of proportion
Just as we did previously with the mean we can find what is the sample size
needed for estimating P with a certain Margin of Error and confidence Level
√^^
Margin of Error = Z p q
n
Z is found from the standard normal table depending on the confidence level
2
Z p^ q^
n= E
2
In the case where ^p is not known we make the most conservative estimate of the
sample size n by using ^p = 0.5 and q^ = 0.5. Foe a given Margin of error these
values give us the largest sample size since their product 0.5x0.5= 0.25 is greater
than any other pair
Example
Determine the sample size for the estimation of the population proportion for the
following
E = 0.04 ^p = 0.78 confidence level 95%
2
Z p^ q^
n= 2
E
Z from the standard normal distribution table for the area 0.025 is 1.96
2
1.96 (0.78)(0.22)
n= =412.01= 413
0.042