4. Interval Estimation
4. Interval Estimation
• We make estimates without worry about whether they are scientific but
with the hope that the estimates bear a reasonable resemblance to the
outcome.
• Managers use estimates too to make rational decisions while dealing with
issues that lack complete information and with a great deal of uncertainty.
• We use statistics to make more logical and useful estimates.
• Statistical inferences are based on estimations.
• In both estimations and hypothesis testing inferences about
characteristics of the population are drawn from information contained in
samples
Estimator and Estimates
• Any sample statistic that is used to estimate a population parameter is called an
estimator.
E.g. Suppose we calculate the mean odometer reading from a sample of used taxis
and find it to be 98,000 miles. If we use this specific value to estimate the mileage
for a whole fleet of used taxis, the value 98,000 miles would be an estimate.
The most commonly-used estimator of the population:
Proportion (P)
)
Proportion ( p is the
Types of Estimates
We can make two types of estimates about the population. They are
referred to as point estimates and interval estimates.
• A point estimate is the sample statistic that is used to estimate the
population parameter.
• Single number
• Often insufficient because it is either right or wrong
• Useful when accompanied by an estimate of the error that might be
involved.
Problem
The National Stadium is considering expanding its seating capacity and needs to
know both the average number of people who attend events there and the
variability in this number. The following are the attendances (in thousands) at nine
randomly selected sporting events. Find point estimates of the mean and the
variance of the population from which the sample was drawn.
8.8 14.0 21.3 7.9 12.5 20.6 16.3 14.1 13.0
• Unbiasedness
• Efficiency
• Consistency
• Sufficiency
Unbiasedness
• An estimator is said to be unbiased if its expected
value of the sample statistic is equal to the population
parameter it estimates.
{
Bias
An unbiased estimatorA biased estimator
is is off target on
on target on average. average.
Efficiency
Consistency
n = 10 n = 100
Sufficiency
The sample variance (the sum of the squared deviations from the sample
mean divided by (n-1) is an unbiased estimator of the population variance.
In contrast, the average squared deviation from the sample mean is a biased
(though consistent) estimator of the population variance.
2
=
æ å ( x - x )
2
ö
= s 2
E (s ) E ç ÷
-
è ( n 1) ø
æ å ( x - x )2 ö
÷ <s
2
Eç
è n ø
Margin of error and the Interval Estimate
• A point estimator cannot be expected to provide the exact value of the
population parameter.
• An interval estimate can be computed by adding and subtracting a margin of
error to the point estimate.
P x 1.96 x 1.96 0.95 Conversely, after sampling,approximately 95% of such intervals
P x 1.96 n x 1.96 n 0.95
n n
x 1.96
n
will include the population mean (and 5% of them will not).
That is, x 1.96 is a 95% confidence interval for .
n
Interval Estimate of
Population Mean
s Known
Sampling
distribution
of x
1 - of all
/2 /2
x values
interval
does x
not interval
z /2 x z /2 x
include includes
m [------------------------- x -------------------------] m
[------------------------- x -------------------------]
[------------------------- x -------------------------]
A 95% Interval around the Population Mean
Sampling Distribution of the Mean
0.4
95%
0.3
Approximately 95%
Approximately 95% of of sample
sample
f(x)
0.2 means can
means can be be expected
expected to to fall
fall
0.1
within the
within the interval
interval
1.96 , 1.96
2.5% 2.5% n. n
0.0
.
x
196
.
196
. n
n
Conversely, about
about
2.5% can
Conversely, 1.2.5%
96
n
can
x be expected
be expected toto be
be above
above
x and
1.96and 2.5% can
2.5% can be
be expected
expected
2.5% fall n
below the
x to be below
to be below
..
x
interval x
x
2.5% fall
1.96 , 1.96
x above the n n
x interval So 5%
So 5% can
can be
be expected
expected to
to fall
fall
x
outside the
outside the interval
interval
..
95% fall
within the
interval
95% Intervals around the Sample Mean
0.4
Sampling Distribution of the Mean
Approximately95%
Approximately 95%of ofthe
theintervals
intervals
x 1.96 aroundthe
around thesample
samplemean
meancancanbebe
95% n
0.3
expected
expected totoinclude
includethe
theactual
actualvalue
valueof
ofthe
the
populationmean,
mean,.. (When
(Whenthe thesample
sample
f(x)
population
0.2
0.1 meanfalls
mean fallswithin
withinthe
the95%
95%interval
intervalaround
around
2.5% 2.5%
0.0
thepopulation
the populationmean.)
mean.)
x
196
. 196
.
n n
x x x
x **5%
5%of
ofsuch
suchintervals
intervalsaround
aroundthe
thesample
sample
x
meancan
mean canbe beexpected
expectednot
nottotoinclude
includethe
the
x
actualvalue
actual valueof
ofthe
thepopulation
populationmean.
mean.
* x
x
(Whenthe
(When thesample
samplemean
meanfalls
fallsoutside
outsidethe
the
x 95%interval
95% intervalaround
aroundthe
thepopulation
population
x
mean.)
mean.)
x
x
*
x x
A (1-a )100% Confidence Interval for m
We define z as the z value that cuts off a right-tail area of under the standard
2 2
normal curve. (1-) is called the confidence coefficient. is called the error
probability, and (1-)100% is called the confidence level.
Stand ard Norm al Distrib ution
P z za = a/2
>
0.4 2
(1 )
a/2
0.3
P z < - za =
2
f(z)
0.2
P - za < z < za =(1 - a)
0.1 2 2
2 2
0.0 (1- a)100% Confidence Interval:
-5 -4 -3 -2 -1 0 1 2 3 4 5 s
z Z z x za
2 2
2 n
Interval Estimate of a Population Mean: σ Known
Interval Estimate of μ:
z scores for confidence interval in relation to alpha
Confidence Alpha Alpha divided by Table Look-up Area 2 subscript alpha divided by 2
Level 2 baseline
90% .10 .05 .9500 1.645
0.4
interval. 0.4
0.3 0.3
f(z)
f(z)
0.2 0.2
0.1 0.1
0.0 0.0
-5 -4 -3 -2 -1 0 1 2 3 4 5 -5 -4 -3 -2 -1 0 1 2 3 4 5
Z Z
0 .4 0 .9
0 .8
0 .3 0 .7
0 .6
0 .5
f(x)
f(x)
0 .2
0 .4
0 .3
0 .1
0 .2
0 .1
0 .0 0 .0
x x
• We say that this interval has been established at the 90% confidence level.
• The value 0.90 is referred to as the confidence coefficient.
Using the Z Statistic for Estimating Population Mean
• The z statistic can be used for estimating the population parameter
on the basis of the sample statistic.
• Confidence interval for estimating population mean μ
Discount Sounds has 260 retail outlets throughout the United States. The firm is evaluating a
potential location for a new outlet, based in part, on the mean annual income of the
individuals in the marketing area of the new location.
A sample of size n = 36 was taken; the sample mean income is $41,100. The population is not
believed to be highly skewed. The population standard deviation is estimated to be $4,500,
and the confidence coefficient to be used in the interval estimate is 0.95.
Interval Estimate of a Population Mean: σ Known
Example: Discount Sounds
$41,100 + $1,470
or
$39,630 to $42,570
We are 95% confident that the interval contains the population mean.
Interval Estimate of a Population Mean: σ Known
This result implies that the researcher is 90% confident that the
population mean will lie between 34.091 and 35.909. The point estimate
is 35.
Example
Comcast, the computer services company, is planning to invest heavily
in online television services. As part of the decision, the company wants
to estimate the average number of online shows a family of four would
watch per day. A random sample of
n = 100 families is obtained, and in this sample the average number
of shows viewed per day is 6.5 and the population standard
deviation is known to be 3.2. Construct a 95% confidence interval
for the average number of online television shows watched by the entire
population
x z / 2 xof
families
x z / of
2
4.
n
3 .2
6.5 z0.025 6.5 1.96 * 0.32
100
6.5 0.6272 (5.87,7.13)
OR,
5.87 7.13
Example
A survey of small business with Web Sites found that the average amount spent on
a site was INR 11,500 per year. Given a sample of 50 businesses and a population
standard deviation of = 600, what is the margin of error? Use 95% confidence.
What sample size would you recommend if the study required a margin of error of
150?
Margin of error = z.025 ( / n )
A larger sample size would be needed to reduce the margin of error to $150 or less.
1.96(600 / n ) 150
• For more than 100 degrees of freedom, the standard normal z value
provides a good approximation to the t value.
• The standard normal z values can be found in the infinite degrees () row of
the t distribution table.
Why n−1 (degrees of freedom) is used:
1. When estimating the population standard deviation (σ) using the sample
standard deviation (s), we calculate s based on the deviations of the sample
values from the sample mean (xbar).
2. The sample mean itself is computed from the same sample, introducing a
constraint: the sum of the deviations from the mean is always zero. This means
only n−1deviations are "free" to vary independently; the nth value is
determined by the others.
Interval Estimate of a Population Mean: σ Unknown
A (1-)100% confidence interval for when is not known (assuming a normally
distributed population) is given by:
Large Sample Confidence Intervals for the Population Mean
A reporter for a student newspaper is writing an article on the cost of off-campus housing. A
sample of 16 one-bedroom apartments within a half-mile of campus resulted in a sample
mean of $750 per month and a sample standard deviation of $55.
Let us provide a 95% confidence interval estimate of the mean rent per month for the
population of one-bedroom apartments within a half-mile of campus. We will assume this
population to be normally distributed.
Interval Estimate of a Population Mean: σ Unknown
Interval Estimate
We are 95% confident that the mean rent per month for the population of
one-bedroom apartments within a half-mile of campus is between $720.70
and $779.30.
Example
A stock market analyst wants to estimate the average return on a
certain stock. A random sample of 15 days yields an average
x 10.37%
(annualized) return of and a standard deviation of s =
3.5%. Assuming a normal population of returns, give a 95%
confidence interval for the average return on this stock.
SOLUTION
The critical value of t for df = (n -1) = (15 -1) =14 and a right-tail
area of 0.025 is:
t 0.025 2.145
The corresponding confidence interval or interval estimate is:
s
x t 0. 025
n
3.5
10.37 2.145
15
10.37 1.94
8.43,12.31
Example : In order to estimate the customer loyalty for a particular product, a researcher poses the
following question to a sample of 100 customers: How many years have you been continuously using
this product? This sample yielded a mean period of 8 years with a sample standard deviation of 2
years. Construct a 95% confidence interval for estimating the population mean.
This result implies that the researcher is 95% confident that the
population mean (average years after purchase in the population)
will lie between 7.608 years and 8.392 years.
Summary of Interval Estimation Procedures for a Population Mean
Can the
population standard
deviation s be assumed
Yes known ? No
Suppose that Discount Sounds’ management team wants an estimate of the population mean
such that there is a 0.95 probability that the sampling error is $500 or less.
For estimating p , a sample is considered large enough when both n p an n q are greater
than 5.
Interval Estimate of a Population Proportion
Large-Sample Confidence Intervals for the Population Proportion, p
Sampling p(1 p)
distribution p
n
of p̂
p̂
p
z / 2 p z / 2 p
A large - sample (1- )100% confidence interval for the population proportion, p :
pˆ z pˆ qˆ
/2 n
where the sample proportion, p̂, is equal to the number of successes in the sample, x,
divided by the number of trials (the sample size), n, and q̂ = 1- p̂.
Interval Estimate of a Population Proportion
Example: Political Science, Inc.
Political Science Inc. (PSI) specializes in voter polls and surveys designed to keep political
office seekers informed of their position in a race.
Using telephone surveys, PSI interviewers ask registered voters who they would vote for if
the election were held that day.
In a current election campaign, PSI has just found that 220 registered voters, out of 500
contacted, favor a particular candidate. PSI wants to develop a 95% confidence interval
estimate for the proportion of the population of registered voters that favor the candidate.
Interval Estimate of a Population Proportion
PSI is 95% confident that the proportion of all voters that favor the
candidate is between 0.3965 and 0.4835.
Example: A research company conducted a survey on 300 randomly selected
taxpayers. It found that out of 300 taxpayers, 180 taxpayers have filled the “SARAL”
form correctly. Construct a 95% confidence interval to estimate the percentage of
taxpayers who have filled the form correctly in the population.
Example
A marketing research firm wants to estimate the share that foreign
companies have in the American market for certain products. A
random sample of 100 consumers is obtained, and it is found that
34 people in the sample are users of foreign-made products; the
rest are users of domestic products. Give a 95% confidence
interval for the share of foreign products in this market.
SOLUTION
pq ( 0.34 )( 0.66)
p z 0.34 1.96
2
n 100
0.34 (1.96)( 0.04737 )
0.34 0.0928
0.2472 ,0.4328
Thus, the firm may be 95% confident that foreign manufacturers control anywhere
from 24.72% to 43.28% of the market.
Sample Size for an Interval Estimate of a Population Proportion
Example: Political Science, Inc.
Suppose that PSI would like a 0.99 probability that the sample proportion is within 0.03 of
the population proportion.
How large a sample size is needed to meet the required precision? (A previous sample of
similar units yielded 0.44 for the sample proportion.)
Sample Size for an Interval Estimate of a Population Proportion