Chap08 PPT

Download as pdf or txt
Download as pdf or txt
You are on page 1of 77

Applied Statistics in Business &

Economics

David P. Doane and Lori E. Seward

Vũ Võ
[email protected]

8-1
Chapter 8
Sampling Distributions and Estimation
Chapter Contents

8.1 Sampling and Estimation


8.2 Central Limit Theorem
8.3 Sample Size and the Standard Error
8.4 Confidence Interval for a Mean (μ) with Known σ
8.5 Confidence Interval for a Mean (μ) with Unknown σ
8.6 Confidence Interval for a Proportion (π)
8.7 Estimating from Finite Populations
8.8 Sample Size Determination for a Mean
8.9 Sample Size Determination for a Proportion
8.10 Confidence Interval for a Population Variance, σ2
(Optional)

8-2
Chapter 8
Sampling Distributions and Estimation
(continued)
Chapter Learning Objectives (LOs)

LO8-1: Define sampling error, parameter, and estimator.


LO8-2: Explain the desirable properties of estimators.
LO8-3: State and apply the Central Limit Theorem for a mean.
LO8-4: Explain how sample size affects the standard error.
LO8-5: Construct a confidence interval for a population mean
using z.

8-3
Chapter 8
Sampling Distributions and Estimation
(continued, 2)
Chapter Learning Objectives (LOs) continued

LO8-6: Know when and how to use Student’s t instead of z to


estimate a mean.
LO8-7: Construct a confidence interval for a population proportion.
LO8-8: Know how to modify confidence intervals when the
population is finite.
LO8-9: Calculate sample size to estimate a mean.
LO8-10: Calculate sample size to estimate a proportion.
LO8-11: Construct a confidence interval for a variance (optional).

8-4
Chapter 8
8.1 Sampling and Estimation
LO8-1: Define sampling error, parameter, and estimator.

• A sample statistic is a random variable whose value


depends on which population items are included in the
random sample.
• Depending on the sample size, the sample statistic
could either represent the population well or differ
greatly from the population (particularly if the sample
size is small).
• This sampling variation can easily be illustrated by
selecting random samples from a large population.

8-5
Chapter 8
LO8-1: Define sampling error, parameter, and
estimator (continued).

• Consider sampling from a large population of GMAT


scores for MBA applicants. The population parameters
are μ = 520.78 and σ = 86.80.
• Figure 8.1 shows a dot plot of the entire population.

8-6
Chapter 8
LO8-1: Define sampling error, parameter, and
estimator (continued, 2).

8-7
Chapter 8
LO8-1: Define sampling error, parameter, and
estimator (continued, 3).

• Figure 8.2 shows several random samples of n = 5


from this population.
• The individual items that happen to be included in the
samples vary.
• Sampling variation is inevitable, yet there is a
tendency for the sample means to be close to the
population mean (μ = 520.78) shown as a dashed line
in Figure 8.2.
• In larger samples, the sample means would tend to be
even closer to μ.
• This is the basis for statistical estimation.

8-8
Chapter 8
LO8-1: Define sampling error, parameter, and
estimator (continued, 4).
• The dot plots show that the sample means have much
less variation than the individual sample items.
Note: This chapter describes
the behavior of the sample
mean and other statistical
estimators of population
parameters, and explains
how to
make inferences about a
population.

8-9
Chapter 8
LO8-1: Define sampling error, parameter, and
estimator (continued, 5).
Estimators
• An estimator is a statistic derived from a sample to infer the value
of a population parameter.
• An estimate is the value of the estimator in a particular sample.
• Population parameters are usually represented by
Greek letters and the corresponding statistic
by Roman letters.

8-10
Chapter 8
LO8-1: Define sampling error, parameter, and
estimator (continued, 6).
Examples of Estimators

8-11
Chapter 8
LO8-1: Define sampling error, parameter, and
estimator (continued, 7).

Sampling Error

• The sampling distribution of an estimator is the


probability distribution of all possible values the statistic
may assume when a random sample of size n is taken.
• Note: An estimator is a random variable since samples
vary.
• Sampling error is the difference between an estimate
and the corresponding population parameter. For
example, if we use the sample mean as an estimate for
the population mean, then the Sampling Error =
8-12
Chapter 8
LO8-2: Explain the desirable properties of estimators.

Properties of Estimators

• Bias is the difference between the expected value of the


estimator and the true parameter. Example for the mean
is
• An estimator is unbiased if its expected value is the
parameter being estimated. The sample mean is an
unbiased estimator of the population mean since

• On average, an unbiased estimator neither overstates


nor understates the true parameter.

8-13
Chapter 8
LO8-2: Explain the desirable properties of estimators
(continued).
Note: Also, a desirable property for an estimator is for it
to be unbiased.

8-14
Chapter 8
LO8-2: Explain the desirable properties of estimators
(continued, 2).
Efficiency
• Efficiency refers to the variance of the estimator’s sampling distribution.
• A more efficient estimator has smaller variance.
• Among all unbiased estimators, we prefer the minimum variance
estimator, referred to as MVUE (minimum variance unbiased estimator).
• Figure 8.5 (next slide) shows two unbiased estimators. Both patterns are
centered on the bull’s-eye, but the estimator on the left has less variation.
• A more efficient estimator is closer on average to the true value of the
parameter.
• You cannot assess efficiency from one sample, but it can be studied either
mathematically or by simulation.
• For a normal distribution, ̅ and s2 are minimum variance estimators
of µ and σ2, respectively. Similarly, the sample proportion p is an MVUE of
the population proportion π. That is one reason these statistics are widely
used.
8-15
Chapter 8
LO8-2: Explain the desirable properties of estimators
(continued, 3).
Efficiency (continued)

8-16
Chapter 8
LO8-2: Explain the desirable properties of estimators
(continued, 4).
Consistency
• A consistent estimator converges toward the parameter being
estimated as the sample size increases.
• That is, the sampling distribution collapses on the true parameter, as
illustrated in Figure 8.6.

8-17
Chapter 8
8.2 Central Limit Theorem
LO8-3: State and apply the Central Limit Theorem
for a mean.
• The sampling distribution of an estimator is the probability
distribution of all possible values the statistic may assume when a
random sample of size n is taken.
• An estimator has a probability distribution with a mean and variance.
• Consider the sample mean ̅ used to estimate the population
mean µ.
• Our objective is to use the sampling distribution of ̅ to say
something about the population that we are studying.
• To describe the sampling distribution, we need to know the
mean, variance, and shape of the distribution.
• Recall that the sample mean is an unbiased estimator for µ;
therefore, ̅ = (expected value of the mean).

8-18
Chapter 8
LO8-3: State and apply the Central Limit Theorem for a
mean (continued).

• Recall also that ̅ is a random variable whose value will


change whenever we take a different sample.
• And as long as our samples are random samples, the only type
of error we will have in our estimating process is sampling
error.
• The sampling error of the sample mean is described by its
standard deviation.
• This value has a special name, the standard error of the
mean.
• It is defined by (standard error of the mean).

• Notice that the standard error of the mean decreases as the


sample size increases.

8-19
Chapter 8
LO8-3: State and apply the Central Limit Theorem for a
mean (continued, 2).

• Furthermore, if the population is normal, then the sample


mean follows a normal distribution for any sample size.
• Unfortunately, the population may not have a normal
distribution, or we may simply not know what the
population distribution looks like.
• What can we do in these circumstances?
• We can use one of the most fundamental laws of statistics,
the Central Limit Theorem.

8-20
Chapter 8
LO8-3: State and apply the Central Limit Theorem for a
mean (continued, 3).

Central Limit Theorem for the Mean


If a random sample of size n is drawn from a
population with mean µ and standard deviation σ, the
distribution of the sample mean approaches a
normal distribution with mean µ and standard deviation
as the sample size increases.

The Central Limit Theorem is a powerful result that allows


us to approximate the shape of the sampling distribution of
the sample mean even when we don’t know what the
population looks like.

The following are three important facts about the sample mean.
8-21
Chapter 8
LO8-3: State and apply the Central Limit Theorem for a
mean (continued, 4).
1. If the population is exactly normal, then the sample mean
follows a normal distribution.

8-22
Chapter 8
LO8-3: State and apply the Central Limit Theorem for a
mean (continued, 5).
2. As the sample size n increases, the distribution of sample means
converges to the population mean µ (i.e., the standard error of the
mean gets smaller).

8-23
Chapter 8
LO8-3: State and apply the Central Limit Theorem for a
mean (continued, 6).
3. Even if your population is not normal, by the Central Limit
Theorem, if the sample size is large enough, the sample
means will have approximately a normal distribution.

8-24
Chapter 8
LO8-3: State and apply the Central Limit Theorem for a
mean (continued, 7).

You may have heard the rule


of thumb that n ≥ 30 is
required to ensure a normal
distribution for the sample
mean, but actually a much
smaller n will suffice if the
population is symmetric as
illustrated in Figure 8.7.

8-25
Chapter 8
LO8-3: State and apply the Central Limit Theorem for a
mean (continued, 8).
Range of Sample Means
The Central Limit Theorem permits us to define an interval within
which the sample means are expected to fall. As long as the
sample size n is large enough, we can use the normal distribution
regardless of the population shape (or any n if the population is
normal to begin with).

Expected Range of Sample Means

8-26
Chapter 8
LO8-3: State and apply the Central Limit Theorem for a
mean (continued, 9).

Range of Sample Means (continued)

If we know µ and σ, the CLT allows us to predict the range of sample


means for samples of size n, when the z-values for the standard
normal distribution are used.

90% Interval:

95% Interval:

99% Interval:

8-27
Chapter 8
8.3 Sample Size and Standard Error
LO8-4: Explain how sample size affects the standard error.

Sample Size and Standard Error


• Even if the population standard deviation σ is large, the sample
means will fall within a narrow interval as long as n is large. The
key is the standard error of the mean . The standard error
decreases as n increases.
• For example, when n = 4 the standard error is halved. To halve it
again requires n = 16, and to halve it again requires n = 64. To
halve the standard error, you must quadruple the sample size (the
law of diminishing returns).

8-28
Chapter 8
LO8-4: Explain how sample size affects the standard error
(continued).
Sample Size and Standard Error (continued)

8-29
Chapter 8
LO8-4: Explain how sample size affects the standard error
(continued, 2).
Illustration: All Possible Samples from a Uniform Population

• Consider a discrete uniform population consisting of the


integers {0, 1, 2, 3}.

• The population parameters are: μ = 1.5, σ = 1.118.

8-30
Chapter 8
LO8-4: Explain how sample size affects the standard error
(continued, 3).
Illustration: All Possible Samples from a Uniform Population
(continued)

• The population is uniform, yet the distribution of all


possible sample means of size 2 has a peaked
triangular shape.
• The distribution of sample means is approaching a bell
shape or normal distribution, as predicted by the Central
Limit Theorem. See images on the next slide.

8-31
Chapter 8
LO8-4: Explain how sample size affects the standard error
(continued, 4).
Illustration: All Possible Samples from a Uniform Population
(continued)

8-32
Chapter 8
8.4 Confidence Interval for a Mean
(µ) with known σ
LO8-5: Construct a confidence interval for a population
mean using z.
What Is a Confidence Interval?
• A sample mean ̅ calculated from a random sample x1, x2, …, xn is a point
estimate of the unknown population mean µ.
• Now, because samples vary, we need to indicate our uncertainty about the
true value of µ.
• Based on our knowledge of the sampling distribution of , we can create
an interval estimate for µ.
• We construct a confidence interval for the unknown mean µ by adding and
subtracting a margin of error from ̅ , the mean of our random sample.
• The confidence level for this interval is expressed as a percentage such as 90,
95, or 99 percent.

8-33
Chapter 8
LO8-5: Construct a confidence interval for a population
mean using z (continued).

Confidence Interval for a Mean µ with Known 

• The confidence interval for μ with known σ can be written as:

8-34
Chapter 8
LO8-5: Construct a confidence interval for a population
mean using z (continued, 2).
If samples are drawn from a normal
population (or if the sample is large enough
that is approximately normal by the
Central Limit Theorem) and σ is known,
then the margin of error is calculated using
the standard normal distribution. The
value zα/2 is determined by the desired
level of confidence, which we call 1 − α.
Because the sampling distribution is
symmetric, α/2 is the area in each tail of
the normal distribution.

8-35
Chapter 8
LO8-5: Construct a confidence interval for a population
mean using z (continued, 3).
Choosing a Confidence Level

• A higher confidence level leads to a wider confidence


interval.
• Greater confidence
implies loss of
precision (i.e., greater
margin of error).
• 95% confidence is
most often used.

8-36
Chapter 8
LO8-5: Construct a confidence interval for a population
mean using z (continued, 4).

Interpretation

• A confidence interval either does or does not contain μ.


• The confidence level quantifies the risk.
• Out of 100 confidence intervals, approximately 95% may
contain μ, while approximately 5% might not contain μ
when constructing 95% confidence intervals.

8-37
Chapter 8
LO8-5: Construct a confidence interval for a population
mean using z (continued, 5).

When Can We Assume Normality?

• If σ is known and the population is normal, then we can


safely use the formula to compute the confidence interval.
• If σ is known and we do not know whether the population is
normal, a common rule of thumb is that n ≥ 30 is sufficient
to use the formula as long as the distribution is
approximately symmetric with no outliers.
• Larger n may be needed to assume normality if you are
sampling from a strongly skewed population or one with
outliers.

8-38
Chapter 8
8.5 Confidence Interval for a Mean
(µ) with Unknown σ
LO8-6: Know when and how to use Student’s t instead of z
to estimate a mean.
Student’s t Distribution
• In situations where the population is normal but its
standard deviation σ is unknown, the Student’s
t distribution should be used instead of the
normal z distribution.
• This is particularly important when the sample size is small.
• When σ is unknown, the formula for a confidence interval
resembles the formula for known σ except
that t replaces z and s replaces σ.

8-39
Chapter 8
LO8-6: Know when and how to use Student’s t instead of z
to estimate a mean (continued).

Confidence Interval for a Mean µ with Unknown σ

Such a confidence interval can be expressed as:

This can be further expressed as:

is the estimated standard error of the mean.

8-40
Chapter 8
LO8-6: Know when and how to use Student’s t instead of z
to estimate a mean (continued, 2).

• The interpretation of the confidence interval is the


same as when σ is known.
• However, the confidence intervals will be wider (other
things being the same) because tα/2 is always greater
than zα/2.
• Intuitively, our confidence interval will be wider
because we face added uncertainty when we use the
sample standard deviation s to estimate the unknown
population standard deviation σ.

8-41
Chapter 8
LO8-6: Know when and how to use Student’s t instead of z
to estimate a mean (continued, 3).

• The confidence level, using the t-distribution is shown


in Figure 8.13.

8-42
Chapter 8
LO8-6: Know when and how to use Student’s t instead of z
to estimate a mean (continued, 4).

Student’s t Distribution
• t distributions are symmetric and shaped like the standard normal
distribution.
• However, they are somewhat less peaked and have thicker tails.
• The t distribution is dependent on the size n of the sample.

8-43
Chapter 8
LO8-6: Know when and how to use Student’s t instead of z
to estimate a mean (continued, 5).

Degrees of Freedom
• Degrees of Freedom (d.f.) is a parameter based on the
sample size that is used to determine the value of the t
statistic.
• Degrees of freedom tell how many observations are used to
calculate σ, less the number of intermediate estimates used
in the calculation. The d.f. for the t distribution in this case,
is given by d.f. = n -1.
• As n increases, the t distribution approaches the shape of
the normal distribution.
• For a given confidence level, t is always larger than z, so a
confidence interval based on t is always wider than if z were
used.
8-44
Chapter 8
LO8-6: Know when and how to use Student’s t instead of z
to estimate a mean (continued, 6).

Comparison of z and t

• For very small samples, t-values differ substantially


from the normal.
• As the degrees of freedom increase, the t-values
approach the normal z-values.
• For example, for n = 31, the degrees of freedom,
d.f. = 31 – 1 = 30.
• So for a 90 percent confidence interval, we would use
t = 1.697, which is only slightly larger than z = 1.645.

8-45
Chapter 8
LO8-6: Know when and how to use Student’s t instead of z
to estimate a mean (continued, 7).
Example: GMAT Scores

Figure 8.13 8-46


Chapter 8
LO8-6: Know when and how to use Student’s t instead of z
to estimate a mean (continued, 8).

Example: GMAT Scores (continued)

• Construct a 90% confidence interval for the mean


GMAT score of all MBA applicants.
• ̅ = 510, s = 73.77
• Since σ is unknown, use the Student’s t for the
confidence interval with d.f. = 20 – 1 = 19.
• First find tα/2 = t.05 = 1.729 from Appendix D.

8-47
Chapter 8
LO8-6: Know when and how to use Student’s t instead of z
to estimate a mean (continued, 9).
Example: GMAT Scores (continued)
For a 90% confidence interval, use Appendix D to find t0.05 = 1.729 with d.f. = 19.

8-48
Chapter 8
LO8-6: Know when and how to use Student’s t instead of z
to estimate a mean (continued, 10).

Example: GMAT Scores (continued)

Note: One can use Excel, Minitab, etc. to obtain these values as well as to
construct confidence Intervals.

The 90 percent confidence interval is

or 510 ± (1.729)(73.77/√20) or 510 ± 28.52.

So we are 90 percent confident that the true mean GMAT


score is within the interval [481.48, 538.52].

8-49
Chapter 8
LO8-6: Know when and how to use Student’s t instead of z
to estimate a mean (continued, 11).

Confidence Interval Width

• Confidence interval width reflects


• the sample size,
• the confidence level, and
• the standard deviation.
• To obtain a narrower interval and more precision
• increase the sample size, or
• lower the confidence level (e.g., from 90% to 80%
confidence).

8-50
Chapter 8
LO8-6: Know when and how to use Student’s t instead of z
to estimate a mean (continued, 12).

Using Appendix D
• Beyond d.f. = 50, Appendix D shows d.f. in steps of 5
or 10.
• If the table does not give the exact degrees of
freedom, use the t-value for the next lower degrees of
freedom.
• This is a conservative procedure since it causes the
interval to be slightly wider.
• A conservative statistician may use the t distribution for
confidence intervals when σ is unknown because
using z would underestimate the margin of error.
8-51
Chapter 8
8.6 Confidence Interval for a
Proportion (π)
LO8-7: Construct a confidence interval for a population
proportion.

• The Central Limit Theorem (CLT) applies to a sample proportion


because a proportion is just a mean of a data set whose only
values are 0 or 1.
• The distribution of a sample proportion p = x/n tends toward
normality as n increases.
• The distribution is centered at the population proportion π.
• Its standard error σp will decrease as n increases, as in the case
of the standard error for .
• In other words, the sample proportion p = x/n is
a consistent estimator of π.

8-52
Chapter 8
LO8-7: Construct a confidence interval for a population
proportion (continued).

Central Limit Theorem for a Proportion

As the sample size increases, the distribution of the


sample proportion p = x/n approaches a normal
distribution with mean π and standard error

8-53
Chapter 8
LO8-7: Construct a confidence interval for a population
proportion (continued, 2).

Applying the CLT


• The distribution of a sample proportion p = x/n is symmetric
if π = .50 and regardless of π, approaches symmetry as n
increases.
• If we could actually take repeated samples, we could
empirically study the sampling distribution of p = x/n. This
can be done in a computer simulation. Figure 8.18 shows
histograms of 1,000 sample proportions of various sample
sizes taken from a population with π = .20.
• See the next slide for Figure 8.18.

8-54
Chapter 8
LO8-7: Construct a confidence interval for a population
proportion (continued, 3).
Applying the CLT (continued)

8-55
Chapter 8
LO8-7: Construct a confidence interval for a population
proportion (continued, 4).
When Is It Safe to Assume Normality of p?

The statistic p = x/n may be assumed normally distributed


when the sample is “large.” How large must n be?

The sample proportion p = x/n may be assumed normal


if both nπ ≥ 10 and n(1 − π) ≥ 10.

Rule of Thumb
The sample proportion p = x/n may be assumed normal when
the sample has at least 10 “successes” and at least 10 “failures,”
i.e., when x ≥ 10 and n − x ≥ 10.
Table 8.9 8-56
Chapter 8
LO8-7: Construct a confidence interval for a population
proportion (continued, 5).
When Is It Safe to Assume Normality of p? (continued)
Table 8.8 shows the minimum sample size needed to
assume normality for p = x/n.

Table 8.9 8-57


Chapter 8
LO8-7: Construct a confidence interval for a population
proportion (continued, 6).

Confidence Interval for π

• The confidence interval for π (assuming a large


sample) is

where

8-58
Chapter 8
LO8-7: Construct a confidence interval for a population
proportion (continued, 7).
Example
A sample of 75 retail in-store purchases showed that 24 were paid in cash.
We will construct a 95 percent confidence interval for the proportion of all
retail in-store purchases that are paid in cash.

• The sample proportion is p = x/n = 24/75 = .32.

• We can assume that p is normally distributed because np and n(1 − p)


exceed 10. That is, np = (75)(.32) = 24 and
n(1 − p) = (75)(.68) = 51.

• The 95 percent confidence interval is (view the next slide):

8-59
Chapter 8
LO8-7: Construct a confidence interval for a population
proportion (continued, 8).

Example (continued)

We cannot know with certainty whether or not the true proportion lies within
the interval [.214, .426]. Either it does, or it does not. And it is true that
different samples could yield different intervals. But we can say that, on
average, 95 percent of the intervals constructed in this way would contain the
true population proportion π. Therefore, we are 95 percent confident that the
true proportion π is between .214 and .426.

8-60
Chapter 8
LO8-7: Construct a confidence interval for a population
proportion (continued, 9).

Rule of Three

• A useful quick rule is the Rule of Three. If in n independent trials no


events occur, the upper 95 percent confidence bound is
approximately 3/n.
• For example, if no medical complications arise in 17 prenatal fetal
surgeries, the upper bound on such complications is roughly
3/17 = .18, or about 18 percent.
• This rule is sometimes used when limited data are available.
• This rule is especially useful because the formula for the standard
error σp breaks down when p = 0.
• The rule of three is a conservative approach.

8-61
Chapter 8
8.7 Estimating from Finite
Populations
LO8-8: Know how to modify confidence intervals when the
population is finite.
• In Chapter 2 we discussed infinite and finite populations and the implication of a finite
population when sampling without replacement.
• If the sample size n is less than 5 percent of the population, and we are sampling
without replacement, then we consider the size of the population to be effectively
infinite.
• However, on occasion we will take samples without replacement where n is greater than
5 percent of the population.
• When this happens, our margin of error on the interval estimate is actually less than
when the sample size is “small” relative to the population size.
• As we sample more of the population, we get more precise estimates. We need to
account for the fact that we are sampling a larger percentage of the population.
• The finite population correction factor (FPCF) reduces the margin of error and provides
a more precise interval estimate.

8-62
Chapter 8
LO8-8: Know how to modify confidence intervals when the
population is finite (continued).

Finite Population Correction Factor

is the finite population correction factor (FPCF)

where
N = the number of items in the population
n = number of items in the sample

8-63
Chapter 8
LO8-8: Know how to modify confidence intervals when the
population is finite (continued, 2).
The FPCF can be omitted when the population is infinite (e.g., when we are
sampling from an ongoing production process) or effectively infinite (when
the population at least 20 times as large as the sample). When n/N < .05,
the FPCF is almost equal to 1 and will have a negligible effect on the
confidence interval.
Confidence Intervals for Finite Population

estimating µ with known σ

estimating µ with unknown σ

estimating π

8-64
Chapter 8
8.8 Sample Size Determination for
a Mean
LO8-9: Calculate sample size to estimate a mean.

Sample Size to Estimate m

• To estimate a population mean with a precision of + E (allowable


error), you would need a sample of size. Now,

Thus, the formula for the sample size can be written as:

Note: Always round n to the next higher integer to be conservative.


8-65
Chapter 8
LO8-9: Calculate sample size to estimate a mean
(continued).

How to Estimate σ?
• Method 1: Take a Preliminary Sample
Take a small preliminary sample and use the sample s in place of
σ in the sample size formula.
• Method 2: Assume Uniform Population
Estimate rough upper and lower limits a and b and set
σ = [(b-a)/12]½.
• Method 3: Assume Normal Population
Estimate rough upper and lower limits a and b and set σ = (b-a)/6.
This assumes normality with most of the data with μ ± 3σ so the
range is 6σ.
• Method 4: Poisson Arrivals
In the special case when λ is a Poisson arrival rate, then  = l.
8-66
Chapter 8
8.9 Sample Size Determination for
a Proportion
LO8-10: Calculate sample size to estimate a
proportion.

• To estimate a population proportion with a precision of


± E (allowable error), you would need a sample size.
• Since p is a number between 0 and 1, the allowable
error E is also between 0 and 1.

8-67
Chapter 8
LO8-10: Calculate sample size to estimate a
proportion (continued).

With we can solve for n.

Solving gives

Note: Always round n to the next higher integer to be conservative.

8-68
Chapter 8
LO8-10: Calculate sample size to estimate a
proportion (continued, 2).

How to Estimate π?

• Method 1: Assume that π = .50.


This conservative method ensures the desired precision.
However, the sample may end up being larger than
necessary.
• Method 2: Take a Preliminary Sample.
Take a small preliminary sample and use the sample p in
place of π in the sample size formula.
• Method 3: Use a Prior Sample or Historical Data.
Unfortunately, π might be different enough to make it a
questionable assumption.
8-69
Chapter 8
8.10 Confidence Interval for a Population
Variance, σ2 (Optional)
LO8-11: Construct a confidence interval for a variance
(optional).

Chi-Square Distribution

• If the population is normal, then the sample


variance s2 follows the chi-square distribution
(Χ2) with degrees of freedom d.f. = n – 1.
• Lower (Χ2L) and upper (Χ2U) tail percentiles for
the chi-square distribution can be found using
Appendix E.

8-70
Chapter 8
LO8-11: Construct a confidence interval for a variance
(optional) (continued).

Confidence Interval

• Using the sample variance s2, the confidence interval for


the variance is computed from the following relationship.

• To obtain a confidence interval for the standard


deviation σ, just take the square root of the
interval bounds.

8-71
Chapter 8
LO8-11: Construct a confidence interval for a variance
(optional) (continued, 2).

Example

• On a particular Friday night, the charges for 40 pizza


delivery orders from Mama Frida’s Pizza showed a
mean of with a sample variance s2 = 12.77. Construct a
95% confidence interval for the population variance for
the charges. (Refer to the text for the actual data).

8-72
Chapter 8
LO8-11: Construct a confidence interval for a variance
(optional) (continued, 3).

Example (continued)

• The sample data were nearly symmetric (median


$24.98) with no outliers.
• Normality of the prices will be assumed.
• Using 39 degrees of freedom (d.f. = n − 1 = 40 − 1 = 39)
in Appendix E, we obtain bounds for the 95 percent
middle area, as illustrated on the next slide.

8-73
Chapter 8
LO8-11: Construct a confidence interval for a variance
(optional) (continued, 4).
Example (continued)
You can use Appendix E to find critical chi-square values.

8-74
Chapter 8
LO8-11: Construct a confidence interval for a variance
(optional) (continued, 5).

Example (continued)

8-75
Chapter 8
LO8-11: Construct a confidence interval for a variance
(optional) (continued, 6).

Example (continued)
The 95 percent confidence interval for the population
variance σ2 is

Lower bound: = 8.569

Upper bound: = 21.058

With 95 percent confidence, we believe that 8.569 ≤ σ2 ≤


21.058. If you want a confidence interval for the standard
deviation, just take the square root of the interval bounds. In
this example, that would give 2.93 ≤ σ ≤ 4.59.
8-76
Chapter 8
LO8-11: Construct a confidence interval for a variance
(optional) (continued, 7).

Caution: Assumption of Normality

• The methods described for confidence interval estimation


of the variance and standard deviation depend on the
population having a normal distribution.
• If the population does not have a normal distribution, then
the confidence interval should not be considered
accurate.
• In such a case, the best alternative is to look for software
that can calculate a bootstrap estimate of σ2.

8-77

You might also like