Chapter 3 Estimation
Chapter 3 Estimation
Estimation 3
Outline
1.1
3.1 Descriptive and Inferential Statistics
Introduction
3.2 Interval Estimation for a Mean
• Small and Large Sample
3.3 Interval Estimation for the Difference
Between Two Means (Independent Sample)
3.4 Interval Estimation for the Difference
Between Two Means (Dependent Sample)
Copyright © 2015 The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Copyright © 2012 The McGraw-Hill Companies, Inc. Slide 1
Inferential Statistics
Estimation: Hypothesis
Estimation
Estimation is a process of testing
estimating the value of a
population parameter based on
the sample statistic.
Point Interval
Hypothesis Testing: Estimation Estimation
Hypothesis testing is a decision
making process for evaluating
claims about a population
parameter.
Estimator
A point estimate of a population parameter is a single number obtained from the sample.
An estimate of a population parameter given by two numbers between which the parameter
may be considered to lie is called an interval estimate.
Example
To estimate µ, the population mean diameter of piston rings for a car engine produce by a
manufacturer, the diameters of a random sample of 15 piston rings were measured. The
sample mean was calculated and the value obtained is 8.23 mm.
If we estimate the mean diameter of all piston rings produced by the manufacturer as
8.23 mm we are giving the point estimate.
If we estimate the mean diameter as (8.216mm, 8.244 mm) or 8.23±0.014 we are
giving the interval estimate.
Characteristics of good estimator
a) Unbiased
b) Efficiency
c) Consistency
d) Sufficiency
The best point estimator of the population mean is the sample mean,
¯=
𝑋
∑ 𝑋𝑖
𝑛
The best point estimator of the population standard deviation is the sample
standard deviation,
√
(∑ 𝑋 )
2
∑𝑋 2
−
𝑛
𝑆=
𝑛 −1
Confidence Interval
This estimate may or may not contain the value of the parameter being estimated.
Example
If an interval (a,b) is such that P(a<θ<b)=0.95 then (a,b) is the 95% confidence interval for θ,
0.95 is the confidence coefficient.
Confidence Interval of the Mean for a Specific a, when
is known, for large samples, (n ≥ 30) KENA CHECK
𝑠
¯𝑥 ± 𝑧 𝛼 / 2
√𝑛
where zα/2 is the upper 100α/2 percentage point of the standard normal distribution.
To be more confident that the interval contains the true population mean, you must make the
interval wider.
Normal Distribution Properties
• The normal distribution curve is bell-shaped.
• The mean, median, and mode are equal and
located at the center of the distribution.
• The normal distribution curve is unimodal (i.e.,
it has only one mode).
• The curve is symmetrical about the mean,
which is equivalent to saying that its shape is
the same on both sides of a vertical line passing
through the center.
Copyright © 2015 The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Normal Distribution Properties
• The curve is continuous—i.e., there are no gaps
or holes. For each value of X, there is a
corresponding value of Y.
• The curve never touches the x-axis.
Theoretically, no matter how far in either
direction the curve extends, it never meets the
x-axis—but it gets increasingly close.
• The total area under the normal distribution
curve is equal to 1.00 or 100%.
Copyright © 2015 The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Chapter 3: Estimation Topic 3.3 : Confidence Interval for One population mean Leave blank
Example :
Using
Table 4,
find z0.025..
z0.025
=1.9600
95% Confidence Interval of the Mean
Copyright © 2015 The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
95% Confidence Interval of the Mean
Copyright © 2015 The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Example:
A random sample of n = 50 males showed a mean average daily
intake of dairy products equal to 756 grams with a standard
deviation of 35 grams. Find a 95% confidence interval for the
population average m
1.96
𝑠
¯𝑥 ± 𝑧 0.05 / 2
√𝑛
35
⇒ 7 56 ± 1.96
⇒ 7 56 ± 9.70 √ 50
6.0
54 ± 𝑧 0.025
√6.0
50
54 ± (1.96)
√ 50
54 ± 1.7
52.3< 𝜇<55.7
We can say with 95% confidence that the mean number of days it takes an
automobile dealer to sell a Chevrolet Aveo is between 52 and 56 days.
Copyright © 2015 The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Example: Waiting Times
A large department store found that it averages 362 customers per hour.
Assume that the standard deviation is 29.6 and a random sample of 40
hours was used to determine the average. Find the 99% confidence
interval of the population mean.
29.6
362 ± 𝑧 0.005
√ 40
29.6
362 ±(2.5758)
√ 40
362 ± 12.1
(350 ,374 )
Hence, one can be 99% confident that the mean number of customers that the
store averages is between 350 and 374 customers per hour.
Copyright © 2015 The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Example: Credit Union Assets
The following data represent a sample of the assets (in millions
of dollars) of 30 credit unions in southwestern Pennsylvania.
Find the 90% confidence interval of the mean.
12.23 16.56 4.39
2.89 1.24 2.17
13.19 9.16 1.42
73.25 1.91 14.64
11.59 6.69 1.06
8.74 3.17 18.13
7.92 4.78 16.85
40.22 2.42 21.58
5.01 1.47 12.24
2.27 12.77 2.76
Copyright © 2015 The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Example: Credit Union Assets
Step 1: Find the mean and standard deviation. Using calculator, we
find =𝑋 ¯11.091 and s = 14.405.
Copyright © 2015 The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
21
Characteristics of the t Distribution
The t distribution differs from the standard normal
distribution in the following ways:
1. The variance is greater than 1.
2. The t distribution is actually a family of curves based
on the concept of degrees of freedom, which is
related to sample size.
3. As the sample size increases, the t distribution
approaches the standard normal distribution.
Copyright © 2015 The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Degrees of Freedom
• The symbol d.f. will be used for degrees of freedom.
• The degrees of freedom for a confidence interval for the
mean are found by subtracting 1 from the sample size. That
is, d.f. = n – 1.
Copyright © 2015 The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
23
x
x̄ ± t α /2
( ) 𝑠
√𝑛
Where df = n - 1
x
Using t distribution
table
Example
α= 0.02
α/2 = 0.01
Sample size, n= 4
Degree of freedom,
v =n-1=3
t0.01,3 = 4.541
Example : Infant Growth
A random sample of 10 children found that their average growth for the
first year was 9.8 inches. Assume the variable is normally distributed and
the sample standard deviation is 0.96 inch. Find the 95% confidence
interval of the population mean for growth during the first year.
x̄ ± t α /2
9 .8 ± 𝑡 √𝑛
𝑠
0.96
0.05 / 2
( )
√10
0.96
9 .8 ± ( 2.262)
√1 0
9 .8 ± 0.69
9 .11< 𝜇<10.49
Therefore, one can be 95% confident that the population mean of the first-year
growth is between 9.11 and 10.49 inches
Copyright © 2015 The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Example : Home Fires by Candles
The data represent a sample of the number of home fires
started by candles for the past several years. Find the 99%
confidence interval for the mean number of home fires started
by candles each year.
5460 5900 6090 6310 7160 8440 9930
x̄ ± t α /2
7 041.4 ± 𝑡 √𝑛
𝑠
1610.3
0.0 05
( ) √7
1610.3
7 041.4 ±(3.707)
√7
7 041.4 ± 2256.2
4785.2<𝜇< 9297.6
One can be 99% confident that the population mean number of
home fires started by candles each year is between 4785.2 and
9297.6.
Copyright © 2015 The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Example:
To determine the flow characteristics of oil through a valve, the inlet oil
temperature is measured in degrees Fahrenheit. The following are a sample of 8
readings: 97, 93, 91, 94, 93, 92,89, 90
Construct a 99% confidence interval for the mean inlet oil temperature.
(89.277, 95.473)
We are 99% confident that the mean inlet oil temperature lies between 89.277ºF and
95.473ºF.
Confidence Intervals for the
Difference Between Two Means/Two Samples
Difference between
means 1 2
of two populations
Independent
Paired samples
samples
Sample 2
Two different test statistics are used when
the unknown population variances are either Population 2
equal or unequal.
Independent random samples
Confidence interval for the difference between two population means
Variances unknown and equal (Small independent samples)
√
where 2 is the pooled standard error.
2
( 𝑛1 −1 ) 𝑠1 + ( 𝑛 2 − 1 ) 𝑠 2
𝑠𝑝 =
The degree of freedom 𝑛1of+𝑛t 2distribution
−2 is n1 + n2 - 2.
Assumption
Populations are normally distributed.
The variances of the two populations, and although unknown, but
they are equal. 2 2
Two independent random samples of size𝜎 𝜎
1 n1 and n22 ( n1 < 30, n2 < 30).
Exercise
Brand 1 43 48 38 41 51
Brand 2 30 26 37 31 34
Construct the 95% confidence interval on the difference between the average
lifetimes of the two brands. Can the supervising inspector of incoming quality
conclude that the average lifetimes of the two brands are equal?
Solution:
√
2 2
( 1
𝑛 −1 ) 1 ( 2
𝑠 + 𝑛 − 1 ) 2
𝑠
𝑠𝑝 =
𝑛1 +𝑛2 −2
= 4.743
( 𝑥¯ 1 − 𝑥¯ 2 ) ± 𝑡 𝛼 𝑠 𝑝
√ 1 1
+
𝑛 1 𝑛2
√
2
1 1
( 44.2− 31.6 ) ± 𝑡0.025 , 8 (4.743) +
5 5
1 2.6 ±(2.306)(4.743) +
1 1
5 5 √
5.682 < 1 2.6 ± 6.918
SPSS Output:
Notes:
1. If the confidence interval includes 0, we can
say that there is no significant difference
between the means of the two populations, at a
given level of confidence. / 1 - 2 = 0
2. Contain only positive value, there is a
significant difference between the means of the
two populations. / 1 > 2
3. Contain only negative value, there is a
significant difference between the means of the
two populations. / 1 < 2
Copyright © 2015 The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Confidence interval for the difference between two population
means (Paired Samples)
Definition
Population 1 Population 2
1 x1 y1 d1= x1 - y1
2 x2 y2 d2= x2 - y2
n xn yn dn= xn - yn
Confidence interval for the difference between two population
means (Paired Samples)
𝑠𝑑
¯ ±𝑡
𝑑 𝛼 /2
√𝑛
where degree of freedom of t is n-1.
The mean and standard deviation of d, the paired differences for the two samples
are calculated using calculator.
Assumption
The distribution of d, difference between paired samples is approximately
normal. .
Exercise
𝑑¯ =19.7 𝑠
𝑠𝑑
=4.398
¯ ±𝑡
𝑑
𝑑
𝛼 /2
√𝑛
4.398
1 9.7 ± 𝑡 0.0 05
√ 10
4.398
1 9.7 ± ( 3.25)
√ 10
1 9.7 ± 4.52
15.18< 𝜇<24.22
SPSS Output:
𝑠𝑑
¯ ±𝑡
𝑑 𝛼 /2
√𝑛
4.398
1 9.7 ± 𝑡 0.0 05
√ 10
4.398
𝑠𝑑 1 9.7 ±( 3.25)
√10
¯
𝑑