Lecture 11
Lecture 11
Chapter 11
Estimation: Comparing two populations
1
The sampling distribution of x1 x 2
• Expected value of x 1 x 2 is 1 – 2
• Variance of x1 x 2 is 12/n1 + 22/n2
2 2
(x x ) z 1 2
1 2 n n
1 2
2
Example 11.1 revised, page 427
Solution
A 99% z-interval estimator for the difference
between two means is:
2 2
(X X ) z 1 2
1 2 a 2 n n
1 2
10 2 10 2
(153 142) 2.575
14 14
11 9.8 [1.2, 20.8].
There are 3 steps to follow:
1. Calculate 𝑥1 = 153 and 𝑥2 = 142
2. Calculate z/2 = z 0.005 = 2.575
3. Calculate the confidence interval
10 2 10 2
(153 142) 2.575 11 9.8 [1.2,20.8].
14 14
3
11.2 Estimating the difference between
two population means (1 – 2) when
the variances 12 and 22 are unknown
• When the population variances 12 and 22 are
unknown, instead of Z - statistic, we use
T-statistic
( X 1 X 2 ) ( )
T
s s
n1 n2
s2 s2
(x x ) t 1 2
1 2 2 , d.f. n n
1 2
w here 1- is the confidence lev el and
(s12 n1 s 22 /n2 ) 2
d.f. 2
( s12 n1 ) 2 ( s 22 n2 )
n1 1 n2 1
4
Example 11.3, page 432
Consmers Non-cmrs
• A sample of 30 people was 2560 2008
randomly drawn. Each person 2420
2116
2812
2940
was identified as a consumer or 2364 2828
a non-consumer of high-fibre 2384 2092
2256 2136
cereal for the breakfast. For 2460 3072
each person, the number of 2240 2504
kilojoules consumed at lunch 2540
2492
2480
2356
was recorded. 2944
2260
2744
• Estimate with 95% confidence 2116
2528
the difference between the 3804
average kilojoule intake of 2976
2528
consumers and non-consumers 2372
of high-fibre cereal for lunch. 3388
Solution
Three steps to follow:
1. Calculate x1 2383.2, x 2 2644.4
s1 142.75, s 2 462.61
It appears that the population variances are unequal.
2. Find:
- the degrees of freedom: d.f. = 25.1
- t/2, d.f. = t0.025, 25 = 2.060
142.752 462.612
(2383.2 2644.4) 2.060
10 20
261.20 232.45 [-493.65,-28.75]
5
Case II: The two variances are equal
1 1
(x x ) t s 2p
1 2 2 , d.f . n1 n2
6
Solution
Three steps to follow:
1. Calculate x1 39.95, x 2 47.03
s12 54.02, s 22 51.22, and n1 n 2 8.
It appears that the variances are equal.
Calculate the pooled variance
(8 1)(54.02) (8 1)(51.22)
s 2p 52.62
2. Find: 882
- the degrees of freedom: d.f. = 8 + 8 -2 = 14
- t/2, d.f. = t0.05, 14 = 1.761
3. Calculate the 90% confidence interval:
1 1
(39.95 47.03) 1.761 52.62 7.08 6.39 [-13.47,-0.69]
8 8
7
The (1 - ) confidence interval
• Calculate the sample of differences: XD = X1 – X2
Differences XD: -1.2 -4.1 0.3 -1.8 1.1 -4.3 -0.5 -1.4
We assume that XD is normally distributed (Check it by
drawing a histogram). Calculate 𝑋𝐷 , the sample mean, and
SD, the sample standard deviation
• T-statistic
X D D
T
s D nD
where D = 1 - 2, nD is the size of sample XD.
T-statistic has a t distribution with nD – 1 degrees of
freedom.
• The (1 - ) confidence interval for D = 1 – 2 is
sD
xD t / 2 ,nD 1
nD
Solution
Three steps to follow:
1. Calculate the values of sample differences
and
x D - 1.49, s D 1.92
2. Find:
- the degrees of freedom: d.f. = nD – 1 = 7
- t/2, d.f. = t0.025, 7 = 2.365
8
11.4 Estimating the difference between
two population proportions (p1 – p2)
• In this section we deal with two populations whose
data are nominal.
• When data are nominal we can (only) ask questions
regarding the proportions of occurrence of certain
outcomes (called successes).
• Thus, we compare the two populations by estimating
the difference between the two population proportions,
p1 – p2.
• Consider statistic (sample proportion) 𝑝1 = 𝑋1 /𝑛1 where
X1 is number of successes in sample of size n1 taken
from the 1st population; and statistic 𝑝2 = 𝑋2 /𝑛2 where
X2 is number of successes in sample of size n2 taken
from the 2nd population.
( pˆ 1 pˆ 2 ) ( p1 p2 )
Z
pˆ 1 (1 pˆ 1 ) pˆ 2 (1 pˆ 2 )
n1 n2
9
Estimating the difference between
two population proportions
• The (1-) confidence interval for the difference
between two population proportions, p1 – p2, is
given by:
p̂1 (1 p̂1 ) p̂ 2 (1 p̂ 2 )
(p̂1 p̂ 2 ) z /2
n1 n2
10
Solution: We perform the following three steps
1. Calculate
104 189
p̂1 .009455 ; p̂ 2 .01718
11,000 11,000
2. Find z/2 = z0.025 = 1.96
3. The 95% confidence interval for the difference
between the proportion of regular aspirin takers
who had a heart attack and the proportion of
regular placebo takers who had a heart attack is
p̂1 (1 p̂1 ) p̂ 2 (1 p̂ 2 )
(p̂1 p̂ 2 ) z .025
n1 n2
(.009455)(.999545) (.01718)(.98282)
(.009455 .01718) 1.96
11000 11000
[ .010753, .004697]
Home assignment:
11