5.confidence Interval
5.confidence Interval
2020
Confidence Interval
Lower Upper
Confidence Confidence
Point Estimate
Limit Limit
Width of
confidence interval
1
21.11.2020
Mean 𝜇 𝑋"
Standard Deviation 𝜎 𝑆
Proportion 𝜋 𝑝
2
21.11.2020
• Since the actual mean is within this interval, this sample makes a correct statement about 𝜇
But what about the intervals from other possible samples of size 25?
3
21.11.2020
4
21.11.2020
Random Sample
I am 95% confident
that μ is between 40
- 60.
Population Mean
(mean, μ, is X = 50
unknown)
Sample
10
5
21.11.2020
§ Confidence Level
11
§ A specific interval either will contain or will not contain the true
parameter
12
6
21.11.2020
Confidence Intervals
Population
Population Mean Proportion
𝜎 known 𝜎 unknown
13
§ Assumptions
§ Population standard deviation 𝜎 is known
§ Population is normally distributed
§ If population is not normal, use large sample
𝜎
𝑋" ± 𝑍!/#
𝑛
where
𝑋3 is the point estimate
𝑍$# is the normal distribution critical value for a probability of a/2 in each tail
%
is the standard error
&
14
7
21.11.2020
α α
= 0.025 = 0.025
2 2
15
16
8
21.11.2020
Confidence
Confidence
Coefficient Zα/2 value
Level
17
a/2 1- a a/2
Intervals
x
μx = μ
extend from x1
σ (1 − a)100%
x2
X - Zα / 2
of intervals constructed
to n
contain μ;
(a)100% do not.
σ
X + Zα / 2
n
Confidence Intervals
18
9
21.11.2020
§ Determine a 90% confidence interval for the true mean resistance of the
population.
𝜎
𝑋3 ± 𝑍$/#
𝑛
0.35
2.20 ± 1.645
11
2.026 ≤ 𝜇 ≤ 2.374
19
20
10
21.11.2020
§ Although the true mean may or may not be in this interval, 90%
of intervals formed in this manner will contain the true mean
21
§ Determine a 95% confidence interval for the true mean resistance of the
population.
𝜎
𝑋3 ± 𝑍$/#
𝑛
(.*"
2.20 ± 1.96 =2.20 ± 0.2068
!!
1.9932 ≤ 𝜇 ≤ 2.4068
22
11
21.11.2020
§ Determine a 99% confidence interval for the true mean resistance of the
population.
𝜎
𝑋3 ± 𝑍$/#
𝑛
0.35
2.20 ± 2.58
11
1.928 ≤ 𝜇 ≤ 2.472
23
24
12
21.11.2020
25
Confidence Intervals
Population
Population Mean Proportion
𝜎 known 𝜎 unknown
26
13
21.11.2020
§ Probably not!
27
28
14
21.11.2020
§ Assumptions
§ Population standard deviation 𝜎 is unknown
§ Population is normally distributed
§ If population is not normal, use large sample
𝑆
𝑋" ± 𝑡!#
𝑛
where
𝑋 is the point estimate
𝑡$# is the critical value of t distribution with n-1 degrees of freedom and an area of α/2 in each tail
𝑆 is the standard deviation of the sample
29
30
15
21.11.2020
Idea: Number of observations that are free to vary after sample mean has been
calculated
Example: Suppose the mean of 3 numbers is 8.0
31
Note: t Z as n increases
Standard
Normal
(t with df = ∞)
t (df = 13)
t-distributions are bell-
shaped and symmetric,
but have ‘fatter’ tails than t (df = 5)
the normal
0 t
32
16
21.11.2020
33
Confidence t t t Z
Level (10 d.f.) (20 d.f.) (30 d.f.) (∞ d.f.)
34
17
21.11.2020
35
𝑆
𝑋3 ± 𝑡$/#
𝑛
8
50 ± 2.064
25
46.698 ≤ 𝜇 ≤ 53.302
36
18
21.11.2020
37
5, 7, 6, 8, 9, 11, 10, 9, 8, 7
Construct a 99% confidence interval for 𝜇.
𝑆 1.83
𝑋3 ± 𝑡$/# 8 ± 3.25
𝑛 10
6.12 ≤ 𝜇 ≤ 9.88
38
19
21.11.2020
39
81 𝜎 3
𝑋" = =9 𝑋" ± 𝑍!/# 9 ± 1.96
9 𝑛 9
7.04 ≤ 𝜇 ≤ 10.96
40
20
21.11.2020
𝑆 3.082
𝑋" ± 𝑡!/# 9 ± 2.306 6.63 ≤ 𝜇 ≤ 11.37
𝑛 9
7.04 ≤ 𝜇 ≤ 10.96
41
42
21
21.11.2020
Confidence Intervals
Population
Population Mean Proportion
𝜎 known 𝜎 unknown
43
44
22
21.11.2020
𝑝(1 − 𝑝)
𝑛
45
§ Upper and lower confidence limits for the population proportion are
calculated with the formula
𝑝(1 − 𝑝)
𝑝 ± 𝑍!/#
𝑛
§ where
§ 𝑍𝛼/2 is the standard normal value for the level of confidence desired
§ 𝑝 is the sample proportion
§ 𝑛 is the sample size
46
23
21.11.2020
0.1651 ≤ 𝜋 ≤ 0.3349
47
48
24
21.11.2020
49
𝜎 𝜎 𝑍!/##𝜎 #
𝑋! ± 𝑍!/# 𝑒 = 𝑍!/# 𝑛=
𝑛 𝑛 𝑒#
50
25
21.11.2020
51
52
26
21.11.2020
53
𝜋(1 − 𝜋) 𝑍!/##𝜋(1 − 𝜋)
𝑝 ± 𝑍A/C 𝑛=
𝑛 𝑒#
54
27
21.11.2020
55
56
28
21.11.2020
57
58
29
21.11.2020
40 𝑁 − 𝑛 40 1000 − 100
𝜎? = = = 3.8
100 𝑁 − 1 10 1000 − 1
59
§ Using FPC,
10 𝑁−𝑛
50 ± 1.984 = 50 ± 1.88
100 𝑁 − 1
48.12 ≤ 𝜇 ≤ 51.88
60
30
21.11.2020
61
Confidence Intervals
DEPENDENT
INDEPENDENT SAMPLES SAMPLES
𝜎C = 𝜎D 𝜎C ≠ 𝜎D
62
31
21.11.2020
INDEPENDENT SAMPLES
PARAMETERS POPULATION 1 POPULATION 2
MEAN 𝜇! 𝜇#
VARIANCE 𝜎!# 𝜎##
STANDARD DEVIATION 𝜎! 𝜎#
SAMPLE 1 SAMPLE 2
SIZE 𝑛! 𝑛#
MEAN 𝑋3! 𝑋3#
VARIANCE 𝑆!# 𝑆##
STANDARD DEVIATION 𝑆! 𝑆#
63
INDEPENDENT SAMPLES
𝜎;# 𝜎##
Var 𝑋"; − 𝑋"# = 𝑉𝑎𝑟 𝑋"; + 𝑉𝑎𝑟 𝑋"# = +
𝑛; 𝑛#
64
32
21.11.2020
INDEPENDENT SAMPLES
§ 120 marketing and 90 finance students are randomly selected from two
independent populations with 0.42 and 0.64 standard deviations,
respectively. The mean GPA of marketing students is 3.08 and finance
students is 2.88.
§ Costruct 95% confidence interval for the differences of population
means.
𝑛! = 120 𝑛# = 90
𝜎! = 0.42 𝜎# = 0.64
65
INDEPENDENT SAMPLES
𝑛! = 120 𝑛# = 90
𝜎! = 0.42 𝜎# = 0.64
0.0479 ≤ 𝜇; − 𝜇# ≤ 0.3521
66
33
21.11.2020
Confidence Intervals
DEPENDENT
INDEPENDENT SAMPLES SAMPLES
𝜎C = 𝜎D 𝜎C ≠ 𝜎D
67
INDEPENDENT SAMPLES
PARAMETERS POPULATION 1 POPULATION 2
MEAN 𝜇! 𝜇#
VARIANCE 𝜎!# 𝜎##
STANDARD DEVIATION 𝜎! 𝜎#
SAMPLE 1 SAMPLE 2
SIZE 𝑛! 𝑛#
MEAN 𝑋3! 𝑋3#
VARIANCE 𝑆!# 𝑆##
STANDARD DEVIATION 𝑆! 𝑆#
68
34
21.11.2020
INDEPENDENT SAMPLES
§ Since we assume 𝜎;= 𝜎# = 𝜎, we need to estimate a common variance
using sample standard deviations
§ If 𝑛; = 𝑛#, then
𝑆;# + 𝑆##
𝑆CDDE# =
2
§ If 𝑛; ≠ 𝑛#, then
69
INDEPENDENT SAMPLES
§ 25 marketingand 25 finance students are randomly selected from two
independent populations with equal variances. The mean GPA of
marketing students is 3.08 and finance students is 2.88.
§ Costruct 95% confidence interval for the differences of population
means.
𝑛! = 25 𝑛# = 25
𝑆! # = 0.36 𝑆# # = 0.64
70
35
21.11.2020
INDEPENDENT SAMPLES
𝑛! = 25 𝑛# = 25
𝑆! # = 0.36 𝑆# # = 0.64
71
72
36
21.11.2020
INDEPENDENT SAMPLES
𝑛! = 25 𝑛# = 25
𝑆! # = 0.36 𝑆# # = 0.64
0.05 0.05
(3.08 − 2.88) ± 2.02 + −0.20 ≤ 𝜇; − 𝜇# ≤ 0.60
25 25
73
INDEPENDENT SAMPLES
§ 25 marketing and 40 finance students are randomly selected from two
independent populations with equal variances. The mean GPA of
marketing students is 3.08 and finance students is 2.88.
§ Costruct 95% confidence interval for the differences of population
means.
𝑛! = 25
𝑛# = 40
𝑋3! = 3.08
𝑋3# = 2.88
𝑆! # = 0.36
𝑆# # = 0.64
74
37
21.11.2020
INDEPENDENT SAMPLES
𝑛! = 25 𝑛# = 40
𝑆! # = 0.36 𝑆# # = 0.64
𝑆CDDE# 𝑆CDDE#
(𝑋";−𝑋"#) ± 𝑡!/# +
𝑛; 𝑛#
75
76
38
21.11.2020
INDEPENDENT SAMPLES
𝑛! = 25 𝑛# = 40
𝑆! # = 0.36 𝑆# # = 0.64
0.50 0.50
(3.08 − 2.88) ± 2.00 + −0.16 ≤ 𝜇; − 𝜇# ≤ 0.56
25 40
77
Confidence Intervals
DEPENDENT
INDEPENDENT SAMPLES SAMPLES
𝜎C = 𝜎D 𝜎C ≠ 𝜎D
78
39
21.11.2020
INDEPENDENT SAMPLES
PARAMETERS POPULATION 1 POPULATION 2
MEAN 𝜇! 𝜇#
VARIANCE 𝜎!# 𝜎##
STANDARD DEVIATION 𝜎! 𝜎#
SAMPLE 1 SAMPLE 2
SIZE 𝑛! 𝑛#
MEAN 𝑋3! 𝑋3#
VARIANCE 𝑆!# 𝑆##
STANDARD DEVIATION 𝑆! 𝑆#
79
INDEPENDENT SAMPLES
§ Since we assume 𝜎; ≠ 𝜎#, we cannot estimate a common variance
§ We use the sample standard deviations and calculate a common
degrees of freedom
#
𝑆;# 𝑆## Round to the
𝑛; + 𝑛#
nearest
integer!
df = # #
𝑆;# 𝑆##
𝑛; 𝑛#
+
(𝑛; − 1) (𝑛# − 1)
80
40
21.11.2020
INDEPENDENT SAMPLES
§ 50 marketing and 25 finance students are randomly selected from two
independent populations. The mean GPA of marketing students is 3.08
and finance students is 2.88.
§ Costruct 95% confidence interval for the differences of population
means.
𝑛! = 50 𝑛# = 25
𝑆! # = 0.36 𝑆# # = 0.64
81
INDEPENDENT SAMPLES
𝑛! = 50 𝑛# = 25
𝑆! # = 0.36 𝑆# # = 0.64
0.36 0.64 #
𝑆;# 𝑆## df = 50 + 25 = 37.93 ⟹ 38
(𝑋";−𝑋"#) ± 𝑡!/# +
𝑛; 𝑛# 0.36 # 0.64 #
50 + 25
(50 − 1) (25 − 1)
82
41
21.11.2020
83
INDEPENDENT SAMPLES
𝑛! = 50 𝑛# = 25
𝑆! # = 0.36 𝑆# # = 0.64
0.36 0.64
(3.08 − 2.88) ± 2.02 + −0.17 ≤ 𝜇; − 𝜇# ≤ 0.57
50 25
84
42
21.11.2020
EXERCISES
§ Anexperiment was conducted in which two types of engines, A and B,
were compared. Gas mileage, in miles per gallon, was measured.
§ Theaverage mileage for engine A was 36 mpg and the average for
machine B was 42 mpg.
85
YES NO
Independent
Samples?
YES Do you
know the
population NO
variance?
YES Assume
equal
Z distribution variances? NO
Use 𝜎
t distribution
Calculate t distribution
common Calculate df.
variance, 𝑆0112 #
86
43
21.11.2020
YES NO
Independent
Samples?
YES Do you
know the
population NO
variance?
YES Assume
equal
Z distribution variances? NO
Use 𝜎
t distribution
Calculate t distribution
common Calculate df.
variance, 𝑆0112 #
87
EXERCISES
𝑛3 = 50 𝑛4 = 75
𝑋33 = 36 𝑋34 = 42
𝜎3 = 6 𝜎4 = 8
𝜎I # 𝜎J # 8# 6#
(𝑋"I −𝑋"J ) ± 𝑍!/# + (42 − 36) ± 2.05 +
𝑛I 𝑛J 75 50
3.43 ≤ 𝜇I − 𝜇J ≤ 8.57
88
44
21.11.2020
EXERCISES
𝑛! = 120 𝑛# = 90
𝜎! = 0.42 𝜎# = 0.64
0.0479 ≤ 𝜇; − 𝜇# ≤ 0.3521
89
EXERCISES
§ Two independentsampling stations were chosen for a study, one located
downstream from the acid mine discharge point and the other located
upstream.
§ For 12 monthly samples collected at the downstream station, the species
diversity index had a mean value 3.11 and a standard deviation 0.771,
while 10 monthly samples collected at the upstream station had a mean
index value 2.04 and a standard deviation 0.448.
§ Find
a 90% confidence interval for the difference between the population
means for the two locations, assuming that the populations are
approximately normally distributed with equal variances.
90
45
21.11.2020
YES NO
Independent
Samples?
YES Do you
know the
population NO
variance?
YES Assume
equal
Z distribution variances? NO
Use 𝜎
t distribution
Calculate t distribution
common Calculate df.
variance, 𝑆0112 #
91
EXERCISES
𝑛! = 12 𝑛# = 10
𝑆! = 0.771 𝑆# = 0.448
𝑆CDDE# 𝑆CDDE#
(𝑋";−𝑋"#) ± 𝑡!/# +
𝑛; 𝑛#
# #
11 0.771 + 9 0.448
𝑆CDDE# = = 0.417
20
92
46
21.11.2020
93
EXERCISES
𝑛! = 12 𝑛# = 10
𝑆! = 0.771 𝑆# = 0.448
0.417 0.417
(3.11 − 2.04) ± 1.725 + 0.59 ≤ 𝜇; − 𝜇# ≤ 1.55
12 10
94
47
21.11.2020
EXERCISES
§ A study was conducted by the Department of Zoology at the Virginia Tech to
estimate the difference in the amounts of the chemical orthophosphorus
measured at two different stations on the James River.
§ Find a 95% confidence interval for the difference in the true average
orthophosphorus contents at these two stations, assuming that the observations
came from normal populations with different variances.
95
YES NO
Independent
Samples?
YES Do you
know the
population NO
variance?
YES Assume
equal
Z distribution variances? NO
Use 𝜎
t distribution
Calculate t distribution
common Calculate df.
variance, 𝑆0112 #
96
48
21.11.2020
EXERCISES
𝑛! = 15 𝑛# = 12
𝑆! = 3.07 𝑆# = 0.80
#
3.07# 0.80#
𝑆;# 𝑆## + 12
15
(𝑋";−𝑋"#) ± 𝑡!/# + df = # # = 16.3 ⟹ 16
𝑛; 𝑛# 3.07# 0.80#
15 12
+
(15 − 1) (12 − 1)
97
98
49
21.11.2020
EXERCISES
𝑛! = 15 𝑛# = 12
𝑆! = 3.07 𝑆# = 0.80
3.07# 0.80#
(3.84 − 1.49) ± 2.12 + 0.60 ≤ 𝜇; − 𝜇# ≤ 4.10
15 12
99
Confidence Intervals
DEPENDENT
INDEPENDENT SAMPLES SAMPLES
𝜎C = 𝜎D 𝜎C ≠ 𝜎D
100
50
21.11.2020
DEPENDENT SAMPLES
§ We are often concerned with data sets that consist of pairs of values that
have some relationship to each other.
§ Paired
t-test, a special case of the two-sample t-tests, occurs when the
observations on the two populations of interest are collected in pairs.
101
YES NO
Independent
Samples? Paired t-test
YES Do you
know the
population NO
variance?
YES Assume
equal
Z distribution variances? NO
Use 𝜎
t distribution
Calculate t distribution
common Calculate df.
variance, 𝑆0112 #
102
51
21.11.2020
DEPENDENT SAMPLES
Observation Drug A Drug B Find 99% CI for the difference
1 29 26 of the population means
2 32 27
3 31 28
4 32 27
5 30 -
6 32 30
7 29 26
8 31 33
9 30 36
103
DEPENDENT SAMPLES
Observation Drug A Drug B
1 29 26
2 32 27
3 31 28
4 32 27
5 30 -
Missing value
6 32 30
7 29 26
8 31 33
9 30 36
104
52
21.11.2020
DEPENDENT SAMPLES
Observation Drug A Drug B Differences 13
1 29 26 𝑑= = 1.625
3 8
2 32 27 5
𝑆5 = 3.78
3 31 28 3
4 32 27 5
∑867!(𝑥6 − 𝑑)#
5 32 30 2 𝑆5 =
(𝑛 − 1)
6 29 26 3
7 31 33 -2
8 30 36 -6
105
DEPENDENT SAMPLES
𝑆P
𝜇̅P = 𝑑̅ ± 𝑡!/#
𝑛
Q.RS
𝜇̅P = 1.625 ± 3.499 S
−3.05 ≤ 𝜇̅P ≤ 6.30
106
53