Lecture7 Confidence
Lecture7 Confidence
IT Literacy
Lecture 7:
Confidence Intervals,
Hypotheses Tes>ng and
Test of Significance
Learning Outcomes
In this slide, you learn:
• The proper>es of a confidence interval
• To do hypotheses tes>ng
• To do test of significance by using t-test
2
Confidence Interval
• Confidence Intervals for the Popula>on Mean,
μ
– when Popula>on Standard Devia>on σ is Known
3
Point and Interval Es>mates
Lower Upper
Confidence Confidence
Point Estimate Limit
Limit
Width of
confidence interval
4
Point Es>mates
Mean µ X
Proportion π p
5
Confidence Intervals
6
Confidence Interval Es>mate
• An interval gives a range of values:
– Takes into considera>on varia>on in sample
sta>s>cs from sample to sample
– Based on observa>ons from 1 sample
– Gives informa>on about closeness to unknown
popula>on parameters
– Stated in terms of level of confidence
• e.g. 95% confident, 99% confident
• Can never be 100% confident
7
Confidence Interval Example
Simple example
• Popula>on has µ = 368 and σ = 15.
• If you take a sample of size n = 25:
– if you already know µ = 368
• 368 ± 1.96 * 15 / 25= (362.12, 373.88) contains 95% of the sample
means
– When you don’t know µ, you use X to es>mate µ
• If X = 362.3 the interval is 362.3 ± 1.96 * 15 / 25 = (356.42, 368.18)
• If you es>mated the µ = 368, since 356.42 ≤ µ ≤ 368.18 the interval based on
this sample makes a correct statement about µ.
Sample
10
Confidence Level, (1-α)
• Suppose confidence level = 95%
• Also wrijen (1 - α) = 0.95, (so α = 0.05)
• A rela>ve frequency interpreta>on:
– 95% of all the confidence intervals that can
be constructed will contain the unknown true
parameter
• A specific interval either will contain or
will not contain the true parameter
– No probability involved in a specific interval
11
Confidence Interval for μ
(σ Known)
• Assump>ons
– Popula>on standard devia>on σ is known
– Popula>on is normally distributed
– If popula>on is not normal, use large sample
• Confidence interval es>mate:
σ
X ± Z α/2
n
where X is the point es>mate
Zα/2 is the normal distribu>on cri>cal value for a probability of α/2 in each tail
σ/ n is the standard error
12
Finding the Cri>cal Value, Zα/2
Zα/2 = ±1.96
• Consider a 95% confidence interval:
1 − α = 0.95 so α = 0.05
α α
= 0.025 = 0.025
2 2
1.96
14
Common Levels of Confidence
• Commonly used confidence levels are 90%,
95%, and 99%
Confidence
Confidence
Coefficient, Zα/2 value
Level
1− α
80% 0.80 1.28
90% 0.90 1.645
95% 0.95 1.96
98% 0.98 2.33
99% 0.99 2.58
99.8% 0.998 3.08
99.9% 0.999 3.27
15
• For 80%, 0.500 - 0.100 = 0.400
• For 90%, 0.500 - 0.050 = 0.450
• For 99%, 0.500 - 0.005 = 0.495
1.28
1.645
2.58
16
Example
17
Example
18
Interpreta>on
19
Hypotheses Tes>ng
20
What is a Hypothesis?
• A hypothesis is a claim (assump>on) about a
popula>on parameter:
– popula>on mean
Example: The mean monthly cell phone bill
in this city is µ = $42
– popula>on propor>on
21
The Null Hypothesis, H0
• States the claim or asser>on to be tested
Example: The average number of TV sets in U.S.
Homes is equal to three ( H 0 : µ = 3)
• Is always about a popula>on parameter,
not about a sample sta>s>c
H0 : µ = 3 H0 : X = 3
22
The Null Hypothesis, H0
23
The Alterna>ve Hypothesis, H1
• Is the opposite of the null hypothesis
– e.g., The average number of TV sets in U.S.
homes is not equal to 3 ( H1: μ ≠ 3 )
• Challenges the status quo
• Never contains the “=” , “≤” or “≥” sign
• May or may not be proven
• Is generally the hypothesis that the
researcher is trying to prove
24
The Hypothesis Tes>ng Process
Population
Sample
25
The Hypothesis Tes>ng Process
Sampling
Distribution of X
X
20 µ = 50
If H0 is true ... then you reject
If it is unlikely that you
the null hypothesis
would get a sample
that µ = 50.
mean of this value ... ... When in fact this were
the population mean…
26
The Test Sta>s>c and
Cri>cal Values
Sampling Distribution of the test statistic
Region of Region of
Rejection Rejection
Region of
Non-Rejection
Critical Values
Critical values
Rejection Region
σσKnown
Known σσUnknown
Unknown
(Z test) (t test)
The test statistic is:
X−µ
Z STAT =
σ
n
29
Two-Tail Tests
H 0: µ = 3
n There are two
cutoff values H 1: µ ≠ 3
(critical values),
defining the
regions of α/2 α/2
rejection
3 X
Reject H0 Do not reject H0 Reject H0
-Zα/2 0 +Zα/2 Z
Lower Upper
critical critical
value value 30
Hypothesis Tes>ng Example
Test the claim that the true mean # of TV sets
in US homes is equal to 3.
(Assume σ = 0.8)
31
Hypothesis Tes>ng Example
3. Determine the appropriate technique
n σ is assumed known so this is a Z test.
α = 0.05/2 α = 0.05/2
t-Test
Difference Between Two Means
X1 – X2
σ1 and σ2 unknown,
not assumed equal
36
Difference Between Two Means: Independent
Samples
• Different data sources
Population means, – Unrelated
independent
samples
* – Independent
• Sample selected from one
popula>on has no effect on the
sample selected from the other
popula>on
Use Sp to estimate unknown
σ1 and σ2 unknown, σ. Use a Pooled-Variance t
assumed equal test.
38
Hypothesis tests for μ1 – μ2
Two Population Means, Independent Samples
Lower-tail test: Upper-tail test: Two-tail test:
H 0: µ 1 – µ 2 ≥ 0 H 0: µ 1 – µ 2 ≤ 0 H 0: µ 1 – µ 2 = 0
H 1: µ 1 – µ 2 < 0 H 1: µ 1 – µ 2 > 0 H 1: µ 1 – µ 2 ≠ 0
α α α/2 α/2
39
Hypothesis tests for µ1 - µ2 with σ1 and σ2
unknown and assumed equal
σ1 and σ2 unknown,
assumed equal
* § Populations are normally
distributed or both sample
sizes are at least 30
σ1 and σ2 unknown,
assumed equal
* ( X1 − X 2 ) − ( µ 1 − µ 2 )
•
t STAT The= test statistic is:
2 ⎛⎜ 1 1 ⎞
Sp ⎜ + ⎟
⎝ n1 n 2 ⎟⎠
σ1 and σ2 unknown,
not assumed equal
• Where tSTAT has d.f. = (n1 + n2 – 2) 41
t Test Example
You are a financial analyst for a brokerage firm. Is there a
difference in dividend yield between stocks listed on the NYSE
& NASDAQ? You collect the following data:
NYSE NASDAQ
Number 21 25
Sample mean 3.27 2.53
Sample std dev 1.30 1.16
t=
(X1 − X 2 ) − (µ1 − µ 2 )
=
(3.27 − 2.53) − 0 = 2.040
2 ⎛⎜ 1 1 ⎞ ⎛1 1 ⎞
Sp ⎜ + ⎟ 1.5021⎜ + ⎟
⎝ n1 n 2
⎟
⎠ ⎝ 21 25 ⎠
2 2
2 (n1 − 1)S1 + (n 2 − 1)S 2 (21 − 1)1.30 2 + (25 − 1)1.16 2
S =
p = = 1.5021
(n1 − 1) + (n2 − 1) (21- 1) + (25 − 1)
43
Pooled-Variance t Test Example: Hypothesis Test
Solu>on
Reject H0 Reject H0
H0: μ1 - μ2 = 0 i.e. (μ1 = μ2)
H1: μ1 - μ2 ≠ 0 i.e. (μ1 ≠ μ2)
α = 0.05 .025 .025
df = 21 + 25 - 2 = 44 -2.0154 0 2.0154 t
CriPcal Values: t = ± 2.0154
2.040
Test StaPsPc: Decision:
3.27 − 2.53
t= = 2.040 Reject H0 at α = 0.05
⎛ 1 1⎞
Conclusion:
1.5021 ⎜ + ⎟
⎝ 21 25 ⎠ There is evidence of a
difference in means.
44
t = 2.021 – ((2.021-.
000)*4/20)
= 2.0168
45