0% found this document useful (0 votes)
10 views45 pages

Lecture7 Confidence

Uploaded by

irisyan0217
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views45 pages

Lecture7 Confidence

Uploaded by

irisyan0217
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 45

Numerical &

IT Literacy
Lecture 7:
Confidence Intervals,
Hypotheses Tes>ng and
Test of Significance
Learning Outcomes
In this slide, you learn:
• The proper>es of a confidence interval
• To do hypotheses tes>ng
• To do test of significance by using t-test

2
Confidence Interval
• Confidence Intervals for the Popula>on Mean,
μ
– when Popula>on Standard Devia>on σ is Known

3
Point and Interval Es>mates

• A point es>mate is a single number,


• a confidence interval provides addi>onal informa>on
about the variability of the es>mate

Lower Upper
Confidence Confidence
Point Estimate Limit
Limit
Width of
confidence interval
4
Point Es>mates

We can estimate a with a Sample


Statistic
Population Parameter …
(a Point Estimate)

Mean µ X
Proportion π p

5
Confidence Intervals

• How much uncertainty is associated with a


point es>mate of a popula>on parameter?

• An interval es>mate provides more


informa>on about a popula>on
characteris>c than does a point es>mate

• Such interval es>mates are called


confidence intervals

6
Confidence Interval Es>mate
• An interval gives a range of values:
– Takes into considera>on varia>on in sample
sta>s>cs from sample to sample
– Based on observa>ons from 1 sample
– Gives informa>on about closeness to unknown
popula>on parameters
– Stated in terms of level of confidence
• e.g. 95% confident, 99% confident
• Can never be 100% confident
7
Confidence Interval Example

Simple example
• Popula>on has µ = 368 and σ = 15.
• If you take a sample of size n = 25:
– if you already know µ = 368
• 368 ± 1.96 * 15 / 25= (362.12, 373.88) contains 95% of the sample
means
– When you don’t know µ, you use X to es>mate µ
• If X = 362.3 the interval is 362.3 ± 1.96 * 15 / 25 = (356.42, 368.18)
• If you es>mated the µ = 368, since 356.42 ≤ µ ≤ 368.18 the interval based on
this sample makes a correct statement about µ.

But what about the intervals from other possible samples of


size 25?
8
Confidence Interval Example

Lower Upper Contain


Sample # X
Limit Limit µ?

1 362.30 356.42 368.18 Yes

2 369.50 363.62 375.38 Yes

3 360.00 354.12 365.88 No

4 362.12 356.24 368.00 Yes

5 373.88 368.00 379.76 Yes


9
Es>ma>on Process

Random Sample I am 95%


confident that
µ is between
Population Mean 40 & 60.
(mean, µ, is X = 50
unknown)

Sample

10
Confidence Level, (1-α)
• Suppose confidence level = 95%
• Also wrijen (1 - α) = 0.95, (so α = 0.05)
• A rela>ve frequency interpreta>on:
– 95% of all the confidence intervals that can
be constructed will contain the unknown true
parameter
• A specific interval either will contain or
will not contain the true parameter
– No probability involved in a specific interval
11
Confidence Interval for μ
(σ Known)
• Assump>ons
– Popula>on standard devia>on σ is known
– Popula>on is normally distributed
– If popula>on is not normal, use large sample
• Confidence interval es>mate:

σ
X ± Z α/2
n
where X is the point es>mate
Zα/2 is the normal distribu>on cri>cal value for a probability of α/2 in each tail
σ/ n is the standard error

12
Finding the Cri>cal Value, Zα/2
Zα/2 = ±1.96
• Consider a 95% confidence interval:
1 − α = 0.95 so α = 0.05

α α
= 0.025 = 0.025
2 2

Z units: Zα/2 = -1.96 0 Zα/2 = 1.96


Lower Upper
X units: Confidence Point Estimate Confidence
Limit Limit
13
• For 95%, the value to be checked from
normal table is 0.500 - 0.025 = 0.475

1.96

14
Common Levels of Confidence
• Commonly used confidence levels are 90%,
95%, and 99%
Confidence
Confidence
Coefficient, Zα/2 value
Level
1− α
80% 0.80 1.28
90% 0.90 1.645
95% 0.95 1.96
98% 0.98 2.33
99% 0.99 2.58
99.8% 0.998 3.08
99.9% 0.999 3.27
15
• For 80%, 0.500 - 0.100 = 0.400
• For 90%, 0.500 - 0.050 = 0.450
• For 99%, 0.500 - 0.005 = 0.495

1.28

1.645

2.58

16
Example

• A sample of 11 circuits from a large


normal popula>on has a mean resistance
of 2.20 ohms. We know from past tes>ng
that the popula>on standard devia>on is
0.35 ohms.

• Determine a 95% confidence interval for


the true mean resistance of the
popula>on.

17
Example

• A sample of 11 circuits from a large normal


popula>on has a mean resistance of 2.20
ohms. We know from past tes>ng that the
popula>on standard devia>on is 0.35 ohms.
• Solu>on: X ± Z α/2
σ
n
= 2.20 ± 1.96 (0.35/ 11)
= 2.20 ± 0.2068
1.9932 ≤ µ ≤ 2.4068

18
Interpreta>on

• We are 95% confident that the true mean


resistance is between 1.9932 and 2.4068
ohms
• Although the true mean may or may not be in
this interval, 95% of intervals formed in this
manner will contain the true mean

19
Hypotheses Tes>ng

20
What is a Hypothesis?
• A hypothesis is a claim (assump>on) about a
popula>on parameter:

– popula>on mean
Example: The mean monthly cell phone bill
in this city is µ = $42
– popula>on propor>on

Example: The proportion of adults in this


city with cell phones is π = 0.68

21
The Null Hypothesis, H0
• States the claim or asser>on to be tested
Example: The average number of TV sets in U.S.
Homes is equal to three ( H 0 : µ = 3)
• Is always about a popula>on parameter,
not about a sample sta>s>c

H0 : µ = 3 H0 : X = 3

22
The Null Hypothesis, H0

• Begin with the assump>on that the null


hypothesis is true
– Similar to the no>on of innocent un>l
proven guilty

• Refers to the status quo or historical value


• Always contains “=” , “≤” or “≥” sign
• May or may not be rejected

23
The Alterna>ve Hypothesis, H1
• Is the opposite of the null hypothesis
– e.g., The average number of TV sets in U.S.
homes is not equal to 3 ( H1: μ ≠ 3 )
• Challenges the status quo
• Never contains the “=” , “≤” or “≥” sign
• May or may not be proven
• Is generally the hypothesis that the
researcher is trying to prove

24
The Hypothesis Tes>ng Process

• Claim: Assume that the popula>on mean age is 50.


– H0: μ = 50, H1: μ ≠ 50
• Sample the popula>on and find sample mean.

Population

Sample

25
The Hypothesis Tes>ng Process

Sampling
Distribution of X

X
20 µ = 50
If H0 is true ... then you reject
If it is unlikely that you
the null hypothesis
would get a sample
that µ = 50.
mean of this value ... ... When in fact this were
the population mean…
26
The Test Sta>s>c and
Cri>cal Values
Sampling Distribution of the test statistic

Region of Region of
Rejection Rejection
Region of
Non-Rejection

Critical Values

“Too Far Away” From Mean of Sampling Distribution


27
Level of Significance
and the Rejec>on Region
H 0: µ = 3 Level of significance = α
H 1: µ ≠ 3
α /2 α /2

Critical values

Rejection Region

This is a two-tail test because there is a rejection region in both tails


28
Z Test of Hypothesis for the Mean (σ
Known)
• Convert sample sta>s>c ( X ) to a ZSTAT test sta>s>c
Hypothesis
Tests for µ

σσKnown
Known σσUnknown
Unknown
(Z test) (t test)
The test statistic is:
X−µ
Z STAT =
σ
n
29
Two-Tail Tests
H 0: µ = 3
n There are two
cutoff values H 1: µ ≠ 3
(critical values),
defining the
regions of α/2 α/2
rejection
3 X
Reject H0 Do not reject H0 Reject H0

-Zα/2 0 +Zα/2 Z

Lower Upper
critical critical
value value 30
Hypothesis Tes>ng Example
Test the claim that the true mean # of TV sets
in US homes is equal to 3.
(Assume σ = 0.8)

1. State the appropriate null and alternative


hypotheses
n H 0: µ = 3 H1: µ ≠ 3 (This is a two-tail test)
2. Specify the desired level of significance and the
sample size
n Suppose that α = 0.05 and n = 100 are chosen

for this test

31
Hypothesis Tes>ng Example
3. Determine the appropriate technique
n σ is assumed known so this is a Z test.

4. Determine the critical values


n For α = 0.05 the critical Z values are ±1.96

5. Collect the data and compute the test statistic


n Suppose the sample results are
n = 100, X = 2.84 (σ = 0.8 is assumed known)
So the test statistic is:
X − µ 2.84 − 3 − .16
Z STAT = = = = −2.0
σ 0.8 .08
n 100
32
Hypothesis Tes>ng Example
• 6. Is the test sta>s>c in the rejec>on region?

α/2 = 0.025 α/2 = 0.025

Reject H0 if Reject H0 Do not reject H0 Reject H0


ZSTAT < -1.96 or -Zα/2 = -1.96 0 +Zα/2 = +1.96
ZSTAT > 1.96;
otherwise do
not reject H0 Here, ZSTAT = -2.0 < -1.96, so the
test statistic is in the rejection
region
33
Hypothesis Tes>ng Example
6 (con>nued). Reach a decision and interpret the result

α = 0.05/2 α = 0.05/2

Reject H0 Do not reject H0 Reject H0

-Zα/2 = -1.96 0 +Zα/2= +1.96


-2.0

Since ZSTAT = -2.0 < -1.96, reject the null hypothesis


and conclude there is sufficient evidence that the mean
number of TVs in US homes is not equal to 3
34
Test of Significance

t-Test
Difference Between Two Means

Population means, Goal: Test hypothesis or form


independent
samples
* a confidence interval for the
difference between two
population means, µ1 – µ2
σ1 and σ2 unknown,
assumed equal The point estimate for the
difference is

X1 – X2
σ1 and σ2 unknown,
not assumed equal
36
Difference Between Two Means: Independent
Samples
• Different data sources
Population means, – Unrelated
independent
samples
* – Independent
• Sample selected from one
popula>on has no effect on the
sample selected from the other
popula>on
Use Sp to estimate unknown
σ1 and σ2 unknown, σ. Use a Pooled-Variance t
assumed equal test.

σ1 and σ2 unknown, Use S1 and S2 to estimate


not assumed equal unknown σ1 and σ2. Use a
Separate-variance t test 37
Hypothesis Tests for
Two Popula>on Means
Two Population Means, Independent Samples

Lower-tail test: Upper-tail test: Two-tail test:


H 0: µ 1 ≥ µ 2 H 0: µ 1 ≤ µ 2 H 0: µ 1 = µ 2
H 1: µ 1 < µ 2 H 1: µ 1 > µ 2 H 1: µ 1 ≠ µ 2
i.e., i.e., i.e.,
H 0: µ 1 – µ 2 ≥ 0 H 0: µ 1 – µ 2 ≤ 0 H 0: µ 1 – µ 2 = 0
H 1: µ 1 – µ 2 < 0 H 1: µ 1 – µ 2 > 0 H 1: µ 1 – µ 2 ≠ 0

38
Hypothesis tests for μ1 – μ2
Two Population Means, Independent Samples
Lower-tail test: Upper-tail test: Two-tail test:

H 0: µ 1 – µ 2 ≥ 0 H 0: µ 1 – µ 2 ≤ 0 H 0: µ 1 – µ 2 = 0
H 1: µ 1 – µ 2 < 0 H 1: µ 1 – µ 2 > 0 H 1: µ 1 – µ 2 ≠ 0
α α α/2 α/2

-tα tα -tα/2 tα/2


Reject H0 if tSTAT < -tα Reject H0 if tSTAT > tα Reject H0 if tSTAT < -tα/2
or tSTAT > tα/2

39
Hypothesis tests for µ1 - µ2 with σ1 and σ2
unknown and assumed equal

Population means, Assumptions:


independent
§ Samples are randomly
samples and
independently drawn

σ1 and σ2 unknown,
assumed equal
* § Populations are normally
distributed or both sample
sizes are at least 30

§ Population variances are


σ1 and σ2 unknown, unknown but assumed
not assumed equal equal
40
Hypothesis tests for µ1 - µ2 with σ1 and σ2
unknown and assumed equal

• The pooled variance is:


Population means, 2 2
independent S 2
=
(n1 − 1)S 1 + (n2 − 1)S 2
p
samples (n1 − 1) + (n2 − 1)

σ1 and σ2 unknown,
assumed equal
* ( X1 − X 2 ) − ( µ 1 − µ 2 )

t STAT The= test statistic is:
2 ⎛⎜ 1 1 ⎞
Sp ⎜ + ⎟
⎝ n1 n 2 ⎟⎠
σ1 and σ2 unknown,
not assumed equal
• Where tSTAT has d.f. = (n1 + n2 – 2) 41
t Test Example
You are a financial analyst for a brokerage firm. Is there a
difference in dividend yield between stocks listed on the NYSE
& NASDAQ? You collect the following data:
NYSE NASDAQ
Number 21 25
Sample mean 3.27 2.53
Sample std dev 1.30 1.16

Assuming both populations are


approximately normal with equal
variances, is
there a difference in mean
yield (α = 0.05)?
42
t Test Example:
Calcula>ng the Test Sta>s>c (continued)

H0: µ1 - µ2 = 0 i.e. (µ1 = µ2)


H1: µ1 - µ2 ≠ 0 i.e. (µ1 ≠ µ2)
The test statistic is:

t=
(X1 − X 2 ) − (µ1 − µ 2 )
=
(3.27 − 2.53) − 0 = 2.040
2 ⎛⎜ 1 1 ⎞ ⎛1 1 ⎞
Sp ⎜ + ⎟ 1.5021⎜ + ⎟
⎝ n1 n 2

⎠ ⎝ 21 25 ⎠

2 2
2 (n1 − 1)S1 + (n 2 − 1)S 2 (21 − 1)1.30 2 + (25 − 1)1.16 2
S =
p = = 1.5021
(n1 − 1) + (n2 − 1) (21- 1) + (25 − 1)
43
Pooled-Variance t Test Example: Hypothesis Test
Solu>on
Reject H0 Reject H0
H0: μ1 - μ2 = 0 i.e. (μ1 = μ2)
H1: μ1 - μ2 ≠ 0 i.e. (μ1 ≠ μ2)
α = 0.05 .025 .025

df = 21 + 25 - 2 = 44 -2.0154 0 2.0154 t
CriPcal Values: t = ± 2.0154
2.040
Test StaPsPc: Decision:
3.27 − 2.53
t= = 2.040 Reject H0 at α = 0.05
⎛ 1 1⎞
Conclusion:
1.5021 ⎜ + ⎟
⎝ 21 25 ⎠ There is evidence of a
difference in means.
44
t = 2.021 – ((2.021-.
000)*4/20)
= 2.0168

45

You might also like