0% found this document useful (0 votes)
10 views30 pages

Lecture 4

Uploaded by

Sanna Zommarin
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views30 pages

Lecture 4

Uploaded by

Sanna Zommarin
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 30

TAMS65 - Lecture 4

Confidence interval - Two independent samples

Zhenxia Liu
Matematisk statistik
Matematiska institutionen
Content
▶ Review - Type I: One sample
▶ Type II: Two independent samples
▶ Iµ1 −µ2 if σ1 and σ2 are known
▶ Ic1 µ1 +c2 µ2

▶ Iµ1 −µ2 if σ1 = σ2 = σ is unknown


▶ Ic1 µ1 +c2 µ2

▶ Iµ1 −µ2 if σ1 ̸= σ2 and both are unknown


▶ Iσ2 or Iσ when σ1 = σ2 = σ is unknown

▶ New notation Fα (r1 , r2 )


▶ Iσ2 /σ2 or Iσ1 /σ2 if σ1 ̸= σ2 and both are unknown
1 2

▶ CI for paired data - Example 2


▶ Appendix

TAMS65 - Lecture 4 1/28


Review - Type I: One sample
Type I: One random sample {X1 , . . . , Xn } from N(µ, σ)

(1) (1 − α) CI for µ,

X̄ −µ √σ
(1.1) If σ is known,

 √
σ/ n
∼ N(0, 1) ⇒ Iµ = x̄ ∓ λα/2 · n

X̄ −µ

(1.2) If σ is unknown, √ ∼ t(n − 1) ⇒ Iµ = x̄ ∓ tα/2 (n − 1) · √s

S/ n n

(2) (1 − α) CI for σ 2 ,
(n − 1)S 2 (n − 1)s 2 (n − 1)s 2
∼ χ2 (n − 1) ⇒ Iσ2 = ( , )
σ2 χ2α/2 (n − 1) χ21−α/2 (n − 1)
Note:
n √ n
1X 1 X
x̄ = xi , s= s 2, s2 = (xi − x̄)2
n n−1
i=1 i=1

TAMS65 - Lecture 4 2/28


Review - Type I: One sample
Steps to find a two-sided confidence interval.

Step I Find the sampling distribution of point estimator of the unknown


parameter θ.
▶ E.g. If σ is known, the sample distribution of X̄ is
X̄ − µ
√ ∼ N(0, 1)
σ/ n
Step II Set up the probability with confidence level (1 − α)
P(index 1 < r.v. of the sampling distribution < index 2) = 1 − α
 
▶ E.g. P −λα/2 < X̄−µ

σ/ n
< λα/2 = 1 − α
Step III Plug in observations and calculate
(1 − α)CI : Iθ = (f1 (x1 , . . . , xn ), f2 (x1 , . . . , xn )) .
 
▶ E.g. Iµ = x̄ − λα/2 · σ σ

n
, x̄ + λα/2 · √
n

TAMS65 - Lecture 4 3/28


Type II: Two independent samples
Type II: Two independent random samples:
Two random samples X1 , X2 , . . . , Xn1 and Y1 , Y2 , . . . , Yn2 are from
independent populations N(µ1 , σ1 ) and N(µ2 , σ2 ), respectively.
Observations: x1 , x2 , . . . , xn1 and y1 , y2 , . . . , yn2 .

▶ X1 , X2 , . . . , Xn1 are independent and each Xi ∼ N(µ1 , σ1 )



▶ X̄ ∼ N(µ1 , σ1 / n1 )
▶ Y1 , Y2 , . . . , Yn2 are independent and each Yi ∼ N(µ2 , σ2 )

▶ Ȳ ∼ N(µ2 , σ2 / n2 )
 s 
σ12 σ22
▶ X̄ and Ȳ are independent.Then X̄ − Ȳ ∼ N µ1 − µ2 , + 
n1 n2

Usually we want to construct confidence intervals for

µ1 − µ2 , c1 µ1 + c2 µ2 , σ 2 or σ if σ1 = σ2 = σ, σ22 /σ12 .

TAMS65 - Lecture 4 4/28


Type II: Iµ1 −µ2 with σ1 and σ2 known
Type II (1.1): If σ1 and σ2 are known(kända), find confidence
interval for µ1 − µ2 .
Point estimate: µ̂1 − µ̂2 = x̄ − ȳ Point estimator: X̄ − Ȳ

▶ The sampling distribution of X̄ − Ȳ is

X̄ − Ȳ − (µ1 − µ2 )
s ∼ N(0, 1)
σ12 σ22
+
n1 n2
▶ (1 − α) CI for µ1 − µ2 is
s
σ12 σ22
Iµ1 −µ2 = x̄ − ȳ ∓ λα/2 · +
n1 n2
1 Pn1 1 Pn2
▶ x̄ = n1 i=1 xi , ȳ = n2 i=1 yi , λα/2 =?

TAMS65 - Lecture 4 5/28


Type II: Ic1 µ1 +c2 µ2 with σ1 and σ2 known
Remark: For any constants c1 , c2 , find confidence interval for
c1 µ1 + c2 µ2 : c1 x̄ + c2 ȳ ⇒ c1 X̄ + c2 Ȳ
 s 
2 2
c 1 σ1 2
c σ 2
▶ c1 X̄ + c2 Ȳ ∼ N c1 µ1 + c2 µ2 , + 2 2
n1 n2
▶ The sampling distribution of X̄ − Ȳ is

c1 X̄ + c2 Ȳ − (c1 µ1 + c2 µ2 )
s ∼ N(0, 1)
c12 σ12 c22 σ22
+
n1 n2
▶ (1 − α) CI for c1 µ1 + c2 µ2 is
s
c12 σ12 c22 σ22
Ic1 µ1 +c2 µ2 = c1 x̄ + c2 ȳ ∓ λα/2 · +
n1 n2

TAMS65 - Lecture 4 6/28


Repetition - Sample variance
Type I: One sample - confidence interval for σ 2 :
▶ The sampling distribution of the sample variance S 2 is
(n − 1)S 2
∼ χ2 (n − 1), S 2 = Sample variance
σ2
Maximum Likelihood Method
- on two samples from independent populations:
x1 , . . . , xn1 , where X1 , . . . , Xn1 are independent and N(µ1 , σ)
y1 , . . . , yn2 , where Y1 , . . . , Yn2 are independent and N(µ2 , σ)

▶ The point estimate of σ 2 is


Combined/Pooled(sammanvägda) sample variance s 2 ,

(n1 − 1)s12 + (n2 − 1)s22


σ̂ 2 = s 2 = , where
(n1 − 1) + (n2 − 1)
Pn1 Pn2
s12 = n11−1 i=1 (xi − x̄)2 and s22 = n21−1 i=1 (yi − ȳ )2

TAMS65 - Lecture 4 7/28


Combined Sample Variance

Remark:
(n1 −1)S12
• degrees of freedom of S12 is n1 − 1, since σ2
∼ χ2 (n1 − 1).

(n2 −1)S22
• degrees of freedom of S22 is n2 − 1, since σ2
∼ χ2 (n2 − 1).

(n1 +n2 −2)S 2 (n1 −1)S12 (n2 −1)S22


We can prove that σ2
= σ2
+ σ2

Note that S12 and S22 are independent, so we have


(n1 + n2 − 2)S 2
∼ χ2 (n1 + n2 − 2).
σ2
Therefore, the degrees of freedom of S 2 is n1 + n2 − 2.

TAMS65 - Lecture 4 8/28


Type II: Iµ1 −µ2 with σ1 = σ2 = σ unknown
Type II (1.2): If σ1 = σ2 = σ and σ is unknown(okänd), find
confidence interval for µ1 − µ2 .
X̄ − Ȳ − (µ1 − µ2 )
Note: r ∼ N(0, 1)
1 1
σ +
n1 n2

The above is NOT sampling distribution since σ is unknown.


▶ The sampling distribution of X̄ − Ȳ is

X̄ − Ȳ − (µ1 − µ2 )
r ∼ t(n1 + n2 − 2)
1 1
S +
n1 n2
▶ (1 − α) CI for µ1 − µ2 is
r
1 1
Iµ1 −µ2 = x̄ − ȳ ∓ tα/2 (n1 + n2 − 2) · s ·
+
n1 n2
▶ Note: s 2 is the Combined/Pooled sample variance.

TAMS65 - Lecture 4 9/28


Type II: Ic1 µ1 +c2 µ2 with σ1 = σ2 = σ unknown
Remark: For any constants c1 , c2 , find confidence interval for
c1 µ1 + c2 µ2 if σ1 = σ2 = σ unknown.
 s 
2
c1 c 2
▶ c1 X̄ + c2 Ȳ ∼ N c1 µ1 + c2 µ2 , σ + 2
n1 n2
▶ The sampling distribution of c1 X̄ + c2 Ȳ is

c1 X̄ + c2 Ȳ − (c1 µ1 + c2 µ2 )
s ∼ t(n1 + n2 − 2)
c12 c22
S +
n1 n2
▶ (1 − α) CI for c1 µ1 + c2 µ2 is
s
c12 c22
Ic1 µ1 +c2 µ2 = c1 x̄ + c2 ȳ ∓ tα/2 (n1 + n2 − 2) · s · +
n1 n2

TAMS65 - Lecture 4 10/28


Type II: Iµ1 −µ2 with σ1 ̸= σ2 and both are unknown
Type II (1.3): If σ1 ̸= σ2 and both are unknown(okänd), find
confidence interval for µ1 − µ2 .
▶ The sampling distribution of X̄ − Ȳ is

X̄ − Ȳ − (µ1 − µ2 )
s ≈ t(f )
S12 S22
+
n1 n2

2
s12 s22

+
n1 n2
f = 2 .
(s1 /n1 )2 (s22 /n2 )2
+
n1 − 1 n2 − 1
▶ (1 − α) CI for µ1 − µ2 is
s
s12 s2
Iµ1 −µ2 = x̄ − ȳ ∓ tα/2 (f ) · + 2
n1 n2

TAMS65 - Lecture 4 11/28


Type II: Iσ2 or Iσ with σ1 = σ2 = σ unknown
Type II (2): If If σ1 = σ2 = σ is unknown(okänd), find
confidence interval for σ 2 or σ.
(n1 + n2 − 2)S 2
s2 ⇒ S2 ⇒ ∼ χ2 (n1 + n2 − 2),
σ2
Note: s 2 is combined/pooled sample variance.
▶ The sampling distribution of S 2 is

(n1 + n2 − 2)S 2
∼ χ2 (n1 + n2 − 2)
σ2
▶ (1 − α) CI for σ 2 is
!
(n1 + n2 − 2)s 2 (n1 + n2 − 2)s 2
Iσ 2 = ,
χ2α/2 (n1 + n2 − 2) χ21−α/2 (n1 + n2 − 2)

▶ (1 − α) CI for σ, Iσ =?

TAMS65 - Lecture 4 12/28


Type II: Two independent samples
Remark 1: The methods for two samples can be generalized to
multiple samples.

Remark 2: If you have two (or more samples) from independent


normal distributions with the same σ, then you use the
combined/pooled s 2 for all samples to estimate σ 2 , even if you,
e.g., just want to construct Iµ1 .

Note: If there are k independent random samples from


independent normal distributions with the same σ, then the
combined/pooled sample variance s 2 is
(n1 − 1)s12 + (n2 − 1)s22 + . . . + (nk − 1)sk2
s2 = .
n1 + n2 + . . . + nk − k

The degrees of freedom for s 2 is n1 + n2 + . . . + nk − k.

TAMS65 - Lecture 4 13/28


New Notation Fα (r1 , r2 )

Fα (r1 , r2 ) is a point such that P(X ≥ Fα (r1 , r2 )) = α, where


X ∼ F (r1 , r2 ).

▶ F (r1 , r2 ) : F − distribution with degrees of freedom r1 and r2 .


▶ The graph of F −distribution.
▶ Note: Table book for F (r1 , r2 ): page 22-27.

TAMS65 - Lecture 4 14/28


New Notation Fα (r1 , r2 )

Theorem If X1 and X2 are independent, and X1 ∼ χ2 (r1 ) and


X2 ∼ χ2 (r2 ), then we get

X1 /r1
V = ∼ F (r1 , r2 )
X2 /r2

Remark:

1
▶ ∼ F (r2 , r1 ).
V

1
Fα (r1 , r2 ) =
F1−α (r2 , r1 )
Proof is given in Appendix.

TAMS65 - Lecture 4 15/28


New Notation Fα (r1 , r2 )
E.g. If α = 0.01, r1 = 4, r2 = 3, Find Fα (r1 , r2 ) and F1−α (r1 , r2 ).

1 1
F0.01 (4, 3) = 28.71, F0.99 (4, 3) = F0.01 (3,4) = 16.69 ≈ 0.0599.

TAMS65 - Lecture 4 16/28


Type II: Iσ22 /σ12 or Iσ2 /σ1
Type II (3): If σ1 and σ2 are unknown(okända), find confi-
dence interval for σ12 /σ22 or σ1 /σ2 .
(n1 − 1)S12 2 (n2 − 1)S22
∼ χ (n 1 − 1), ∼ χ2 (n2 − 1).
σ12 σ22
▶ The sampling distribution of S12 /S22 is

(n1 −1)S12
σ12
/(n1 − 1) S12 /σ12 S12 /S22
= = ∼ F (n1 − 1, n2 − 1)
(n2 −1)S22
/(n2 − 1) S22 /σ22 σ12 /σ22
σ22
▶ (1 − α) CI for σ12 /σ22 is
s12 /s22 s12 /s22
 
Iσ12 /σ22 = ,
Fα/2 (n1 − 1, n2 − 1) F1−α/2 (n1 − 1, n2 − 1)

▶ (1 − α) CI for σ1 /σ2 , Iσ1 /σ2 =?

TAMS65 - Lecture 4 17/28


Example 1 - More effective routines
Example 1: A company wants to create more effective routines for
its transport, and therefore has tried two different ways to organize
them. For each method, test transport has been carried out and
the total transport time (unit: hour) including loading and
unloading has been measured:

Method Measured times x̄ s


A: 8.2 7.1 7.8 8.9 8.8 8.16 0.7436
B: 7.1 7.4 6.9 6.8 7.05 0.2646

Assume there are two independent samples from independent


N(µi , σi ), i = 1, 2.

(a) Is it possible that σ1 = σ2 ? Explain why by constructing


appropriate confidence interval with confidence 98%.

TAMS65 - Lecture 4 18/28


Example 1- continued
S12 /S22
The sampling distribution of S12 /S22 is : σ12 /σ22
∼ F (n1 − 1, n2 − 1)
(1 − α) CI for σ12 /σ22 is:

s12 /s22 s12 /s22


 
,
Fα/2 (n1 − 1, n2 − 1) F1−α/2 (n1 − 1, n2 − 1)

So 98% CI for σ1 /σ2 is


s s !
s12 /s22 s12 /s22
Iσ2 /σ1 = ,
Fα/2 (n1 − 1, n2 − 1) F1−α/2 (n1 − 1, n2 − 1)
s s !
(0.7436)2 /(0.2646)2 (0.7436)2 /(0.2646)2
= ,
F0.01 (4, 3) F0.99 (4, 3)
≈ (0.52, 11.48).

1 − α = 98% gives α = 0.02. Then Fα/2 (n1 − 1, n2 − 1) = F0.01 (4, 3) = 28.71,


F1−α/2 (n1 − 1, n2 − 1) = F0.99 (4, 3) = F0.011(3,4) = 16.69
1
≈ 0.0599.
Note: Because 1 ∈ Iσ2 /σ1 , so it is possible that σ1 = σ2 with 98% confidence.

TAMS65 - Lecture 4 19/28


Example 1- continued
(b) Based on (a), give a proper assumption on the populations.
Then find an appropriate confidence interval for µ1 − µ2 with
confidence level 98%.
Based on (a), we assume that σ1 = σ2 = σ which is unknown.
The sampling distribution of X̄ − Ȳ is

X̄ − Ȳ − (µ1 − µ2 )
r ∼ t(n1 + n2 − 2)
1 1
S +
n1 n2

(1 − α) confidence interval for µ1 − µ2 is


r
1 1
Iµ1 −µ2 = (x̄ − ȳ ) ∓ tα/2 (n1 + n2 − 2) · s · +
n1 n2
1 − α = 98%, then α = 0.02, thus
tα/2 (n1 + n2 − 2) = t0.01 (5 + 4 − 2) = t0.01 (7) = 3.

TAMS65 - Lecture 4 20/28


Example 1- continued

The combined sample variance s 2 is

(n1 − 1)s12 + (n2 − 1)s22 4 · 0.74362 + 5 · 0.26462


s2 = = ≈ 0.37,
n1 + n2 − 2 7

where n1 + n2 − 2 = 7 is the degrees of freedom of s 2 .


98% confidence interval for µ1 − µ2 is
r
1 1
Iµ1 −µ2 = (x̄ − ȳ ) ∓ tα/2 (n1 + n2 − 2) · s · +
n1 n2

r
1 1
= (8.16 − 7.05) ∓ 3 · 0.37 · + ≈ (−0.11, 2.33)
5 4

TAMS65 - Lecture 4 21/28


Example 1- continued
(c) Find confidence interval for µ1 with confidence 95%.
The sampling distribution of X̄ is
X̄ − µ
√ ∼ t(n1 + n2 − 2) = t(7),
S/ n1
where S is combined sample standard deviation.
(1 − α) confidence interval for µ1 is
s
Iµ1 = x̄ ∓ tα/2 (n1 + n2 − 2) · √ ,
n1

s2
= 8.16 ∓ t0.025 (7) · √
5

0.37
= 8.16 ∓ 2.36 · √ ≈ (7.52, 8.80).
5

Where 1 − α = 95%, then α = 0.05.


TAMS65 - Lecture 4 22/28
Example 2 - CI for paired data
To study the effect of cigarette smoking on platelet aggregation,
Levine (1973) drew blood samples from 11 individuals before and
after they smoked a cigarette and measured the extent to which
the blood platelets aggregated. Platelets are involved in the
formation of blood clots, and it is known that smokers suffer more
often from disorders involving blood clots than nonsmokers do.

The data is shown in the following table, which gives the maximum
percentage of all the platelets that are aggregated after being
exposed to a stimulus.

Before xi 25 25 27 44 30 67 53 53 52 60 28
After yi 27 29 37 56 46 82 57 80 61 59 43

Question: Does smoking increase the formation of blood clots with 95%
confidence?

TAMS65 - Lecture 4 23/28


Example 2 - CI for paired data
Question: Does smoking increase the formation of blood clots with 95%
confidence?

Let xi be the percentage of all platelets that are aggregated before being
exposed to a stimulus for the i−th person, i = 1, 2, . . . , 11.
Let yi be the percentage of all platelets that are aggregated after being
exposed to a stimulus for the i−th person, i = 1, 2, . . . , 11.
Model:
{x1 , x2 , . . . , x11 } is a sample from population X ∼ N(µ1 , σ1 ).
{y1 , y2 , . . . , y11 } is a sample from population Y ∼ N(µ2 , σ2 ).
Note: X and Y are Not independent. Why?

The interesting thing is the change for each individual, so we are


interested in the differences

Valid assumption: We assume that D = Y − X ∼ N(µ2 − µ1 , σ)

TAMS65 - Lecture 4 24/28


Example 2- continued
Model: We assume that D = Y − X ∼ N(µ2 − µ1 , σ)
Observations: di = yi − xi , i = 1, . . . , 11, which are
{2, 4, 10, 12, 16, 15, 4, 27, 9, −1, 15}.
We now want to investigate if µ2 − µ1 > 0? That is, now we need to find one sided
lower bound CI for µ2 − µ1 .

D̄−(µ2 −µ1 )
The sampling distribution of D̄ is: √
S/ n
∼ t(n − 1)
Then 95% one sided lower bound CI:
 
s
Iµ2 −µ1 = (a, ∞) = d¯ − tα (n − 1) √ , ∞
n

s2
= (10.27 − (1.81) √ , ∞) = (5.9, ∞)
11

1 P11
where d¯ = 11 i=1 di = 10.27; 1 − α =P95%, then α = 0.05, so
1 n ¯2
tα (n − 1) = t0.05 (10) = 1.81; s 2 = n−1 i=1 (di − d) = 63.62.

With 95% confidence, we can say µ2 − µ1 > 5.9 > 0, that is, µ2 > µ1 which means
smoking increases the formation of blood clots.

TAMS65 - Lecture 4 25/28


Paired data
Typical cases with paired data are
a) measured on the same unit before and after treatment;
b) measured by two different methods within pairs with
equivalent(likvärdiga) units.

Thus, if you have two equal length measurement series and want
to investigate if there is ”systematic difference/change ”(e.g.,
(µ2 − µ1 ) in Example 2) between them, think about:

• Are those two measurements paired? If the answer is yes, make


differences and consider Type I CI - one sample..

• Are these two measurements that are completely disconnected


from each other (i.e. independent)? If the answer is yes, then
consider Type II CI - two samples.
TAMS65 - Lecture 4 26/28
Practice after the lecture:

Exercises:

(I) 12.21, 12.22, 12.25, PS-4.

(II) 12.27, 12.28.

TAMS65 - Lecture 4 27/28


Appendix
Proof of
1
Fα (r1 , r2 ) =
F1−α (r2 , r1 )

By the definition of Fα (r1 , r2 ), we can get


P(X ≥ Fα (r1 , r2 )) = α, where X ∼ F (r1 , r2 ).
P(X ≥ Fα (r1 , r2 )) = α which is equivalent to P( X1 ≤ 1
Fα (r1 ,r2 ) ) =α
P( X1 ≤ 1
Fα (r1 ,r2 ) ) = α is equivalent to P( X1 ≥ 1
Fα (r1 ,r2 ) ) = 1 − α.
1
By the Theorem and Remark on page 15, we can get ∼ F (r2 , r1 ).
X
1
That is, P( X1 ≥ 1
Fα (r1 ,r2 ) ) ∼ F (r2 , r1 ).
= 1 − α, where
X
According to definition of the F-notation, we can get
1 1
F1−α (r2 , r1 ) = ⇔ Fα (r1 , r2 ) = .
Fα (r1 , r2 ) F1−α (r2 , r1 )

TAMS65 - Lecture 4 28/28


Thank you!

You might also like