0% found this document useful (0 votes)
43 views32 pages

Lec01 Population

population test

Uploaded by

Thiên Cầm
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
43 views32 pages

Lec01 Population

population test

Uploaded by

Thiên Cầm
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 32

LECTURE 1.

TWO POPULATION MEANS TESTS


▪ Dependent and Independent Samples
▪ Testing for Two Dependent Means
▪ Testing for Two Independent Means
▪ Testing for Two Variances

1
1.1. Dependent and Independent Samples
▪ Dependent samples (or related, pair sample): two
variable 𝑋1 , 𝑋2 gained from the same individuals
▪ Number of observations must equals
▪ Order of value cannot be changed

▪ Independent samples: observations gained from


different and independent individuals; 𝑋1 from one
sample, 𝑋2 from the other
▪ Number of observations can be different
▪ Order of value can be changed

2
Example 1.1
▪ Dependent sample ▪ Independent sample
Store Before After Firm A Firm B
 72 76 76 90
 75 79 79 82
 70 77 77 85
 82 80 80 90
 70 75 75 80
 83 89 89 79
Good Advertising policy ? 87
88
On average, are A and B really different? 84
3
1.2. Testing Two Dependent Means
▪ Pair sample 𝑋1𝑖 , 𝑋2𝑖 , 𝑖 = 1,2, … , 𝑛
▪ Sample size is 𝑛 for both 𝑋1 and 𝑋2
▪ Testing: 𝑀𝑒𝑎𝑛(𝑋1 ) and 𝑀𝑒𝑎𝑛(𝑋2 ) in population
unequals
𝐻0 : 𝜇1 = 𝜇2

𝐻1 : 𝜇1 ≠ 𝜇2
𝐻1 could be: 𝜇1 > 𝜇2 or 𝜇1 < 𝜇2
▪ Assumption:
▪ 𝑋1 and 𝑋2 are normally distributed
▪ Or Sample size are large enough

4
Statistical and Critical value
▪ Let: 𝒅 = 𝑿𝟏 − 𝐗 𝟐 , sample: 𝒅𝒊 = 𝑿𝟏𝒊 − 𝑿𝟐𝒊
𝑯𝟎 : 𝝁𝟏 − 𝝁𝟐 = 𝟎 𝑯𝟎 : 𝝁𝒅 = 𝟎
▪ Hypothesis: ቊ or ቊ
𝑯𝟏 : 𝝁𝟏 − 𝝁𝟐 ≠ 𝟎 𝑯𝟏 : 𝝁𝒅 ≠ 𝟎
▪ Statistic value:
ഥ−𝟎
𝒅
𝒕=
𝒔𝒅 / 𝒏
▪ Critical value: 𝒕 𝒏−𝟏 𝜶/𝟐
▪ If 𝒕 > 𝒕 𝒏−𝟏 𝜶/𝟐 then reject 𝐻0
▪ Similarly in cases of 𝐻1 : 𝜇𝑑 > 0 and 𝐻1 : 𝜇𝑑 < 0

5
Example 1.2
▪ Does Advertising policy increases sales? α = 5%
▪ (Sales is Normally distributed)
𝐻0 : 𝜇𝑑 = 0
▪ ቊ Store Before After Difference
𝐻1 : 𝜇𝑑 > 0 1 72 76 4
1 24 2 75 79 4
ҧ
𝑑 = ෍ 𝑑𝑖 = =4 3
𝑛 6 70 77 7
4 82 80 –2
1
2
𝑠𝑑 = ෍(𝑑𝑖 −𝑑)ҧ 2 = 10 5 70 75 5
𝑛−1
4−0 6 83 89 6
𝑡= = 3.1
10/ 6 Sum 24

▪ Critical 𝑡 5 0.05 = 2.015


6
Estimate the Difference

▪ Confidence interval 95%


𝒔𝒅 𝒔𝒅
ഥ − 𝒕(𝒏−𝟏)𝜶/𝟐
𝒅 ഥ
< 𝝁𝒅 < 𝒅 + 𝒕(𝒏−𝟏)𝜶/𝟐
𝒏 𝒏

▪ 𝑡(𝑛−1)𝛼/2 = 𝑡 5 0.025 = 2.57

3.16 3.16
4 − 2.57 < 𝜇𝑑 < 4 + 2.57
6 6
0.68 < 𝜇𝑑 < 7.32

7
In General
▪ Hypotheses:
𝑯𝟎 : 𝝁𝟏 − 𝝁𝟐 = 𝑫𝟎 𝑯𝟎 : 𝝁𝒅 = 𝑫𝟎
▪ ቊ ⇔ ቊ
𝑯𝟏 : 𝝁𝟏 − 𝝁𝟐 ≠ 𝑫𝟎 𝑯𝟏 : 𝝁𝒅 ≠ 𝑫𝟎

𝒅−𝑫 𝟎
▪ Statistic: 𝒕 =
𝒔𝒅 / 𝒏

▪ Critical value: 𝑡(𝑛−1)𝛼/2 ;


▪ Rule: |t| > Critical value → Reject H0

8
Example 1.3
House Inc. Exp.
d ▪ Use the following data, test
-hold (1000) (1000)
the hypothesis that on
1 12 8 4
average, income is $3.000
2 15 10 5
higher than expenditure,
3 18 12 6
significant level at 5.
4 10 12 -2
5 16 16 0
▪ Income and Expenditure
6 16 9 7
are Normally distributed
7 14 15 -1 ▪ Summary statistics:
8 12 7 5 𝑑ҧ = 2.9; 𝑠𝑑2 = 13.8
9 15 8 7
10 11 13 -2
 29
9
1.3. Testing Two Independent Means

▪ 𝑋1 is normality 𝑁(𝜇1 , 𝜎12 ), 𝑋2 is normality 𝑁(𝜇2 , 𝜎22 )


▪ Two independent sample: size 𝑛1 and 𝑛2

Known 𝜎12 , 𝜎22 Z-test : self study

Unknown 𝜎12 , 𝜎22


t-test with
Assume 𝜎12 = 𝜎22 pooled variance 𝑠𝑝2
Assume 𝜎12 ≠ 𝜎22 t-test with
two variance 𝑠12 , 𝑠22
10
2 2
Population Variance 𝜎𝟏 , 𝜎𝟐 are Known

Z-test
Statistic Hypotheses Reject H0 P-value

𝑯 : 𝝁 − 𝝁𝟐 = 𝑫𝟎 𝑧 > 𝑧𝛼/2 or 2 × 𝑃(𝑍 > |𝑧|)


(𝒙𝟏 − 𝒙𝟐 ) − 𝑫𝟎 ቊ 𝟎 𝟏 Two-tailed
𝑯𝟏 : 𝝁𝟏 − 𝝁𝟐 ≠ 𝑫𝟎 𝑧 <– 𝑧
𝒛= 𝛼/2
𝝈𝟐𝟏 𝝈𝟐𝟐 𝑃(𝑍 > 𝑧)
+ 𝑯𝟎 : 𝝁𝟏 − 𝝁𝟐 = 𝑫𝟎
𝒏𝟏 𝒏𝟐 ቊ 𝑧 > 𝑧𝛼 One-tailed
𝑯𝟏 : 𝝁𝟏 − 𝝁𝟐 > 𝑫𝟎

𝑯𝟎 : 𝝁𝟏 − 𝝁𝟐 = 𝑫𝟎 𝑃(𝑍 < 𝑧)
ቊ 𝑧 <– 𝑧𝛼 One-tailed
𝑯𝟏 : 𝝁𝟏 − 𝝁𝟐 < 𝑫𝟎

11
𝟐 𝟐
Unknown, Assumed 𝝈𝟏 = 𝝈𝟐
𝟐 𝟐
𝒏 − 𝟏 𝒔 + 𝒏 − 𝟏 𝒔
▪ Pooled variance 𝟐
𝒔𝒑 =
𝟏 𝟏 𝟐 𝟐
𝒏𝟏 + 𝒏𝟐 − 𝟐

t-test Hypotheses Reject H0 P-value


Statistic
𝑯𝟎 : 𝝁𝟏 − 𝝁𝟐 = 𝑫𝟎 2 × 𝑃(𝑇 > |𝑡|)
(𝒙𝟏 − 𝒙𝟐 ) − 𝑫𝟎 ቊ 𝑡 > 𝑡(𝑑𝑓)𝛼/2 Two-tailed
𝑯𝟏 : 𝝁𝟏 − 𝝁𝟐 ≠ 𝑫𝟎
𝒕=
𝒔𝟐𝒑 𝒔𝟐𝒑 𝑯 : 𝝁 − 𝝁𝟐 = 𝑫𝟎 𝑃(𝑇 > 𝑡)
+
𝒏𝟏 𝒏𝟐 ቊ 𝟎 𝟏 t>𝑡 One-tailed
𝑯𝟏 : 𝝁𝟏 − 𝝁𝟐 > 𝑫𝟎 𝑑𝑓 𝛼

𝑑𝑓 = 𝑛1 + 𝑛2 – 2 𝑯 : 𝝁 − 𝝁𝟐 = 𝑫𝟎 𝑃(𝑇 < 𝑡)
ቊ 𝟎 𝟏 t < −𝑡 One-tailed
𝑯𝟏 : 𝝁𝟏 − 𝝁𝟐 < 𝑫𝟎 𝑑𝑓 𝛼

12
Example 1.4
▪ Sales data of two firms is below. Assuming of equal
population variances, test that average sales in firm A
and B are different; at 5% and 1%
Firm A Firm B
▪ Assumed that variances are equal (𝑿𝟏 ) (𝑿𝟐 )
▪ 𝑥1 =
476
= 79.33 ; 𝑥2 =
765
= 85 76 90
6 9 79 82
▪ 𝑠12 = 25.87; 𝑠22 = 16.75 77 85
80 90
75 80
89 79
87
88
84
13
𝟐
Unknown, Assumed 𝝈𝟏  𝟐
𝝈𝟐
▪ Welch’s adjusted (𝑠12 /𝑛1 +𝑠22 /𝑛2 )2
degrees of freedom: 𝑑𝑓 = 2
𝑠1 Τ𝑛1 2 𝑠22 Τ𝑛2 2
+
𝑛1 − 1 𝑛2 − 1
t-test
Hypotheses Reject H0 P-value
Statistic
𝑯𝟎 : 𝝁𝟏 − 𝝁𝟐 = 𝑫𝟎 2 × 𝑃(𝑇 > |𝑡|)

(𝒙𝟏 − 𝒙𝟐 ) − 𝑫𝟎 𝑯𝟏 : 𝝁𝟏 − 𝝁𝟐 ≠ 𝑫𝟎 𝑡 > 𝑡(𝑑𝑓)𝛼/2 Two-tailed
𝒕=
𝒔𝟐𝟏 𝑺𝟐𝟐 𝑯𝟎 : 𝝁𝟏 − 𝝁𝟐 = 𝑫𝟎 𝑃(𝑇 > 𝑡)
+ ቊ t>𝑡
𝒏𝟏 𝒏𝟐 𝑯𝟏 : 𝝁𝟏 − 𝝁𝟐 > 𝑫𝟎 𝑑𝑓 𝛼 One-tailed

𝑛1 , 𝑛2 > 30 then 𝑯 : 𝝁 − 𝝁𝟐 = 𝑫𝟎 𝑃(𝑇 < 𝑡)


ቊ 𝟎 𝟏
𝑡 𝑑𝑓 𝛼 = 𝑧𝛼 𝑯𝟏 : 𝝁𝟏 − 𝝁𝟐 < 𝑫𝟎 t < −𝑡 𝑑𝑓 𝛼 One-tailed

14
Example 1.5
▪ Testing that average sales in firm A and B are different
▪ Assuming of unequal variances Firm A Firm B
(𝑿𝟏 ) (𝑿𝟐 )
▪ Significant level of 5% and 2%
76 90
▪ 𝑥1 = 79.33 ; 𝑥2 = 85 79 82
▪ 𝑠12 = 25.87; 𝑠22 = 16.75; 𝑑𝑓 = 9 77 85
80 90
75 80
89 79
87
88
84

15
Estimate the difference of two means, 𝝁𝟏 − 𝝁𝟐

The 95% confidence interval for 𝜇1 − 𝜇2

▪ Assuming equal variances:

𝒔𝟐𝒑 𝒔𝟐𝒑
(𝝁𝟏 −𝝁𝟐 ) ∈ (𝒙𝟏 − 𝒙𝟐 ) ± 𝒕(𝒏𝟏 +𝒏𝟐 −𝟐)𝜶/𝟐 +
𝒏𝟏 𝒏𝟐

▪ Assuming unequal variances:

𝒔𝟐𝟏 𝒔𝟐𝟐
(𝝁𝟏 −𝝁𝟐 ) ∈ (𝒙𝟏 − 𝒙𝟐 ) ± 𝒕(𝒅𝒇)𝜶/𝟐 +
𝒏𝟏 𝒏𝟐

16
Example 1.6

17
Testing Two Independent Means

Unknown 𝜎12 , 𝜎22

Assume 𝜎12 = 𝜎22 How to know which


assumption is
Assume 𝜎12  𝜎22 correct?

Testing two variances


H0: 𝜎12 = 𝜎22
H1: 𝜎12  𝜎22

18
1.4. Testing two variances
▪ H0: 𝝈𝟐𝑿 = 𝝈𝟐𝒀 H1: 𝝈𝟐𝑿  𝝈𝟐𝒀
▪ F-statistic:
𝑺𝟐𝑿
𝑭= 𝟐
𝑺𝒀
▪ If 𝑭 > 𝑭 𝒏𝑿 −𝟏,𝒏𝒀 −𝟏 𝜶/𝟐 or 𝑭 < 𝑭 𝒏𝑿 −𝟏,𝒏𝒀 −𝟏 𝟏−𝜶/𝟐
then reject H0
▪ Note:
1
𝐹 𝑛𝑋 −1,𝑛𝑌 −1 1−𝛼/2 =
𝐹 𝑛𝑌 −1,𝑛𝑋 −1 𝛼/2

19
1.4. Testing Two Variances

F-test
Hypotheses Reject H0
Statistic
H0: 𝜎𝑋2 = 𝜎𝑌2 𝐹 > 𝐹 𝑛𝑋 −1,𝑛𝑌 −1 𝛼/2
𝑺𝟐𝑿 H1: 𝜎𝑋2  𝜎𝑌2 Or 𝐹 < 𝐹 𝑛𝑋 −1,𝑛𝑌 −1 1−𝛼/2
𝑭= 𝟐
𝑺𝒀
H0: 𝜎𝑋2 = 𝜎𝑌2
𝐹>𝐹 𝑛𝑋 −1,𝑛𝑌 −1 𝛼
H1: 𝜎𝑋2 > 𝜎𝑌2
H0: 𝜎𝑋2 = 𝜎𝑌2
𝐹<𝐹 𝑛𝑋 −1,𝑛𝑌 −1 1−𝛼
H1: 𝜎𝑋2 < 𝜎𝑌2

20
Example 1.7
▪ Testing for the hypothesis that variances of two firms
are different, at 5%?
2 2 Firm A Firm B
▪ 𝑠1 = 25.87; 𝑠2 = 16.75 (X) (Y)
▪ 𝐹 6−1,9−1 0.025 = 4.81 76 90
1 79 82
▪ 𝐹 6−1,9−1 0.975 = = 0.148 77 85
𝐹 8,5 0.025
80 90
75 80
89 79
87
88
84

21
Summary
Variable 𝑋1 and 𝑋2 𝑑ത
𝑡=
𝑆𝑑2 /𝑛
Yes
Pair sample? t-Test: 𝑑 = 𝑋1 − 𝑋2
𝑥1 −𝑥2 −𝐷0
No 𝑧=
Yes 𝜎2
1 𝜎2
+ 2
Known 𝜎12 , 𝜎22 ? z-Test 𝑛1 𝑛2

No 𝑥1 −𝑥2 −𝐷0
Yes 𝑡=
𝜎12 = 𝜎22 ? t-Test: 𝑠𝑝2 𝑆2
𝑝
𝑛1
𝑆2
+𝑛
𝑝
2
No
𝑆12 t-Test: 𝑠12 , 𝑠22 𝑡=
𝑥1 −𝑥2 −𝐷0
F-Test: 𝐹 =
𝑆22 𝑆2
1 𝑆2
+ 2
𝑛1 𝑛2

22
1.5 Testing Two Proportions
▪ Independent populations and samples 1, 2
▪ Population proportion: 𝑝1 , 𝑝2
𝑓1
▪ Sample 1 from population 1: 𝑝Ƹ1 =
𝑛1
𝑓2
▪ Sample 2 from population 2: 𝑝Ƹ 2 =
𝑛2

23
1.5 Testing Two Proportions
𝒇𝟏 𝒇𝟐
ෝ𝟏 =
𝒑 ෝ𝟐 =
𝒑
𝒏𝟏 𝒏𝟐

Statistic value Hypotheses Reject H0


ෝ𝟏 − 𝒑
𝒑 ෝ𝟐 𝐻0 : 𝑝1 ≤ 𝑝2
𝒁𝒔𝒕𝒂𝒕 = ቊ 𝑍𝑠𝑡𝑎𝑡 > 𝑧𝛼
𝟏 𝟏 𝐻1 : 𝑝1 > 𝑝2
ෝ ෝ
𝒑𝟎 (𝟏 − 𝒑𝟎 ) +
𝒏𝟏 𝒏𝟐
𝐻0 : 𝑝1 ≥ 𝑝2
𝒇𝟏 + 𝒇𝟐 ቊ 𝑍𝑠𝑡𝑎𝑡 < −𝑧𝛼
ෝ𝟎 =
𝒑
𝐻1 : 𝑝1 < 𝑝2
𝒏𝟏 + 𝒏𝟐
𝐻0 : 𝑝1 = 𝑝2
ቊ |𝑍𝑠𝑡𝑎𝑡 | > 𝑧𝛼/2
𝐻1 : 𝑝1 ≠ 𝑝2
24
Example 1.8
▪ Observe customers in three 1 2 3
store 1, 2, 3 Female 55 115 85
▪ At level of 5%, test the Male 45 55 35
hypothesis that “Female
Sum 100 170 120
customer proportion in store 1
and 2 are equal”
▪ What is the answer if significant level is 1%
▪ Compare Female proportion between store 2 and 3

BUSINESS STATISTICS – Bui Duong Hai – NEU – www.mfe.edu.vn/buiduonghai 25


Confidence Interval of Difference
▪ Confidence interval of p1 – p2 is 𝒑 ෝ𝟐 ± ME
ෝ𝟏 − 𝒑
𝑝ො1 (1−𝑝ො1 ) 𝑝ො2 (1−𝑝ො2 )
▪ Marginal error: 𝑀𝐸 = 𝑧𝛼/2 +
𝑛1 𝑛2

Example 1.9
▪ In 200 male and 300 female customers, there are 126
and 144 regular ones, respectively. At significant level
of 5%, whether regular proportion in male is higher
than that in female? If Yes, estimate the difference
with confidence level 95%.
26
1.6 Correlation Coefficient
▪ Pair samples data of quantity variables, 𝑥𝑖 , 𝑦𝑖 , 𝑖 = 1, 𝑛
▪ Covariance
σ𝑛𝑖=1 𝑥𝑖 − 𝑥ҧ 𝑦𝑖 − 𝑦ത
𝐶𝑜𝑣 𝑋, 𝑌 =
𝑛−1
▪ 𝐶𝑜𝑣 > 0: (+) correlated
▪ 𝐶𝑜𝑣 = 0: no correlated
▪ 𝐶𝑜𝑣 < 0: (-) correlated
▪ Correlation coefficient
𝐶𝑜𝑣 𝑋, 𝑌
𝑟𝑋,𝑌 =
𝑠𝑋 𝑠𝑌
27
Correlation
▪ (Pearson) sample correlation coefficient
𝒓 = −𝟏 −𝟏 < 𝒓 < 𝟎 𝒓=𝟎 𝟎<𝒓<𝟏 𝒓=𝟏
Linear Negatively No Positively Linear
Negatively correlated correlated correlated Positively
correlated correlated

28
Example 1.10

𝒙𝒊 − 𝒙 𝟐 𝟐
𝒊 𝒙𝒊 𝒚𝒊 ഥ
𝒙𝒊 − 𝒙 ഥ
𝒚𝒊 − 𝒚 ഥ
𝒙𝒊 − 𝒙 ഥ
𝒚𝒊 − 𝒚
ഥ)
*(𝒚𝒊 − 𝒚
① 1 4 -1.4 -2.2 3.08 1.96 4.84
② 2 6 -0.4 -0.2 0.08 0.16 0.04
③ 2 5 -0.4 -1.2 0.48 0.16 1.44
④ 3 7 0.6 0.8 0.48 0.36 0.64
⑤ 4 9 1.6 2.8 4.48 2.56 7.84
σ 12 31 0 0 8.6 5.2 14.8
12 5.2 8.6
▪ 𝑥ҧ = = 2.4 𝑠𝑥2 = = 1.3 𝐶𝑜𝑣 = = 2.15
5 4 4
31 14.8 2.15
▪ 𝑦ത = = 6.2 𝑠𝑦2 = = 3.7 𝑟= = 0.9803
5 4 1.3∗3.7

29
Correlation Test
▪ In Population: 𝜌𝑋,𝑌 is unknown
▪ Hypotheses pair
▪ H0: 𝜌𝑋,𝑌 = 0 : X and Y are no correlated
▪ H1: 𝜌𝑋,𝑌 ≠ 0 : X and Y are correlated
▪ Z-Statistic: 𝑧 = 𝑟𝑋,𝑌 𝑛
▪ If 𝑧 > 𝑧𝛼/2 then reject H0
▪ Confidence interval of correlation coefficient
1 1
𝑟𝑋,𝑌 − 𝑧𝛼/2 < 𝜌𝑋,𝑌 < 𝑟𝑋,𝑌 + 𝑧𝛼/2
𝑛 𝑛

30
Example 1.11
▪ The following data shown Quantity of sale (Q), Price
(P), and competitive price (Z)
Q 20 25 24 26 28 29 25 26 28 26 27 28 26 25 27 28
P 18 18 17 15 15 12 16 17 14 15 14 13 15 16 16 15
Z 15 15 14 14 17 17 15 12 18 13 14 19 12 16 16 18
▪ Using Excel to calculate correlation coefficients
between Q and P; Q and Z
▪ At significant level of 5%, test for correlation between
Q and P; and Q and Z
▪ Find C.I 95% for the significant correlation
31
1.7 Practice
▪ Testing for means - dependent samples
▪ Testing for two variances
▪ Testing for means - independent samples

32

You might also like