Topic - Chapter 10 - Two-Sample Hypothesis Tests
Topic - Chapter 10 - Two-Sample Hypothesis Tests
Two-sample hypothesis tests Nguyen Thi Thu Van, November 26, 2022
Two-sample tests compare two sample estimates with each other, whereas one-sample tests compare a sample estimate with a non-sample benchmark or target (a claim or prior belief about a population parameter).
For example. A new bumper is installed on selected vehicles in a corporate fleet. During a 1-year test period, 12 vehicles with the new bumper were involved in accidents, incurring mean damage of $1,101 with a standard deviation of $696. During the same year, 9 vehicles
with the old bumpers were involved in accidents, incurring mean damage of $1,766 with a standard deviation of $838. Did the new bumper significantly reduce damage? Did it reduce significantly variation?
Basis of Two-Sample Tests. The logic of two-sample tests is based on the fact that two samples drawn from the same population may yield different
estimates of a parameter due to chance. For example, exhaust emission tests could yield different results for two vehicles of the same type. Only if the
two sample statistics differ by more than the amount attributable to chance can we conclude that the samples came from populations with different
parameter values, as illustrated the adjacent picture.
Two-sample tests are especially useful because they possess a built-in point of comparison. You can think of many situations where two groups are to
be compared: Before versus after; Old versus new; or Experimental versus control. Sometimes we don’t really care about the actual value of the
population parameter, but only whether the parameter is the same for both populations.
Comparing two means Comparing two proportions Comparing two variances
𝒙𝟏 𝒙𝟐
Variances known Variances unknown 𝒑𝟏 = , 𝒑𝟐 = Assuming the populations are normal, the test
𝒏𝟏 𝒏𝟐
𝝈𝟐𝟏 , 𝝈𝟐𝟐 Equal Unequal statistic follows the F distribution, named for
Ronald A. Fisher in the 1930s
𝑥1 − ̅̅̅
̅̅̅ 𝑥2 ̅̅̅
𝑥1 − ̅̅̅
𝑥2 ̅𝑥̅̅1̅−𝑥
̅̅̅2̅ 𝑝1 − 𝑝2 𝑠12
𝑧𝑐𝑎𝑙𝑐 = 𝑡𝑐𝑎𝑙𝑐 = If n1 , 𝑛2 ≥ 30, then 𝑧𝑐𝑎𝑙𝑐 = 𝑧𝑐𝑎𝑙𝑐 =
𝒔𝟐 𝒔𝟐 1 1 𝐹𝑐𝑎𝑙𝑐 =
√ 𝟏
+𝑛𝟐 √𝑝𝑐 (1 − 𝑝𝑐 ) ( + ) 𝑠22
𝝈𝟐 𝝈𝟐𝟐 𝒔𝟐𝒑 𝒔𝟐𝒑 𝑛1 2 𝑛1 𝑛2
√ 𝟏 + √ +
𝑛1 𝑛2 𝑛1 𝑛2 Critical values
̅𝑥̅̅1̅−𝑥
̅̅̅2̅
If n1 , 𝑛2 < 30, then 𝑡𝑐𝑎𝑙𝑐 = 1
𝒔𝟐 𝒔𝟐
√ 𝟏+ 𝟐 𝐹𝑅 = 𝐹𝑑𝑓1 ,𝑑𝑓2 ; 𝐹𝐿 =
𝑛1 𝑛2
𝐹𝑑𝑓2 ,𝑑𝑓1
𝑑. 𝑓. = 𝑛1 + 𝑛2 − 2 𝒔𝟐 𝟐 2 𝑑𝑓1 = 𝑛1 − 1
( 𝟏 𝒔𝟐 )
+
𝑛1 𝑛2
𝑑. 𝑓. = 𝟐 𝟐 𝑑𝑓2 = 𝑛2 − 1
𝒔𝟐 𝒔𝟐
( 𝟏) ( 𝟐)
𝑛1 𝑛2
𝑛1 −1
+𝑛
2−1
1 1 𝑠12 𝑠22 𝑝1 (1 − 𝑝1 ) 𝑝2 (1 − 𝑝2 )
(𝑥 𝑥2 ± 𝑡𝛼/2√𝑠𝑝2√
̅̅̅1 − ̅̅̅) + (𝑥 𝑥2 ± 𝑡𝛼/2√ +
̅̅̅1 − ̅̅̅) (𝑝1 − 𝑝2 ) ± 𝑧𝛼/2√ +
𝑛1 𝑛2 𝑛1 𝑛2 𝑛1 𝑛2
To find 𝑧𝑐𝑟𝑖𝑡 , look up in To find 𝑡𝑐𝑟𝑖𝑡 , look up in Table D or, Excel function. To find 𝐹𝑐𝑟𝑖𝑡 , we look up in Table F, or Excel
Table C. function.
𝛼
𝐹𝑅 ≡ 𝐹. 𝐼𝑁𝑉. 𝑅𝑇 ( , 𝑑𝑓1, 𝑑𝑓2)
2
1
𝐹𝐿 ≡ 𝛼
𝐹. 𝐼𝑁𝑉. 𝑅𝑇 ( , 𝑑𝑓2, 𝑑𝑓1)
2
Paired When sample data consist of n matched pairs, a different approach is required. If the same individuals are observed twice 𝑛
∑ 𝑑 ∑(𝑑 −𝑑) ̅ 2 𝑑̅−𝜇𝑑
𝑑̅ = 1 𝑖 ; 𝑠𝑑 = √ 𝑖 ;𝑡= 𝑠𝑑
𝑛 𝑛−1
𝒕 − test Student’s t distribution but under different circumstances, we have a paired comparison. For example, weekly sales of Snapple at 12 Walmart √𝑛
stores are compared before and after installing a new eye-catching display. Did the new display increase sales?
It is worth bearing in mind three questions (1) Are the populations skewed? Are there outliers? This question refers to the assumption of normal populations, upon which the tests are based.
when you are comparing two samples (2) Are the sample sizes large (𝒏 ≥ 𝟑𝟎)? The t test is robust to non-normality as long as the samples are not too small and the populations are not too skewed thanks to the Central Limit Theorem.
(3) Is the difference important as well as significant? A small difference in means or proportions could be statistically significant if the sample size is large because the standard error gets smaller as the sample size gets larger.