Hypothesis Testing For Two Sample Means
Hypothesis Testing For Two Sample Means
Sample Means
Ravindra S. Gokhale
IIM Indore
1
An Example
A company that sells educational materials reports statistical studies
to convince customers that its material improve learning. One new
product supplies "directed reading activities" for classroom use.
These activities should improve the reading ability of elementary
school pupils.
A consultant arranges for a third-grade class of 21 students to take
part in these activities for an eight-week period. A control classroom
of 23 third-grade students follow the same curriculum without the
activities. At the end of the eight weeks, all students are given a
Degree of Reading Power (DRP) test, which measures the aspects of
reading ability that the treatment is designed to improve. The data
appears as follows:
2
An Example
Treatment Group:
24 43 58 71 43 49 61 44 67 49 53 56 59
52 62 54 57 33 46 43 57
Control Group:
42 43 55 26 62 37 33 41 19 54 20 85 46
10 17 60 53 42 37 42 55 28 48
3
The z-test (two sample)
Null hypothesis H
0
:
1
2
=
0
The test statistic:
4
2
2
2
1
2
1
2 1 2 1
n
) ( ) X X (
Z
+
=
2
2
2
1
2
1
2 1
0
n
) X X (
Z
+
A
=
0
The z-test (two sample)
The computation of p-value
p = 2 x [1 (|z
0
|)] for a two-tailed test with
H
0
: =
0
H
1
:
0
p = 1 (z
0
) for an upper-tailed test with
H
0
: =
0
H
1
: >
0
p = (z
0
) for a lower-tailed test with
H
0
: =
0
H
1
: <
0
5
The z-test (two sample)
Two-sided 100(1 ) % confidence interval for
1
2
is:
One-sided 100(1 ) % confidence intervals
A 100(1 ) % upper confidence bound for
1
2
is:
A 100(1 ) % lower confidence bound for
1
2
is:
6
2
2
2
1
2
1
/2 2 1 2 1
2
2
2
1
2
1
/2 2 1
n
z x x
n
z x x + + s s +
2
2
2
1
2
1
2 1 2 1
n
z x x + + s
2 1
2
2
2
1
2
1
2 1
n
z x x s +
The z-test (two sample)
What are the conditions under which z-test can be used?
What are the conditions under which z-test can be approximate?
What are the conditions under which z-test cannot be used?
7
The t-test (two sample)
What are the conditions under which t-test can be used?
What are the conditions under which z-test cannot be used?
8
The t-test (two sample)
Null hypothesis H
0
:
1
2
=
0
The test statistic:
degrees of freedom: min (n
1
-1, n
2
-1).
Compare this using the t-distribution table
However, most of the books will not give this procedure. This may be
considered as an approximate procedure.
9
2
2
2
1
2
1
0 2 1
0
n
s
+
n
s
- ) X - X (
= T
The t-test (two sample)
Actual procedure
Two cases need to be considered:
Unknown but equal variances assumed:
1
2
=
2
2
=
2
Unknown and unequal variances assumed:
1
2
2
2
10
The t-test (two sample)
Based on the normality assumptions and the preceding results, it
can be derived that:
has a t-distribution with n
1
+ n
2
2 degrees of freedom.
Note: s
p
is derived from s
p
2
(the pooled estimator of
2
), which is given
by:
11
2 1
p
2 1 2 1
n
1
+
n
1
s
) - ( - ) X - X (
= T
Case 1: Equal variances assumed:
1
2
=
2
2
=
2
2 n + n
1)s (n + 1)s (n
= s
2 1
2
2 2
2
1 1 2
p
-
- -
The t-test (two sample)
Based on the normality assumptions and the preceding results, it
can be derived that:
has an approximate t-distribution with the degrees of freedom
given by:
12
2
2
2
1
2
1
2 1 2 1
n
s
+
n
s
) ( ) X X (
= T
- - -
Case 2: Unequal variances assumed:
1
2
2
2
1 1
2 2
2
|
|
.
|
\
|
+
|
|
.
|
\
|
|
|
.
|
\
|
+
=
2
2
2
2
1
1
2
1
2
2
2
1
2
1
n
n
S
n
n
S
n
S
n
S
u
The Paired t-test
The t-test used before cannot be used if the two samples are
related.
An example:
Eleven employees were put under the care of the company
nurse because of higher cholesterol reading. They were put on
a diet. The table next shows cholesterol reading of the eleven
employees both before they were put on the new diet and one
month after beginning the new diet. Has the new diet been
effective in reducing the cholesterol?
13
The Paired t-test
Employee
Number
Cholesterol
Reading (Initial)
Cholesterol Reading
(One month after new diet)
1 255 197
2 230 225
3 290 215
4 242 215
5 300 240
6 250 235
7 215 190
8 230 240
9 225 200
10 219 203
11 236 223
14
The Paired t-test
The level of cholesterol for an employee will get affected by innate
characteristics of an employee.
So the level of cholesterol before and after for an employee is
related.
Therefore, a regular (unpaired) t-test is not appropriate.
We have to use the paired t-test in these kind of situations.
Another example measure the effect of training on productivity
of employees.
15
The Paired t-test
16
Null hypothesis H
0
:
D
=
1
2
=
0
Test statistic:
D-bar = sample average of the n differences D
1
, D
2
, ., D
n
S
D
= sample standard deviation of the differences
Note: The differences should have a normal distribution with a mean
of
D
=
1
2
The mechanics after that is the same as the regular t-test
n / s
- D
= T
D
0
0
The Paired t-test
Conclusion from the cholesterol example?
17