Post Hoc Analysis (Tukey's Test) : Dr. A. Ramesh
Post Hoc Analysis (Tukey's Test) : Dr. A. Ramesh
Dr. A. Ramesh
DEPARTMENT OF MANAGEMENT STUDIES
IIT ROORKEE
1
Lecture Objectives
After completing this lecture, you should be able to:
• Use Tukey’s test and LSD Test to identify specific differences between
means
2
Designing engineering experiments
3
Designing engineering experiments
4
Designing engineering experiments
5
The completely randomized single-factor experiment
example
• A manufacturer of paper that is used for making
grocery bags is interested in improving the tensile
strength of the product
• Product engineer thinks that tensile strength is a
function of the hardwood concentration in the
pulp and that the range of hardwood
concentrations of practical interest is between 5
and 20%.
6
The completely randomized single-factor experiment
example
• A team of engineers responsible for the study decides to investigate four
levels of hardwood concentration: 5%, 10%, 15%, and 20%.
• They decide to make up six test specimens at each concentration level,
using a pilot plant.
• All 24 specimens are tested on a laboratory tensile tester, in random order.
The data from this experiment are shown in Table
7
The completely randomized single-factor experiment
example
• Tensile Strength of Paper (psi)
Hardwood Observations Total Avg
Concentration (%) 1 2 3 4 5 6
5 7 8 15 11 9 10 60 10.00
10 12 17 13 18 19 15 94 15.67
15 14 18 19 17 16 18 102 17.00
20 19 25 22 23 18 20 127 21.17
383 15.96
8
The completely randomized single-factor experiment
example
9
Typical Data for Single Factor Experiment
10
Sum of Squares
a n --
Total sum of squares SST (yij - y..)2
i 1 j 1
a --- ---
Treatment sum of squares SSTreatments n ( y i. y ..)2
i 1
a n ---
Error sum of Squares SSE (yij y j. ) 2
i 1 j 1
11
ANOVA with Equal Sample Sizes
a n 2
y ..
SST y
2
ij
i 1 j 1 N
1 a 2 y 2 ..
SSTreatments yi.
n i 1 N
12
ANOVA with unequal Sample Sizes
a n 2
y ..
SST y i j 2
i 1 j 1 N
a
yi.2 y 2 ..
SSTreatments
i 1 ni N
13
Problem: Analysis of variance
14
Problem: Analysis of variance
15
ANOVA Table
16
Problem: Analysis of variance
17
Problem: Analysis of variance
18
Problem: Analysis of variance
19
Jupyter code
20
Jupyter code
21
Jupyter code
22
Jupyter code
23
Multiple Comparisons Following the ANOVA
• When the null hypothesis is rejected in the ANOVA, we know that some of
the treatment or factor level means are different
• ANOVA doesn’t identify which means are different
• Methods for investigating this issue are called multiple comparisons
methods
24
Fisher’s least significant difference (LSD) method
• The Fisher LSD method compares all pairs of means with the null
hypotheses H0:i j (for all i ≠ j) using the t-statistic
yi* y j*
t0
2 MS E
n
25
Fisher’s least significant difference (LSD) method
yi* y j* LSD
where LSD, the least significant difference, is
2MS E
LSD ta /2,a ( n 1)
n
26
Fisher’s least significant difference (LSD) method
• If the sample sizes are different in each treatment, the LSD is defined as
1 1
LSD ta /2, N a MS E ( )
ni n j
27
Problem : LSD method
28
Problem : LSD method
• Therefore, any pair of treatment averages that differs by more than 3.07
implies that the corresponding pair of treatment means are different.
29
Jupyter code
30
Problem : LSD method
31
The Tukey-Kramer Test for Post Hoc analysis
32
The Tukey-Kramer Test for Post Hoc analysis
x
μ1 = μ 2 μ3
33
Tukey-Kramer Critical Range
MSW 1 1
Critical Range QU
2 n j n j'
where:
QU = Value from Studentized Range
Distribution with c and n - c degrees of freedom for
the desired level of a
MSW = Mean Square Within
nj and nj’ = Sample sizes from groups j and j’
34
Problem: Tukey- Kramer test
35
The Tukey-Kramer Procedure
1. Compute absolute mean differences:
36
The Tukey-Kramer Procedure
QU 3.96
37
• Q table: The critical values
for q corresponding to
alpha = .05 (top) and
alpha = .01 (bottom)
38
The Tukey-Kramer Procedure
39
The Tukey-Kramer Procedure
3. Compute Critical Range:
MSW 1 1 6.51 1 1
Critical Range Q U 3.96 4.124
2 n j n j' 2 6 6
40
The Tukey-Kramer Procedure
5. Other then x 2 x 3 , all of the absolute mean differences are greater than
critical range. Therefore there is significant difference between each pair of
means, except 10% concentration and 15% concentration at the 5% level of
significance.
41
Jupyter code
42
Problem 2
43
Problem 2
1 2 3 4 5
15 7 7 15 11 9 49 9.8
20 12 17 12 18 18 77 15.4
25 14 18 18 19 19 88 17.6
30 19 25 22 19 23 108 21.6
35 7 10 11 15 11 54 10.8
Grand Grand
total=376 mean=
15.004
44
• SSA = 5 (9.8 – 15.04)2 + 5 (15.4 – 15.04)2 + 5
(17.6 – 15.04)2 +5( 21.6-15.04)2+ 5(10.8-
15.04)2 = 475.76
SST = 636.96
SSE = 636.96 - 475.76=161.20
45
Problem 2
46
• Q table: The critical values
for q corresponding to
alpha = .05 (top) and
alpha = .01 (bottom)
47
Problem 2
MS E
Ta qa (c, n c )
n
a 0.05
48
Problem 2
49
Problem 2
__ __
y1. y2. 9.8 15.4 5.6*
__ __
y1. y3. 9.8 17.6 7.8*
__ __ Starred values indicate pairs of means
y1. y4. 9.8 21.6 11.8 *
that are significantly different.
__ __
y1. y5. 9.8 10.8 1
__ __ __ __
y2. y3. 15.4 17.6 2.2 y3. y4. 17.6 21.6 4
__ __ __ __
y2. y4. 15.4 21.6 6.2* y3. y5. 17.6 10.8 6.8*
__ __ __ __
y2. y5. 15.4 10.8 4.6 y4. y5. 21.6 10.8 10.8*
50
Jupyter code
51
Jupyter Code
52
Thank you
53