0% found this document useful (0 votes)
10 views

Notes Class5

Notes sampling

Uploaded by

Vale Diode
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views

Notes Class5

Notes sampling

Uploaded by

Vale Diode
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 41

Analysis of Variance

Blood coagulation time

T avg

A 62 60 63 59 61
B 63 67 71 64 65 66 66
C 68 66 71 67 68 68 68
D 56 62 60 61 63 64 63 59 61

64
Blood coagulation time

Combined

56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72

Coagulation Time

Notation

Assume we have k treatment groups.

nt number of subjects in treatment group t


N number of subjects (overall)
Yti response i in treatment group t
Ȳt· average response in treatment group t
Ȳ average response (overall)
Variance contributions

(Yti − Ȳ)2 = nt(Ȳt· − Ȳ)2 + (Yti − Ȳt·)2


!! ! !!

t i t t i

ST = SB + SW

N–1 = k–1 + N–k

Estimating the variability

We assume that the data are random samples from four normal
distributions having the same variance σ 2, differing only (if at all)
in their means.
We can estimate the variance σ 2 for each treatment t, using the
sum of squared differences from the averages within each group.

Define, for treatment group t,


nt
(Yti − Ȳt·)2.
!
St =
i=1

Then

E(St)=(nt – 1) × σ 2.
Within group variability

The within-group sum of squares is the sum of all treatment sum


of squares:

(Yti − Ȳt·)2
!!
SW=S1 + · · · + Sk=
t i

The within-group mean square is defined as

− Ȳt·)2
""
S1 + · · · + Sk SW t i (Yti
MW = = =
(n1 – 1) + · · · + (nk – 1) N − k N−k

It is our first estimate of σ 2.

Between group variability

The between-group sum of squares is

k
nt(Ȳt· − Ȳ)2
!
SB=
t=1

The between-group mean square is defined as

− Ȳ)2
"
SB t nt (Ȳt·
MB = =
k−1 k−1

It is our second estimate of σ 2.

That is, if there is no treatment effect!


Important facts

The following are facts that we will exploit later for some formal
hypothesis testing:

• The distribution of SW/σ 2 is χ2(df=N-k)

• The distribution of SB/σ 2 is χ2(df=k-1) if there is no treatment effect!

• SW and SB are independent

The F distribution

Let Z1 ∼ χ2m, and Z2 ∼ χ2n. Assume Z1 and Z2 are independent.

Z1/m
−→ Then ∼ Fm,n
Z2/n

F distributions

df=20,10
df=20,20
df=20,50

0 0.5 1 1.5 2 2.5 3


ANOVA table

source sum of squares df mean square

!
between treatments SB= nt(Ȳt· − Ȳ)2 k–1 MB=SB/(k – 1)
t
!!
within treatments SW= (Yti − Ȳt·)2 N–k MW=SW/(N – k)
t i
!!
total ST= (Yti − Ȳ)2 N–1
t i

Example

source sum of squares df mean square

between treatments 228 3 76.0

within treatments 112 20 5.6

total 340 23
The ANOVA model

We write Yti = µt + #ti with #ti ∼ iid N(0,σ 2).

Using τt = µ t − µ we can also write

Yti = µ + τt + #ti.

The corresponding analysis of the data is

yti = ȳ·· + (ȳt· − ȳ··) + (yti − ȳt·)

The ANOVA model

Three different ways to describe the model:

A. Yti independent with Yti ∼ N(µt, σ 2)

B. Yti = µt + #ti where #ti ∼ iid N(0, σ 2)

Yti = µ + τt + #ti where #ti ∼ iid N(0, σ 2) and


"
C. t τt =0
Now what did we do...?

       
62 63 68 56 64 64 64 64 −3 2 4 −3 1 −3 0 −5
 60 67 66 62   64 64 64 64   −3 −3 
2 4   −1 1 −2 1 
     

 63 71 71 60 
  64 64 64   −3
64  2 4 −3 
  2 5 3 −1 
   

 59 64 67 61  
 64 64 64 64  
 −3 2 4 −3   −2 −2 −1 0 
 
 = + + 

 65 68 63  
  64 64 64  
  2 4 −3 
  −1 0 2 


 66 68 64 
 
 64 64 64 
 
 2 4 −3 
 
 0 0 3 
 63   64   −3   2
59 64 −3 −2

observations grand average treatment deviations residuals


yti = ȳ·· + ȳt· − ȳ·· + yti − ȳt·
Vector Y = A + T + R
Sum of Squares 98,644 = 98,304 + 228 + 112
D’s of Freedom 24 = 1 + 3 + 20

Hypothesis testing

We assume

Yti = µ + τt + #ti with #ti ∼ iid N(0,σ 2).

Equivalently, Yti ∼ independent N(µt, σ 2)

We want to test

H0 : τ1= · · · =τk=0 versus Ha : H0 is false.

Equivalently, H0 : µ1= . . . =µk

For this, we use a one-sided F test.


Another fact

It can be shown that

2
"
2 t n t τt
E(MB)=σ +
k–1

Therefore

E(MB)=σ 2 if H0 is true

E(MB) > σ 2 if H0 is false

Recipe for the hypothesis test

Under H0 we have

MB
∼ Fk – 1, N – k.
MW

Therefore

• Calculate MB and MW.

• Calculate MB/MW.

• Calculate a p-value using MB/MW as test statistic, using the


right tail of an F distribution with k – 1 and N – k degrees of
freedom.
Example (cont)

H0 : τ1=τ2=τ3=τ4=0 versus Ha : H0 is false.

MB = 76, MW =5.6, therefore MB/MW = 13.57.

Using an F distribution with 3 and 20 degrees of freedom, we get


a pretty darn low p-value. Therefore, we reject the null hypothesis.

F(3,20)

MB MW

0 2 4 6 8 10 12 14

Another example

200 400 600 800 1000 1200 1400 1600 1800 2000

treatment response

Are the population means the same?

By now, we know two ways of testing that:


Two-sample t-test, and ANOVA with two treatments.

−→ But do they give similar results?


ANOVA table

source sum of squares df mean square

!
between treatments SB= nt(Ȳt· − Ȳ)2 k–1 MB=SB/(k – 1)
t
!!
within treatments SW= (Yti − Ȳt·)2 N–k MW=SW/(N – k)
t i
!!
total ST= (Yti − Ȳ)2 (N – 1)
t i

ANOVA for two groups

The ANOVA test statistic is MB/MW, with

MB=n1(Ȳ1 − Ȳ)2 + n2(Ȳ2 − Ȳ)2

and
"n1 2 "n2 2
i=1 (Y1i − Ȳ1 ) + i = 1 (Y2i − Ȳ2)
MW =
n1 + n2 − 2
Two-sample t-test

The test statistic for the two sample t-test is

Ȳ1 − Ȳ2
t= )
s 1/n1 + 1/n2

with
"n1 2 "n2 2
2 i=1 (Y 1i − Ȳ 1 ) + i = 1 (Y2i − Ȳ2)
s =
n1 + n2 − 2

This also assumes equal variance within the groups!

Reference distributions

MB 2
−→ Result: =t
MW

If there was no difference in means, then

MB
∼ F1,n1+n2−2
MW

t ∼ tn1+n2−2

Now does this mean F1,n1+n2−2=(tn1+n2−2)2 ?


A few facts

F1,k = t2k

χ2k
Fk,∞ =
k

N(0,1)2 = χ21 = F1,∞ = t2∞

Fixed effects
Underlying group dist’ns

Standard ANOVA model


µ8

µ7
Data

µ6

µ5

µ4

µ3

µ2

µ1
Random effects

Underlying group dist’ns


Dist’n of group means
Random effects
µ8 model
µ
µ7 Data

µ6
Observed underlying
group means
µ5

µ4

µ3

µ2

µ1

The random effects model

Two different ways to describe the model:

A. µt ∼ iid N(µ, σA2 )


Yti = µt + #ti where #ti ∼ iid N(0, σ 2)

B. τt ∼ iid N(0, σA2 )


Yti = µ + τt + #ti where #ti ∼ iid N(0, σ 2)

−→ We add another layer of sampling.


Hypothesis testing

→ In the standard ANOVA model, we considered the µt as fixed


but unknown quantities.
We test the hypothesis H0 : µ1 = · · · = µk (versus H0 is
false) using the statistic MB/MW from the ANOVA table and
the comparing this to an F(k – 1, N – k) distribution.

→ In the random effects model, we consider the µt as random


draws from a normal distribution with mean µ and variance σA2 .
We seek to test the hypothesis H0 : σA2 = 0 versus Ha : σA2 > 0.

As it turns out, we end up with the same test statistic and same
null distribution. For one-way ANOVA, that is!

Estimation

For the random effects model it can be shown that

E(MB)=σ 2 + n0 × σA2

where
* " 2+
1 n
n0 = N − "t t
k–1 t nt

Recall also that E(MW) = σ 2.

Thus, we may estimate σ 2 by σ̂ 2 = MW.

And we may estimate σA2 by σ̂A2 = (MB − MW)/n0


(provided that this is ≥ 0).
Random effects example

3
Subject ID

25 30 35 40 45 50 55 60

response

Random effects example

The samples sizes for the 8 subjects were (14, 12, 11, 10, 10, 11,
15, 9), for a total sample size of 92. Thus, n0 ≈ 11.45.

source SS df MS F P-value
between subjects 1485 7 212 4.60 0.0002
within subjects 3873 84 46
total 5358 91

We have MB = 212 and MW = 46. Thus



σ̂ = 46 = 6.8 −→ overall sample mean = 40.3
)
σ̂A = (212 − 46)/11.45 = 3.81.
ANOVA assumptions

• Data in each group are a random sample from some population.


• Observations within groups are independent.
• Samples are independent.
• Underlying populations normally distributed.
• Underlying populations have the same variance.

−→ The Kruskal-Wallis test is a non-parametric rank-based approach to as-


sess differences in means.
−→ In the case of two groups, the Kruskal-Wallis test reduces exactly to the
Wilcoxon rank-sum test.
−→ This is just like how ANOVA with two groups is equivalent to the two-sample
t test.

Multiple comparisons

When we carry out an ANOVA on k treatments, we test

H0 : µ1= · · · =µk versus Ha : H0 is false

Assume we reject the null hypothesis, i.e. we have some evidence


that not all treatment means are equal. Then we could for example
be interested in which ones are the same, and which ones differ.
For this, we might have to carry out some more hypothesis tests.

−→ This procedure is referred to as multiple comparisons.


Key issue

We will be conducting, say, T different tests, and we become con-


cerned about the overall error rate (sometimes called the family-
wise error rate).

Overall error rate=Pr( reject at least one H0 | all H0 are true )

 =1 − {1 − Pr( reject first | first H0 is true )}T



if independent

≤ T × Pr( reject first | first H0 is true ) generally


Types of multiple comparisons

There are two different types of multiple comparisons procedures:

Sometimes we already know in advance what questions we want


to answer. Those comparisons are called planned (or a priori)
comparisons.

Sometimes we do not know in advance what questions we want


to answer, and the judgement about which group means will be
studied the same depends on the ANOVA outcome. Those com-
parisons are called unplanned (or a posteriori) comparisons.
Former example

We previously investigated whether the mean blood coagulation


times for subjects receiving different treatments (A, B, C or D) were
the same.

Imagine A is the standard treatment, and we wish to compare each


of treatments B, C, D to treatment A.
−→ planned comparisons!

After inspecting the treatment means, we find that A and D look


similar, and B and C look similar, but A and D are quite differ-
ent from B and C. We might want to formally test the hypothesis
µA=µD *= µB=µC.
−→ unplanned comparisons!

Adjusting the significance level

Assume the investigator plans to make T independent significance


tests, all at the significance level α+. If all the null hypothesis are
true, the probability of making no Type I error is (1 – α+)T. Hence
the overall significance level is

α=1 – (1 – α+)T

Solving the above equation for α+ yields


1
α+=1 – (1 – α) T

The above adjustment is called the Dunn – Sidak method.


An alternative method

In the literature, investigators often use


α
α++=
T
where T is the number of planned comparisons.

This adjustment is called the Bonferroni method.

“Unplanned” comparisons

Suppose we are comparing k treatment groups.


Suppose ANOVA indicates that you reject H0 : µ1 = · · · = µk

What next?
Which of the µ’s are different from which others?

Consider testing H0 : µi = µj for all pairs i,j.

There are 2k = k (k2−1) such pairs.


/0

/k0
k=5 −→ 2 = 10.
/k0
k = 10 −→ 2 = 45.
Bonferroni correction

Suppose we have 10 treatment groups, and so 45 pairs.

If we perform 45 t-tests at the significance level α = 0.05, we would


expect to reject 5% × 45 ≈ 2 of them, even if all of the means were
the same.

Let α = Pr(reject at least one pairwise test | all µ’s the same)
≤ (no. tests) × Pr(reject test #1 | µ’s the same)

The Bonferroni correction:


Use α+ = α/(no. tests) as the significance level for each test.
For example, with 10 groups and so 45 pairwise tests,
we would use α+ = 0.05 / 45 ≈ 0.0011 for each test.

Blood coagulation time

Combined

56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72

Coagulation Time
Pairwise comparisons

α 0.05
Comparison p-value α++= = =0.0083
k 6
A vs B 0.004
A vs C < 0.001

A vs D 1.000
B vs C 0.159
B vs D < 0.001

C vs D < 0.001

Another example

A
treatment

60 65 70 75

response
ANOVA table

Source SS Df MS F-value p-value

Between treatment 1077.3 4 269.3 49.4 < 0.001

Within treatment 245.5 45 5.5

/50
2 = 10 pairwise comparisons −→ α+ = 0.05/10 = 0.005
/ 0 1 21 13
For each pair, consider Ti,j = Ȳi· − Ȳj· / σ̂ ni + nj

Use σ̂ = MW (MW = within-group mean square)
and refer to a t distribution with df = 45.

A comparison
Uncorrected:
A:S Uncorrected
Bonferroni
Each interval, individually, had G:S Tukey
(in advance) a 95% chance of G:A
covering the true mean differ- F:S
ence.
F:A

F:G

C:S
Corrected:
C:A

(In advance) there was a greater C:G

than 95% chance that all of the C:F

intervals would cover their re-


!10 !5 0 5 10 15
spective parameters.
Difference in response
Newman-Keuls procedure

Goal: Identify sets of treatments whose mean re-


sponses are not significantly different.
(Assuming equal sample sizes for the treatment groups.)

Procedure: 1. Calculate the group sample means.

2. Order the sample means from smallest to largest.

3. Calculate a triangular table of all pairwise sample means.

4. Calculate qi = Qα (i, df) for i = 2, 3, . . . , k.


The Q is called the studentized range distribution!
)
5. Calculate Ri = qi × MW/n.

Newman-Keuls procedure (continued)

Procedure: 6. If the difference between the biggest and the smallest


means is less than Rk, draw a line under all of the means
and stop.

7. Compare the second biggest and the smallest (and the


second-smallest and the biggest) to Rk−1. If observed dif-
ference is smaller than the critical value, draw a line be-
tween these means.

8. Continue to look at means for which a line connecting them


has not yet been drawn, comparing the difference to Ri with
progressively smaller i’s.
Example

Sorted sample means:

A F G S C
58.0 58.2 59.3 64.1 70.1

Table of differences:

F G S C
A 0.2 1.3 6.1 12.1
F 1.1 5.9 11.9
G 4.8 10.0
S 6.0

Example (continued)

From the ANOVA table:


)
MW = 5.46 n = 10 for each group MW/10 = 0.739 df = 45

The qi (using df=45 and α = 0.05):


q2 q3 q4 q5
2.85 3.43 3.77 4.02

)
Ri = qi × MW/10:
R2 R3 R4 R5
2.10 2.53 2.79 2.97
Example (continued)

Table of differences:

F G S C
A 0.2 1.3 6.1 12.1
F 1.1 5.9 11.9
G 4.8 10.0
S 6.0

)
Ri = qi × MW/10:
R2 R3 R4 R5
2.10 2.53 2.79 2.97

Results

Sorted sample means:

A F G S C
58.0 58.2 59.3 64.1 70.1

Interpretation:
A≈F≈G<S<C
Another example

Sorted sample means:

D C A B E
29.6 32.9 40.0 40.7 48.8

Interpretation:
{D, C, A, B} < E and D < {A, B}

Nested ANOVA: Example

We have:
−→ 3 hospitals
−→ 4 subjects within each hospital
−→ 2 independent measurements per subject

Hospital I Hospital II Hospital III

1 2 3 4 1 2 3 4 1 2 3 4

58.5 77.8 84.0 70.1 69.8 56.0 50.7 63.8 56.6 77.8 69.9 62.1
59.5 80.9 83.6 68.3 69.8 54.5 49.3 65.8 57.5 79.2 69.2 64.5
The model

40 50 60 70 80 90
40 50 60 70 80 90 100
Hospitals Hospitals

!30 !20 !10 0 10 20 30


!30 !20 !10 0 10 20 30
Individuals Individuals

!30 !20 !10 0 10 20 30


!30 !20 !10 0 10 20 30
Residuals Residuals

Nested ANOVA: models

Yijk = µ + αi + βij + #ijk

µ = overall mean
αi = “effect” for ith hospital
βij = “effect” for jth subject within ith hospital
#ijk = random error

Random effects model Mixed effects model

αi ∼ Normal(0, σA2 )
"
αi fixed; αi = 0
βij ∼ Normal(0, σB2 |A) βij ∼ Normal(0, σB2 |A)
#ijk ∼ Normal(0, σ 2) #ijk ∼ Normal(0, σ 2)
Example: sample means

Hospital I Hospital II Hospital III

1 2 3 4 1 2 3 4 1 2 3 4

58.5 77.8 84.0 70.1 69.8 56.0 50.7 63.8 56.6 77.8 69.9 62.1
59.5 80.9 83.6 68.3 69.8 54.5 49.3 65.8 57.5 79.2 69.2 64.5
Ȳij· 59.00 79.35 83.80 69.20 69.80 55.25 50.00 64.80 57.05 78.50 69.55 63.30

Ȳi·· 72.84 59.96 67.10

Ȳ··· 66.63

Calculations (equal sample sizes)

Source Sum of squares df

− Ȳ···)2
"
among groups SSamong=bn i (Ȳi·· a–1

− Ȳi··)2
""
subgroups within groups SSsubgr=n i j (Ȳij· a (b – 1)

− Ȳij·)2
"""
within subgroups SSwithin= i j k (Yijk a b (n – 1)

− Ȳ···)2
"""
TOTAL i j k (Yijk abn–1
ANOVA table

SS df MS F expected MS

SSamong MSamong
SSamong a–1 σ 2 + n σB2 |A + n b σA2
a–1 MSsubgr

SSsubgr MSsubgr
SSsubgr a (b – 1) σ 2 + n σB2 |A
a(b – 1) MSwithin

SSwithin
SSwithin a b (n – 1) σ2
ab(n – 1)

SStotal abn–1

Example

source df SS MS F P-value

among groups 2 665.68 332.84 1.74 0.23

among subgroups within groups 9 1720.68 191.19 146.88 < 0.001

within subgroups 12 15.62 1.30

TOTAL 23 2401.97
Variance components

Within subgroups (error; between measurements on each subject)



s2=MSwithin=1.30 s= 1.30 = 1.14

Among subgroups within groups (among subjects within hospitals)

MSsubgr − MSwithin 191.19 – 1.30 √


s2B|A= = =94.94 sB|A = 94.94 = 9.74
n 2

Among groups (among hospitals)

MSamong − MSsubgr 332.84 – 191.19 √


s2A= = =17.71 sA = 17.71 = 4.21
nb 8

Variance components (2)

s2 + s2B|A + s2A = 1.30 + 94.94 + 17.71 = 113.95.

1.30
s2 represents = 1.1%
113.95
94.94
s2B|A represents = 83.3%
113.95
17.71
s2A represents = 15.6%
113.95

Note:

−→ var(Y) = σ 2 + σB2 |A + σA2

−→ var(Y | A) = σ 2 + σB2 |A

−→ var(Y | A, B) = σ 2
Subject averages

I-1 I-2 I-3 I-4 II-1 II-2 II-3 II-4 III-1 III-2 III-3 III-4
58.5 77.8 84.0 70.1 69.8 56.0 50.7 63.8 56.6 77.8 69.9 62.1
59.5 80.9 83.6 68.3 69.8 54.5 49.3 65.8 57.5 79.2 69.2 64.5
ave 59.0 79.4 83.8 69.2 69.8 55.2 50.0 64.8 57.0 78.5 69.6 63.3

ANOVA table

source df SS MS F P-value
between 2 332.8 166.4 1.74 0.23
within 9 860.3 95.6

Higher-level nested ANOVA models

You can have as many levels as you like. For example, here is a
three-level nested mixed ANOVA model:

Yijkl=µ + αi + Bij + Cijk + #ijkl

Assumptions: Bij ∼ N(0,σB2 |A), Cijk ∼ N(0,σC2 |B), #ijkl ∼ N(0,σ 2).
Calculations

Source Sum of squares df

− Ȳ····)2
"
among groups SSamong=b c n i (Ȳi··· a–1

− Ȳi···)2
""
among subgroups SSsubgr=c n i j (Ȳij·· a (b – 1)

− Ȳij··)2
"""
among subsubgroups SSsubsubgr=n i j k (Ȳijk· a b (c – 1)

− Ȳijk·)2
""" "
within subsubgroups SSsubsubgr= i j k l (Yijkl a b c (n – 1)

ANOVA table

SS MS F expected MS

bcn − Ȳ)2 MSamong α2


"
a (ȲA
"
SSamong σ 2 + nσC2 ⊂B + ncσB2 ⊂A + ncb
a–1 MSsubgr a–1

cn − ȲA)2
b (ȲB MSsubgr
" "
SSsubgr a
σ 2 + nσC2 ⊂B + ncσB2 ⊂A
a(b – 1) MSsubsubgr

n − ȲB)2 MSsubsubgr
" " "
c (ȲC
SSsubsubgr a b
σ 2 + nσC2 ⊂B
ab(c – 1) MSwithin

n (Y − ȲC )2
" " " "
SSwithin a b c
σ2
abc(n – 1)
Unequal sample size

It is best to design your studies such that you have equal sample
sizes in each cell. However, once in a while this is not possible.

In the case of unequal sample sizes, the calculations become re-


ally painful (though a computer can do all of the calculations for
you).

Even worse, the F tests for the upper levels in the ANOVA table no
longer have a clear null distribution.

−→ Maximum likelihood methods are more complicated, but can


solve this problem.

Two-way ANOVA
Treatment

Gender 1 2

709 592
Male 679 538
699 476

657 508
Female 594 505
677 539
Let
r be the number of rows in the two-way ANOVA,

c be the number of columns in the two-way ANOVA,

n be the number of observations within each of those r×c groups.


A picture

700

650
Response

600

550

500

Female Male Female Male


1 1 2 2

All sorts of means

Treatment

Gender 1 2

Male 695.67 535.33 615.50

Female 642.67 517.33 580.00

669.17 526.33 597.75

−→ This table shows the cell, row, and column means, plus the
overall mean.
Two-way ANOVA table

source sum of squares df

− Ȳ···)2
"
between rows SSrows=c n i (Ȳi·· r–1

− Ȳ···)2
"
between columns SScolumns=r n j (Ȳ·j· c–1

interaction SSinteraction (r – 1)(c – 1)

− Ȳij·)2
"""
error SSwithin= i j k (Yijk rc(n – 1)

− Ȳ···)2
"""
total SStotal= i j k (Yijk rcn – 1

Example

source sum of squares df mean squares

sex 3781 1 3781

treatment 61204 1 61204

interaction 919 1 919

error 11667 8 1458


The ANOVA model

Let Yijk be the kth item in the subgroup representing the ith group
of factor A (r levels) and the jth group of factor B (c levels). We
write

Yijk=µ + αi + βj + γij + #ijk

The corresponding analysis of the data is

yijk = ȳ··· + (ȳi·· − ȳ···) + (ȳ·j· − ȳ···) + (ȳij· − ȳi·· − ȳ·j· + ȳ···) + (yijk − ȳij·)

Towards hypothesis testing

source mean squares expected mean squares

cn − Ȳ···)2
"
i (Ȳi·· cn ! 2
between rows σ2 + αi
r−1 r−1
i

rn − Ȳ···)2
"
j (Ȳ·j· rn ! 2
between columns σ2 + βj
c−1 c−1
j

n j (Ȳij· − Ȳi·· − Ȳ·j· + Ȳ···)2


""
i n !!
interaction σ2 + γij2
(r − 1) (c − 1) (r − 1) (c − 1)
i j

− Ȳij·)2
"""
i j k (Yijk
error σ2
r c (n − 1)

This is for fixed effects, and equal number of observations per cell!
Example (continued)

source SS df MS F p-value

sex 3781 1 3781 2.6 0.1460

treatment 61204 1 61204 42.0 0.0002

interaction 919 1 919 0.6 0.4503

error 11667 8 1458

Interaction in a 2-way ANOVA model

Let Yijk be the kth item in the subgroup representing the ith group
of factor A (r levels) and the jth group of factor B (c levels). We
write

Yijk=µ + αi + βj + γij + #ijk

no interaction positive interaction negative interaction

N A B A+B N A B A+B N A B A+B


Expected mean squares

source fixed effects random effects mixed effects

cn ! 2 cn ! 2
between rows σ2 + αi σ 2 + n σR2 ×C + c n σR2 σ 2 + n σR2 ×C + αi
r–1 r–1
i i

rn ! 2
between columns σ2 + βj σ 2 + n σR2 ×C + r n σC2 σ2 + r n σC2
c–1
j

n !!
interaction σ2 + γij2 σ 2 + n σR2 ×C σ 2 + n σR2 ×C
(r – 1)(c – 1)
i j

error σ2 σ2 σ2

Two-way ANOVA without replicates

Physician
Concentration A B C

60 9.6 9.3 9.3


80 10.6 9.1 9.2
160 9.8 9.3 9.5
320 10.7 9.1 10.0
640 11.1 11.1 10.4
1280 10.9 11.8 10.8
2560 12.8 10.6 10.7
ANOVA table

source df SS MS
physician 2 2.79 1.39
concentration 6 12.54 2.09
interaction 12 4.11 0.34
total 20

We have 21 observations. That means we have no degrees of


freedom left to estimate an error!

Expected mean squares

In general, we have:

source fixed effects random effects mixed effects

cn ! 2 cn ! 2
between rows σ2 + αi σ 2 + n σR2 ×C + c n σR2 σ 2 + n σR2 ×C + αi
r–1 r–1
i i

rn ! 2
between columns σ2 + βj σ 2 + n σR2 ×C + r n σC2 σ2 + r n σC2
c–1
j

n !!
interaction σ2 + γij2 σ 2 + n σR2 ×C σ 2 + n σR2 ×C
(r – 1)(c – 1)
i j

error σ2 σ2 σ2
Expected mean squares

If n=1 and there is no interaction in truth, we have:

source fixed effects random effects mixed effects

c ! 2 c ! 2
between rows σ2 + αi σ 2 + c σR2 σ2 + αi
r–1 r–1
i i

r ! 2
between columns σ2 + βj σ 2 + r σC2 σ 2 + r σC2
c–1
j

error σ2 σ2 σ2

Expected mean squares

If n=1 but there is an interaction, we have:

source fixed effects random effects mixed effects

c ! 2 c ! 2
between rows σ2 + αi σ 2 + σR2 ×C + c σR2 σ 2 + σR2 ×C + αi
r–1 r–1
i i

r ! 2
between columns σ2 + βj σ 2 + σR2 ×C + r σC2 σ2 + r σC2
c–1
j

1 !!
interaction σ2 + γij2 σ 2 + σR2 ×C σ 2 + σR2 ×C
(r – 1)(c – 1)
i j

error σ2 σ2 σ2

You might also like