1.017/1.010 Class 19 Analysis of Variance: Concepts and Definitions
1.017/1.010 Class 19 Analysis of Variance: Concepts and Definitions
1.017/1.010 Class 19 Analysis of Variance: Concepts and Definitions
010 Class 19
Analysis of Variance
Specify one or more factors that could account for variability (e.g.
location, time, etc.). Each factor is associated with a particular set of
populations or treatments (e.g. particular sampling stations, sampling
days, etc.). One-way analysis of variance (ANOVA) considers only a
single factor.
Suppose a random sample [xi1, xi2, ..., xiJ] is obtained for treatment i.
There are i =1,..., I treatments (e.g. each treatment may correspond to a
different sampling location).
Each random sample has a CDF Fxi(xi). The different Fxi(xi) are assumed
identical except for their means, which may differ. Classical ANOVA also
assumes that all data are normally distributed.
1
Objective is to estimate/test values of ai's, which are the unknown
distributional parameters of the Fxi(xi)'s.
If the factor does not affect variability in the data then all ai's = 0. Use
hypothesis test:
H0: a1 = a2 = .... = aI = 0
I
H0 : ∑ ai2 = 0
i =1
Sums-of-Squares Computations
J
1
m xi =
J ∑ xij = xi.
j =1
I J
1
mx =
IJ ∑ ∑ xij = x..
i =1 j =1
I J
SST = ∑∑ ( x ij − m x ) 2
i =1 j =1
I J I J
= ∑∑ 2
( x ij − m xi ) + ∑∑ (m xi − m x ) 2
i =1 j =1 i =1 j =1
= SSE + SSTr
SST can be divided into error sum-of-squares SSE and treatment sum-
of-squares SSTr.
2
I J
SSE = ∑∑ ( x ij − m xi ) 2
i =1 j =1
I J
SSTr = ∑∑ (m xi − m x ) 2
i =1 j =1
SSE
MSE =
I ( J − 1)
SSTr
MSTr =
I −1
E[ MSE ] = σ 2
I
J
E[ MSTr ] = σ 2 +
I −1 ∑ ai2
i =1
Test Statistic
MSTr
F ( MSE , MSTr ) =
MSE
When H0 is true and xij's are normally distributed this statistic follows F
distribution with νT r = I - 1 and νE = I(J-1) degrees of freedom. Check
normality by plotting (xij - mxi) with normplot.
3
One-sided p-value:
p = 1 − FF , ν ,ν [F ( MSE, MStr )]
Tr E
Source SS df MS F p
Treatments SSTr νTr = I-1 MSTr = F = MSTr/MSE p=
SSTr/νTr 1-FF ,νTr,νE(F )
MSE =
Error SSE νE = I(J-1)
SSE/νE
MST =
Total SST νT = IJ-1
SST/νT
The MATLAB anova1 function derives the error and treatment sums of
squares and computes p value. When using anova1 be sure to
transpose the data array (MATLAB requires treatments in columns and
replicates in rows).
4
Source SS df MS=SS/df F p
Treatments 47.1642 3 15.7214 29.8 1.4E-7
Error 10.5518 20 0.5276
Total 57.716 23
The very low p value indicates that seasonality is highly significant in this
case. Note that MSTr, which depends on the ai's, is much larger than MSE
F CDF, νTr = 3, νE = 5