Ssta032 Guide 2024
Ssta032 Guide 2024
UNIVERSITY OF LIMPOPO
SSTA032
Contents
i
4 Balanced Incomplete Block Designs 34
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
4.2 Analysis Of Variance . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
4.3 Pairwise Comparisons . . . . . . . . . . . . . . . . . . . . . . . . . . 35
6 Factorial Experiments 50
6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
6.1.1 Interaction in Factorial Experiments . . . . . . . . . . . . . . 51
6.2 Analysis of Factorial Experiments . . . . . . . . . . . . . . . . . . . . 51
6.2.1 Two factor factorial Design in a CRD . . . . . . . . . . . . . . 51
6.2.2 Two factor factorial Design in a RBD . . . . . . . . . . . . . . 58
6.3 2k Factorial Designs . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
6.3.1 22 Factorial experiment . . . . . . . . . . . . . . . . . . . . . . 63
6.4 23 Factorial Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
8 QUESTIONS 77
9 Formulae 83
ii
List of Tables
iii
Chapter 1
1.1 Introduction
Experimental Design concerns the arrangement of various conditions or situations
to which experimental subjects (for example, people or rats) will be exposed. The
process of designing an experiment for comparing treatment or factor level means
begins by stating the objective(s) of the experiment clearly. The statement of the
objectives indicates to us what measurements are to be made (how, when, where) and
on what.Precision and accurate comparisons among treatments over an appropriate
range of conditions are the primary objectives of most experiments. These objectives
require precise estimates of means and powerful statistical test. Reduced experimental
errors increase the possibility of achieving these objectives. Local control describes
the actions an investigator employs to reduce or control experimental errors, increase
accuracy of observations, and establish the inference base of a study. The investigator
controls the:
• Technique
• Measure of covariates
Example 1.1.1 To compare the mean weight gains of steers that are fed diets A and
B.
1
Example 1.1.2 To compare the mean weight gains of two year old Holstein steers
that are fed diets A and B for a period of six months.
The objective of the experiment in example 1.1.1 is vague. Why? On the other
hand the objective of the experiment in example 1.1.2 is specific. It is clear from the
statement of the objective that the experiment should be conducted as follows:
3. Assign one group of steers to diet A and the other group to diet B.
4. Feed the steers with their respective diets for six months and then weigh them.
Example 1.1.2 shows that a clear statement of the objectives of an experiment spec-
ifies
2. the set of experimental units (two-year old Holstein steers) to be used; and
Definitions
Definition 1.1.1 A treatment is the level or class of a factor whose effects are to
be inivestigated.
Example 1.1.3 Refer to example 1.1.2. The factor is diet and the levels of the
factor (or treatments) are A and B
Example 1.1.4 Suppose that we wish to compare the mean yield of maize varieties
X and Y grown under the same management and climatic conditions. In this case
Variety is the factor and the maize varieties X and Y are the factor levels.
2
The unit may be a plot of land, a patient in a hospital, or a lump of dough, or it may
be a group of pigs in a pen, or a batch of seed.
Example 1.1.5 Refer to example 1.1.2. If the two year old Holstein steers are indi-
vidually fed their assigned diets, then the steers are the experimental units, otherwise,
if the steers are group fed their assigned diets then the groups are the experimental
units.
Example 1.1.6 Refer to example 1.1.4. If the varieties are assigned individual
plots, then each plot on which a variety is grown is the experimental unit.
Example 1.1.7 Refer to example 1.1.2. The response variable is the weight gain of
a steer.
Example 1.1.8 Refer to example 1.1.4. The response variable is the yield per unit
area of the plot.
2. Identify the factor we wish to study. What are the levels of the factor?
1.1.1 Randomistaion
Refer to example 1.1.4. If the mean yield of the two varieties are actually the same,
what would be our conclusion from the experiment in which we assign fertile plots
to variety X and poor plots to variety Y? The answer to this question is simple.
Our experimental procedure would favour variety X. We would make the erroneous
conclusion that variety X has a higher yield than variety Y when in actual fact they
have the same yield. If allocation of the plots to the varieties is done (as above)
deliberately, then the bias in our conclusion is called subjective bias or bias due to
deliberate selection, otherwise, the bias is called systematic bias.
3
of analysing data including ANOVA.
Example 1.1.9 Suppose that we have developed a new diet A which we believe is
better than the existing diet B in terms of increasing the daily weight gain of steers
(of the same age) fed the diets. Furthermore suppose that four two year old Brahman
steers and four two year old Nguni steers are avilable for the experiment to verify your
claim. Known or unknown to us is that naturally Brahman steers grow faster than
Nguni steers of the same age raised under the same enviromental conditions. If in
actual fact diet A is good as diet B, what will be the conclusion from the experiment
whereby we assign Brahman steers to diet A and Nguni steers to diet B.
1.2 Replication
We define a basic experiment as one in which only one experimental unit is as-
signed to each treatment. Thus, each treatment appears once in a basic experiment.
Example 1.2.1 Suppose that we wish to compare the effects of two treatments (T1 , T2 )
on some response. Furthermore, suppose that we have 2n (n ∈ ℜ+ ≥ 1 ) identical
experimental units available for experimentation. The plan of conduct of the experi-
ment is to randomly allocate n experimental units to T1 and the remainder to T2 . The
experiment for n = 1 is our basic experiment, for n = 2 we have two replications of
our basic experiment etc.
Recall that the pooled t-test assumptions are that the errors in the Yij′ s are
independent and normally distributed with mean 0 and variance σ 2 (unknown). The
σ 2 is a measure of the experimental error and it is estimated by the pooled variance
which, in this case, is given by:
1
Sp2 = (S12 + S22
2
4
What happens to Sp2 if we do not replicate the basic experiment?
The estimate of the difference between the T1 mean and the T2 mean is given by:
2 2 2
Ȳ1. − Ȳ2. with varianceσ12 = σ
n
2 2 2 2 2
The variance σ12 is estimated by σ̂12 = S . The σ12 is a measure of precision or
n p
the reliability of Ȳ1. − Ȳ2. in estimating the difference between the treatment means.
2
The estimate is precise or reliable if σ12 is small. The precision or the reliability of the
estimate improves as we increase the number of replications of the basic experiment
2
since σ12 → 0 as n → ∞. The width of the confidence interval for the difference
between treatment means decreases as n increases. That is, the confidence interval
becomes more and more accurate as n is increased.
If the pooled t − test assumptions (specified above) hold, then the appropriate test
statistic is given by:
Y¯i. − Y¯2.
t= ,
σˆ12
which is Student t distribution with 2n − 2 degrees of freedom. If the variance of
the errors σ 2 is known, then the appropriate test is the z − test. The test statistic for
the z − test is given by:
Y¯i. − Y¯2.
z= ,
σ12
which has a standard normal distribution. Both the t − test and the z − test reject
H0 in favour of H1 if the values of |t| and |z| are large.
A measure of the sensitivity of a test for comparing treatment means is the power
of the test to detect differences between treatment means. The power of a test is
the probability of rejecting H0 (treatment means are not different) when H1 is true
(treatment means are different)
5
1.3 Blocking
Blocking an experiment refers to arranging the experimental units into groups (called
blocks) within each of which the experimental units are relatively homogeneous with
respect to one or more characteristics of the units that may influence the response
of interest. Randomisation is then done independently within each block. Note that
blocking may also be based on external variables (variables that may influence the
response) associated with the experimental setting e.g observer if two or more peo-
ple perform the experiment.
Blocking an experiment allows us to account for the variation in the responses that
is due to differences among the experimental units. If we block an experiment using
external variables such as time or observer, then blocking allows us to account for
the variation in responses that is due to these external variables. The consequence of
not blocking when we are supposed to block is that the variation due to differences
among the experimental units or due to the external variables cannot be separated
from that due to the random errors. This results in an estimate of the experimental
error (σ 2 ) that is biased upwards.
Example 1.3.1 Suppose that we wish to compare the effects of diet A and diet B
on the daily weight gain of steers (of the same age) fed the diets. Furthermore,
suppose that four two year old Afrikaner steers and four two year old Nguni steers
are available for the experiment. How can we design a simple experiment that can
allow us to remove the between breed variation from the experimental error?
1. Technique
5. Measure of covariates
6
Chapter 2
2.1 Introduction
In a Completely Randomised Design (CRD) the experimental units are assigned to
the treatments completely t random. Complete randomisation ensures that every
unit has an equal chance to be assigned any one of the treatments. The value of
complete randomisation is insurance against subjective and systematic biases. If we
have n = tr experimental units and t treatments, then we randomly assign r units to
each of the t treatments to obtain a balanced completely randomised design.
We use a CRD when both the experimental units and experimental conditions are
uniform. That is, when the experimental units are identical with respect to their
characteristics that can affect the response and there are no external variables asso-
ciated with the experimental setting that can also affect the response
2. The number of experimental units (sample size) can be varied from treatment
to treatment without complicating the analysis of the experiment.
3. The statistical analysis of the experiment (ANOVA and estimation) is easy even
when:
7
2.3 Disadvantages of using CRD
1. Although CRD can be used for any number of treatments, it is best suited for
situations where there are relatively few treatments.
Example 2.3.1 Suppose that you want to compare the effects of three types of fer-
tiliser (X,Y,Z) on a cert ain variety of maize. Furthemore, suppose that 9 plots of
the same size are available for the experiment.
1. List at least two characteristics of the plots that can affect the response of in-
terest.
2. List at least two environmental factors that can affect the response of interest.
3. List at least two management practices that can affect the response of interest
besides the method and the level of application of the fertilisers.
4. In view of your answers to (a) - (c), under what conditions would a CRD design
be appropriate for the experiment?
8
2.4.1 ANOVA
General Model
The general linear statistical model for a CRD has the form:
Where
• µ is the overall population mean;
• the ǫ′ij s are random errors which are assumed to be independent and normally
distributed with mean 0 and variance σ 2 ; and
• i = 1, 2, ..., t; j = 1, 2, ..., r.
Sum of Squares
The total variability in the observations (Yij′ s) is measured using the total sum of
squares and is given by:
t X
r t X
r
X X 1 2
SST O = (Yij − Y¯.. )2 = Yij2 − Y with tr − 1 df (2.2)
i=1 j=1 i=1 j=1
tr ..
It is possible to partition the total sum of squares into three separate sources of
variability i.e.
1. due to the variability among treatments (treatment sum of squares (SST)),
2. due to the variability among the yij′ s which is not accounted for by either treat-
ments or blocks (error sum of squares (SSE))
The partition can be done as follows
t X
X r
SST O = (Yij − Y¯.. )2
i=1 j=1
t X
X r
= [(Y¯i. − Y¯.. ) + (Ȳij − Ȳi. )]2
i=1 j=1
t X
X r t X
X r t X
X r
= (Y¯i. − Y¯.. )2 + (Yij − Y¯i. )2 + 2 (Yij − Ȳi. )(Y¯i. − Ȳ.. )
i=1 j=1 i=1 j=1 i=1 j=1
| {z } | {z } | {z }
SST SSE =0
9
The sum of squares formulas just discussed are mathematical and are not convinient
to use in calculations. The computational formulae are given by:
t
X 1 1 2
SST = Yi.2 − Y with t − 1 df
i=1
r tr ..
SSE = SST O − SST with t(r − 1) df
The mean sum of squares are obtained by dividing the sums of squares by the corre-
sponding degrees of freedom and they are used for testing the hypothesis about the
treatment effects.
Hypothesis Testing
We use model I to analyse a CRD when the conclusions of the experiment are to
pertain to the particular set of treatments included in the experiment i.e conclusions
cannot be extended to any other treatments that were not included in the experiment.
The objective is to test the null hypothesis of no difference among the treatment
means. This is equivelent to testing
H0 : τ1 = τ2 = ... = τt = 0
H1 : at least one τi 6= 0. (at least one of the
treatment means differs from the rest)
10
To show that F is a reasonable test statistic for the above hypothesis we need to
show that MSE is an unbiased estimate of the experimental error (σ 2 ) i.e
E[M SE] = σ 2
Under model I, we can also show that MST is an unbiased estimate of:
t
2 1 X 2
E[M ST ] = σ + rτ
t − 1 i=1 i
When testing the hypothesis H0 versus H1 at the α level of significance we note that
H0 is true when both MST and MSE are both unbiased estimates of σ 2 . If H1 is true
MSE is still an unbiased estimate of σ 2 , but MST is an unbiased estimate of
t
2 1 X 2
σ + rτi > σ 2
t − 1 i=1
H0 : µi. = µi′ .
H1 : µi. 6= µi′ . ∀i 6= i′
NOTE: Comparisons should be made at the same level of significance as that used
in the ANOVA. If we choose to use the least significant difference method for making
11
the comparisons, then the hypothesis are tested using the t-testwhose test statistic is
given by:
Ȳi. − Ȳi′
t= p
2M SE/r
p
where 2M SE/r is the standard error of Ȳi. − Ȳi′ . The test statistic has a Student
t-distibution with t(r − 1) degrees of freedom. We declare means to be significantly
different at the α level of significance if:
Ȳi. − Ȳi′ α
|p | > tt(t−1)
2
2M SE/r
or equivalently α p
|Ȳi. − Ȳi′ | > tt(t−1)
2
2M SE/r
the least significant difference quantity is:
α p
LSD = tt(r−1)
2
2M SE/r
The LSD procedure simply compares the observed absolute difference between each
pair of averages to the corresponding LSD. The means µi. and µi′ . are declared sig-
nificantly different if
|Yi. − Yi′ . | > LSD.
Example 2.4.1 A study was undertaken to compare the distance travelled in kilome-
tres per litre of three competing brands of petrol. Fifteen identical cars were available
for the experiment. The brands of petrol i.e brand A, B, and C were each assigned
five cars. The cars were operated under the same conditions and the distance trav-
elled by each car per litre of the assigned brand of petrol was recorded. The results are
displayed in the following table. Analyse the data and draw appropriate conclusions.
Replication
Brand 1 2 3 4 5
A 9.5 11.0 13.0 15.0 18.0
B 10.5 12.0 14.0 16.0 10.5
C 10.0 10.5 13.5 14.5 10.0
1. Assume that model I is approriate for the experiment, test the relevant hypothesis
at the 0.05 level of significance.
3. If you are to perform pairwise comparisons of the brand means, what value of
LSD would you use?
12
2.4.3 Random Effects Model (Model II)
For an experiment where the treatments or the factor levels of a factor are randomly
chosen from a population of treatments or the factor levels of the factor the appro-
priate model is a random effects model (Model II).Model II is also model ( 2.1)
but with the:
• τi′ s random variables which are independent and normally distributed with mean
zero and variance στ2 ; and
Hypothesis Testing
We wish to test the hypothesis
then
1
στ2 = (E[M ST ] − E[M SE])
r
Therefore the most obvious estimate of στ2 is given by:
1
σ̂τ2 = max[0, (M ST − M SE]
r
Example 2.4.2 To compare the lightning discharge intensities at all areas in South
Africa a CRD was used. Three areas were randomly chosen for the study and the
lightining tracking equipment assembled at those areas. On each day of the month of
December between 0800hrs and 1700hrs lightining was monitored at the three areas
until the maximum intensity had been recorded for five seperate storms. The sample
data is in the following table. Analyse the data and draw appropriate conclusions.
1. Assume that model II is approriate for the experiment, test the relevant hypoth-
esis at the 0.05 level of significance.
13
Intensity
Area 1 2 3 4 5
A 20 1050 3200 5600 50
B 4300 70 2560 3650 80
C 100 7700 8500 2960 3340
. The ith treatment total and the estimate of the overall mean (µ) are given by:
ri it r
X 1 XX
Yi. = Yij and Ȳ.. = Yij
j=1
n i=1 j=1
respectively.
The formulae to calulate the sums of squares are given by:
ri
t X
X 1 2
SST O = Yij2 − Y with n − 1 df
i=1 j=1
n ..
t
X 1 2 1 2
SST = Y − Y with t − 1 df (2.4)
r i. n ..
i=1 i
SSE = SST O − SST with n − t df
The mean sum of squares are obtained by dividing the sums of squares by the corre-
sponding degrees of freedom and they are used for testing the hypothesis about the
treatment effects.
The Analysis of variance table for a completely randomised design with missing ob-
servations is as follows:
The LSD for comparing two treatment means is given by
r
α 1 1
LSD = tn−t
2
M SE( + )
ri ri′
14
Table 2.3: ANOVA table for a CRD with missing observations
Source df SS MS F
Treatments t − 1 SST M ST M ST /M SE
Error n − t SSE M SE
Totals n − 1 SST O
Example 2.5.1 Refer to example 2.4.1. Calculate the residuals for the data.
Normality Assumption
A histogram of the residuals can help to detect skewness in the distribution of the
residuals. We expect the histogram to be approximately bell shaped about zero if the
distribution of the errors is indeed normal with mean 0. We can also use a normal
probability plot to detect non-normality of the errors. If the error distibution is
normal then this plot, should be approximately straight. To construct the probability
plot we follow the following steps:
√
1. Compute the standardised residuals by dividing each residual by M SE.
3. Calculate the cumulative probability point (pi ) for the ith ordered stan-
dardised residual using the formula pi = (i − 12 )/n where n = rt.
4. Plot the value of the ith ordered standardised residual on the horizontal axis
against pi on the vertical axis. The verical axis goes from 0.01 to 0.99 and this
corresponds to the cummulative distribution function of a normal distribution.
NOTE that: The normality of the error distribution do not seriously invalidate
conclusions of the analysis of variance of the fixed effects model if there are moderate
depatures from normality. They only cause the level of significance to differ from
the specified value and the tests to be slightly less powerful in detecting treatment
differences. Conclusions of the ANOVA of the random effects model are more seriously
15
affected by non-normality in the sense that the estimates of the variance components
may be inaccurate.
Independence Assumption
When responses are measured at successive time points, the errors in the responses
may become related through time. The assumption of independent errors would
be violated if this happens. We detect this assumption by plotting the residuals
versus time. If the plot shows any obvious pattern, then this would imply that the
assumption of independence is violated. Once this assumption is violated, all the
conclusions of the analysis of variance become invalid.
16
Chapter 3
1. The experimental units are arranged into groups (called blocks) in such a
way that within each block, the experimental units are relatively homogeneous
with respect to one or more characteristics of the units that may influence the
response of interest.
Blocking allows us to remove the variation due to the differences among the blocks
from the experimental error. Complete randomisation within each block offers insur-
ance against subjective and systematic biases, thereby making the conclusions from
the experiment more accurate.
17
Advantages of an RBD
1. It can be used to accomodate any number of treatments and replications in any
number of blocks.
2. If blocking is effective, then the RBD can provide more precise conclusions than
a CRD that uses the same experimental units.
3. The statistical analysis is simple (two way ANOVA) even when an entire block
or treatment is dropped.
Disadvantages of an RBD
1. Since the experimental units within a block must be homogeneous, the design
is best suited for a relatively small number of treatments.
2. The statistical analysis of the experiment is complicated if there are some miss-
ing observations within blocks.
3. The degrees of freedom for the experimental error are lost to blocking.
4. The model for the RBD experiment is more complicated than that for a CRD
experiment, and requires more assumptions.
Example 3.1.1 A study was undertaken to compare the starting salaries of bach-
elor’s degree candidates at the University of Limpopo from the school of computa-
tional and mathematical sciences for the academic years 2004-2005, 2005 -2006 and
2006-2007. Three students from mathematics, three students from statistics and three
students from computer science were available for the experiment. It should be noted
that only those students who had accepted a job were considered in this study.
2. Which design CRD or RBD is appropriate for the experiment? Give reasons
for your answer.
18
3.2 Analysis of the RBD
A randomised block design can be used to compare t population treatment means
when an additional source of variability (blocks) is present.
General model
Both the fixed effects model (model I) and the random effects model (model
II) for the RBD have the form:
Where
• the ǫ′ij s are random errors associated with the response on treatment i, block
j which are assumed to be independent and normally distributed with mean 0
and variance σ 2 ; and
• i = 1, 2, ..., t; j = 1, 2, ..., b.
19
• Yij is the ith treatment in the j th block;
P 1
• Yi. = bj=1 Yij and Y¯i. = Yi. are the ith treatment total and treatment mean
b
respectively, and
P 1
• Y.j = ti=1 Yij and Y¯.j = Y.j are the j th block total and block mean respec-
t
tively, and
Sum of Squares
The total variability in the observations (Yij′ s) is measured using the total sum of
squares and is given by:
t X b t X b
X
¯ 2
X 1
SST O = (Yij − Y.. ) = Yij2 − Y..2 with n − 1 df (3.2)
i=1 j=1 i=1 j=1
n
It is possible to partition the total sum of squares into three separate sources of
variability i.e.
2. due to the variability among the blocks (block sum of squares (SSB))and
3. due to the variability among the yij′ s which is not accounted for by either treat-
ments or blocks (error sum of squares (SSE))
The sum of squares formulas just discussed are mathematical and are not convinient
to use in calculations. The computational formulae are given by:
20
t
X 1 1 2
SST = Yi.2 − Y with t − 1 df
i=1
b n ..
b
X 1 1 2
SSB = Y.j2 − Y with b − 1 df
j=1
t n ..
SSE = SST O − SST − SSB with n − t − b + 1 = (t − 1)(b − 1) df
The mean sum of squares are obtained by dividing the sums of squares by the corre-
sponding degrees of freedom and they are used for testing the hypothesis about the
treatment and the block effects.
Hypothesis Testing
The objective is to test the null hypothesis of no difference among the treatment
means. This is equivelent to testing
H0 : τ1 = τ2 = ... = τt = 0
H1 : at least one τi 6= 0. (at least one of the
treatment means differs from the rest)
H0 : β1 = β2 = ... = βb = 0
H1 : at least one βj 6= 0. (at least one of the
block means differs from the rest)
21
The test statistic for these hypothesis is given by:
M SB
F = (3.4)
M SE
where MSB and MSE are mean squares computed from the appropriate sums of
squares in the ANOVA table. F has an F distribution with b − 1 degrees of freedom
on the numerator and (t − 1)(b − 1) degrees of freedom on the denominator. We
α
conclude that blocking was effective at the α level of significance if F > Fb−1,(t−1)(b−1) .
The Analysis of variance table for a randomised block design is as follows:
H0 : τi. = τi′ .
H1 : τi. 6= τi′ . ∀i 6= i′
NOTE: Comparisons should be made at the same level of significance as that used
in the ANOVA. If we choose to use the least significant difference method for making
the comparisons, then the least significant difference method is given by:
α p
LSD = t(t−1)(b−1)
2
2M SE/b
Example 3.2.1 A study was undertaken to compare the starting salaries of bach-
elor’s degree candidates at the University of Limpopo from the school of computa-
tional and mathematical sciences for the academic years 2004-2005, 2005 -2006 and
2006-2007. Three students from mathematics, three students from statistics and three
students from computer science were available for the experiment. It should be noted
that only those students who had accepted a job were considered in this study.
22
Curriculum
Year Mathematics Statistics Computer Science
2004-2005 10.6 12.0 11.0
2005-2006 9.0 15.0 12.0
2006-2007 12.0 17.4 13.0
1. Assume that model I is approriate for the experiment, test the relevant hypothesis
at the 0.05 level of significance.
3. If you are to perform pairwise comparisons of the yearly salary means, what
value of LSD would you use?
• τi′ s random variables which are independent and normally distributed with mean
zero and variance στ2 ;
• βj′ s random variables which are independent and normally distributed with
mean zero and variance σβ2 ;
Hypothesis Testing
We wish to test the hypothesis
NOTE Model II is used when the set of treatments and the set of blocks used
in the RBD experiment are random samples from their respective treatment and
block populations, and the conclusions of the experiment are to be extended to these
populations.
23
3.2.3 Checking model assumptions
The graphical methods presented in section 2.4.2 apply in this case and subsequent
cases. However, in this case the residuals for model (3.1) are given by:
where
Ŷij = Ȳi. + Ȳ.j + Ȳ..
Ŷij is the predicted mean response to the ith treatment in the j th block, i.e is an
estimate of µij
Definition 3.3.1 Two factors A and B are said to interact if the difference in mean
responses for two levels of one factor is not constant across levels of the second factor.
The presence of block by treatment interaction effects tends to inflate the estimate
of the experimental error and this has an effect of making the tests for comparing the
treatment means insenstive to treatment differences. To account for the variation in
the responses that is due to the block by treatment interaction we can replicate the
basic RBD.
24
• Yijk is the ith treatment in the j th block for the k th replication;
P P 1
• Yi.. = bj=1 rk=1 Yijk and Y¯i.. = Yi.. are the ith treatment total and treatment
b
mean respectively, and
P P 1
• Y.j. = ti=1 rk=1 Yijk and Y¯.j. = Y.j. are the j th block total and block mean
t
respectively, and
To check for the presence of the block by treatment interaction we can use the
following methods
1. If the differences Ȳij. − Ȳi′ j are approximately the same for all i, i 6= i′ and j,
then there may be no block by treatment interaction.
2. If Ȳij. ≈ Ȳi.. + Ȳ.j. − Ȳ... fo all i and j, then there may be no block by treatment
interactions.
3. We can also check for interaction using graphs plotted from the treatment means
i.e Ȳij. versus j (or Ȳij. versus i). If the curves of the graph are almost parallel
then there may be no block by treatment interactions.
Operator
Filter 1 2
1 7.6,8.8 22.2,23.4
2 19.5,17.6 30.1,24.2
3.4 ANOVA
The model for an RBD with interactions has the form
Where
25
• µ is the overall population mean;
• (τ β)ij is the interaction effect of the ith treatment and the j th block;
• µij is the mean of the ith treatment when in the j th block and
• the ǫ′ijk s are random errors which are assumed to be independent and normally
distributed with mean 0 and variance σ 2 ; and
Sum of Squares
The total variability in the observations (Yijk
′
s) is measured using the total sum of
squares and is given by:
t X
b X
r t X
b X
r
X X 1 2
SST O = (Yijk − Y¯... )2 = 2
Yijk − Y with tbr − 1 df (3.6)
i=1 j=1 k=1 i=1 j=1 k=1
tbr ...
1. Model sum of squares (SSModel) which measures the variability in the ob-
servations that is due to the block, treatment and block by treatment interaction
effects;
2. Error Sum of Squares (SSE) which measures the variability in the observa-
tions that is due to the random errors.
t X
b t b
X 1 XX 2 1 2
SSM odel = (Yij. − Y¯... )2 = Yij. − Y with tb − 1 df
i=1 j=1
r i=1 j=1 tbr ...
It is possible to partition the total sum of squares into two separate sources of
variability i.e.
26
t X
X b X
r
SST O = (Yijk − Y¯... )2
i=1 j=1 k=1
t X
X b X
r
= [(Ȳij. − Y¯... ) + (Yijk − Y¯ij. )]2
i=1 j=1 k=1
t X
X b X
r t X
X b X
r
= (Ȳij. − Y¯... )2 + (Yijk − Y¯ij. )2
i=1 j=1 k=1 i=1 j=1 k=1
| {z } | {z }
SSM odel SSE
t b
XXX r
+ 2 (Ȳij. − Y¯... )(Yijk − Y¯ij. ) (3.7)
i=1 j=1 k=1
| {z }
=0
We can also decompose SSModel into three separate sources of errors as follows:
1. Treatment Sum of Squares (SST) which measures the variation due to the
treatment effects;
2. Block Sum of Squares (SSB) which measures the variation due to the block
effects and
t X
X b X
r
SSM odel = (Yij. − Y¯... )2
i=1 j=1 k=1
t X
X b X
r
= [(Ȳi.. − Y¯... ) + (Ȳ.j. − Y¯... ) + (Ȳij. − Y¯i.. ) − (Ȳ.j. + Y¯... )]2
i=1 j=1 k=1
t X
X b X
r t X
X b X
r
= (Ȳi.. − Y¯... )2 + (Ȳ.j. − Y¯... )2
i=1 j=1 k=1 i=1 j=1 k=1
| {z } | {z }
SST SSB
t b
XXX r
+ [(Ȳij. − Y¯i.. ) − (Ȳ.j. + Y¯... )]2 + crossproductterm
| {z }
i=1 j=1 k=1 =0
| {z }
SSBT
27
The computational formulae are given by:
t
1 X 2 1
SST = Yi.. − Y... 2 with t − 1 df
br i=1 tbr
b
1 X 2 1
SSB = Y.j. − Y... 2 with b − 1 df
tr j=1 tbr
SSB × T = SSM odel − SST − SSB with (t − 1)(b − 1) df
Table 3.4: ANOVA table for a randomised block design with interactions
Source df SS MS F
Treatments t−1 SST M ST M ST /M SE
Blocks b−1 SSB M SB M SB/M SE
Interaction (t − 1)(b − 1) SSB × T M SB × T M SB × T /M SE
Error bt(r − 1) SSE M SE
Totals bt − 1 T SS
Hypothesis Testing
Interaction effects
We always first test the hypothesis about the block by treatment interaction effects.
The hypothesis can be stated as follows:
H0 : All (τ β)i j = 0
H1 : at least one (τ β)ij 6= 0.
28
where M SB × T and MSE are mean squares computed from the appropriate sums of
squares in the ANOVA table. F has an F distribution with (t − 1)(b − 1) degrees of
freedom on the numerator and tb(r − 1) degrees of freedom on the denominator.
Main Effects
If we fail to reject the null hypothesis about the interaction effects we test the main
effects which are treatment or block effects. Under both models i.e the fixed and
random effects model the tests about the main effects are meaningful only if there
are no block by treatment interaction effects. This follows that we only do the tests
for the main effects only if the test about the interaction effects concludes that the
interaction effects are not present.
NOTE Tests for the main effects are the same as the tests for an RBD without
replications.
Pairwise Comparisons
They are done only if the ANOVA tests conclude that the block by treatment interac-
tion effects are absent and that some treatment means are different. Hypothesis to be
tested are the same as those for pairwise comparisons of an RBD without replications.
The LSD for the tests is given by
α p
LSD = tbt(r−1)
2
2M SE/br
1. Assume that model 1 is appropriate for the data and hence test the hypothesis
about Operator by Filter interaction effects.
29
3.4.2 Random Effects Model (Model II)
Model II is also model 3.5 but with the:
• τi′ s random variables which are independent and normally distributed with mean
zero and variance στ2 ;
• βj′ s random variables which are independent and normally distributed with
mean zero and variance σβ2 ;
• (τ β)′ij s random variables which are independent and normally distributed with
mean zero and variance στ2β ;
• ǫ′ijk s, τi′ s βτj′ s and (τ β)i j ′ s are assumed to be independent random variables
Hypothesis Testing
We also start by testing for interactions
Main Effects
Tests are the same as the tests for Model II of a randomised block design without
replications.
30
T is the sum of all observations on the treatment assigned to the missing observation;
B is the sum of all measurements in the block with the missing observation and
G is the sum of all the measurements
Diet
Dairy 1 2 3 4
1 15.4 9.6 9.5 8.4
2 14.8 9.3 9.4
3 15.9 9.8 9.7 9.3
4 15.5 9.4 9.2 8.1
5 14.7 9.2 9.0 7.9
significance.
The reduced and complete models for testing treatments are as follows:
Yij = µ + τi + βj + ǫij
Yij = µ + βj + ǫij
31
By fitting model 1 to the data we obtain SSEF . Similarly, a fit of model 2 yields
SSER . The difference of the two sums of squares for error SSER − SSEF , gives the
drop in the sum of squares due to treatments. Since this is an unbalanced design
the block effecs do not cancel out when comparing treatment means as they do in a
balanced randomised block design. The difference in the sums of squares has been
adjusted for any effects due to blocks caused by the imbalance in the design. The
difference is called the sum of squares due to treatments adjusted for blocks i.e
The sum of squares due to blocks unadjusted for any treatment differences is
obtained by subtraction:
Where SSTO and SSE are sums of squares from the complete model.
The analysis of variance table for testing the effect of treatments is as follows:
Table 3.5: ANOVA table for testing the effects of treatments, unbalanced randomised
block design
Source df SS MS F
Blocks b−1 SSB
Treatmentsadj t−1 SSTadj M STadj M STadj /M SE
Error by subtraction SSE M SE
Totals n−1 SST O
For the blocks the corresponding sum of squares for testing the effect of blocks
has the same complete model (model 1) as before i.e
Complete (Full) model (model 1)
Yij = µ + τi + βj + ǫij
Yij = µ + τj + ǫij
The sums of squares drop SSER − SSEF , is the sum of squares due to blocks
after adjusting for the effects of the treatments. By subtraction, we obtain:
32
The analysis of variance table for testing the effect of blocks is as follows:
Table 3.6: ANOVA table for testing the effects of blocks, unbalanced randomised
block design
Source df SS MS F
Blocksadj b−1 SSBadj M SBadj M SBadj /M SE
Treatments t−1 SST
Error by subtraction SSE M SE
Totals n−1 SST O
33
Chapter 4
4.1 Introduction
A balanced incomplete RBD is an incomplete RBD in which any combination of
treatments appear together in a block an equal number of times. We use it when
we are forced to design an experiment in which we must sacrifice some balance to
perform the experiment and this is when the size of blocks (k) is less than the number
of the treatments (t). For example suppose we have three treatments (A, B, C) and
blocks (B1, B2, B3) of size two each. Then we can construct a balanced incomplete
RBD by randomly assigning each of the combinations to one of the three blocks.
µ ¶
t
In general, if k < t, then we have k or t Ck treatment combinations. Note that
balanced incomplete block designs can also be constructed with less than t Ck blocks.
Although these designs are not balanced as the definition we had in Chapter 3, the
designs do retain some balance i.e even though all treatments do not appear in the
same block, each combination of treatments appears together in a block the same
number of times (The pairs AB,BC, and AC) appear once in a block.
34
The Total Sum of Squares (SSTO) is given by the formula:
t X
b
X 1 2
SST O = Yij2 − Y with n − 1 df
i=1 j=1
n ..
Where k is the number of treatments per block or the size of each block and
Bi is the sum of all the observations for the blocks that contain the ith treatment.
35
An estimate of µi. − µi′ . is given by
r
2kM SE
µ̂i. − µ̂i′ . =
tλ
The least significance difference for pairwise comparisons of the µi. and µi′ . is given
by
r
α/2 2kM SE
LSD = tn−t−b+1
tλ
Example 4.3.1 A chemical experiment was conducted to determine whether the re-
action time was a function of the type of catalyst used. A balanced incomplete RBD
was used for the experiment. The treatments were four catalysts and the blocks were
four batches of raw material. The data is displayed in the following table.
Batch
Catalyst 1 2 3 4
1 73 74 ... 71
2 ... 75 67 72
3 73 75 68 ...
4 75 ... 72 75
Test the equality of the catalyst effects at the 0.05 level of significance.
36
Chapter 5
Definition 5.1.1 A t × t Latin Square Design contains t rows and t colums. The t
treatments are randomly assigned to experimental units within the rows and columns
so that each treatment appears in every row and in every column.
3. The t treatments are randomly assigned to the experimental units within the
rows and colums so that each treatment appears once in every row and once in
every column.
4. If the letters in the first row and first column are arranged alphabetically (in a
regular ascending order), then the latin square is called a standard latin square.
37
For a given number of treatments e.g for t = 3 there are 12 different latin square
designs. The question is, if there exist many latin square design which one should we
use for the design?
NOTE: The process of choosing a latin square design at random is called ran-
domisation of the latin square design. In principle, to choose a random latin
square we proceed as follows
38
3. The latin square design can only be used when there are no interactions
between either the blocking factors and the treatments or between the blocking
factors.
39
5.2.1 ANOVA
The fixed effects model (model I) and the mixed effects model (model II) for a latin
square experiment with t treatments have the form:
Yijk = µ + ρi + βj + τk + ǫijk
Where
• the ǫ′ijk s are random errors which are assumed to be independent and normally
distributed with mean 0 and variance σ 2 .
Model I
Model I is model 5.1 but with the ρ′i s, βj′ s and τk′ s regarded as fixed constants
satisfying the constraints.
t
X t
X b
X
ρi = βj = τk = 0
i=1 j=1 k=1
Hypothesis Testing
The objective is to test the null hypothesis of no difference among the treatment
means. This is equivelent to testing
H0 : τ1 = τ2 = ... = τt = 0
H1 : at least one τi 6= 0. (at least one of the
treatment means differs from the rest)
Model II
Model I is model 5.1 but with some effects regarded as fixed real constants and some
regarded as random effects variables. If we consider a case whereby the treatments
are fixed and both the rows and colums are random, the assumptions for the model
are that:
40
• the ρ′i s are random variables which are independent and normally distributed
with mean 0 and variance σρ2 ;
• the βj′ s are random variables which are independent and normally distributed
with mean 0 and variance σβ2 and
t
X
τk = 0
k=1
Hypothesis Testing
The objective is to test the null hypothesis of no difference among the treatment
means. This is equivelent to testing
H0 : τ1 = τ2 = ... = τt = 0
H1 : at least one τi 6= 0. (at least one of the
treatment means differs from the rest)
Sum of Squares
The total variability in the observations (Yijk
′
s) is measured using the total sum of
squares and is given by:
t X
t t X
t
X X 1 2
SST O = (Yijk − Y¯... )2 = 2
Yijk − Y with t2 − 1 df
i=1 j=1 i=1 j=1
t2 ...
It is possible to partition the total sum of squares into four separate sources of vari-
ability i.e.
2. the variability that is due to the row effects (row sum of squares (SSR),
3. the variability that is due to the column effects (column sum of squares
(SSC)and
4. the variability in the responses that is due to the random errors (error sum of
squares (SSE)
There are sum of squares formulae that are mathematical and are not convinient
to use in calculations. The computational formulae are given by:
41
t
1X 2 1
SST = Y..k − 2 Y...2 with t − 1 df
t k=1 t
t
1X 2 1
SSR = Yi.. − 2 Y...2 with t − 1 df
t i=1 t
t
1X 2 1
SSC = Y.j. − 2 Y...2 with t − 1 df
t i=1 t
SSE = SST O − SST − SSR − SSC with (t − 2)(t − 1) df
The mean sum of squares are obtained by dividing the sums of squares by the corre-
sponding degrees of freedon and they are used for testing the hypothesis about the
treatments.
Hypothesis testing
The test statistic for treatment effects is given by:
M ST
F = (5.1)
M SE
where MST and MSE are mean squares computed from the appropriate sums of
squares in the ANOVA table. F has an F distribution with t − 1 degrees of freedom
on the numerator and (t − 1)(t − 2) degrees of freedom on the denominator.
The test statistics for checking the effectiveness of blocking by the row blocking factor
and the column blocking factor are given by:
M SR M SC
F = and F =
M SE M SE
respectively. Both F ratios have an F distribution with t − 1 degrees of freedom
on the numerator and (t − 1)(t − 2) degrees of freedom on the denominator.
42
Pairwise Comparisons of the treatments
They are done only if the ANOVA test under the fixed effects model concludes that
some treatment means are different. In this case the hypothesis to be tested are:
H0 : τi. = τi′ .
H1 : τi. 6= τi′ . ∀i 6= i′
The least significant difference for comparing the means is given by:
α p
LSD = t(t−1)(t−2)
2
2M SE/t
Example 5.2.1 A traffic engineer wished to compare the total unused green time for
3 different signal-control sequencing devices at 3 different intersections of a city. It
was assumed that the intersections were far enough apart that they, in effect, acted
independently, regardless of the signal sequencing device employed. In addition to
comparing the devices at the 3 different intersections, the engineer wished to compare
the devices at different time periods during the day. The data collected are tabulated
in the following table. Analyse the data and draw conclusions.
Table 5.4: A 3 × 3 latin square design for the traffic delay experiment
Time period
1 2 3
1 23 (II) 31 (III) 51 (I)
Intersection 2 71 (I) 42 (II) 35 (III)
3 34 (III) 67 (I) 29 (II)
t(T + R + C) − 2G
M=
(t − 1)(t − 2)
where T, R and C represent the treatment, row and column totals respectively, cor-
responding to the missing observation and t is the number of treatments in the latin
square design. After replacing the missing value the analysis can proceed as for a
balanced latin square design with degrees of freedom for SST O = t2 − 2. If there
are significant differences due to treaments we need to make pairwise comparisons.
43
The least significant difference between the treatment with the missing value and any
other treatment is
s
α 2 1
LSD = t(t−1)(t−2)
2
M SE( +
t (t − 1)(t − 2)
• two or more experimental units can be obtained for each cell defined by levels
of the row blocking factor and the column blocking factor or,
• the latin square can be repeated two or more times using the same experimental
units.
suppose that the number of replications within each cell is r. Then the model for the
data is given by:
Yijkm = µ + ρi + βj + τk + ǫijkm
Where
• Yijkm is the mth response to the k th treatment in the ith row and j th column;
• the ǫ′ijk s are random errors which are assumed to be independent and normally
distributed with mean 0 and variance σ 2 ; and
44
1 1 1 1
• Ȳijk. = Yijk. , Ȳi... = Yi... , Ȳ.j.. = Y.j.. and Ȳ.... = 2 Y.... are the (ij)th cell
r tr tr tr
mean, ith row mean, j th column mean and overall mean respectively, and
It is possible to partition the total sum of squares into four separate sources of vari-
ability. The computational formulae for the sum of squares are given by:
t
1 X 2 1 2
SST = Y..k. − 2 Y.... with t − 1 df
tr k=1 tr
t
1 X 2 1 2
SSR = Yi... − 2 Y.... with t − 1 df
tr i=1 tr
t
1 X 2 1 2
SSC = Y.j.. − 2 Y.... with t − 1 df
tr i=1 tr
SSE = SST O − SST − SSR − SSC with t2 r − 3t + 2 df
Note: by replicating the latin square design we have increased the degrees of freedom
for SSE by
(r − 1)t2
Example 5.4.1 A team of educators were interested in determining the relative ef-
fectiveness of instruction methods A (Video instruction), B (traditional classroom)
and C (programmed study) on student learning. They felt that the IQ and the age
of the student could influence their study too much. To take these two factors into
account, they used a 3 X 3 latin square design with IQ-Age cell of the latin square.
The scores for the instruction methods are in the following table. Analyse the data
and draw conclusions.
45
Table 5.5: A 3 × 3 latin square design for the scores
IQ
High Average Low
20 40,50 (C) 40,40 (B) 50,40 (A)
Age 30 70,60 (B) 30,20 (A) 55,50 (C)
40 20,30 (A) 70,80 (C) 25,25 (B)
Yijk = µ + ρi + βj + τk + ǫijk
Where
• Yijk is the response to the k th treatment administered to the experimental unit
j during period i;
• the ǫ′ijk s are random errors which are assumed to be independent and normally
distributed with mean 0 and variance σ 2 ; and
The totals and the means are as follows:
P Pt Pt Prt
• Yi.. = rt j=1 Yijk , Y.j. = i=1 Yijk and Y... = i=1 j=1 Yijk are the i
th
period
th
total, j experimental total and the overall total respectively, and
46
1 1 1
• Ȳi.. = Yi.. , Ȳ.j. = Y.j. and Ȳ... = 2 Y... are the ith period mean, j th experi-
rt t tr
mental unit mean and overall mean respectively, and
Sum of Squares
The total variability in the observations (Yijk
′
s) is measured using the total sum of
squares as usual and is given by:
t X rt t X rt
X
¯ 2
X
2 1
SST O = (Yijk − Y... ) = Yijk − 2 Y...2 with t2 r − 1 df (5.3)
i=1 j=1 i=1 j=1
tr
The computational formulae for the period sum of squares, the experimental unit
sum of squares, the treatment sum of squares and the error sum of squares are given
by:
t
1 X 2 1
SST = Y..k − 2 Y...2 with t − 1 df
tr k=1 tr
t
1 X 2 1
SSP = Yi.. − 2 Y...2 with t − 1 df
tr i=1 tr
t
1X 2 1
SSU = Y.j. − 2 Y...2 with tr − 1 df
t i=1 tr
SSE = SST O − SST − SSP − SSU with (tr − 2)(t − 1) df
The mean sum of squares are obtained by dividing the sums of squares by the corre-
sponding degrees of freedon and they are used for testing the hypothesis about the
treatments.
Hypothesis testing
The objective is to test the null hypothesis of no difference among the treatment
means. This is equivelent to testing
H0 : τ1 = τ2 = ... = τk = 0
H1 : at least one τk 6= 0. (at least one of the
treatment means differs from the rest)
The test statistic for treatment effects is given by:
M ST
F = (5.4)
M SE
47
where MST and MSE are mean squares computed from the appropriate sums of
squares in the ANOVA table. F has an F distribution with t − 1 degrees of freedom
on the numerator and (t − 1)(rt − 2) degrees of freedom on the denominator.
48
Table 5.6: ANOVA table for a Latin Square Design
Source df SS MS F
Treatment t−1 SST M ST M ST /M SE
Period t−1 SSP M SP M SP/M SE
Unit rt − 1 SSU M SU M SU/M SE
Error (rt − 2)(t − 1) SSE M SE
Totals t2 r − 1 SST O
Example 5.5.1 A crossover design was used to study the effects of three diets on the
daily weight gain of 2 year old goats. A sufficiently long period of time was allowed to
pass before a goat was fed its new diet in order to eliminate the effect of its previous
diets on the response to the new diet. The following data was recorded (in grams per
day). Analyse the data and draw conclusions.
Goat
1 2 3 4 5 6
1 40.5 (C) 30.0 (B) 30.5 (A) 30.5 (C) 30.0 (B) 40.0 (A)
Period 2 45.0 (B) 20.5 (A) 40 (C) 60.5 (B) 10.0 (A) 45.5 (C)
3 20.0 (A) 65.5 (C) 20.0 (B) 20.5 (A) 60.5 (C) 30.0 (B)
49
Chapter 6
Factorial Experiments
6.1 Introduction
Consider a situation in which it is of interest to study the effect of two factors A
and B on some response. For example, in a chemical experiment we would like to
simultaneously vary the reaction pressure and reaction time and study the effect of
each on the yield. The term factor is used in a general sense to denote any feature of
the experiment such as temperature, time or pressure that may be varied from trial
to trial. The levels of a factor are the actual values used in the experiment.
A factorial design can either be a completely randomised design (CRD) or a block
design (BD). We use CRD factorial design if we have homogeneous experimental
units and the experimental conditions are uniform. If we have heterogeneous exper-
imental units and/or if there are external variables that may influence the response
then a randomised block design or a latin square is appropriate for the factorial ex-
periment.
To illustrate a simple factorial design: lets suppose that we wish to study the ef-
fects of the combination of the levels of Factors A and B and that each factor has two
levels i.e for A we have A1 and A2 and for B we have B1 and B2 . The combinations
of the two factors that are to be investigated are:
A1 B1 , A1 B2 , A2 B1 , A2 B2 .
In this type of experiment it is important not only to determine if the two factors
have an influence on the response but also if there is a significant interaction between
the two factors.
50
6.1.1 Interaction in Factorial Experiments
If we consider the illustration above (which is an example of a two factor experiment)
the effects of A and B, often called the main effects, take on a different meaning
in the presence of interaction. In general, there could be experimental situations in
which factor A has a positive effect on the response at one level of factor B, while at
a different level of factor B the effect of A is negative.
Example 6.1.1 Consider, for example, the following hypothetical data taken on two
factors each at three levels. Assume that the values given are averages for each treat-
ment. Check for the presence of the A by B interactions.
B
A B1 B2 B3
A1 80.30 80.65 80.30
A2 80.20 80.55 80.00
A3 80.60 80.85 80.25
NOTE:
• The analysis of the data usually begins by checking the presence of interaction.
Then if interaction is not significant we proceed to make tests on the effects
of the main factors. If the data indicate the presence of interaction, we might
need to observe the influence of each factor at fixed levels of the other.
51
Table 6.1: Two-factor factorial experiment with r plications in a CRD
Factor B
Factor A B1 B2 ... Bb Total Mean
A1 Y111 , Y112 , ..., Y11r Y121 , Y122 , ..., Y12r ... Y1b1 , Y1b2 , ..., Y1br Y1.. Ȳ1..
A2 Y211 , Y212 , ..., Y21r Y221 , Y222 , ..., Y22r ... Y2b1 , Y2b2 , ..., Y2br Y2.. Ȳ2..
. . . ... . . .
. . . ... . . .
. . . ... . . .
Aa Ya11 , Ya12 , ..., Ya1r Ya21 , Ya22 , ..., Ya2r ... Yab1 , Yab2 , ..., Yabr Ya.. Ȳa..
Total Y.1. Y.2. ... Y.b. Y...
Mean Ȳ.1. Ȳ.2. ... Ȳ.b. Ȳ...
The Anova model for a two-factor factorial design in a CRD has the form
where
• (τ β)ij is the interaction effect of the ith level of factor A and the j th level of
factor B;
• the ǫ′ijk s are random errors which are assumed to be independent and normally
distributed with mean 0 and variance σ 2 .
The fixed effects model assumes that the τi′ s, βj′ s and (τ β)′ij s are fixed real con-
stants satisfying the constraint:
a
X b
X a
X
τi = βj = (τ β)ij = 0
i=1 j=1 i=1
• τi = Ȳi.. − Ȳ...
52
1. The first hypothesis that we test is about the treatment effects. The hypoth-
esis are as follows:
3. If there are no interaction effects we proceed to test about the main effects i.e
A factor level effects or the B factor level effects.
H0 : τ1 = τ2 = ... = τa = 0
H1 : at least one of the τi′ s 6= 0.
H0 : β1 = β2 = ... = βb = 0
H1 : at least one of the βj′ s 6= 0.
Sum of Squares
The total variability in the observations (Yijk
′
s) is measured using the total sum of
squares and is given by:
a X
b X
r a X r
b X
X X 1 2
SST O = (Yijk − Y¯... )2 = 2
Yijk − Y with abr − 1 df
i=1 j=1 k=1 i=1 j=1 k=1
abr ...
The total sum of squares can be decomposed into two separate sources of variability
i.e.
1. the variability that is due to the treatments (factor level combination) (treatment
sum of squares (SST)) and
53
2. the variability in the responses that is due to the random errors (error sum of
squares (SSE))
SST is obtained by performing a one way ANOVA of the data with the factor
level combinations Ai Bj as the treatments. The formula for SST is given by:
a X
b X
r a b
X
2 1 XX 2 1 2
SST = (Yij. − Ȳ... ) = Yij. − Y with ab − 1 df
i=1 j=1 k=1
r i=1 j=1 abr ...
The variability in the observations that is due to the treatments effects is at-
tributed to the main effects and the interaction of the main effects i.e
• A × B interaction effects.
This follows that we can partition the treatment sum of squares into
The mean sum of squares are obtained by dividing the sums of squares by the corre-
sponding degrees of freedon and they are used for testing the hypothesis about the
treatments.
54
Hypothesis testing
1. We first test for treatment effects and the test statistic is given by:
M ST
F = (6.1)
M SE
where MST and MSE are mean squares computed from the appropriate sums
of squares in the ANOVA table. F has an F distribution with ab − 1 degrees of
freedom on the numerator and ab(r − 1) degrees of freedom on the denomi-
nator.
3. We test the hypothesis about the A factor level effects or the B factor level
effects only if there are no A × B interaction effects.The test statistic for the A
and B factor level effects are given by:
M SA M SB
F = and F =
M SE M SE
which both have an F distribution with (a − 1) degrees of freedom on the
numerator and ab(r − 1) degrees of freedom on the denominator and (b − 1)
degrees of freedom on the numerator and ab(r − 1) degrees of freedom on the
denominator respectively.
55
Pairwise Comparisons of the treatments
They are done only if the ANOVA test under the fixed effects model concludes that
some treatment means are different.Why? In a case when treatment means are not
equal as well as the A × B interaction effects are present pairwise comparisons of the
A or B factor level means does not make sense. In the presence of A × B interactions
we must compare the factor level means at each level of B or vise versa. Hence the
hypothesis to be tested for the A factor level means at the j th level of factor B in this
case are:
H0 : τij = τi′ j
H1 : τij 6= τi′ j ∀i 6= i′
The least significant difference for comparing the means is given by:
α p
LSD = tab(r−1)
2
2M SE/r
Similarly, for the B factor level means at the ith level of factor A the hypothesis
are:
H0 : τij = τij ′
H1 : τij 6= τij ′ ∀j 6= j ′
The least significant difference for comparing the means is as above and the means
are significantly different if:
|Ȳij. − Ȳij ′ . | > LSD
If we conclude that A × B interactions are absent and some A and/or B factor
level means are not equal then we can do pairwise comparisons of the A factor level
means and/or the B factor level means. The hypothesis to be tested for factor A are:
H0 : τi. = τi′ .
H1 : τi. 6= τi′ . ∀i 6= i′
The least significant difference for comparing the means is given by:
α p
LSD = tab(r−1)
2
2M SE/br
56
Thus the τij and τi′ j are significantly different if
H0 : τ.j = τ.j ′
H1 : τ.j 6= τ.j ′ ∀j 6= j ′
The least significant difference for comparing the means is given by:
α p
LSD = tab(r−1)
2
2M SE/ar
The least significant difference for comparing the means is as above and the means
are significantly different if:
|Ȳ.j. − Ȳ.j ′ . | > LSD
Example 6.2.1 In a chemical process the most important variables that are thought
to affect the yield are pressure and temperature. Three levels of each factor are selected
and a factorial experiment in a CRD with two replications was performed. The results
are as shown in table 6.2.1. Analyse the data and draw conclusions.
Pressure
Temperature 215 230 235
30 80.4 80.7 80.2
80.2 80.6 80.4
40 80.1 80.5 79.9
80.3 80.6 80.1
50 80.5 80.8 80.4
80.7 80.9 80.1
57
6.2.2 Two factor factorial Design in a RBD
In a two-factor experiment with random effects the layout of the experiment is as
follows:
The Anova model for a two-factor factorial design in a RBD has the form
where
• (τ β)ij is the interaction effect of the ith level of factor A and the j th level of
factor B;
58
• the ǫ′ijk s are random errors which are assumed to be independent and normally
distributed with mean 0 and variance σ 2 .
The fixed effects model assumes that the τi′ s, ρk , βj′ s and (τ β)′ij s are fixed real
constants satisfying the constraint:
a
X b
X r
X a
X a
X
τi = βj = ρk = (τ β)ij = (τ β)ij = 0
i=1 j=1 k=1 i=1 i=1
3. If there are no interaction effects we proceed to test about the main effects i.e
A factor level effects or the B factor level effects.
H0 : τ1 = τ2 = ... = τa = 0
H1 : at least one of the τi′ s 6= 0.
If we reject H0 we compare the A factor level means.
(b) The hypothesis about B factor level effects are
H0 : β1 = β2 = ... = βb = 0
H1 : at least one of the βj′ s 6= 0.
If we reject H0 we compare the B factor level means.
H0 : ρ1 = ρ2 = ... = ρk = 0
H1 : at least one of the ρ′k s 6= 0.
If we reject H0 we coclude that blocking was effective.
59
Sum of Squares
The total variability in the observations (Yijk
′
s) is measured using the total sum of
squares and is given by:
a XX
r a X
b X
r
X X 1 2
SST O = (Yijk − Y¯... )2 = 2
Yijk − Y with abr − 1 df
i=1 j=1 k=1 i=1 j=1 k=1
abr ...
The total sum of squares can be decomposed into three separate sources of variability
i.e.
1. the variability that is due to the treatments (factor level combination) (treatment
sum of squares (SST))
2. the variability that is due to the block effects (block sum of squares (SSBlk))
and
3. the variability in the responses that is due to the random errors (error sum of
squares (SSE))
SST , SSBlk, and SSE are obtained by performing a two way ANOVA of the
data with the factor level combinations Ai Bj as the treatments. The formula for SST
is given by:
a XX
r a b
X 1 XX 2 1 2
SST = (Yij. − Y¯... )2 = Yij. − Y... with ab − 1 df
i=1 j=1 k=1
r i=1 j=1
abr
The variability in the observations that is due to the treatments effects is at-
tributed to the main effects and the interaction of the main effects i.e
• A × B interaction effects.
This follows that we can partition the treatment sum of squares into
60
• the variation due to the factor A effects (SSA)
The mean sum of squares are obtained by dividing the sums of squares by the corre-
sponding degrees of freedon and they are used for testing the hypothesis about the
treatments.
61
The least significant difference for comparing the means is given by:
α p
LSD = t(ab−1)(r−1)
2
2M SE/r
Similarly, for the B factor level means at the ith level of factor A the hypothesis
are:
H0 : τij. = τij ′
H1 : τij 6= τij ′ . ∀j 6= j ′
The least significant difference for comparing the means is as above and the means
are significantly different if:
|Ȳij. − Ȳij ′ . | > LSD
If we conclude that A × B interactions are absent and some A and/or B factor
level means are not equal then we can do pairwise comparisons of the A factor level
means and/or the B factor level means. The hypothesis to be tested for factor A are:
H0 : τi.. = τi′ ..
H1 : τi.. 6= τi′ .. ∀i 6= i′
The least significant difference for comparing the means is given by:
α p
LSD = t(ab−1)(r−1)
2
2M SE/br
H0 : τ.j. = τ.j ′ .
H1 : τ.j. 6= τ.j ′ . ∀j 6= j ′
The least significant difference for comparing the means is given by:
α p
LSD = t(ab−1)(r−1)
2
2M SE/ar
The least significant difference for comparing the means is as above and the means
are significantly different if:
|Ȳ.j. − Ȳ.j ′ . | > LSD
62
Example 6.2.2 Consider a paper manufacturer who is interested studying the effect
of four different cooking temperatures for three different pulp mixtures on the tensile
strength of a paper. The experimenter has decided to run three replicates for each
treatment combination. However the plant can only make 12 runs a day so the ex-
perimenter decided to run one replicate on each of the days and consider the days as
blocks. The data is as shown in table 6.2.2
Block
Treatment (A,B) Day 1 Day 2 Day 3
(200,1) 5.2 5.9 6.3
(200,2) 7.4 7.0 7.6
(200,3) 6.3 6.7 6.1
(225,1) 7.1 7.4 7.5
(225.2) 7.4 7.3 7.1
(225,3) 7.3 7.5 7.2
(250,1) 7.6 7.2 7.4
(250,2) 7.6 7.5 7.8
(250,3) 7.2 7.3 7.0
(275,1) 7.2 7.5 7.2
(275,2) 7.4 7.0 6.9
(275,3) 6.8 6.6 6.4
63
where
• (τ β)ij is the interaction effect of the ith level of factor A and the j th level of
factor B;
• the ǫ′ijk s are random errors which are assumed to be independent and normally
distributed with mean 0 and variance σ 2 .
• a represents the total of r observations taken at the High level of A and Low
level of B;
• b represents the total of r observations taken at the High level of B and Low
level of A;
• ab represents the total of r observations taken at the High level of A and High
level of B and
• (1) represents the total of r observations taken at the Low level of A and Low
level of B;
64
Table 6.5 gives a two-way table of these total yields
The main effect of a factor is defined as the change in the mean response due to
the change in the level of the factor. For example, the main effect of factor A from
table 6.5 above is:
1 1 1 1
A = Ȳ2.. − Ȳ1.. = (a + ab) − ((1) + b)
2 2 2r 2r
1
= [a + ab − b − (1)] (6.2)
2r
The main effect of factor B is:
1 1
B = Ȳ.2. − Ȳ.1. = (b + ab) − ((1) + a)
2r 2r
1
= [b + ab − a − (1)] (6.3)
2r
The AB interaction is the difference between the diagonal means in table 6.5.
That is
1 1 1 1
AB = (Ȳ22. + Ȳ11. ) − (Ȳ21. + Ȳ12. ) = (ab + (1)) − (a + b)
2 2 2r 2r
1
= [a + (1) − a − b] (6.4)
2r
The quantities in the square brackets [.] of equations 6.2, 6.3 and 6.4 are called
contrasts and contrasts are always orthogonal. We define the contrasts among the
treatment totals as follows:
A contrast = a + ab − b − (1)
B contrast = b + ab − a − (1)
The AB interaction is the difference between the diagonal means in table 6.5.
That is
AB contrast = a + (1) − a − b
65
The sums of squares for each contrast is found using the following formula
(Contrastf actor )2
SSF actor = P (6.5)
r (Contrastf actor Coef f icients)2
P
where (Contrastf actor Coef f icients)2 is the sum of the squares of the coeffi-
cients of the terms in the contrast.
Using the formula in 6.5 the Sum of Squares for A (SSA) is given by:
1
SSA = (a + ab − b − (1))2
4r
Sum of Squares for B (SSB) is given by:
1
SSB = (b + ab − a − (1))2
4r
and the Sum of Squares for AB (SSAB) is given by:
1
SSAB = (ab + (1) − a − b2
4r
The total sum of squares (SSTO) is computed using the usual formula i.e:
a X
b X
r a X
b X
r
X X 1 2
SST O = (Yijk − Y¯... )2 = 2
Yijk − Y with abr − 1 df
i=1 j=1 k=1 i=1 j=1 k=1
4r ...
66
ANOVA table for a 22 Factorial experiment
Treatment Replicate
combination I II II
(1) 28 25 27
a 36 32 32
b 18 19 23
ab 31 30 29
where
67
• µ is the overall population mean;
• (τ β)ij is the interaction effect of the ith level of factor A and the j th level of
factor B;
• (τ ρ)ik is the interaction effect of the ith level of factor A and the k th level of
factor c;
• (βρ)jk is the interaction effect of the j th level of factor B and the k th level of
factor C;
• (τ βρ)ijk is the interaction effect of the ith level of factor A, the j th level of factor
B and the k th level of factor C;
• the ǫ′ijk s are random errors which are assumed to be independent and normally
distributed with mean 0 and variance σ 2 .
In computing the sums of squares for the main effects it is convinient to present the
total yields of the treatment combinations along with the appropriate algebraic signs
for each contrast as in table 6.7 The treatment combinations and the appropriate
algebraic signs for each contrast in table 6.7 are used in computing the sums of
squares for the main effects and interaction effects.
For example, the main effect of factor A from table 6.7 above is:
1
A= [−(1) + a − b − c + ab + ac − bc + abc]
4r
68
rearrangig we obtain
1
A= [a + ab + ac + abc − (1) − b − c − bc]
4r
The sums of squares for each contrast is found using the following formula
(Contrastf actor )2
SSF actor = P (6.6)
r (Contrastf actor Coef f icients)2
P
where (Contrastf actor Coef f icients)2 is the sum of the squares of the coefficients
of the terms in the contrast.
Using the formula 6.6 the Sum of Squares are given by:
1
SSA = (a + ab + ac + abc − (1) − b − c − bc)2
8r
1
SSB = (b + ab + bc + abc − (1) − a − c − ac)2
8r
1
SSC = (c + ac + bc + abc − (1) − a − b − ab)2
8r
1
SSAB = (abc + ab + c + (1) − a − b − ac − bc)2
8r
1
SSAC = (abc + ac + b + (1) − a − c − ab − bc)2
8r
1
SSBC = (abc + bc + a + (1) − b − c − ab − ac)2
8r
1
SSABC = (abc + a + b + c − ab − ac − bc − (1)2
8r
The error sum of squares (SSE) is obtained by subtraction as usual.
SSE = SST O − SSA − SSB − SSC − SSAB − SSAC − SSBC − SSABC with r2k − 1) df(6.7
69
ANOVA table for a 23 Factorial experiment
Example 6.4.1 An engineer is trying to improve the life of a cutting tool. He has
run 23 factorial experiment using cutting speed A, metal hardness B and cutting angle
C. He replicated the experiment 2 times and obtained the following data displayed in
table 6.4.1
Treatment Replicate
combination I II
(1) 284 248
a 450 410
b 349 353
c 455 438
ab 502 522
ac 398 385
bc 545 560
abc 403 408
3. Advise on the best factor level combination in improving the the life tool.
70
Chapter 7
7.1 Introduction
ANCOVA is simply a combination of the analysis of variance and the regression analy-
sis methods, that is, when we compare treatment means that incorporate information
on a quantitative variable x. In this chapter we present an analysis of covariance of
a completely randomised design with one variate.
During the analysis covariates are used to adjust the observed responses for the effects
of heterogeneity of the experimental units i.e the responses Y are adjusted for the
values of the covariate X during the analysis. We can write the model to be fitted to
the data as follows:
where β is slope of the regression Y on X. The error ǫij in equation 7.1 has been
reduced because part of it is now being accounted for by β(Xij − X̄.. ).
NOTE: Adding covariate(s) to any ANOVA model has the effect of reducing
the experimental error and this makes the ANOVA tests more sensitive to treatment
differences.
71
7.3 Analysis
The model for ANCOVA for a CRD with one covariate is as in equation 7.1 where
• (τ β)ij is the interaction effect of the ith level of factor A and the j th level of
factor B;
• the ǫ′ij s are random errors which are assumed to be independent and normally
distributed with mean 0 and variance σ 2 .
To adjust the estimates of the model parameters for the effect of covariate we note
that:
1.
Ȳ.. = µ + ǫ̈..
Pt Pt Pr
since i=1 τi = i=1 j=1 (Xij − X̄.. ) = 0. If ǫ̈.. ≈ 0 then the estimate of µ is
Y..
2.
Ȳi. = µ + τi + β(X̄i. − X̄.. ) + ǭi.
and
Yij − Ȳi. = β(Xij − X̄i. ) + ǫij + ǭi.
Let Yij∗ = Yij − Ȳi. , Xij∗ = Xij − X̄i. and ǫ∗ij = ǫij + ǭi. , then
is a simple linear regression model and the least squares estimate of β is given
by
Pt Pr
j=1 Yij Xij
∗ ∗
∗ i=1 Exy
β̂ = Pt Pr = (7.1)
i=1 j=1 Xij
∗2 Exx
72
3. If
Ȳi. = µ + τi + β(X̄i. − X̄.. ) + ǭi.
is evaluated at µ = Ȳ.. , β = β̂ ∗ and ǭi. ≈= 0 then
4.
µ̂∗i. = Ȳi. − β̂ ∗ (X̄i. − X̄.. )
is the adjusted estimate of the ith treatment mean.
ANCOVA
Pt Pr
Exx and Exy have been defined. Let Eyy = i=1 j=1 Yij∗2 and the total sum of
squares be given by the formula
t X b t X b
X X 1
SST O = (Yij − Y¯.. )2 = Yij2 − Y..2 with n − 1 df (7.2)
i=1 j=1 i=1 j=1
n
1. The total sum of squares (SSTO) includes the effects of the covariate
2
SSxy
SST Oadj = SST O −SSReg = SST O − β̂SSxy = SST O − with rt−2 df
SSxx
SST Oadj is the total sum of squares that includes only the treatment effects
and the random errors.
2.
2
Exy
SSEadj = SSE − SSReg ∗ = SSE − β̂ ∗ Exy = SSE − with t(r − 1) − 1 df
Exx
SSE is the usual ANOVA error sum of squares, SSReg ∗ is the regression sum
of squares from regressing Yij∗ = Yij − Ȳi. on Xij∗ = Xij − X̄i. SSEadj is the
adjusted error sum of squares for the effects of the covariate.
3.
SSTadj = SST Oadj − SSEadj with (t − 1) − 1 df
73
The following formulae are used to compute SSxy , SSyy , SSxx , Exy and Exx
t X
r
X 1
SSxy = Yij Xij − Y.. X..
i=1 j=1
rt
t X
r
X 1 2
SSyy = Yij2 − Y
i=1 j=1
rt ..
t X
r
X 1 2
SSxx = Xij2 − X
i=1 j=1
rt ..
t X
r t
X 1X 2
Exx = Xij2 − X
i=1 j=1
r i=1 i.
t X
r t
X 1X
Exy = Yij2 − Yi. Xi.
i=1 j=1
r i=1
t X
r t
X 1X 2
Eyy = Yij2 − Y
i=1 j=1
r i=1 i.
NOTE: SST Oadj = SSTadj + SSEadj and not SST Oadj = SSTadj + SSEadj +
SSReg ∗
H0 : β = 0
H1 : β 6= 0.
74
The test statistic for the treatment hypothesis is given by:
M STadj
F = (7.3)
M SEadj
F has an F distribution with t−1 degrees of freedom on the numerator and t(r−1)−1
degrees of freedom on the denominator. We conclude that treatment effects are
α
significant at the α level of significance if Fcal > Ft−1,t(r−1)−1 . The test statistic for
the linear relationship is given by:
SSReg ∗
F = (7.4)
M SEadj
H0 : µi. = µi′ .
H1 : µi. 6= µi′ . ∀i 6= i′
75
Example 7.3.1 Three different types of hand trucks have been developed and a soft
drink distributor wants to study the effectiveness of using these trucks in delivery.
An experiment was carried out in the company’s methods engneering laboratory. The
variable of interest is the delivery time in minutes (y); however, delivery time is also
strongly related to the case volume delivered (x). Each hand truck is used five times
and the data that follow are obtained. Analyse this data and draw conclusions.
76
Chapter 8
QUESTIONS
Chapter 2: CRD
1. The yields of maize, in tonnes were recorded for 4 different varieties of maize,
P, Q, R and S. The experiment was done in a controlled greenhouse enviro-
ment, each variety was randomly assigned to 8 of the 32 plots available for the
experiment. The yields are as in the following table:
Yield
Variety
P 2.5 3.6 2.8 2.7 3.1 3.4 2.9 3.5
Q 3.6 3.9 4.1 4.3 2.9 3.5 3.8 3.7
R 4.3 4.4 4.5 4.1 3.5 3.4 3.2 4.6
S 2.8 2.9 3.1 2.4 3.2 2.5 3.6 2.7
77
Weight
Method
1 80 92 87 83
2 70 81 78 74
3 63 76 70
Chapter 3: RBD
1. An experiment is conducted in which four treatments (A,B,C and D) are to be
compared in three blocks (1,2 and 3). Five experimental units are available for
each block. The labels for Block 1 are S1 , S2 , S3 , S4 , for Block 2 are S5 , S6 , S7 , S8 ,
and S9 , S10 , S11 , S12 . Use the following set of random numbers to randomise the
experiment.
Block
Treatment
1 12.8 10.6 11.7 10.7 11.0
2 11.7 14.2 11.8 9.9 13.8
3 11.5 14.7 13.6 10.7 15.9
4 12.6 16.5 15.4 9.6 17.1
variance, separating out the treatment, block, and error sums of squares. Use
α = 0.05 level of significance to test the hypothesis that there is no difference
between the treatment means.
78
the balanced design with the five blocks that follow. Analyse the data and draw
conclusions.
Car
Additive 1 2 3 4 5
1 ... 17 14 13 12
2 14 14 ... 13 10
3 12 ... 13 12 9
4 13 11 11 12 ...
5 11 12 10 ... 8
Operator
Order of Assembly 1 2 3 4
1 10 (C) 14(D) 7(A) 8(B)
2 7(B) 18(C) 11(D) 8(A)
3 5(A) 10(B) 11(C) 9(D)
4 10(D) 10(A) 12(B) 14(C)
2. The effects of two drugs (A,B) on the duration of sleep were studied using a
group of 8 patients. Four patients were randomly assigned to drug A during
period 1 and the other four to drug B. During period 2, the two groups of
patients switched drugs. The sleep duration data is as follows.
79
Patient
Period 1 2 3 4 5 6 7 8
1 8.6(A) 7.1(B) 8.3(A) 7.3(B) 7.9(A) 7.5(B) 6.3(A) 6.8(B)
2 8.0(B) 7.5(A) 7.4(B) 8.4(A) 7.3(B) 7.6(A) 6.4(B) 7.5(A)
(b) Give a model for this design. State all the relevant assumptions of the
model.
(c) Analyse the data and draw conclusions.
Temperature
Glass Type 100 125 150
1 580 1090 1392
570 1085 1386
2. The following data were obtained from a 23 factorial experiment replicated three
times: Evaluate the sums of squares for all factorial effects by the contrast
80
method.
81
Chapter 7: Analysis of Covariance
1. Four different formulations of an industrial glue are being tested. The ten-
sile strength of the glue is also related to the thickness. Five observations on
strength (y) and thickness (x) are obtained for each formulation. The data
are shown in the following table. Analyse these data and draw appropriate
conclusions.
Glue Fomulation
1 2 3 4
Y X Y X Y X Y X
1 46.5 13 48.7 12 46.3 15 44.7 16
2 45.9 14 49.0 10 47.1 14 43.0 15
3 49.8 12 50.1 11 48.9 11 51.0 10
4 46.1 12 48.5 12 48.2 11 48.1 12
5 44.3 14 45.2 14 50.3 10 48.6 11
82
Chapter 9
Formulae
2.
t
X 1 1 2
SST = Yi.2 − Y with t − 1 df
i=1
r tr ..
SSE = SST O − SST with t(r − 1) df
α p
LSD = tt(r−1)
2
2M SE/r
3. Missing Observations
4.
ri
t X
X 1 2
SST O = Yij2 − Y with n − 1 df
i=1 j=1
n ..
t
X 1 2 1 2
SST = Yi. − Y.. with t − 1 df (9.2)
i=1
r i n
83
SSE = SST O − SST with n − t df
r
α 1 1
LSD = tn−t 2
M SE( + )
ri ri′
t
X 1 1 2
SST = Yi.2 − Y with t − 1 df
i=1
b n ..
b
X 1 2 1 2
SSB = Y − Y with b − 1 df
j=1
t .j n ..
SSE = SST O − SST − SSB with n − t − b + 1 = (t − 1)(b − 1) df
α p
LSD = t(t−1)(b−1)
2
2M SE/b
t X b t b
X 1 XX 2 1 2
SSM odel = (Yij. − Y¯... )2 = Yij. − Y... with tb − 1 df
i=1 j=1
r i=1 j=1
tbr
t
1 X 2 1
SST = (Ȳi.. − Y... 2 with t − 1 df
br i=1 tbr
b
1 X 2 1
SSB = (Y.j. − Y... 2 with b − 1 df
tr j=1 tbr
SSB × T = SSM odel − SST − SSB with (t − 1)(b − 1) df
84
α p
LSD = tbt(r−1)
2
2M SE/br
Yij = µ + τi + βj + ǫij
Yij = µ + βj + ǫij
t
t−1 X
SSTadj = (kYi. − Bi )2 with t − 1 df
nk(k − 1) i=1
b
X 1 1 2
SSB = Y.j2 − Y with b − 1 df
j=1
k n ..
r
α/2 2kM SE
LSD = tn−t−b+1
tλ
85
Latin Square and Crossover Designs
1.
t X t t X t
X
¯ 2
X
2 1
SST O = (Yijk − Y... ) = Yijk − 2 Y...2 with t2 − 1 df
i=1 j=1 i=1 j=1
t
t
1X 2 1
SST = Y..k − 2 Y...2 with t − 1 df
t k=1 t
t
1X 2 1
SSR = Yi.. − 2 Y...2 with t − 1 df
t i=1 t
t
1X 2 1
SSC = Y.j. − 2 Y...2 with t − 1 df
t i=1 t
SSE = SST O − SST − SSR − SSC with (t − 2)(t − 1) df
α p
LSD = t(t−1)(t−2)
2
2M SE/t
t(T + R + C) − 2G
M=
(t − 1)(t − 2)
s
α 2 1
LSD = t(t−1)(t−2)
2
M SE( +
t (t − 1)(t − 2)
t X
t X
r t X
t X
r
X X 1
SST O = (Yijkm − Y¯.... )2 = 2
Yijkm − 2
Y.... with t2 r − 1 df
i=1 j=1 m=1 i=1 j=1 m=1
t2 r
It is possible to partition the total sum of squares into four separate sources of
variability. The computational formulae for the sum of squares are given by:
86
t
1 X 2 1 2
SST = Y..k. − 2 Y.... with t − 1 df
tr k=1 tr
t
1 X 2 1 2
SSR = Yi... − 2 Y.... with t − 1 df
tr i=1 tr
t
1 X 2 1 2
SSC = Y.j.. − 2 Y.... with t − 1 df
tr i=1 tr
SSE = SST O − SST − SSR − SSC with t2 r − 3t + 2 df
t X
rt t X
rt
X X 1
SST O = (Yijk − Y¯... )2 = 2
Yijk − Y...2 with t2 r − 1 df (9.5)
i=1 j=1 i=1 j=1
t2 r
t
1 X 2 1
SST = Y..k − 2 Y...2 with t − 1 df
tr k=1 tr
t
1 X 2 1
SSP = Yi.. − 2 Y...2 with t − 1 df
tr i=1 tr
t
1X 2 1
SSU = Y.j. − 2 Y...2 with tr − 1 df
t i=1 tr
SSE = SST O − SST − SSP − SSU with (tr − 2)(t − 1) df
Factorial Experiments
1.
a X
b X
r a X
b X
r
X X 1 2
T wof actorF actorialDesignSST O = (Yijk − Y¯... )2 = 2
Yijk − Y with
i=1 j=1 k=1 i=1 j=1 k=1
abr ...
a X r
b X a b
X
2 1 XX 2 1 2
SST = (Yij. − Ȳ... ) = Yij. − Y... with ab − 1 df
i=1 j=1 k=1
r i=1 j=1
abr
87
a
1 X 2 1 2
SSA = Yi.. − Y with a − 1 df
br i=1 abr ...
b
1 X 2 1 2
SSB = Y.j. − Y with b − 1 df
ar j=1 abr ...
SSAB = SST − SSA − SSB with (a − 1)(b − 1) df
Factor A
α p
LSD = tab(r−1)
2
2M SE/br
Factor B
α p
LSD = tab(r−1)
2
2M SE/ar
a XX
r a b
X 1 XX 2 1 2
SST = (Yij. − Y¯... )2 = Yij. − Y with ab − 1 df
i=1 j=1 k=1
r i=1 j=1 abr ...
r
1 X 2 1 2
SSBlk = Y..k − Y with abr − 1 df
ab k=1 abr ...
a
1 X 2 1 2
SSA = Yi.. − Y with a − 1 df
br i=1 abr ...
b
1 X 2 1 2
SSB = Y.j. − Y with b − 1 df
ar j=1 abr ...
SSAB = SST − SSA − SSB with (a − 1)(b − 1) df
88
In the presence of A×B interaction effects the appropriate pairwise comparisons
are the A factor level means at the j th level of factor B or vise-vesa
α p
LSD = t(ab−1)(r−1)
2
2M SE/r
A factor level
α p
LSD = t(ab−1)(r−1)
2
2M SE/br
B factor level
α p
LSD = t(ab−1)(r−1)
2
2M SE/ar
2k Factorial Experiment
(Contrastf actor )2
SSF actor = P (9.6)
r (Contrastf actor Coef f icients)2
Analysis of Covariance
1.
Pt Pr
i=1 j=1 Yij∗ Xij∗ Exy
β̂ ∗ = Pt Pr = (9.7)
i=1 j=1 Xij
∗2 Exx
t X
b t X
b
X X 1 2
SST O = (Yij − Y¯.. )2 = Yij2 − Y with n − 1 df (9.8)
i=1 j=1 i=1 j=1
n ..
2
SSxy
SST Oadj = SST O −SSReg = SST O − β̂SSxy = SST O − with rt−2 df
SSxx
2
∗ ∗
Exy
SSEadj = SSE − SSReg = SSE − β̂ Exy = SSE − with t(r − 1) − 1 df
Exx
89
t X
r
X 1
SSxy = Yij Xij − Y.. X..
i=1 j=1
rt
t X
r
X 1 2
SSyy = Yij2 − Y
i=1 j=1
rt ..
t X
r
X 1 2
SSxx = Xij2 − X
i=1 j=1
rt ..
t X
r t
X 1X 2
Exx = Xij2 − X
i=1 j=1
r i=1 i.
t X
r t
X 1X
Exy = Yij2 − Yi. Xi.
i=1 j=1
r i=1
t X
r t
X 1X 2
Eyy = Yij2 − Y
i=1 j=1
r i=1 i.
α
LSD = tt(r−1)−1
2
se(µ̂∗i. − µ̂∗i′ . )
r ³ ´
(X̄i. −X̄i′ . )2
where se(µ̄i. − µi′ . ) = M SEadj 2r +
∗ ∗
SSxx
90