0% found this document useful (0 votes)
29 views94 pages

Ssta032 Guide 2024

Uploaded by

musamhlaba0329
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
29 views94 pages

Ssta032 Guide 2024

Uploaded by

musamhlaba0329
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 94

Design and Analysis of Experiments

Department of Statistics and Operations


Research

School of Mathematical and Computer Sciences


Faculty of Science and Agriculture

UNIVERSITY OF LIMPOPO

SSTA032
Contents

1 The Principles Of Experimental Design 1


1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1.1 Randomistaion . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2 Replication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.3 Blocking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

2 Completely Randomised Designs 7


2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.2 Advantages of using CRD . . . . . . . . . . . . . . . . . . . . . . . . 7
2.3 Disadvantages of using CRD . . . . . . . . . . . . . . . . . . . . . . . 8
2.4 Analysis of the CRD . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.4.1 ANOVA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.4.2 Fixed effects model (Model I) . . . . . . . . . . . . . . . . . . 10
2.4.3 Random Effects Model (Model II) . . . . . . . . . . . . . . . . 13
2.5 Unbalanced CRD/Missing Observations . . . . . . . . . . . . . . . . . 14
2.5.1 Checking model assumptions . . . . . . . . . . . . . . . . . . . 15

3 Randomised Block Design (RBDs) 17


3.1 Randomised Block Design . . . . . . . . . . . . . . . . . . . . . . . . 17
3.2 Analysis of the RBD . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
3.2.1 Fixed effects model . . . . . . . . . . . . . . . . . . . . . . . . 21
3.2.2 Random Effects Model (Model II) . . . . . . . . . . . . . . . . 23
3.2.3 Checking model assumptions . . . . . . . . . . . . . . . . . . . 24
3.3 Block × Treatment Interaction Effects . . . . . . . . . . . . . . . . . 24
3.4 ANOVA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
3.4.1 Fixed effects model . . . . . . . . . . . . . . . . . . . . . . . . 28
3.4.2 Random Effects Model (Model II) . . . . . . . . . . . . . . . . 30
3.5 Unbalanced RBD/Missing Observations . . . . . . . . . . . . . . . . . 30
3.5.1 One missing Observation . . . . . . . . . . . . . . . . . . . . . 30
3.5.2 Two or more missing Observations . . . . . . . . . . . . . . . 31

i
4 Balanced Incomplete Block Designs 34
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
4.2 Analysis Of Variance . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
4.3 Pairwise Comparisons . . . . . . . . . . . . . . . . . . . . . . . . . . 35

5 Latin Square and Crossover Designs 37


5.1 Introduction: Latin Square Design . . . . . . . . . . . . . . . . . . . . 37
5.1.1 Advantages of a Latin Square Design . . . . . . . . . . . . . . 38
5.1.2 Disadvantages of a Latin Square Design . . . . . . . . . . . . 38
5.2 Analysis of a Latin Square Design . . . . . . . . . . . . . . . . . . . . 39
5.2.1 ANOVA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
5.3 A Latin Square Design with Missing Data . . . . . . . . . . . . . . . 43
5.4 Replication of a Latin Square . . . . . . . . . . . . . . . . . . . . . . 44
5.5 Crossover Designs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

6 Factorial Experiments 50
6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
6.1.1 Interaction in Factorial Experiments . . . . . . . . . . . . . . 51
6.2 Analysis of Factorial Experiments . . . . . . . . . . . . . . . . . . . . 51
6.2.1 Two factor factorial Design in a CRD . . . . . . . . . . . . . . 51
6.2.2 Two factor factorial Design in a RBD . . . . . . . . . . . . . . 58
6.3 2k Factorial Designs . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
6.3.1 22 Factorial experiment . . . . . . . . . . . . . . . . . . . . . . 63
6.4 23 Factorial Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67

7 The Analysis of Covariance (ANCOVA) 71


7.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
7.2 A CRD with One Covariate . . . . . . . . . . . . . . . . . . . . . . . 71
7.3 Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
7.3.1 Hypothesis Testing . . . . . . . . . . . . . . . . . . . . . . . . 74

8 QUESTIONS 77

9 Formulae 83

ii
List of Tables

2.1 Layout of the data in CRD . . . . . . . . . . . . . . . . . . . . . . . . 8


2.2 ANOVA table for a CRD design . . . . . . . . . . . . . . . . . . . . . 11
2.3 ANOVA table for a CRD with missing observations . . . . . . . . . . 15

3.1 A Randomised block design . . . . . . . . . . . . . . . . . . . . . . . 19


3.2 ANOVA table for a randomised block design . . . . . . . . . . . . . . 22
3.3 Replicated RBD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
3.4 ANOVA table for a randomised block design with interactions . . . . 28
3.5 ANOVA table for testing the effects of treatments, unbalanced ran-
domised block design . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
3.6 ANOVA table for testing the effects of blocks, unbalanced randomised
block design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

4.1 ANOVA table for a balanced incomplete block design . . . . . . . . . 35

5.1 Standard 4 × 4 latin square design . . . . . . . . . . . . . . . . . . . . 38


5.2 A 4 × 4 Latin Square design . . . . . . . . . . . . . . . . . . . . . . . 39
5.3 ANOVA table for a Latin Square Design . . . . . . . . . . . . . . . . 42
5.4 A 3 × 3 latin square design for the traffic delay experiment . . . . . . 43
5.5 A 3 × 3 latin square design for the scores . . . . . . . . . . . . . . . . 46
5.6 ANOVA table for a Latin Square Design . . . . . . . . . . . . . . . . 49

6.1 Two-factor factorial experiment with r plications in a CRD . . . . . . 52


6.2 ANOVA table for a Two-factor experiment in a CRD . . . . . . . . . 55
6.3 Two-factor factorial experiment in a RBD . . . . . . . . . . . . . . . 58
6.4 ANOVA table for a Two-factor experiment in a RBD . . . . . . . . . 61
6.5 22 Factorial experiment . . . . . . . . . . . . . . . . . . . . . . . . . . 65
6.6 ANOVA table for a 22 factorial experiment . . . . . . . . . . . . . . . 67
6.7 Signs for contrasts in a 23 Factorial Experiment . . . . . . . . . . . . 68
6.8 ANOVA table for a 23 factorial experiment . . . . . . . . . . . . . . . 70

7.1 ANCOVA table for a CRD with one covariate . . . . . . . . . . . . . 74

iii
Chapter 1

The Principles Of Experimental


Design

1.1 Introduction
Experimental Design concerns the arrangement of various conditions or situations
to which experimental subjects (for example, people or rats) will be exposed. The
process of designing an experiment for comparing treatment or factor level means
begins by stating the objective(s) of the experiment clearly. The statement of the
objectives indicates to us what measurements are to be made (how, when, where) and
on what.Precision and accurate comparisons among treatments over an appropriate
range of conditions are the primary objectives of most experiments. These objectives
require precise estimates of means and powerful statistical test. Reduced experimental
errors increase the possibility of achieving these objectives. Local control describes
the actions an investigator employs to reduce or control experimental errors, increase
accuracy of observations, and establish the inference base of a study. The investigator
controls the:

• Technique

• Selection of experimental units

• Blocking or ensuring parity of information on all treatments,

• Experimental design, and

• Measure of covariates

Consider the following two examples of the objectives of an experiment.

Example 1.1.1 To compare the mean weight gains of steers that are fed diets A and
B.

1
Example 1.1.2 To compare the mean weight gains of two year old Holstein steers
that are fed diets A and B for a period of six months.

The objective of the experiment in example 1.1.1 is vague. Why? On the other
hand the objective of the experiment in example 1.1.2 is specific. It is clear from the
statement of the objective that the experiment should be conducted as follows:

1. Weigh the available two-year old Holstein steers.

2. Divide the steers into two groups.

3. Assign one group of steers to diet A and the other group to diet B.

4. Feed the steers with their respective diets for six months and then weigh them.

5. Weight gain = final weight - initial weight.

Example 1.1.2 shows that a clear statement of the objectives of an experiment spec-
ifies

1. the set of treatments (diets A and B) whose effects are to be investigated;

2. the set of experimental units (two-year old Holstein steers) to be used; and

3. the response variable(s) (weight gain) of interest.

Definitions
Definition 1.1.1 A treatment is the level or class of a factor whose effects are to
be inivestigated.

In the selection of treatments it is important to define clearly each treatment and to


understand the role that each treatment will play in reaching the objectives of the
experiment.

Example 1.1.3 Refer to example 1.1.2. The factor is diet and the levels of the
factor (or treatments) are A and B

Example 1.1.4 Suppose that we wish to compare the mean yield of maize varieties
X and Y grown under the same management and climatic conditions. In this case
Variety is the factor and the maize varieties X and Y are the factor levels.

Definition 1.1.2 An experimental unit is the smallest experimental material upon


which a treatment is applied.

2
The unit may be a plot of land, a patient in a hospital, or a lump of dough, or it may
be a group of pigs in a pen, or a batch of seed.
Example 1.1.5 Refer to example 1.1.2. If the two year old Holstein steers are indi-
vidually fed their assigned diets, then the steers are the experimental units, otherwise,
if the steers are group fed their assigned diets then the groups are the experimental
units.

Example 1.1.6 Refer to example 1.1.4. If the varieties are assigned individual
plots, then each plot on which a variety is grown is the experimental unit.

Definition 1.1.3 The response variable is the characteristic of the experimental


unit that is measured after applying the treatment on the unit. A response variable is
also called a dependant variable.

Example 1.1.7 Refer to example 1.1.2. The response variable is the weight gain of
a steer.

Example 1.1.8 Refer to example 1.1.4. The response variable is the yield per unit
area of the plot.

Exercise 1.1.1 We wish to conduct an experiment to compare the mean caffeine


content of three brands of tea leaves - . We will analyse ten tea bags of each brand
for caffaine content and record the amount of caffaine in each tea bag in milligrams.
1. What is our response variable?

2. Identify the factor we wish to study. What are the levels of the factor?

3. Identify the experimental units.

1.1.1 Randomistaion
Refer to example 1.1.4. If the mean yield of the two varieties are actually the same,
what would be our conclusion from the experiment in which we assign fertile plots
to variety X and poor plots to variety Y? The answer to this question is simple.
Our experimental procedure would favour variety X. We would make the erroneous
conclusion that variety X has a higher yield than variety Y when in actual fact they
have the same yield. If allocation of the plots to the varieties is done (as above)
deliberately, then the bias in our conclusion is called subjective bias or bias due to
deliberate selection, otherwise, the bias is called systematic bias.

Subjective and systematic biases can be eliminated by randomising the experi-


ment. Furthermore, randomising the experiment makes the errors in the measure-
ments statistically independent - an assumption required by many statistical methods

3
of analysing data including ANOVA.

By definition Randomistaion of an experiment is the random allocation of the


experimental units to the treatments or factor levels. That is, it is the allocation of
the experimental units to the treatments in a haphazard way.

Example 1.1.9 Suppose that we have developed a new diet A which we believe is
better than the existing diet B in terms of increasing the daily weight gain of steers
(of the same age) fed the diets. Furthermore suppose that four two year old Brahman
steers and four two year old Nguni steers are avilable for the experiment to verify your
claim. Known or unknown to us is that naturally Brahman steers grow faster than
Nguni steers of the same age raised under the same enviromental conditions. If in
actual fact diet A is good as diet B, what will be the conclusion from the experiment
whereby we assign Brahman steers to diet A and Nguni steers to diet B.

Example 1.1.10 Randomise the experiment in example 1.1.9

1.2 Replication
We define a basic experiment as one in which only one experimental unit is as-
signed to each treatment. Thus, each treatment appears once in a basic experiment.

Replication is the repetition of the basic experiment. In other words, replica-


tion is the assignment of at least two experimental units to each of the treatments
whose effects are under investigation. Replication allows the accurate estimation of
the experimental error, improves the reliability of the estimates of the treatment
means and also improves the sensitivity of statistical tests for comparing treatment
means.

Example 1.2.1 Suppose that we wish to compare the effects of two treatments (T1 , T2 )
on some response. Furthermore, suppose that we have 2n (n ∈ ℜ+ ≥ 1 ) identical
experimental units available for experimentation. The plan of conduct of the experi-
ment is to randomly allocate n experimental units to T1 and the remainder to T2 . The
experiment for n = 1 is our basic experiment, for n = 2 we have two replications of
our basic experiment etc.

Recall that the pooled t-test assumptions are that the errors in the Yij′ s are
independent and normally distributed with mean 0 and variance σ 2 (unknown). The
σ 2 is a measure of the experimental error and it is estimated by the pooled variance
which, in this case, is given by:
1
Sp2 = (S12 + S22
2

4
What happens to Sp2 if we do not replicate the basic experiment?

The estimate of the difference between the T1 mean and the T2 mean is given by:

2 2 2
Ȳ1. − Ȳ2. with varianceσ12 = σ
n

2 2 2 2 2
The variance σ12 is estimated by σ̂12 = S . The σ12 is a measure of precision or
n p
the reliability of Ȳ1. − Ȳ2. in estimating the difference between the treatment means.
2
The estimate is precise or reliable if σ12 is small. The precision or the reliability of the
estimate improves as we increase the number of replications of the basic experiment
2
since σ12 → 0 as n → ∞. The width of the confidence interval for the difference
between treatment means decreases as n increases. That is, the confidence interval
becomes more and more accurate as n is increased.

Suppose that we wish to test the hypothesis

H0 : T1 mean = T2 mean versus H1 : T1 mean 6= T2 mean.

If the pooled t − test assumptions (specified above) hold, then the appropriate test
statistic is given by:
Y¯i. − Y¯2.
t= ,
σˆ12
which is Student t distribution with 2n − 2 degrees of freedom. If the variance of
the errors σ 2 is known, then the appropriate test is the z − test. The test statistic for
the z − test is given by:
Y¯i. − Y¯2.
z= ,
σ12
which has a standard normal distribution. Both the t − test and the z − test reject
H0 in favour of H1 if the values of |t| and |z| are large.

Consider testing the above hypothesis using the z − test.


How does |z| vary with n?

A measure of the sensitivity of a test for comparing treatment means is the power
of the test to detect differences between treatment means. The power of a test is
the probability of rejecting H0 (treatment means are not different) when H1 is true
(treatment means are different)

5
1.3 Blocking
Blocking an experiment refers to arranging the experimental units into groups (called
blocks) within each of which the experimental units are relatively homogeneous with
respect to one or more characteristics of the units that may influence the response
of interest. Randomisation is then done independently within each block. Note that
blocking may also be based on external variables (variables that may influence the
response) associated with the experimental setting e.g observer if two or more peo-
ple perform the experiment.

Blocking an experiment allows us to account for the variation in the responses that
is due to differences among the experimental units. If we block an experiment using
external variables such as time or observer, then blocking allows us to account for
the variation in responses that is due to these external variables. The consequence of
not blocking when we are supposed to block is that the variation due to differences
among the experimental units or due to the external variables cannot be separated
from that due to the random errors. This results in an estimate of the experimental
error (σ 2 ) that is biased upwards.

Example 1.3.1 Suppose that we wish to compare the effects of diet A and diet B
on the daily weight gain of steers (of the same age) fed the diets. Furthermore,
suppose that four two year old Afrikaner steers and four two year old Nguni steers
are available for the experiment. How can we design a simple experiment that can
allow us to remove the between breed variation from the experimental error?

NOTE Precision and accurate comparisons among treatments over an appropriate


range of conditions are the primary objectives of most experiments. These objectives
require precise estimates of means and powerful statistical test. Reduced experimental
errors increase the possibility of achieving these objectives. Local control describes
the actions an investigator employs to reduce or control experimental errors, increase
accuracy of observations and establish the influence base of a study. The investigator
controls:

1. Technique

2. Selection of experimental units

3. Blocking or ensuring parity of information on all treatments

4. Choice of experimental design

5. Measure of covariates

6
Chapter 2

Completely Randomised Designs

2.1 Introduction
In a Completely Randomised Design (CRD) the experimental units are assigned to
the treatments completely t random. Complete randomisation ensures that every
unit has an equal chance to be assigned any one of the treatments. The value of
complete randomisation is insurance against subjective and systematic biases. If we
have n = tr experimental units and t treatments, then we randomly assign r units to
each of the t treatments to obtain a balanced completely randomised design.

We use a CRD when both the experimental units and experimental conditions are
uniform. That is, when the experimental units are identical with respect to their
characteristics that can affect the response and there are no external variables asso-
ciated with the experimental setting that can also affect the response

2.2 Advantages of using CRD


1. The design can be used with any number of treatments if the resources are
permitting.

2. The number of experimental units (sample size) can be varied from treatment
to treatment without complicating the analysis of the experiment.

3. The statistical analysis of the experiment (ANOVA and estimation) is easy even
when:

• there are observations that are missing by accident or by design and


• a treatment is dropped.

7
2.3 Disadvantages of using CRD
1. Although CRD can be used for any number of treatments, it is best suited for
situations where there are relatively few treatments.

2. the experimental units to which treatments are applied must be homogeneous,


with no extraneous source of variability affecting them.

Example 2.3.1 Suppose that you want to compare the effects of three types of fer-
tiliser (X,Y,Z) on a cert ain variety of maize. Furthemore, suppose that 9 plots of
the same size are available for the experiment.

1. List at least two characteristics of the plots that can affect the response of in-
terest.

2. List at least two environmental factors that can affect the response of interest.

3. List at least two management practices that can affect the response of interest
besides the method and the level of application of the fertilisers.

4. In view of your answers to (a) - (c), under what conditions would a CRD design
be appropriate for the experiment?

2.4 Analysis of the CRD


Table 3.1 displays the layout of the data in a CRD with t treatments and r experi-
mental units per treatment, where r is the number of replications of the basic CRD.
The symbols in the table have the following meanings:

Table 2.1: Layout of the data in CRD


Replication
Treatment 1 2 3 ... r Total Mean
T1 Y11 Y12 Y13 ... Y1r Y1. Ȳ1.
T2 Y21 Y22 Y23 ... Y2r Y2. Ȳ2.
. . . . ... . . .
Tt Yt1 Yt2 Yt3 ... Ytr Yt. Ȳt.

• Yij is the j th response to the ith treatment;


P 1
• Yi. = rj=1 Yij and Y¯i. = Yi. are the ith treatment total and treatment mean
r
respectively, and

• i = 1, 2, ..., t; j = 1, 2, ..., r; n = tr = total number of observations

8
2.4.1 ANOVA
General Model
The general linear statistical model for a CRD has the form:

Yij = µ + τi + ǫij (2.1)

Where
• µ is the overall population mean;

• τ is the effect of the ith treatment;

• µ + τi is the mean of the ith treatment;

• the ǫ′ij s are random errors which are assumed to be independent and normally
distributed with mean 0 and variance σ 2 ; and

• i = 1, 2, ..., t; j = 1, 2, ..., r.

Sum of Squares
The total variability in the observations (Yij′ s) is measured using the total sum of
squares and is given by:
t X
r t X
r
X X 1 2
SST O = (Yij − Y¯.. )2 = Yij2 − Y with tr − 1 df (2.2)
i=1 j=1 i=1 j=1
tr ..

It is possible to partition the total sum of squares into three separate sources of
variability i.e.
1. due to the variability among treatments (treatment sum of squares (SST)),

2. due to the variability among the yij′ s which is not accounted for by either treat-
ments or blocks (error sum of squares (SSE))
The partition can be done as follows
t X
X r
SST O = (Yij − Y¯.. )2
i=1 j=1
t X
X r
= [(Y¯i. − Y¯.. ) + (Ȳij − Ȳi. )]2
i=1 j=1
t X
X r t X
X r t X
X r
= (Y¯i. − Y¯.. )2 + (Yij − Y¯i. )2 + 2 (Yij − Ȳi. )(Y¯i. − Ȳ.. )
i=1 j=1 i=1 j=1 i=1 j=1
| {z } | {z } | {z }
SST SSE =0

9
The sum of squares formulas just discussed are mathematical and are not convinient
to use in calculations. The computational formulae are given by:

t
X 1 1 2
SST = Yi.2 − Y with t − 1 df
i=1
r tr ..
SSE = SST O − SST with t(r − 1) df

The mean sum of squares are obtained by dividing the sums of squares by the corre-
sponding degrees of freedom and they are used for testing the hypothesis about the
treatment effects.

2.4.2 Fixed effects model (Model I)


The τi′ s are regarded as fixed real constants satisfying the constraints:
t
X
τi = 0
i=1

Hypothesis Testing
We use model I to analyse a CRD when the conclusions of the experiment are to
pertain to the particular set of treatments included in the experiment i.e conclusions
cannot be extended to any other treatments that were not included in the experiment.

The objective is to test the null hypothesis of no difference among the treatment
means. This is equivelent to testing

H0 : τ1 = τ2 = ... = τt = 0
H1 : at least one τi 6= 0. (at least one of the
treatment means differs from the rest)

The test statistic for these hypothesis is given by:


M ST
F = (2.3)
M SE
where MST and MSE are mean squares computed from the appropriate sums of
squares in the ANOVA table. F has an F distribution with t − 1 degrees of freedom
on the numerator and (t − 1)(b − 1) degrees of freedom on the denominator.

10
To show that F is a reasonable test statistic for the above hypothesis we need to
show that MSE is an unbiased estimate of the experimental error (σ 2 ) i.e

E[M SE] = σ 2

Under model I, we can also show that MST is an unbiased estimate of:
t
2 1 X 2
E[M ST ] = σ + rτ
t − 1 i=1 i

When testing the hypothesis H0 versus H1 at the α level of significance we note that
H0 is true when both MST and MSE are both unbiased estimates of σ 2 . If H1 is true
MSE is still an unbiased estimate of σ 2 , but MST is an unbiased estimate of
t
2 1 X 2
σ + rτi > σ 2
t − 1 i=1

From the explanation above we expect the value of F to be equal to 1 if H0 is true


and to be large if H1 is true. How large is large is the question. The decision rule for
α
the testing problem is to reject H0 in favour of H1 if Fcal > F(t−1),t(r−1) - the critical
value obtained from the F-tables The Analysis of variance table for a completely
randomised design is as follows:

Table 2.2: ANOVA table for a CRD design


Source df SS MS F
Treatments t−1 SST M ST M ST /M SE
Error t(r − 1) SSE M SE
Totals tr − 1 SST O

Pairwise Comparisons of the treatments


They are done only if the ANOVA test under the fixed effects model concludes that
some treatment means are different. We perform pairwise comparisons of the treat-
ment means in order to determine which means are different. In this case the hy-
pothesis to be tested are:

H0 : µi. = µi′ .
H1 : µi. 6= µi′ . ∀i 6= i′

NOTE: Comparisons should be made at the same level of significance as that used
in the ANOVA. If we choose to use the least significant difference method for making

11
the comparisons, then the hypothesis are tested using the t-testwhose test statistic is
given by:

Ȳi. − Ȳi′
t= p
2M SE/r
p
where 2M SE/r is the standard error of Ȳi. − Ȳi′ . The test statistic has a Student
t-distibution with t(r − 1) degrees of freedom. We declare means to be significantly
different at the α level of significance if:
Ȳi. − Ȳi′ α
|p | > tt(t−1)
2

2M SE/r
or equivalently α p
|Ȳi. − Ȳi′ | > tt(t−1)
2
2M SE/r
the least significant difference quantity is:
α p
LSD = tt(r−1)
2
2M SE/r
The LSD procedure simply compares the observed absolute difference between each
pair of averages to the corresponding LSD. The means µi. and µi′ . are declared sig-
nificantly different if
|Yi. − Yi′ . | > LSD.

Example 2.4.1 A study was undertaken to compare the distance travelled in kilome-
tres per litre of three competing brands of petrol. Fifteen identical cars were available
for the experiment. The brands of petrol i.e brand A, B, and C were each assigned
five cars. The cars were operated under the same conditions and the distance trav-
elled by each car per litre of the assigned brand of petrol was recorded. The results are
displayed in the following table. Analyse the data and draw appropriate conclusions.

Replication
Brand 1 2 3 4 5
A 9.5 11.0 13.0 15.0 18.0
B 10.5 12.0 14.0 16.0 10.5
C 10.0 10.5 13.5 14.5 10.0

1. Assume that model I is approriate for the experiment, test the relevant hypothesis
at the 0.05 level of significance.

2. Is it appropriate to perform pairwise comparisons of the brand means. Give


reasons for your answer.

3. If you are to perform pairwise comparisons of the brand means, what value of
LSD would you use?

12
2.4.3 Random Effects Model (Model II)
For an experiment where the treatments or the factor levels of a factor are randomly
chosen from a population of treatments or the factor levels of the factor the appro-
priate model is a random effects model (Model II).Model II is also model ( 2.1)
but with the:

• τi′ s random variables which are independent and normally distributed with mean
zero and variance στ2 ; and

• ǫ′ij s, τi′ s and βj′ s are assumed to be independent random variables

Hypothesis Testing
We wish to test the hypothesis

H0 : στ2 = 0 i.e treatment effects are identical


H1 : στ2 > 0 i.e treatment effects are not identical

variance component Estimation


If we reject H0 under model II it follows that στ2 > 0 at the α level of significance.
The question still remains, ”How large is σ 2 ” We estimate στ2 as follows:

E[M SE] = σ 2 and E[M ST ] = σ 2 + rστ2

then
1
στ2 = (E[M ST ] − E[M SE])
r
Therefore the most obvious estimate of στ2 is given by:
1
σ̂τ2 = max[0, (M ST − M SE]
r
Example 2.4.2 To compare the lightning discharge intensities at all areas in South
Africa a CRD was used. Three areas were randomly chosen for the study and the
lightining tracking equipment assembled at those areas. On each day of the month of
December between 0800hrs and 1700hrs lightining was monitored at the three areas
until the maximum intensity had been recorded for five seperate storms. The sample
data is in the following table. Analyse the data and draw appropriate conclusions.

1. Assume that model II is approriate for the experiment, test the relevant hypoth-
esis at the 0.05 level of significance.

2. Is it necessary to estimate στ2 . Give reasons for your answer.

3. If you are to estimate στ2 , what would be its estimate?

13
Intensity
Area 1 2 3 4 5
A 20 1050 3200 5600 50
B 4300 70 2560 3650 80
C 100 7700 8500 2960 3340

2.5 Unbalanced CRD/Missing Observations


Missing observations may occur by design or by accident. An unbalanced CRD is
a CRD for which the number of responses within each treatment are different. The
analysis of variance for the unbalanced CRD is the same as that for the balanced one
with some slight modifications in the formulae. The general model is the same as 2.1
but with j = 1, 2, , ..., ri where ri is the number of replications of the ith treatment.
The total number of observations (experimental units) becomes
t
X
n= ri
i=1

. The ith treatment total and the estimate of the overall mean (µ) are given by:
ri it r
X 1 XX
Yi. = Yij and Ȳ.. = Yij
j=1
n i=1 j=1
respectively.
The formulae to calulate the sums of squares are given by:

ri
t X
X 1 2
SST O = Yij2 − Y with n − 1 df
i=1 j=1
n ..
t
X 1 2 1 2
SST = Y − Y with t − 1 df (2.4)
r i. n ..
i=1 i
SSE = SST O − SST with n − t df

The mean sum of squares are obtained by dividing the sums of squares by the corre-
sponding degrees of freedom and they are used for testing the hypothesis about the
treatment effects.

The Analysis of variance table for a completely randomised design with missing ob-
servations is as follows:
The LSD for comparing two treatment means is given by
r
α 1 1
LSD = tn−t
2
M SE( + )
ri ri′

14
Table 2.3: ANOVA table for a CRD with missing observations
Source df SS MS F
Treatments t − 1 SST M ST M ST /M SE
Error n − t SSE M SE
Totals n − 1 SST O

2.5.1 Checking model assumptions


There are graphical and formal statistical methods used to check the assumptions
of the analysis of variance. These methods are based on the estimates of the errors
called residuals. The underlying assumptions of ANOVA are that the data are
adequately described by the specified model and that the errors (ǫ′ij s) are independent
and normally distributed with mean 0 and constant variance σ 2 . In this case the
residuals for model 2.1 are given by:

ǫ̂ij = Yij − Ȳi.

Example 2.5.1 Refer to example 2.4.1. Calculate the residuals for the data.

Normality Assumption
A histogram of the residuals can help to detect skewness in the distribution of the
residuals. We expect the histogram to be approximately bell shaped about zero if the
distribution of the errors is indeed normal with mean 0. We can also use a normal
probability plot to detect non-normality of the errors. If the error distibution is
normal then this plot, should be approximately straight. To construct the probability
plot we follow the following steps:

1. Compute the standardised residuals by dividing each residual by M SE.

2. Arrange the standardised residuals in increasing order of magnitude.

3. Calculate the cumulative probability point (pi ) for the ith ordered stan-
dardised residual using the formula pi = (i − 12 )/n where n = rt.

4. Plot the value of the ith ordered standardised residual on the horizontal axis
against pi on the vertical axis. The verical axis goes from 0.01 to 0.99 and this
corresponds to the cummulative distribution function of a normal distribution.

NOTE that: The normality of the error distribution do not seriously invalidate
conclusions of the analysis of variance of the fixed effects model if there are moderate
depatures from normality. They only cause the level of significance to differ from
the specified value and the tests to be slightly less powerful in detecting treatment
differences. Conclusions of the ANOVA of the random effects model are more seriously

15
affected by non-normality in the sense that the estimates of the variance components
may be inaccurate.

Constant Variance Assumption


A plot of residuals versus the treatment averages (Ȳi. ) can be used to check the
assumption of constant error variance. If error assumptions are not violated and
model 2.1 is correct then the errors and hence the residuals should be patternless.
The conclusions of the F-test are slightly affected by the violation of the constant error
variance assumption in the case of a fixed effects model for balanced data. Conclusions
become invalid if the data are unbalanced and/or some error variances are much larger
than others. For the random effects model, the conclusions are generally invalid even
for balanced data if the constant error variance assumption is violated.

Independence Assumption
When responses are measured at successive time points, the errors in the responses
may become related through time. The assumption of independent errors would
be violated if this happens. We detect this assumption by plotting the residuals
versus time. If the plot shows any obvious pattern, then this would imply that the
assumption of independence is violated. Once this assumption is violated, all the
conclusions of the analysis of variance become invalid.

16
Chapter 3

Randomised Block Design (RBDs)

3.1 Randomised Block Design


In many experimental problems, it is necessary to design the experiment so that
variability arising from extraneous sources can be systematically controlled. The
randomised complete block design is perhaps the most widely used experimental
design. Situations for which the randomised complete block design is appropriate are
numerous, and easily detected with practice. For example, units of test equipment or
machinery are often different in their operating characteristics and would be a typical
blocking factor.
NOTE
The word ”complete” indicates that each block contains all the treatments. By using
this design, the blocks form a more homogeneous experimental unit.

We construct an RBD as follows

1. The experimental units are arranged into groups (called blocks) in such a
way that within each block, the experimental units are relatively homogeneous
with respect to one or more characteristics of the units that may influence the
response of interest.

2. Complete randomisation is then done independently within each block.

Blocking allows us to remove the variation due to the differences among the blocks
from the experimental error. Complete randomisation within each block offers insur-
ance against subjective and systematic biases, thereby making the conclusions from
the experiment more accurate.

17
Advantages of an RBD
1. It can be used to accomodate any number of treatments and replications in any
number of blocks.

2. If blocking is effective, then the RBD can provide more precise conclusions than
a CRD that uses the same experimental units.

3. The statistical analysis is simple (two way ANOVA) even when an entire block
or treatment is dropped.

4. The design is easy to construct and allows variable experimental units to be


used without sacrificing the precision of the conclusions.

Disadvantages of an RBD
1. Since the experimental units within a block must be homogeneous, the design
is best suited for a relatively small number of treatments.

2. The statistical analysis of the experiment is complicated if there are some miss-
ing observations within blocks.

3. The degrees of freedom for the experimental error are lost to blocking.

4. The model for the RBD experiment is more complicated than that for a CRD
experiment, and requires more assumptions.

Example 3.1.1 A study was undertaken to compare the starting salaries of bach-
elor’s degree candidates at the University of Limpopo from the school of computa-
tional and mathematical sciences for the academic years 2004-2005, 2005 -2006 and
2006-2007. Three students from mathematics, three students from statistics and three
students from computer science were available for the experiment. It should be noted
that only those students who had accepted a job were considered in this study.

1. What are the experimental units in the proposed experiment?

2. Which design CRD or RBD is appropriate for the experiment? Give reasons
for your answer.

3. In case your answer to (2) is RBD, what is the blocking factor?

18
3.2 Analysis of the RBD
A randomised block design can be used to compare t population treatment means
when an additional source of variability (blocks) is present.

Definition 3.2.1 A randomised block design is an experimental design for comparing


t treatments in b blocks. Treatments are randomly assigned to experimental units
within a block, with each treatment appearing exactly once in every block.

General model
Both the fixed effects model (model I) and the random effects model (model
II) for the RBD have the form:

Yij = µ + τi + βij + ǫij


(3.1)

Where

• µ is the overall population mean;

• τi is the effect of the ith treatment;

• βj is the effect of the j th block;

• µ + τi is the mean of the ith treatment;

• the ǫ′ij s are random errors associated with the response on treatment i, block
j which are assumed to be independent and normally distributed with mean 0
and variance σ 2 ; and

• i = 1, 2, ..., t; j = 1, 2, ..., b.

Table 3.1: A Randomised block design


Block
Treatment 1 2 3 ... b Total Mean
1 Y11 Y12 Y13 ... Y1b Y1. Ȳ1.
2 Y21 Y22 Y23 ... Y2b Y2. Ȳ2.
. . . . ... . . .
t Yt1 Yt2 Yt3 ... Ytb Yt. Ȳt.
Total Y.1 Y.2 Y.3 ... Y.b Y..
Mean Ȳ.1 Ȳ.2 Ȳ.3 ... Ȳ.b Ȳ..

The symbols in the table have the following meanings:

19
• Yij is the ith treatment in the j th block;
P 1
• Yi. = bj=1 Yij and Y¯i. = Yi. are the ith treatment total and treatment mean
b
respectively, and
P 1
• Y.j = ti=1 Yij and Y¯.j = Y.j are the j th block total and block mean respec-
t
tively, and

• i = 1, 2, ..., t; j = 1, 2, ..., b; n = tb = total number of observations

Sum of Squares
The total variability in the observations (Yij′ s) is measured using the total sum of
squares and is given by:
t X b t X b
X
¯ 2
X 1
SST O = (Yij − Y.. ) = Yij2 − Y..2 with n − 1 df (3.2)
i=1 j=1 i=1 j=1
n

It is possible to partition the total sum of squares into three separate sources of
variability i.e.

1. due to the variability among treatments (treatment sum of squares (SST)),

2. due to the variability among the blocks (block sum of squares (SSB))and

3. due to the variability among the yij′ s which is not accounted for by either treat-
ments or blocks (error sum of squares (SSE))

The partition can be done as follows


t X
X b
SST O = (Yij − Y¯.. )2
i=1 j=1
t X
X b
= [(Y¯i. − Y¯.. ) + (Ȳ.j − Ȳ.. ) + (Yij − Ȳi. − Ȳ.j + Ȳ.. )]2
i=1 j=1
t
X b
X t X
X b
2
= b (Y¯i. − Y¯.. )2 + t (Y¯.j − Y¯.. )2 + (Yij − Y¯i. − Y¯.j + Y¯.. )
| i=1 {z } |
j=1
{z }
i=1 j=1
| {z }
SST SSB SSE

The sum of squares formulas just discussed are mathematical and are not convinient
to use in calculations. The computational formulae are given by:

20
t
X 1 1 2
SST = Yi.2 − Y with t − 1 df
i=1
b n ..
b
X 1 1 2
SSB = Y.j2 − Y with b − 1 df
j=1
t n ..
SSE = SST O − SST − SSB with n − t − b + 1 = (t − 1)(b − 1) df

The mean sum of squares are obtained by dividing the sums of squares by the corre-
sponding degrees of freedom and they are used for testing the hypothesis about the
treatment and the block effects.

3.2.1 Fixed effects model


The τi′ s and βj′ s are regarded as fixed real constants satisfying the constraints:
t
X b
X
τi = βj = 0
i=1 j=1

Hypothesis Testing
The objective is to test the null hypothesis of no difference among the treatment
means. This is equivelent to testing

H0 : τ1 = τ2 = ... = τt = 0
H1 : at least one τi 6= 0. (at least one of the
treatment means differs from the rest)

The test statistic for these hypothesis is given by:


M ST
F = (3.3)
M SE
where MST and MSE are mean squares computed from the appropriate sums of
squares in the ANOVA table. F has an F distribution with t − 1 degrees of freedom
on the numerator and (t − 1)(b − 1) degrees of freedom on the denominator.

We may also be interested in testing whether it was advantageous to block. The


hypothesis to be tested are:

H0 : β1 = β2 = ... = βb = 0
H1 : at least one βj 6= 0. (at least one of the
block means differs from the rest)

21
The test statistic for these hypothesis is given by:
M SB
F = (3.4)
M SE
where MSB and MSE are mean squares computed from the appropriate sums of
squares in the ANOVA table. F has an F distribution with b − 1 degrees of freedom
on the numerator and (t − 1)(b − 1) degrees of freedom on the denominator. We
α
conclude that blocking was effective at the α level of significance if F > Fb−1,(t−1)(b−1) .
The Analysis of variance table for a randomised block design is as follows:

Table 3.2: ANOVA table for a randomised block design


Source df SS MS F
Treatments t−1 SST M ST M ST /M SE
Blocks b−1 SSB M SB M SB/M SE
Error (b − 1)(t − 1) SSE M SE
Totals bt − 1 T SS

Pairwise Comparisons of the treatments


They are done only if the ANOVA test under the fixed effects model concludes that
some treatment means are different. In this case the hypothesis to be tested are:

H0 : τi. = τi′ .
H1 : τi. 6= τi′ . ∀i 6= i′

NOTE: Comparisons should be made at the same level of significance as that used
in the ANOVA. If we choose to use the least significant difference method for making
the comparisons, then the least significant difference method is given by:
α p
LSD = t(t−1)(b−1)
2
2M SE/b

The means µi. and µi′ . are declared significantly different if

|Yi. − Yi′ . | > LSD.

Example 3.2.1 A study was undertaken to compare the starting salaries of bach-
elor’s degree candidates at the University of Limpopo from the school of computa-
tional and mathematical sciences for the academic years 2004-2005, 2005 -2006 and
2006-2007. Three students from mathematics, three students from statistics and three
students from computer science were available for the experiment. It should be noted
that only those students who had accepted a job were considered in this study.

22
Curriculum
Year Mathematics Statistics Computer Science
2004-2005 10.6 12.0 11.0
2005-2006 9.0 15.0 12.0
2006-2007 12.0 17.4 13.0

1. Assume that model I is approriate for the experiment, test the relevant hypothesis
at the 0.05 level of significance.

2. Is it appropriate to perform pairwise comparisons of the yearly salary means.


Give reasons for your answer.

3. If you are to perform pairwise comparisons of the yearly salary means, what
value of LSD would you use?

3.2.2 Random Effects Model (Model II)


Model II is also model ( 3.1) but with the:

• τi′ s random variables which are independent and normally distributed with mean
zero and variance στ2 ;

• βj′ s random variables which are independent and normally distributed with
mean zero and variance σβ2 ;

• ǫ′ij s, τi′ s and βj′ s are assumed to be independent random variables

Hypothesis Testing
We wish to test the hypothesis

H0 : στ2 = 0 i.e treatment effects are identical


H1 : στ2 > 0 i.e treatment effects are not identical

We may also wish to test the effectiveness of blocking:

H0 : σβ2 = 0 i.e blocking was not effective


H1 : σβ2 > 0 i.e blocking was effective

NOTE Model II is used when the set of treatments and the set of blocks used
in the RBD experiment are random samples from their respective treatment and
block populations, and the conclusions of the experiment are to be extended to these
populations.

23
3.2.3 Checking model assumptions
The graphical methods presented in section 2.4.2 apply in this case and subsequent
cases. However, in this case the residuals for model (3.1) are given by:

ǫij = Yij − Ŷij

where
Ŷij = Ȳi. + Ȳ.j + Ȳ..
Ŷij is the predicted mean response to the ith treatment in the j th block, i.e is an
estimate of µij

3.3 Block × Treatment Interaction Effects


When the difference in treatment means is not the same for different blocks, the model
is no longer additive and we say that the two factors treatment and blocks interact.

Definition 3.3.1 Two factors A and B are said to interact if the difference in mean
responses for two levels of one factor is not constant across levels of the second factor.

The presence of block by treatment interaction effects tends to inflate the estimate
of the experimental error and this has an effect of making the tests for comparing the
treatment means insenstive to treatment differences. To account for the variation in
the responses that is due to the block by treatment interaction we can replicate the
basic RBD.

Table 3.3: Replicated RBD


Block
Treatment 1 2 ... b Total Mean
1 Y111 , Y112 , ..., Y11r Y121 , Y122 , ..., Y12r ... Y1b1 , Y1b2 , ..., Y1br Y1.. Ȳ1..
Y11. (Ȳ11. ) Y11. (Ȳ11. ) ... Y1b. (Ȳ1b. )
2 Y211 , Y212 , ..., Y21r Y221 , Y222 , ..., Y22r ... Y2b1 , Y2b2 , ..., Y2br Y2.. Ȳ2..
Y21. (Ȳ21. ) Y22. (Ȳ22. ) ... Y2b. (Ȳ2b. )
. . . . ... . .
. . . . ... . .
. . . . ... . .
t Yt11 , Yt12 , ..., Yt1r Yt21 , Yt22 , ..., Yt2r ... Ytb1 , Ytb2 , ..., Ytbr Yt.. Ȳt..
Yt1. (Ȳt1. ) Yt2. (Ȳt2. ) ... Ytb. (Ȳtb. )
Total Y.1. Y.2. ... Y.b. Y...
Mean Ȳ.1. Ȳ.2. ... Ȳ.b. Ȳ...

The symbols in the table Have the following meanings:

24
• Yijk is the ith treatment in the j th block for the k th replication;
P P 1
• Yi.. = bj=1 rk=1 Yijk and Y¯i.. = Yi.. are the ith treatment total and treatment
b
mean respectively, and
P P 1
• Y.j. = ti=1 rk=1 Yijk and Y¯.j. = Y.j. are the j th block total and block mean
t
respectively, and

• i = 1, 2, ..., t; j = 1, 2, ..., b; k = 1, 2, ..., r; n = tbk = total number of observations

To check for the presence of the block by treatment interaction we can use the
following methods

1. If the differences Ȳij. − Ȳi′ j are approximately the same for all i, i 6= i′ and j,
then there may be no block by treatment interaction.

2. If Ȳij. ≈ Ȳi.. + Ȳ.j. − Ȳ... fo all i and j, then there may be no block by treatment
interactions.

3. We can also check for interaction using graphs plotted from the treatment means
i.e Ȳij. versus j (or Ȳij. versus i). If the curves of the graph are almost parallel
then there may be no block by treatment interactions.

Example 3.3.1 To estimate the various components of variability in a filtration pro-


cess, the percent of material lost in the mother liquor was measured for 8 experimental
units two runs on each condition. Two filters and two operators were selected at ran-
dom to use in the experiment resulting in the following measurements:
Do the data suggest the presence of the operator by filter interaction?

Operator
Filter 1 2
1 7.6,8.8 22.2,23.4

2 19.5,17.6 30.1,24.2

3.4 ANOVA
The model for an RBD with interactions has the form

Yijk = µ + τi + βj + (τ β)ij + ǫijk


(3.5)

Where

25
• µ is the overall population mean;

• τi is the effect of the ith treatment;

• βj is the effect of the j th block;

• (τ β)ij is the interaction effect of the ith treatment and the j th block;

• µij is the mean of the ith treatment when in the j th block and

• the ǫ′ijk s are random errors which are assumed to be independent and normally
distributed with mean 0 and variance σ 2 ; and

• i = 1, 2, ..., t; j = 1, 2, ..., b; k = 1, 2, ...r.

Sum of Squares
The total variability in the observations (Yijk

s) is measured using the total sum of
squares and is given by:
t X
b X
r t X
b X
r
X X 1 2
SST O = (Yijk − Y¯... )2 = 2
Yijk − Y with tbr − 1 df (3.6)
i=1 j=1 k=1 i=1 j=1 k=1
tbr ...

We can breakdown the Total Sum of Squares into:

1. Model sum of squares (SSModel) which measures the variability in the ob-
servations that is due to the block, treatment and block by treatment interaction
effects;

2. Error Sum of Squares (SSE) which measures the variability in the observa-
tions that is due to the random errors.

We obtain SSModel by performing a two-way ANOVA of data with Model as the


factor. This follows that the formula for SSModel is as follows:

t X
b t b
X 1 XX 2 1 2
SSM odel = (Yij. − Y¯... )2 = Yij. − Y with tb − 1 df
i=1 j=1
r i=1 j=1 tbr ...

We obtain SSE by subtraction as usual i.e

SSE = SST O − SSM odel with tb(r − 1) df

It is possible to partition the total sum of squares into two separate sources of
variability i.e.

26
t X
X b X
r
SST O = (Yijk − Y¯... )2
i=1 j=1 k=1
t X
X b X
r
= [(Ȳij. − Y¯... ) + (Yijk − Y¯ij. )]2
i=1 j=1 k=1
t X
X b X
r t X
X b X
r
= (Ȳij. − Y¯... )2 + (Yijk − Y¯ij. )2
i=1 j=1 k=1 i=1 j=1 k=1
| {z } | {z }
SSM odel SSE
t b
XXX r
+ 2 (Ȳij. − Y¯... )(Yijk − Y¯ij. ) (3.7)
i=1 j=1 k=1
| {z }
=0

We can also decompose SSModel into three separate sources of errors as follows:

1. Treatment Sum of Squares (SST) which measures the variation due to the
treatment effects;

2. Block Sum of Squares (SSB) which measures the variation due to the block
effects and

3. Block by Treatment Sum of Squares (SSB×T) which measure the varia-


tion due to the block by treatment interaction effects.

This follows that SSModel can be decomposed as follows

t X
X b X
r
SSM odel = (Yij. − Y¯... )2
i=1 j=1 k=1
t X
X b X
r
= [(Ȳi.. − Y¯... ) + (Ȳ.j. − Y¯... ) + (Ȳij. − Y¯i.. ) − (Ȳ.j. + Y¯... )]2
i=1 j=1 k=1
t X
X b X
r t X
X b X
r
= (Ȳi.. − Y¯... )2 + (Ȳ.j. − Y¯... )2
i=1 j=1 k=1 i=1 j=1 k=1
| {z } | {z }
SST SSB
t b
XXX r
+ [(Ȳij. − Y¯i.. ) − (Ȳ.j. + Y¯... )]2 + crossproductterm
| {z }
i=1 j=1 k=1 =0
| {z }
SSBT

27
The computational formulae are given by:
t
1 X 2 1
SST = Yi.. − Y... 2 with t − 1 df
br i=1 tbr
b
1 X 2 1
SSB = Y.j. − Y... 2 with b − 1 df
tr j=1 tbr
SSB × T = SSM odel − SST − SSB with (t − 1)(b − 1) df

Table 3.4: ANOVA table for a randomised block design with interactions
Source df SS MS F
Treatments t−1 SST M ST M ST /M SE
Blocks b−1 SSB M SB M SB/M SE
Interaction (t − 1)(b − 1) SSB × T M SB × T M SB × T /M SE
Error bt(r − 1) SSE M SE
Totals bt − 1 T SS

3.4.1 Fixed effects model


Model I is the same as model 3.5 but with τi′ s and βj′ s are regarded as fixed real
constants satisfying the constraints:
t
X b
X b
X t
X
τi = βj = (τ β)ij = (τ β)ij ) = 0
i=1 j=1 j=1 i=1

Hypothesis Testing
Interaction effects
We always first test the hypothesis about the block by treatment interaction effects.
The hypothesis can be stated as follows:

H0 : All (τ β)i j = 0
H1 : at least one (τ β)ij 6= 0.

The test statistic for these hypothesis is given by:


M SB × T
F = (3.8)
M SE

28
where M SB × T and MSE are mean squares computed from the appropriate sums of
squares in the ANOVA table. F has an F distribution with (t − 1)(b − 1) degrees of
freedom on the numerator and tb(r − 1) degrees of freedom on the denominator.

Main Effects
If we fail to reject the null hypothesis about the interaction effects we test the main
effects which are treatment or block effects. Under both models i.e the fixed and
random effects model the tests about the main effects are meaningful only if there
are no block by treatment interaction effects. This follows that we only do the tests
for the main effects only if the test about the interaction effects concludes that the
interaction effects are not present.
NOTE Tests for the main effects are the same as the tests for an RBD without
replications.

Pairwise Comparisons
They are done only if the ANOVA tests conclude that the block by treatment interac-
tion effects are absent and that some treatment means are different. Hypothesis to be
tested are the same as those for pairwise comparisons of an RBD without replications.
The LSD for the tests is given by
α p
LSD = tbt(r−1)
2
2M SE/br

The means µi. and µi.′ are declared significantly different if

|Yi.. − Yi′ .. | > LSD.

Example 3.4.1 Refer to example 3.3.1

1. Assume that model 1 is appropriate for the data and hence test the hypothesis
about Operator by Filter interaction effects.

2. Based on your conclusions in (1) is it appropriate to test the hypothesis about


the effectiveness of blocking by operator? If so

(a) Was blocking by Operator effective?


(b) Are the Filter means significantly different?

29
3.4.2 Random Effects Model (Model II)
Model II is also model 3.5 but with the:

• τi′ s random variables which are independent and normally distributed with mean
zero and variance στ2 ;

• βj′ s random variables which are independent and normally distributed with
mean zero and variance σβ2 ;

• (τ β)′ij s random variables which are independent and normally distributed with
mean zero and variance στ2β ;

• ǫ′ijk s, τi′ s βτj′ s and (τ β)i j ′ s are assumed to be independent random variables

Hypothesis Testing
We also start by testing for interactions

H0 : στ2β = 0 i.e there are no interaction effects


H1 : στ2β > 0 i.e there are interaction effects

Main Effects
Tests are the same as the tests for Model II of a randomised block design without
replications.

3.5 Unbalanced RBD/Missing Observations


The usual ANOVA methods of the previous sections do not apply directly to unbal-
anced data for the reason that:

SST O 6= SST + SSB + SSB × T + SSE

3.5.1 One missing Observation


After replacing the missing value with an estimated one, the ANOVA of unbalanced
data with one missing observation will be the same as that for balanced data. The
formula for estimating the missing observation M is given by
tT + bB − G
M=
(t − 1)(b − 1)
where
t is the number of treatments;
b is the number of blocks;

30
T is the sum of all observations on the treatment assigned to the missing observation;
B is the sum of all measurements in the block with the missing observation and
G is the sum of all the measurements

Example 3.5.1 An experiment was conducted to determine the nutritional value of


4 diets for cows. Five dairies were involved in the study. Each cow in a sample of
4 cows from a dairy was randomly assigned to one of the four diets so that a total
of 5 cows were fed each diet. The response measured was the amount consumed per
day. Unfortunately one of the cows (Diet 4) developed an infection (unrelated to
the treatment) and was dropped from the study for safety reasons. The reults are as
follows:
Estimate the missing value and then perform an analysis of variance at 0.01 level of

Diet
Dairy 1 2 3 4
1 15.4 9.6 9.5 8.4
2 14.8 9.3 9.4
3 15.9 9.8 9.7 9.3
4 15.5 9.4 9.2 8.1
5 14.7 9.2 9.0 7.9

significance.

3.5.2 Two or more missing Observations


The formulae for estimating more than one missing observation are complicated and
hence we resort to some other simpler methods. We discuss one such method which
involves fitting what are called full (complete) and reduced models using what is
called the regression approach to fitting ANOVA models.

The reduced and complete models for testing treatments are as follows:

Complete (Full) model (model 1)

Yij = µ + τi + βj + ǫij

Reduced model (model 2)

Yij = µ + βj + ǫij

where βj is the j th block effect and τi is the ith treatment effect.

31
By fitting model 1 to the data we obtain SSEF . Similarly, a fit of model 2 yields
SSER . The difference of the two sums of squares for error SSER − SSEF , gives the
drop in the sum of squares due to treatments. Since this is an unbalanced design
the block effecs do not cancel out when comparing treatment means as they do in a
balanced randomised block design. The difference in the sums of squares has been
adjusted for any effects due to blocks caused by the imbalance in the design. The
difference is called the sum of squares due to treatments adjusted for blocks i.e

SSER − SSEF = SSTadj

The sum of squares due to blocks unadjusted for any treatment differences is
obtained by subtraction:

SSB = SST O − SSTadj − SSE

Where SSTO and SSE are sums of squares from the complete model.

The analysis of variance table for testing the effect of treatments is as follows:

Table 3.5: ANOVA table for testing the effects of treatments, unbalanced randomised
block design
Source df SS MS F
Blocks b−1 SSB
Treatmentsadj t−1 SSTadj M STadj M STadj /M SE
Error by subtraction SSE M SE
Totals n−1 SST O

Note n is the number of actual observations.

For the blocks the corresponding sum of squares for testing the effect of blocks
has the same complete model (model 1) as before i.e
Complete (Full) model (model 1)

Yij = µ + τi + βj + ǫij

and Reduced model (model 2)

Yij = µ + τj + ǫij

The sums of squares drop SSER − SSEF , is the sum of squares due to blocks
after adjusting for the effects of the treatments. By subtraction, we obtain:

SST = SST O − SSBadj − SSE

32
The analysis of variance table for testing the effect of blocks is as follows:

Table 3.6: ANOVA table for testing the effects of blocks, unbalanced randomised
block design
Source df SS MS F
Blocksadj b−1 SSBadj M SBadj M SBadj /M SE
Treatments t−1 SST
Error by subtraction SSE M SE
Totals n−1 SST O

33
Chapter 4

Balanced Incomplete Block


Designs

4.1 Introduction
A balanced incomplete RBD is an incomplete RBD in which any combination of
treatments appear together in a block an equal number of times. We use it when
we are forced to design an experiment in which we must sacrifice some balance to
perform the experiment and this is when the size of blocks (k) is less than the number
of the treatments (t). For example suppose we have three treatments (A, B, C) and
blocks (B1, B2, B3) of size two each. Then we can construct a balanced incomplete
RBD by randomly assigning each of the combinations to one of the three blocks.

µ ¶
t
In general, if k < t, then we have k or t Ck treatment combinations. Note that
balanced incomplete block designs can also be constructed with less than t Ck blocks.
Although these designs are not balanced as the definition we had in Chapter 3, the
designs do retain some balance i.e even though all treatments do not appear in the
same block, each combination of treatments appears together in a block the same
number of times (The pairs AB,BC, and AC) appear once in a block.

4.2 Analysis Of Variance


The anlysis of variance for a balanced incomplete block design can be performed either
by using specifically developed formulas or by using the method of fitting complete
and reduced models as discussed for unbalanced designs. The developed formulas are
as follows:

34
The Total Sum of Squares (SSTO) is given by the formula:
t X
b
X 1 2
SST O = Yij2 − Y with n − 1 df
i=1 j=1
n ..

where n is the total number of observations. SSTO can be decomposed into

SST O = SSTadj + SSB + SSE

SSTadj is the adjusted treatment sum of squares. SSTadj is given by:


t
t−1 X
SSTadj = (kYi. − Bi )2 with t − 1 df
nk(k − 1) i=1

Where k is the number of treatments per block or the size of each block and
Bi is the sum of all the observations for the blocks that contain the ith treatment.

The Block Sum of Squares are given by:


b
X 1 1 2
SSB = Y.j2 − Y with b − 1 df
j=1
k n ..

The Error Sum of Squares is given by

SSE = SST O − SSTadj − SSB with n − t − b + 1 df

Table 4.1: ANOVA table for a balanced incomplete block design


Source df SS MS F
Blocks b−1 SSB
Treatmentsadj t−1 SSTadj M STadj M STadj /M SE
Error by subtraction SSE M SE
Totals n−1 T SS

4.3 Pairwise Comparisons


The treatment means are estimated using the adjusted treatment means. The esti-
mate of the ith treatment mean µi. is given by:
r
kM SE
µ̂i. =

35
An estimate of µi. − µi′ . is given by
r
2kM SE
µ̂i. − µ̂i′ . =

The least significance difference for pairwise comparisons of the µi. and µi′ . is given
by
r
α/2 2kM SE
LSD = tn−t−b+1

Example 4.3.1 A chemical experiment was conducted to determine whether the re-
action time was a function of the type of catalyst used. A balanced incomplete RBD
was used for the experiment. The treatments were four catalysts and the blocks were
four batches of raw material. The data is displayed in the following table.

Batch
Catalyst 1 2 3 4
1 73 74 ... 71
2 ... 75 67 72
3 73 75 68 ...
4 75 ... 72 75

Test the equality of the catalyst effects at the 0.05 level of significance.

36
Chapter 5

Latin Square and Crossover


Designs

5.1 Introduction: Latin Square Design


In, general a Latin Square Design can be used to compare t treatment means in the
presence of two extraneous sources of variability, which we block off into t rows and t
blocks. The t treatments are then randomly assigned to the rows and columns so that
each treatment appears in every row and every column of the design. For example,
if the experimental units are animals and both sex and age of the animal affect the
response of interest, then the age and sex of the animal can be used as blocking factors
in the experiment.

Definition 5.1.1 A t × t Latin Square Design contains t rows and t colums. The t
treatments are randomly assigned to experimental units within the rows and columns
so that each treatment appears in every row and in every column.

A general latin square design has the following features:

1. There are t treatments

2. There are two blocking factors each with t blocks or levels.

3. The t treatments are randomly assigned to the experimental units within the
rows and colums so that each treatment appears once in every row and once in
every column.

4. If the letters in the first row and first column are arranged alphabetically (in a
regular ascending order), then the latin square is called a standard latin square.

If t = 2 or t = 3 there is only one standard square. For t = 4 there are 4 standard


squares, for t = 5 there are 56, for t = 6 there are 9408.

37
For a given number of treatments e.g for t = 3 there are 12 different latin square
designs. The question is, if there exist many latin square design which one should we
use for the design?

Example 5.1.1 Randomise the following 4 × 4 standard latin square:

Table 5.1: Standard 4 × 4 latin square design


A B C D
B A D C
C D A B
D C B A

NOTE: The process of choosing a latin square design at random is called ran-
domisation of the latin square design. In principle, to choose a random latin
square we proceed as follows

1. Choose one of the standard squares at random.

2. Randomly permute the columns

3. Randomly permute the rows

4. Randomly permute the symbols in the body of the diagram.

5.1.1 Advantages of a Latin Square Design


1. The design is particularly appropriate for comparing two treatment means in
the presence of two sources of extraneous variation each measured at t levels.
This results in substantial reductions in the experimental error.

2. In experiments where the t treatments are applied to each experimental unit in


succession, the design is used to account for ’Period’ effects or the effects of the
order of running the treatments.

3. The analysis is still quite simple.

5.1.2 Disadvantages of a Latin Square Design


1. While a latin square can be constructed for any value of t, it is best suited for
comparing t treatments when 5 ≤ t ≤ 10

2. The two blocking factors cannot have different numbers of levels.

38
3. The latin square design can only be used when there are no interactions
between either the blocking factors and the treatments or between the blocking
factors.

4. Randomising a latin square experiment is more complicated than randomising


a CRD or RBD experiment.

5.2 Analysis of a Latin Square Design


Consider a 4 × 4 latin square design displayed in table 5.2

Table 5.2: A 4 × 4 Latin Square design


Column
1 2 3 4 Total Mean
1 Y111 Y122 Y134 Y143 Y1.. Ȳ1..
Row 2 Y212 Y223 Y231 Y244 Y2.. Ȳ2..
3 Y313 Y324 Y332 Y341 Y3.. Ȳ3..
4 Y414 Y421 Y433 Y442 Y4.. Ȳ4..
Total Y.1. Y.2. Y.3. Y.4. Y...
Mean Ȳ.1. Ȳ.2. Ȳ.3. Ȳ.4. Ȳ...
Treatments A B C D
Total Y..1 Y..2 Y..3 Y..4
Mean Ȳ..1 Ȳ..2 Ȳ..3 Ȳ..4

The symbols in the table have the following meanings:

• Yijk is the response to the k th treatment ith row and j th column;


P P P P
• Yi.. = tj=1 Yijk , Y.j. = ti=1 Yijk and Y... = ti=1 tj=1 Yijk are the ith row
total, j th column total and the overall total respectively, and
1 1 1
• Ȳi.. = Yi.. , Ȳ.j. = Y.j. and Ȳ... = 2 Y... are the ith row mean, j th column mean
t t t
and overall mean respectively, and

• i = 1, 2, ..., t; j = 1, 2, ..., t; k = 1, 2, ...t n = t2 = total number of observations

For the treatment sum of squares we have:


P
• Y..k = tk=1 Yijk and
1
• Ȳ..k = Y..k and
t
P P
• SST = ti=1 tj=1 (Y..k − Ȳ... )2

39
5.2.1 ANOVA
The fixed effects model (model I) and the mixed effects model (model II) for a latin
square experiment with t treatments have the form:

Yijk = µ + ρi + βj + τk + ǫijk

Where

• µ is the overall population mean;

• τk is the k th treatment effect;

• ρi is the ith row effect;

• βj is the effect of the j th column;

• the ǫ′ijk s are random errors which are assumed to be independent and normally
distributed with mean 0 and variance σ 2 .

Model I
Model I is model 5.1 but with the ρ′i s, βj′ s and τk′ s regarded as fixed constants
satisfying the constraints.
t
X t
X b
X
ρi = βj = τk = 0
i=1 j=1 k=1

Hypothesis Testing
The objective is to test the null hypothesis of no difference among the treatment
means. This is equivelent to testing

H0 : τ1 = τ2 = ... = τt = 0
H1 : at least one τi 6= 0. (at least one of the
treatment means differs from the rest)

If we reject H0 we wish to determine which means are different.

Model II
Model I is model 5.1 but with some effects regarded as fixed real constants and some
regarded as random effects variables. If we consider a case whereby the treatments
are fixed and both the rows and colums are random, the assumptions for the model
are that:

40
• the ρ′i s are random variables which are independent and normally distributed
with mean 0 and variance σρ2 ;

• the βj′ s are random variables which are independent and normally distributed
with mean 0 and variance σβ2 and
t
X
τk = 0
k=1

Hypothesis Testing
The objective is to test the null hypothesis of no difference among the treatment
means. This is equivelent to testing

H0 : τ1 = τ2 = ... = τt = 0
H1 : at least one τi 6= 0. (at least one of the
treatment means differs from the rest)

If we reject H0 we wish to determine which means are different.

Sum of Squares
The total variability in the observations (Yijk

s) is measured using the total sum of
squares and is given by:
t X
t t X
t
X X 1 2
SST O = (Yijk − Y¯... )2 = 2
Yijk − Y with t2 − 1 df
i=1 j=1 i=1 j=1
t2 ...

It is possible to partition the total sum of squares into four separate sources of vari-
ability i.e.

1. the variability that is due to the treatments (treatment sum of squares


(SST),

2. the variability that is due to the row effects (row sum of squares (SSR),

3. the variability that is due to the column effects (column sum of squares
(SSC)and

4. the variability in the responses that is due to the random errors (error sum of
squares (SSE)

There are sum of squares formulae that are mathematical and are not convinient
to use in calculations. The computational formulae are given by:

41
t
1X 2 1
SST = Y..k − 2 Y...2 with t − 1 df
t k=1 t
t
1X 2 1
SSR = Yi.. − 2 Y...2 with t − 1 df
t i=1 t
t
1X 2 1
SSC = Y.j. − 2 Y...2 with t − 1 df
t i=1 t
SSE = SST O − SST − SSR − SSC with (t − 2)(t − 1) df

The mean sum of squares are obtained by dividing the sums of squares by the corre-
sponding degrees of freedon and they are used for testing the hypothesis about the
treatments.

Hypothesis testing
The test statistic for treatment effects is given by:
M ST
F = (5.1)
M SE
where MST and MSE are mean squares computed from the appropriate sums of
squares in the ANOVA table. F has an F distribution with t − 1 degrees of freedom
on the numerator and (t − 1)(t − 2) degrees of freedom on the denominator.

The test statistics for checking the effectiveness of blocking by the row blocking factor
and the column blocking factor are given by:
M SR M SC
F = and F =
M SE M SE
respectively. Both F ratios have an F distribution with t − 1 degrees of freedom
on the numerator and (t − 1)(t − 2) degrees of freedom on the denominator.

Table 5.3: ANOVA table for a Latin Square Design


Source df SS MS F
Treatment t−1 SST M ST M ST /M SE
Rows t−1 SSR M SR M SR/M SE
Columns t−1 SSC M SC M SC/M SE
Error (t − 2)(t − 1) SSE M SE
Totals t2 − 1 SST O

42
Pairwise Comparisons of the treatments
They are done only if the ANOVA test under the fixed effects model concludes that
some treatment means are different. In this case the hypothesis to be tested are:

H0 : τi. = τi′ .
H1 : τi. 6= τi′ . ∀i 6= i′

The least significant difference for comparing the means is given by:
α p
LSD = t(t−1)(t−2)
2
2M SE/t

Example 5.2.1 A traffic engineer wished to compare the total unused green time for
3 different signal-control sequencing devices at 3 different intersections of a city. It
was assumed that the intersections were far enough apart that they, in effect, acted
independently, regardless of the signal sequencing device employed. In addition to
comparing the devices at the 3 different intersections, the engineer wished to compare
the devices at different time periods during the day. The data collected are tabulated
in the following table. Analyse the data and draw conclusions.

Table 5.4: A 3 × 3 latin square design for the traffic delay experiment
Time period
1 2 3
1 23 (II) 31 (III) 51 (I)
Intersection 2 71 (I) 42 (II) 35 (III)
3 34 (III) 67 (I) 29 (II)

5.3 A Latin Square Design with Missing Data


The techniques discussed in section 3.5 Chapter 3 also apply to the latin square
design. The formula for estimating a single missing value in a latin square design is

t(T + R + C) − 2G
M=
(t − 1)(t − 2)

where T, R and C represent the treatment, row and column totals respectively, cor-
responding to the missing observation and t is the number of treatments in the latin
square design. After replacing the missing value the analysis can proceed as for a
balanced latin square design with degrees of freedom for SST O = t2 − 2. If there
are significant differences due to treaments we need to make pairwise comparisons.

43
The least significant difference between the treatment with the missing value and any
other treatment is
s
α 2 1
LSD = t(t−1)(t−2)
2
M SE( +
t (t − 1)(t − 2)

For any other pair of treatments, the LSD is as before


α p
LSD = t(t−1)(t−2)
2
2M SE/t

Example 5.3.1 Refer to example 5.2.1 with one missing value.

5.4 Replication of a Latin Square


This is feasible when

• two or more experimental units can be obtained for each cell defined by levels
of the row blocking factor and the column blocking factor or,

• the latin square can be repeated two or more times using the same experimental
units.

suppose that the number of replications within each cell is r. Then the model for the
data is given by:

Yijkm = µ + ρi + βj + τk + ǫijkm

Where

• Yijkm is the mth response to the k th treatment in the ith row and j th column;

• µ is the overall population mean;

• τk is the k th treatment effect;

• ρi is the ith row effect;

• βj is the effect of the j th column;

• the ǫ′ijk s are random errors which are assumed to be independent and normally
distributed with mean 0 and variance σ 2 ; and

The totals can be redifined as follows:


Pr Pt Pr Pt Pr
• Yijk. = m=1 Yijkm , Yi... = j=1 m=1 Yijkm , Y.j.. = i=1 m=1 Yijkm and
Pt Pt Pr
Y.... = i=1 j=1 m=1 Yijkm are the (ij) cell total, i row total, j th column
th th

total and the overall total respectively, and

44
1 1 1 1
• Ȳijk. = Yijk. , Ȳi... = Yi... , Ȳ.j.. = Y.j.. and Ȳ.... = 2 Y.... are the (ij)th cell
r tr tr tr
mean, ith row mean, j th column mean and overall mean respectively, and

• i = 1, 2, ..., t; j = 1, 2, ..., t; k = 1, 2, ...t n = t2 r = total number of observations

The total variability in the observations (Yijk



s) is measured using the total sum
of squares and is given by:
t X
t X
r t X
t X
r
X X 1
SST O = (Yijkm − Y¯.... )2 = 2
Yijkm − 2
Y.... with t2 r − 1 df(5.2)
i=1 j=1 m=1 i=1 j=1 m=1
t2 r

It is possible to partition the total sum of squares into four separate sources of vari-
ability. The computational formulae for the sum of squares are given by:

t
1 X 2 1 2
SST = Y..k. − 2 Y.... with t − 1 df
tr k=1 tr
t
1 X 2 1 2
SSR = Yi... − 2 Y.... with t − 1 df
tr i=1 tr
t
1 X 2 1 2
SSC = Y.j.. − 2 Y.... with t − 1 df
tr i=1 tr
SSE = SST O − SST − SSR − SSC with t2 r − 3t + 2 df

Note: by replicating the latin square design we have increased the degrees of freedom
for SSE by
(r − 1)t2

Example 5.4.1 A team of educators were interested in determining the relative ef-
fectiveness of instruction methods A (Video instruction), B (traditional classroom)
and C (programmed study) on student learning. They felt that the IQ and the age
of the student could influence their study too much. To take these two factors into
account, they used a 3 X 3 latin square design with IQ-Age cell of the latin square.
The scores for the instruction methods are in the following table. Analyse the data
and draw conclusions.

45
Table 5.5: A 3 × 3 latin square design for the scores
IQ
High Average Low
20 40,50 (C) 40,40 (B) 50,40 (A)
Age 30 70,60 (B) 30,20 (A) 55,50 (C)
40 20,30 (A) 70,80 (C) 25,25 (B)

5.5 Crossover Designs


In this experiment, each experimental unit receives a sequence of all the t treatments
in t successive time periods, the treatment sequences being different for different ex-
perimental units. At the end of each time period, the response to the treatment is
measured on the experimental unit, and a period of time is allowed to pass in order
to eliminate the effect of the current treatment before the next treatment is admin-
istered to the unit.

A crossover design can be a group of two or more t × t latin squares. With rt


experimental units, the crossover design is a group of r t × t latin square designs with
t common time periods and t different experimental units forming the columns of
each latin square.
The crossover design model has the form:

Yijk = µ + ρi + βj + τk + ǫijk

Where
• Yijk is the response to the k th treatment administered to the experimental unit
j during period i;

• µ is the overall population mean;

• τk is the k th treatment effect;

• ρi is the period i effect;

• βj is the experimental unit j effect;

• the ǫ′ijk s are random errors which are assumed to be independent and normally
distributed with mean 0 and variance σ 2 ; and
The totals and the means are as follows:
P Pt Pt Prt
• Yi.. = rt j=1 Yijk , Y.j. = i=1 Yijk and Y... = i=1 j=1 Yijk are the i
th
period
th
total, j experimental total and the overall total respectively, and

46
1 1 1
• Ȳi.. = Yi.. , Ȳ.j. = Y.j. and Ȳ... = 2 Y... are the ith period mean, j th experi-
rt t tr
mental unit mean and overall mean respectively, and

• i = 1, 2, ..., t; j = 1, 2, ..., rt; k = 1, 2, ...t n = rt2 = total number of observations

Sum of Squares
The total variability in the observations (Yijk

s) is measured using the total sum of
squares as usual and is given by:
t X rt t X rt
X
¯ 2
X
2 1
SST O = (Yijk − Y... ) = Yijk − 2 Y...2 with t2 r − 1 df (5.3)
i=1 j=1 i=1 j=1
tr

The computational formulae for the period sum of squares, the experimental unit
sum of squares, the treatment sum of squares and the error sum of squares are given
by:

t
1 X 2 1
SST = Y..k − 2 Y...2 with t − 1 df
tr k=1 tr
t
1 X 2 1
SSP = Yi.. − 2 Y...2 with t − 1 df
tr i=1 tr
t
1X 2 1
SSU = Y.j. − 2 Y...2 with tr − 1 df
t i=1 tr
SSE = SST O − SST − SSP − SSU with (tr − 2)(t − 1) df

The mean sum of squares are obtained by dividing the sums of squares by the corre-
sponding degrees of freedon and they are used for testing the hypothesis about the
treatments.

Hypothesis testing
The objective is to test the null hypothesis of no difference among the treatment
means. This is equivelent to testing
H0 : τ1 = τ2 = ... = τk = 0
H1 : at least one τk 6= 0. (at least one of the
treatment means differs from the rest)
The test statistic for treatment effects is given by:
M ST
F = (5.4)
M SE

47
where MST and MSE are mean squares computed from the appropriate sums of
squares in the ANOVA table. F has an F distribution with t − 1 degrees of freedom
on the numerator and (t − 1)(rt − 2) degrees of freedom on the denominator.

48
Table 5.6: ANOVA table for a Latin Square Design
Source df SS MS F
Treatment t−1 SST M ST M ST /M SE
Period t−1 SSP M SP M SP/M SE
Unit rt − 1 SSU M SU M SU/M SE
Error (rt − 2)(t − 1) SSE M SE
Totals t2 r − 1 SST O

Example 5.5.1 A crossover design was used to study the effects of three diets on the
daily weight gain of 2 year old goats. A sufficiently long period of time was allowed to
pass before a goat was fed its new diet in order to eliminate the effect of its previous
diets on the response to the new diet. The following data was recorded (in grams per
day). Analyse the data and draw conclusions.

Goat
1 2 3 4 5 6
1 40.5 (C) 30.0 (B) 30.5 (A) 30.5 (C) 30.0 (B) 40.0 (A)
Period 2 45.0 (B) 20.5 (A) 40 (C) 60.5 (B) 10.0 (A) 45.5 (C)
3 20.0 (A) 65.5 (C) 20.0 (B) 20.5 (A) 60.5 (C) 30.0 (B)

49
Chapter 6

Factorial Experiments

6.1 Introduction
Consider a situation in which it is of interest to study the effect of two factors A
and B on some response. For example, in a chemical experiment we would like to
simultaneously vary the reaction pressure and reaction time and study the effect of
each on the yield. The term factor is used in a general sense to denote any feature of
the experiment such as temperature, time or pressure that may be varied from trial
to trial. The levels of a factor are the actual values used in the experiment.
A factorial design can either be a completely randomised design (CRD) or a block
design (BD). We use CRD factorial design if we have homogeneous experimental
units and the experimental conditions are uniform. If we have heterogeneous exper-
imental units and/or if there are external variables that may influence the response
then a randomised block design or a latin square is appropriate for the factorial ex-
periment.

To illustrate a simple factorial design: lets suppose that we wish to study the ef-
fects of the combination of the levels of Factors A and B and that each factor has two
levels i.e for A we have A1 and A2 and for B we have B1 and B2 . The combinations
of the two factors that are to be investigated are:

A1 B1 , A1 B2 , A2 B1 , A2 B2 .

These four factor level combinations are the treatments.

Definition 6.1.1 A factorial experiment is an experiment in which the response


y is observed at all factor-level combinations of the independent variables.

In this type of experiment it is important not only to determine if the two factors
have an influence on the response but also if there is a significant interaction between
the two factors.

50
6.1.1 Interaction in Factorial Experiments
If we consider the illustration above (which is an example of a two factor experiment)
the effects of A and B, often called the main effects, take on a different meaning
in the presence of interaction. In general, there could be experimental situations in
which factor A has a positive effect on the response at one level of factor B, while at
a different level of factor B the effect of A is negative.

Example 6.1.1 Consider, for example, the following hypothetical data taken on two
factors each at three levels. Assume that the values given are averages for each treat-
ment. Check for the presence of the A by B interactions.

B
A B1 B2 B3
A1 80.30 80.65 80.30
A2 80.20 80.55 80.00
A3 80.60 80.85 80.25

NOTE:

• The analysis of the data usually begins by checking the presence of interaction.
Then if interaction is not significant we proceed to make tests on the effects
of the main factors. If the data indicate the presence of interaction, we might
need to observe the influence of each factor at fixed levels of the other.

• Interaction and experimental error are separated in factorial experiments only


if multiple observations are taken at the various treatment combinations.

6.2 Analysis of Factorial Experiments


We shall consider the analysis of a two factor factorial design when in a completely
randomised design and also in a randomised block design. In both cases we assume
that the fixed effects models are appropriate for the factorial experiments.

6.2.1 Two factor factorial Design in a CRD


We shall consider two factors A and B with a and b levels respectively and a case of
r replications for each treatment combination. Each treatment combination defines a
cell in our array. Table 6.3 displays the layout of the factorial design when replicated
r times.

51
Table 6.1: Two-factor factorial experiment with r plications in a CRD
Factor B
Factor A B1 B2 ... Bb Total Mean
A1 Y111 , Y112 , ..., Y11r Y121 , Y122 , ..., Y12r ... Y1b1 , Y1b2 , ..., Y1br Y1.. Ȳ1..
A2 Y211 , Y212 , ..., Y21r Y221 , Y222 , ..., Y22r ... Y2b1 , Y2b2 , ..., Y2br Y2.. Ȳ2..
. . . ... . . .
. . . ... . . .
. . . ... . . .
Aa Ya11 , Ya12 , ..., Ya1r Ya21 , Ya22 , ..., Ya2r ... Yab1 , Yab2 , ..., Yabr Ya.. Ȳa..
Total Y.1. Y.2. ... Y.b. Y...
Mean Ȳ.1. Ȳ.2. ... Ȳ.b. Ȳ...

The Anova model for a two-factor factorial design in a CRD has the form

Yijk = µ + τi + βj + (τ β)ij + ǫijk

where

• µ is the overall population mean;

• τi is the effect of the ith level of factor A;

• βj is the effect of the j th level of factor B;

• (τ β)ij is the interaction effect of the ith level of factor A and the j th level of
factor B;

• the ǫ′ijk s are random errors which are assumed to be independent and normally
distributed with mean 0 and variance σ 2 .

The fixed effects model assumes that the τi′ s, βj′ s and (τ β)′ij s are fixed real con-
stants satisfying the constraint:
a
X b
X a
X
τi = βj = (τ β)ij = 0
i=1 j=1 i=1

The estimates of τi , βj and (τ β)ij are given by

• τi = Ȳi.. − Ȳ...

• βj = Ȳ.j. − Ȳ... and

• (τ β)ij = Ȳij. − Ȳi.. − Ȳ.j. + Ȳ...

The three hypothesis to be tested are as follows:

52
1. The first hypothesis that we test is about the treatment effects. The hypoth-
esis are as follows:

H0 : All the µ′ij s are equal


H1 : at least one pair of µ′ij s are not equal

2. If we reject H0 in favour of H1 then we test about the AB interaction effects


and the hypothesis are:

H0 : (τ β)11 = (τ β)12 = ... = (τ β)ab = 0


H1 : at least one of the (τ β)′ij s 6= 0.

3. If there are no interaction effects we proceed to test about the main effects i.e
A factor level effects or the B factor level effects.

(a) The hypothesis about A factor level effects are

H0 : τ1 = τ2 = ... = τa = 0
H1 : at least one of the τi′ s 6= 0.

If we reject H0 we compare the A factor level means.


(b) The hypothesis about B factor level effects are

H0 : β1 = β2 = ... = βb = 0
H1 : at least one of the βj′ s 6= 0.

If we reject H0 we compare the B factor level means.

Sum of Squares
The total variability in the observations (Yijk

s) is measured using the total sum of
squares and is given by:
a X
b X
r a X r
b X
X X 1 2
SST O = (Yijk − Y¯... )2 = 2
Yijk − Y with abr − 1 df
i=1 j=1 k=1 i=1 j=1 k=1
abr ...

The total sum of squares can be decomposed into two separate sources of variability
i.e.

1. the variability that is due to the treatments (factor level combination) (treatment
sum of squares (SST)) and

53
2. the variability in the responses that is due to the random errors (error sum of
squares (SSE))

SST is obtained by performing a one way ANOVA of the data with the factor
level combinations Ai Bj as the treatments. The formula for SST is given by:

a X
b X
r a b
X
2 1 XX 2 1 2
SST = (Yij. − Ȳ... ) = Yij. − Y with ab − 1 df
i=1 j=1 k=1
r i=1 j=1 abr ...

The error sum of squares is obtained as follows:

SSE = SST O − SST with ab(r − 1) df

The variability in the observations that is due to the treatments effects is at-
tributed to the main effects and the interaction of the main effects i.e

• A factor level effects;

• B factor level effects and

• A × B interaction effects.

This follows that we can partition the treatment sum of squares into

• the variation due to the factor A effects (SSA)

• the variation due to the factor B effects (SSB)

• the variation due to the factor A × B interaction effects (SSAB)

The computational formulae for SSA and SSB are as follows:


a
1 X 2 1 2
SSA = Yi.. − Y with a − 1 df
br i=1 abr ...
b
1 X 2 1 2
SSB = Y.j. − Y with b − 1 df
ar j=1 abr ...
SSAB = SST − SSA − SSB with (a − 1)(b − 1) df

The mean sum of squares are obtained by dividing the sums of squares by the corre-
sponding degrees of freedon and they are used for testing the hypothesis about the
treatments.

54
Hypothesis testing
1. We first test for treatment effects and the test statistic is given by:
M ST
F = (6.1)
M SE
where MST and MSE are mean squares computed from the appropriate sums
of squares in the ANOVA table. F has an F distribution with ab − 1 degrees of
freedom on the numerator and ab(r − 1) degrees of freedom on the denomi-
nator.

2. Next we test for the significance of A × B interaction effects. This is only


possible if we reject H0 in number 1 above. The test statistic is given by:
M SAB
F =
M SE
which also has an F distribution with (a − 1)(b − 1) degrees of freedom on the
numerator and ab(r − 1) degrees of freedom on the denominator.

3. We test the hypothesis about the A factor level effects or the B factor level
effects only if there are no A × B interaction effects.The test statistic for the A
and B factor level effects are given by:
M SA M SB
F = and F =
M SE M SE
which both have an F distribution with (a − 1) degrees of freedom on the
numerator and ab(r − 1) degrees of freedom on the denominator and (b − 1)
degrees of freedom on the numerator and ab(r − 1) degrees of freedom on the
denominator respectively.

Table 6.2: ANOVA table for a Two-factor experiment in a CRD


Source df SS MS F
Treatment ab − 1 SST M ST M ST /M SE
Main Effects
A a−1 SSA M SA M SA/M SE
B b−1 SSB M SB M SB/M SE
Interactions
AB (a − 1)(b − 1) SSAB M SAB M SAB/M SE
Error ab(r − 1) SSE M SE
Total abr − 1 SST O

55
Pairwise Comparisons of the treatments
They are done only if the ANOVA test under the fixed effects model concludes that
some treatment means are different.Why? In a case when treatment means are not
equal as well as the A × B interaction effects are present pairwise comparisons of the
A or B factor level means does not make sense. In the presence of A × B interactions
we must compare the factor level means at each level of B or vise versa. Hence the
hypothesis to be tested for the A factor level means at the j th level of factor B in this
case are:

H0 : τij = τi′ j
H1 : τij 6= τi′ j ∀i 6= i′

The least significant difference for comparing the means is given by:
α p
LSD = tab(r−1)
2
2M SE/r

Thus the τij and τi′ j are significantly different if

|Ȳij. − Ȳi′ j. | > LSD

Similarly, for the B factor level means at the ith level of factor A the hypothesis
are:

H0 : τij = τij ′
H1 : τij 6= τij ′ ∀j 6= j ′

The least significant difference for comparing the means is as above and the means
are significantly different if:
|Ȳij. − Ȳij ′ . | > LSD
If we conclude that A × B interactions are absent and some A and/or B factor
level means are not equal then we can do pairwise comparisons of the A factor level
means and/or the B factor level means. The hypothesis to be tested for factor A are:

H0 : τi. = τi′ .
H1 : τi. 6= τi′ . ∀i 6= i′

The least significant difference for comparing the means is given by:
α p
LSD = tab(r−1)
2
2M SE/br

56
Thus the τij and τi′ j are significantly different if

|Ȳi.. − Ȳi′ .. | > LSD

Similarly, for the B factor level means the hypothesis are:

H0 : τ.j = τ.j ′
H1 : τ.j 6= τ.j ′ ∀j 6= j ′

The least significant difference for comparing the means is given by:
α p
LSD = tab(r−1)
2
2M SE/ar

The least significant difference for comparing the means is as above and the means
are significantly different if:
|Ȳ.j. − Ȳ.j ′ . | > LSD

Example 6.2.1 In a chemical process the most important variables that are thought
to affect the yield are pressure and temperature. Three levels of each factor are selected
and a factorial experiment in a CRD with two replications was performed. The results
are as shown in table 6.2.1. Analyse the data and draw conclusions.

Pressure
Temperature 215 230 235
30 80.4 80.7 80.2
80.2 80.6 80.4
40 80.1 80.5 79.9
80.3 80.6 80.1
50 80.5 80.8 80.4
80.7 80.9 80.1

57
6.2.2 Two factor factorial Design in a RBD
In a two-factor experiment with random effects the layout of the experiment is as
follows:

Table 6.3: Two-factor factorial experiment in a RBD


Block
Treatment 1 2 ... r Total Mean
A1 B1 Y111 , Y112 ... Y11r Y11. Ȳ11.
A2 B2 Y121 Y122 ... Y12r Y12. Ȳ12.
. . . ... . . .
. . . ... . . .
A1 Bb Y1b1 Y1b2 ... Y1br Y1b. Ȳ1b.
A2 B1 Y211 Y212 ... Y21r Y21. Ȳ21.
A2 B2 Y221 Y222 ... Y22r Y22. Ȳ22.
. . . ... . . .
. . . ... . . .
A2 Bb Y2b1 Y2b2 ... Y2br Y2b. Ȳ2b.
. . . ... . . .
. . . ... . . .
Aa B1 Ya11 Ya12 ... Ya1r Ya1. Ȳa1.
Aa B2 Ya21 Ya22 ... Ya2r Ya2. Ȳa2.
. . . ... . . .
. . . ... . . .
Aa Bb Yab1 Yab2 ... Yabr Yab. Ȳab.
Total Y..1 Y..2 ... Y..r Y...
Mean Ȳ..1 Ȳ..2 ... Ȳ..r Ȳ...

The Anova model for a two-factor factorial design in a RBD has the form

Yijk = µ + τi + βj + ρk + (τ β)ij + ǫijk

where

• µ is the overall population mean;

• τi is the effect of the ith level of factor A;

• βj is the effect of the j th level of factor B;

• ρk is the effect of the k th block;

• (τ β)ij is the interaction effect of the ith level of factor A and the j th level of
factor B;

58
• the ǫ′ijk s are random errors which are assumed to be independent and normally
distributed with mean 0 and variance σ 2 .
The fixed effects model assumes that the τi′ s, ρk , βj′ s and (τ β)′ij s are fixed real
constants satisfying the constraint:
a
X b
X r
X a
X a
X
τi = βj = ρk = (τ β)ij = (τ β)ij = 0
i=1 j=1 k=1 i=1 i=1

The three hypothesis to be tested are as follows:


1. The first hypothesis that we test are about the treatment effects and the
hypothesis are as follows:
H0 : All the µ′ij s are equal
H1 : at least one pair of µ′ij s are not equal

2. If we reject H0 in favour of H1 then we test about the AB interaction effects


and the hypothesis are:
H0 : (τ β)11 = (τ β)12 = ... = (τ β)ab = 0
H1 : at least one of the (τ β)′ij s 6= 0.

3. If there are no interaction effects we proceed to test about the main effects i.e
A factor level effects or the B factor level effects.

(a) The hypothesis about A factor level effects are

H0 : τ1 = τ2 = ... = τa = 0
H1 : at least one of the τi′ s 6= 0.
If we reject H0 we compare the A factor level means.
(b) The hypothesis about B factor level effects are

H0 : β1 = β2 = ... = βb = 0
H1 : at least one of the βj′ s 6= 0.
If we reject H0 we compare the B factor level means.

4. The hypothesis about the blocking effects are

H0 : ρ1 = ρ2 = ... = ρk = 0
H1 : at least one of the ρ′k s 6= 0.
If we reject H0 we coclude that blocking was effective.

59
Sum of Squares
The total variability in the observations (Yijk

s) is measured using the total sum of
squares and is given by:
a XX
r a X
b X
r
X X 1 2
SST O = (Yijk − Y¯... )2 = 2
Yijk − Y with abr − 1 df
i=1 j=1 k=1 i=1 j=1 k=1
abr ...

The total sum of squares can be decomposed into three separate sources of variability
i.e.

1. the variability that is due to the treatments (factor level combination) (treatment
sum of squares (SST))

2. the variability that is due to the block effects (block sum of squares (SSBlk))
and

3. the variability in the responses that is due to the random errors (error sum of
squares (SSE))

SST , SSBlk, and SSE are obtained by performing a two way ANOVA of the
data with the factor level combinations Ai Bj as the treatments. The formula for SST
is given by:

a XX
r a b
X 1 XX 2 1 2
SST = (Yij. − Y¯... )2 = Yij. − Y... with ab − 1 df
i=1 j=1 k=1
r i=1 j=1
abr

The formula for SSBlk is given by:


r
1 X 2 1 2
SSBlk = Y..k − Y with abr − 1 df
ab k=1 abr ...

The error sum of squares is obtained as follows:

SSE = SST O − SST − SSBlk with (ab − 1)(r − 1) df

The variability in the observations that is due to the treatments effects is at-
tributed to the main effects and the interaction of the main effects i.e

• A factor level effects;

• B factor level effects and

• A × B interaction effects.

This follows that we can partition the treatment sum of squares into

60
• the variation due to the factor A effects (SSA)

• the variation due to the factor B effects (SSB)

• the variation due to the factor A × B interaction effects (SSAB)


The computational formulae for SSA and SSB are as follows:
a
1 X 2 1 2
SSA = Yi.. − Y with a − 1 df
br i=1 abr ...
b
1 X 2 1 2
SSB = Y.j. − Y with b − 1 df
ar j=1 abr ...
SSAB = SST − SSA − SSB with (a − 1)(b − 1) df

The mean sum of squares are obtained by dividing the sums of squares by the corre-
sponding degrees of freedon and they are used for testing the hypothesis about the
treatments.

Table 6.4: ANOVA table for a Two-factor experiment in a RBD


Source df SS MS F
Treatment ab − 1 SST M ST M ST /M SE
Main Effects
A a−1 SSA M SA M SA/M SE
B b−1 SSB M SB M SB/M SE
Interactions
AB (a − 1)(b − 1) SSAB M SAB M SAB/M SE
Blocks (r − 1) SSBlk M SBlk M SBlk/M SE
Error (ab − 1)(r − 1) SSE M SE
Total abr − 1 SST O

Pairwise Comparisons of the treatments


They are done only if the ANOVA test under the fixed effects model concludes that
some treatment means are different. In the presence of A × B interaction effects
the appropriate pairwise comparisons are the A factor level means at the j th level of
factor B or vise-vesa. For the A factor level means at the j th level of factor B the
hypothesis are:
H0 : τij. = τi′ j.
H1 : τij. 6= τi′ j. ∀i 6= i′

61
The least significant difference for comparing the means is given by:
α p
LSD = t(ab−1)(r−1)
2
2M SE/r

Thus the τij. and τi′ j. are significantly different if

|Ȳij. − Ȳi′ j. | > LSD

Similarly, for the B factor level means at the ith level of factor A the hypothesis
are:

H0 : τij. = τij ′
H1 : τij 6= τij ′ . ∀j 6= j ′

The least significant difference for comparing the means is as above and the means
are significantly different if:
|Ȳij. − Ȳij ′ . | > LSD
If we conclude that A × B interactions are absent and some A and/or B factor
level means are not equal then we can do pairwise comparisons of the A factor level
means and/or the B factor level means. The hypothesis to be tested for factor A are:

H0 : τi.. = τi′ ..
H1 : τi.. 6= τi′ .. ∀i 6= i′

The least significant difference for comparing the means is given by:
α p
LSD = t(ab−1)(r−1)
2
2M SE/br

Thus the τij and τi′ j are significantly different if

|Ȳi.. − Ȳi′ .. | > LSD

Similarly, for the B factor level means the hypothesis are:

H0 : τ.j. = τ.j ′ .
H1 : τ.j. 6= τ.j ′ . ∀j 6= j ′

The least significant difference for comparing the means is given by:
α p
LSD = t(ab−1)(r−1)
2
2M SE/ar

The least significant difference for comparing the means is as above and the means
are significantly different if:
|Ȳ.j. − Ȳ.j ′ . | > LSD

62
Example 6.2.2 Consider a paper manufacturer who is interested studying the effect
of four different cooking temperatures for three different pulp mixtures on the tensile
strength of a paper. The experimenter has decided to run three replicates for each
treatment combination. However the plant can only make 12 runs a day so the ex-
perimenter decided to run one replicate on each of the days and consider the days as
blocks. The data is as shown in table 6.2.2
Block
Treatment (A,B) Day 1 Day 2 Day 3
(200,1) 5.2 5.9 6.3
(200,2) 7.4 7.0 7.6
(200,3) 6.3 6.7 6.1
(225,1) 7.1 7.4 7.5
(225.2) 7.4 7.3 7.1
(225,3) 7.3 7.5 7.2
(250,1) 7.6 7.2 7.4
(250,2) 7.6 7.5 7.8
(250,3) 7.2 7.3 7.0
(275,1) 7.2 7.5 7.2
(275,2) 7.4 7.0 6.9
(275,3) 6.8 6.6 6.4

6.3 2k Factorial Designs


These are experimental designs in which the experimental plan calls for the study of
the effect on a response of k factors, each at two levels. The levels are often denoted
as ’high’ and ’low’or + and - respectively. The complete factorial design requires that
each level of every factor occur with each level of every other factor, giving a total of
2k treatment combinations. In this chapter, the letters a, b, c, .. will be used to denote
higher levels of factors A, B, C, .. and a (1) will be used to denote the lower levels of
each factor.In the presence of other letters we omit the symbol (1). The next sections
will look at the special methods of analysing 22 and 23 factorial designs in a CRD.
We will assume that the factor effects are fixed and that the errors in the responses
are independent and normally distributed with mean 0 and variance σ 2 .

6.3.1 22 Factorial experiment


Consider a 22 factorial experiment in which there are r experimental observations per
treatment combination. The full model for a 22 factorial design is given by:
Yijk = µ + τi + βj + (τ β)ij + ǫijk

63
where

• µ is the overall population mean;

• τi is the effect of the ith level of factor A;

• βj is the effect of the j th level of factor B;

• (τ β)ij is the interaction effect of the ith level of factor A and the j th level of
factor B;

• the ǫ′ijk s are random errors which are assumed to be independent and normally
distributed with mean 0 and variance σ 2 .

Using the notation defined in section 6.3 above we note that

• a represents the total of r observations taken at the High level of A and Low
level of B;

• b represents the total of r observations taken at the High level of B and Low
level of A;

• ab represents the total of r observations taken at the High level of A and High
level of B and

• (1) represents the total of r observations taken at the Low level of A and Low
level of B;

64
Table 6.5 gives a two-way table of these total yields

Table 6.5: 22 Factorial experiment


Factor B
Low(−) High(+) Total
Low(−) (1)=Y11. b=Y12. (1) + b = Y1..
Factor A High(+) a=Y21. ab=Y22. a+ab=Y2..
Total (1)+aY.1. b+ab=Y.2. (1)+a+ab=Y...

The main effect of a factor is defined as the change in the mean response due to
the change in the level of the factor. For example, the main effect of factor A from
table 6.5 above is:
1 1 1 1
A = Ȳ2.. − Ȳ1.. = (a + ab) − ((1) + b)
2 2 2r 2r
1
= [a + ab − b − (1)] (6.2)
2r
The main effect of factor B is:

1 1
B = Ȳ.2. − Ȳ.1. = (b + ab) − ((1) + a)
2r 2r
1
= [b + ab − a − (1)] (6.3)
2r
The AB interaction is the difference between the diagonal means in table 6.5.
That is
1 1 1 1
AB = (Ȳ22. + Ȳ11. ) − (Ȳ21. + Ȳ12. ) = (ab + (1)) − (a + b)
2 2 2r 2r
1
= [a + (1) − a − b] (6.4)
2r
The quantities in the square brackets [.] of equations 6.2, 6.3 and 6.4 are called
contrasts and contrasts are always orthogonal. We define the contrasts among the
treatment totals as follows:

A contrast = a + ab − b − (1)
B contrast = b + ab − a − (1)

The AB interaction is the difference between the diagonal means in table 6.5.
That is

AB contrast = a + (1) − a − b

65
The sums of squares for each contrast is found using the following formula

(Contrastf actor )2
SSF actor = P (6.5)
r (Contrastf actor Coef f icients)2
P
where (Contrastf actor Coef f icients)2 is the sum of the squares of the coeffi-
cients of the terms in the contrast.

Using the formula in 6.5 the Sum of Squares for A (SSA) is given by:
1
SSA = (a + ab − b − (1))2
4r
Sum of Squares for B (SSB) is given by:
1
SSB = (b + ab − a − (1))2
4r
and the Sum of Squares for AB (SSAB) is given by:
1
SSAB = (ab + (1) − a − b2
4r
The total sum of squares (SSTO) is computed using the usual formula i.e:
a X
b X
r a X
b X
r
X X 1 2
SST O = (Yijk − Y¯... )2 = 2
Yijk − Y with abr − 1 df
i=1 j=1 k=1 i=1 j=1 k=1
4r ...

The error sum of squares (SSE) is obtained by subtraction as follows:

SSE = SST O − SSA − SSB − SSAB with 2k (r − 1) df

66
ANOVA table for a 22 Factorial experiment

Table 6.6: ANOVA table for a 22 factorial experiment


Source df SS MS F
Main Effect
A 1 SSA M SA M SA/M SE
B 1 SSB M SB M SB/M SE
Interactions
AB 1 SSAB M SAB M SAB/M SE
k
Error 2 (r − 1) SSE M SE
Total r2k − 1 SST O

Example 6.3.1 Consider an investigation into the effect of concentration of reac-


tant and the presence of a catalyst on the reaction time of a chemical process. Let the
reactant concentration be factor A with two levels of interest 10% and 20% and Cat-
alyst be factor B with the high level denoting the presence of the catalyst and the low
level denoting its absence. Assuming three replicates, the data from the experiment
are displayed in table 6.3.1

Treatment Replicate
combination I II II
(1) 28 25 27
a 36 32 32
b 18 19 23
ab 31 30 29

1. Calculate the sums of squares for the data.

2. Which effects are significantly different from zero.

6.4 23 Factorial Design


Suppose that three factors, A, B and C each at two levels are under study. The design
is called 23 factorial and there are eight treatment combinations. The full model for
a 23 factorial design is given by:

Yijkl = µ + τi + βj + ρk + (τ β)ij + (τ ρ)ik + (βρ)jk + (τ βρ)ijk + ǫijkl

where

67
• µ is the overall population mean;

• τi is the effect of the ith level of factor A;

• βj is the effect of the j th level of factor B;

• ρk is the effect of the k th level of factor C;

• (τ β)ij is the interaction effect of the ith level of factor A and the j th level of
factor B;

• (τ ρ)ik is the interaction effect of the ith level of factor A and the k th level of
factor c;

• (βρ)jk is the interaction effect of the j th level of factor B and the k th level of
factor C;

• (τ βρ)ijk is the interaction effect of the ith level of factor A, the j th level of factor
B and the k th level of factor C;

• the ǫ′ijk s are random errors which are assumed to be independent and normally
distributed with mean 0 and variance σ 2 .
In computing the sums of squares for the main effects it is convinient to present the
total yields of the treatment combinations along with the appropriate algebraic signs
for each contrast as in table 6.7 The treatment combinations and the appropriate

Table 6.7: Signs for contrasts in a 23 Factorial Experiment


Treatment Factorial effect
combination A B C AB AC BC ABC
(1) - - - + + + -
a + - - - - + +
b - + - - + - +
c - - + + - - +
ab + + - + - - -
ac + - + - + - -
bc - + + - - + -
abc + + + + + + +

algebraic signs for each contrast in table 6.7 are used in computing the sums of
squares for the main effects and interaction effects.
For example, the main effect of factor A from table 6.7 above is:
1
A= [−(1) + a − b − c + ab + ac − bc + abc]
4r

68
rearrangig we obtain

1
A= [a + ab + ac + abc − (1) − b − c − bc]
4r

We define the contrasts among the treatment totals as follows:

Acontrast = a + ab + ac + abc − (1) − b − c − bc


Bcontrast = b + ab + bc + abc − (1) − a − c − ac
Ccontrast = c + ac + bc + abc − (1) − a − b − ab
ABcontrast = abc + ab + c + (1) − a − b − ac − bc
ACcontrast = abc + ac + b + (1) − a − c − ab − bc
BCcontrast = abc + bc + a + (1) − b − c − ab − ac
ABCcontrast = abc + a + b + c − ab − ac − bc − (1)

The sums of squares for each contrast is found using the following formula
(Contrastf actor )2
SSF actor = P (6.6)
r (Contrastf actor Coef f icients)2
P
where (Contrastf actor Coef f icients)2 is the sum of the squares of the coefficients
of the terms in the contrast.

Using the formula 6.6 the Sum of Squares are given by:
1
SSA = (a + ab + ac + abc − (1) − b − c − bc)2
8r
1
SSB = (b + ab + bc + abc − (1) − a − c − ac)2
8r
1
SSC = (c + ac + bc + abc − (1) − a − b − ab)2
8r
1
SSAB = (abc + ab + c + (1) − a − b − ac − bc)2
8r
1
SSAC = (abc + ac + b + (1) − a − c − ab − bc)2
8r
1
SSBC = (abc + bc + a + (1) − b − c − ab − ac)2
8r
1
SSABC = (abc + a + b + c − ab − ac − bc − (1)2
8r
The error sum of squares (SSE) is obtained by subtraction as usual.

SSE = SST O − SSA − SSB − SSC − SSAB − SSAC − SSBC − SSABC with r2k − 1) df(6.7

69
ANOVA table for a 23 Factorial experiment

Table 6.8: ANOVA table for a 23 factorial experiment


Source df SS MS F
Main Effect
A 1 SSA M SA M SA/M SE
B 1 SSB M SB M SB/M SE
C 1 SSC M SC M SC/M SE
Interactions
AB 1 SSAB M SAB M SAB/M SE
AC 1 SSAC M SAC M SAC/M SE
BC 1 SSBC M SBC M SBC/M SE
ABC 1 SSABC M SABC M SABC/M SE
k
Error 2 (r − 1) SSE M SE
Total r2k − 1 SST O

Example 6.4.1 An engineer is trying to improve the life of a cutting tool. He has
run 23 factorial experiment using cutting speed A, metal hardness B and cutting angle
C. He replicated the experiment 2 times and obtained the following data displayed in
table 6.4.1

Treatment Replicate
combination I II
(1) 284 248
a 450 410
b 349 353
c 455 438
ab 502 522
ac 398 385
bc 545 560
abc 403 408

1. Calculate the sums of squares for the data.

2. Which effects are significantly different from zero.

3. Advise on the best factor level combination in improving the the life tool.

70
Chapter 7

The Analysis of Covariance


(ANCOVA)

7.1 Introduction
ANCOVA is simply a combination of the analysis of variance and the regression analy-
sis methods, that is, when we compare treatment means that incorporate information
on a quantitative variable x. In this chapter we present an analysis of covariance of
a completely randomised design with one variate.

7.2 A CRD with One Covariate


Definition 7.2.1 Covariates or concomitant variables are characteristic(s) of
experimental units which may affect the response variable and can be measured before
the treatments are imposed on the experimental units

During the analysis covariates are used to adjust the observed responses for the effects
of heterogeneity of the experimental units i.e the responses Y are adjusted for the
values of the covariate X during the analysis. We can write the model to be fitted to
the data as follows:

Yij = µ + τi + β(Xij − X̄.. ) + ǫij

where β is slope of the regression Y on X. The error ǫij in equation 7.1 has been
reduced because part of it is now being accounted for by β(Xij − X̄.. ).

NOTE: Adding covariate(s) to any ANOVA model has the effect of reducing
the experimental error and this makes the ANOVA tests more sensitive to treatment
differences.

71
7.3 Analysis
The model for ANCOVA for a CRD with one covariate is as in equation 7.1 where

• Yij is the j th response to the ith treatment

• µ is the overall population mean;

• τi is the ith treatment effect;

• Xij is the value of the covariate corresponding to Yij ;

• β is the regression coefficient(slope);

• (τ β)ij is the interaction effect of the ith level of factor A and the j th level of
factor B;

• the ǫ′ij s are random errors which are assumed to be independent and normally
distributed with mean 0 and variance σ 2 .

• i = 1, 2, ..., t and j = 1, 2, ..., r

To adjust the estimates of the model parameters for the effect of covariate we note
that:

1.
Ȳ.. = µ + ǫ̈..
Pt Pt Pr
since i=1 τi = i=1 j=1 (Xij − X̄.. ) = 0. If ǫ̈.. ≈ 0 then the estimate of µ is
Y..

2.
Ȳi. = µ + τi + β(X̄i. − X̄.. ) + ǭi.
and
Yij − Ȳi. = β(Xij − X̄i. ) + ǫij + ǭi.
Let Yij∗ = Yij − Ȳi. , Xij∗ = Xij − X̄i. and ǫ∗ij = ǫij + ǭi. , then

Yij∗ = βXij∗ ǫ∗ij

is a simple linear regression model and the least squares estimate of β is given
by
Pt Pr
j=1 Yij Xij
∗ ∗
∗ i=1 Exy
β̂ = Pt Pr = (7.1)
i=1 j=1 Xij
∗2 Exx

NOTE:β ∗ is free of the treatment effects.

72
3. If
Ȳi. = µ + τi + β(X̄i. − X̄.. ) + ǭi.
is evaluated at µ = Ȳ.. , β = β̂ ∗ and ǭi. ≈= 0 then

τ̂i∗ = Ȳi. − Ȳ.. − β̂ ∗ (X̄i. − X̄.. )

is the adjusted estimate of τi

4.
µ̂∗i. = Ȳi. − β̂ ∗ (X̄i. − X̄.. )
is the adjusted estimate of the ith treatment mean.

ANCOVA
Pt Pr
Exx and Exy have been defined. Let Eyy = i=1 j=1 Yij∗2 and the total sum of
squares be given by the formula
t X b t X b
X X 1
SST O = (Yij − Y¯.. )2 = Yij2 − Y..2 with n − 1 df (7.2)
i=1 j=1 i=1 j=1
n

ANCOVA adjust the usual ANOVA for the effects as follows

1. The total sum of squares (SSTO) includes the effects of the covariate
2
SSxy
SST Oadj = SST O −SSReg = SST O − β̂SSxy = SST O − with rt−2 df
SSxx
SST Oadj is the total sum of squares that includes only the treatment effects
and the random errors.

2.
2
Exy
SSEadj = SSE − SSReg ∗ = SSE − β̂ ∗ Exy = SSE − with t(r − 1) − 1 df
Exx
SSE is the usual ANOVA error sum of squares, SSReg ∗ is the regression sum
of squares from regressing Yij∗ = Yij − Ȳi. on Xij∗ = Xij − X̄i. SSEadj is the
adjusted error sum of squares for the effects of the covariate.

3.
SSTadj = SST Oadj − SSEadj with (t − 1) − 1 df

73
The following formulae are used to compute SSxy , SSyy , SSxx , Exy and Exx
t X
r
X 1
SSxy = Yij Xij − Y.. X..
i=1 j=1
rt
t X
r
X 1 2
SSyy = Yij2 − Y
i=1 j=1
rt ..
t X
r
X 1 2
SSxx = Xij2 − X
i=1 j=1
rt ..
t X
r t
X 1X 2
Exx = Xij2 − X
i=1 j=1
r i=1 i.
t X
r t
X 1X
Exy = Yij2 − Yi. Xi.
i=1 j=1
r i=1
t X
r t
X 1X 2
Eyy = Yij2 − Y
i=1 j=1
r i=1 i.

Table 7.1: ANCOVA table for a CRD with one covariate


Source df SS MS F
Regression 1 SSReg ∗ SSReg SSReg /M SEadj
∗ ∗

Treatment t−1 SSTadj SSTadj SSTadj /M SEadj


Error t(r − 1) − 1 SSEadj M SEadj
Totals rt − 2 SST Oadj

NOTE: SST Oadj = SSTadj + SSEadj and not SST Oadj = SSTadj + SSEadj +
SSReg ∗

7.3.1 Hypothesis Testing


The objective is to test the null hypothesis of no difference among the treatment
means. This is equivelent to testing

H0 : τ1. = τ2. = ... = τt. = 0


H1 : at least one τi. 6= 0. (at least one of the
treatment means differs from the rest)

and for the Covariate effects we have

H0 : β = 0
H1 : β 6= 0.

74
The test statistic for the treatment hypothesis is given by:
M STadj
F = (7.3)
M SEadj

F has an F distribution with t−1 degrees of freedom on the numerator and t(r−1)−1
degrees of freedom on the denominator. We conclude that treatment effects are
α
significant at the α level of significance if Fcal > Ft−1,t(r−1)−1 . The test statistic for
the linear relationship is given by:
SSReg ∗
F = (7.4)
M SEadj

F has an F distribution with 1 degree of freedom on the numerator and


t(r − 1) − 1 degrees of freedom on the denominator. We conclude that there is a
linear relationship between the response and the covariate at the α level of significance
α
if Fcal > Ft−1,t(r−1)−1 .

Pairwise Comparisons of the treatments


In this case the hypothesis to be tested are:

H0 : µi. = µi′ .
H1 : µi. 6= µi′ . ∀i 6= i′

The least significant difference is given by:


α
LSD = tt(r−1)−1
2
se(µ̂∗i. − µ̂∗i′ . )
r ³ ´
2 (X̄i. −X̄i′ . )2
where se(µ̄∗i. − µ∗i′ . ) = M SEadj r
+ SSxx
The means µi. and µi′ . are declared
not equal if
|µi. − µi′ . | > LSD.

75
Example 7.3.1 Three different types of hand trucks have been developed and a soft
drink distributor wants to study the effectiveness of using these trucks in delivery.
An experiment was carried out in the company’s methods engneering laboratory. The
variable of interest is the delivery time in minutes (y); however, delivery time is also
strongly related to the case volume delivered (x). Each hand truck is used five times
and the data that follow are obtained. Analyse this data and draw conclusions.

Hand truck type


1 2 3
y x y x y x
36 20 40 22 35 21
41 25 48 28 37 23
39 24 39 22 42 26
42 25 45 30 34 21
49 32 44 28 32 15

76
Chapter 8

QUESTIONS

Chapter 2: CRD
1. The yields of maize, in tonnes were recorded for 4 different varieties of maize,
P, Q, R and S. The experiment was done in a controlled greenhouse enviro-
ment, each variety was randomly assigned to 8 of the 32 plots available for the
experiment. The yields are as in the following table:

Yield
Variety
P 2.5 3.6 2.8 2.7 3.1 3.4 2.9 3.5
Q 3.6 3.9 4.1 4.3 2.9 3.5 3.8 3.7
R 4.3 4.4 4.5 4.1 3.5 3.4 3.2 4.6
S 2.8 2.9 3.1 2.4 3.2 2.5 3.6 2.7

(a) Write an appropriate statistical model.


(b) Perform an analysis of variance on these data and draw conclusions.
(c) Use Fisher’s LSD procedure to run all pairwise comparisons.
(d) Obtain a computer solution for the data. Compare your results to those
obtained in (ii).

2. A clinical phychologist wished to compare three methods for reducing weight in


obese patients. Eleven female patients who had equal body mass indeces (BMI)
were used in the experiment. Four were selected at random from among the 11
patients and were assigned to Method 1. Four of the remaining 7 patients were
selected at random and treated with method 2. The remaining 3 patients were
treated with method 3. All treatments were continued for six months. Each
patient was measured his/her final weight at the end of six months. The results
are shown in the following table.

77
Weight
Method
1 80 92 87 83
2 70 81 78 74
3 63 76 70

(a) Analyse the data and draw appropriate conclusions.


(b) Is it necessary to perform pairwise comparisons of the mean weight for the
three methods. Give reasons for your answer.

Chapter 3: RBD
1. An experiment is conducted in which four treatments (A,B,C and D) are to be
compared in three blocks (1,2 and 3). Five experimental units are available for
each block. The labels for Block 1 are S1 , S2 , S3 , S4 , for Block 2 are S5 , S6 , S7 , S8 ,
and S9 , S10 , S11 , S12 . Use the following set of random numbers to randomise the
experiment.

Set 1={10,9,4,5,6,3,1,8,12,2,7,11}; Set 2={6,3,2,11,5,8,12,9,4,7,1,10};


Set 3={2,11,5,1,9,4,10,6,8,12,3,7}

2. An experiment was carried out in which four treatments are to be compared


in five blocks. The following results were obtained. Perform the analysis of

Block
Treatment
1 12.8 10.6 11.7 10.7 11.0
2 11.7 14.2 11.8 9.9 13.8
3 11.5 14.7 13.6 10.7 15.9
4 12.6 16.5 15.4 9.6 17.1

variance, separating out the treatment, block, and error sums of squares. Use
α = 0.05 level of significance to test the hypothesis that there is no difference
between the treatment means.

Chapter 4: Balanced Incomplete Block Designs


1. An engineer is studying the mileage performance characteristics of five types of
gasoline additives. In the road test he wishes to use cars as blocks; however,
because of a time constraint, he must use an incomplete block design. He runs

78
the balanced design with the five blocks that follow. Analyse the data and draw
conclusions.

Car
Additive 1 2 3 4 5
1 ... 17 14 13 12
2 14 14 ... 13 10
3 12 ... 13 12 9
4 13 11 11 12 ...
5 11 12 10 ... 8

Chapter 5: Latin Square and Crossover Designs


1. An industrial engineer is investigating the effect of four assembly methods
(A,B,C,D) on the assembly time for a color television component. Four op-
erators are selected for the study. Furthermore, the engineer knows that each
method produces such fatigue that the time required for the last assembly may
be greater than the time required for the first, regardless of the method. That
is a trend develops in the required assembly time. To account for this source of
variability, the engineer uses the Latin Square Design shown below.

Operator
Order of Assembly 1 2 3 4
1 10 (C) 14(D) 7(A) 8(B)
2 7(B) 18(C) 11(D) 8(A)
3 5(A) 10(B) 11(C) 9(D)
4 10(D) 10(A) 12(B) 14(C)

(a) Analyse the data and draw appropriate conclusions.


(b) Suppose that, the observation from operator 4 on order of assembly 4 is
missing. Estimate the missing value and perform the analysis using this
value.

2. The effects of two drugs (A,B) on the duration of sleep were studied using a
group of 8 patients. Four patients were randomly assigned to drug A during
period 1 and the other four to drug B. During period 2, the two groups of
patients switched drugs. The sleep duration data is as follows.

(a) Identify the design.

79
Patient
Period 1 2 3 4 5 6 7 8
1 8.6(A) 7.1(B) 8.3(A) 7.3(B) 7.9(A) 7.5(B) 6.3(A) 6.8(B)
2 8.0(B) 7.5(A) 7.4(B) 8.4(A) 7.3(B) 7.6(A) 6.4(B) 7.5(A)

(b) Give a model for this design. State all the relevant assumptions of the
model.
(c) Analyse the data and draw conclusions.

Chapter 6: Factorial Designs


1. An experiment is conducted to study the influence of operating temperature and
three types of face-plate glass in the light output of TV tube. The following
data are collected. Assume that both factors are fixed. Analyse the data and
draw conclusions.

Temperature
Glass Type 100 125 150
1 580 1090 1392
570 1085 1386

2 530 1035 1312


579 1000 1299

3 546 1045 867


575 1053 904

2. The following data were obtained from a 23 factorial experiment replicated three
times: Evaluate the sums of squares for all factorial effects by the contrast

Treatment Replicate 1 Replicate 2 Replicate 3


(1) 12 19 10
a 15 20 16
b 24 16 17
ab 23 17 27
c 17 25 21
ac 16 19 19
bc 24 23 29
abc 28 25 20

80
method.

81
Chapter 7: Analysis of Covariance
1. Four different formulations of an industrial glue are being tested. The ten-
sile strength of the glue is also related to the thickness. Five observations on
strength (y) and thickness (x) are obtained for each formulation. The data
are shown in the following table. Analyse these data and draw appropriate
conclusions.

Glue Fomulation
1 2 3 4
Y X Y X Y X Y X
1 46.5 13 48.7 12 46.3 15 44.7 16
2 45.9 14 49.0 10 47.1 14 43.0 15
3 49.8 12 50.1 11 48.9 11 51.0 10
4 46.1 12 48.5 12 48.2 11 48.1 12
5 44.3 14 45.2 14 50.3 10 48.6 11

82
Chapter 9

Formulae

The Principles of Experimental Design


1.

Completely Randomised Designs


1.
t X
r t X
r
X X 1 2
SST O = (Yij − Y¯.. )2 = Yij2 − Y with tr − 1 df (9.1)
i=1 j=1 i=1 j=1
tr ..

2.
t
X 1 1 2
SST = Yi.2 − Y with t − 1 df
i=1
r tr ..
SSE = SST O − SST with t(r − 1) df

α p
LSD = tt(r−1)
2
2M SE/r

3. Missing Observations

4.
ri
t X
X 1 2
SST O = Yij2 − Y with n − 1 df
i=1 j=1
n ..
t
X 1 2 1 2
SST = Yi. − Y.. with t − 1 df (9.2)
i=1
r i n

83
SSE = SST O − SST with n − t df

r
α 1 1
LSD = tn−t 2
M SE( + )
ri ri′

Randomised Block Design


1.
t X
b t X
b
X X 1 2
SST O = (Yij − Y¯.. )2 = Yij2 − Y with n − 1 df (9.3)
i=1 j=1 i=1 j=1
n ..

t
X 1 1 2
SST = Yi.2 − Y with t − 1 df
i=1
b n ..
b
X 1 2 1 2
SSB = Y − Y with b − 1 df
j=1
t .j n ..
SSE = SST O − SST − SSB with n − t − b + 1 = (t − 1)(b − 1) df

α p
LSD = t(t−1)(b−1)
2
2M SE/b

Block × Treatment Interaction Effects


t X
b X
r t X
b X
r
X X 1 2
SST O = (Yijk − Y¯... )2 = 2
Yijk − Y with tbr − 1 df(9.4)
i=1 j=1 k=1 i=1 j=1 k=1
tbr ...

t X b t b
X 1 XX 2 1 2
SSM odel = (Yij. − Y¯... )2 = Yij. − Y... with tb − 1 df
i=1 j=1
r i=1 j=1
tbr

SSE = SST O − SSM odel with tb(r − 1) df

t
1 X 2 1
SST = (Ȳi.. − Y... 2 with t − 1 df
br i=1 tbr
b
1 X 2 1
SSB = (Y.j. − Y... 2 with b − 1 df
tr j=1 tbr
SSB × T = SSM odel − SST − SSB with (t − 1)(b − 1) df

84
α p
LSD = tbt(r−1)
2
2M SE/br

One missing Value


tT + bB − G
M=
(t − 1)(b − 1)

Two or more missing Values Complete (Full) model (model 1)

Yij = µ + τi + βj + ǫij

Reduced model (model 2)

Yij = µ + βj + ǫij

SSER − SSEF = SSTadj


SSB = SST O − SSTadj − SSE

Balanced Incomplete Block Designs


1.
t X
b
X 1 2
SST O = Yij2 − Y with n − 1 df
i=1 j=1
n ..

t
t−1 X
SSTadj = (kYi. − Bi )2 with t − 1 df
nk(k − 1) i=1

b
X 1 1 2
SSB = Y.j2 − Y with b − 1 df
j=1
k n ..

SSE = SST O − SSTadj − SSB with n − t − b + 1 df

r
α/2 2kM SE
LSD = tn−t−b+1

85
Latin Square and Crossover Designs
1.
t X t t X t
X
¯ 2
X
2 1
SST O = (Yijk − Y... ) = Yijk − 2 Y...2 with t2 − 1 df
i=1 j=1 i=1 j=1
t

t
1X 2 1
SST = Y..k − 2 Y...2 with t − 1 df
t k=1 t
t
1X 2 1
SSR = Yi.. − 2 Y...2 with t − 1 df
t i=1 t
t
1X 2 1
SSC = Y.j. − 2 Y...2 with t − 1 df
t i=1 t
SSE = SST O − SST − SSR − SSC with (t − 2)(t − 1) df

α p
LSD = t(t−1)(t−2)
2
2M SE/t

t(T + R + C) − 2G
M=
(t − 1)(t − 2)

s
α 2 1
LSD = t(t−1)(t−2)
2
M SE( +
t (t − 1)(t − 2)

For any other pair of treatments, the LSD is as before


α p
LSD = t(t−1)(t−2)
2
2M SE/t

t X
t X
r t X
t X
r
X X 1
SST O = (Yijkm − Y¯.... )2 = 2
Yijkm − 2
Y.... with t2 r − 1 df
i=1 j=1 m=1 i=1 j=1 m=1
t2 r

It is possible to partition the total sum of squares into four separate sources of
variability. The computational formulae for the sum of squares are given by:

86
t
1 X 2 1 2
SST = Y..k. − 2 Y.... with t − 1 df
tr k=1 tr
t
1 X 2 1 2
SSR = Yi... − 2 Y.... with t − 1 df
tr i=1 tr
t
1 X 2 1 2
SSC = Y.j.. − 2 Y.... with t − 1 df
tr i=1 tr
SSE = SST O − SST − SSR − SSC with t2 r − 3t + 2 df

t X
rt t X
rt
X X 1
SST O = (Yijk − Y¯... )2 = 2
Yijk − Y...2 with t2 r − 1 df (9.5)
i=1 j=1 i=1 j=1
t2 r

t
1 X 2 1
SST = Y..k − 2 Y...2 with t − 1 df
tr k=1 tr
t
1 X 2 1
SSP = Yi.. − 2 Y...2 with t − 1 df
tr i=1 tr
t
1X 2 1
SSU = Y.j. − 2 Y...2 with tr − 1 df
t i=1 tr
SSE = SST O − SST − SSP − SSU with (tr − 2)(t − 1) df

Factorial Experiments
1.
a X
b X
r a X
b X
r
X X 1 2
T wof actorF actorialDesignSST O = (Yijk − Y¯... )2 = 2
Yijk − Y with
i=1 j=1 k=1 i=1 j=1 k=1
abr ...

a X r
b X a b
X
2 1 XX 2 1 2
SST = (Yij. − Ȳ... ) = Yij. − Y... with ab − 1 df
i=1 j=1 k=1
r i=1 j=1
abr

87
a
1 X 2 1 2
SSA = Yi.. − Y with a − 1 df
br i=1 abr ...
b
1 X 2 1 2
SSB = Y.j. − Y with b − 1 df
ar j=1 abr ...
SSAB = SST − SSA − SSB with (a − 1)(b − 1) df

In the presence of A × B interactions we must compare the factor level means


at each level of B or vise versa
α p
LSD = tab(r−1)
2
2M SE/r

Factor A
α p
LSD = tab(r−1)
2
2M SE/br

Factor B
α p
LSD = tab(r−1)
2
2M SE/ar

Two factor factorial in RBD


a XX
r a X
b X
r
X X 1 2
SST O = (Yijk − Y¯... )2 = 2
Yijk − Y with abr − 1 df
i=1 j=1 k=1 i=1 j=1 k=1
abr ...

a XX
r a b
X 1 XX 2 1 2
SST = (Yij. − Y¯... )2 = Yij. − Y with ab − 1 df
i=1 j=1 k=1
r i=1 j=1 abr ...

r
1 X 2 1 2
SSBlk = Y..k − Y with abr − 1 df
ab k=1 abr ...

SSE = SST O − SST − SSBlk with (ab − 1)(r − 1) df

a
1 X 2 1 2
SSA = Yi.. − Y with a − 1 df
br i=1 abr ...
b
1 X 2 1 2
SSB = Y.j. − Y with b − 1 df
ar j=1 abr ...
SSAB = SST − SSA − SSB with (a − 1)(b − 1) df

88
In the presence of A×B interaction effects the appropriate pairwise comparisons
are the A factor level means at the j th level of factor B or vise-vesa

α p
LSD = t(ab−1)(r−1)
2
2M SE/r

A factor level
α p
LSD = t(ab−1)(r−1)
2
2M SE/br

B factor level
α p
LSD = t(ab−1)(r−1)
2
2M SE/ar

2k Factorial Experiment

(Contrastf actor )2
SSF actor = P (9.6)
r (Contrastf actor Coef f icients)2

SSE = SST O − SSA − SSB − SSAB with 2k (r − 1) df

Analysis of Covariance
1.
Pt Pr
i=1 j=1 Yij∗ Xij∗ Exy
β̂ ∗ = Pt Pr = (9.7)
i=1 j=1 Xij
∗2 Exx

t X
b t X
b
X X 1 2
SST O = (Yij − Y¯.. )2 = Yij2 − Y with n − 1 df (9.8)
i=1 j=1 i=1 j=1
n ..

2
SSxy
SST Oadj = SST O −SSReg = SST O − β̂SSxy = SST O − with rt−2 df
SSxx

2
∗ ∗
Exy
SSEadj = SSE − SSReg = SSE − β̂ Exy = SSE − with t(r − 1) − 1 df
Exx

SSTadj = SST Oadj − SSEadj with (t − 1) − 1 df

89
t X
r
X 1
SSxy = Yij Xij − Y.. X..
i=1 j=1
rt
t X
r
X 1 2
SSyy = Yij2 − Y
i=1 j=1
rt ..
t X
r
X 1 2
SSxx = Xij2 − X
i=1 j=1
rt ..
t X
r t
X 1X 2
Exx = Xij2 − X
i=1 j=1
r i=1 i.
t X
r t
X 1X
Exy = Yij2 − Yi. Xi.
i=1 j=1
r i=1
t X
r t
X 1X 2
Eyy = Yij2 − Y
i=1 j=1
r i=1 i.

α
LSD = tt(r−1)−1
2
se(µ̂∗i. − µ̂∗i′ . )
r ³ ´
(X̄i. −X̄i′ . )2
where se(µ̄i. − µi′ . ) = M SEadj 2r +
∗ ∗
SSxx

90

You might also like