Chapter 4 Design of Experiment
Chapter 4 Design of Experiment
Chapter Four
Contents
Randomized Blocks
“Statistics are like women; mirrors of purest virtue and truth, or like
The completely randomized design is most of the time applied when the experimental units are
homogeneous and hence blocking is not necessary. When the experimental units are
heterogeneous, we need to see an advanced design of experiment which is supposed to reduce
[email protected]
68
Lecture notes for Design and analysis of experiments (Stat 2043) Chapter - 4
the variance. Experimental error makes inference difficult. As the variance of experimental error
increases, confidence interval gets longer and the test power decreases.
When we have a single blocking factor available for our experiment we will try to utilize a
randomized complete block design (RCBD).
All other things being constant equal, we would thus prefer to conduct our experiments with
units that are homogeneous so that the variance will be small. Among those designs which
minimize experimental error by introducing block, Randomized Complete Block Design (RCBD)
is our interest in this chapter.
When the nuisance source of variability is known and controllable, a design technique is called
blocking can be used to systematically eliminate its effect on the statistical comparison among
treatments. Blocking is extremely important design technique.
CRD is not applicable if the experimental units are not alike. But the simplest design which
enables as to take care of variability among the unit is called Randomized complete block design
(RCBD).:
Remark: The word complete indicates that each block contains all treatments.
We have seen the drawback of CRD when the experimental units are really heterogeneous. One
may ask what the need of blocking is. The answer is very simple investigator needs to introduce
blocking in his design in order to reduce variability. By doing so, you can maximize the power of
test and minimize experimental errors to some extent. Beware that blocking is good enough
when the experimental units are not homogeneous else it is not necessary to include block in
your design.
Use of blocking
[email protected]
69
Lecture notes for Design and analysis of experiments (Stat 2043) Chapter - 4
The Randomized Complete Block Design is the basic blocking design. Assume there are
treatments and each treatment will be assigned to units for a total of units. We
partition the units into groups of units each; these groups are our blocks. We make this
partition into blocks in such a way that the units within a block are somehow alike; we anticipate
that these alike units will have similar responses. In the first block, we randomly assign the
treatments to the unit; we do an independent randomization, assigning treatments to units in
each of the other blocks. This is RCBD.
[email protected]
70
Lecture notes for Design and analysis of experiments (Stat 2043) Chapter - 4
Suppose we have, in general, treatments that are to be compared and blocks. There is one
observation per treatment (single replication) in each block and the order in which the treatments
are run within each block is determined randomly. Let be the response for the treatment in
the block. The standard linear model for RCBD has a grand mean, a treatment effect, a block
effect and experimental error, as in
Where is an overall mean, is the effect of treatment, is the effect of block and is
the usual NID random error term.
and
When the treatments and the blocks are assumed to be fixed,
∑ ∑
In an experiment involving the RCBD, we are interested in testing the equality of the
treatment means. Thus, the hypothesis of interest are
An equivalent way to write the above hypothesis is in terms of the treatment effects, is:
o As we test the treatment effect, we can also test the significance of blocking. The
hypothesis is
[email protected]
71
Lecture notes for Design and analysis of experiments (Stat 2043) Chapter - 4
The ANOVA that we have seen in CRD can be extended to the RCBD easily. Let be the total
of all observations taken under treatment , be the total of all observations in block , be the
grand total of all observations, and be the total number of observations.
With similar fashion ̅ is the average of the treatment. ̅ is the average of the block and
̅ is the grand average of all observations. That is,
̅ ̅ ̅
[email protected]
72
Lecture notes for Design and analysis of experiments (Stat 2043) Chapter - 4
We can see that the cross products are zero and hence we can get the following expression.
Since there are N observations, have N-1 degree of freedom. There are treatments and
blocks, so and have ,
respectively. For simplicity of computational procedures:
∑∑ ∑
Treatments a–1
Blocks b–1
Error (a-1)(b-1)
Total ab – 1
[email protected]
73
Lecture notes for Design and analysis of experiments (Stat 2043) Chapter - 4
NB: If the null hypothesis of the treatment means is rejected, we need to know from which the
difference is coming. Therefore, we are supposed to conduct multiple comparisons as we have
done in chapter 3. Let’s use the Fisher LSD procedure. Everything of the procedure is the same
as of in chapter 3.The only difference is the degree of freedom for . In this case it is
. .
Example: Consider the hardness testing experiment. There are four tips and
four available metal coupons. Each tip is tested once on each coupon, resulting
your answer.
[email protected]
74
Lecture notes for Design and analysis of experiments (Stat 2043) Chapter - 4
Solution: a = 4, b = 4 , N = ab = 4*4= 16
= = = = 12.83 = = 14.42
= = = = 27.50 = = 30.89
= = = = 0.89
A,
i, for treatment
Vs ( )
Vs
[email protected]
75
Lecture notes for Design and analysis of experiments (Stat 2043) Chapter - 4
= = 14.42
= = 30.89
square
Treatments a-1
Blocks b-1
Error (a-1)(b-1)
Total N-1
[email protected]
76
Lecture notes for Design and analysis of experiments (Stat 2043) Chapter - 4
square
Exercise? Since the null hypothesis is rejected, Make pair wise comparisons
[email protected]
77
Lecture notes for Design and analysis of experiments (Stat 2043) Chapter - 4
The investigator must consider carefully and critically the reasons why values are missing. If
only one datum is missing from a randomized blocks experiment and provided it is missing for
reasons unassociated with the treatment, the correct analysis of the remaining data can be
performed relatively simply by hand. Suppose, however, that values are missing for reasons
possibly associated with treatment. For example, a patient may decide to drop out of the study
and not return for follow-up assessment because of a poor (or good) response to treatment.
The statistical methods of this section are then inappropriate. They may produce seriously biased
results and the investigator will have to adopt ad hoc methods for analyzing the data.
Assume, to begin with, that only one value is missing, say the response to treatment i in block j.
The analysis begins with the estimation, using the method of least squares, of the value predicted
for treatment i in block j on the basis of the average level of responses in that block and of the
average level of response to that treatment.
The formula for , the predicted value, is
Where denotes the sum of observed responses in block , is the sum of the
observed responses to treatment and is the sum of all observed responses. The next
step in the analysis is to obtain an unbiased estimator of the inherent variance in the data by
calculating the residual sum of square in the usual way on the table obtained by inserting in
the empty cell. This table is called augmented table. The degree of freedom of is not
hereafter (a-1)(b-1) but it is (a − 1)(b − 1) − 1. The analysis is not quite done, because both the
sum of squares for treatments and the sum of squares for blocks are slightly too high.
The corrected (or adjusted) sum of squares for treatments is
[email protected]
78
Lecture notes for Design and analysis of experiments (Stat 2043) Chapter - 4
For blocks ,
Blocks b–1
Error
Total ab – 2
The final two columns of the analysis of variance table are completed in the standard manner. It
must be remembered that the F ratio for treatments is to be compared to for
significance. The investigator confronted with two or more missing values in a randomized
blocks study and the investigator who prefer not to apply the preceding (admittedly tedious)
algebra even when only one value is missing will be able to obtain correct results by running the
data through a computer using a program for multiple regression analysis. The reader is assumed
to be familiar with the use of dummy or indicator variables to represent treatments and blocks.
[email protected]
79
Lecture notes for Design and analysis of experiments (Stat 2043) Chapter - 4
Example:
When using the RCBD, sometimes an observation in one of the blocks is missing. This may
happen because of carelessness or error or for reasons beyond our control, such as unavoidable
damage to an experimental unit. A missing observation introduces a new problem into the
analysis because treatments are no longer orthogonal to blocks.
Find the missing value x , and perform anova table (Exercise?), the data are shown below
Solution: a = 4 ,b = 4
Hint: First find the sum of square treatments, sum of square blocks and sum of
square error and sum of square total after substituting the missing value. Then
find sum of square treatment adjusted, sum of square block adjusted and the
[email protected]
80
Lecture notes for Design and analysis of experiments (Stat 2043) Chapter - 4
The assumptions of RCBD are the same as of CRD and the possible ways of detecting the
violation of the assumptions are the same. The only different thing is the calculation of residuals
(errors). In case of CRD, ̅ but in case of RCBD
Assumptions
The three basic assumptions we need to check are that the errors are:-
1) Independent,
2) Normally distributed
3) Have constant variance.
[email protected]
81
Lecture notes for Design and analysis of experiments (Stat 2043) Chapter - 4
experiment”. E. RUTHERFORD
The Latin Square (LS) design is used where the researcher desires to control the variation in an
experiment that is related to rows and columns in the field.
Some of the basic features of LS (Latin square) design are presented below:
The treatments are assigned at random within rows and columns, with each treatment appears
once per row and per column.
There are equal numbers of treatments, rows and columns.
Useful where the experimenter desires to control variation in two directions.
LS design is considered more powerful than RCBD in addition to detecting differences due to
treatments; it also detects differences due to rows and columns and not due to blocks alone. One
important feature of the LS design is that the number of replications is always equal to the
number of treatments. As such, the LS design can only be used when the number of treatments is
a perfect square. Because of this requirement, the main disadvantage of the design is that it is not
advisable for large number of treatments. Thus, in practice, the LS design is applicable only for
experiments in which the number of treatments is not less than four but not more than eight.
More formally, Latin Square is a array filled with different Latin letters, each
occurring once in each row and exactly once in each column. The name Latin Square is
motivated by Leonhard Euler, who used Latin characters as symbols.
Of course, other symbols can be used instead of Latin letters. That is, the alphabetic
sequence , etc can be replaced by the integer sequence , etc.
[email protected]
82
Lecture notes for Design and analysis of experiments (Stat 2043) Chapter - 4
For a better visualization of Latin Square design, let us see some example for different values of
; Rows and columns are blocks and Latin letters are treatments. Every row contains all the
Latin letters and every column contains all the Latin letters.
Layout for 4 4 Latin Square Design
Columns
1 2 3 4
1 Y111(A) Y122(B) Y133(C) Y144(D)
Rows 2 Y221(B) Y232(C) Y243(D) Y214(A)
3 Y331(C) Y342(D) Y313(A) Y324(B)
4 Y441(D) Y412(A) Y423(B) Y434(C)
NB: Every Latin letter appears once in each row and once in each column.
[email protected]
83
Lecture notes for Design and analysis of experiments (Stat 2043) Chapter - 4
Where is the observation in the row and column for the treatment, is the overall
mean, is the row effect, is the treatment effect, is the column effect and is
the random error. The model is completely additive; that is, there is no interaction between rows,
columns and treatments. This is due to the existence of one observation per cell.
The analysis of variance consists of partitioning the total sum of squares of the
observations into components for rows, columns, treatments and errors. That is,
∑∑∑ ∑
∑ ∑
Where is the sum of the row observations, is the sum of the treatment values and
is the sum of column observations. Under the usual assumptions that is NID(0, ),
each sum of squares is, upon division by , an independently distributed chi - square random
variable. Then the appropriate test statistic for testing no differences in treatment means is
[email protected]
84
Lecture notes for Design and analysis of experiments (Stat 2043) Chapter - 4
However, because of rows and columns represent restriction on randomization, these tests may
not be appropriate. From the computational formulas for sum of squares, we can see that the
analysis is a simple extension of the RCBD, with the sum of squares resulting from rows
obtained from row totals.
NB: =
The following are test statistic for treatment, row and column.
o The test statistic for equality of means ( treatments effect) is:
[email protected]
85
Lecture notes for Design and analysis of experiments (Stat 2043) Chapter - 4
Missing Data
Occasionally, one observation in a Latin square is missing .For a p Latin square, the missing
value may be estimated by
( )
=
Where the primes indicate totals for the row, column, and treatment with the missing value, and
is the grand total with the missing value.
[email protected]
86
Lecture notes for Design and analysis of experiments (Stat 2043) Chapter - 4
Solution: N = = 25, P = 5
The sum of squares for the total, batches (rows), operators (columns) and treatments
(formulations) are computed as follows:
[email protected]
87
Lecture notes for Design and analysis of experiments (Stat 2043) Chapter - 4
P=5
= = = = 82.50 = = 7.73
= = = = 17.00 = = 1.59
= = = = 37.50 = = 3.51
= = = = 10.67
i, for treatment
( )
ii, for row
Vs
Vs
[email protected]
88
Lecture notes for Design and analysis of experiments (Stat 2043) Chapter - 4
for row and for column they have the same critical value.
column
7.73 & 3.51 > 3.26, so reject for treatment , column , But 1.59 < 3.26 so
Step 7: Conclusion:
[email protected]
89
Lecture notes for Design and analysis of experiments (Stat 2043) Chapter - 4
square
Treatments p-1
Rows p-1
Columns p-1
Error (p-2)(p-1)
Total
square
[email protected]
90
Lecture notes for Design and analysis of experiments (Stat 2043) Chapter - 4
Consider a Latin square, and superimpose on it a second Latin square in which the
treatments are denoted by Greek letters. If the two squares when superimposed have the property
that each Greek letter appears once and only once with each Latin letter, the two Latin squares
are said to be orthogonal, and the design obtained is called a .
This is another name for a pair of orthogonal Latin square super imposed on one another. That is
the treatment being represented by Greek letters in one square and the Latin letters in other
square.
Keeping in mind the definition or the requirements in Latin Square design, let us visualize the
superimposition of two orthogonal Latin square designs. Assume the treatments in the first Latin
square design are denoted by Latin letters and the treatments in the second Latin square design
are denoted by Greek letters such as α, β, γ, . In addition, each Latin letter occurs once with
each Greek letter.
Example: 4X4 Greco - Latin letters square is the following, that is:
Note: Each treatment occurs once in each row block, once in each column block and once in
each Greek letter.
[email protected]
91
Lecture notes for Design and analysis of experiments (Stat 2043) Chapter - 4
The usual model for a Graeco Latin design has terms for treatments, rows, columns and Greek
letter blocks and assumes that all these terms are additive.
Where is the observation in the row and column for Latin letter and Greek treatment ,
is the overall mean, is the effect of the row, is the effect of the Latin letter
treatment, is the effect of the Greek letter treatment, is the effect of the column
and is a NID(0, ) random error component.
As usual the total sum of squares is partitioned into five components. That is, Latin letter
treatments sum of squares, Greek letter treatments sum of squares, row sum of squares, column
sum of squares and error sum of squares.
∑∑∑∑ ∑
∑ ∑
[email protected]
92
Lecture notes for Design and analysis of experiments (Stat 2043) Chapter - 4
The null hypothesis of equal row, column, Latin letter, and Greek letter treatments would be
tested by dividing the corresponding mean square by mean square error. The rejection region is
the upper tail point of the distribution.
=
[email protected]
93
Lecture notes for Design and analysis of experiments (Stat 2043) Chapter - 4
Test the significance of Latin letter treatments means also check significance of
Greek letter treatments , rows and columns using 5% level of significance and
[email protected]
94
Lecture notes for Design and analysis of experiments (Stat 2043) Chapter - 4
[email protected]
95
Lecture notes for Design and analysis of experiments (Stat 2043) Chapter - 4
P=5
= = = = 82.50 = = 10.00
= = = = 15.50 = = 1.88
= = = = 17.00 = = 2.06
= = = = 37.50 = = 4.55
= = = = 8.25
Vs
[email protected]
96
Lecture notes for Design and analysis of experiments (Stat 2043) Chapter - 4
Vs
treatment, for Greek letter treatment, for row and for column they have the
10.00 & 4.55 > 3.84, so reject for Latin letter treatment and column ,
But 1.88& 2.06 < 3.84 so don’t reject for Greek letter treatment and
row
Step 7: Conclusion:
(formulations).
[email protected]
97
Lecture notes for Design and analysis of experiments (Stat 2043) Chapter - 4
(assemblies).
Rows p-1
Columns p-1
Error (p-3)(p-1)
Total
[email protected]
98
Lecture notes for Design and analysis of experiments (Stat 2043) Chapter - 4
It is a situation where the number of treatments exceeds the number of units per block.
The number of treatments are
The number of blocks are
Replication per treatment is
Block size is
Total number of units
[email protected]
99
Lecture notes for Design and analysis of experiments (Stat 2043) Chapter - 4
Where is the response of the treatment in the block, is an overall mean, is the
effect of treatment, is the effect of block and is the usual NID random error
term.
Our usual methods for estimating treatment effects don’t work for the BIBD. The total
Variability in the data is expressed by the corrected sum of squares.
∑∑ ∑
Where the sum of square for treatments is adjusted to separate the treatment and block effects.
This adjustment is necessary because each treatment is represented in a different set of blocks.
Thus, differences between unadjusted treatment totals ., ., , . are also affected by
difference between blocks.
The block sum of squares is
Where is the total in the block. has degrees of freedom. The adjusted
treatment sum of squares is
∑
[email protected]
100
Lecture notes for Design and analysis of experiments (Stat 2043) Chapter - 4
The appropriate statistic for testing the equality of the treatment effects is
Treatments(adjusted) a–1
Blocks b–1
Error
Total N–1
Note that: In the analysis that we have described above, the total variability
has been partitioned into an adjusted sum of squares for treatments, an
unadjusted sum of squares for blocks and an error sum of squares. Sometimes
∑ ( )
[email protected]
101
Lecture notes for Design and analysis of experiments (Stat 2043) Chapter - 4
Treatments a–1
Treatments(adjusted) a–1
Blocks b–1
Blocks(adjusted) b–1
Error
Total N–1
Example: Suppose that a chemical engineer thinks that the time of reaction for
selecting a batch of raw material, and observing the reaction time .Because
variations in the batches of raw material may affect the performance of the
incomplete block design for this experiment, along with the observations
[email protected]
102
Lecture notes for Design and analysis of experiments (Stat 2043) Chapter - 4
∑∑
[email protected]
103
Lecture notes for Design and analysis of experiments (Stat 2043) Chapter - 4
[email protected]
104
Lecture notes for Design and analysis of experiments (Stat 2043) Chapter - 4
∑ ( )
= = = = 3.89
= = = = 7.58 = = 11.66
= = = = 18.33
= = = = 22.03 = = 33.90
= = = = 0.65
Notice: The sum of squares associated with the mean squares do not add to the
[email protected]
105
Lecture notes for Design and analysis of experiments (Stat 2043) Chapter - 4
i, for treatment
Vs
Vs
= = 11.66
= = 33.90
= = . for block
[email protected]
106
Lecture notes for Design and analysis of experiments (Stat 2043) Chapter - 4
Treatments a–1
Treatments(adjusted) a–1
Blocks b–1
Blocks(adjusted) b–1
Error
Total N–1
Treatments(adjusted) 22.75 3
Total 81.00 11
NB: DF: Degree of freedom