Chapter 5
Chapter 5
5. Factorial Designs
Factorial experiments are experiments that investigate the effects of two or more factors or input
parameters on the output response of a process. Factorial experiment design, or simply factorial
design, is a symmetric method for formulating the steps needed to successfully implement a
factorial experiment. Estimating the effects of various factors on the output of a process with a
minimal number of observations is crucial to being able to optimize the output of the process.
Many experiments involve the study of the effects of two or more factors. In general factorial
designs are most efficient for this type of experiments. A factorial design all possible
combinations of the levels of the factors are investigated in each complete trial or replication.
The experiment in which the treatments consist of all possible combinations of selected levels in
two or more factors is referred as factorial experiments. When factors are arranged in a
factorial design, they are said to be crossed.
The effect of a factor is defined to be the change in response produced by a change in the level
of the factor. This is frequently called a main effect because it refers to the primary factors of
interest in the experiment.
For example, we may have two factors A and B, each with two levels (ɑ0 and ɑ1, for factor A
and b0 and b1, for factor B). The four treatment combinations are denoted by ɑ0b0, ɑ1b0, ɑ0b1
and ɑ1b1. In addition, we define and describe the measurement of the simple effect and the main
effect of each of the two factors A and B.
Factor B
Factor A
Compute the simple effect of factor A as the difference between its two levels at a given level
of factor B. That is:
1
The simple effect of A at b0 = a1bo - a0bo
The simple effect of A at b1 = a1b1 - a0 b1
In the same manner, compute the simple effect of factor B at each of the two levels of factor A
as:
The simple effect of B at a0 = a0b1 - aobo
The simple effect of B at a1 = alb1 - alb0
Compute the main effect of factor A as the average of the simple effects of factor A over all
levels of factor B as:
Example: The effect of factor A is the change in response due to a change in the level of A. For
instance, consider a two factor experiment in which the two factors A and B have two levels
each. Then the experiment is run once. The following is the resulting output.
B1 B2
A1 30 20
A2 40 30
2
40 30 30 20
10
2 2
In this case there is no interaction since the effect of factor A is the same at all levels of B:
40 – 30 = 10 and 30 – 20 = 10
Definition of a factor effect: The change in the mean response when the factor is changed from
low to high.
40 52 20 30
A y A y A 21
2 2
30 52 20 40
B yB yB 11
2 2
52 20 30 40
AB 1
2 2
3
• The knowledge of the interaction is more useful than knowledge of the main effect
• A significant interaction will often mask the significance of main effects.
• The main effect of one factor needs to be examined with the levels of other factors fixed
when there is a significant interaction
• When an interaction is large, the corresponding main effects have little practical meaning.
• An interaction is present when the effects of one independent variable on behavior change at
the different levels of the second independent variable.
• An interaction is present when the pattern of differences associated with an independent
variable changes at the different levels of the other independent variable.
• An interaction is present when the simple effects of one independent variable are not the
same at all levels of the second independent variable.
• An interaction is present when the main effect of an independent variable is not
representative of the simple effects of that variable.
• An interaction is present when the differences between cell means representing the effects of
Factor A at one level of Factor B do not equal the corresponding differences at another level
of Factor B.
• An interaction is present when the effects of one of the independent variables are
conditionally related to the levels of the other independent variable.
• An interaction is present when one of the independent variables does not have a constant
effect at all levels of the other independent variable.
• In the absence of a significant interaction, you would focus your attention on the main
effects. That is, because the simple effects are all telling you basically the same story,
there’s little reason to examine them separately.
4
5.3 The two factors factorial design (with and without interaction)
We shall now study the statistical properties of the two-factor design. Let the factors be A and B
each with ɑ and b levels. Suppose the experiment is run n times at each combination of the
levels of A and B. The following table displays the data arrangement of such an experiment.
5.3.1 Two factor design with interaction
This design is a specific example of the general case of a two-factor factorial. To pass to the
general case, let yijk be the observed response when factor A is at the ith level (i= 1, 2, … , ɑ)
and factor B is at the jth level (j = 1, 2, …, b) for the kth replicate (k = 1, 2, …, n). in general, a
two-factor factorial experiment will appear as in Table 5.2. The order in which the ɑbn
observations are taken is selected at random so that this design is a completely randomized
design.
The observations in a factorial experiment can be described by a model. There are several ways
to write the model for a factorial experiment. The effects model is
Where µ is the overall mean effect, τi is the effect of the ith level of the row factor A, βj is the
effect of the jth level of column factor B, (τβ)ij is the effect of the interaction between τi and βj ,
and єijk is the random error component. Both factors are assumed to be fixed, and the treatment
a b
effects are defined as deviations from the overall mean, so τ i 0 and
i 1
β
j1
ij 0. Similarly,
5
a b
the interaction effects are fixed and are defined such that ( ) i j (β)i j 0. because there
i 1 j1
We are also interested in determining whether row and column treatments interact. Thus, we
also wish to test
We now discuss how these hypotheses are tested using a two-factor analysis of variance.
Let yi•• denote the total of all observations under the ith level of factor A, y•j• denote the total of
all observation under the jth level of factor B, yij• denote the total of all observations in the ijth
cell, and y••• denote the grand total of all the observations. Define y i , y j , y ij , and y as the
6
The total corrected sum of squares may be written as
Each sum of squares divided by its degrees of freedom is a mean square. The expected values of
the mean squares are
7
The test procedure is usually summarized in an analysis of variance table, as shown in Table 5.3
on the next page.
However, manual computing formulas for the sums of squares may be obtained easily. The total
sum of squares is computed as usual by
8
It is convenient to obtain the SSAB in two stages. First we compute the sum of squares between
the ɑb cell totals, which is called the sum of squares due to “subtotals”:
This sum of squares also contains SSA and SSB. Therefore, the second step is to compute SSAB
as
An engineer is studying the effective life of a certain type of battery. Two factors, plate material
and temperature, are involved. There are three types of plate materials (1, 2, 3) and three
temperature levels (15, 70, 125). Four batteries are tested at each combination of plate material
and temperature, and all 36 tests are run in random order. The experiment and the resulting
observed battery life data are given below.
Two questions:
9
What effects do material type and temperature have on the life of the battery?
Is there a choice of material that would give uniformly long life regardless of
temperature?
10
Before the conclusions from the analysis of variance are adopted, the adequacy of the
underlying model should be checked. As before, the primary diagnostic tool is residual analysis.
The residuals for the two-factor factorial model are
e ijk y ijk ŷ ijk
And because the fitted value ŷ ijk y ij (the average of the observations in the ijth cell),then the
equation becomes
e ijk y ijk y ij
99
95 18.75
90
Norm al % probability
80
Residuals
70
-7.75
50
30
20
-34.25
10
5
1
-60.75
Predicted
Residual
11
DESIGN-EXPERT Plot Residuals vs. Run
Life
45.25
18.75
Res iduals
-7.75
-34.25
-60.75
1 6 11 16 21 26 31 36
18.75
18.75
Res iduals
-7.75
Res iduals
-7.75
-34.25
-34.25
-60.75
1 2 3
-60.75
Material
1 2 3
Temperature
12
Example2: Bottling experiment
A soft bottler is interested in obtaining more uniform fill heights in the bottles produced by his
manufacturing process. An experiment is conducted to study three factors of the process, which
are the percent carbonation (A): 10, 12, and 14 percent
The operating pressure (B): 25, 30 psi
The line speed (C): 200, 250 b/pm
The response is the deviation from the target fill height. Each combination of the three factors
has two replicates and all 24 runs are performed in a random order. The experiment and data are
shown below.
13
Example
The response values can be arranged in a three-dimensional contingency table. The effects are
determined by the linear contrasts
Note that once few rows have been determined in this table, rest can be obtained by simple
multiplication of the symbols. For example, consider the column corresponding to ɑ, we note
that A has + sign, B has – sign, so AB has – sign (sign of A * sign of B). Once AB has – sign, C
has – sign then ABC has (sign of AB * sign of C) which is + sign and so on.
15
The first row is a basic element. With this ɑ = 1'Y* can be computed where 1 is a column vector
of all elements unity. If other rows are multiplied with the first row, they stay unchanged
(therefore we call it as identity and denoted as I). Every other row has the same number of + and
– signs. If + is replaced by 1 and – is replaced by – 1, we obtain the vectors of orthogonal
contrasts with the norm (8= 23).
If each row is multiplied by itself, we obtain I (first row). The product of any two rows leads to
a different row in the table. For example
The structure in the table helps in estimating the average effect. For example, the average effect
of A is
Hence for all combinations of B and C, the average effect of A is the average of all the average
effects in (i) – (iv).
Similarly, other main and interaction effects are as follows:
16
17
18
19
20
Various sum of squares in the three factorial experiment are obtained as
21
Which follow a Chi-square distribution with one degree of freedom under normality of Y*. The
corresponding mean sum of squares is obtained as
Which follows an F-distribution with degrees of freedom 1 and error degrees of freedom under
respective null hypothesis. The decision rule is to reject the corresponding null hypothesis at α
level of significance whenever
22
deliberately vary the experimental conditions to ensure that the treatments are equally effective
across many situations that are likely to be encountered in practice. The design technique used
in these situations is blocking.
We often need to eliminate the influence of extraneous factors when running an experiment. We
do this by "blocking". Previously, blocking was introduced when randomized block designs
were discussed. There we were concerned with one factor in the presence of one of more
nuisance factors. In this section we look at a general approach that enables us to divide 2-level
factorial experiments into blocks.
For example, assume we anticipate predictable shifts will occur while an experiment is being
run. This might happen when one has to change to a new batch of raw materials halfway
through the experiment. The effect of the change in raw materials is well known, and we want
to eliminate its influence on the subsequent data analysis.
23
24
5.6. Unbalanced data in a factorial design
When all factor-level combinations have the same amount of replication is not true, the data are
said to be unbalanced. The analysis of unbalanced data is more complicated, in part because
there are no simple formulae for the quantities of interest. Thus we will need to rely on
statistical software for all of our computation, and we will need to know just exactly what the
software is computing, because there are several variations on the basic computations.
When the data are balanced, a contrast for one main effect or interaction is orthogonal to a
contrast for any other main effect or interaction. One consequence of this orthogonality is that
we can estimate effects and compute sums of squares one term at a time, and the results for that
term do not depend on what other terms are in the model.
When the data are unbalanced, the results we get for one term depend on what other terms are in
the model, so we must to some extent do all the computations simultaneously.
The term of interest is said to have been “adjusted for” the terms in the reduced model. We also
presented simple formulae for these sums of squares. When the data are unbalanced, we still
compute the sum of squares for a term as a difference in error sums of squares for two models,
but there are no simple formulae to accomplish that task.
The orthogonality property of main effects and interactions present in balanced data does not
carry over to the unbalanced case. This means that the usual analysis of variance techniques do
not apply. Consequently, the analysis of unbalanced factorials is much more difficult than that
for balanced designs.
In this section we give a brief overview of methods for dealing with unbalanced factorials,
concentrating on the case of the two-factor fixed effects model. Suppose that the number of
b
observations in the ijth cell is nij. Furthermore, let n i n i j be the number of observations in
j1
a
the ith row (the ith level of factor A), n j n i j be the number of observations in the jth column
i 1
a b
(the j level of factor B), and n n i j be the total number of observations.
th
i 1 j1
25
26
5.7. The 2k factorial design
5.7.1. Introduction
Factorial designs are widely used in experiments involving several factors where it is necessary
to study the joint effect of the factors on a response.
27
5.1.1 The 22 design
In a two-level factorial design, we may define the average effect of a factor as the change in
response produced by a change in the level of that factor averaged over the levels of the other
factor. Also, the symbols (1), ɑ, b and ɑb now represent the total of all n replicates taken at the
treatment combination. Now the effect of A at the low level of B is [ɑ - (1)] / n and the effect of
A at the high level of B is [ɑb – b] /n. Averaging these two quantities yields the main effect of
A:
28
Alternatively, we may define AB as the average difference between the effect of B at the high
level of A and the effect of B at the low level of A.
The sum of squares for any contrast can be computed by the contrast sum of squares is equal to
the contrast squared divided by the number of observations in each total in the contrast times the
sum of the squares of the contrast coefficients. Consequently, we have
29
As the sums of squares for A, B and AB.
Example:
Using the experiment in Figure 6-1, we may estimate the average effects as
30
Using the experiment in figure 6-1, we may find the sums of squares using the above equations.
The total sum of squares is found in the usual way, that is,
31
5.8. The 23 design
32
33
34
35
36
37
38
From Table 6-6 we note that the main effects are highly significant (all have very small p-
values). The AB interaction is significant at about the 10 percent level; thus there is some mild
interaction between carbonation and pressure.
Example7-1: Consider the chemical process experiment first described in previous sections.
Suppose that only four experimental trials can be made from a single batch of raw material.
Therefore, three batches of raw material will be required to run all three replicates of this
design. Table 7-1 (on the previous page) shows the design, where each batch of raw material
corresponds to a block.
39
Confounding in the 2k Factorial Design
There are many problems for which it is impossible to perform a complete replicate of a
factorial design in one block. Confounding is a design technique for arranging a complete
factorial experiment in blocks, where the block size is smaller than the number of treatment
combinations in one replicate. The technique causes information about certain treatment effects
(usually high-order interactions) to be indistinguishable form, or confounded with, blocks. In
this chapter we concentrate on confounding systems for the 2k factorial design. Note that even
though the designs presented are incomplete block designs because each block does not contain
all the treatments or treatment combinations, the special structure of the 2k factorial system
allows a simplified method of analysis.
We consider the construction and analysis of the 2k factorial design in 2p incomplete blocks,
where p < k. Consequently, these designs can be run in two blocks, four blocks, eight blocks,
and so on.
40
Suppose we estimate the main effects of A and B just as if no blocking had occurred.
Note that both A and B are unaffected by blocking because in each estimate there is one plus
and one minus treatment combination from each block. That is, any difference between block 1
and block 2 will cancel out.
Because the two treatment combinations with the plus sign [ɑb and (1)] are in block 1 and the
two with the minus sign (ɑ and b) are in block 2, the block effect and the AB interaction are
identical. That is, AB is confounded with blocks. The reason for this is apparent from the table
of plus and minus signs for the 22 design.
41
From this table, we see that all treatment combinations that have a plus sign on AB are assigned
to block 1, whereas all treatment combinations that have a minus sign on AB are assigned to
block 2. This approach can be used to confounded any effect (A, B, or AB) with blocks. For
example, if (1) and b had been assigned to block 1 and ɑ and ɑb to block 2, the main effect A
would have been confounded with blocks. The usual practice is to confound the highest-order
interaction with blocks.
Example, consider a 23 design run in two blocks. Suppose we wish to confound the three-factor
interaction ABC with blocks. From the table of plus and minus signs shown in Table 7-4, we
assign the treatment combinations that are minus on ABC to block 1 and those that are plus on
ABC to block 2. The resulting design is shown in Figure 7-2. Once again, we emphasize that the
treatment combinations within a block are run in random order.
42
Other Methods for Constructing the Blocks
43
44