Lecture 1
Lecture 1
Design of Experiments
DR IQBAL SHAMSUDHEEN
JULY – SEPTEMBER 2023
2
Some
Housekeeping u CLO1: Describe the concepts of
experimental design.
Stuff u CLO2: Determine the key factor in the
process.
Course Learning
Outcome u CLO3: Apply the concepts in creating a
designed experiment including
randomisation, blocking, and replication.
u CLO4: Design and complete their own
scientific experiment and interpret
statistical results from an experiment and
report them in non-technical language.
4
Quizzes 60
Final Exam 40
Total 100
References
Main Reference
u Montgomery, D. C. (2020). Design and
Analysis of Experiments, 10th Edition, John
Wiley & Sons.
If you would like an electronic copy of this
book, please request in the telegram group
7
Introduction to
Design of • THE SCIENTIFIC METHOD
(DOE)
8
What is the Scientific Method?
u Do you remember learning about this back in high school or junior high even? What
were those steps again?
u Decide what phenomenon you wish to investigate. Specify how you can manipulate
the factor and hold all other conditions fixed, to insure that these extraneous
conditions aren't influencing the response you plan to measure.
u Then measure your chosen response variable at several (at least two) settings of the
factor under study. If changing the factor causes the phenomenon to change, then
you conclude that there is indeed a cause-and-effect relationship at work.
u How many factors are involved when you do an experiment? Some say two -
perhaps this is a comparative experiment? Perhaps there is a treatment group and a
control group? If you have a treatment group and a control group then, in this case,
you probably only have one factor with two levels.
9
What is the Scientific Method?
u How many of you have baked a cake? What are the factors involved to ensure a
successful cake? Factors might include preheating the oven, baking time,
ingredients, amount of moisture, baking temperature, etc.-- what else? You probably
follow a recipe so there are many additional factors that control the ingredients - i.e.,
a mixture. In other words, someone did the experiment in advance! What parts of the
recipe did they vary to make the recipe a success? Probably many factors,
temperature and moisture, various ratios of ingredients, and presence or absence of
many additives. Now, should one keep all the factors involved in the experiment at a
constant level and just vary one to see what would happen? This is a strategy that
works but is not very efficient. This is one of the concepts that we will address in this
course.
10
A Quick History of DOE
“All experiments are designed experiments, it is just that some are poorly designed and
some are well-designed.”
u If we had infinite time and resource budgets there probably wouldn't be a big fuss
made over designing experiments. In production and quality control we want to
control the error and learn as much as we can about the process or the underlying
theory with the resources at hand. From an engineering perspective we're trying to
use experimentation for the following purposes:
u reduce time to design/develop new products & processes
u improve performance of existing processes
u improve reliability and performance of products
u achieve product & process robustness
u perform evaluation of materials, design alternatives, setting component & system tolerances,
etc.
11
A Quick History of DOE
u We always want to fine-tune or improve the process. In today's global world this drive
for competitiveness affects all of us both as consumers and producers.
u Every experiment design has inputs. Back to the cake baking example: we have our
ingredients such as flour, sugar, milk, eggs, etc. Regardless of the quality of these
ingredients we still want our cake to come out successfully. In every experiment there
are inputs and in addition, there are factors (such as time of baking, temperature,
geometry of the cake pan, etc.), some of which you can control and others that you
can't control. The experimenter must think about factors that affect the outcome. We
also talk about the output and the yield or the response to your experiment. For the
cake, the output might be measured as texture, flavour, height, size, or flavour.
12
A Quick History of DOE
u The first industrial era, 1951 – late 1970s u The modern era, beginning circa 1990,
when economic competitiveness and
u Box & Wilson, response surfaces
globalization are driving all sectors of the
u Applications in the chemical & process economy to be more competitive.
industries
u A lot of what we are going to learn in this course goes back to what Sir Ronald Fisher
developed in the UK in the first half of the 20th century. He really laid the foundation
for statistics and for design of experiments. He and his colleague Frank Yates
developed many of the concepts and procedures that we use today. Basic
concepts such as orthogonal designs and Latin squares began there in the '20s
through the '40s. World War II also had an impact on statistics, inspiring sequential
analysis, which arose from World War II as a method to improve the accuracy of long-
range artillery guns.
u Immediately following World War II the first industrial era marked another resurgence
in the use of DOE. It was at this time that Box and Wilson (1951) wrote the key paper in
response surface designs thinking of the output as a response function and trying to
find the optimum conditions for this function. George Box died early in 2013. And, an
interesting fact here - he married Fisher's daughter! He worked in the chemical
industry in England in his early career and then came to America and worked at the
University of Wisconsin for most of his career.
14
A Quick History of DOE
Introduction to • EXPERIMENTS
• THE BASIC PRINCIPLES OF
Design of DOE
STEPS FOR PLANNING,
Experiments
•
CONDUCTING AND
ANALYSING AN
(DOE) EXPERIMENT
17
Why Experiment?
u The drawback of observational studies is that the grouping into “treatments” is not
under the control of the experimenter and its mechanism is usually unknown. Thus
observed differences in responses between treatment groups could very well be due
to these other hidden mechanisms, rather than the treatments themselves.
u It is important to say that while experiments have some advantages, observational
studies are also useful and can produce important results. For example, studies of
smoking and human health are observational, but the link that they have established
is one of the most important public health issues today. Similarly, observational studies
established an association between heart valve disease and the diet drug fen-phen
that led to the withdrawal of the drugs fenfluramine and dexfenfluramine from the
market (Connolloy et al. 1997 and US FDA 1997).
20
Components of an Experiment
u Not all experimental designs are created equal. A good experimental design must
u Avoid systematic error
u Be precise
u Allow estimation of error
u Have broad validity
22
Avoid Systematic Error
u Even without systematic error, there will be random error in the responses, and this will
lead to random error in the treatment comparisons. Experiments are precise when this
random error in treatment comparisons is small. Precision depends on the size of the
random errors in the responses, the number of units used, and the experimental
design used.
24
Allow Estimation of Error
Randomisation
u This is an essential component of any experiment that is going to have validity. If you
are doing a comparative experiment where you have two treatments, a treatment
and a control, for instance, you need to include in your experimental process the
assignment of those treatments by some random process. An experiment includes
experimental units. You need to have a deliberate process to eliminate potential
biases from the conclusions, and random assignment is a critical step.
27
The Basic Principles of DOE
Replication
u Replication is some in sense the heart of all of statistics. To make this point...
Remember what the standard error of the mean is? It is the square root of the
!
estimate of the variance of the sample mean, i.e., ! ⁄" . The width of the confidence
interval is determined by this statistic. Our estimates of the mean become less
variable as the sample size increases.
u Replication is the basic issue behind every method we will use in order to get a
handle on how precise our estimates are at the end. We always want to estimate or
control the uncertainty in our results. We achieve this estimate through replication.
Another way we can achieve short confidence intervals is by reducing the error
variance itself. However, when that isn't possible, we can reduce the error in our
estimate of the mean by increasing n.
u Another way is to reduce the size or the length of the confidence interval is to reduce
the error variance - which brings us to blocking.
28
The Basic Principles of DOE
Blocking
u Blocking is a technique to include other factors in our experiment which contribute to
undesirable variation. Much of the focus in this class will be to creatively use various
blocking techniques to control sources of variation that will reduce error variance.
u For example, in human studies, the gender of the subjects is often an important
factor. Age is another factor affecting the response. Age and gender are often
considered nuisance factors which contribute to variability and make it difficult to
assess systematic effects of a treatment. By using these as blocking factors, you can
avoid biases that might occur due to differences between the allocation of subjects
to the treatments, and as a way of accounting for some noise in the experiment.
u We want the unknown error variance at the end of the experiment to be as small as
possible. Our goal is usually to find out something about a treatment factor (or a
factor of primary interest), but in addition to this, we want to include any blocking
factors that will explain variation.
29
The Basic Principles of DOE
Multi-factor Designs
u We will spend a big part of this course talking about multi-factor experimental
designs: 2k designs, 3k designs, response surface designs, etc.
u The point to all of these multi-factor designs is contrary to the scientific method where
everything is held constant except one factor which is varied. The one factor at a
time method is a very inefficient way of making scientific advances.
u It is much better to design an experiment that simultaneously includes combinations
of multiple factors that may affect the outcome. Then you learn not only about the
primary factors of interest but also about these other factors. These may be blocking
factors which deal with nuisance parameters or they may just help you understand
the interactions or the relationships between the factors that influence the response.
30
The Basic Principles of DOE
Confounding
u Confounding is something that is usually considered bad! Here is an example. Let's
say we are doing a medical study with drugs A and B. We put 10 subjects on drug A
and 10 on drug B. If we categorize our subjects by gender, how should we allocate
our drugs to our subjects? Let's make it easy and say that there are 10 male and 10
female subjects. A balanced way of doing this study would be to put five males on
drug A and five males on drug B, five females on drug A and five females on drug B.
This is a perfectly balanced experiment such that if there is a difference between
male and female at least it will equally influence the results from drug A and the
results from drug B.
31
The Basic Principles of DOE
u After the system has been characterised and we are reasonably certain that the
important factors have been identified, the next objective is usually optimisation, that
is, find the settings or levels of the important factors that result in desirable values of
the response.
u For example, if a screening experiment on a chemical process results in the
identification of time and temperature as the two most important factors, the
optimisation experiment may have as its objective to find the levels of time and
temperature that maximises yield, or perhaps maximising yield while keeping some
product property that is critical to the customer within specifications.
u An optimisation experiment is usually a follow-up to a screening experiment. It would
be very unusual for a screening experiment to produce the optimal settings of the
important factors.
40
Recognition - Confirmation
u These experiments often address questions such as under what conditions do the
response variables of interest seriously degrade?
u Or what conditions would lead to unacceptable variability in the response variables?
u A variation of this is determining how we can set the factors in the system that we can
control to minimize the variability transmitted into the response from factors that we
cannot control very well.
43
Selection of the response variable
u In selecting the response variable, the experimenter should be certain that this
variable really provides useful information about the process under study
u Most often, the average or standard deviation (or both) of the measured
characteristic will be the response variable. Multiple responses are not unusual. The
experimenters must decide how each response will be measured, and address issues
such as how will any measurement system be calibrated and how this calibration will
be maintained during the experiment
u The gauge or measurement system capability (or measurement error) is also an
important factor. If gauge capability is inadequate, only relatively large factor effects
will be detected by the experiment or perhaps additional replication will be required
44
Selection of the response variable
u In some situations where gauge capability is poor, the experimenter may decide to
measure each experimental unit several times and use the average of the repeated
measurements as the observed response
u It is usually critically important to identify issues related to defining the responses of
interest and how they are to be measured before conducting the experiment
u Sometimes designed experiments are employed to study and improve the
performance of measurement systems
45
Choice of factors, levels, and range
Factors
u When considering the factors that may influence the performance of a process or
system, the experimenter usually discovers that these factors can be classified as
either potential design factors or nuisance factors
u The potential design factors are those factors that the experimenter may wish to vary
in the experiment. Often we find that there are a lot of potential design factors, and
some further classification of them is helpful.
u Some useful classifications are design factors, held-constant factors, and allowed-to-
vary factors. The design factors are the factors actually selected for study in the
experiment. Held-constant factors are variables that may exert some effect on the
response, but for purposes of the present experiment these factors are not of interest,
so they will be held at a specific level.
46
Choice of factors, levels, and range
u Nuisance factors, on the other hand, may have large effects that must be accounted for,
yet we may not be interested in them in the context of the present experiment. Nuisance
factors are often classified as controllable, uncontrollable, or noise factors.
u A controllable nuisance factor is one whose levels may be set by the experimenter. For
example, the experimenter can select different batches of raw material or different days
of the week when conducting the experiment. The blocking principle, discussed in the
previous section, is often useful in dealing with controllable nuisance factors.
u If a nuisance factor is uncontrollable in the experiment, but it can be measured, an
analysis procedure called the analysis of covariance can often be used to compensate
for its effect. For example, the relative humidity in the process environment may affect
process performance, and if the humidity cannot be controlled, it probably can be
measured and treated as a covariate.
48
Choice of factors, levels, and range
u When a factor that varies naturally and uncontrollably in the process can be
controlled for purposes of an experiment, we often call it a noise factor. In such
situations, our objective is usually to find the settings of the controllable design factors
that minimise the variability transmitted from the noise factors.
49
Choice of factors, levels, and range
Levels
u Once the experimenter has selected the design factors, he or she must choose the
ranges over which these factors will be varied and the specific levels at which runs will
be made. Thought must also be given to how these factors are to be controlled at
the desired values and how they are to be measured.
u Process knowledge is required to do this. This process knowledge is usually a
combination of practical experience and theoretical understanding. It is important to
investigate all factors that may be of importance and to be not overly influenced by
past experience, particularly when we are in the early stages of experimentation or
when the process is not very mature.
50
Choice of experimental design
u If the above pre-experimental planning activities are done correctly, this step is
relatively easy. Choice of design involves consideration of sample size (number of
replicates), selection of a suitable run order for the experimental trials, and
determination of whether or not blocking or other randomisation restrictions are
involved.
u Design selection also involves thinking about and selecting a tentative empirical
model to describe the results. The model is just a quantitative relationship (equation)
between the response and the important design factors. In many cases, a low-order
polynomial model will be appropriate. A first-order model in two variables is
𝑦 = 𝛽# + 𝛽$ 𝑥$ + 𝛽% 𝑥% + 𝜀
51
Choice of experimental design
u where y is the response, the x's are the design factors, the 𝛽 's are unknown
parameters that will be estimated from the data in the experiment, and & 𝜀 is a
random error term that accounts for the experimental error in the system that is being
studied. The first-order model is also sometimes called a main effects model. First-
order models are used extensively in screening or characterization experiments.
u A common extension of the first-order model is to add an interaction term, say
𝑦 = 𝛽# + 𝛽$ 𝑥$ + 𝛽% 𝑥% + 𝛽$% 𝑥$ 𝑥% + 𝜀
u where the cross-product term 𝑥$ 𝑥% represents the two-factor interaction between the
design factors. Because interactions between factors is relatively common, the first-
order model with interaction is widely used.
52
Choice of experimental design
u Higher-order interactions can also be included in experiments with more than two
factors if necessary. Another widely used model is the second-order model
% %
𝑦 = 𝛽# + 𝛽$ 𝑥$ + 𝛽% 𝑥% + 𝛽$% 𝑥$ 𝑥% + 𝛽$$ 𝑥$$ + 𝛽% 𝑥%% +𝜀
u When running the experiment, it is vital to monitor the process carefully to ensure that
everything is being done according to plan. Errors in experimental procedure at this stage
will usually destroy experimental validity.
u One of the most common mistakes is that the people conducting the experiment failed to
set the variables to the proper levels on some runs. Someone should be assigned to check
factor settings before each run. Up-front planning to prevent mistakes like this is crucial to
success. It is easy to underestimate the logistical and planning aspects of running a
designed experiment in a complex manufacturing or research and development
environment.
u Coleman and Montgomery (1993) suggest that prior to conducting the experiment a few
trial runs or pilot runs are often helpful. These runs provide information about consistency of
experimental material, a check on the measurement system, a rough idea of experimental
error, and a chance to practice the overall experimental technique. This also provides an
opportunity to revisit the decisions made in steps 1-4, if necessary.
55
Statistical analysis of the data
u Statistical methods should be used to analyse the data so that results and conclusions are
objective rather than judgmental in nature. If the experiment has been designed correctly
and performed according to the design, the statistical methods required are not
elaborate. There are many excellent software packages designed to assist in data
analysis.
u Often we find that simple graphical methods play an important role in data analysis and
interpretation. Because many of the questions that the experimenter wants to answer can
be cast into an hypothesis-testing framework, hypothesis testing and confidence interval
estimation procedures are very useful in analysing data from a designed experiment.
u It is also usually very helpful to present the results of many experiments in terms of an
empirical model, that is, an equation derived from the data that express the relationship
between the response and the important design factors.
56
Conclusions and recommendations
u Once the data have been analysed, the experimenter must draw practical conclusions
about the results and recommend a course of action.
u Graphical methods are often useful in this stage, particularly in presenting the results to
others.
u Follow-up runs and confirmation testing should also be performed to validate the
conclusions from the experiment.
u Throughout this entire process, it is important to keep in mind that experimentation is an
important part of the learning process, where we tentatively formulate hypotheses about
a system, perform experiments to investigate these hypotheses, and on the basis of the
results formulate new hypotheses, and so on. This suggests that experimentation is iterative.
It is usually a major mistake to design a single, large, comprehensive experiment at the
start of a study.
57
Conclusions and recommendations