0% found this document useful (0 votes)
112 views26 pages

Design of Experiment (G2)

This document describes the design of an experiment conducted by a group of students at Birla Institute of Technology and Science, Pilani. It includes the names and student IDs of the six students who conducted the experiments. The document discusses key principles of experimental design, including randomization, replication, and local control. It also describes different methods of experimental design, such as completely randomized design, randomized block design, and generalized randomized block design. Finally, it provides an example of ANOVA procedure for randomized block design and includes references.

Uploaded by

YANA UPADHYAY
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
112 views26 pages

Design of Experiment (G2)

This document describes the design of an experiment conducted by a group of students at Birla Institute of Technology and Science, Pilani. It includes the names and student IDs of the six students who conducted the experiments. The document discusses key principles of experimental design, including randomization, replication, and local control. It also describes different methods of experimental design, such as completely randomized design, randomized block design, and generalized randomized block design. Finally, it provides an example of ANOVA procedure for randomized block design and includes references.

Uploaded by

YANA UPADHYAY
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 26

Design 24th March, 2021

Of Bharat Agarwal
Sriram Ranga
2019B4A70725P (Leader)
2017A7PS0047P

Experiments
Anmol Kaushik 2018B4AB0888P
Rituj Mittal 2018B4A80915P
Sonam Kataria 2018B4TS1165P
Neha Saini 2018B4TS1168P
Atmesh Mahapatra 2019B4A30560P
Pranay Bansal 2019B4A10471P

Birla Institute of Technology and Science, Pilani


TABLE OF CONTENTS

1. Design of Experiments 1

2. Principle of Experimentation 2,3

a. Randomisation 3

b. Replication 3

c. Local Control 3

3. Methods of Experimental Design 4-10

a. Completely Randomized Design 5

b. Randomised Block Design 6

c. Generalised Randomised Block Design 7

d. Optimal Design 8

e. Bayesian Experimental Design 9

f. Quasi-Experimental Design 10

4. Examples of Methods of Experimental Designs 11-16

5. ANOVA Procedure for Randomised Block Design 17-23

6. References 24
1

DESIGN OF EXPERIMENTS
'Design for the experiment, don't experiment for the de sign.'

In the literature on the design and analysis of experiments far more emphasis has been
placed on analysis than on design. In fact, the experimental design determines the manner
in which the experiment is carried out, which is reflected in the statistical model, which in
turn determines the appropriate techniques of analysis.

In Layman terms, Design of Experiment means how to design an experiment in a


sense that how the observations and measurements should be obtained to answer a
query in a valid, efficient, and economical way. Proper designing of an experiment
means the data generated is valid and proper analysis of the data provides valid
statistical inferences. So, the main objective of Designing of Experiments is to verify
the hypothesis in an efficient and economical way. Some techniques used in acquiring
these results are based on certain statistical assumptions.

The main question now arises is how to obtain the data such that the assumptions are
met, and the data is readily available for the application of the tools involved in the
technique of statistical inference. Since the first step involves the obtaining of
sufficient experimental units, the next job is using an appropriate randomisation
procedure for allocating the experimental units to the treatments in a random fashion.
The Design of Experiment provides a method by which the treatments are placed at
random on the experimental units in such a way that the responses are estimated with
utmost precision possible.
2

PRINCIPLES OF EXPERIMENTAL DESIGN:


1. Randomisation.
2. Replication.
3. Local Control.

Each of these basic principles are analogous to the three rectangular faces of a triangular
prism, each serving an important role in being exhaustive for the - Design of the
Experiment.
3

Randomisation
Randomisation forms the basis of a valid experiment as it involves the allocation of
treatments to the experimental units at random which ensures that every possible allotment
of treatments has the same probability.

● A representative sample from the population is obtained.


● Eliminates systematic bias by randomly assigning the experimental units to the
treatments.
● Helps in distributing the unknown variation due to variables influencing both the
independent and dependent variables throughout the experiment. Thus, the errors also
become random and independent and in turn makes the observations random as well.

Replication
Replication serves as the validator of the experiment where we obtain a more
reliable estimate of the experimental error by repeating the experimental situation
by replicating the experimental unit.

● Repetition of the experiment by repeating the same treatment a certain number of


times which helps us in obtaining a stronger, more reliable estimate than the estimate
we gained through one observation.
● Effectively increases the precision of the experiment with the increase in the number
of observations. For example: If the variance of a random variable x is sigma2. Then the
variance of the sample mean, x_bar which is based on n observations is (sigma2/n).
Thus, as n increases, variance of x_bar decreases.

Local Control
Local Control involves grouping together of homogeneous experimental units into
groups / blocks eliminating the variation within the blocks aiming to reduce the
experimental error.

● Increases the efficiency and precision of the experiment.


● Takes care of those extraneous sources of variation that contribute to the
experimental error which Randomisation and Replication are incapable to control.
● Thus, ideally the error component will only contain the variation due to the treatments
only- variation among blocks.
4

METHODS OF EXPERIMENTAL DESIGN:


1. Completely Randomized Design.
2. Randomised Block Design.
3. Generalised Randomised Block Design.
4. Optimal Design.
5. Bayesian Experimental Design.
6. Quasi-Experimental Design.
5

Completely Randomized Design (CRD)


Since our main objective is to be able to compare effectively and distinctly among our
treatments, to achieve our objective- the experimental units are assumed to be relatively
homogeneous with respect to the measured response variable- this design is referred to as
Completely Randomized Design.

Here each experimental unit is randomly assigned to a random group to receive a different
treatment and hence, each unit in the same group will receive the same treatment and we can
finally compare the results for each treatment.

Advantages of CRD
● This is a pretty basic setup among the different experimental designs and hence is
relatively easy to implement and perform statistical analysis.
● It required certain strong a priori assumptions to carry out further analysis.
● There is no restriction on the number of replications for the different treatments.
● Example: Suppose there are 3 treatments and 15 experimental units:

Treatment 1 is replicated 6 times and given to 6 experimental units.

Treatment 2 is replicated 4 times and given to 4 experimental units.

Treatment 3 is replicated 5 times and given to 5 experimental units.

Disadvantages of CRD
● Used only when the entire experimental material is homogeneous i.e. every
experimental unit is having identical characteristics. Hence, often inefficient because it
is not always possible to gather sufficient numbers of homogeneous units for an
experiment.
● The results may be skewed if we forcefully assume our experimental material to be
homogeneous because there may be the presence of nuisance/extraneous factors which
were kept unaccounted as of not being of our primary interest.
● The variability resulting among the experimental units i.e. the variation among the
response variables is completely directed into experimental error – a variable which we
are expected to minimise.
● Not suitable for larger number of treatments as it would essentiate more experimental
material which would increase the variation.
6

Randomised Block Design


“Block what you can, randomise what you cannot”

Here, the concept of Blocking is used to remove the effects of a few of the most important
extraneous / nuisance variables. The basic concept is to create homogeneous blocks in which
these extraneous factors are held constant and the factor of interest is allowed to vary.

Randomisation Replication Local Control

Every treatment occurs in every We can consider each treatment The experimental units are first
block because each treatment to be replicated the same sorted into homogeneous
appears in each block. Thus, the number of times as the number groups / blocks and then all the
number of treatments is equal to of units in a block. Hence, the treatment combinations are
the number of units in any number of replications is the randomly assigned to the units
block. So, we randomly allocate same as the number of blocks. within the blocks.
the treatments to the
experimental units in each
block.

Advantages of RBD
● Effectively handles non-homogeneous experimental material.
● It has flexibility to accommodate any number of treatments, blocks, and replications.
● The different treatments need not have equal sample sizes.
● Smaller error variance as the Local Control principle makes that sure because of the
homogeneous blocks and because of parting away some variance from the error
variance due to the difference among blocks. Thus, this dominates over the Complete
Block Design which has high experimental error due to high variability among
experimental units.
● Relatively easy statistical analysis even with the missing data.
● If an entire treatment or a block needs to be dropped from the analysis for some reason,
such as spoiled results, the analysis is not complicated thereby.

Disadvantages of RBD
● Not suitable for a large number of treatments because the block size becomes too large.
Because the prima facie idea of Randomised Block Design is based on the fact of
reducing the variability within blocks, but with the increase of block size, we deviate
from our basic setup.
● It requires some strong assumptions more than that for a completely randomised
design - like no interactions between treatments and blocks and constant variance
from block to block. So, interactions between block and treatment effects increase
error.
7

Generalised Randomised Block Design


In RBD, each treatment occurs only once in each block and it’s not possible to test for the
interaction between block and treatment. But, Generalised Randomised Block Design (GRBD)
allows replications of each treatment level within a block. Also, unlike RBD, in GRBD - the two
factors: treatment and block are interchangeable.

When the experimental units represent physical entities, then smaller groups or blocks of
experimental units usually result in greater homogeneity. And hence, we don’t favour using
block design with more than the minimum x experimental units per block, where x represents
the number of levels of the treatment factor. But in cases where the experimental runs
represent trials rather than physical entities, then larger block sizes do not necessarily
increase the variability of experimental units within a block and hence experimental runs can
be made quickly.

Advantages of GRBD
● Helpful in cases where there is uncertainty over the block-treatment interaction.
● Helpful in cases where the experimental units represent trials rather than physical
entities.

Disadvantages of GRBD
● Higher cost of experimentation because of use of replications.
● In some experimental settings, use of one factor for blocking the experimental units
doesn’t lead to a satisfactory precision, hence two factors other than the treatment
factor may be required to yield a higher precision.
8

Optimal Design
When the factor levels i.e. treatments are continuous rather than discrete, we can use
Randomised Block Design can’t study all the factor levels to determine the optimal
value of the factor which would optimise our response variable, as they assume a
simple setup for the experiment which renders inappropriate for above stated goals
with given constraints, so this is accomplished with the help of Optimal Design.

It provides us with a principled approach to accommodating the entire range of the


continuous treatment levels. It is statistical model specific i.e. it depends on the
statistical model and is assessed w.r.t. a statistical criterion.

Various types of Optimality include: A, C, D, E, G, I, V.

Advantages of Optimal Design


● Reduces the cost of experimentation by estimating the required statistical model with
fewer experimental runs.
● It allows parameters to be estimated without bias and with minimum variance.
● Can take care of treatments i.e. factor levels which are continuous rather than discretely
categorical.
● Designs can be optimised when the design space is constrained like when the factor
settings are practically infeasible to be replicated to conduct the experiment.
● Can accommodate multiple types of factors- such as process, mixture, and discrete
factors hence, more precision.

Disadvantages of Optimal Design


● Complexity is high as specifying suitable model and corresponding criterion function
required proper understanding of statistical theory as well as thorough practical
knowledge with designing of experiments.
9

Bayesian Experimental Design


Bayesian Experimental Design (BED) is a general probability-theoretical tool based on
Bayesian Inference to interpret experimental data to update the uncertainties in the
model. It facilitates model comparison by providing a formal metric for model selection
and choosing between models when, unlike Frequentist Approach, prior estimates like
uncertainty for parameters and experimental data are given.

It is useful for guiding experiments founded on the principle of expected information


gain - which is a metric preventing overfitting of the model to the data by measuring
the relative change in the model parameters with the availability of data.

A utility function is defined as follows:

F =∫ ∫ U(d, θ, η, y)p(θ∣ y, η)p(y∣ η)dθdy


𝑦 θ

Where η = set of experimental design, y = observed data for each design, θ = unknown
parameters for each design, d = decision taken for data y. The aim of this approach is to
maximize the utility function, and the value of d obtained by doing so will be the best
decision, and the value of η obtained by pairing this with the value of d obtained will be
the Bayesian Experimental Design.

Advantages of Bayesian Experimental Design


● It provides a strategic way of utilizing prior information with data. When new
observations are available, we can use the previous posterior distribution of a
parameter as the prior for future analysis.
● The utility function can be tailored to the given problem and it helps to estimate the
function parameters directly without relying on estimated parameters.
● Takes consideration of the model uncertainty factor.

Disadvantages of Bayesian Experimental Design


● Choosing prior needs a lot of understanding and intuition. Since the posterior
distribution of the parameters is progressively influenced by the prior, the validity of
chosen prior is of utmost importance.
● The utility function must be approximated since closed form is difficult to obtain for
nonlinear models.
● High computation cost for models involving large numbers of parameters.
10

Quasi-Experimental Design
A Quasi-Experimental Design aims at establishing a cause-and-effect relationship
between the dependent and independent variable where assignment is done on some
specific, non-random criterion thus allowing the experimenter to control the
assignment to the treatment condition. It is usually used in cases when randomisation
is impractical and / or unethical.

Advantages of Quasi-Experimental Design


● Involves real world data as compared to that obtained in the laboratory.
● Can include both categorical as well as continuous variables.
● Effective as they use pre-post testing i.e. tests are done before data collection and then
the experiment is performed with the recorded post test results.
● Easily conducted as compared to true experiments as they bring in features from both
experimental and non-experimental designs.
● Maximised internal and external validity.

Disadvantages of Quasi-Experimental Design


● Provides weaker evidence as it doesn’t include randomisation. Hence, it is difficult to
get completely unbiased results not influenced by other external factors.
● Internal validity concerns remain as a convincing link between the treatment condition
and observed outcomes cannot be demonstrated especially when the groups are not
equal.
11

EXAMPLES
Completely Randomised Design
Suppose the BITS Admin wants to test which academic structure is more stress free for the
students :
1) T1 T2 T3 or 2) Midsem-Compre.
Assumptions:
● The population corresponding to both treatments is normally distributed.
● Both treatments have the same population variance.
● All experimental units are independent of each other in their working.
There will be 2 groups and each academic structure will be considered a separate group.
For a particular batch having lets say 1000 students, 500 random students each, will be put
under the two groups.
12

Randomised Block Design


Since all students might not have the same courses, we consider their branch as the blocking
variable and further split them into xyz branches.
13

Generalised Randomised Block Design


To counter any interaction existing between stress due to academic structure and branch we
can replicate within each block.
14
15
16

Optimal Design
Suppose a research student in Robotics from Massachusetts Institute of Technology wants to
study the response of his model to 4 different sound patterns : A with 5 levels, B with 2 levels,
and C with 3 levels and D with 7 levels. One complete replication of this experiment would
require 5*2*3*7 = 210 experimental units. But the student could afford only 37 units. The
question now arises as to which 37 out of the 210 units should the student choose. The
D-Optimal Design algorithm provides a reasonable choice here.

Bayesian Experimental Design


Suppose during the lecture hour of MATHF113 course in BITS Pilani, the Professor wants to
explain the concept of Frequentist and Bayesian approach to the students. The Professor as a
result takes help of 25 TAs and divides them into two groups based on their belief towards
Frequentist vs Bayesian approach and they take up an experiment of drawing a black ball from
a bag of 17 black balls and 17 white balls. Suppose they get 7 blacks out of the 10 times they
drew a ball and are now asked to give an estimate of the probability of getting a black ball. The
Frequentist group say that 𝑝̂= 7/10 = 0.7. The Bayesian group say they know that 𝑝̂ will be close
to 0.5, but they also want to blend this prior idea with the current data to get an estimate. Here,
BED comes into help as they propose a certain distribution with mean as 0.5 and standard
deviation 0.023.

Quasi-Experimental Design

A fuel company claims that it's fuel produces less pollutants from the vehicle exhausts than
the existing fuel standards. To verify the claim, the Delhi govt proposes an odd-even scheme
for all the cars in Delhi in which the odd numbered cars will run on the existing fuel
standardized by the pollution control board and the even numbered cars will run on the new
fuel. After a month of implementation, the level of pollutant in the cars' engine will be
measured and conclusion will be drawn accordingly.

Here odd numbered cars are the control group and even numbered cars will be the treatment
group. Odd-even grouping is easier to keep track as random grouping will be impractical to
measure the results.
17

ANOVA PROCEDURE FOR RANDOMIZED BLOCK


DESIGN
To implement ANOVA procedure for RBD, we shall consider the following case:

A college primarily uses 3 proctored examination platforms for conducting exams. The college
administration is under the impression that all the 3 platforms are equally effective, and wants to
check if there is a difference in effectiveness of the platform.

(Note: effectiveness is measured by difference in average of the test compared to other


platforms)

Assumptions:
● The population corresponding to the three treatments is normally distributed.
● All the treatments have the same population variance.
● All experimental units are independent of each other in their working.

Step 1:
Stating the null and alternative hypothesis based on the treatments:
H0 : Mean of marks in all the platforms are equal.
Ha : One of the platforms has a different mean.

Step 2:
We have to decide on an extraneous factor that could be affecting the averages of an exam. One
factor is the scoring ability of the students. So far, the students have been tested for 90 marks.
So, we divide the class into blocks of students with different post mid sem scores.
We can consider the following 6 groups:
Students with scores: (i) 0-15 (ii) 16-30 (iii) 31-45 (iv) 46-60 (v) 61-75 (vi) 76-90.
18

Step 3:
Performing the experiment: (Conducting another exam of 90 marks)

Group Mean-Platform 1 Mean-Platform 2 Mean-Platform 3

1 5 6 8

2 24 25 28

3 35 37 34

4 54 56 50

5 70 65 65

6 81 82 81

Step 4:
Determining suitable significance level. We shall take α = 0.05.

Step 5:
Calculating treatment mean, block mean and overall mean:
Treatment means (mean for each platform):
𝑥1 = 44.8333, 𝑥2 = 45.1666, 𝑥3 = 44.3333

Block means (for each score range):


𝑥1 = 6.3333, 𝑥2 = 25.666, 𝑥3 = 35.333, 𝑥4 = 53.333, 𝑥5 = 66.666, 𝑥6 = 81.333

Overall mean
𝑥 = 44.777
19

Step 6:
(i) Calculating SSTR, SSBL, SST, SSE:
2
SST = ∑ ∑(𝑥 − 𝑥) = 11517.111
𝑖𝑗
𝑖 𝑗

2
SSTR =𝑏 * ∑(𝑥 − 𝑥) = 2.111
𝑗
𝑗

2
SSBL = 𝑘 * ∑(𝑥 − 𝑥) = 11463.11
𝑖
𝑖
Here, i ranges from 1 to 6, j ranges from 1 to 3, b = 6, k= 3

SSE = SST- SSTR-SSBL = 51.888

(ii) Calculating corresponding Mean Squares:


𝑆𝑆𝑇𝑅
MSTR = = 1.055
𝑘−1

𝑆𝑆𝐵𝐿
MSBL = = 2292.622
𝑏−1

𝑆𝑆𝐸
MSE = = 5.188
(𝑘−1)*(𝑏−1)

Step 7:
𝑀𝑆𝑇𝑅
Calculating the test statistic, FObserved : 𝑀𝑆𝐸 = 0.203

Step 8:
Performing the hypothesis test:
For F = 0.203, corresponding p value is 0.819 > α = 0.05.
Thus, we fail to reject H0
Thus, there is no significant difference between the effectiveness of the 3 platforms.
20
21
22
23
24

REFERENCES

1. Shadish, W. R., Cook, T. D., & Campbell, D. T. (2002). Experimental and


quasi-experimental designs for generalized causal inference/William R.
Shedish, Thomas D. Cook, Donald T. Campbell. Boston: Houghton Mifflin.
2. Shah, Kirti R., and B. I. K. A. S. Sinha. Theory of optimal designs. Vol. 54.
Springer Science & Business Media, 2012
3. Clyde, M. A. (2001). Experimental design: A Bayesian perspective.
International Encyclopedia Social and Behavioral Sciences
4. https://en.wikipedia.org/wiki/Bayesian_experimental_design
5. https://en.wikipedia.org/wiki/Quasi-experiment
6. https://scidesign.github.io/designbook/randomized-block-designs.html
7. https://web.ma.utexas.edu/users/mks/384E09/gcbdslides.pdf
8. http://users.stat.umn.edu/~gary/classes/5303/lectures/Blocking.pdf
9. https://stattrek.com/statistics/dictionary.aspx?definition=randomized%20blo
ck%20design
10. https://www.itl.nist.gov/div898/handbook/pri/section3/pri332.htm
11. https://www.sciencedirect.com/topics/mathematics/randomized-block-desi
gn
12. http://homepage.divms.uiowa.edu/~gwoodwor/AdvancedDesign/Chaloner%
20Verdinelli.pdf
13. https://arxiv.org/ftp/arxiv/papers/1909/1909.03861.pdf

You might also like