0% found this document useful (0 votes)
404 views122 pages

BDU Biometrics

This document outlines course details for an advanced biometry and software applications course, including course codes, instructor information, and course content. The course is divided into three parts covering statistical concepts, experimental designs and analysis, and correlation and regression analysis. Statistical topics include probability, random variables, sampling, estimation, hypothesis testing, analysis of variance, and designs for simple and factorial experiments.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
404 views122 pages

BDU Biometrics

This document outlines course details for an advanced biometry and software applications course, including course codes, instructor information, and course content. The course is divided into three parts covering statistical concepts, experimental designs and analysis, and correlation and regression analysis. Statistical topics include probability, random variables, sampling, estimation, hypothesis testing, analysis of variance, and designs for simple and factorial experiments.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 122

7

Bahir Dar University


College of Agriculture and Environmental Sciences
Department of Plant Sciences
M.Sc. Agronomy & Horticulture, Regular 2008EC

The same two Courses, but with different Codes:


1. Advanced Biometry and Software Applications, Plag601
2. Advanced Biometry and Software Applications, Hort601

Instructor:
 Getachew Alemayehu Damot (PhD, Associate Professor)

The course constitutes 3 parts:


Part I: Some Statistical Concepts Important for Designs and Analysis of Experiments
1. Definition and Fundamentals of Statistics
2. Sampling and Sampling Distributions
3. Estimation
4. Hypothesis Testing
5. Analysis of Variance
Part II: Designs and Analysis of Experiments
6. Designs and analysis of Simple Experiments
6.1 Complete Designs: CRD, RCBD and LD
6.2 Incomplete Designs: BLD, PBLD, Augmented Designs
7. Designs and analysis of Factorial Experiments
7.1 Complete Designs: CRD, RCBD and LD
7.2 Incomplete Designs: Confounding in FE, SPD, SSPD
7.3 Nested Design
Part III: Correlation and Regression Analysis
8. Correlation
9. Regression

1
Part I: Some Important Statistical Concepts Useful for Experimental Designs and Analysis

1. Definition and Fundamentals of Statistics


1.1 Statistics
The term “Statistics” has three different meanings.
a) Statistics (plural) deals with collection of figures or facts or quantitative information
that describes every aspect of social and economic phenomenon. Example, total
biomass production, prices, GDP, etc.
b) Statistics (plural) values of a sample, antonym to parameters that are values of a
population.
c) Statistics as a branch of scientific method used in dealing with the planning and
design of data collection, organization, presentation, analysis and interpretation and
drawing conclusion based on data.

Statistics is the science, pure and applied, of creating, developing and applying techniques
such that the uncertainty of inductive influences may be evaluated.

Applications and uses of statistics


Statistical concepts and methods are widely applied for gathering data in various fields of
study, i.e., engineering, physical science, biological science, economics, politics, sociology,
psychology, meteorology, and so on. The statistical concepts and methods bring a logical
objective and systematic approach decision making. They do not replace intuition and
common sense, but assist in structuring a problem and in bringing the application of
concrete judgment to it.

Statistics is used in almost all fields of human activities and used by government bodies,
private business firms and research agencies as an indispensable tool.

Limitations of statistics
1. It cannot deal with a single observation or value
2. Statistical methods are not applicable to studies which measure qualitative characters and
cannot be coded in numerical values.
3. The statistical study does not take care of changes occurring to individuals
4. The statistical statements or conclusions are generally not true or applicable to
individuals but are applicable to the majority of classes.
5. Misuse of statistics arises from deliberate motivation or due to lack of knowledge or
application of inappropriate methodology.
6. Complete accuracy in statistics is often impossible

2
In summary, statistics a highly developed science with deep rooted mathematical base. It is
applicable to a large number of economic, social and business phenomena which are executed by
different bodies of government and private sector. It is a backbone of industrial research.

1.2 Fundamental mathematical concepts useful in statistics


i. Probability - The mathematical basis of statistical inference is the theory of
probability (= the laws of chance).
ii. Random variable - Under probability, an important term widely referred is random
variable, whose antonym is systematic variable. Random variable is a variable
whose values change depending up on chance, while systematic variable is a variable
whose values change systematically. Variable (character) is the actual property
measured by the individual observations. An example of random variable - let y be a
number of off-types among 10,000 of wheat plants, then y is a random variable.
Random indicates chance, whereas variable means any quantitative or qualitative
observation (values).
iii. Types of random variables – are either discrete random variable or continuous
random variable.
(a) Discrete random variable – if the set of all the values of a random variable is
finite or countable infinite, then the random variable is discrete. Something which
can be countable is discrete random variable. Discrete random variable takes
values which are positive integers. Examples: number of married males in Addis
Ababa, number of coffee bean per plant, or number of stars on clear nights.
Discrete random variable could be countable, but it is infinite.
(b) Continuous random variable – if the set of all the values of the random variable
is an interval or range, then it is a continuous random variable. Examples: height
of a person, milk yield of a cow, or length of roots of a plant.
Expected value of a random variable – is a long term average of a random
variable. It is mathematically defined as:
μ ( mew )=E ( x ) ,if x isthe random variable of interest .
E(x) is expected value of x random variable

+∞
E(x) = ∫ x ƒ ( x ) dx ,ifxiscontinuousrandomvariable
−∞

E(x) = Ʃxp(x), if x is discrete random variable, p(x) is the probability value of x

Mean (μ ¿ - is a measure of central tendency. The value that tell us the aggregate to
the center.

Variance (v) – the measure of spread of a probability distribution (=dispersion). In


other words, variance is the extent difference of each value from the

3
mean. It is an indicator of how close are they, or how far they one to
another. Variance is simply the deviation of each variable from the mean.

Examples:

A B

B is more variable than A.

Variance can be defined symbolically as:


+∞
σ 2 = E(x-μ ¿= ∫ ( x−μ)2ƒ(x)dx, if x is continuous variable
−∞

σ = E(x-μ) = Ʃ(x-μ)2, if x is discrete


2

A) Properties of expected value


1) Expected value of a constant is a constant itself. E(c) = c, where c is a constant
Constant values are not changed for a long term. Example, avocado trees
2) E(x) = μ, where x is a random variable.
3) The expected value of a product of a constant and a random variable is the
constant times the expected value of random variable. E(cx) = cE(x) = cμ
4) Assume we have two random variables x1& x2 with :-
E(x1) = μ1, E(x2) = μ2
2
V(x1) = σ1 , v(x2) = σ22
Then, (a) E(x1+x2) = E(x1) + E(x2) = μ1 + μ2
(b) E(x1.x2) = E(x1) . E(x2), indeed, this is not always true. It is true, if x1
and x2 are independent. But, it is not true, if x1 and x2 are
dependent. Examples, Tail and Head of a given coin are dependent
each other, whereas Tails and Heads of the different coins are
independent.
(c) E(x1/x2) ≠ E(x1)/E(x2)

B) Properties of variance
1) v(c) = 0
2) v(x) = E(x-μ)2 = σ2
3) v(cx) = E[c(x-μ)]2 = c2E(x-μ)2 = c2σ2
4) v(x1+x2) = v(x1)+v(x2)+2cov(x1,x2), where cov means covariance of
x1& x2 = σ12+σ22+2σ122
5) v(x1-x2) = v(x1)+v(x2)-2cov(x1,x2) = σ12+σ22-2σ122
Note: Cov(x1,x2) = 0, if x1& x2 are independent.
If x1& x2 are dependent, cov(x1,x2) ≠ 0, rather other integer.

4
2. Sampling and Sampling Distribution
a) Observation – is an elementary unit or a measure of some attribute of a population.
Attributes means characters that are considered or interested. Examples, grain yield per
spike, percent silt in some soils, amount of fat in milk of Zebu cows, etc.
b) Population – is the universe or an aggregate or totality of elementary unit or
measurement. All or whole measurement what we are interested is statistically called
population. Examples, all the grain yield of a given wheat variety, amount of Debre Zeit
soils, all milk of Zebu cows, etc. Some populations could be finite and some other
populations are infinite.
c) Sample – is a set of n observations or elementary units drawn from a population.
Normally, n is designated as sample size, whereas N is designated as population size.

The objective of statistical analysis is to draw inference about a population based on the
result of a sample. Or to give conclusion about a population, sample should be taken and
analyzed statistically. Otherwise, it is difficult to give a conclusion about a given population.

Most of the time, it will be assumed that a random sample is taken, rather than systematic
sampling. Assume that the population consists of N individuals and a sample of n individuals
is taken. If a population size is N and a sample size of this population is n, the total possible
N!
combinations of samples is calculated as = . Possible samples are given equal
n ! ( N −n ) !
opportunity to be members of the sample, then the sample is said to be random sample.
Example, population letters include A, B, C, D & E (N=5). If 2 letters are taken as a sample,
5! 5x 4 x 3 x2 x1
how many possible samples can be formed randomly? = = 10
2! ( 5−2 ) ! 2 x 1 (3 x 2 x 1)
possible samples will be having in this population.

d) Parameter – is some value of a population which is usually not known. It is a value that
we could like to make estimation. Parameters are expressed (designated) in terms of
Greek letters such as μ (mew), σ2 (sigma square), ρ (raw), τ, etc. Parameters are
constants. I.e., they are not random variable.
e) Statistic – is a sample estimate of a parameter. Statistics – are some estimates of a
sample and designated by Latin letters such as , s2, etc. Suppose y1, y2, y3, …yn is a
n

sample of observations from the population. Compute sample mean as =


∑ yi =
i=1
n
y 1+ y 2+ y 3+ …+ yn
= sample mean.
n

5
n
2
s = ∑ ¿¿ ¿ = sample variance where n-1 is the degree of freedom.
i=1

s = √ s 2, where S is standard deviation. All , s2 and s are statistics. What is the meaning
of statistics? Statistics are estimates (values) of a sample. Normally, statistics are
computed from samples. estimates μ, where is a minimum sum of squares estimates
of μ; s2 estimates σ2; and s estimates σ.

f) Sampling Distribution – is a probability distribution of statistics. A table with random


variables with their probability values is called probability distribution.
g) Some useful probability distributions – There are quite a large number of probability
distributions. Some of the more important ones are the followings:
1. Normal distribution
2. Chi-square distribution
3. Binomial distribution
4. Poisson distribution
5. F-distribution
6. t-distribution, etc. Indeed, 1st, 2nd, 5th and 6th are important probability distributions
which are dealt further here in this course.

2.1 Normal probability distribution


This is the most important probability distribution. This is because of the fact that most
real life phenomena can be defined in terms of normal law. Many other distributions can
be also approximated by normal distribution.

The normal curve is the theoretical model that is used to analyze data in real life.
Density Function – The frequency or density fxn of the normal law is given as follows:
1 −( x−μ ) 2
ƒ(x) = е , where -∞ < x <+∞
μ √2 πσ 2 2σ2
-∞ < μ<+ ∞
σ2>0
e ≈ 2.718…
Frequency curve – The frequency curve of normal distribution looks like the following:

6
In this case, we can say x has a normal distribution with mean μ and with σ2.
Symbolically, x N(μ,σ2), where means distributed.

Properties of the normal distribution


1. The curve is bell-shaped
2. It is symmetric and as the result the mean divides the curve into 2 equal parts
3. The mean, media and mode are equal
4. The curve extends into infinity in both directions, coming very close to the horizontal
line, but never touches it.

Standard Normal Distribution


Standard normal distribution serves as a working distribution. It is a special case of the
general normal distribution. Why do we need the standard normal distribution? The
general normal distribution is quite a difficult distribution to work with. Therefore, the
standard normal distribution is developed to solve this problem.

If x is distributed normal with mean μ and variance σ2, i.e., x (μ,σ2), then
defined Ƶ as:
x−μ
Ƶ= , then this Ƶ is distributed normal with mean 0 and variance 1. In other words, Ƶ
σ
(0, 1).

1 −ƶ 2
The density fxn of Ƶ is given as follows: ƒ(ƶ) = е , -∞ < ƶ <+ ∞
√2 π 2
The reason why we use the standard normal distribution is that the population’s μ and σ2
are not known. So that ƶ normal distribution is used practically because ƶ is any random
variable.

The standard normal curve

The probability distribution for the standard normal variate has been completely
tabulated.

Disadvantage of this is that many statistical techniques assume that the random variable
under consideration is normally distributed and used this distribution to calculate various

7
probabilities. Consequently, many people use a standard normal distribution whether
their study follows normal distribution or not. Indeed, there are some cases which are not
fit the standard normal distribution. Examples, if we want to know females born in the
country annually, or number of telephone calls, etc. that enforces us to use other
distribution which is known as Poisson distribution.

8
Central Limit Theorem
Central limit theorem is also known as the law of large number.
If y1, y2, y3, …, yn is a sequence of n independent variables with E(yi) = μi and
n
x−∑ μi
i=1
v(yi) = σ2 and x = y1+y2+y3+…+yn, then Ƶn = n , then Ƶn N (0,1)
∑ σi
i=1

Therefore, the central limit theorem gives us the justification for using the normal
distribution. Even if any data are not normal, rather Poisson or Binomial or other
distribution, they are used in normal distribution, because normal distribution (standard)
fits the central limit theorem.

2.2 Chi-square distribution ( χ 2)


Suppose we have a sequence of normally distributed random variables y1, y2, y3, …, yn
yi−μ
with mean E(yi) = μ and variance v(yi) = σ2. Let xi = ,x N(0,1)
σ

Define μ = x12+x22+x32+…+xn2, then μ will have the chi-square distribution with the
following frequency function:
1
x -n/2 n/2 -1 -μ/2
ƒ(μ) = n 2 xμ x e , μ ≥ 0.
( −1)!
2

n=1

ƒ(μ) n=5 n=20

μ 0<x2<+∞

From the curve, we note that as sample size increases, the chi-square curve approaches
the normal curve. So that when the sample size is high, it is better to use normal
distribution, since it is easier than chi-square.

Properties of the χ 2(chi-square) distribution


1. E(μ) = n, where n is number of degree of freedom (df)
2. v(μ) = 2n, number classes – 1 = n

9
10
Goodness of fit
k k
2 Observed −Expected 2 Oi−Ei 2
Define X = ∑ ( ) = ∑( )
i=1 Expected i=1 Ei
Here X2 is used as a measure of discrepancy between theoretical and observed
frequencies. Such type of chi-square is used in plant and animal breeding.
Example, crossing two heterozygote plants of sorghum for height, Aa x Aa ⇒ 1AA, 2Aa,
1aa. If you cross 100 plants, as a breeder you may expect 25 of AA, 50 of Aa and 25 of
aa.
Ify1, y2, y3, …, ynis a random sample from a normal distribution, that is, y N(μ,
n
SS 2(n-1)
σ ). Let the sum of squares are defined as follows: SS = ∑ ( yi−¿) ¿2, then
2
x ,
i=1 σ2
where n-1 is degree of freedom.

If the random variable in the sample is independently and identically distributed, that is,
σ 2 2(n-1) 2 SS
y N ¿2), then s2 x , s =
n−1 n−1
σ 2 2(n-1)
s2 x ,
n−1
( n−1 ) s 2 2(n-1)
x
σ2
(n-1)s2 = SS

2.3 The t-distribution


If Ƶ and chi-square k(x2k) are independent normal and chi-square random variables
Z
respectively, then define another random variable tk = x 2 t(k), where k is degree of

k
freedom. The t-distribution has the following frequency function:
Ƥ (k +1)/2 1
ƒ(t) = k x 2 k +1 , where -∞ <t <+∞
(
√ kπ Ƥ ( ) t k +1 2
2 )
Ƥ is gamma.

ƒ(t)

0 t
11
Properties of the t-distribution
1. As the sample size increases, the curve approaches the normal curve.
2. E(t) = 0
k
3. v(t) = , where k>2
k−2
4. The curve is symmetric
Ify1, y2, y3, …, ynis a random sample from the normal distribution, that is, y
−μ
s
N(μ-σ ), then defined t as t = s t(n-1), where
2
is standard error.
√ n
√n
Letx1, x2, x3, …, xn be a random sample from a normal distribution with mean μ1 and
variance σ12, and lety1, y2, y3, …, yn be another random sample with mean μ2 and variance
σ 12 σ 22
σ22, then define Ƶ = ¿ ¿ N(0,1), where = σ12 and = σ22,
m m
Where m and n are the sample sizes for the first and second samples., respectively. This
is true if the two variances σ12 and σ22 are known. If the 2 population variances are not
known, then they have to be estimated from the samples.
Let s12 = ∑ ¿¿ ¿
s22 = ∑ ¿¿ ¿
SSx−SSy
Compute pooled variance as sp2 =
( m−1 ) +(n−1)
( m−1 ) s 12+ ( n−1 ) s 22
or sp2 = , where s 12 = s12 and
m+ n−2
s22 = s22
Here, we have assumed that the populations have the same variance, that is, σ12=σ22=σ2
The t- is defined as t = ¿ ¿. This is known as a two sample t-test where samples are
independent.
t-is known as sampling distribution. t = ¿ ¿ t(m+n-2), this is true for σ1=σ2.
s 12 s 22

If σ12≠ σ22, there is no Sp, rather
m
+
n

2.4 The F-distribution


This is another sampling distribution. Let x2u and x2v are the two independent x2 random
x 2 u/u vx 2u
variables, then define F as F = F(u, v), F =
x 2 v /v ux 2 v

The frequency function of F-distribution is as follow:

12
u
Ƥ (u+v /2)(u+ v /2)u/2 f ( −1)
2
h(f) =
u v u
Ƥ ( ) ( ) [( )
2
Ƥ
2 v ]
f +1 u+v /2

F-distribution is also as t-distribution a sampling distribution, where 0<∞ < f <+∞


F-distribution of frequency curve = F-frequency curve:

13
h(f)

Application f
Assume that you have two samples x1, x2, x3, …, xn and y1, y2, y3, …, yn with the same
population variance, then define F as F = s12/s22 F(m-1, n-1). Here, we have
2 2
assumed that s1 >s2 , usually the larger to be the nominator where as lesser as
denominator.

The relationship between normal and F-distribution is F = e2Ƶ.

3. Estimation
In statistics often times are interested in estimating some properties of the population or test
hypothesis concerning that property. A sample is usually taken from the population and the
value or values concerning property or properties are calculated and the values are taken as
estimates the population properties.

Assume that we are interested in determining the mean of the normal distribution, then we

take a sample and compute =


∑ xi , then will be the estimate of .
n

Criteria for comparison of estimators


1. Consistency – An estimator is said to be consistent if the sample size increases, its value
approaches population value. Then that estimator is consistent. For example, the sample
mean is a consistent estimator.
2. Unbiasedness – An estimator is said to be unbiased estimator, if its expected value is
equal to the population value. Example, let be the population mean μ to be estimated by

sample mean ( ). And let 2 , 3 and be three estimators to be compared:
E(2 ) = 2E( ) = 2μ

E( 3 ) = 1/3E( )
E( ) = μ, so that is a good estimator.
3. Sufficiency – A sufficient estimator is the one which has all the necessary information or
which utilizes all the information that a sample has. The sample mean is a sufficient

14
statistics. In this case, no other estimator of the population mean μ such as median or
mode gives additional information.

=
∑ (xi ) = x 1+ x 2 ,+ x 3+ …+ x n
n n
4. Efficiency – An estimator is said to be efficient if it has minimum variance (smallest
variance). Let Theta1, 2, 3 ( 1, 2, 3) be estimators of a parameter with respective
variance σ2. -this mark over the designated variable means estimator. means estimator
of variable θ, or is an estimator of μ. For instance, 1, 2, 3 are of 2.00, 2.32, 1.81,
respectively, and then 3 is an efficient.

Methods of estimation
There are 3 important methods of estimation in mathematics. These are:
1. The least square method
2. Maximum likelihood method
3. Minimum χ 2-method, ( χ 2 = Chi-Square)
Why we need these methods? The answer is to estimate the population parameter.
1. Maximum likelihood (M.L.)
The properties of M.L. are such that it gives estimator with desirable properties such as
consistency, efficiency and sufficiency, but not unbiased. Assume that we have the
following parameters: θ1, θ2, θ3,…, θk to be estimated. Let x be the random variable (r.v)
of interest. Take a random sample such as x1, x2, x3, …, xn. Then, M.L. is computed as
follow: πƒ(x; θ1, θ2, θ3,…, θk). π = product is known as maximum likelihood. The values
of θ1, θ2, θ3,…, θk is given in a such way that M.L. value is maximized.
1 ¿ 1 σ2
Example, e-( x−μ ¿ 2 2 = e- , which is of the estimation of μ & σ2. Take
√2 πσ 2 √2 πσ 2 2
samples and multiply them μ1, σ12; μ2, σ22; μ3, σ32; …; μn, σn2.
2. Least square method (L.S.)
It is mostly widely used one techniques of estimation. For least square estimate, take the
values of estimators which give you or result in the smallest deviation of the actual values
from expected value. Let x1, x2, x3, …, xn be a sample from a given population, then the
n n
least squares estimate is computed as follows: L = ∑ [ xi−E( xi)]2 = ∑ ( xi−μ)2.
i=1 i=1

The least squares techniques give a good result when,


(a) deviations are unrelated (uncorrelated), and
(b) deviations have the same variance (σ2, population variance). Both M.L. & L.S. would
be the same if deviations are independently and identically distributed (having the
same σ2).
3. Minimum χ 2-method

15
This technique is used in frequency data. Let’s say there are γ attributes (characters), and
let’s take a sample of size n. Let ni of these values belong to the ith attribute, and let’s pi
be the true population of the ith attribute, then the minimum χ 2 is computed as follows:

γ
( ¿−nipi)2
χ *= ∑ =

ƒ(x) = E(xi-μ)2
i=1 nipi
L.S.

μ
Find the value of pi which minimizes χ . χ is asymptotically efficient. When the size is
* *

large, it will be very efficient, because it approaches to the line. In other words, for large
samples, χ *is distributed χ 2. In this course, we will be interested in the least squares
method, because it is unbiased estimator and simple to compute better than other
methods.

4. Hypotheses Testing
Definition – Hypothesis is a statement about some attribute(s) of a population. Examples: the
proportion of silt in the soil of Debre Zeit is 50%; there is no difference in yield of fertilized
and unfertilized crops; the butter fat percentage of Zebu cows is 4; etc.

Hypothesis testing – is a procedure in which an experimenter decides which of the two sets
of dichotomous hypotheses is accepted or rejected with a specified level of risk of making an
error. There are two hypotheses, namely:
α α
(a) Exact hypothesis = such as two tails (sides) test = Ƶ or t . Examples:
2 2
(i) H0 (H-null) hypothesis, such as μ = 4; and
(ii) Ha (H-alternate) hypothesis, μ ≠ 4.
(b) Inexact hypothesis = one tail (side) test = Ƶα or tα. Examples:
(i) H0 (H-null) hypothesis, μ >4; and
(ii) Ha (H-alternate) hypothesis, μ <4.

Normally, H0 (H-null) hypothesis is hypothesis of interest. Whereas, Ha (H-alternate)


hypothesis that hypothesis which becomes true when the null hypothesis is false. For
instance, in reality differences are expected among treatments and the hypothesis of the study
would be null, there will not be differences among treatments for crop growth and yield.

16
Steps in hypothesis testing
1. State the null hypothesis,
2. Determine the test statistics and sample statistics (Ƶ, t, F, χ 2, etc.),
3. Determine the significance level,
4. Decide on sample size,
5. Take a sample and compute sample statistics, and
6. Make a decision by comparing sample statistic with test statistic. Reject your hypothesis
if computed value is larger than table value of your statistic.

−μ −μ
,
For example: for testing means, we can use Ƶ- or t-statistics. Ƶ = σ t = s .
√n √n
t-test is used – (1) when σ is not known, or
(2) when the sample size is <30.
Statistical inference includes (a) estimation, and (b) hypothesis testing.

Example: let’s say that our hypothesis H0 – μ = 100. Assume that random variable under
study is normally distributed with mean μ and variance σ2 = 25. Let the significance level
to be 0.05. Take a sample of size 100 and calculate = 102. Then, Ƶ is computed as:
−μ 102−100
Ƶ= σ = 25 = 4.
√n √
100
Whereas, Z0.05 from standard normal distribution table = 1.645.
Then, the decision would be = reject H0.

Acceptance and rejection region


Rejection region is that region of the theoretical frequency curve in which the values of
the random variable understudy are less likely to occur if the null hypotheses were true.

Rejection Acceptance Acceptance


Rejection
region region region
region

μ μ

α-significance level α-significance level

One tail test for inexact hypotheses (left-less than, right-greater than)

17
Rejection Rejection
region region
Acceptance
region

α μ α
-significance level -significance level
2 2
Two tails test for exact hypothesis
Significance level α (alpha) is the probability which determines your rejection region. In
industry, they call it tolerant level, while in economics or business, they call it risk level.

If the work is very precise, the α level will be very small, whereas as not precise very much
level of risk will be large. But, usually 0.01 & 0.05 levels of error are commonly used for
most agricultural and biological studies.

Distribution Distribution
under H0 under Ha
1−α 1−β

Two types of errors α β


In making statistical inference, we generally commit two types of errors.
1. Type I error – This error is committed when we reject a true hypothesis. This is
designated by α. That is, α is the probability of rejecting a true hypothesis.
2. Type II error – This error is committed when we accept a false hypothesis. The
probability of making type II error is designated byβ. Note that as α increases, β
decreases. Error α is determined by experimenter, whereas β is hypothetical construct.
The seriousness of errors depends up on the nature of the experiment. Example: Assume
that a certain doctor has cancer patients. Assume also that he has given a certain drug
called Drug A. He feels that Drug A arrest the growth of cancer cells. He administered
the drug to his patients. Which of the two errors is serious in this case? Type II error is
more serious. Why?

Decision category
True situation

μ = 100 μ = 105
Decision μ = 100 1-α β
Correct Type II error
μ = 105 α 1-β
Type I error Correct

18
Power of a test
Power of a test is the probability of rejecting a hypothesis when it is false. This probability is
designated by 1-β. An experimenter selected an experimental design and set of decision rules
which give the highest power.

19
Methods of increasing power of test
1. Increase sample size,
2. Taking samples randomly,
3. Select an appropriate experimental design which will precisely measures the treatment
effects and results in small error variance

Types of hypotheses of interest


1. Hypotheses concerning means
(a) H0: μ = 0, example, H0: μ = 4.This is a test concerning a single mean; we call it as
one-sample hypothesis.
(b) H0: μ1 = μ2, two population means are equal. Here, we have two samples and the test
is known as a two samples test. Assumption, population variance is the same.
(c) H0: μ1 = μ2 = μ3 = … = μk. This means all the k means are equal.

How do we go about testing our hypothesis?


(j) The first hypothesis H0: μ = c, Ha: μ ≠ c
−μ
Compute: Ƶ = σ , σ is known
√n
−μ
t = s , σ is not known, t tα/2(n-1).
√n
(ii) Second hypothesis H0: μ1 = μ2, Ha: μ1 ≠ μ2.
( 1−2 )−(μ 1−μ 2)
Compute: Ƶ= 1 +1 , if σ is known

σ ( ¿ )¿
m n
( 1−2 )−(μ 1−μ 2)
t= 1 +1 , if σ is not known.

Sp ( ¿ )¿
m n

In order to compute t, we must first compute the followings:


m

S12 =
∑ ( x 1−1)2 ,
i=1
m−1
n

S22 =
∑ ( x 2−2)2 ,
i=1
n−1

20
( m−1 ) S 12+ ( n−1 ) S 22
Pairing avoids variation that reduces error. Sp = . Here, we have
( m−1 )+(n−1)
assumed that the population variances are equal. This test is known as independent t-test.

But, if two samples are dependent, t is computed as:


−μ
Y 2−( ∑ Y ¿ 2¿¿ n)
S t tα/2(n-1)S 2 = S2 = ∑
, , where = ∑ (x 1 – x 2), x1 and
n−1
√n
(−Y ) 2
x2 are dependent random variables. S2 = . Examples: albumin and hemoglobin
n−1
blood. Here, m and n samples are the same, and when two samples are related
(correlated), use independent samples. Whereas, for dependent samples, use μ deviation.

(iii) Several samples hypothesis H0: μ1 = μ2 = μ3 = … = μk. Here, it is possible to compute


two samples t’s and make two samples t-test. But, it is preferable to compute F under
the analysis of variance. That performs the analysis of variance.
2. Hypotheses concerning variances
(a) H0: σ2 = c - One sample test.
2
Ha: σ ≠ c
(b) H0: σ12 = σ22 - Two sample test.
2 2 2
Ha: σ1 ≠ σ2 σ
(c) H0: σ12 = σ22 = σ32 = … = σk2 - Multiple sample test.
2 2 2 2
Ha: σ1 ≠ σ2 ≠ σ3 ≠ … ≠ σk (all variances are not equal).

A hypothesis for testing variance


(i) H0: σ2 = cvs. Ha: σ2 ≠ c. To test this hypothesis, you use the χ 2 distribution.
SS 2
Note: x (n-1)
σ2
(ii) H0: σ12 = σ22vs. Ha: σ12 ≠ σ22σ2. To test this hypothesis, use the F distribution.
For S12/S22 F(m-1, n-1).
(iii)H0: equality of several variances (σ12 = σ22 = σ32 = … = σk2) vs. Ha: inequality of all
several variances (σ12 ≠ σ22 ≠ σ32 ≠ … ≠ σk2. To test this hypothesis use either (a) Fmax
Maximum variance
– test, where Fmax = , or (b) Bartlett’s test – for homogeneity of
Minimum variance
variance.F-distribution (test) is called variance ratio.

Assignment: Is Bartlett’s test the best? Read Bartlett’s test for homogeneity variances.

Developing two sample test (independent)

21
Letx1, x2, x3, …, xn be a random sample and y1, y2, y3, …, yn be another random sample,
the followings are true: x N(μx, σx2), as well as, y N(μy, σy2).

Assume that the two variances σx2 and σy2 are equal. In other words, the two population
variances are equal. Then, we can test the hypothesis. H0: μx = μy, or H0: μx – μy = 0.

22
Procedure
1. Decide on significant level
2. Decide on sample size
3. Take the samples
4. Compute the two variances, then compute pooled variance
5. Compute two sample t
6. Reject the hypothesis if t-calculated > tα/2(m+n-2). It is two tails test if not (inexact
hypothesis) use tα, not tα/2.

Computation of estimates
Note that this sign means estimates of statistics.
m

1. = =
∑ xi
i=1
m
n

2. = =
∑ yi
i=1
n
m

3. x
2
= sx2 = ∑ ¿¿ ¿
i=1
n

4. y
2
= sy2 = ∑ ¿¿ ¿
i=1

2
SSx +SSy
5. = sp2 = m+n−2

What is the sampling distribution of - ? Note: if xi N(μx, σx2), then


σ x2
N(μx, ). Similarly, if yi N(μy, σy2), then N(μy,
m
σ y2 σ x2 σ y 2
). Also - N[(μx-μy), ( + )], that is if x & y are independent
n m n
random variable.
If σx2=σy2, then - N[(μx-μy), σ2(1/m+1/n)]. t = ¿ ¿
Example, x y Test H0: μx = μy
2 5 m=n
m
2
1 3 sx = ∑ ¿¿ ¿
i=1

3 4 sx = (2-2)2+(1-2)2+(3-2)2+(2-2)2/3
2

2 8 sx2 = 2/3 = 0.667


Σx=8 Σy=20
=2 =5

23
m
2
sx = ∑ ¿¿ ¿ -Deviation formula
i=1
m

m ∑ xi
sx =2
∑ xi2−( i=1m )2 -Machine
i=1
m−1
formula
m
2
∑x i = 22+12+32+22 = 18
i=1

(Σxi)2/m = (8)2/4 = 64/4 = 16


18−16
sx2 = = 2/3 = 0.667
4−1
n

n ∑ yi n
2 i=1 2 2
According to machine formula, sy = ∑ yi2−( n
) ∑y i = 25+9+16+64 = 114
i=1 i=1

n−1
(Σyi)2/n = 202/4 = 400/4 = 100
sy2 = 114-100/4-1 = 14/3 =4.667

The pool sample variance is estimated as: sp2 = ssx+ssy/m+n-2 = 2+14/(4+4)-2 = 16/6 = 2.667
sp = √ s p 2 = √ 2.667 = 1.6329
t = ¿¿ Tabulated t-value
(2−5)−0
t= 1 1 tα/2(σ)
1.6329

t = -3.67
√ +
4 4
t0.5/2(σ) = 1.46
Decision: Reject H0, whilet-calculated is > t-tabulated indicating that the population mean of x &
y are not equal.

If we don’t wish to assume that the two variances are equal, then we can test the following
hypothesis: H0: σx2 = σy2 vs. Ha: σx2 ≠ σy2
Procedure:
1. Choose α-level (error-level)
2. Take the two samples (xi& yi)
3. Compute F as: F = (sx2/σx2)/sy2/σy2, σx2/σy2 = 1, while they are equal hypothetically
If H0 is true, then F = sx2/sy2 Fα(m-1, n-1)
4. Reject H0, if F>Fα(m-1, n-1)

Example, test the variances of x & y given above: sx2= 0.667,and sy2 =4.667
F = sy2/sx2 =4.667/0.667 = 7.0

24
F-tabulated at 0.05 error: F0.05(3, 3) = 9.28
Decision: Accept H0. Since F-distribution is skewed to the right, it doesn’t have two tails test,
rather one tail test.

Development of many sample tests


Let’s assume that we have the following samples:
x11, x12, x13, …, x1n (1st sample)
x21, x22, x23, …, x2n (2nd sample)
. . . .
. . . .
. . . .
xk1, xk2, xk3, …, xkn (kth sample)
Let xij N(μi, σ2). Assume that all variances are the same, where xij is the jth observation
of the ith sample. There are μ1, μ2, μ3, …, μk and σ2 parameters to be estimated.

Estimation of means of many samples


j

1. 1 = 1. =
∑ x 1 j = x /n 1.
j =1
n
j

2. 2 = 2. =
∑ x 2 j = x /n 2.
j =1
n
j

3. 3 = 3. =
∑ x 3 j = x /n 3.
j =1
n
. . . .
. . . .
. . . .
j

k. k = k. =
∑ xkj = x /n k.
j =1
n

Estimation of variances of many samples


n

1. 1
2
= s12 =
∑ ( x 1 j−1. ) 2
j =1
n−1
n

2. 2
2
= s22 =
∑ ( x 2 j−2. ) 2
j =1
n−1

25
n

3. 3
2
= s32 =
∑ ( x 3 j−3. ) 2
j =1
n−1
. . .
. . .
. . .
n

k. k
2
= sk2 =
∑ ( xkj−k . ) 2
j =1
n−1

2
( n−1 ) s 12+ ( n−1 ) s 22+ ( n−1 ) s 32+ … ( n−1 ) sk 2
= sp2 =
k (n−1)
k n

sp2 =
∑ ∑ ( xij−i. ) 2 , Error variance is the pooled variance.
i=1 j=1
k (n−1)
To test the hypothesis, H0: μ1 = μ2 = μ3 = … = μk
We know that under the null hypothesis, 1, 2, 3, …, k is a random sample from
N(μ,σ2/n), then we can estimate μ as:
k n

= .. =
∑ ∑ xij , where N = kn
i=1 j=1
N
k

The mean variance = σ2/n and its estimate is 2


/n = s2/n =
∑ ( i.−.. ) 2
i=1
k −1

Distribution of s2 and sp2


Note: (k-1)s2/σ2 x2(k-1)
k(n-1)s2/σ2 x2k(n-1)
Compute F as: F = (s2/σ2)/(sp2/σ2) Fα[k-1, k(n-1)]

5. Analysis of Variance

Definition – The analysis of variance (AOV or ANOVA) is defined as the breakdown of


variability in to its component parts. F-distribution plays a big role in the analysis
of variance

Basic underlined assumptions for the analysis of variance


1. Observations are drown from a normally distributed population,
2. Observations represent random sample,
3. Population variances are equal,

26
4. The numerator and denominator F-ratios are independent.

5.1 Analysis of Variance Models


Definition of effect – considering the following model of one-way analysis of variance
yij = μ+βj+εij, where yij – typical observation
μ – grand mean
βj – treatment effect
εij – random error
Substituting sample values, then we have:

yij = ..+ + - (1)


yij- .. = + - (2)
The effect of treatment j (βj) is defined as the deviation of the jth treatment mean from the
grand mean.
Symbolically, βj = μj-μ
The estimate, = j- = j.- .. –(3)
Inserting equation (3) in equation (2), we have:
yij- .. = j .- ..+ - (4)
yij- ..- j.+ .. =
= yij-..- j.+ .. – (5)
Types of models
1. Fixed model – This model is appropriate when all levels of a treatment are included or
some of them are selected not at random among possible levels. yij = μ+βj+εij
For this model βj is a fixed constant and it is = βj.
2. Random model – This is a case in which k-levels are taken out of the possible k-levels
at random. yij = μ+βj+εij, where βj ≠ βj.. In this model βj is a random variable.
3. Mixed model – This is a case where either block or treatment, but not both fixed
whereas either one is random. yijk = μ+αi+βj+εijk. Here αiis fixed constant whereas βj is a
random variable.
Numerical example
A study on the absorption of fats by doughnuts, and assume that we have 4 types of fats at 6
batches.
Fat
1 2 3 4
64 78 75 55
72 91 93 66
Batch 68 97 78 49
77 82 71 64
56 85 63 70

27
99 77 76 68
Ʃyi.= yi. 436 510 456 372

i. 72.67 85 76 62
What would be the statement of null hypothesis?
H0: There is no difference among the absorption of fats by doughnuts.

Computation of sum of squares


1. Total sum of squares(SStotal)
= between groups SS + within groups SS
= treatment SS + error SS.
= ƩƩyij2 - (ƩƩyij)2/kn = (642+722+…+682)-(64+72+…+68)2/(4x6)
= 134968 – 131128.1667 = 3839.83
2. Between groups sum of squares
k
SSBg = ∑ (Yij¿)2/n ¿ -(ƩƩyij)2/kn
i=1

SSBg = (4362/6)+(5102/6)+(4562/6)+(3722/6) – 131128,17


SSBg = 132752.67 – 131128.17 = 1624.5
3. Within groups sum of squares
SSWg = ƩƩyij2 – Ʃ(Ʃyij)2/n
SSWg = SStotal - SSBg = 3839.83 – 1624.5 = 2215.33

AOV (ANOVA) Table


Source of variation df SS MS Fcal
Between groups k-1=3 1624.5 541.5 4.89
Within groups k(n-1)=20 2215.3 110.765 -
Total kn-1=23 3839.83 - -
Where df = degree of freedom, SS = sum of squares, MS = mean of squares
(SS/df), Fcal = F-calculated (MSBg/MSWg)
When Fcal<Ftab, H0 will be accepted, but when Fcal>Ftab, H0 will be rejected.
F-tabulated (Ftab) at 0.05 level of error = F0.05(3,20) = 3.10, then the conclusion is reject null
hypothesis (H0), while Fcal>Ftab (4.89>3.10). In other words, there is a difference between
groups (treatments).

The F-test is usually used for the analysis of variance. If the F-test shows that there is the
difference between treatment means, we would like to know which of the means are different
and the process is known as multiple comparison or means separation (separation of
means).

5.2 Multiple comparison or contrast

28
If the overall F-test declares, there is significant difference between treatments (groups), the
experimenter is faced the question, which of the means under the study are different? The F-
test indicates to the experimenter that either something has or has not happened between
treatments. If the null hypothesis is true, there would have been a small probability of
happening between treatments. In other words, what has happened between the treatments is
a small probability difference as a matter of chance.

A comparison or contrast is the difference between means regardless of sign (decreasing or


increasing). For example, assume that we have 4 means, 1, 2, 3& 4. Then, the
possible comparisons between or among the means are:
1- 2 2- 3 1+ 2+ 3 - 3 4, etc.

1- 3 2- 4

1- 4 3- 4

If Ψ (phay) is a contrast for population, its sample estimate is given by . For instance,
1 = 1- 2, 2 = 1- 3, 3 = 1+ 2+ 3 -3 4, etc.

Methods of multiple comparison or means separation


1. Least Significance Difference (LSD)
2. Newman Keul Significance Difference
3. Turkkey’s Honestly Significance Difference
4. Duncan’s New Multple Range Test
5. Schiffe’s S-Method

Each method of means comparison has advantages and disadvantages, but the 1st and the 4th
methods are commonly used. Indeed, the 5th method is very appropriate if the sample size (n)
of the treatments is not the same.

1. Least Significance Difference (LSD)


To use this method, first of all the F-test will be performed. Then, you make all pair-wise
2 MSE
comparisons. I.e., compare the differences against LSD = tα(k(n-1)

n
= d, where

tα(k(n-1) is t-value of error’s degree of freedom, while MSE is a mean square of error and
n is a sample size. If your calculated difference exceeds d, reject the null hypothesis; if
not, accept the null hypothesis. For instance, the doughnut study case indicated above,
put the means in descending and ascending orders in x & y axis, respectively, and
calculate the difference between the means.
2=85 3=76 1=72.67 4=62

4=62 23 14 10.6 -
1=72.67 12.34 3.34 - -

29
3=76 9 - - -
2=85 - - - -
Correction!
2 MSE 2(110.4 )
d = tα(k(n-1)
√ n √
= t05 (20)
6
= 1.7 √ 6.06 6.06 = 10.3

Then, the difference between two means would be significant if it exceeds 10.3. That
mean, there is no difference between 2& 3, and 3 & 1, whereas there is
1significance difference between 2& 4, 2& 1, and 3& 1. For more

simplification of recognizing the existance of significant differences between treatments


or not, it is good to identify means having significance difference or not with different
and similar letters, respectively. For instance of the doughnut study results, put the mean
as their natural order and differentiate them with letters as follow:
1=72.67b

2=85a

3=76ab

4=62c

In some cases, however, the LCD can lead to anomalous (unusual) results. The F-test
may be significant, but the means difference may not be significant.

2. Duncan’s New Multiple Range Test


In this method, pair-wise comparison among means is performed using the following
MSE
formula: wr = pr,v

n
, where pr is obtained from special tables at v error’s degree of

freedom and r distance between two means at 0.01 or 0.05 level of error.

30
Procedure
1. Rank the means
2. Obtain pr& v Correction!
3. Compute wr
4. Compare differences withwr

Example, take the results of the doughnut study. First make ranking means from the
highest to the smallest: 2, 3, 1, 4. Then, compute as follow:

r Compute d/c d/c qr wr d/c: wr conclusion differential


4 2- 4 23 13.64 > s 1ab
3 2- 1 12.34 13.30 < ns 2a
2 2- 3 9 12.65 < ns 3a
3 3 - 4 14 13.30 > s 4b
2 3- 1 3.34 12.65 < ns
2 1 - 4 10.6 12.65 < ns

110.4
w2 = 2.95
√ 6
110.4
= 12.65

w3 = 3.10
√ 6
110.4
= 13.30

w4 = 3.18
√ 6
= 13.64

There is only difference between 2& 4, and 3& 4. When we compare the
results of LSD and DNMRT methods, LSD seems more sensitive than DNMRT.

LSD DNMRT
1b 1ab
2a 2a
3ab 3a
4c 4b

3. Schiffe’s S-Method
This is known as simultaneous confidence bound.
S = √ ( k −1 ) Fα ( v 1 , v 2) MSE ∑ ( Ci ) 2 , where ci are possible contrasts.
Example,

H0: μ1+μ2-μ3-μ4 = 0
¿

= 1+ 2- 3- 4 = 72.67+85-76-62 = 19.6
( 1 ) 2 ( 1 ) 2 (−1 ) 2 (−1 ) 2

S = √ 3(4.90) 110.4 ∑
6
+
6
+
6
+
6

31
S = 30.45, and the decision is accepting the hypothesis, while the
estimate is less than S.

5.3 Sample Size


Determination of sample size is very important for researchers, while it is important for two
reasons:
1. Too large sample would be expensive, and
2. Too small sample would affect the precision of our estimate.

How to determine sample size?


Sample size in estimating means.
If xi N(μ,σ2), then N(μ,σ2/n)

Confidence interval for the mean is:


- tα/2 s/√ n ≤ μ ≤ + tα/2 s/√ n, or - Ƶ α/2 σ/√ n ≤ μ ≤ + Ƶα/2 σ/√ n
The difference between two ranges is known as length (L).
L = ( +Ƶα/2 σ/√ n) – ( - Ƶα/2 σ/√ n). L is also known as tolerance level.
= 2Ƶα/2 σ/√ n
L = 4Ƶα/22 σ2/n
2

n = (4Ƶα/22 σ2)/L. If σ2 is not known, use the sample estimate θ2.

Example, it is given that σ2 = 50, and α = 0.05 and L = 2, then n is calculated as:
n = (4Ƶ0.052 σ2)/L = [4(1.96)2x 502]/22 = 200. To use the formula for determining sample size
(n), we need to know the sample variance.

Possible ways of obtaining s2


1. Previous experience (review)
2. Pilot experiment (small experiment), determining s2 through this may not as exact as it is,
it gives at least clue/idea.

Experimental error and its significance


When we take a sample from a given population, say normal population, not all of the
observations would be equal to the population mean (μ). A typical observation can be
expressed as follows: yi = μ + ϱ i (ϱ = obsolon, where μ is the population mean and ϱi is the
deviation of the observation from the population mean, that we call this term the random
error.

When we sum all observations and divided by sample size, it gives μ. But, when we look at
each observation vs. μ, it may less or larger from μ. This deviation (difference) regardless
sign (increase or decrease) is called experimental error.

32
Experimental error is used to determine the accuracy of an estimate or accuracy of the
experiment. Suppose we have two populations with mean μ1&μ2, and we want to test the
following hypothesis: H0: μ1 = μ2. There are two possible cases in relation to error.
1. In the first case (case I), the experimental error would be large. In this case, the difference
between the two means must be large also in order to be detected. Large experimental
error affect our estimation.
2. In the second case (case II), experimental error is small. In this case, small differences
can be detected.
1 1
t = [( 1- 2) – (μ1-μ2)]/sp
√ +
m p
When experimental error is reduced, the accuracy of an experiment increases.

5.4 Increasing accuracy


How accuracy is improved? There are several ways of increasing accuracy. The important
ones are:
1. Increase sample size
2. Refine experimental technique (replication & randomization)
3. Handle experimental material in such a way that variability is reduced and
homogenousity is secured.

A. Increasing Sample Size


If the experimental material that the researcher is working with fairly homogenous,
increasing the sample size will enable us to detect small differences. If this is the case, it
may be seen from the following: If yi N(μ, σ2), then N(μ, σ2/n).

Similarly, if we have samples distributed as yi N(μ1, σ2), xi N(μ2,


1 1
σ2), then the difference between their means is: - [(μ1-μ2),σ2/( + )],
m n
from this, we note that the variance of means is smaller than that of a single observation.

B. Choice of Proper Experimental Design


In order to reduce your experimental error, choose a design which is appropriate to the
experiment. By choosing an appropriate design which ensures homogenous experimental
unit for treatments, it is possible to reduce the experimental error.

Relative efficiency – is a measure used to compare two designs in terms of having the
minimum experimental error. It depends upon the variances of the two designs. Suppose
we wish to compare desirable one (I) against design two (II). If we have large samples,
the relative efficiency can be computed as follow: R.E. = (s22/s12) x 100.

33
If we have small samples, on the other hand, the relative efficiency can be calculated as
follow: R.E. = [(n1+1) (n2+3) s22]/[(n1+3) (n2+1) s12] x 100, where s2 is error
variance. The latter takes in to account the degrees of freedom in comparing the
efficiency of the designs.

A design is said to be more efficient than the other one if the R.E. is greater than 100%. A
design which reduces the error variance and the degrees of freedom at the same time may
not be the best. A design which reduces the error of variance by increasing sample size to
increase degrees of freedom is rather the best. Example let the sample variance obtained
for design I be 127 and that of design II be 152. Let their respective sample sizes be 6 and
14. Symbolically, s12 = 127; s22 = 152
n1 = 6 n2 = 14
(i) The relative efficiency of design I over design I without considering the df is:
R.E. = (152/127) x 100 = 119.7%
(ii) If we take into account the sample sizes, then
R.E. = [(n1+1) (n2+3) s22]/[(n1+3) (n2+1) s12] x 100
R.E. = [(6+1) (14+3) 152]/[(6+3) (14+1) 127] x 100 = 105.5%
Since the R.E. value is greater than 100, we say that design I is more efficient than design
II.

Definition of some important terms used in experimental design


a) Experimental unit – This is an object or group of objects which are measured to
result in an observation. In other words, it is an object on which an experiment is
carried out. It can be a person, an animal, a chemical reaction, a plot of land, etc.
b) Randomization – The selection of experimental material and assignment of
treatments is done in a random fashion, and this technique is known as
randomization.
Significance of randomization
i. Statistical methods require that the observation to be independently distributed.
Hence, randomization makes this possible.
ii. Proper randomization will assist us in getting rid of the effect of extraneous
factors which are not of our interest. E.g., soil variability. If we put treatments
systematically, their effects would be confounded with natural soil variability
within the replication. Because it is very difficult to control all sorts of soil
variability by blocking.
c) Replication – Replication is the repetition of the basic experiment. Note that having
repeated measurement on the same set of experiment doesn’t constitute replication.
Rather this is observation or sample. Plot to plot variability can be reduced by having
the same set of experiment in different plots.

34
Both randomization and replication are necessary to obtain a valid estimate of
experimental error. Increasing number of observations is important to reduce errors
caused by man, but replication is important to increase accuracy and precision.
d) Treatment – An experimental material in which an experimenter is interested in and
whose effect he wishes to measure. Examples, (i) effect of fertilizers on root
development, (ii) effect of quantity of food consumed on weight gain, (iii) effect of
quantity of enzyme on chemical reaction, etc.
e) Factors and levels – are kinds of treatments having main and sub constitutes. For
instance, nitrogen and phosphorous fertilizers on the one hand and their different rates
on the other hand.
Nitroge 0, 25, 50, 60, 70
n
P2O5 0, 50, 100, 150, 200
The items nitrogen and phosphorous are the factors, while their rates are the levels.
f) Treatment combinations – The combination of factors and levels is known as
treatment combination. Assume that we have t-factors and r-levels each, and then we
will have rt treatment combinations. But, if the levels are different, then the treatment
combinations would be rt1 x rt2x rt3 x …x rtn. In case of the above example, nitrogen
and phosphorous and their levels, the treatment combination is = 52 = 25.
g) Blocking – This is a very useful term in statistics or experimental design. It is a
technique in which experimental units are classified into homogenous group. This is
done to reduce variability arising from differences due to experimental units.

Assume that we have t-treatments and we wish to observe each treatment r-times
(replications), then we have rt experimental units to work with. Suppose r = 4 and t =
4, then rt = 4x4 = 16. Suppose also these experimental units are hypothetically
arranged as follow:  and with the application of
blocking, they will be arranged as follow:
Group I: 
Group II: 
Group III: 
Group IV:  

This technique is similar to blocking because homogenous groups are assigned to one
block. Note that the units within each block are homogenous, and this is the idea of
blocking. If land is an experimental unit, its condition may vary along the slope or
across East-West, South-North or both. In other words, it may vary one-way or two-
ways.

Blocking, treatments placement and comparison of their effects

35
Each treatment has to be allocated in each block which is considered to be homogenous.
The treatments found in the same homogenous block are therefore comparable. But, any
two treatments found in any two different blocks separately are not comparable, while
their differences may not be totally due to their effect, perhaps may be due to the
differences of experimental units. In each block, each treatment has to be assigned
randomly.

Suppose that we have t-treatments and wish to observe each treatment r-times
(replications). Then, we will have 3 different cases to placing treatments on an
experimental unit where the treatments are comparable.
1. Case I – Suppose that the experimental units are homogenous. In other words, no
restriction on experimental unit. Indeed, if there is no variation, no need to make
blocking and the source of variation will be due to treatments and experimental error.
Under this condition, the experimental design is known as Completely Randomized
Design (CRD) and the AOV (ANOVA) is known as One-Way ANOVA, while the
main source of variation is only one that is the treatment.
2. Case II – Suppose not all experimental units are homogenous, and there is one
restriction on the experimental unit. Within blocks/groups, the experimental units are
homogenous, while between blocks/groups, they are heterogeneous. Randomly
assigning of each treatment to one experimental unit within a group and that group is
known as a block. Under this case, each treatment occurs within each block and
therefore the blocks are said to be complete. Each treatment occurs once in each
block. If we need to observe each treatment more than once in each block, say k
times, then we need k homogenous experimental units to form a complete block.

For instance, when land as experimental unit varies only one-way, either along the
slope, East-West or South-North, it is necessary to apply one-way blocking and each
block will be considered as a replication for a set of basic treatments. Under this
condition, the design is known as Random Complete Block Design (RCBD) and the
ANOVA is known as Two-Way ANOVA, while one-way variation of experimental
unit (block) and the treatments are the two main sources of variation.

Block 1 Block 2 Block 3 Block 4 Block 5

3. Case III - In case of two-way variation (restriction) on the experimental unit, two-
way blocking is applied and the design is known as Latin Square Design (LSD) and
the ANOVA is known as Three-Way ANOVA where row and column blocks and the
treatments are the three sources of variation.

36
Two-way blocking necessitates, however, equal number of row and column blocks as
the number of treatments. If the number of treatments is 4, then the number of row
and column blocks has to be 4.

For instance, plot of land varies across East-West and South-North:


Row

Column

Complete and incomplete blocks


Complete blocks are in which each treatment occurs in all blocks. On the contrary, if
each treatment doesn’t occur in all blocks, it is then said to be incomplete block. If there
are k blocks and the treatments don’t occur k times each, then the block is said to be
incomplete.

There are two major classes of incomplete blocks:


1. Balanced incomplete blocks, and
2. Partially balanced incomplete blocks.

Balanced incomplete blocks (BIB)


Let’s assume that we have b blocks, t treatments and r observations per treatment. Let k
be the number of treatments in a block, where k ≠ t and t ≥ k, because there is no enough
experimental unit to see each treatments in all blocks. To have a balanced incomplete
blocks, the following must hold true, rt = bk.

For example, let r = 3, t = 4, b = 6, and k = 2. Then, rt = bk  3x4 = 6x2, and it can be


simply laid out as the following.

TT1 T2 T1 T3 T1 T4 T2 T3 T2 T4 T3 T4

b1 b2 b3 b4 b5 b6
In this case, all the treatments occur an equal number of times.

37
Note that we can compare treatments which are found on the same block, not on different
blocks. The function of local control is to make the experimental design more efficient.

Partially balanced incomplete blocks


Under this case, the number of observations is not the same for all treatments. That
means that the equation rt ≠ bk. Example, assume that we have 24 experimental units and
have 10 treatments that we wish to study. Suppose that we can group these units into 6.
The size of the blocks will be 4. Clearly that rt ≠ bk. We cannot find an integral value of r
such that 10r = 24. This would mean that some of the treatments would occur more often
than others.

T1 T1 T2 T3 T4 T2

T2 T2 T3 T2 T1 T4

T3 T4 T4 T1 T3 T1

T5 T6 T7 T8 T9 T10

In this layout, the treatments don’t occur an equal number of times.


Note: Treatments T1 & T2 occur 5 times
" T3 & T4 " 4 "
" Others " only once

Analysis of complete block experiment


Let’s assume that we have b groups with nt homogenous experimental units in each
groups. Let’s assume also that the n observation of the jth treatment of the ith block is a
random sample from a normal distribution with mean μij and variance σ2.
That is μijk N(μij, σ2).

1. General Model
First it is necessary to transcribe the parameters.
yijk = μij + Ɛijk, where i = 1,2,3,…,b
j = 1,2,3,…,t
k = 1,2,3,…,n
In other words, yijk is the kth observation of the jth treatment of the ith block. By
transforming the above equation, we will have the following:
yijk = μ + βi + ζ j +Ɛijk, where yijk = typical observation
μ = grand mean
βi = ith block effect
ζ j = jth treatment effect

38
Ɛijk = kth random error of the jth treatment
If we put as yijk = μij + Ɛijk, it is not possible to know the effect of treatments and
blocks. The next step will be estimation of parameters.

2. Estimation of parameters
The second step after transcribe the parameters is estimating the parameters. To
b t
estimate our parameters, we must set the following: ∑ β I = 0, ∑ζ j = 0, that means
i=1 j=1

the Ʃ of blocks is zero, because some are <μ and some others are >μ.

Parameters: μ, μi., μ.j, βi, ζj, σ2


In the ANOVA, all σ2 are the same. If not, it can’t be possible to pool them. If they
are the same, it is possible to pool them.

b t t t
μ = (∑ ∑ μ ij)/bt, μi. = (∑ μ ij)/t, μ.j = (∑ μ ij)/b, βi = μi.-μ, ζj = μ.j-μ
i=1 j=1 j=1 j=1

Ʃμ.. – μ = 0

Estimates
b t n
= (∑ ∑ ∑ y ijk)/btn = …
i=1 j=1 k=1
t n

i. = (∑ ∑ y ijk)/nt = i..
j=1 k=1
b n

.j = (∑ ∑ y ijk)/nb = .j.
i=1 k=1

Effects and then estimates


βi = μi.-μ
i = μi.-μ = i.. - , where i = 1,2,3,…,b

ζj = μ.j-μ
i = μ.j-μ = .j. - … , where j = 1,2,3,…,t
b t n

2 =[ (∑ ∑ ∑ y ijk- ij. )/bt(n-1)]2, = pooled variance. Note that if observation is one,


i=1 j=1 k=1

the 2 = 0 (no MSE).

Additive Model
The model discussed here above (yijk = μ + βi + ζ j +Ɛijk) is known as additive model.

39
A model is said to be additive if the difference between two treatments is the same for
all blocks. In other words, in additive model, the difference between any two
treatments doesn’t vary from block to block. For example, if we have 4 treatments,
the treatment differences would be: ζ1-ζ2, ζ1-ζ3, ζ1-ζ4, ζ2-ζ3, ζ2-ζ4, ζ3-ζ4, and each of
these differences are the same in all blocks.

This can be illustrated graphically as follow:

10
9
ζ2
8
Treatmet effects

7 ζ4
6
5 ζ3
4
ζ1
3
2
1
0
β1 β2 β3 β4 β5
Block effects

Additive model never includes interaction effects. It is applicable where the difference
between treatments is the same in all blocks effect. I.e., ζ1-ζ3 = c (constant) at all blocks.

Non-additive model or interacting model


A model is said to be non-additive, if the difference between two treatments depends up
on the block in which they occur. Mathematically, non-additive model or interacting
model is written as follow: yijk = μ + βi + ζ j +δij+Ɛijk, where δij = interaction effect of
blocks and treatments, specifically interaction between ith block and jth treatment.

To relate this model to the general model, yijk = μij + Ɛijk,


μij = μ + (μi. - μ) + (μ.j - μ) + (μij - μi.-μ.j + μ)

block effect treatment effect interaction effect

ij = ij. - i.. - +
.j. …
b t
To obtain the estimate of δij, we need to set that ∑ ∑ xij = 0
i=1 j=1

The next step is hypotheses testing.

40
Test hypotheses
1. Hypothesis on treatments
H0: There is no difference among treatments.
H0: ζ1 = ζ2 = ζ3 = … = ζt, or
H0: μ.1 = μ.2 = μ.3 = … = μ.t

Ifyijk N(μij, σ2), then .j. N(μ.j, σ2/bn). If the null hypothesis is true,
.1., .2., .3., … .t. is a random sample from a normal distribution with mean μ and
t
variance σ2/bn. Symbolically, .j. N(μ.j, σ2/bn). The variance of .j. is = [∑ ¿¿
j=1
2 2
.j. - ...) ]/t-1. It estimates σ /bn, which indicates the variability of treatments.

This estimates σ2/bn and the estimate of σ2 (Q1) = bnƩ[( .j. - ... )2]/t-1
2
To test the above hypothesis, compute Ϝ = Q1/ Fα[t-1, bt(n-1)]
b
2
= [∑ ∑ t ∑ n ¿ ¿ ¿ ¿yijk - ij. )2]/bt(n-1)
i=1 j=¿¿ k=¿¿

2. Hypothesis on blocks
H0: There is no difference among blocks.
H0: β1 = β2 = β3 = … = βb, or
H0: μ1. = μ2. = μ3. = … = μb.. If H0 is true, 1.., 2.., 3.., … b.. is a random sample
distributed normal with mean μ and variance σ2/tn. Then i.. N(μ, σ2/tn).
Sample variance of i.. is = [ ∑ b ¿ ¿ i.. - ... )2]/b-1. This estimates σ2/tn and the estimate
i=¿¿

of σ2 (Q2) = tnƩ[( i.. - ... )2]/b-1. Then compute F = Q2/ 2


Fα[b-1, bt(n-1)]

3. Hypothesis on interactions
H0: There is no difference among interactions of blocks and treatments.
H0: δ11 = δ12 = δ13 = … = δbt.
The variance of the estimate ij.- i..- .j.+ …
b t
= [∑ ∑ ¿ ¿¿ --
ij. . i..- .j. + … )2]/(b-1) (t-1). This estimates σ2/n,
i=1 j=1
b t
then the estimate of σ2 (Q3) = n[∑ ∑ ¿ ¿¿ --
ij. . i.. - +
.j. … )2]/(b-1) (t-1). Then F is
i=1 j=1

2
computed as F = Q3/ Fα[(b-1)(t-1), bt(n-1)].

Note that whether the model additive or non-additive is determined after conducting the
experiment and observing the samples which are either affected or not affected by the
interaction of blocks and treatments.

41
Under additive model SStotal = SSblock+SStreatment+ SSerror, whereas under non-additive
modelSStotal = SSblock+SStreatment+ SSinteraction +SSerror.

AOV (ANOVA) Table (non-additive model)


Source of variation df Sum of squares Mean squares
Blocks b-1 tnƩ( i.. - ...)2 = B B/b-1
Treatments t-1 bnƩ( .j. - ...)2 = T T/t-1
Interaction (b-1)(t-1) nƩƩ( ij.-.- i..- .j.+ …)2 = I I/(b-1)(t-1)
Error Bt(n-1) ƩƩƩ( ijk - ij.)2 = E E/(bt(n-1)
Total btn-1 ƩƩƩ( ijk - …)2 -

If the interaction effect is insignificant, MSerror = E+I/[(b-1)(t-1)+(bt(n-1)] is not


significant with E/bt(n-1)

AOV (ANOVA) Table (additive model)


Source of variation df Sum of squares Mean squares
Blocks b-1 B B/b-1
Treatments t-1 T T/t-1
Error (b-1)(t-1) + bt(n-1) I+E I+E/[(b-1)(t-1)+bt(n-1)]
or
btn-1 – [(b-1)+(t-1)]
Total btn-1 - -

42
Part II: Design and Analysis of Experiments

Definition – By statistical design and analysis of experiments, we mean the process of planning
an experiment so that appropriate data could be collected which can be analyzed to result in valid
and objective conclusion.

A) Types of experimental design from randomization point of view: There are two major
types of experimental design:
1. Systematic experimental designs
2. Randomized experimental designs

i. Systematic Experimental Designs


Prior to the development of modern experimental designs, researchers tried various
arrangements of their treatments. These arrangements were not subject to the laws of
chance (probability). For example, if we have the treatments A, B, C these could be
arranged as follows:
i)
A A A A B B B B C C C C
ii)
A A A A
B B B B
C C C C
iii)
A B C
A B C
A B C
A B C
iv)
A A B B C C
A A B B C C

One of the commonest systematic arrangement of treatments in which the


treatments occur several times is the following:
A B C A B C A B C

Diagonal Square

43
A B C A B C
C A B B C A
B C A C A B

Advantages of the systematic designs


1. Simplicity – many experimental fields are handled easily including planting,
harvesting and other agronomic practice
2. No need for randomization
3. Disimilar varieties can be alternated to observe natural crossing or mechanical
mixture
4. Varieties can be arranged in order of maturity or fertilizers can be applied in
terms of increasing magnitude
5. Systematic arrangement allows for proper utilization of experimental area. That
is. It allows intelligent placement of the treatments

Disadvantages of the systematic designs


1. It does not provide proper estimation of experimental error
2. Correlation of adjacent plots may result in systematic errors in estimating
treatment effects

ii. Randomized Designs


The treatments are applied to the experimental units in block at random. Some of
these are:
1) Completely randomized design (CRD)
2) Randomized complete block design (RCBD)
3) Latin square design (LSD)
Cross-over design (COD)
Switch-back design (SBD)
Graeco-Latin square Design (GLSD)
4) Split-plot design
Split-block design
Strip-block design
5) Lattice square design
Simple lattice square design
Double lattice square design
Triple lattice square design
Cubic lattice square design
Quadratic lattice square design
6) Youden square design
Quasi-Latin square design
44
Semi-Latin square design, etc.

B) Types of experimental design from completeness point of view: There are two major
types of experimental design:
1. Complete designs
2. Incomplete designs

Selection of experimental design


The above lists of designs indicate that there is quite an array of experimental design
available for a researcher. There is no one best experimental design for all situations.
Each design was developed to control variability under a given experimental condition.
The choice of an experimental design depends upon the nature of experimental material
to be tested and the variability present. However, if an experimenter is given an
opportunity to choose a guiding principle to choose an experimental design which is
simpler in terms of layout and analysis and which adequately control variability.

The chief advantage of the randomized designs is that they are subjected for statistical
analysis.

6. Simple/Single Factor Experiments


6.1 Complete Designs
A. Completely Randomized Design (CRD)
The completely randomized design (CRD) is the basic design. All other designs
spring from CRD by putting restriction on the allotment of treatments to the
experimental units.

Definition – The simplest experimental design where the treatments are assigned at
random to homogenous experimental units is known as the completely randomized
design (CRD). It is selected when the overall variation of experimental units is
relatively small or insignificant.

Advantages of Completely Randomized Design (CRD)


1. Simple layout
2. Statistical analysis is straight forward
3. Doesn’t require equal sample size. It does not require the same sample size as
other designs.
4. Provides maximum degrees of freedom for estimating variance (error).

Disadvantages of Completely Randomized Design (CRD)

45
1. Appropriate only where the number of treatments are small and where there are
homogenous experimental units (material), such as in laboratories and green
houses.
2. Not appropriate for field experiment, because it is nearly impossible to get
homogenous experimental units in the field.
3. It does not provide method for estimating interaction of treatments with blocks
Note that the term layout refers to the placement of experimental treatments on the
experimental site whether it be over space, time, or type of material.

Analysis of a completely randomized experiment


Statistical model for CRD:
yij = μi + Ɛij - (1)
Transform the general model indicated above (1) into:
yij = μ + ζi + Ɛij - (2)
yij = μ + i. + Ɛij

AOV (ANOVA) Table for RCD


Source of variation df SS
k k n
Treatments k-1
[∑ ¿¿ ij) ]/n - (∑ ∑ y ij)2/kn
2

i=1 i=1 j=1


Error k(n-1)
k n k

∑∑ y ij2 - [∑ ¿¿ ij)2]/n
i=1 j=1 i=1
Total kn-1 k n
2
k n

∑∑ y ij - (∑ ∑ y ij)2/kn
i=1 j=1 i=1 j=1
Example, assume that we have four treatments and 32 homogenous experimental units.
The data from this experiment are as follows:
Treatments
t1 t2 t3 t4
3 4 7 7
6 5 8 8
3 4 7 9
3 3 6 8
1 2 5 10
2 3 6 10
2 4 5 9
2 3 6 11
∑yij = 22 28 50 72
k n k n
1. Total sum of squares (TSS) = ∑ ∑ y ij - (∑ ∑ y ij)2/kn
2

i=1 j=1 i=1 j=1

= (3 +6 +3 +…+11 ) – (3+6+3+…+11)2/4x8
2 2 2 2

46
= 1160.0 – 924.5
= 235.5
k n k
2. Treatment sum of squares (TreatSS) = ∑ ∑ y - [∑ ¿¿ ij)2]/n
ij
2

i=1 j=1 i=1

= (22 +28 +50 +72 )/8 – (3+6+3+…+11)2/4x8


2 2 2 2

= 1119 – 924.5
= 194.5
k n k
3. Error sum of squares = ∑ ∑ y ij2 - [∑ ¿¿ ij)2]/n
i=1 j=1 i=1

= TSS - TreatSS = 235.5 – 194.5 = 41.0


AOV (ANOVA) Table
Source of variation df Sum of squares Mean of squares F-calculated value
Treatments 3 194.5 64.8 64.8/1.464
= 44.2
Errors 28 41.0 1.464
Total 31 235.5
F-tabulated value at 0.05 level of error = F0.05(3, 28) = 2.95
H0: No difference of treatment effects
Decision: Reject H0, while F-calculated value is highly greater than F-tabulated value.

Assignment: please carry out multiple comparisons for the above experiment using LSD
and DMRT!

B. Randomized Complete Block Design (RCBD)


Definition – A design said to be RCBD, if treatments are assigned at random to
homogenous experimental unit in a block and replicated in the other blocks. The
reason for blocking is to eliminate variability due to difference between experimental
units. Blocking minimizes the experimental error.

In this case, the sum of squares due to error is partitioned into block and error sums of
squares.

Relationships between CRD and RCBD


The two designs are related in that each block of RCBD consists of a CRD. This
means that RCBD is a group of CRDs. The RCBD is a complete block design while
each treatment occurs at least once in each block. Also all treatments occur an equal
number of times and all blocks are of the same size.

Treatment combinations in RCBD


Assume that we have t-number of treatments, b-number of blocks, and n-number of
observations per treatment per block.

47
tn = number of observations per block
btn = total number of observations, or
btn = total number of treatment combinations

48
Illustration of treatment combinations in the following table:
Where the total treatment combinations = btn
Treatments
t1 t2 t3 .. tt
b1 t11 t21 t31 … tt1
t12 t22 t32 … tt2
t13 t23 t33 … tt3
. . . … .
. . . … .
. . . … .
t1n t2n t3n … ttn
b2 t11 t21 t31 … tt1
t12 t22 t32 … tt2
t13 t23 t33 … tt3
. . . … .
. . . … .
. . . … .
t1n t2n t3n … ttn
Blocks b3 t11 t21 t31 … tt1
t12 t22 t32 … tt2
t13 t23 t33 … tt3
. . . … .
. . . … .
. . . … .
t1n t2n t3n … ttn
.. … … … … …
.. … … … … …
.. … … … … …
bb t11 t21 t31 … tt1
t12 t22 t32 … tt2
t13 t23 t33 … tt3
. . . … .
. . . … .
. . . … .
t1n t2n t3n … ttn

If n = 1, then t = n. If n =1, it is not possible to determine the sum of


squares of interaction. Under such condition, the error sum of squares is
equivalent to the interaction sum of squares. Thus, RCBD necessitates to
have more than one observations per treatment per block to see the
interaction effects.

49
Statistical model for RCBD
There are two different statistical models for RCBD depending upon
absence or presence of interaction effects between treatments and blocks.
Case 1:In additive model
yijk = μ + βi + ζ j +Ɛijk
Case 2:In non-additive model
yijk = μ + βi + ζ j +δij+Ɛijk

Example, assume that we have 3 treatments, 4 blocks and 4 observations per


treatment per block with the following results:
Treatments
t1 t2 t3 Block total
b1 1 6 4
3 2 4
0 2 1
4 4 3
∑n 8 14 12 34
Blocks b2 2 7 6
2 3 2
4 0 6
8 2 6
∑n 16 12 20 48
b3 3 5 2
4 1 4
5 6 2
0 4 0
∑n 12 16 8 36
b4 1 7 0
0 5 1
9 0 9
2 8 6
∑n 12 20 16 48
Treatment total 48 62 56 Grand total
= 166

Estimates
t1 t2 t3 i.. = i.
b1 2 3.5 3 3
b2 4 3 5 4
b3 3 4 2 3
b4 3 5 4 4
.j. = .j 3 4 3.5 3.5 = … =
Computation of treatment and block effects
50
a) Block effects: βi = μi. - μ..
i = i. - ..

1 = 1. - .. = 3 – 3.5 = -0.5
2 = 2. - .. = 4 – 3.5 = 0.05
3 = 3. - .. = 3 – 3.5 = -0.5
4 = 4. - .. = 4 – 3.5 = 0.5
b
∑εi.. = 0, then what understand from here? Also ∑ ❑i = 0, what we understand from here.
i=1

b) Treatment effects: ζj = μ.j-μ..


j = .j - ..

1 = .1 - .. = 3 – 3.5 = -0.5
2 = .2 - .. = 4 – 3.5 = 0.5
3 = .3 - .. = 3.5 – 3.5 = 0
t
∑ε.j. = 0, and ∑ ❑i = 0. What these mean?
i=1

c) Interaction effects: δij = μij– μi.– μ.j+ μ..


ij = ij – i. – .j + ..

11 = 2-3-3+3.5 = -0.5
12 = 3.5-3-4+3.5 = 0.0
13 = 3-3-3.5+3.5 = 0.0
21 = 4-4-3+3.5 = 0.5
22 = 3-4-4+3.5 = -1.5
23 = 5-4-3.5+3.5 = 1.0
31 = 3-3-3+3.5 = 0.5
32 = 4-3-4+3.5 = 0.5
33 = 2-3-3.5+3.5 = -1.0
41 = 3-4-3+3.5 = -0.5
42 = 5-4-4+3.5 = 0.5
= 4-4-3.5+3.5 = 0.0
43

∑δij = 0, then what it means?

51
Test of hypothesis
a) On treatment:H0: ζ1 = ζ2 = ζ3 or H0: μ.1 = μ.2 = μ.3
Sample estimate .j. N(μ, σ2/bn), if the null hypothesis is true, μ.j = μ
Q1 = bnƩ[( .j. - ...)2]/t-1 = 4x4∑[(3-3.5)2 + (4-3.5)2 + (3.5-3.5)2]/2 = 4

b
2
= [∑ ∑ t ∑ n ¿ ¿ ¿ ¿yijk - ij. )2]/bt(n-1)
i=1 j=¿¿ k=¿¿

= ∑[(1-2) + (3-2)2 + … +(9-4)2 + (6-4)2]/4x3(4-1)


2

= 7.44
F = Q1/ 2 = 4/7.44 = 0.537
F0.05(2,36) = 3.2
Conclusion: Accept H0

b) On blocks: H0: β1 = β2 = β3 = β4 = β. or H0: μ1. = μ2. = μ.3 = μ4. = μ..


Compute Q2 as = tnƩ[( i.. - ...)2]/b-1
= 3x4∑[(2.8-3.5)2 + (4-3.5)2 + 3-3.5)2 + (4-3.5)2]/4-1
=4
F = Q2/ 2= 4/7.44 =0.537
F0.05(3,36) = 2.8
Conclusion: Accept H0

c) On interactions: H0: δ11 = δ12 = … = δ43 = δ..


Compute Q3 as = n[∑( ij.- i..- .j.+ …)2]/(b-1) (t-1)
= 4[∑(2-3-2.8+3.5)2 + …+ (4-4-3.5+3.5)2]/(4-1)(3-1)
=4
F = Q3/ 2= 4/7.44 =0.537
F0.05(6,36) = 1.94
Conclusion: Accept H0

Analysis of Variance
SV df SS MS F
2 2
Block b-1 tnƩ( i.. - ... ) tnƩ[( i.. - ... ) ]/b-1

Treatment t-1 bnƩ( .j. - ... )2 bnƩ[( .j. - ... )2]/t-1

Interaction (b-1) (t-1) n∑( ij. - i.. - .j. + … )2 n[∑( ij. - i.. - .j. + …)2]/(b-1) (t-1)

Error bt(n-1) ∑( ijk - ij.)2 [∑( ijk - ij.)2]/ bt(n-1)


2 2
Total btn-1 ∑( ijk - ...) [∑( ijk - ...) ]/ btn-1

52
AOV (ANOVA) Table of the above results
SV df SS MS F
Block 3 12 4 0.537

Treatment 2 8 4 0.537

Interactio 6 24 4 0.537
n
36 267.84 7.44 -
Error
Total 47 - -

Expected mean square


The expected mean square is important in hypothesis testing. The kind of expected mean
square to be used depends upon the model used.

1. Model I (fixed model)


AOV Table E (Q1) = bn E( .j. - …)2
E (Q1) = σ2+bnδ12
SV df Expected MS
Treatments t-1 σ2+bnδ12

Blocks b-1 σ2+tnδ22

BxT (b-1) (t-1) σ2+nδ32

Error bt(n-1) σ2
Total btn-1

2. Model II (mixed model)


Blocks are random, but treatments are fixed
AOV Table
SV df Expected MS
Treatments t-1 σ2+bnδ2 + nσBT2

Blocks b-1 σ2+tnσB2

BxT (b-1) (t-1) σ2+nσBT2

Error bt(n-1) σ2
Total btn-1

53
3. Model III (random model)
Both blocks and treatments are found in random model
AOV Table
SV df Expected MS
Treatments t-1 σ2+ nσBT2+bnσT2

Blocks b-1 σ2+nσBT2 +tnσB2

BxT (b-1) (t-1) σ2+nσBT2

Error bt(n-1) σ2
Total btn-1

How to use these 3 different models?


We need expected mean of square for testing hypothesis. To test hypothesis in different
models, use as the followings:
1. Model I, to test treatment, block and interaction effects, use MSerror as dominator
2. Model II, a) to test treatment effect, use MSinteraction as dominator
b) to test block effect, use MSerror as dominator
3. Model III, to test treatment and block effects, use MSinteraction as dominator for both
treatments and blocks.

Number of observation per cell


1. If blocks and treatments are fixed, then use more than one observation per cell. Why?
Because this model is tested by MSE. So that if only one observation per cell will not
have MSE.
2. If we have model II & III, there is no need for taking more than one observations per
cell. It is not economical to do so.

Advantages of RCBD
1. More accurate than CRD for most types of experiments
2. Analysis is straight forward
3. No restriction on number of treatments or replications (high flexibility)
4. Possible for estimating missing observation

Disadvantages of RCBD
The chief disadvantage of RCBD is that it is not appropriate for experiments which have
a very large number of treatments and/or so the blocks have considerable variability.

Its flexibility and ease of application have made it the most popular design in use. The
Latin square design is being its closest rival.

54
Comparison of RCBD with CRD
Relative efficiency – R.E of RCBD as compare to CRD is:
R.E = [(n1+1) (n2+3)s22]/[(n1+3) (n2+1)s12] x 100%
Where n1 = df of RCBD
s12 = MSE of RCBD
n2 = df of CRD
s22 = MSE of CRD
χ = [r(B) + t(T) - G]/(r-1)(t-1)
= ❑√ s 2¿ ¿

Example, compare the previous RCB’s AOV results with that of CRD’s AOV.
AOV Table (RCBD) AOV Table (CRD)
SV df SS MS SV df SS MS
Blocks 3 60 20
Treatments 2 50 25
Treatment 2 50 25
s
6 120 20 Error 9 180 20
Error
Total 11 230 Total 11 230

R.E = [(n1+1) (n2+3)s22]/[(n1+3) (n2+1)s12] x 100%


= [(6+1) (9+3)20]/[(6+3) (9+1)20] x 100%
= [(7x12)20]/[(9x10)20] x 100%
= 1680/1800 x 100% = 93%
Conclusion: Blocking is not efficient. Rather CRD is more efficient than RCBD. So that
it is not necessary to blocking.

Efficiency of designs can be known:


1. From previous experience and review
2. By carrying out pilot experiment

C. Latin Square Design (LCD)


In the completely randomized design (CRD), the treatments are allocated to homogenous
experimental material at random. Here, there is no restriction. In the RCBD design, a
restriction is put on the allocation of treatments to the experimental unit in a block. It is
required that all treatments appear equal or proportional number of times in a block.

In a condition that experimental units have two way-restrictions, then the experimental
unit is divided into rows and column so as to occur each treatment once in each row and
column. This type of design is known as Latin square design. By elimination of row and

55
column effects from the error variance, the mean square due to error would be reduced
more than that of RCBD. Latin square design is used in biology, agriculture, industry,
economics and many other fields.

Advantages of LSD
1. Experimental units with two way-restrictions, the LSD controls variability better than
the RCBD. In other words, the error mean square would be much smaller than that of
RCBD.
2. Analysis is simple as compared to that of other designs, although it is more
complicated than that of CRD & RCBD.
3. Analysis remains simple with missing data.

Disadvantages of LSD
1. The number of treatments is restricted by number of rows or columns. As a thumb
rule, the LSD is not generally used for the treatments exceed 10.
2. Fore fewer than 5 treatments, the df for controlling heterogenesity is
disproportionately small. Error will be large that may render to a more tendency to
accept null hypothesis. So it is necessary to use repeated Latin square or other
appropriate designs.
3. If no assumption about,

Construction and arrangement of LSD


Definition of terms
1. Standard Square – a L.S is said to be standard, if rows and columns are arranged
alphabetically.
Example, 3x3 L.S
A B C A B C
B A C Not standard square B C A Standard square
C B A C A B
2. Conjugate squares – Two standard squares are said to be conjugate, if the rows of the
one are the columns of the other.
3. Self conjugate – a L.S is said to be self conjugate, if its row and column arrangement
is the same.
Examples,
i. A 2x2 L.S has only one standard
A B A B 1st row
Standard square B A 2nd row
B A A B This L.S is
B A self conjugate
st nd
1 column 2 column
ii. A 3x3 L.S A B C
56
B C A It has only this standard square
C A B
iii. A 4x4 L.S This has number of Latin squares, but it has only 4 standard
squares.
a) ABCD b) ABCD c) ABCD d) ABCD
BADC BCDA BDAC BADC
CDBA CDAB CADB CDAB
DCAB DABC DCBA DCBA
Enumeration of L.S: The letters in a given Latin square can be arranged to provide a
total of AK!(K-1) Latin squares, where A is possible standard squares.
Level of L.S Possible Latin squares Possible standard squares
1. 2x2 2 1
2. 3x3 12 1
3. 4x4 576 4
4. 5x5 161,280 ?
5. 6x6 812,851,200 ?
6. 7x7 ? 16,942,080
Randomization: Theoretically a researcher is supposed to select one Latin square from
array of possible Latin squares of a given dimension. This must be done at random.
Practically this is not possible when number of treatments exceed 5. The advice is to
select one standard square and then randomize its rows and columns independently.

Example, consider 4x4 L.S, and how you independently randomize row and column? For
randomization, follow the following steps:
1. Draw 3 sets of 4 random digits (1, 2, 3, 4) from table of random digits.
Assume the 3 sets of random digits are: 2, 1, 3, 4
3, 1, 2, 4
4, 3, 1, 2
2. Select the square according to the first number of the first set. Since the first number
is two (2), we pick the 2nd set of 4x4 L.S.
That is, the b set given above. A B C D
B C D A
C D A B
D A B C
3. Arrange rows of step 2 according to the 2nd set of numbers (3, 1, 2, 4)
C D A B
A B C D
B C D A
D A B C
4. Randomize columns of step 3 according to the 3rd set (4, 3, 1, 2), and work with this
L.S. B A C D
57
D C A B
A D B C
C B D A
Construction of L.S with a one step cyclic permutation: the simplest and commonest
way of constructing L.S is using a one step cyclic permutation of the letters. Example,
consider a chemical experiment in which there are 6 chemicals under investigation, with
6 methods of mixing and 6 technicians to do the job. The layout of this experiment on the
basis of a one step cyclic permutation is as the following:
A B C D E F
F A B C D E
E F A B C D
D E F A B C
CDEFAB
BCDEFA
Assumptions: before we analyze a Latin square experiment, we need to make the
following assumptions:
1. No row by column interaction,
2. No column by treatment interaction,
3. No row by treatment interaction,
4. No row by column by treatment interaction.

If we don’t have these assumptions, we would need a large number of observations. For
example, if we have an experiment under 6x6 Latin square, we need 63 = 216
observations.

Analysis of Latin square experiments


Model: yijk = μij + Ɛijk - (1), the general model

1 2 3
I A B C
II C A B
III B C A

The general model transformed into:


yijk = μ + ρi + δj + ζ k +Ɛijk - (2)
Where, μ = grand mean = μ…
ρi = row effect = μi.. - μ…
δj = column effect = μ.j. - μ…
ζ k = treatment effect = μ..k - μ…
Let Q = [(i, j, k): i, j, k are observed indexes of observed numbers of rows, columns and
treatments]

58
t
μi.. = ∑ μ ijk/t, (i, j, k) ϵ Q
i=1

Sample estimates
Estimate of treatment mean is = ..k
" " row " " = i..
" " column " " = .j.
For the above example, mean of treatment A is:
..A = ⅓(y11A+y22A+y33A)

=
i i.. - … , .j. = .j. - …, ..k = ..k - …

Estimation of mean square


2
= ∑(yijk- - I - .j. - ..k)2/(t-1)(t-2)
= ∑ yijk2 - t∑ i..2 - t∑ .j.2 - t∑ ..k2 + 2t2∑ …
2
/(t-1)(t-2)

Test, 1) H0: ζ 1 = ζ 2 = ζ 3 = … = ζ k
2) H0: ρ1 = ρ2 = ρ3 = … = ρi
3) H0: δ1 = δ2 = δ3 = … = δj
Then, compute Q1 as = t∑( ..k - …)2/t-1 = t∑ ..k
2
– t2 …
2
/t-1
" Q2 as = t∑( i.. - …)2/t-1 = t∑ 2
i.. - t
2

2
/t-1
" Q3 as = t∑( .j. - …)2/t-1 = t∑ 2
.j. - t
2

2
/t-1
2
To test hypothesis, you have to compute F = Q/ F[(t-1), (t-1)(t-2)

Analysis of variance table


Source of variation df SS Q
2 2 2
Row blocks t-1 t∑ i.. -t … =R R/t-1
2
Column blocks t-1 t∑ .j. - t2 …
2
=C C/t-1
2
Treatments t-1 t∑ ..k – t2 …
2
=T T/t-1

Error (t-1)(t-2) ∑yijk2 - t2 …


2
–T=E E/(t-1)(t-2)
Total t2-1 ∑yijk2 - t2 …
2

Example of Latin square experiment on feeding trial on sheep having different age and
breed.
Row blocks – age differences
Column blocks – breed differences

Treatments:
A. Grazing only (control)

59
B. Grazing and maize supplement
C. Grazing, maize and protein supplement (P1)
D. Grazing, maize and protein supplement (P2)
E. Grazing, maize and protein supplement (P3)

Interested parameter: wool yield


Row Total Mean
D = 32 E = 33 C = 30 B= 28 A= 24 147 29.4
C = 51 D = 45 A = 41 E= 45 B= 29 211 42.2
Column E = 41 A = 29 B = 24 D= 36 C= 35 165 33.0
B = 38 C = 39 E = 42 A= 23 D= 37 179 35.8
A = 26 B = 24 D = 21 C= 29 E= 26 126 25.2
Total 188 170 158 161 151 828
Mean 37.6 34.0 31.6 32.2 30.2 33.12
Treatments total and mean:
A B C D E Total Mean
Total 143 143 184 171 187 828
Mean 28.6 28.6 36.8 34.2 37.4 33.12

Computation of sum of squares


1. Total SS = ∑yijk2 - t2 …2 , t2 …2 (CT) = 27423.36
= 1578.64
2. Row SS = t∑ i..2 - t2 …2
= 831.04
3. Column SS = t∑ .j.2 - t2 …2
= 162.64
4. Treatment SS = t∑ ..k2 - t2 …2
= 369.44
5. Error SS = SStotal – Ssrow – SScolumn – Sstreatment
= 215.2
AOV Table
Source of variation df SS MS F
Rows 4 831.04 207.7 11.57**
Columns 4 162.64 40.66 2.26
Treatments 4 369.44 92.36 5.14*
Error 12 215.52 17.96
Total 24 1578.64

60
Compute relative efficiency
1. If we consider rows as blocks, then you have:
SB2(rows) = 4(40.66)+12(17.96)/16 = 22.50
R.E = (13x19x22.50)/(15x17x17.96) x100% = 121%
Conclusion: row blocking is efficient.
Relative efficiency of LS as compare to RCBD consisting of row blocks.
R.E = [(t-1)(t-2)+1][(t-1)(t-1)+3]S2RCBD/[(t-1)(t-2)+3][(t-1)(t-1)+1]S2LSD
2. Columns as blocks
SB2(columns) = 4(207.76)+12(17.96)/16 = 65.41
R.E = (13x19x65.41)/(15x17x17.96) x100% = 357%
3. R.E of LS as compared CRD
R.E = [(13x23)/(21x15)][(60.45/17.96)]x100% = 319.5
Conclusion: LSD is more efficient than CRD.

Repeated Latin Square


Sometimes we may wish to conduct a single Latin square at more than one locations or
more than one Latin square at a single location. Under such circumstances, there would
be having two possibilities:
a) Blocks are the same at all different locations. Example, if we are interested to study
time to bed of students at 4 different campuses at fresh, sophomore, junior and senior
year of study. IQ of students is 80, 90, 100 or 110. Hence, here at 4 different
campuses, the blocks are the same that are: IQ and year of study.
b) Blocks are different at different locations. Example, soil study at Alemaya,
Debrezeit and Ambo. Either one-way or two-way restrictions (blocks) are quite
different from location to location.

Analysis of such type data (repeated Latin square)


To analyze such data, we consider the following:
Case I: Both rows and columns are the same at different locations.
Model: yijkl = μ+ρi+δj+ζk+Ϩl+εijkl,
where: ρi = row effect Ϩl = location effect
δj = column effect εijkl = experimental error
ζk = treatment effect
AOV Table
Source of variation df
Rows t-1
Columns t-1
Treatments t-1
Locations s-1
Error (st2-1)[3(t-1)+(s-1)]
Total st2-1

61
Case II: Rows are the same, but columns are different at different locations.
Model: yijkl = μ+ρi+δjl+ζk+Ϩl+εijkl, where δjl = jth column at l location.
AOV Table
Source of variation df
Rows t-1
Treatments t-1
Locations s-1
Columns within location s(t-1)
Error (st2-1) - [2(t-1)+(s-1)+s(t-1)]
Total st2-1

Case III: Both rows and columns are different at different locations.
Model: yijkl = μ+ρil+δjl+ζk+Ϩl+εijkl, where ρil = ith row at l location
δjl = jth column at l location
AOV Table
Source of variation df
Rows/location s(t-1)
Columns/location s(t-1)
Treatments t-1
Locations s-1
Error (st2-1) - [2s(t-1)- (t-1)-(s-1)]
Total st2-1

Types of repeated Latin squares


a) Change-over or switch-over or cross-over design.
This is a design which combines the features of a LS design and that of RCBD.
Normally, it has been used to study two to four treatments. Example, consider two
treatments A and B, supplemental and no-supplemental feeding, respectively,
administered to 6 cows. Each cow receives treatments A & B in periods of 1 & 2. The
treatments are allotted to the periods at random with the restriction that half of the
cows receive treatment A and the other half receive treatment B in the period 1. The
cows received treatment A in period 1 will receive treatment B in period 2. Similarly,
cows received treatment B in period 1 will receive treatment A in period 2.

The design layout is given as below:


Columns (cows)
Rows I II III IV V VI
Period 1 B B A A B A
Period 2 A A B B A B

62
AOV Table as cross-over design
Source of var. df
Columns 5 (r-1)
Rows 1 (t-1)
Treatments 1 (t-1)
Error 4 (t-1)(r-2)
Total tr-1

If the experiment is conducted as a 2x2 repeated LS, the layout would be as follow:
SQ1 SQ2 SQ3
Columns 1 2 1 2 1 2
1 A B A B B A
Rows 2 B A B A A B

The repeated LS is like case III where rows and columns are different at different
locations. Thus, the AOV Table of this LSD is that of case III.
AOV Table as LSD
Source of variation df
Squares 2 = (s-1)
Periods/square 3 = s(t-1)
Columns/square 3 = s(t-1)
Treatments 1 = t-1
Error 2 = (st2-1) - [2s(t-1)- (t-1)-(s-1)]
Total 11 = st2-1

Note: The change-over design gives more degree of freedom than that of repeated LS. So
that using change-over design is better than using repeated LSD.

The cross-over design may be used for any number of treatments with restriction
that the number of replications be a multiple of number of treatments. But, it is not
advisable to use repeated LSD instead of the cross-over design if the number of
treatments exceeds 4.

Cross-over or switch-over design is commonly used in most animal science


studies, whereby different treatments are tested one after another on the same
animals. In animal experimentations, large source of error variation is due to the
variation from animal to animal. One way of overcoming this problem is to use the
treatments on the same animal. This involves the changing of the treatments from
time to time. Therefore, it is necessary to use change-over, switch-over or cross-
over design.

63
Many characteristics under study change with time. For instance, milk yield varies
with the stage of lactation. If all the animals are given the same sequence of
treatments say A, B, C in three periods of that order. In this case, treatment C will
be under estimated because it will be given to the cows at their declining stage of
lactation. To overcome this problem, it is necessary to apply treatments in
different sequences.
Sequences
I II III
P1 A B C
Periods P2 B C A
P3 C A B

How about persistency?


If our animals vary in persistency, that is, the changes of producing milk yield vary
from animals to animals along the progress time, then some treatments will be
overestimated and some others are under estimated. One way of overcoming this
difficulty is to have a block of homogenous animals for persistency. Allot each
animal to a different sequence. If this is not possible, there is a design that takes
care of this problem. This design is known as switch-back or double reversal
design. Advantages of double change-over design are:
1. Allow estimating residual and direct effects of treatments
2. A high degree of precision
3. Suitable for small number of treatments

Carry-over effect
In the use of switch-over design, there is another problem which we call it carry-
over effect. What we call treatment effect in a switch-over trial for a given period
may not be the effect due solely to the treatment in that period, but it may be also
due to a carry-over of treatment conducted a period before.

If one is interested only in the direct effect of that particular period, it is possible to
give or to allow sufficiently long period or rest period that we call it carry-over
period for the effect of the previous treatment to disappear. During this period, the
animals will be treated equally or can be switched to the next treatment, but the
yield will be discarded. The rest period will be sufficiently long so that the residual
effect of the previous period would disappear. Make sure that the carry-over effect
and the direct effect are not confused (confounded).

In many cases such as lactation or growth, the periods themselves are of limited
duration. On the other hand, the treatments should be given sufficient time to
express their effect. In other words, it may not be possible to have sufficiently long
64
period of rest to get rid off or at least appreciably reduce the effect of previous
treatment. Not only that one may not know whether or not experiments in such a
way that both the carry-over and the residual effects can be estimated. Hence, it is
noted also that switch-back or switch-over doesn’t work for more treatments
because they require a long period that will be out of the time of confound.

Design for estimating carry-over effect = Balanced Latin square design


In the estimation of carry-over effects, we must make the following assumptions:
1. The direct and the carry-over effects don’t interact, and
2. The carry-over effect lasts for one succeeding period.

The analysis of the cross-over and the direct effects becomes much simpler if each
treatment is preceded by every other treatment an equal number of times. This
makes the design balanced for residual effects.

Example, for a 4x4 LS: A B C D


D A B C
B C D A
C D A B
When treatments are many, each treatment will be preceded more than once but
equal number of times is called balanced for residual effects.

b) Orthogonal Latin Square


Two Latin squares are said to be orthogonal, if one super-imposed over the other,
each letter of square occurs once and only once with every letters of the other square.
Example, for 4x4 LS:
A B C D A B C D A B C D
B A D C D C B A C D A B
C D A B B A D C D C B A
D C B A C D A B B A D C
I II III
These are orthogonal Latin square.

If we employ orthogonal Latin square , the treatment will be preceded an equal


number of times not only in the immediate period, but one or two periods before.
A balancing will be achieved in terms of residual effect of second and higher
order.

65
c) Graeco Latin Square
Assume that we have a LS of a given dimension superimposed Greek letter on the
Latin letters such every Greek letter occurs once and only once with any Latin letter.
Example, consider a 4x4 LS:
Aα Bβ Cδ Dζ
Dβ Cα Bζ Aδ
Bδ Aζ Dα Cβ
Cζ Dδ Aβ Bα
For instance, 4 poultry feeds and 4 feed additives are tested on 4 different age groups
and 4 different breeds of chickens. Latin letters can represent main poultry feeds,
while Greek letters represent feed additives.

AOV Table for GLS


Source of var. df
Rows t-1
Columns t-1
Treatments A t-1
Treatments B t-1
Error (t-1)(t-3)
Total t2-1
The disadvantage of GLS is having inadequate degree of freedom for error especially
when replication (r) is <6.

6.2 Incomplete Designs (please refer Gomez and Gomez from page 39 to page 83)
A) Lattice Design
1. Balanced Lattice Square Design
2. Partially Balanced Lattice Square Design
 Simple Lattice Square Design
 Triple Lattice Square Design
 Quadruple Lattice Square Design
B) Group Balanced Square Design

7. Factorial Experiments
This is the condition of experimentation in which there are several factors with many levels
and all treatment combinations are observed. A group of treatments which contain two or
more levels of two or more factors or substances in all combinations is known as factorial
experiment. Example, a fertilizer study on N, P, K, S & Zn, each with 5 levels: N1, N2,
N3, N4, N5
P1, P2, P3, P4, P5
K1, K2, K3, K4, K5

66
S1, S2, S3, S4, S5
Zn1, Zn2, Zn3, Zn4, Zn5
5
Then we will have 5 = 5x5x5x5x5 = 3125 treatment combinations. Here, we say that
we have 5 factors each factor at 5 levels.

Some common examples where factorial experiments are used:


1. Effect of light, temperature and moisture on seed germination
2. Spacing and rate of planting on yield of potatoes
3. Levels of ingredients and baking temperatures on bread quality
4. Levels of protein and carbohydrate in the feeding trials
5. Levels of herbicides, fungicides and varieties on growth of plants.
Advantages of factorial experimentation
1. All experimental materials utilized in evaluating effects. The use of all experimental units
in evaluating an effect increases efficiency of the particular experiment. As result, it is the
most efficient use of resources.
2. Effects are evaluated over a wide range of experimental conditions with a minimum
outlay of resources.
3. Possible to estimate interaction.
4. Unbiased estimates of factors are obtained whether or not time trends exist.
5. Factorial treatment combination is optimum for estimating effects and interaction.

Disadvantages of factorial experimentation


1. Sometimes prohibitive number of treatments to work with
2. Associated with a large number of treatments comes sometimes decrease of efficiency

Effects in factorial experiment are composed of main effects and interactions.


Notation on factorial experimentation: Small letters denote factors and numbers denote
levels. Capital letters denote the effect of respective factors.
Example, a0 – 0th level of factor a
a1 – 1st level of factor a
a2 – 2nd ” ” ” a
. . . .
. . . .
th
ak – k level of factor a
Assume we have two factors a & b, each at 3 levels. We will have the following treatment
combinations.
a0 a1 a2
b a0b0 a1b0 a2b0
0
b a0b1 a1b1 a2b1
1

67
b a0b2 a1b2 a2b2
2

Choice of levels in factorial experiments


The choice of levels in a factorial experimentation depends up on the nature of the experimental
yield and the objective of the experiment. The form of the response curve and the portion of the
curve to be studied determine the number and location of the levels.
A
B

If the range of the levels is known and if we are interested in the nature of the response curve, we
should take as many levels as possible. The levels may be equally spaced or not. To ease the
computation of linear regression curve, it is advisable to use levels which are equally spaced. If
the range in which the factor is effective is not known, but the lower or upper limit is known, use
the formula c±nk to determine the levels, where c is the lower or upper value, n is the levels (0,
1, 2, 3, etc) and k is the interval which is usually constant.

Factorial experiment of pn series


Let n is the number of factors and p is the number of levels, and then we will have pn treatment
combinations. This is true when all factors have equal number of levels. Let’s take a & b factors
each at 2 levels. This is therefore known as a 2x2 factorial experiment.
a0 a1 a0 a1
b0 a0b0 a1b0 means b0 μ11 μ21
b1 a0b1 a1b1 b1 μ12 μ22

Model: yijl = μij + εijl, where: i = 1,2,3


j = 1.2.3
l = 1,2,3…n

Some definitions in factorial experiments


1. Simple effect – the simple effect of a is the difference between the levels of a for each
level of b. That is, μ21 - μ11, μ22 - μ12.
Similarly, the simple effect of b is the difference between the levels of b for each level of
a. That is, μ12 - μ11, μ22 - μ21
2. Main effect – Main effect of A is the average of simple effects of A, while the main
effect of B is the average of simple effects of B. Note that main effect is given by capital
letter.
A = [(μ21 - μ11) + (μ22 - μ12)]/2
= [(μ21 + μ22) - (μ11 + μ12)]/2

68
= μ2.- μ1., the inner subscript is the same, but the outer is different.
B = [(μ12 - μ11) + (μ22 - μ21)]/2
= [(μ12 + μ22) - (μ11 + μ21)]/2
= μ.2 - μ.1
3. Interaction effect – interaction effect is defined as the failure of one factor, in the above
case a, to retain the same order and magnitude of performance at all levels of the other
factor, in the above case b. In other words, if the difference between the two levels of
factor a varies at each level of b, then we say the factors a & b interact. The interaction of
factors a & b is designated as AB. Symbolically, interaction of A & B is defined as the
average difference between the simple effects of A or B.

For example, take the above 2x2 factorial experiment. The interaction (AB) is estimated
as follow:
AB = [(μ21 - μ11) - (μ22 - μ12)]/2
= [μ21 + μ12- μ11 - μ22]/2
Note: the interaction effect calculated in line with A is the same as in line with B
AB = [(μ12 - μ11) - (μ22 - μ21)]/2
= [μ12 + μ21- μ11 - μ22]/2

For example, computing different effects a 2x2 factorial experiment


a0 a1 a0 a1
b0 4 2 means b0 μ11 μ21
6 4
b1 4 7 b1 μ12 μ22
8 5

Estimates of means
a0 a1
b0 5 3
b1 6 6

1. Simple effect of A: μ21 - μ11= 3-5 = -2


μ22 - μ12 = 6-6 = 0
2. Simple effect of B: μ12 - μ11= 6-5 = 1
μ22 - μ21 = 6-3 = 3
3. Main effect of A: [(μ21 - μ11) + (μ22 - μ12)]/2
(-2+0)/2
-1
4. Main effect of B: [(μ12 - μ11) + (μ22 - μ21)]/2
(1+3)/2
2

69
5. Interaction of AB: [(μ21 - μ11) - (μ22 - μ12)]/2, in line with A
(-2-0)/2
-1
[(μ12 - μ11) - (μ22 - μ21)]/2, in line with B
(1-3)/2
-1
Test of hypotheses in FE: several hypotheses can be tested in factorial experiments. For
instance, in a 2x2 FE, the hypotheses of interest would be the followings:
1. H0: A = 0 or H0: μ2. - μ1.= 0
2. H0: B = 0 or H0: μ.2 – μ.1= 0
3. H0: AB = 0, or no interaction between A & B

Computation of sum of squares in FE


There are two ways of computing sum of squares: (a) using means, or (b) using variances.
(a) Computing SS using means by taking the above 2x2 data:
1. SSA = nA2 where A = 1/2[(μ21 - μ11) + (μ22 - μ12)]
= 1/2[μ21 + μ22– μ11 - μ12]
≈ 1/2[ 21. + 22.– 11. - 12]
2
2. SSA = nB where B = 1/2[(μ12 - μ11) + (μ22 - μ21)]
= 1/2[μ12 + μ22– μ11 - μ21]
≈ 1/2[ 21. + 22.– 11. - 21.]
2
3. SSAB = nAB where AB = 1/2[(μ21 - μ11) - (μ22 - μ12)], or 1/2[(μ12 - μ11) - (μ22 - μ21)]
= 1/2[μ21 + μ12– μ11 - μ22], or 1/2[μ12 + μ21– μ11 - μ22]
≈ 1/2[ 21. + 12.– 11. - 22.], or 1/2[ 12. + 21.– 11. - 22.]
Computation of F
1. To test H0: A = 0, compute F as = 1/σ2(SSA/1) Fα(1, 4(n-1)
2
2. To test H0: B = 0, compute F as = 1/σ (SSB/1) Fα(1, 4(n-1)
2
3. To test H0: AB = 0, compute F as = 1/σ (SSAB/1) Fα(1, 4(n-1)
7.1 FE in Complete Designs
Note: The degree of freedom of error is dependent on what type of design is used. Here above,
the ANOVA is computed with CRD.
The AOV Table under CRD using means
Source df SS
A 1 nA2
B 1 nB2
AB 1 nAB2
Error 4(n-1) 4 Subtraction (SStotal-SSA-SSB-SSAB)
Total 4n-1 7 ΣΣyij2-c.f
(b) Computing SS using variances by taking the above 2x2 FE data:

70
1. SStotal = Σyijl2 – (y…)2/ijl, y… = ΣΣΣyijl and (y…)2/ijl is known as c.f
= (42+62+22+42+42+82+72+52) – 402/(2x2x2)
= 226 – 1600/8
= 226 – 200 (c.f)
= 26
2. SSA = Σ(yi..)2/jl – c.f MSA = SSA/dfA
2 2)
= (22 +18 /4 – 200 = 2.0/1
= 202 – 200 = 2.0
= 2.0
3. SSB = Σ(y.j.)2/il – c.f MSB = SSB/dfB
2 2
= (16 +24 )/4 – 200 = 8.0/1
= 208 – 200 = 8.0
= 8.0
4. SSAB = Σ(yij.)2/l – c.f – SSA – SSB MSAB = SSAB/dfAB
2 2 2 2
= (10 +12 +6 +12 )/2 – 200 – 2.0 – 8.0 = 2.0/1
= 2.0 = 2.0
5. SSerror = SStotal – SSA – SSB – SSAB MSerror = SSerror/dferror
= 26 – 2.0 – 8.0 – 2.0 = 14/ij(l-1)
= 14 = 14/4
= 3.5
Computation of F using variances
4. To test H0: A = 0, compute F as = MSA/MSerror Fα(1, 4(n-1)
5. To test H0: B = 0, compute F as = MSB/MSerror Fα(1, 4(n-1)
6. To test H0: AB = 0, compute F as = MSAB/MSerror Fα(1, 4(n-1)
AOV Table under CRD using variances
Source df SS MS Fcal F0.05 SD
A 1 2.0 2.0 0.57 7.71 NS
B 1 8.0 8.0 2.28 7.71 NS
AB 1 2.0 2.0 0.57 7.71 NS
Error 4 14 3.5
Total 7 26
Under RCBD, the data must be first arranged as the following:
A B BI BII
a0 b0 4 6
b1 2 4
a1 b0 4 7
b1 8 5
Block total 18 23
The SS of block is computed as = (172+232)/4 – c.f
= 204.5- 200

71
= 4.5
AOV Table of 2x2 FE under RCBD using variances
Source df SS MS Fcal F0.05 SD
Block 1 4.5 4.5 1.42 7.71 NS
A 1 2.0 2.0 0.63 7.71 NS
B 1 8.0 8.0 2.52 7.71 NS
AB 1 2.0 2.0 0.63 7.71 NS
Error 3 9.5 3.17
Total 7 26

Suppose that we use Latin square design, then the layout of 2x2 FE would be as the following:
Row total
a0b0 (1) a1b0 (2) a0b1 (3) a1b1(4) R1
a1b1(5) a0b0 (6) a1b0 (7) a0b1(8) R2
a0b1 (9) a1b1(10) a0b0 (11) a1b0 (12) R3
a1b0 (13) a0b1(14) a1b1(15) a0b0 (16) R4
Column total: C1 C2 C3 C4 G (grand total)
Treatment total: T1 T2 T3 T4 = interactions
Factor A & B total:
a0 a1 B-total
b0 T1 T3 b0total
b1 T2 T4 b1total
A-total a0total a1total G

AOV Table of 2x2 FE under LS design using variances


Source df SS
Row 3 1/4[Σ(R12+R22+R32+R42)]-c.f
Column 3 1/4[Σ(C12+C22+C32+C42)]-c.f
A 1 1/8[Σ(a0total2+a1total2)]-c.f
B 1 1/8[Σ(b0total2+b1total2)]-c.f
AB 3 1/4[Σ(T12+T22+T32+T42)]-c.f-SSA-SSB
Error 4 Subtraction (SStotal-SSrow-SScolumn-SSA-SSB-SSAB)
Total 15 Σ[(1)2+(2)2+(3)2+…+(16)2)]-c.f where c.f = G2/16

Note: In factorial experiments under RCBD or LS design, it is not necessary to look at the
interaction between treatments or factors and blocks.

72
Means with plus-minus technique in FE
Any factor at 0 level is the lowest one and designated by 1, while the next level is designated by
the factor itself. With this conversion, a0b0 = (1), a0b1 = b, a1b0 = a, and a1b1 = ab

With this conversion, the plus-minus table for a 2x2 factorial experiment is presented as fllow:
(1) a b ab Divisor (coefficient)
Mean + + + + 4
A – + – + 2
B – – + + 2
AB + – – + 2
Where: a0b0 = (1), a0b1 = b, a1b0 = a, and a1b1 = ab

The plus-minus table helps to calculated/determine effects.


M = 1/4[(1)+a+b+ab)], where: M = mean effect
A = 1/2[-(1)+a-b+ab)], A = effect of A
B = 1/2[-(1)-a+b+ab)], B = effect of B
AB = 1/2[(1)-a-b+ab)], AB = interaction effect
3
2 factorial experiments
Here, 3 refers to the number of factors, while 2 refers to the number of levels at each factor.
Under this FE, there are 8 treatment combinations (2x2x2 = 8). Let the factors a, b & c, and the
levels 0 & 1. Then, the factors and levels are combined as follow:
F a0 a1 FL a0 a1
L c0 c1 c0 c1 c0 c1 c0 c1
b0 a0b0c0 a0b0c1 a1b0c0 a1b0c1 In converted b0 (1) c a ac
form
b1 a0b1c0 a0b1c1 a1b1c0 a1b1c1 b1 b bc ab abc

Mean table
FL a0 a1
c0 c1 c0 c1
b0 μ111 μ112 μ211 μ212
b1 μ121 μ122 μ221 μ222
1. Simple effect of A: i) μ211 -μ111, ii) μ221 -μ121, iii) μ212-μ112, iv) μ222 - μ122
2. Simple effect of B: i) μ121 -μ111, ii) μ122 -μ112, iii) μ221-μ211, iv) μ222 - μ212
3. Simple effect of C: i) μ112 -μ111, ii) μ122 -μ121, iii) μ212-μ211, iv) μ222 - μ221
4. Main effect of A: = the average of the simple effects of A
= 1/4[(μ211 -μ111)+(μ221 -μ121)+(μ212-μ112)+(μ222 - μ122)]
= 1/4(μ211 + μ221 + μ212+ μ222 -μ111-μ121-μ112- μ122)
= μ2.. - μ1..
5. Similarly, main effect of B: = μ.2. – μ.1.

73
6. Main effect of C: = μ..2 – μ..1
7. Simple effect of AB: = the interaction between A & B at each level of C
AB1 = 1/2[(μ211 -μ111) - (μ221 -μ121)]
= 1/2(μ211+μ121-μ111-μ221)
AB2 = 1/2[(μ212-μ112) - (μ222 - μ122)]
= 1/2(μ212+μ122-μ112-μ222)
8. Interaction of AB: = the average of the simple effects of AB
= 1/2(AB1+AB2)
9. Simple effect of AC: = the interaction between A & C at each level of B
AC1 = 1/2[(μ112 -μ111) – (μ212 -μ121)]
AC2 = 1/2[(μ122 -μ121) – (μ222 –μ221)]
10. The AC interaction: = 1/2(AC1+AC2)
11. Simple effect of BC: = the interaction between B & C at each level of A
BC1 = 1/2[(μ121 -μ111) – (μ122 -μ112)]
BC2 = 1/2[(μ221 -μ211) – (μ222 - μ212)]
12. The BC interaction: = 1/2(BC1+BC2)
13. The ABC interaction is the average difference between the simple effects of AB or AC or
BC: ABC = 1/4[(μ211+μ121+μ112+μ222) – (μ111+μ221+μ212+μ122)]

The plus-minus table for 23 FE


(1) a b ab c ac bc abc divisor
Mean + + + + + + + + 8
A – + – + – + – + 4
B – – + + – – + – 4
AB + – – + + – – + 4
C – – – – + + + + 4
AC + – + – – + – + 4
BC + + – – – – + + 4
ABC – + + – + – – + 4

Numerical example for a 23 FE considering the following artificial data:


Treatment combinations Observations
F a0 a1 FL a0 a1
L c0 c1 c0 c1 c0 c1 c0 c1 Total b
b0 (1) c a ac b0 7, 4, 1 4, 3, 2 5, 5, 5 4, 1, 1 42
b1 b bc ab abc b1 3, 3, 1 0, 6, 6 9, 4, 8 3, 8, 7 57
Total a 39 60
Total c c0 = 54 c1 = 45

Observation means (population) Observation mean estimates (total in brackets)


FL a0 a1 FL a0 a1

74
c0 c1 c0 c1 c0 c1 c0 c1
b0 μ111 μ112 μ211 μ212 b0 4 (12) 3 (9) 5 (15) 2 (6)
b1 μ121 μ122 μ221 μ222 b1 2 (6) 4 (12) 7 (21) 6 (18)

Computation of effects
Simple effects: A = μ211–μ111, μ221–μ121, μ212–μ112, μ222–μ122
= 5–4, 7–2 2–3 6–4
=1 5 -1 2
B = μ121–μ111, μ122–μ112, μ221–μ211, μ222–μ212
= 2–4 4–3 7–5 6–2
= -2 1 2 4
C = μ112–μ111, μ122–μ121, μ212–μ211, μ222–μ221
= 3–4 4–1 2–5 6–7
= -1 2 -3 -1
Main effects: A = 1/4[(μ211–μ111)+(μ221–μ121)+(μ212–μ112)+(μ222–μ122)]
= 1/4[(5–4)+(7–2)+(2–3)+(6–4)]
= 1/4(1+5+-1+2) = 1.75
B = 1/4[(μ121–μ111)+(μ122–μ112)+(μ221–μ211)+(μ222–μ212)]
= 1/4[(2–4)+(4–3)+(7–5)+(6–2)] = 1.25
C =1/4 [(μ112–μ111)+(μ122–μ121)+(μ212–μ211)+(μ222–μ221)]
= 1/4[(3–4)+(4–1)+(2–5)+(6–7)] = -0.75
Interaction effects: AB = 1/4[(μ111+μ221+μ112+μ222)–(μ211+μ121+μ212+μ122)]
= 1/4[(4+7+3+6)–(5+2+2+4)]
= 14(20–13) = 1.75
AC = 1/4[(μ111+μ121+μ212+μ222)–(μ211+μ221+μ112+μ122)]
= 1/4[(4+2+2+6)–(5+7+3+4)]
= 1/4(14–19) = -1.25
BC = 1/4[(μ111+μ211+μ122+μ222) – (μ121+μ221+μ112+μ212)]]
= 1/4[4+5+4+6) – (2+7+3+2)]
= 1/4(19–14) = 1.25
ABC = 1/4[(μ211+μ121+μ112+μ222)–(μ111+μ221+μ212+μ122)]
= 1/4[(5+2+3+6)–(4+7+2+4)]
= 1/4(16–17)
=1/4 (-1) = -0.25

Sum of squares using means


SSA = 2nA2 = 6(1.75)2 = 18.375 SSAC = 2nAC2 = 3(-1.25)2 = 9.375
SSB = 2nB2 = 6(1.25)2 = 9.375 SSBC = 2nBC2 = 6(1.25)2 = 9.375
SSC = 2nC2 = 6(-0.75)2 = 3.375 SSABC = 2nABC2 = 6(-0.25)2 = 0.375
SSAB =2nAB2 = 6(1.75)2 = 18.375

75
Sum of squares using variances
SSA = Σ(392+602)/12 – 992/24 = 426.75- 408.375 = 18.375
2 2 2
SSB = Σ(42 +57 )/12 – 99 /24 = 417.75- 408.375 = 9.375
2 2 2
SSC = Σ(54 +45 )/12 – 99 /24 = 411.75- 408.375 = 3.375
2 2 2 2 2
SSAB = Σ(21 +21 +18 +39 )/6–99 /24–SSA–SSB = 454.5- 408.375-18.375-9.375 = 18.375
SSAC = Σ(182+212+362+242)/6–992/24–SSA–SSC = 439.5- 408.375-18.375-3.375 = 9.375
SSBC = Σ(272+152+272+302)/6–992/24–SSB–SSC = 430.5- 408.375-9.375-3.375 = 9.375
2 2 2 2 2 2 2 2 2
SSABC = Σ(12 +9 +15 +6 +6 +12 +21 +18 )/3–99 /24–SSA–SSB–SSC -SSAB-SSAC-SSBC
= 477- 408.375-18.375-9.375-3.375-18.375-9.375-9.375 = 0.375

24 factorial experiments
This refers as experiments having 4 factors at 2 levels each.
Determination of factorial effects using algebraic technique
In order to use the algebraic technique, we need to set the first level of a factor as 1 and the 2nd
level of the factor as its factor designated by small letter. That is, for instance, a0 = 1 and a1 = a.
The factors are: a, b, c & d. The divisor = 1/2n-1, where n is number of factors.

a0 a1
c0 c1 c0 c1
d0 d1 d0 d1 d0 d1 d0 d1
b0 (1) d c cd a ad ac acd
b1 b bd bc bcd ab abd abc abcd

The main effects:


A = 1/8[(a-1)(b+1)(c+1)(d+1)
B = 1/8[(a+1)(b-11)(c+1)(d+1)
C = 1/8[(a+1)(b+1)(c-1)(d+1)
D = 1/8[(a+1)(b+1)(c+1)(d-1)
Interactions:
AB = 1/8[(a-1)(b-1)(c+1)(d+1)
AC = 1/8[(a-1)(b+1)(c-1)(d+1)
AD = 1/8[(a-1)(b+1)(c+1)(d-1)
BC = 1/8[(a+1)(b-1)(c-1)(d+1) 1st order interactions
BD = 1/8[(a+1)(b-1)(c+1)(d+1)
CD = 1/8[(a+1)(b+1)(c-1)(d-1)
ABC = 1/8[(a-1)(b-1)(c-1)(d+1)
ABD = 1/8[(a-1)(b-1)(c+1)(d-1)
2ndorder interactions
ACD = 1/8[(a-1)(b+1)(c-1)(d-1)
BCD = 1/8[(a+1)(b-1)(c-1)(d-1)
ABCD = 1/8[(a-1)(b-1)(c-1)(d-1) 3rd order interaction
Plus-minus table for 24 FE

76
Effect (1) d c cd a ad ac acd b bd bc bcd ab abd ab abcd
s c
A - - - - + + + + - - - - + + + +
B - - - - - - - - + + + + + + + +
C - - + + - - + + - - + + - - + +
D - + - + - + - + - + - + - + - +
AB + + + + - - - - - - - - + + + +
AC + + - - - - + + + + - - - - + +
AD + - + - - + - + + - + - - + - +
BC + + - - + + - - - - + + - - + +
BD + - + - + - + - - + - + - + - +
CD + - - + + - - + + - - + + - - +
Effect (1) d c cd a ad ac acd b bd bc bcd ab abd ab abcd
s c
ABC - - + + + + - - + + - - - - + +
ABD - + - + + - + - + - + - - + - +
ACD - + + - + - - + - + + - + - - +
BCD - + + - - + + - + - - + + - - +
ABCD + - - + - + + - - + + - + - - +

Assignment: Use the algebraic technique and estimate various effects of a 23 experiment.

General 2n factorial experiments


Here, we have n factors, each at 2 levels.
Determination of main effects and interactions:
n n!
1. Main interactions = =
1 1! (n−1) !

n n!
2. Two factors interactions = =
2 2! (n−2) !

n n!
3. Three factors interactions = =
3 3! (n−3)!
. . .
. . .
. . .
n n!
n. n factors interactions = =
n n !(n−n)!

Example, by using this open matrix, compute main effects and interactions of a 24 experiment!
n! 4! 4!
1. Main effects = = = =4
1! (n−1) ! 1! (4−1)! 1! 3 !
n! 4! 4!
2. Two factors interactions = = = =6
2! (n−2) ! 2! (4−2)! 2! 2 ! Total 15 factorial effects

77
n! 4! 4!
3. Three factors interactions = = = =4
3! (n−3)! 3! (4−3)! 3! 1 !
4! 4! 4!
4. Four factors interactions = = = =1
4 ! (n−4) ! 4 ! (4−4)! 4 !

Here, we have 2n-1contrasts which explain factorial effects in totality.


For instance of the above example, we have 24-1contrasts = 15 factorial effects (2x2x2x2-1)

The odds-even technique for 2n factorial experiment


General rule: Every factorial effect has half positive and half negative signs.
1. Main effects: For main effect of A, B or C, any treatment combination with “a”, “b”, or
“c” receives positive sign, respectively, while the remaining receives negative sign
2. Interactions:
a. Even-order interactions – A treatment combination with zero and an even number
of letters in common with the effect letter receives a positive sign, while the rest
receives negative sign.
b. Odd-order interactions – A treatment combination with an odd number of letters
in common with interaction letters receives a plus sign, while the rest receives
negative sign.
Example, take a 23 experiment and put plus or negative using odd-even technique!
(1) a b ab c ac bc abc
A – + – + – + – +
B – – + + – – + –
AB + – – + + – – +
C – – – – + + + +
AC + – + – – + – +
BC + + – – – – + +
ABC – + + – + – – +

A 32 factorial experiment
This mean an experiment consists of 2 factors each at 3 levels. Let’s assume that we have two
factors A and B. Their 3 levels are designated as the following: Factor A (a0, a1, a2) ; and Factor
B (b0, b1, b2). The treatment combinations of this FE are as follow:

a0 a1 a2 a0 a1 a2
b0 a0b0 a1b0 a2b0 Means b0 μ00 μ10 μ20
b1 a0b1 a1b1 a2b1 b1 μ01 μ11 μ21
b2 a0b2 a1b2 a2b2 b2 μ02 μ12 μ22

78
Before we attempt to measure factorial effects, we will assume that we have equally spaced and
even will partition A into linear and quadratic contrasts, and similarly also partition B into
linear and quadratic contrasts.
Factor A Factor B
a0 a1 a2 b0 b1 b2
Linear contrast
-1 0 1 -1 0 1
1 2 1 Quadratic contrast 1 2 1

In other words, these are coefficients of orthogonal polynomials.

Determination of effects and interaction in 32 FE


1. Simple effect of Alinear
The simple effect of Alinearis the linear contrast of the means of A at each level of B.
At b0 = (-1)(μ00) +(0)(μ10) + (1)(μ20) = μ20 - μ00
b1 = (-1)(μ01) +(0)(μ11) + (1)(μ21) = μ21 - μ01
b2 = (-1)(μ02) +(0)(μ12) + (1)(μ22) = μ22 - μ02
2. Simple effect of Aquadratic
This is obtained by taking quadratic contrast of the means A at each level of B.
At b0 = (1)(μ00) +(-2)(μ10) + (1)(μ20) = μ00-2μ10+μ20
b1 = (1)(μ01) +(-2)(μ11) + (1)(μ21) = μ01-2μ11+μ21
b2 = (1)(μ02) +(-2)(μ12) + (1)(μ22) = μ02-2μ12+μ22
3. Simple effect of Blinear
At a0 = (-1)(μ00) +(0)(μ01) + (1)(μ02) = μ02 - μ00
a1 = (-1)(μ10) +(0)(μ11) + (1)(μ12) = μ12 - μ10
a2 = (-1)(μ20) +(0)(μ21) + (1)(μ22) = μ22 - μ20
4. Simple effect of Bquadratic
At a0 = (1)(μ00) +(-2)(μ01) + (1)(μ02) = μ00-2μ01+μ02
a1 = (1)(μ10) +(-2)(μ11) + (1)(μ12) = μ10-2μ11+μ12
a2 = (1)(μ20) +(-2)(μ21) + (1)(μ22) = μ20-2μ21+μ22
5. Main effects
a) Alinear = 1/3[(μ20 - μ00)+(μ21 - μ01)+(μ22 - μ02)]
b) Aquadratic = 1/6[(μ00-2μ10+μ20)+(μ01-2μ11+μ21)+(μ02-2μ12+μ22)]
c) Blinear = 1/3[(μ02 - μ00)+(μ12 - μ10)+(μ22 - μ20)]
d) Bquadratic = 1/6[(μ00-2μ01+μ02)+(μ10-2μ11+μ12)+(μ20-2μ21+μ22)]
Note: Divisor = 2/(Σ|ci|
6. Interactions
We will have four possible interactions.
These are: 1) ALBL, 2) ALBQ, 3) AQBL, 4) AQBQ
Determination of interactions
6.1 ALBL is the average of BL contrasts of AL simple effects
Simple effects of AL are: μ20 - μ00, μ21 - μ01, μ22 - μ02
79
BL contrasts: -1 0 1
ALBL = 1/2[-1(μ20 - μ00) + 0(μ21 - μ01)+ 1(μ22 - μ02)]
= 1/2(μ00 - μ20 + μ22 - μ02)
6.2 ALBQ is the average of BQ contrasts of AL of simple effects
ALBQ = 1/4[1(μ20 - μ00) + -2(μ21 - μ01) + 1(μ22 - μ02)]
6.3 AQBL is the average of BL contrasts of AQ simple effects
BL contrasts
Simple effects of AQ are: μ00-2μ10+μ20 -1 = 2μ10-μ00-μ20
μ01-2μ11+μ21 0 =0
μ02-2μ12+μ22 1 = μ02-2μ12+μ22
AQBL = 1/4[-1(μ00-2μ10+μ20) + 0(μ01-2μ11+μ21) + 1(μ02-2μ12+μ22)]
6.4 AQBQ is the average of BQ contrasts of the simple effects of AQ
BQ contrasts
Simple effects of AQ are: μ00-2μ10+μ20 1 = μ00-2μ10+μ20
μ01-2μ11+μ21 -2 = 4μ11-2μ01-2μ21
μ02-2μ12+μ22 1 = μ02-2μ12+μ22
AQBQ = 1/8[1(μ00-2μ10+μ20) + -2(μ01-2μ11+μ21) + 1(μ02-2μ12+μ22)]

A 2x3x4 factorial experiment


This is a peculiar of factors with different levels FE. This experiment consists of one factor at 2
levels, another factor at 3 levels and the 3rd factor at 4 levels. The treatment combinations of this
FE = 2x3x4 = 24
Factors Levels Contrast table
A a0 a1 A = -1 1 0 0
B b0 b1 b2 BL = -1 1 0 0
C c0 c1 c2 c3 BQ = -1 1 0 0
C1 = -1 1 0 0
C2 = -1 -1 2 0
C2 = -1 -1 -1 3
Using algebraic method
We generate main effects and interactions as follow by using algebraic method:
1. A = 1/12[(a1-a0)(b0+b1+b2)(c0+c1+c2+c3)]
2. B1 = 1/8[(a1+a0)(b1-b)(c0+c1+c2+c3)]
3. B2 = 1/16[(a1+a0)(-b0-b1+2b2)(c0+c1+c2+c3)]
4. C2 = 1/12[(a1+a0)(b0+b1+b2)(-c0-c1+2c2)]
5. ALBL = 1/8[(a1-a0)(-b0+b1)(c0+c1+c2+c3)]
6. AB2 = 1/16[(a1-a0)(-b0-b1+2b2)(c0+c1+c2+c3)]
7. B1C1 = 1/4[(a1-a0)(-b0+b1)(-c0+c1)]
8. B2C2 = 1/24[(a1+a0)(-b0-b1+2b2)(-c0-c1-c2+3c3)]
9. AB1C1 = 1/4[(a1-a0)(-b0+b1)(-c0+c1)]
10. AB2C2 = 1/24[(a1-a0)(-b0-b1+2b2)(-c0-c1-c2+3c3)]

80
The AOV Table
Source df
A 1
B 2
AB 2
C 3
AC 3
BC 6
ABC 6
Error Depends upon a design used

7.2 FE in Incomplete Designs


7.2.1 Confounding the Interactions Effects of FE
Confounding
Factorial experiment with large number of factors or with a few factors having several levels
generate quite a large number of treatment combinations. For number of treatments exceeding
10, it is also impractical to use the latin square design. Also as the number of treatments
increases, it becomes exceedingly difficult to find homogenous experimental material that
accommodates all treatments at recommended replications under a randomized complete block
design. As the result the variation within replicates is going to be increase and thereby the
experimental error is going to be increased. To overcome this problem, one could use the idea of
confounding in which only a given portion of the treatment combinations is used at different
replications. This in other words results in incomplete blocks.

Definition: Confounding involves tying up or mixing up some effects with block differences.
The segregation of blocks within replicate results in a decrease in the degrees of freedom
associated with either error or treatment sum of squares or even both. This means that
information concerning some treatment effects will be mixed up with block differences.

Random assignment or indiscriminate allocation of treatments to incomplete blocks will result in


a complete loss of some factorial effects. This means that we should confound with incomplete
blocks those contrasts comparisons which are of little or of negligible interest to us.

For illustration consider the following. Have three factors A, B & C each at 2 levels and
treatment combinations are: a0b0c0, a1b0c0, a0b1c0, a1b1c0, a0b0c1, a1b0c1, a0b1c1, a1b1c1.
Assume that it is not possible for us to find 8 homogenous experimental material to form a
complete block. Rather we can find 2 blocks with 4 homogenous experimental units within each
block, but heterogeneous between blocks. The question is how to allocate the 8 treatments
between the 2 incomplete blocks. Suppose the following choice was made:
Block B1 = a1b0c0, a0b1c1, a0b0c1, a1b1c1
Block B2 = a0b1c1, a1b0c1, a1b1c0, a0b0c0
81
What is the difference between the two blocks measure?
B1-B2 = [(a1b0c0+a0b1c1+a0b0c1+a1b1c1) – (a0b1c1+a1b0c1+a1b1c0+a0b0c0)]
A little algebraic manipulation shows that it is equivalent to = (a1-a0)(b1-b0)(c1-c0), which
stands for ABC effect. I.e. this measure of the interaction ABC is identical with block difference.
Since blocks are formed in such a way that the variation between blocks is maximized, while on
the contrary the effect ABC is estimated with relatively poor precision from this replication.

This is because of that it is subjected not only to within block variability, but also it is affected by
the variation between the two blocks. This means that ABC is confounded with the blocks B1 &
B2. Hence, it is customary to ignore the effect of ABC interaction, which is already confounded
with the replication.

On the other hand, the other effects are free from block differences and thereby they are not
confounded. That is, A, B, c, AB, AC & BC are obtained by taking within block comparisons
(intra-block differences).

To check this fact, consider for instance the estimate of C.


C = (a1+a0)(b1+b0)(c1-c0)
= a1b1c1-a1b1c0+a1b1c1-a1b0c0+a0b1c1-a0b1c0+a0b0c1-a0b0c0
= (a1b1c1-a1b1c0+a1b1c1-a1b0c0) + (a0b1c1-a0b1c0+a0b0c1-a0b0c0)
st
The 1 group comes from block B1 and the second from block B2. Note that the effect of C is
not obtained as a difference between block B1 & B2. Therefore, the effect of C is not
confounded.

Exercise: check that AB is not confounded!

The implication is that we estimate the ABC interaction as an inter-block difference, whereas
the others are estimated as intra-block difference.
The uncofounded effects are estimated with an error variance of σ2/4 corresponding to a
relatively homogenous experimental units of 4. Had a complete block of 8 units would be used
and the error variance of estimating effects would have been σ2/8. Definitely, confounding
results in decreasing the degree of freedoms of both the treatment and error, and therefore the
treatments effects must be relatively conspicuous enough to be statistically significant among
their results.

82
Advantages of confounding
1. Confounding is of utility if the gain in efficiency through reduction of the error variable
is materialized
2. The loss of information concerning the confounded effect can in some cases be tolerated;
otherwise it can be used to increase the efficiency of the experiment and to precisely
estimate other treatment combinations.
Disadvantages of confounding
1. Confounding effects are replicated fewer times than the unconfounded effects
2. The calculation procedures are usually more difficult
3. Considerable difficulties are encountered if the treatments interact with incomplete blocks

Types of confounding
There are two general types of confounding used by experiments. These are:
a) Complete confounding, and
b) Partial confounding
a) Complete confounding
If we found that an effect is of no use or of negligible importance, we confound that effect with
incomplete blocks in all replication, and this is known as complete confounding. Before using
this complete confounding, the experimenter must make sure that it is desirable to do so.
Sometimes in an experiment where effects are completely confounded, it may be necessary to
adjust means.

Complete confounding in a 2n series


Example, assume that we have a 23 experiment in blocks of size 4 with 3 replications and
completely confounded the effect ABC.
Rep I Rep II Rep III
B1 B2 B1 B2 B1 B2
abc (1) abc (1) abc (1)
a ab a ab a ab
b ac b ac b ac
c bc c bc c bc
To determine which effects are confounded, we will look at the expected of the means of each
treatment and for each replication. Note that if a treatment is in block B1, we will assume that its
block effect will be B1 and if it is in block B2, its block effect will be B2.

Let’s look at the ABC effect.


In block B1 In block B2
μabc+B1 μ(1)+B2
μa+B1 μab+B2
μb+B1 μac+B2
μc+B1 μbc+B2
83
μabc = 1/4[-μ(1)-B2+μa+B1+μb +B1+μc +B1-μab-B2-μac-B2-μbc-B2+μabc+B1
= 1/4[(μa+B1+μb +B1+μc +B1+μabc+B1) – (μab+B2+μac+B2+μbc+B2+μ(1)+B2)]
= 1/4[(μa+μb+μc+μabc) – (μab+μac+μbc+μ(1)) + (B1+B1+B1+B1) – (B2+B2+B2+B2)]
= ABC + (B1-B2)
If there would not be confounding, ABC effect is not added with B1-B2 difference.

We note that the effect ABC is completely confounded with blocks in all replications, because
its expected value includes the block difference.

Exercise: show that the remaining effects are not confounded by using plus-minus technique!
B1 B2 B1 B2 B1 B2
abc (1) ABC = + - A= + -
a ab + - confounded + + not confounded
b ac + - - +
c bc + - - -
B= + - C= + -
- + not confounded - - not confounded
+ - - +
- + + +
AB = + + AC = + +
- + not confounded - - not confounded
- - + +
+ - - -
BC = + +
+ - not confounded
- -
- +

84
AOV of a 23 experiment in which the effect is completely confounded
The analysis of an experiment in which an effect is completely confounded is straight foreword.
The breakdown of source of variation and the calculation of the sum of squares is similar to
RCBD design (simple factorial experiment in RCBD). The only exception being the following:
1. The completely confounded effect cannot be estimated, and
2. The divisor of the block sum of squares will be 4 instead of 8.
The AOV Table
Source df
Block 5
A 1
B 1
AB 1
C 1
AC 1
BC 1
Error 12
Total 23

b) Partial confounding
If the experimenter doesn’t want to lose all information on the effect of ABC, he/she should not
completely confounded in all replication.

He/she may, for instance, confound ABC in Rep I, BC in Rep II, etc. The information on ABC
effect can be obtained from the replication in which it is not confounded.

Since complete confounding results in total loss of information concerning the confounded
effect, we may wish not to completely confound it in all replication. Instead we may wish to
confound some effects in some replications and others in other replications. We can then get
information about the effect from the replication in which it is not confounded and this type of
confounding is known as partial confounding.

Example, assume that we have a 23 experiment laid in block of 3 replications.


Rep I Rep II Rep II
B1 B2 B1 B2 B1 B2
abc (1) abc a abc ab
a ab ac c bc ac
b ac (1) ab a b
c bc b bc (1) c
Note that the effect ABC is confounded in Rep I, AC in Rep II and BC in Rep III. All the other
effects are not confounded. Example, we want show that the effect A is not confounded:
Using plus-minus technique
85
Rep I Rep II Rep II
B1 B2 B1 B2 B1 B2
abc (1) abc a abc ab
a ab ac c bc ac
b ac (1) ab a b
c bc b bc (1) c

Rep I Rep II Rep II


B1 B2 B1 B2 B1 B2
A= + - + + + +
+ + + - - +
- + - + + -
- - - - - -
Results not confounded not confounded not confounded

Verify that ABC, AC & BC are confounded at Rep I, II and III, respectively!
Rep I Rep II Rep II
B1 B2 B1 B2 B1 B2
ABC = + - + + + -
+ - - + - -
+ - - - + +
+ - + - - +
Results confounded not confounded not confounded

Rep I Rep II Rep II


B1 B2 B1 B2 B1 B2
AC = + + + - + -
- - + - - +
+ + + - - +
- - + - + -
Results no confounding confounding no confounding

Rep I Rep II Rep II


B1 B2 B1 B2 B1 B2
BC = + + + + + -
+ - - - + -
- - + - + -
- + - + + -
Results not confounded not confounded confounded

86
Note that when factor is confounded, all “+ve” will be found only in one block and all the “-ve”
signs will be found in another block. Conversely, if they are not confounded, both positive and
negative signs are found in both blocks.

Relative information: This is the amount of information which could be obtained about the
factorial effect. If the effect is not confounded, its information is 1 (100%). If it is confounded,
its information is less than 1. Therefore, information on ABC, AC & BC is 2/3, while all the rest
have 100% information.

Analysis of a 23 experiment in blocks of 4 units


Rep I Rep II Rep III Total
Block I Block II Block III Block IV Block V Block VI
12 b 13 bc 14 ab 10 (1) 16 ab 15 a
12 ac 10 a 13 c 11 a 14 (1) 14 ac
14 abc 12 c 12 ac 15 abc 14 c 16 b
11 (1) 16 ab 12 b 12 bc 14 abc 15 bc
49 51 51 48 58 60 317

Treatment Total: (1) a b ab c ac bc abc


35 36 40 46 39 38 40 43 = 317

The first task in analyzing a factorial experiment is to find out which effect is confounded.
Computation of sum of squares
1. SStotal = Σ(122+122+…+152) – (3172)/24 = 76.0
2. SSblock = Σ(492+512+…+602)/4 – c.f. = 30.8
3. SSmain effects (A, B & C)
b0 b1 Total
a0 74 80 154
a1 74 89 163
Total 148 169 317
2 2
SSA = (154 +163 )/12 – c.f. = 3.4
SSB = (1482+1692)/12 – c.f. = 18.4
SSC = (1572+1602)/12 – c.f. = 0.375
If AB had not been confounded at all within replications, it would have been obtained as:
SSAB = (742+…+892)/6 – c.f. – SSA-SSB

But, the problem is AB is confounded in Rep III. Therefore, use Rep I & Rep II to
calculate SSAB. Then, we must form a new A by B table using data from Rep I & II.
b0 b1 Total
a0 46 49 95
a1 45 59 104
Total 91 108 199

87
SSAB actually = (462+…+592)/4 – (199)2/16 – (952+1042)/8 – c.f. – (912+1082)/8-c.f
= 2505.75-2475.0625-5.0625-18.0625 = 7.5625
c0 c1 Total
a0 52 54 106
a1 56 55 111
Total 108 109 217
2 2 2 2
SSAC = (52 +54 +56 +55 )/4-2943.0625-1.5625-0.0625 = 0.5625

c0 c1 Total
b0 50 52 102
b1 60 56 116
Total 110 108 218
SSBC = (50 +52 +60 +56 )/4-2182/16-[(1022+1162)/8-2970.25]-[(1102+1082)/8-2970.25]
2 2 2 2

= 14.75-12.25-0.25 = 2.25

SSABC = (352+…+432)/3-3172/24-SSA-SSB-SSAB-SSC-SSAC-SSBC
The procedure outlined is straight forward, but some additional works required. An easier
way to find effects in 2x2x… series is using Yate’s method of sums and differences.

Yate’s procedure of sums and differences for a 2n experiment


It is known as treatment total round 1, 3 & 4 effect adjustment
Treatment Total Round effect Adjustment Adjusted Squared Estimated
1 2 3 value of divisor effect (E)
(1) 35 71 157 317 M 24
a 36 86 160 9 A 9 24 9/24 = 0.38
b 40 77 7 21 B 21 24 21/24 = 0.88
ab 46 83 2 9 AB -(58-60) 11 16 11/16 = 0.69
c 39 1 15 3 C 3 24 3/24 = 0.12
ac 38 6 6 -5 AC -(49-51) -3 16 -3/16 = -0.19
bc 40 -1 5 -9 BC -(48-51) -6 16 -6/16 = -0.38
abc 43 3 4 -1 ABC -1 24 -1/24 = -0.04

SSA = n[E(A)]2, SSB = n[E(B)]2, etc.


Procedures
1. Write down treatment totals in succession
2. Write down in the 2nd column the sum of two consecutive treatments in succession. This
fills half the column
3. Fill the remaining half by using difference of two successive values (the 2nd minus the 1st)
4. The entire procedure is repeated in Round 2 using values generated in Round 1
5. Procedure is repeated until the first number of the latest rounds gives a value equal to the
ground total

88
The latest column gives us the effect, M, A, B AB, C, AC, BC & ABC, in that order. If the
effects were not confounded to obtain the respective sums square, all we need to do is square this
column and divide by number of observations (total number).

On the other hand, if effect is confounded we have to adjust for confounding. For example, effect
AB is confounded between blocks and 5 & 6 in Rep III. Therefore, we adjust this effect by
removing the block differences that is, B5-B6 = -58-60 = -2. To get the effect of AB we subtract
this number from 9 = (9-(-2) = 11

AOV Table
Source df Sum of squares
Blocks 5 30.8
A 1 3.4
B 1 18.4
AB 1 7.6
C 1 0.4
AC 1 0.6
BC 1 2.25
ABC 1 0.04
Error 11 12.6
Total 23 76.0

Techniques for constructing confounded designs


1. The sign method: Assume that we have a 2n experiment in blocks of size 2n-1. We first
choose the effect to be confounded. Let it be ∆ (delta) will be a defining contrast. Assign
those treatment combinations having a plus sign with ∆ contrast to block I and the rest
block II.

Example, let ∆ = ABC. Then the ABC is given in 23 FE.


ABC = [-(1)+a+b+c-ab-ac-bc+abc]. If we want to ABC confounded, put those “+” signs
in one block and all negatives in another block.
Block I: (1), ab, ac, bc
Block II: a, b, c, abc
2. Even-odd technique: Put those contrasts having zero or an even number of letter in
common with ∆ in one block, while the rest in the other block.

Example, let AB being the effect to be confounded. According to even-odd technique, the
blocks look like the following:
Block I: (1), ab, c, abc
Block II: a, b, bc, ac

89
3. Permutation method: Random assignment of treatments to incomplete blocks. Here we
don’t know which factor is confounded purposely.

Example, in 23 FE we have 8 treatment combinations [(1) a b ab c ac bc


abc], but each block has only 4 homogenous units. If we want to run this experiment with
3 replications, we may have 6 total blocks where each replication encompasses two
blocks. Then, the allocation of the treatment combinations with permutation method is as
follow:
Block I: (1) a b ab
Rep I
Block II: c ac bc abc
Block III: abc (1) a b
Rep II
Block IV: ab c ac bc
Block V: bc abc (1) a
Block VI: b ab c ac Rep III

4. Modular arithmetic method: It can be defined simply as an integer x modular of an


integer m equal to the remainder I when x is divided by m.

Mathematically, x(mod m) = I. Example, let x = 10, and m = 4


x(mod m) = I,
10(mod 4) = 2
0(mod 4) = 0
Modular arithmetic has the following features:
2(mod 3) + 4(mod 3) = 6(mod 3) = 0
3(mod 2) + 0(mod 2) = 3(mod 2) = 1
Let x denotes a factor and m denotes a level of a factor x. Numerically, for instance, let’s
take a 24 experiment and then x = 4 and m = 2.

In another way, we look into the values of x1, x2, x3& x4 which denote the levels of the
1st, 2nd, 3rd& 4th factors, respectively. Each factor of xi will have a value of 0 or 1 where i
refers levels of factor x. I.e. xi = i levels of factor x and i is 0 or 1 (i=0,1). Let Hi have the
value 0 or 1 depending upon whether the factor is confounded or not.

Example, suppose we have 23 experiment and desire to confound AB. Then, we will have
the followings: HA = 1; HB = 1; and HC = 0. Based on modular arithmetic, these can be
transformed into the following equation: HAXA+HBXB+HCXC = I

Procedure
i. Determine the contrast to be confounded
ii. Compute HAXA+HBXB+HCXC for all treatment combinations

90
iii. Assign those treatment combinations to one blockHAXA+HBXB+HCXC = 0; and in
the other block those which result in HAXA+HBXB+HCXC = 1

Taking the above example to confound AB, the modular arithmetic is done as
follow:
Treatment XA XB XC HAXA+HBXB Mod 2 Block IBlockII
combinations
(1) 0 0 0 0 0(mod 2) = 0 1 a
a 1 0 0 1 1(mod 2) = 1 ab b
b 0 1 0 1 1(mod 2) = 1 c ac
ab 1 1 0 2 2(mod 2) = 0 abc bc
c 0 0 1 0 0(mod 2) = 0
ac 1 0 1 1 1(mod 2) = 1
bc 0 1 1 1 1(mod 2) = 1
abc 1 1 1 2 2(mod 2) = 0
HAXA+HBXB+HCXC = I, since HC = 0, only HAXA+HBXB will be resulted and the mod is 2
while they are 2 factors. HAXA+HBXB = I
Note: H takes a value when each factor in separate or interaction is confounded = 1not 0.

Another example, if ABC is needed to be confounded, then HA = 1, HB = 1, & HC = 1

Treatment XA XB XCHAXA+HBXB+ HCXCMod 2 Block IBlockII


combinations
(1) 0 0 0 0 0(mod 2) = 0 1 a
a 1 0 0 1 1(mod 2) = 1 ab b
b 0 1 0 1 1(mod 2) = 1 ac c
ab 1 1 0 2 2(mod 2) = 0 bc abc
c 0 0 1 1 1(mod 2) = 1
ac 1 0 1 2 2(mod 2) = 0
bc 0 1 1 2 2(mod 2) = 0
abc 1 1 1 3 3(mod 2) = 1

7.2.2 Factorial experiment with main effect confounded


A) Split-plot design)
The nature of the experimental material may be such that it excludes the use of small plots or the
experimenter may know that the levels of that factor differ in yield. Such factors are laid out in
relatively large plots called main-plots. The other factor is assigned to the split-plots. Such
design is known as split-plot design.

Since in split plot experiment variation among sub unit is expected to be less than among the
whole units, then:
91
1. The factors which require smaller amount of experimental material, and/or
2. The factors which are of major importance, and/or
3. Factors which are expected to exhibit smaller differences, and/or
4. Factors for which greater precision is required
are assigned to the sub plots.

The underline principle is that whole plots to which one or more factors are applied are
divided into subunits to which one or more levels or the other factors are applied. The whole
plot unit becomes main plot whereas the subunit becomes the sub plot.

The split plot design is an incomplete block design where main effect that is the whole plot is
confounded. The design has a complete block as regards the sub plot treatment.
Examples:
1. Study to determine the effect of 4 levels of meat on 3 breeds of cows. The main plot here
would be the breads.
2. Study of 3 types of nozzles of irrigation equipment and fertilizer application on yield of
onion
3. Effect of spraying 3 chemicals using airplane and spacing on yield of corn

Advantages of split-plots
1. Experimental material which is large by necessity and a design can be utilized to compare
subsidiary treatments (sub units)
2. Increase precision over the complete block design with regard to the sub unit treatments
and the interaction of sub plot and main plot
3. The over all precision of this design relative to RCBD can be increased, if the sub unit
treatments are laid out in latin square or in an incomplete latin square design

Disadvantages of split-plots
1. Whole-plot factor is less precisely estimated as compare to RCBD
2. When missing data exist, the analysis becomes too complicated compared to RCBD

Randomization to be followed depends upon the design used (whole-plots). Example, assume the
one factor in our study is fertilizer designated by a, and the other factor is variety and designated
by b. Assume that a has p levels and b has q levels. Assume also that a is the main plot factor
and b is a sub-plot factor. Then the design looks like the following:
a1 a2 … ap
b1 b1 … b1
b2 b2 … b2
. . … .
bq. bq. … bq

92
Here we want to know varieties effect precisely. When we a, it is confounded. In this design,
sub-units are complete, but incomplete to main-plot and main-plot is confounded.

From this design, we note that we have an incomplete block design as regards the main effect a
and a complete block design as regard the sub unit treatment b. We also note that the main plot
treatment is confounded with incomplete blocks. Therefore, that is why we say a split-plot design
is an incomplete block design in which main effect is confounded. Note that RCBD requires pq
experimental units to form a complete block.

Analysis of split-plot experiment


Model: yijk = μ+ρi+δj+Ϩij+βk+(δβ)jk+εijk
Where: i (stands replication or block) = 1,2…,r
j (treatment a) = 1,2…, a
k (treatment b) = 1,2…,b
Ϩij N(0, σ2Ϩ) - whole plot variance
2
εijk N(0, σ ε) - sub-plot variance

Suppose we have two factors each at 3 levels and replicated 3 times. And let the following be the
experimental layout.
Rep I Rep II Rep III
a1a3 a2 a2 a3a1 a3a1a2
b1 b3 b2 b1 b1 b3 b3 b1 b3
b2 b1 b3 b2 b2 b13 b1 b3 b2
b3 b2 b1 b3 b3 b2 b2 b2 b1

If we assume that there is no interaction between whole plot treatments and replications, as well
as, between sub-plot treatments and blocks, we will have two types of comparisons:
1. Between whole plots, and
2. Between sub-plots within whole plots
Within whole plot comparison
To make the within whole plot comparison, we will divide data into 3 groups. In the first group,
we put the treatment combination having the first level of a. In second group, second level of a,
and so on.
Group I Group II Group III
Rep I a1b1 a2b1 a3b1
a1b2 a2b2 a3b2
a1b3 a2b3 a3b3
Rep II a1b1 a2b1 a3b1
a1b2 a2b2 a3b2
a1b3 a2b3 a3b3
Rep III a1b1 a2b1 a3b1

93
a1b2 a2b2 a3b2
a1b3 a2b3 a3b3

a) Sub-plot AOV table


Source df Sum of squares
3
B 2 3x3∑ ¿¿.j.- …)2
i=1
3 3
AB 4 3∑ ∑ ¿ ¿¿ ij.- …)2
i=1 j=1

Error (b) 12 Subtraction


3 3 3
2
Total 18 ∑∑∑ y ijk -(ΣΣΣijk)2
i=1 j=1 k=1

b) Whole plot analysis


To compute the whole plot AOV take the average of each main plot within replication.
Rep I Rep II Rep III
a1 a2 a3.. a1 a2 a3.. a1 a2 a3..
1.1 1.2 1.3 2.1 2.2 2.3 3.1 3.2 3.3

Whole plot AOV table


Source df Sum of squares
Rep (R) 2 9Σ( i.. - …)2
A 2 9Σ( ..k - …)2
AxR (Error a) 4 3ΣΣ( i.k - ..k)2
Total 8

c) Complete analysis
Source df Sum of squares
Main-plot 8
Rep 2 (r-1) SSR
A 2 (a-1) SSA
Error (a) 4 [(r-1)(a-1)] SS(a)
Sup-plot 18
B 2 (b-1) SSB
AB 4 [(b-1)(a-1)] SSAB
Error (b) 12 [a(b-1)(r-1)] SS(b)
Total 26 (abr-1) SStotal

94
An example of a split-plot analysis
Yields of three varieties of alfalfa (ton/acre) with four dates of cutting.
Blocks Variety
Variety Date total
I II III IV V VI Total
Ladak A 2.17 1.88 1.62 2.34 1.58 1.66 11.25
B 1.58 1.26 1.22 1.59 1.25 0.94 7.84
C 2.29 1.60 1.67 1.91 1.39 1.12 9.98
D 2.23 2.01 1.82 2.10 1.66 1.10 10.92
Total 8.29 6.75 6.33 7.94 5.88 4.82 39.99
Cossac A 2.33 2.01 1.70 1.78 1.42 1.35 10.59
k B 1.38 1.30 1.85 1.09 1.13 1.06 7.81
C 1.86 1.70 1.81 1.54 1.67 0.88 9.46
D 2.27 1.81 2.01 1.40 1.31 1.06 9.86
Total 7.84 6.82 7.37 5.81 5.53 4.35 37.72
Ranger A 1.75 1.95 2.13 1.78 1.31 1.30 10.22
B 1.52 1.47 1.80 1.37 1.01 1.31 8.48
C 1.55 1.61 1.82 1.56 1.23 1.13 8.90
D 1.56 1.72 1.99 1.55 1.51 1.33 9.66
Total 6.38 6.75 7.74 6.26 5.06 5.07 37.26
Block Total 22.49 20.32 21.44 20.01 16.47 14.24 114.97

Date of cutting by variety


Date of cutting Total
Variety A B C D
Ladak 11.25 7.84 9.98 10.92 39.99
Cossak 10.59 7.81 9.46 9.86 37.72
Ranger 10.22 8.48 8.90 9.66 37.26
Total 32.06 24.13 28.3 30.44 114.97
4
SS computation
1. Correction factor (c.f) = (114.97)2/72 = 183.585
2. SStotal = (2.17)2+…+(1.33)2 –c.f = 9.1218
3. SSmain plots = [(8.29)2+…+(5.07)2]/4 –c.f = 5.6902
4. SSreplication = [(22.49)2+…(14.24)2]/3x4 –c.f = 4.1496
5. SSvariety = [(39.99)2+(37.72)2+(37.26)]2/4x6 –c.f = 0.1778
6. SSerror(main-plot) = SSmainplot-SSrep-SSerror = 5.6902-4.146-0.1778 = 1.3622
7. SSsub-plot = [(11.25)2+…+(9.66)2]/6-c.f = 2.3511
8. SSdate = [(32.06)2+…+(30.44)2]/3x6-c.f = 1.9625
9. SSdxv = SSsubplot-SSdate-SSvariety = 2.3511-1.9625-0.1778 = 0.2108
10. SSerror(sp) = SStotal-SSmainplot-SSdate-SSdxv = 9.1218-5.6902-1.9625-0.2108 = 1.2586

95
AOV table
Source df SS MS
Main plots 17 5.6902
Blocks 5 4.1496 0.8300
Varieties 2 0.1778 0.089
Error(a) 10 1.3622 0.1362
Sub-plots 54 2.3511
Dates 3 1.9625 0.6542
DxV 6 0.2108 0.0351
Error(b) 45 1.2586 0.0280
Total 71 9.1218

General formula to analyze variances of split plot experiment:


Model: yijk = μ+ρi+δj+Ϩij+βk+(δβ)jk+εijk
Where: i (stands replication or block) = 1,2…,r
j (treatment a) = 1,2…, a
k (treatment b) = 1,2…,b
Source df Sum of squares
Replication r-1 Σ(yi..)2/ab – (y…)2/abr
Treatment a a-1 Σ(y.j.)2/br – (y…)2/abr
RxTa (Errora) (r-1)(a-1) Σ(yij.)2/b – (y…)2/abr – SSA-SSR
Treatment b b-1 Σ(y..k)2/ar – (y…)2/abr
AB (a-1)(b-1) Σ(y.jk)2/r – (y…)2/abr – SSA-SSB
Error (b) a(r-1)(b-1) SStotal – SSR SSA–SSRT– SSB–SSAB
Total abr-1 Σ(yijk)2/ab – (y…)2/abr

Relative efficiency of the split-plot design


How does the split-plot design which is unincomplte block design compare with randomized
complete block design as regards the main effects and interactions? The main effect of under
sub-plot and its interaction with another treatment under main-plot are much more precisely
estimated in the split-plot experiment than in the randomized complete block design. But,
concerning the main plot effect, estimation is poor in split-plot design. This is because that the
effect of a factor in the main plot is confounded with that of incomplete blocks.

There is also less degrees of freedom in split-plot design to compare the main effects. Since the
average error of the difference for both designs is the same. In other words, there is no overall
gain by using the split-plot design.

96
B) Split-split-plot design
In split-plot design, it is possible to have more than two factors. Assume that we have several
levels in all possible combinations and these combinations are of interested to the experimenters.
Assume that we have three factors (A, B, & C) with several levels. The first factor is less
important than B & C, while B is relatively more important than A, but less important than C. In
other words, C is much more important than either A or B.

The levels of factor a can be laid out as the main plots, the levels of factor b laid out under each
level of a, and the levels of factor c can be laid out under each level of factor b. such type of
design is known as the split-split-plot design.

It is possible to have more than 3 factors under such type of design. One can continue the same
procedure as long as the factors under consideration are hierarchal in terms of importance.

Analysis of variance of a split-split-plot design


In setting up the analysis of variance table, it is advisable to figure out the degrees of freedom
and the various types of error. Assume that we have axbxc experiment, and in terms of their
importance as follow: a<b<c

Model: yijkl = μ+ρi+δj+Ϩij+βk+(δβ)jk+γjk+αl+(δα)jl+(βα)kl+(δβα)jkl+εijkl

AOV table
Source df Sum of squares
Main plots ra-1 [Σ(yij..)2]/bc - c.f
Replication (R) r-1 [Σ(yi....)2]/abc - c.f
A a-1 [Σ(y.j..)2]/rbc - c.f
RA (Error a) (r-1)(a-1) [Σ(yij..)2]/bc - c.f – SSR – SSA
Sub-plots ab(r-1) [Σ(yijk.)2]/c - c.f
B b-1 [Σ(y..k.)2]/rac - c.f
AB (a-1)(b-1) [Σ(y.jk.)2]/rc - c.f – SSA – SSB
Error (b) a(r-1)(b-1) [Σ(yijk.)2]/c - c.f – SSR – SSA – SSB
Sub-sub-plots abc(r-1) [Σ(yijkl)2]/r - c.f
C c-1 [Σ(y…l)2]/rab - c.f
AC (a-1)(c-1) [Σ(y.j.l)2]/rb - c.f– SSC – SSA
BC (b-1)(c-1) [Σ(y..kl)2]/ra - c.f– SSC – SSB
ABC (a-1)(b-1)(c-1) [Σ(y.jkl)2]/r - c.f– SSC – SSB– SSA
Error (c) ab(c-1)(r-1) SStotal-SSR-SSA-SS(a)-SSB-SSAB-SS(b)-SSC-SSAC-SSBC-SSABC
Total abcr-1 [Σ(yijkl)2] – c.f

How to calculate the degrees of freedom and sum of squares, as well as, to set source of variation
for other higher levels of similar design follows the same way.
97
On the experimental errors
The whole plot error which is conventionally called Ea is larger than that of the sub-plot error,
called Eb. This is because of that the sub-units within the whole unit tend to be positively
correlated and therefore behave alike than the sub-units which are in other main-units.

Error a cannot be less than error b, unless this occurs by chance. If this happen, we assume that
both error a and b estimate the population error variance σ2 and so we pool the two errors and
used the pool error variance to test hypotheses.

Test of hypotheses about various effects


There are several hypotheses that can be tested using the split-plot design. Some of these are:
1) H0: A = 0, 2) H0: B = 0, 3) H0: AB = 0, 4) H0: C = 0, 5) H0: ABC = 0, etc.
If it is happen to reject any of the above hypotheses, we may need to perform multiple
comparisons. Hence, we need to know which standard error is appropriate for each comparison.

Standard error in the split-plot design


1. To compare the means of A, such as μa1-μa2, μa1-μa3, etc., we use the standard error:
S.E = √ 2 MSE (a)/rb, where b is the number of b levels
2. To compare any two levels of factor b such as b1, b2, b3, etc., we use:
S.E = √ 2 MSE (b)/ra
3. To compare any two levels of a at the same level of factor b, example a1b1-a2b1, or
a3b2-a2b2, etc., use S.E = √ 2 [ ( b−1 ) Eb+ Ea ] /rb, used pooled or average df.
4. To compare any two levels of the factor b at the same level of factor a, example a1b2-
a1b3, a2b2-a2b1, etc., use the following standard error.
S.E = √ 2 MSE (b)/r

C) Strip-plot/split-block design
Sometimes the factors a and b may not be as important as the fraction between the two factors.
Experimental condition may be such that both factor a and b need larger plots, but the interaction
is more important. Examples,
1) Fertilizer with broadcast method of application and tillage experiment with an ordinary
implement
2) Study on using sprinkler irrigation and spraying of some chemicals
To cope with situations like these, a design variously named as split-block, strip-cropping, or
two-way-whole is employed. In this design, the levels of factor a are laid out in randomized
complete block design, latin square design or any other design. Then the levels of factor b are
laid out across the levels of factor a. Such a design would effectively have both the levels of a
and b as whole-plot levels, the only sub-plot information is that of the AB interaction.

98
Advantages of the split-block design
1. The sub-plot may be kept relatively small even though the whole plots for both factors
may be large
2. The AB interaction would be estimated precisely

Disadvantages of split-block design


1. The main plot factors, that is, A & B are estimated less precisely than would be using the
randomized complete block design
2. The analysis is more complex than that of RCBD

Layout of split-block experiments


Rep I Rep II Rep III
a1 a2 a3 a4 a1 a4 a3 a2 a4 a2 a3 a1
b1 b2 b
5
b2 b1 b
2
b3 b3 b
3
b4 b4 b
4
b5 b5 b
1

Here we have factor a and b in randomized complete blocks. Note that both factors are
randomized simultaneously in each block. Here again each b level is imposed across the whole a
levels and each level a is imposed across the whole b levels.

AOV table for split-block design


Source df Sum of squares
Replications (R) 2 (r-1) Σ(yi..)2/ab – (y…)2/abr
A 3 (a-1) Σ(y.j.)2/br – (y…)2/abr
RA (Ea) 6 (r-1)(a-1) Σ(yij.)2/b – (y…)2/abr – SSA-SSR
B 4 (b-1) Σ(y..k)2/ra – (y…)2/abr
RB (Eb) 8 (r-1)(b-1) Σ(yi.k)2/a – (y…)2/abr – SSR-SSB
AB 12 (a-1)(b-1) Σ(y.jk)2/r – (y…)2/abr – SSA-SSB
Error(c) 24 (r-1)(a-1)(b-1) Σ(yijk)2 – c.f – SSR-SSA-SSRA-SSB-SSRB-SSAB
Total 59 (rab-1) Σ(yijk)2 – c.f

In this design, the two factors are laid out in randomized complete block design. Comparison
between any level of one factor is done using three replications.
a) To compare two levels of a use:

99
S.E = √ 2 MSE (a)/rb
b) To compare any two levels of b use:
S.E = √ 2 MSE (b)/ra

The name stripe-cropping implies that it requires large plot/area. If a and b are equally important,
but c is less important, c should be laid at the main plot and then a& b in each c level as split-
block. The experimental layout looks like the following:

c2 c1 c3
a1 a3 a2 a2 a1 a3 a3 a2 a1
b2 b2 b1
b3 b1 b2
b1 b3 b3

D) Strip-split-plot design
 Please refer Gomez and Gomez from page 154 to page 167
E) Fractional Factorial Design
 Please refer Gomez and Gomez from page 167 to page 186

7.3 Nested or hierarchal design


Definition: This is a design in which the levels of one factor are nested within the levels of the
other factor. This means that the levels or the sub-factors are not the same for all levels of the
main factor.

Example 1, consider an industrial experiment in which the performance of 5 machines are tested
at 4 different stations with the following layout:

Machines
M1 M2 M3 M4 M5
Stations 1 5 9 13 17
2 6 10 14 18
3 7 11 15 19
4 8 12 16 20

Could it be considered as a split-plot design? No, it can’t be considered as split-plot design,


because there are different stations under each machine. The split-plot design requires that the
levels or the sub-unit factors be the same for all levels of the main-plot factor.

Nested design is very useful for survey type research.

100
Example 2, assume that a physian wants to study the effectiveness of the 2 drugs with respect to
some criteria. Suppose that the design calls for the administration of drug 1 to n patients in each
of three hospitals. Similarly, drug 2 is administrated to n patients in the remaining three
hospitals.

This is shown below:


Drug 1 Drug 2
Hosp 1 Hosp 2 Hosp 3 Hosp 4 Hosp 5 Hosp 6
Patients 1 1 1 1 1 1
2 2 2 2 2 2
3 3 3 3 3 3
. . . . . .
. . . . . .
n1 n2 n3 n4 n5 n6

The effect of the drugs are masked or mixed with the effects of hospitals. The effect we measure
for drug 1 is not the unique effect due to the drug, but it also includes effects of hospital 1, 2 & 3.

Note: It is impossible to compare drug one with drug two, but possible to compare drug one
effect with different hospitals.

Mathematical model for the machine study indicated above


Yijk = μ+Mi+Sj(i)+εijk, where: Mi = machine effect
Sj(i) = station effect within machine i
Note; We can’t compare stations under different machines, because both of them vary
simultaneously. So that comparison is possible only in each machine’s stations.

Test of hypotheses
I. On machines: H0: μ1. = μ2. = μ3. = μ4. = μ5.
II. On stations:
1. H0: μ11 = μ12 =μ13 = μ14
2. H0: μ21 = μ22 =μ23 = μ24
3. H0: μ31 = μ32 =μ33 = μ34
4. H0: μ41 = μ42 =μ43 = μ44

Computation of sum of squares


1. SSmachine 1 = Σ(y1j)2 – (Σy1.)2/r
2. SSmachine 2 = Σ(y2j)2 – (Σy2.)2/r
3. SSmachine 3 = Σ(y3j)2 – (Σy3.)2/r
4. SSmachine 4 = Σ(y4j)2 – (Σy4.)2/r

101
5. SSmachine 5 = Σ(y5j)2 – (Σy5.)2/r

102
Analysis of variance table
Source df Sum of squares
Between machines 4 (m-1) Σ(yi..)2/sr – (Σy...)2/msr
Between stations/machine 15 [m(s-1)] Σ(yij.)2/r – Σ(yi..)2/sr
Between period/station/machine ms(r-1) Σ(yijk)2 – Σ(yij.)2/r
Total msr-1 Σ(yijk)2 – (Σy…)2/msr

Nested design as used in sampling experiment


Assume that we wish to employ multi-stage sampling procedure to select people for an
interview. We will use the following procedure:
1. Select s Administrative Regions at random from the country
2. Select a Awrajas at random from each Administrative Region
3. Select w Woredas at random from each Awraja
4. Select r kebeles from each of selected wored
5. Select h households from each of selected kebele, etc.
Here we can understand how they are nested.

Model: μ+si+aj(i)+wk(ij)+rl(ijk)+hm(ijkl)+εijklm

The AOV table


Source df Sum of squares
1. Between Adm. Reg. s-1 Σ(yi….)2/awrh – (Σy…..)2/sawrh
2. Between Awraja/Adm s(a-1) Σ(yij...)2/wrh – (Σyi....)2/awrh
3. Between woredas/Awr/Adm sa(w-1) Σ(yijk..)2/rh – (Σyij...)2/wrh
4. Between kebeles/wor/Aw/Ad saw(r-1) Σ(yijkl.)2/r – (Σyijk..)2/rh
5. Between HH/keb/wor/Aw/Ad sawr(h-1) Σ(yijklm)2 – (Σyijkl.)2/r
6. Error Subtraction Subtraction
7. Total sawrh-1 Σ(yijklm)2 – (Σy…..)2/sawrh

Test of hypotheses Fcalcul Ftabul


1. Test for Adm. Reg. F = MSs/MSa Fα(dfs, dfa)
2. Test for Awrajas F = MSa/MSw Fα(dfa, dfw)
3. Test for woreds effect F = MSw/MSr Fα(dfw, dfr)
4. Test for kebeles effect F = MSr/MSh Fα(dfr, dfh)
5. Test for house holds effect F = MSr/MSerror Fα(dfh, dferror)

Note: In nested design, we cannot study the interaction effect, because sub-unit factors are
different to different units.

There are different types of nested design. Example, two-way nested design as split-block
design.

103
Part II: Correlation and Regression

8. Correlation
Correlation is one of the most widely used concepts in statistical methodology. In olden days, it
was used by biologists, but now it has found to be used in many fields such as agriculture,
industry, chemistry, psychology, etc.

In correlation, we look into two broad aspects:


1. It is concerned with co-variability between two variables, say x & y
2. It measures the closeness of fit of the regression to the observation. In this course we will
be interested in the relationship between the variables.

Assumptions governing to use correlation


1. For each value of x, there must be a normally distributed values of y
2. For each value of y, there is a normally distributed sub-population of x values
3. The y’s and x’s are bivariate normally distributed

Bivariate normal distribution


Let us say that we have x & y which have values that are jointly normally distributed. This
distribution is known as bivariate normal distribution.

Example, consider height and weight of individuals:


Height (x)
Weight fy
46 47 48 49 50 51 52 53 54 55
40 1 1
41 1 1 1 3
42 2 2 2 1 7
43 3 3 2 8
44 2 4 2 1 9
45 1 3 5 2 1 12
46 3 4 3 2 12
47 2 4 2 1 1 10
48 2 1 2 1 1 7
49 1 1 1 1 4
fx 1 3 9 18 20 9 7 3 2 1 73
fy distribution is known as y-marginal distribution
fx distribution is known as x-marginal distribution
Whereas the middle distribution is known as the joint distribution

Correlation coefficient

104
Correlation coefficient measures the amount of co-variability between the two variables.
1. Population correlation coefficient: Let us assume that we have two variables x and y,
such that E(x) = μx v(x) = σx2
E(y) = μy v(y) = σy2
Then, covariance between x and y is given by cov(x,y) = E(x-μx) (y-μy)
E ( x−μ x)( y −μ y) cov (x . y)
Population correlation coefficient is given as ρ = =
√ E ( x−μ x)2 ( y −μ y ) 2 √ v ( x ) v ( y )
σ x, y
=
σ xσ y
Ρ (rho) measures the strength of the linear relationship between x and y. ρ2 is sometimes
known as population coefficient of determination, which measures the amount of
variability in one variable that is accounted for the other variable. Ρ (rho) assumes values
between negative and positive one. That is, ρ lies between -1 and +1, symbolically it is
= -1≤ρ≤1

Three cases regarding correlation coefficient


1. Case where ρ = 1. That is a case of perfect and positive relationship
2. Case where ρ = -1. This a case of perfect negative relationship
3. Case where ρ = 0. There is no relationship between two variables

Note: In correlation, there is no dependence and independent relationship. That means


that one does not cause the other.

2. Sample correlation coefficient (r): Sample correlation coefficient is sometimes known


as product-moment correlation coefficient. It measures the strength of relationship
between observations of two variables say x1&x2.

Scattered diagram: The relationship between two variables can be depicted


diagrammatically as follows:
* *
x1 * Type x1 r = 1 *
* *
* * Type
r = -1 * *
* *
x2 x2

Type
x1 x1 Type
r=0 *
* * * * * * * * *
* * r=0
105
*
x2 x2

x1 Type
* *
* * r=0
* *
* *
x2

Sample correlation coefficient


Σxy−( Σx ) ( Σy)
n nΣxy−( Σx ) (Σy)
r= =
( Σx ) 2 ( Σy ) 2 √[n( Σx 2)−( Σx ) 2][n( Σy 2)−( Σy ) 2]
(
Σx 2−
n )(Σy 2−
n )
Like the population correlation coefficient r lies between 1 and -1, that is -1≤r≤1

The positive value of r indicates that as the values of one variable increase, the
values of the other variable increase also. While the negative value for r shows
that as the values of one variable increase, the values of the other variable
decrease on the contrary.

Note that Σxy = 0 r=0


Example, suppose a random sample of five students is selected and their English
and Mathematics grades on the basis of 10 points are shown as below, and find
the sample correlation coefficient!
English (x) Mathematics (y) xy x2 y2
2 3 6 4 9
5 4 20 25 16
3 4 12 9 16
7 8 56 49 64
8 9 72 64 81
Sum 25 28 166 151 186

nΣxy−[ ( Σx ) (Σy)] 5(166)−[ ( 25 ) (28)]


r= = =
√[nΣx 2−( Σx ) 2][nΣy 2−( Σy ) 2] √[5(151)−( 25 ) 2][5 (186)−( 28 ) 2]
130
= 0.94
√ 130 x 146
This value of r shows that there is a very strong positive relationship between
students’ grades in Mathematics and English.

106
Test of hypothesis concerning the correlation coefficient
To perform a test on the correlation coefficient, we need to know the sampling
distribution of r. If we are told that r has a bivariate normal distribution, then we will
have two cases:
1. ρ = 0
2. ρ ≠ 0
1. Case 1, if ρ = 0. Then we can find easily the sampling distribution of r. Then r will be
1−r 2
distributed normal with mean 0 and variance σr2, where σr2 = . Note: E(r) = 0
n−2

The frequency curve of correlation coefficient looks like the following:

n=50
n=10

-1 0 1
The characteristic of the sampling distribution of r is that it depends upon p and n. In this
case, since we have assumed that p = 0, then the distribution depends upon n only.

r
r− p r
To test H0: p = 0 vs. Ha: p ≠ 0, we use the t-test. t = = = 1−r 2
Sr Sr
√ n−2
Example: Suppose a sample of n = 10 pairs of brothers and sisters is selected and the
sample correlation coefficient is found to be r = 0.7. Test the significance of the value
0.7 0.7 0.7
0.7
against p = 0. t = 1−(0.7)2 = 1−0.49 = 0.51 = = 2.77
√ 10−2
t0.05(8) = 2.26
√ 8 √ 8
0.2525

Conclusion: we reject the null hypothesis, while t-calculated value is greater than
t-tabulated value.
2. In case where p ≠ 0, rather say p = c, then the frequency distribution looks like the
following:

p =0.8
p = 0.5

107
-1 0 1
The same is true to negative values of p. If p ≠ 0, then we not use the t-distribution to test
the hypothesis such as H0: p = 0.6. To test such hypothesis, we need to make the
1+r
following transformation. Ƶr = ½ln = ½(2.3026log10(1+r/1-r))
1−r
Then, Zr will be approximately normally distributed with mean
1
E(r) = ½(2.3026)log10(1+p/1-p)), var(Ƶr) =
n−3
Once we can use normal distribution theory test hypothesis regarding the population
correlation coefficient. Note that Ƶr = r+(r3/3)+(r5/5)+(r7/7)+…. This is equal to the
inverse hyperbolic tangent of r (tann-1r). For values of r, 0≤r≤1, then 0≤Ƶr ≤∞

Ƶ r−Ƶ p 1
To test hypothesis, we compute the normal deviate Ƶ = , where σr2 = .
σr n−3
Example: Suppose 12 pairs of observations are collected with the following results,
r = 0.866. Test the hypothesis p = 0.75?
1+0.866
Ƶr = ½ln¿) = ½ln( ) = 1.3169
1−0.866
1+0.75
Ƶp = ½ln¿) = ½ln( ) = 0.973
1−0.75
1 1 1
Note: σr2= = = = 0.111, σr = 0.333
n−3 12−3 9

Ƶ r−Ƶ p 1.3169−0.973
Ƶ= = = 1.032
σr 0.333

Ƶ0.05(2) = Ƶ0.05(2), ∞= 1.96

Conclusion: we don’t reject the hypothesis

Construction of confidence interval about ρ


Since Ƶr is approximately normally distributed, we can find confidence interval about Ƶp easily.
Then using the r-Ƶ transformation graph, we can find confidence interval about ρ. The
confidence interval of Ƶp is: Ƶr- Ƶα/2 σr≤ Ƶp≤ Ƶr+ Ƶα/2 σr

Example: Let us use the previous example and find 95% confidence interval?
r = 0.866
Ƶr = 1.3169
σr = 0.333
Ƶr±Ƶ0.05(2) σr = 1.3169±(1.96)(0.333) 0.664≤Ƶp≤1.970
Then the 95% confidence interval for ρ is 0.5810≤ρ≤0.9618

108
This is calculated by referring the table or by transformation of Ƶr to ρ.

Compare two correlation coefficients


We may have two sources of correlation. Assume for instance, two species of birds. Let A stand
for the first species and B stands for the second species. Assume that you took measurement of
wing length and tail length of 15 birds from the first species and 18 birds from the second with
the following result. Let x stands for wing length and y stands for tail length.

rxy(A) = 0.843, and rxy (B) = 0.784


Test the hypothesis that the two samples came from populations with common correlation
coefficient. That is, ρ1 = ρ2 = ρ
H0: ρ1 = ρ2 vs. Ha: ρ1 ≠ ρ2
Ƶ 1−Ƶ 2
Then, we compute Ƶ =
σƵ 1−σƵ 2
For our data, r1 = 0.843, r2 = 0.784
Ƶ1 = 1.2314, Ƶ2 = 1.0557
n1 = 15, n2 = 18
Ƶ 1−Ƶ 2 1.2315−1.0557
0.1758
Ƶ= 1 1 = 1 1 = = 0.454
√ +
n 1−3 n−3 √ +
15−3 18−3
0.3873

Since 0.454 is less than Ƶ0.05(2-side) = Ƶ0.05(2), ∞= 1.96, we accept H0. Under this condition, we must
calculate the average or weighted correlation coefficient.
( n 1−3 ) Ƶ 1+ ( n 2−3 ) Ƶ 2 ( 12 ) 1.2315+ ( 15 ) 1.0557
Ƶw = == = 1.1338
(n 1−3)(n 2−3) (12 ) (15)

Rank correlation
There are cases where it is known that a relationship exists between two variables, but the
distribution of these variables may not be known. Under this circumstance, it will not possible to
compute r using our previous procedure. To take care of such a situation, that is to investigate the
association of x and y where their distribution are not known, two statistics have been developed
by two famous statisticians called Spearman and Kendall. These statistics respectively called
Spearman’s Rho (ρs)and Kendall’s taw (ζ). Such statistics which don’t depend up on specific
distribution or whose distribution is free statistics is known as non-parametric statistic.

To compute Spearman’s Rho (ρs), we use the following formula: rs = 1-(6Σdi2/n3-n), where rs is
estimate of ρs, n is sample size. di = the difference in ranks between x & y. Ranking = (1+2)/2 =
1.5 if two same number of smallest ones.

Procedure
1. Rank the data in both x & y separately

109
2. Find the differences in ranks between x & y
3. Compute sum of squares of the differences in ranks
4. Then compute rs as given before
5. For n exceeding 20, rs is approximately normally distributed. That means that the normal
distribution can be used to test hypothesis. But, if n is less than 20, there is a special of rs
table which should be used for non-parametric statistics.

110
Example: Test H0: ρ = 0 using the following data:

x Ranks of x (Rx) y Rank of y (Ry) difference (Rx- Ry) difference squares(di2)


10.4 4 7. 5 -1.0 1.00
4
10.8 8.5 7. 7 1.5 2.25
6
11.1 10 7. 11 -1.0 1.00
9
10.2 1.5 7. 2.5 -1.0 1.00
2
10.3 3 7. 5 -2.0 4.00
4
10.2 1.5 7. 1 0.5 0.25
1
10.7 7 7. 5 2.0 4.00
4
10.5 5 7. 2.5 2.5 6.25
2
10.8 8.5 7. 9.5 -1.0 1.00
8
11.2 11 7. 8 3.0 9.00
7
10.6 6 7. 9.5 -3.5 12.25
8
11.4 12 8. 12 0 0
3
n = 12; Σdi2 = 42;
6 Σdi 2
rs = 1- = 1-[6(42)/123-12 = 0.853
n 3−n
From specialized table for rs, the table value is 0.648. Since the computed value exceeds the
tabulated value, we reject H0: ρ = 0

Intra-class correlation
Sometimes it is possible to have data which cannot be designated as x and y, or x1 and x2. For
instance, information on some characteristics of twins falls in this category.

Example: Correlation data on some human twins (weight):


Group Values (kg) Values (kg)
1 70.4 71.3
2 68.2 67.4
3 77.3 75.2
4 61.2 66.7

111
5 74.1 72.9
6 74.1 69.5
7 72.3 74.2
The intra-class correlation is defined as:
Group msq−Error msq
rI =
Group mean+ Error ms
Using our data, we compute the various sum of squares and our analysis of variance table look
like the following.

112
AOV table
Source df SS MS F
Between group 6 198.31 33.05 20.6
Within groups/Error 7 21.86 3.12
Total 13 220.17

33.05−3.12
rI = = = 0.827
33.05+3.12

To test H0: ρI = 0, we use the F-test.

9. Regression
One of the most frequently used techniques in research to find the relationship between variables
which are causally related is known as regression.

What is regression?
We observe that the taller a person, the heavier he is. The, what is the relationship between
weight and height? If, for instance we are given the height of a person, we can determine his
weight (using regression model), or if weight and height are related, we may also wish to know
the closeness of the relationship.

Other examples:
1. Relationship between consumption and family income. Does the expenditure on
consumption depend on the income of a family?
2. What is the relationship between milk yield and consumption of ration by cows?

In regression, there is functional relationship between dependent and independent variables.


That is the magnitude of the dependent variable is determined by the magnitude of the
independent variables. There is a cause and effect relationship.

Relationship between regression and correlation


In regression, we have functional relationship between y & x where y is dependent variable and
x is independent variable. Whereas, in correlation, there is no such dependent-independent
relationship. In regression, we have cause and effect relationship, but not in correlation.

Simple regression
A regression is said to be simple if there is one independent variable.
Examples:
1. Relationship between temperature and army warm population.
2. Fruit size of orange and the level of phosphorous in the soil, etc.

113
Multiple regression
In this type of regression, we have more than one independent variables defining the dependent
variable. Example, weight gain of animal as a function of feed, temperature, breed, disease and
age.

Here, we are interested in the simple linear regression and in this course we will concentrate in
dealing with linear regression.

Linear regression
This means that the relationship between the dependent and independent variables can be
expressed as a straight line.

The regression model


In regression model, we will look into the relationship between say x & y. Here, we will consider
two aspects. One is the relationship perse between the two variables and the other one is to
estimate the values of the dependent variable given that we know the value of the independent
variable.

Example, suppose we have height (inches) and weight (lb) of a group of children as follows:
Height (x) Weight (y) E(y)
50 40 41 42 43 44 42
51 41 43 44 46 46 44
52 41 44 45 48 52 46
53 43 46 47 49 55 48
54 44 46 49 51 60 50

Note that we have 25 pairs of numbers such as (50, 400), (50, 41), etc. These 25 pairs comprise
the population. The graph of this information looks like the following:
1400 = b0+b1x
1200
1000
800
Weight

600
400
200
0
E(y/x=50)
-200 0 10 20 30 40 50 60 70 80 90
-400 E(y/x=42)
Height

114
As we see the graph, we have a sub-population of y values for the x values. The x values are
always fixed. We can talk about the distribution of the y values. There are two cases:
1. In the first case, the distribution of y is specified such as normal
2. In the second case, the distribution is not specified

The average of the y values is known as the expected value of the sub-population.
Example: E(y/=x=54) = (44+46+49+51+60)/5 = 50

The best fitting line


This is the line which goes through the observations dividing them into two in such a way that
their deviation from it is minimum. • • • =b0+b1x
Y • • •

• • •

• • • • •
• • • •
• • • •

• •
b0 •
x
The Regression equation
The typical regression equation is given by:
yi = β0+β1xi+εi, where: yi = is a typical observation
xi = is the independent variable
εi = error term
β0& β1 = are the regression coefficients
β0 = called intercept
β1 = called slope
Intercept: This is the point at which the best fitting line crosses the vertical line. It indicates
those values of y which are not related to x.

Slope: is represented byβ1which measures the amount of change in y affected by a unit change
inx. That is, if x increases by one unit, what happens to y is measured by β1.

Both β0& β1 are known as regression coefficient. The equation is known as the regression curve
or the regression function.

Estimations of the parameters


In the regression equation, the parameters β0& β1 are unknown and have to be estimated using
sample data. How estimates these parameters?

Least squares technique: In the regression equation, the coefficients β0& β1 have to be
estimated using the best fitting line which is given by the equation =b0+b1x. This line is drown
115
by finding b0&b1 in such a way that the sum of the square of the deviation from this line is
minimized. This is done using the technique of least squares.

Development of the procedure for least squares:


Let yi = β0+β1xi+εi - (1)
E(yi) = yi+β0+β1xi+εi - (2)
Estimate of the expected value is = E( i) = i = b0+b1x
Consider the deviations such as (yi- i)
n
Let S = ∑ ( y i−i)2
i=1

Find the value of b0&b1 in such a way that S is minimized. Using the notion of differential
calculus, the value of b0&b1 with minimum S are given by the following formula:
b0 = - b1 Σxy−( Σx ) ( Σy)
b1 = =
Σ¿¿

Example: Let us consider the following data in which age x and wing length y of sparrows are
given.
Age (x days) Wing length (y cm)
3.0 1.4 n = 13
4.0 1.5 Σx = 130 Σy = 44.4
5.0 2.2 = 10.0 = 3.42
2
6.0 2.4 Σx = 1562.0 Σy2= 171.3
( Σx ) 2
8 3.1 SSx = Σx2- 1562-[(130)2/13] = 262
n
( Σy ) 2
9 3.2 SSy = Σy2- 171.3-[(44.4)2/13] = 19.66
n
10 3.2 Σxy = 514.8
( Σx ) ( Σy)
11 3.9 SSxy = Σxy- 514.8--[(130x44.4)/13] =
n
70.80
12 4.1
( Σx ) ( Σy)
Σxy−
n Σxy
14 4.7 b1 = = = 70.8/262 = 0.27cm/day
( Σx ) 2 Σx 2
Σx 2−
n
15 4.5
16 5.2 b0 = - b1 = 3.42-(0.27)10 = 0.72

17 5.0 Equation of the line will be = b0+b1xi


= 0.72+0.27xi

116
Standard error of the estimates
Now, we have found the equation of the best fitting line. What remaining for us is to find out
how well the line fits the set of observation?

For example, given the age of a sparrow, how well or accurately can we estimate its wing
length?
Consider the following two graphs:
• •• • • • • • •
• • • •• •
• • • • • • •• • • • •
a • • •• • • • b • •• ••
• • • • • • • • •• • •
• • • •
• • •
• • • • •
• • •
• • • • • •

Note that the values in a are scattered, whereas they are concentrated in b. This means that the
line b is better than that of a in fitting well with the observations. The standard deviation of
a is more than that of b. The variance of y is given by Sy2 = Σ(y- )2/n-1

Using the same approach, the conditional variance of y gives x. We know that the deviations of
interest are of the form yi- i.

The variance of y given x is as shown below:


Sy2/x = Σ(yi- i)2/n-2

The reason for dividing by n-2 is that in the regression equation, the parameters β0& β1 have to
be estimated. Defined in this way, Sy2/x is unbiased estimate of the population conditional
variance, σy2/x. For large sample size, the formula for computing the conditional variance is too
cumbersome. The simpler one is the following:
1
Sy2/x = ¿)]
n−2

The two formulas are algebraically identical. Using this variance of the regression coefficients
are: V(b0) = S2b0=
Sy 2/ xΣxi 2
( Σx ) 2 Sy 2 /x
n(Σxi 2− ) ¿
2 n
V(b1) = S b1= ( Σx ) 2 =
Σxi 2− ¿
Following our previous example,nwe can compute the standard error of the estimates as follow:
Σy2 – ((Σy)2/n) = 19.656
Σxy – ((Σx)(Σy)/n) = 70.82

117
Σx2 – ((Σx)2/n) = 262
1
Then, Sy2/x = ¿)]
n−2

1
= ¿)]
n−2

0.524
= = 0.0477
11

Sy/x = √ 0.0477 = 0.2184

Sy 2/ x 0.0477
S2b1 = = = 0.013, Sb1 = √ 0.013 = 0.114
Sx 2 262
Sy 2/ xΣxi 2
( 0.0477 ) (1562)
2
S b0 = ( Σx ) 2 = = 0.0219 Sb0 = √ 0.0219 = 0.1479
n(Σxi 2− ) 13(262)
n

The Y-intercept
Given the value of b1, we can have different lines depending upon what value b0 assumes. If b1
remains the same, butb0 changes, we will have several lines which are parallel from one to
another. 1
Example: 2
3
b0(4) 4 Equation 4: 4 = b0(1)+b1x
b0(3) Equation 3: 3 = b0(2)+b1x
b0(2) Equation 2: 2 = b0(3)+b1x
b0(1) Equation 1: 1 = b0(4)+b1x

But, on the other hand, if the intercept b0 remains the same and b1 changes, under this
circumstance b1 obtains lines which start from one point and diverge apart. This phenomenon is
known as concurrence. The lines are said to concur at a point b0.

To illustrate with figure, see the following one:


1
2
3
4 4 = b0+b1(4)x
b0 3 = b0+b1(3)x

2 = b0+b1(2)x

1 = b0+b1(1)x

118
Extrapolation data point
We are generally advised not to extrapolate beyond the values of set of data we have. The reason
for this is that we don’t know the nature of the curve beyond our data. We don’t know whether or
not it will remain linear.

119
Assumptions underlying for the use of regression
1. For any value of x, there is normally distributed sub-population of y values
2. The variances of the population are equal
3. The errors in the dependent variable are additive
4. The independent variable is measured without error, since it is fixed.

Prediction and confidence interval


Sometimes we will be interested to estimate the values of the regression coefficient or set
confidence interval about them. To do so, first of all we must find the distribution of the
estimates of the coefficients.

Note that the sample estimate of β1 is b1. If we take several samples, we will have several values
of b1. This means that b1 will have sampling distribution with mean β1 and some variance. That
is, the mean of b1 = E(b1) = β1.
V(b1) = σ2b1 = σ2y/x/σ2x
Using a development in statistics, the sampling distribution of b1 is normal. There is a theory in
statistics that says that any function of a normal random variable is itself normal. Here, y has a
normal distribution and b1 is a function of as y = b0+b1x+ε

Confidence interval on β1 is given as follows:


b1- Ƶα/2σb1≤ β1≤ b1+ Ƶα/2σb1

For large sample, we can use the following formula to set confidence interval of β1:
b1- Ƶα/2Sb1≤ β1≤ b1+ Ƶα/2Sb1

Otherwise, for small sample size, use t-test in estimating the confidence interval of β1.
b1- tα/2Sb1≤ β1≤ b1+ tα/2Sb1

Using our previous example, a 95% C.I is: 0.27-1.96(0.0135)≤ β1≤0.27+1.96(0.0135)


0.243≤ β1≤0.296

Test of hypothesis on regression coefficient β1


Sometimes we will make a statement regarding the slope or the regression coefficient. Then, we
may like to check the validity of this statement. The statement may be that the regression
coefficientβ1is zero. This in effect means that there is no relationship between the dependent and
the independent variable.

The dependent variable can be expressed without the independent variable. To test this
hypothesis, either Ƶ-test or the t-test can be used. Each of this will be illustrated separately.

Let the hypothesis be of the form H0: β1 = 0. Then, t-takes the following shape:
t = (b1- β1)/Sb1 = (b1-0)/Sb1
120
For our example indicated above, b1 = 0.27 and Sb1 = 0.0135, then, t = 0.27/0.0135 = 20
Conclusion: This value is significant.

The F-test on β1
To develop this test, we must first of all define the various sum of squares.
1. Total sum of squares: S2total = Σy2-((Σy)2/n)
( Σxy− ( Σx )( Σy ) ) 2/n
( Σxy ) 2
2. Regression sum of squares: ( Σx ) 2 = = b1Σxy
Σx 2− Σx 2
n
3. Residual sum of squares: SStotal-SSregression

The Analysis of Variance Table


Source df SS MS F
Regression 1 b1Σxy b1Σxy/1 MSregression/MSresidual
Residual n-2 Subtraction SSresidual/n-2
Total n-1 Σy2-((Σy)2/n) -

Continuing with our example, we have:


SSy = 19.6569
SSreg = b1Σxy = 0.25x70.8 = 17.7
SSres = 19.6569-17.7 = 1.9569

The AOV table


Source df SS MS F
Linear regression 1 17.7 17.7 99.49
Residual 11 1.9569 0.1779
Total 12 19.6569
F0.05(1,11) = 4.84
Conclusion: Reject H0: β1 = 0

Coefficient of determination (R2)


The coefficient of determination measures the variability in y which is accounted for the
SSresidual
independent variable x. This is given by R2 =
SStotal

SSregression
The remaining variability that is 1- R2 = .
SStotal
1-R2 is the part that is not related to x.

SSresidual
For our previous example, R2 = = 17.7/19.6569 = 0.90
SS total

121
The high value of R2 indicates that most of the variation in y is due to the regression of x&y.

122

You might also like