0% found this document useful (0 votes)
112 views30 pages

5 - LSD and Missing Plot

The document discusses the analysis of Latin Square Designs (LSD) and techniques for handling missing data in experiments. It explains the structure of a Latin square, the process of randomizing rows and columns, and the statistical models used for analysis, including the formulation of hypotheses and the analysis of variance (ANOVA). Additionally, it covers methods for estimating and imputing missing observations to maintain the integrity of the data set during analysis.

Uploaded by

csc240120
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
112 views30 pages

5 - LSD and Missing Plot

The document discusses the analysis of Latin Square Designs (LSD) and techniques for handling missing data in experiments. It explains the structure of a Latin square, the process of randomizing rows and columns, and the statistical models used for analysis, including the formulation of hypotheses and the analysis of variance (ANOVA). Additionally, it covers methods for estimating and imputing missing observations to maintain the integrity of the data set during analysis.

Uploaded by

csc240120
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 30

Analysis of Variance and Design of

Experiments
Experimental Designs and Their Analysis
:::
Lecture 21
Analysis in Latin Square Design and Missing Plot Technique

Shalabh
Department of Mathematics and Statistics
Indian Institute of Technology Kanpur

Slides can be downloaded from http://home.iitk.ac.in/~shalab/sp1


Latin Square:
A Latin square of order p is an arrangement of p2 symbols in cells
arranged in p rows and p columns such that each symbol occurs
once and only once in each row and in each column.
For example, to write a Latin square of order 4,
• choose four Latin letters as symbols – A, B, C and D.
• Write them in a way such that each of the letters out of A, B, C
and D occurs once and only once in each row and each column.

A B C D
B C D A
C D A B
D A B C
2
Analysis of LSD (one observation per cell):
In designing a LSD of order p,
 choose one Latin square at random from the set of all
possible Latin squares of order p.
 Select a standard Latin square from the set of all standard
Latin squares with equal probability.
 Randomize all the rows and columns as follows:
 Choose a random number, less than p, say n1 and then 2nd
row is the n1th row.
 Choose another random number less than p, say n2 and
then 3rd row is the n2th row and so on.
 Then do the same for column. 3
Analysis of LSD (one observation per cell):
For Latin squares of order less than 5, fix first row and then
randomize rows and then randomize columns.

In Latin squares of order 5 or more, need not to fix even the first
row. Just randomize all rows and columns.

4
Analysis of LSD (one observation per cell): Example
Suppose following Latin square is chosen
A B C D E
B C D E A
D E A B C
E A B C D
C D E A B

Now randomize rows, e.g., 3rd row becomes 5th row and 5th row
becomes 3rd row. The Latin square becomes
A B C D E
B C D E A
C D E A B
E A B C D
D E A B C
5
Analysis of LSD (one observation per cell): Example
Now randomize columns, say 5th column becomes 1st column,
1st column becomes 4th column and 4th column becomes 5th
column
E B C A D
A C D B E
D A B E C
C E A D B
B D E C A

Now use this Latin square for the assignment of treatments.

6
Analysis of LSD (one observation per cell):
yijk : Observation on kth treatment in ith row and jth block,
i = 1, 2,...,v, j = 1, 2,...,v, k = 1, 2,...,v.

Triplets (i, j, k) take on only the v2 values indicated by the chosen


particular Latin square selected for the experiment.

yijk’s are independently distributed as N (    i   j   k ,  2 ) .

7
Analysis of LSD (one observation per cell):
Linear model is
yijk    i   j   k   ijk , i  1, 2,..., v; j  1, 2,..., v; k  1, 2,..., v
where  ijk are random errors which are identically and
independently distributed following N (0,  2 ) with
v v v


i 1
i  0, 
j 1
j  0, 
k 1
k  0,

i : main effect of rows


 j : main effect of columns
 k : main effect of treatments.

8
Analysis of LSD (one observation per cell):
The null hypothesis under consideration are

H 0 R : 1   2  ....   v  0
H 0C : 1   2  ....   v  0
H 0T :  1   2  ....   v  0.

9
Analysis of LSD (one observation per cell):
The analysis of variance can be developed on the same lines as
earlier.
v v v
Minimizing S     ijk with respect to  ,  i ,  j and  k given
 2

i 1 j 1 k 1

the least squares estimate as

ˆ  yooo
ˆi  yioo  yooo i  1, 2,..., v

ˆ j  yojo  yooo j  1, 2,..., v


ˆk  yook  yooo k  1, 2,..., v.

10
Analysis of LSD (one observation per cell):
Using the fitted model based on these estimators, the total sum
of squares can be partitioned into the mutually orthogonal sum
of squares SSR, SSC, SSTr and SSE as

TSS  SSR  SSC  SSTr  SSE


where
v v v v v v 2
G
TSS: Total sum of squares   
i 1 j 1 k 1
( yijk  yooo ) 2     yijk
i 1 j 1 k 1
2
 2
v

v  i
R 2
G2
SSR: Sum of squares due to rows = v ( yioo  yooo )  i 1
 2;
2

i 1 v v

v v
where Ri   yijk .
j 1 k 1 11
v
Analysis of LSD (one observation per cell):
v  j
C 2

G2
SSC: Sum of squares due to column  v  ( yojo  yooo ) 2  j 1
 2 ;
j 1 v v
v v
where C j   yijk .
i 1 k 1 v

v k
T 2
G2
SSTr : Sum of squares due to treatment  v ( yook  yooo )2  k 1
 2;
k 1 v v
v v
where Tk   yijk .
i 1 j 1

Degrees of freedom carried by SSR, SSC and SSTr are (v ‐ 1) each.


Degrees of freedom carried by TSS are v2 – 1.
Degrees of freedom carried by SSE are (v ‐ 1)(v ‐ 2).
12
Analysis of LSD (one observation per cell):
Thus MSR
‐ under H 0 R , FR  ~ F ((v  1), (v  1)(v  2))
MSE
MSC
‐ under H 0C , FC  ~ F ((v  1), (v  1)(v  2))
MSE
MSTr
‐ under H 0T , FT  ~ F ((v  1), (v  1)(v  2)).
MSE
Decision rules:
Reject H 0 R at level  if FR  F1 ;( v 1),( v 1)( v  2)

Reject H 0C at level  if FC  F1 ;( v 1),( v 1)( v  2)

Reject H 0T at level  if FT  F1 ;( v 1),( v 1)( v  2) .


If any null hypothesis is rejected, then use multiple comparison test.
13
Analysis of LSD (one observation per cell): ANOVA Table
The analysis of variance table is as follows

Source of Degrees of Sum of Mean F ‐ value


variation freedom squares squares
Rows v–1 SSR MSR FR
Columns v–1 SSC MSC FC
Treatments v–1 SSTr MSTr FT
Error (v – 1)(v – 2) SSE MSE

Total v2 ‐ 1 TSS

14
Analysis of LSD (one observation per cell):
The expectations of mean squares are obtained as
 SSR  v v
E ( MSR )  E 
 v  1


  2

v  1

i 1
 i
2

 SSC  v v 2
E ( MSC )  E    
 v 1 
2

v  1 j 1
j

 SSTr  v v
E ( MSTr )  E 
 v  1


  2

v  1

k 1
 2
k

 SSE 
E ( MSE )  E     2
.
 (v  1)(v  2) 

15
Missing plot techniques:
It happens many time in conducting the experiments that some
observation are missed.
This may happen due to several reasons.
For example, in a clinical trial, suppose the readings of blood
pressure are to be recorded after 3 days of giving the medicine to
the patients. Suppose the medicine is given to 20 patients and one
of the patients doesn’t turn up for providing the reading.
Similarly, in an agricultural experiment, the seeds are sown and
yields are to be recorded after few months. Suppose some cattle
destroy the crop of any plot or the crop of any plot is destroyed
due to storm, insects etc.
16
Missing plot techniques:
In such cases, one option is to
‐ somehow estimate the missing value on the basis of available
data,
‐ replace it back in the data and make the data set complete.

Now conduct the statistical analysis on the basis of completed


data set as if no value was missing by making necessary
adjustments in the statistical tools to be applied.

Such an area comes under the purview of “missing data models”


and a lot of development has taken place.
17
Missing plot techniques:
We discuss here the classical missing plot technique proposed
by Yates which involve the following steps:
• Estimate the missing observations by the values which makes
the error sum of squares to be minimum.
• Substitute the unknown values by the missing observations.
• Express the error sum of squares as a function of these
unknown values.
• Minimize the error sum of squares using the principle of
maxima/minima, i.e., differentiating it with respect to the
missing value and put it to zero and form a linear equation.

18
Missing plot techniques:
• Form as much linear equation as the number of unknown
values (i.e., differentiate the error sum of squares with
respect to each unknown value).
• Solve all the linear equations simultaneously and solutions
will provide the missing values.
• Impute the missing values with the estimated values and
complete the data.
• Apply analysis of variance tools.
• The error sum of squares thus obtained is corrected but the
treatment sum of squares is not corrected.
19
Missing plot techniques:
• The number of degrees of freedom associated with the total
sum of squares is subtracted by the number of missing values
and adjusted in the error sum of squares.
• No change in the degrees of freedom of sum of squares due
to treatment is needed.

20
Missing observations in RBD: One missing observation
Suppose one observation in (i, j)th cell is missing and let this be x.
The arrangement of observations in RBD then will be as follows:
Treatments (Factor B) Block
totals where
1 2 j v '
yoo : total of known
1 y11 y12 … y1j … y1v B1
2 y21 y22 … y2j … y2v B2 observations
Blocks (Factor A)

. . . . . . yio' : total of known


. . . . . . observations in ith
. . . . . .
block.
i yi1 yi2 … yij = x … yiv 𝑩𝒊
𝒚𝒊𝒐 𝒙 yoj' : total of known
. . . . . observations in jth
. . . . .
. . . . . treatment.
b yb1 yb2 … ybj … ybv Bb
Treatment T1 T2 … 𝑻𝒋 … Tv Grand
totals 𝒚𝒐𝒋 𝒙 total 𝑮
𝒚𝒐𝒐 𝒙 21
Missing observations in RBD: One missing observation
'
(G ') 2 ( yoo  x) 2
Correction factor (CF )  
n bv
b v
TSS   yij2  CF
i 1 j 1

 ( x 2  terms which are constant with respect to x)  CF


1 '
SSBl  [( yio  x) 2  terms which are constant with respect to x]  CF
v
1 '
SSTr  [( yoj  x) 2  terms which are constant with respect to x]  CF
b
SSE  TSS  SSBl  SSTr
1 1 ( y '
 x ) 2
 x 2  ( yio'  x) 2  ( yoj'  x) 2  oo
v b bv
 (terms which are constant with respect to x)  CF .

22
Missing observations in RBD: One missing observation
Find x such that SSE is minimum

2( yio'  x) 2( yoj  x) 2( yoo


'
 ( SSE ) '
 x)
 0  2x    0
x v b bv
byio'  vyoj'  yoo
'

or x 
(b  1)(v  1)

The second‐order derivative condition for x to provide minimum


SSE can be easily verified.

23
Missing observations in RBD: Two missing observations
If there are two missing observation, then let they be x and y.
‐ Let the corresponding row sums (block totals) are (R1  x) and
(R2  y).
‐ Column sums (treatment totals) are (C1  x) and (C2  y).
‐ Total of known observations is S.
Then
1 1
SSE  x  y  [( R1  x)  ( R2  y ) ]  [(C1  x) 2  (C2  y ) 2 ]
2 2 2 2

v b
1
 ( S  x  y ) 2  terms independent of x and y.
bv

24
Missing observations in RBD: Two missing observations
Now differentiate SSE with respect to x and y, as

 ( SSE ) R1  x C1  x S  x  y
0 x   0
x v b bv
 ( SSE ) R2  y C2  y S  x  y
0 y    0.
y v b bv

Thus solving the following two linear equations in x and y, we


obtain the estimated missing values

(b  1)(v  1) x  bR1  vC1  S  y

(b  1)(v  1) y  bR2  vC2  S  x.


25
Adjustments to be done in analysis of variance:

i. Obtain the within block sum of squares from incomplete


data.
ii. Subtract correct error sum of squares from (i). This gives the
correct treatment sum of squares.
iii. Reduce the degrees of freedom of error sum of squares by
the number of missing observations.
iv. No adjustments in other sums of squares are required.

26
Missing observations in LSD:
Let
‐ x be the missing observation in (i, j, k)th cell, i.e.
yijk , i  1, 2,.., v, j  1, 2,.., v, k  1, 2,.., v.
‐ R: Total of known observations in ith row
‐ C: Total of known observations in jth column
‐ T: Total of known observation receiving the kth treatment.
‐ S: Total of known observations

27
Missing observations in LSD:
Now
( S  x) 2
Correction factor (CF ) 
v2
Total sum of squares (TSS )  x2 + term which are constant with respect to x ‐ CF

( R  x)2
Row sum of squares (SSR)  + term which are constant with respect to x ‐ CF
v
(C  x)2
Column sum of squares (SSC)  + term which are constant with respect to x ‐ CF
v
(T  x)2
Treatment sum of squares(SSTr )  + term which are constant with respect to x ‐ CF
v
Sum of squares due to error (SSE)  TSS - SSR - SSC - SSTr

1 2(S  x)2
 x  ( R  x)  (C  x)  (T  x)  
2 2 2 2

v v2

28
Missing observations in LSD:
Choose x such that SSE is minimum. So

d ( SSE )
0
dx
2 4( S  x )
 2 x   R  C  T  3x   2
0
v v

V ( R  C  T )  2S
or x 
(v  1)(v  2)

29
Adjustment to be done in analysis of variance:
Do all the steps as in the case of RBD.
To get the correct treatment sum of squares, proceed as follows:

 Ignore the treatment classification and consider only row and


column classification.

 Substitute the estimated values at the place of missing observation.

 Obtain the error sum of squares from complete data, say SSE1 .

 Let SSE2 be the error sum of squares based on LSD obtained earlier.

 Find the corrected treatment sum of squares = SSE2 ‐ SSE1

 Reduce of degrees of freedom of error sum of squares by the


number of missing values. 30

You might also like