Lecture4 Module2 Anova 1
Lecture4 Module2 Anova 1
MODULE II
LECTURE - 4
Var (Yi ) = 2 .
This is the linear model in the expectation form where 1 , 2 ,..., p are the unknown parameters and x ij ' s are the known values of independent covariates X 1 , X 2 ,..., X p . Alternatively, the linear model can be expressed as
where i s are identically and independently distributed random error component with mean 0 and variance 2 , i.e.,
E ( i ) = 0, 0 Var ( i ) = 2 and Cov( i , j ) = 0(i j ) ).
Y = X +
where Y = (Y1 , Y2 ,..., Yn ) ' is n 1 vector of observations on response variable,
X 11 X 12 ... X 1 p X 21 X 22 ... X 2 p is n p matrix of n observations on p independent covariates X 1 , X 2 ,..., X p , the matrix X = X X ... X n 1 n 2 np
3
= ( 1 , 2 ,..., p ) is a p 1 vector of unknown regression parameters (or regression coefficients) 1 , 2 ,..., p
associated with X 1 , X 2 ,..., X p , respectively and
the matrix X is termed as design matrix; unknown 1 , 2 ,..., p are termed as effects; the covariates X 1 , X 2 ,..., X p are counter variables or indicator variables where xij counts the number of times the effect j occurs in the ith observation xi .
The value xij = 1 indicates the presence of effect j in xi and xij = 0 indicates the absence of effect j in Xi.
Note that in the linear regression model, the covariates are usually continuous variables. When some of the covariates are counter variables and rest are continuous variables, then the model is called as mixed model and is used in the analysis of covariance.
In the case of analysis of variance model, the one-way classification considers only one covariate, two t way-classification l ifi ti model d l considers id t two covariates, i t three-way classification model considers three covariates and so on. If , and denote the effects associated with the covariates X, Z and W which are counter variables, then in
Consider an example of agricultural yield. The study variable denotes the yield which depends on various covariates
X 1 , X 2 ,..., X p . In case of regression analysis, the covariates X 1 , X 2 ,..., X p are the different variables like temperature,
Now consider the case of one way model and try to understand its interpretation in terms of multiple regression model. The covariate X is now measured at different levels, e.g., if X is the quantity of fertilizer then suppose there are p possible values, say 1 Kg., 2 Kg., ,..., p Kg. then X 1 , X 2 ,..., X p denotes these p values in the following way. The linear model now can be expressed as Y = o + 1 X 1 + 2 X 2 + ... + p X p + by defining
1 if effect of 1 Kg. fertilizer is present X1 = 0 if effect of 1 Kg. fertilizer is absent 1 if effect of 2 Kg. fertilizer is present X2 = 0 if effect of 2 Kg. fertilizer is absent 1 if effect of p Kg. fertilizer is present Xp = 0 if effect of p Kg. fertilizer is absent.
If effect of 1 Kg. of fertilizer is present, then other effects will obviously be absent and the linear model is expressible as
Y = 0 + 1 ( X 1 = 1) + 2 ( X 2 = 0) + ... + p ( X p = 0) + = 0 + 1 + .
Y = 0 + 1 ( X 1 = 0) + 2 ( X 2 = 1) + ... + p ( X p = 0) + = 0 + 2 + .
Y = 0 + 1 ( X 1 = 0) + 2 ( X 2 = 0) + ... + p ( X p = 1) + = 0 + p +
and so on.
If the experiment with 1 Kg. of fertilizer is repeated n1 number of times then n1 observation on response variables p as are recorded which can be represented
Y11 = 0 + 1.1 + 2 .0 + ... + p .0 + 11 Y12 = 0 + 1.1 + 2 .0 + ... + p .0 + 12 Y1n1 = 0 + 1.1 + 2 .0 + ... + p .0 + 1n1.
If X2 = 1 is repeated n2 times, then on the same lines n2 number of times then n1 observation on response variables are recorded which can be represented as
Y21 = 0 + 1.0 + 2 .1 + ... + p .0 + 21 Y22 = 0 + 1.0 + 2 .1 + ... + p .0 + 22 Y2 n2 = 0 + 1.0 + 2 .1 + ... + p .0 + 2 n2 .
7
The experiment is continued and if Xp = 1 is repeated np times, then on the same lines
Yp1 = 0 + 1.0 + 2 .0 + ... + p .1 + P1 Yp 2 = 0 + 1.0 + 2 .0 + ... + p .1 + P 2 Ypn p = 0 + 1.0 + 2 .0 + ... + p .1 + pn p .
or
Y = X + .
In the two way analysis of variance model, there are two covariates and the linear model is expressible as
Y = 0 + 1 X 1 + 2 X 2 + ... + p X p + 1 Z1 + 2 Z 2 + ... + q Z q +
where
X 1 , X 2 ,..., X p
denotes, e.g., the p levels of quantity of fertilizer, say 1 Kg., 2 Kg.,..., p Kg. and Z1 , Z 2 ,..., Z q
denotes, e.g., the q levels of level of irrigation, say 10 Cms., 20 Cms.,,10q Cms. etc. The levels X 1 , X 2 ,..., X p ,
Z1 , Z 2 ,..., Z q are defined as counter variable indicating the presence or absence of the effect as in the earlier
case. If the th effect ff t of f X1 and d Z1 are present, t i.e., i 1K Kg of f fertilizer f tili and d 10 Cms. C of f irrigation i i ti i is used d th then th the linear model is written as
If all ' s are unknown constants, they are called as parameters of the model and the model is called as a
fixed-effects model or model I. The objective in this case is to make inferences about the parameters and the error
variance 2 .
for all i = 1, case, j occurs with every 1 2,..., 2 n then j is termed as additive constant. In this case
If all ' s are observable random variables except the additive constant, then the linear model is termed as
random-effects model, model II or variance components model. The objective in this case is to make inferences
2 2 2 about the variances of ' s, i.e., 1 , 2 ,..., p and error variance 2 and/or certain functions of them.
If some parameters are fixed and some are random variables, then the model is called as mixed-effects model or model III. In mixed effect model, at least one
2 objective is to make inference about the fixed effect parameters, variance of random effects and error variance .