0% found this document useful (0 votes)
2 views71 pages

Linear Model 1

The document discusses General Linear Models, focusing on a quantitative response variable and multiple covariates, outlining the model's structure and assumptions. It includes examples such as Snell's law and traffic flow analysis to illustrate the application of linear models. Additionally, it addresses the use of dummy variables for categorical factors in modeling.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views71 pages

Linear Model 1

The document discusses General Linear Models, focusing on a quantitative response variable and multiple covariates, outlining the model's structure and assumptions. It includes examples such as Snell's law and traffic flow analysis to illustrate the application of linear models. Additionally, it addresses the use of dummy variables for categorical factors in modeling.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 71

General Linear Models I

Presidency University

November, 2023
General Linear model
I In the most general form, consider a setup where we have a single
response variable y which is quantitative and p covariates x1 , x2 , ...., xp
which can be either quantitative or qualitative or both.
General Linear model
I In the most general form, consider a setup where we have a single
response variable y which is quantitative and p covariates x1 , x2 , ...., xp
which can be either quantitative or qualitative or both.

I Suppose we have n observations on each of these p variables. That is,


suppose we have observations y1 , y2 , ..., yn on the response y and
x1i , x2i , ..., xni on the i th covariate xi , i = 1, 2, ..., p.
General Linear model
I In the most general form, consider a setup where we have a single
response variable y which is quantitative and p covariates x1 , x2 , ...., xp
which can be either quantitative or qualitative or both.

I Suppose we have n observations on each of these p variables. That is,


suppose we have observations y1 , y2 , ..., yn on the response y and
x1i , x2i , ..., xni on the i th covariate xi , i = 1, 2, ..., p.

I Then the general linear model can be written as

y = Xβ + 

where y = (y1 , y2 , ..., yn ) is the response vector and β = (b1 , ..., bp ) is the
vector of parameters and

x11 x12 · · · x1p


 
x21 x22 x2p 
X = . ..
 
.

 . . ··· 
xn1 xn2 xnp

is the design matrix.


Linear Model (contd.)

I Further  = (1 , 2 , .., n ) is the vector of random errors where


we assume
I E (i |X ) = 0∀i.
I Var (i |X ) = σ 2 ∀i.
I Cov (i , j |X ) = 0∀i 6= j.
I These assumptions can alternatively be stated as E (|X ) = 0
and D(|X ) = σ 2 In .
Linear Model (contd.)

I Further  = (1 , 2 , .., n ) is the vector of random errors where


we assume
I E (i |X ) = 0∀i.
I Var (i |X ) = σ 2 ∀i.
I Cov (i , j |X ) = 0∀i 6= j.
I These assumptions can alternatively be stated as E (|X ) = 0
and D(|X ) = σ 2 In .

I More specifically we assume single quantitative resposne


variable y and p covarites such that

E (y |X ) = X β and Var (y |X ) = σ 2 In .
Example of linear model: Snell’s law
I Snell’s law relates the angle of incidence (θ1 ) with the angle of
refraction (θ2 ), when light crosses the boundary between two
media, as
sin θ2 = κ sin θ1
where κ is the ratio of refractive indices of the two media.
Example of linear model: Snell’s law
I Snell’s law relates the angle of incidence (θ1 ) with the angle of
refraction (θ2 ), when light crosses the boundary between two
media, as
sin θ2 = κ sin θ1
where κ is the ratio of refractive indices of the two media.

I This law can be used to find the refractive index of one


medium if the refractive index of other medium is known after
observing θ1 and θ2 .
Example of linear model: Snell’s law
I Snell’s law relates the angle of incidence (θ1 ) with the angle of
refraction (θ2 ), when light crosses the boundary between two
media, as
sin θ2 = κ sin θ1
where κ is the ratio of refractive indices of the two media.

I This law can be used to find the refractive index of one


medium if the refractive index of other medium is known after
observing θ1 and θ2 .

I However any measurement will involve some error and hence


we can use the following model to estimate κ:

y = βx + 

where y = sin θ2 and x = sin θ1 and β = κ.


Example (Contd.)

I With repeated measurements on angle of incidence and


refraction we have the linear model

y = Xθ + 

where y = (y1 , y2 , ..., yn ) and θ = β and


 
x1
 x2 
X =.
 
 .. 
xn

where yi = sin θ2i and xi = sin θ1i .


Example: Traffic control plan

I More cars on a road slower the speed of traffic flow becomes.


Example: Traffic control plan

I More cars on a road slower the speed of traffic flow becomes.

I A fairly precise understanding of this is important to the


transportation planner since reducing travel time is frequently
the main purpose behind increasing transportation facilities.
Example: Traffic control plan

I More cars on a road slower the speed of traffic flow becomes.

I A fairly precise understanding of this is important to the


transportation planner since reducing travel time is frequently
the main purpose behind increasing transportation facilities.

I Data on density in vehicles per mile and the corresponding


speed in miles per hour were obtained.
Example: Traffic control plan

I More cars on a road slower the speed of traffic flow becomes.

I A fairly precise understanding of this is important to the


transportation planner since reducing travel time is frequently
the main purpose behind increasing transportation facilities.

I Data on density in vehicles per mile and the corresponding


speed in miles per hour were obtained.

I Since congestion affects speed (and not the other way around)
we are interested in determining the effect of density on speed.
Example (Contd.)

40
35
30
SPEED

25
20
15
10

20 40 60 80 100 120 140

DENSITY
Example (Contd.)
I The plot speed and density suggests a quadratic relationship.
Example (Contd.)
I The plot speed and density suggests a quadratic relationship.
I Hence we opt for a model of the form

y = α + βx + γx 2 + 

where y is the speed and x is the density and  is the random


error.
Example (Contd.)
I The plot speed and density suggests a quadratic relationship.
I Hence we opt for a model of the form

y = α + βx + γx 2 + 

where y is the speed and x is the density and  is the random


error.
I With n observations, we shall have a linear model

y = Xθ + 

where y = (y1 , y2 , ..., yn ) and θ


= (α, β, γ) and
x1 x12
 
1
1 x2 x22 
X = . .. .. 
 
 .. . .
1 xn xn2
Classification

I If all the columns of X (except the first column) contains


values of continuous variables, then the linear model is called
regression model.
Classification

I If all the columns of X (except the first column) contains


values of continuous variables, then the linear model is called
regression model.

I If all the columns of X contains values of discrete variables


(more specifically if all the columns contains values 0 or 1),
then the linear model is called ANOVA model.
Classification

I If all the columns of X (except the first column) contains


values of continuous variables, then the linear model is called
regression model.

I If all the columns of X contains values of discrete variables


(more specifically if all the columns contains values 0 or 1),
then the linear model is called ANOVA model.

I If some columns of X contains values of continuous variables


and some columns contains values of discrete variable, then
the linear model is called ANCOVA (or ANOCOVA) model.
Dealing with Factors
I In linear models we need to deal with what we call factor
variables or factors which are categorical variables with
different categories. The different categories of a factor are
called the factor levels.
Dealing with Factors
I In linear models we need to deal with what we call factor
variables or factors which are categorical variables with
different categories. The different categories of a factor are
called the factor levels.
I Suppose we have a single factor A with k levels A1 , A2 , ..., Ak
having potential effect on the response y . A natural question is
: How do we model the effects of all these levels in a single
linear model?
Dealing with Factors
I In linear models we need to deal with what we call factor
variables or factors which are categorical variables with
different categories. The different categories of a factor are
called the factor levels.
I Suppose we have a single factor A with k levels A1 , A2 , ..., Ak
having potential effect on the response y . A natural question is
: How do we model the effects of all these levels in a single
linear model?
I The answer is to use indicator variables or dummy
variables x1 , x2 , ..., xk−1 where
(
1 if the observation receives the i th level
xi =
0 otherwise.
Dealing with Factors
I In linear models we need to deal with what we call factor
variables or factors which are categorical variables with
different categories. The different categories of a factor are
called the factor levels.
I Suppose we have a single factor A with k levels A1 , A2 , ..., Ak
having potential effect on the response y . A natural question is
: How do we model the effects of all these levels in a single
linear model?
I The answer is to use indicator variables or dummy
variables x1 , x2 , ..., xk−1 where
(
1 if the observation receives the i th level
xi =
0 otherwise.
I We can write a linear model as
y = α + β1 x1 + .... + βk−1 xk−1 + 
where βi is the effect of the i th level of A.
Using dummy variables
I So why did we use k − 1 dummy variables when we had k
levels? Where is the effect of Ak modeled in the linear model?
Using dummy variables
I So why did we use k − 1 dummy variables when we had k
levels? Where is the effect of Ak modeled in the linear model?

I The answer is if the observation receives the k th level of the


factor A, then all xi = 0, i = 1, 2, ..., k − 1 and as such α
represents the expected value of y when the observations
receives Ak .
Using dummy variables
I So why did we use k − 1 dummy variables when we had k
levels? Where is the effect of Ak modeled in the linear model?

I The answer is if the observation receives the k th level of the


factor A, then all xi = 0, i = 1, 2, ..., k − 1 and as such α
represents the expected value of y when the observations
receives Ak .

I When an observation receives the level Ai , i = 1, 2, ..., k − 1,


then expected value of y is α + βi . As such
βi , i = 1, 2, ..., k − 1 represents the change in the expected
value of y due to Ai as compared to Ak .
Using dummy variables
I So why did we use k − 1 dummy variables when we had k
levels? Where is the effect of Ak modeled in the linear model?

I The answer is if the observation receives the k th level of the


factor A, then all xi = 0, i = 1, 2, ..., k − 1 and as such α
represents the expected value of y when the observations
receives Ak .

I When an observation receives the level Ai , i = 1, 2, ..., k − 1,


then expected value of y is α + βi . As such
βi , i = 1, 2, ..., k − 1 represents the change in the expected
value of y due to Ai as compared to Ak .

I That means each βi represents the expected difference in y


when the observation belongs to Ai and Ak . For this reason
βi ’s are sometimes called contrasts between two classes.
I Here we compare the effects of other level with that of Ak . In
such case we call Ak to be the baseline level.
I Here we compare the effects of other level with that of Ak . In
such case we call Ak to be the baseline level.

I Obviously we can take any level Ai (not necessarily Ak ) to be


the baseline level.
I Here we compare the effects of other level with that of Ak . In
such case we call Ak to be the baseline level.

I Obviously we can take any level Ai (not necessarily Ak ) to be


the baseline level.

I A general rule is thus : if we are working with a factor, then


we need to introduce k − 1 dummy variables.
Using dummy for all levels
I Now suppose in the same situation we use k dummy variables x1 , x2 , ..., xk
instead of k − 1 variables x1 , x2 , ..., xk−1 and fit a model as

y = α + β1 x1 + β2 x2 + ... + βk xk + 
Using dummy for all levels
I Now suppose in the same situation we use k dummy variables x1 , x2 , ..., xk
instead of k − 1 variables x1 , x2 , ..., xk−1 and fit a model as

y = α + β1 x1 + β2 x2 + ... + βk xk + 

I Then we note that the variables x1 , x2 , ..., xk are not independent : they satisfy
P
a constraint xi = 1, that is any observation must receive any one of the levels
Ai .
Using dummy for all levels
I Now suppose in the same situation we use k dummy variables x1 , x2 , ..., xk
instead of k − 1 variables x1 , x2 , ..., xk−1 and fit a model as

y = α + β1 x1 + β2 x2 + ... + βk xk + 

I Then we note that the variables x1 , x2 , ..., xk are not independent : they satisfy
P
a constraint xi = 1, that is any observation must receive any one of the levels
Ai .

I Here the design matrix is

1 x11 x12 ··· x1k


 
1 x21 x22 ··· x2k 
X = . .. .. 
 
 .. . . 
1 xn1 xn2 ··· xnk

but it is not of full rank.


Using dummy for all levels
I Now suppose in the same situation we use k dummy variables x1 , x2 , ..., xk
instead of k − 1 variables x1 , x2 , ..., xk−1 and fit a model as

y = α + β1 x1 + β2 x2 + ... + βk xk + 

I Then we note that the variables x1 , x2 , ..., xk are not independent : they satisfy
P
a constraint xi = 1, that is any observation must receive any one of the levels
Ai .

I Here the design matrix is

1 x11 x12 ··· x1k


 
1 x21 x22 ··· x2k 
X = . .. .. 
 
 .. . . 
1 xn1 xn2 ··· xnk

but it is not of full rank.

I Statistical lesson: There can be alternative parametrization for the same model.
Example: ANOVA model (One way layout)
I Suppose we have a factor A and let A1 , A2 , ...., Ak be the levels of A
which constitutes the population of interest.
Example: ANOVA model (One way layout)
I Suppose we have a factor A and let A1 , A2 , ...., Ak be the levels of A
which constitutes the population of interest.

I Further assume there are ni observations receiving the level Ai and yij be
the j th observation receiving the i th level Ai .
Example: ANOVA model (One way layout)
I Suppose we have a factor A and let A1 , A2 , ...., Ak be the levels of A
which constitutes the population of interest.

I Further assume there are ni observations receiving the level Ai and yij be
the j th observation receiving the i th level Ai .

I The model we consider is

yij = µi + eij , j = 1, 2, ..., ni , i = 1, 2, ..., k.

where
µi = fixed effect due to Ai and eij = random error.
Example: ANOVA model (One way layout)
I Suppose we have a factor A and let A1 , A2 , ...., Ak be the levels of A
which constitutes the population of interest.

I Further assume there are ni observations receiving the level Ai and yij be
the j th observation receiving the i th level Ai .

I The model we consider is

yij = µi + eij , j = 1, 2, ..., ni , i = 1, 2, ..., k.

where
µi = fixed effect due to Ai and eij = random error.

I We assume that
eij ∼ N(0, σ 2 )
and eij0 s are independent.
Example: ANOVA model (One way layout)
I Suppose we have a factor A and let A1 , A2 , ...., Ak be the levels of A
which constitutes the population of interest.

I Further assume there are ni observations receiving the level Ai and yij be
the j th observation receiving the i th level Ai .

I The model we consider is

yij = µi + eij , j = 1, 2, ..., ni , i = 1, 2, ..., k.

where
µi = fixed effect due to Ai and eij = random error.

I We assume that
eij ∼ N(0, σ 2 )
and eij0 s are independent.

I This implies that E (yij ) = µi and Var (yij ) = σ 2 for all j = 1, ..., ni which
means µ0i s are the factor level means for each i and σ 2 is the common
variability among observations belonging to each group.
Interpretation of parameters

I In an observational study, the factor level means µi correspond to the


means for the different factor level populations and σ 2 is the cmmon
variability in each such population.
Interpretation of parameters

I In an observational study, the factor level means µi correspond to the


means for the different factor level populations and σ 2 is the cmmon
variability in each such population.

I For example, consider a study of the productivity of employees in each of


three shifts operated in a plant. The populations consist of the employee
productivities for each of three shifts. The population mean µ1 is the
mean productivity for employees in shift 1, and µ2 and µ3 are interpreted
similarly. The variance σ 2 refers to the variability of employee
productivities within a shift.
Interpretation (Contd.)

I In an experimental study, the factor level mean µi stands for the mean
response that would be obtained if the i th treatment were applied to all
unit in the population of experimental units about which inferences are to
be drawn. Similarly, the variance σ 2 refers to the variability of responses
if any given experimental treatment were applied to the entire population
of experimental units.
Interpretation (Contd.)

I In an experimental study, the factor level mean µi stands for the mean
response that would be obtained if the i th treatment were applied to all
unit in the population of experimental units about which inferences are to
be drawn. Similarly, the variance σ 2 refers to the variability of responses
if any given experimental treatment were applied to the entire population
of experimental units.

I For example, consider a completely randomized design to study the


effects of three different training programs on employee productivity, in
which 90 employees participate, a third of these employees is assigned at
random to each of the three programs. The mean µ1 here denotes the
mean productivity if training program I were given to each employee in
the population of experimental units; the means µ2 and µ3 are
interpreted correspondingly. The variance σ 2 denotes the variability in
productivities if any one training program were given to each employee in
the population of experimental units.
Summary of assumptions in one-way ANOVA

I Corresponding to each factor level, there is a probability


distribution of responses.
Summary of assumptions in one-way ANOVA

I Corresponding to each factor level, there is a probability


distribution of responses.

I For example, in a study of the effects of four types of incentive


pay on employee productivity, there is a probability distribution
of employee productivities for each type of incentive pay.
Summary of assumptions in one-way ANOVA

I Corresponding to each factor level, there is a probability


distribution of responses.

I For example, in a study of the effects of four types of incentive


pay on employee productivity, there is a probability distribution
of employee productivities for each type of incentive pay.

I The ANOVA model assumes that:


I Each probability distribution is normal.
I Each probability distribution has the same variance.
I The probability distributions if differ among themselves, differ
only with respect to their means.
I The responses for each factor level are random selections from
the corresponding probability distribution and are independent
of the responses for any other factor level.
Summary (Contd.)
One way ANOVA as linear model

I Here we have introduced k dummy variables for k levels but


without any intercept.
One way ANOVA as linear model

I Here we have introduced k dummy variables for k levels but


without any intercept.

I In terms of dummy variables we can write

y = µ1 x1 + µ2 x2 + .... + µk xk + 

where xi = 1 or 0 according as the observation receives Ai or


not.
One way ANOVA as linear model
I Suppose we denote

y11 e11
   
 y12   e12 
 ..   .. 
   
 .   . 
   
y1n  e1n 
 1  1
1n1 0 ··· 0
   
 y21 
  µ1  e21 
 
 .   µ2   0 1n2 ··· 0   . 
y =  ..  , β =  .  and Xn×k =  .
 . 
.. .. ..  and  = .
     
 ..   ..
 
y2n2 
  . . .  e2n2 
 
 . 
 .  µ k 0 0 ··· 1nk  . 
 . 
 .   . 
   
 yk1   ek1 
   
 .   . 
 ..   .. 
yknk eknk
One way ANOVA as linear model
I Suppose we denote

y11 e11
   
 y12   e12 
 ..   .. 
   
 .   . 
   
y1n  e1n 
 1  1
1n1 0 ··· 0
   
 y21 
  µ1  e21 
 
 .   µ2   0 1n2 ··· 0   . 
y =  ..  , β =  .  and Xn×k =  .
 . 
.. .. ..  and  = .
     
 ..   ..
 
y2n2 
  . . .  e2n2 
 
 . 
 .  µ k 0 0 ··· 1nk  . 
 . 
 .   . 
   
 yk1   ek1 
   
 .   . 
 ..   .. 
yknk eknk

I Then the above model can be written as

y = Xβ + 
2
where  ∼ Nn (0, σ In ).
Reparametrization
I At times, an alternative but completely equivalent formulation of the
single-factor ANOVA model is used. This alternative formulation is called
the factor effects model.
Reparametrization
I At times, an alternative but completely equivalent formulation of the
single-factor ANOVA model is used. This alternative formulation is called
the factor effects model.

I Let us write
µi = µ̄ + (µi − µ̄) = µ + αi
P
ni µi
where µ = µ̄ = n
and αi = µi − µ̄.
Reparametrization
I At times, an alternative but completely equivalent formulation of the
single-factor ANOVA model is used. This alternative formulation is called
the factor effects model.

I Let us write
µi = µ̄ + (µi − µ̄) = µ + αi
P
ni µi
where µ = µ̄ = n
and αi = µi − µ̄.
X
I Then we note that ni αi = 0.
i
Reparametrization
I At times, an alternative but completely equivalent formulation of the
single-factor ANOVA model is used. This alternative formulation is called
the factor effects model.

I Let us write
µi = µ̄ + (µi − µ̄) = µ + αi
P
ni µi
where µ = µ̄ = n
and αi = µi − µ̄.
X
I Then we note that ni αi = 0.
i

I Now our linear model of interest becomes

yij = µ + αi + eij , i = 1, 2, ..., ni , j = 1, 2, ..., k

where µ denotes the general effect or the average effect and αi denotes
the
X additional effect (fixed) due to Ai subject to the restriction
ni αi = 0 and eij denotes the random error.
i
Reparametrization
I At times, an alternative but completely equivalent formulation of the
single-factor ANOVA model is used. This alternative formulation is called
the factor effects model.

I Let us write
µi = µ̄ + (µi − µ̄) = µ + αi
P
ni µi
where µ = µ̄ = n
and αi = µi − µ̄.
X
I Then we note that ni αi = 0.
i

I Now our linear model of interest becomes

yij = µ + αi + eij , i = 1, 2, ..., ni , j = 1, 2, ..., k

where µ denotes the general effect or the average effect and αi denotes
the
X additional effect (fixed) due to Ai subject to the restriction
ni αi = 0 and eij denotes the random error.
i

I We assume that for all i, j, eij are independent N(0, σ 2 ) variables.


Reparametrized form as linear Model

I Now, in terms of dummy variables, we have included k dummy


variables for k levels along with an intercept.
Reparametrized form as linear Model

I Now, in terms of dummy variables, we have included k dummy


variables for k levels along with an intercept.

I In this case the linear model becomes

y = Xβ + 

where β = (µ, α1 , α2 , ..., αk )T and


 
1n1 1n1 0 ··· 0
 1n
 2 0 1n2 ··· 0 
Xn×(k+1) =  . .. .. .. .

 .. . . . 
1n k 0 0 ··· 1nk
Example: More use of dummy variables
I Consider a setup where we need to judge the effectiveness of a treatment (or
may be comparing the effectiveness of two treatments, in which case the control
group may be thought of getting some treatment). This may be a controlled
experiment or observational study.
Example: More use of dummy variables
I Consider a setup where we need to judge the effectiveness of a treatment (or
may be comparing the effectiveness of two treatments, in which case the control
group may be thought of getting some treatment). This may be a controlled
experiment or observational study.

I Suppose the data are obtained in the form


Control Treatment
y11 y21
y12 y22
.. ..
. .
..
y1n1 .
y2n2
Example: More use of dummy variables
I Consider a setup where we need to judge the effectiveness of a treatment (or
may be comparing the effectiveness of two treatments, in which case the control
group may be thought of getting some treatment). This may be a controlled
experiment or observational study.

I Suppose the data are obtained in the form


Control Treatment
y11 y21
y12 y22
.. ..
. .
..
y1n1 .
y2n2

I Note that we allow the number of observations in the two groups to be different
and y ’s represent the value of the response.
Example: More use of dummy variables
I Consider a setup where we need to judge the effectiveness of a treatment (or
may be comparing the effectiveness of two treatments, in which case the control
group may be thought of getting some treatment). This may be a controlled
experiment or observational study.

I Suppose the data are obtained in the form


Control Treatment
y11 y21
y12 y22
.. ..
. .
..
y1n1 .
y2n2

I Note that we allow the number of observations in the two groups to be different
and y ’s represent the value of the response.

I This situation can also be tackled with a linear model with the use of dummy
variables.
More use of dummy (contd.)
I Let us define a dummy variable as

x = 1 or 0 according as the observation receives the treatment or not


More use of dummy (contd.)
I Let us define a dummy variable as

x = 1 or 0 according as the observation receives the treatment or not

I Then the linear model can be written as

z = α + βx + 

or more precisely

zi = α + βxi + i , i = 1, 2, ..., n(= n1 + n2 ).


More use of dummy (contd.)
I Let us define a dummy variable as

x = 1 or 0 according as the observation receives the treatment or not

I Then the linear model can be written as

z = α + βx + 

or more precisely

zi = α + βxi + i , i = 1, 2, ..., n(= n1 + n2 ).

I Here (
y1i , i = 1, 2, ..., n1
zi =
y2(i−n1 ) , i = n1 + 1, ..., n1 + n2
and 1 , 2 , ..., n are the random errors.
More use (Contd.)
I Suppose we write the above linear model as

z = X θ + .
More use (Contd.)
I Suppose we write the above linear model as

z = X θ + .

I Then θ = (α, β) is the vector of parameters, z is the response vector and  is


the random error vector.
More use (Contd.)
I Suppose we write the above linear model as

z = X θ + .

I Then θ = (α, β) is the vector of parameters, z is the response vector and  is


the random error vector.

I It is instructive to have a look at the structure of the design matrix


 
1 0
 1 0
 
 . .. 
 .
 . . 

 1 0
 
Xn×2 = · · · · · ·
 
 1 1
 
 1 1
 
 .. . 
 
 . .. 
1 1

where the upper submatrix consists of n1 rows and the lower one contains
n2 rows.
More use (Contd.)
I Suppose we write the above linear model as

z = X θ + .

I Then θ = (α, β) is the vector of parameters, z is the response vector and  is


the random error vector.

I It is instructive to have a look at the structure of the design matrix


 
1 0
 1 0
 
 . .. 
 .
 . . 

 1 0
 
Xn×2 = · · · · · ·
 
 1 1
 
 1 1
 
 .. . 
 
 . .. 
1 1

where the upper submatrix consists of n1 rows and the lower one contains
n2 rows.

I Note that in the above formulation the effect of the treatment is α + β and the
effect of the control is α- so the change in effect due to the treatment is β.

You might also like