0% found this document useful (0 votes)
10 views17 pages

Random Effect Models

The document discusses random effect models, contrasting them with fixed effect models, and explains the application of Expected Mean Squares (EMS) in statistical analysis. It outlines the characteristics of fixed and random effects, provides a statistical model for random effects, and details the assumptions and variance components involved. Additionally, it covers ANOVA tables, variance component estimation methods, and hypothesis testing related to random effects in various research contexts.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views17 pages

Random Effect Models

The document discusses random effect models, contrasting them with fixed effect models, and explains the application of Expected Mean Squares (EMS) in statistical analysis. It outlines the characteristics of fixed and random effects, provides a statistical model for random effects, and details the assumptions and variance components involved. Additionally, it covers ANOVA tables, variance component estimation methods, and hypothesis testing related to random effects in various research contexts.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 17

RANDOM EFFECT MODELS

Topic Overview
 Random vs. Fixed Effects
 Using Expected Mean Squares (EMS) to obtain
appropriate tests in a Random or Mixed Effects
Model

Fixed vs. Random Effects


So far we have considered only fixed effect models in
which the levels of each factor were fixed in advance
of the experiment and we were interested in
differences in response among those specific levels.
A random effects model considers factors for which
the factor levels are meant to be representative of a
general population of possible levels.

Fixed-effect:
i. All the levels of the factor are predetermined by the
experimenter.
ii. If we were to repeat the experiment we would select the
same levels.
iii. We will not extrapolate our statistical conclusions to
other treatments not included in the study.

1
Random-effect:
i. The levels of the factors in the experiment are randomly
selected from a population of all possible levels; or the
levels result from a process of a random nature.
ii. If we were repeating the experiment we would select
different levels.
iii. We will apply our statistical conclusions to all
treatments of the population from which our
experimental treatments were randomly selected.

All of the above three-mentioned points can be used by a


researcher in deciding on whether a particular factor should
be treated as a fixed or random in the analysis. The third
point (inference space) is probably the most crucial one.


Note: In some situations it is clear from the experiment
whether an effect is fixed or random. However there are
also situations in which calling an effect fixed or random
depends on your point of view, and on your interpretation
and understanding. It is therefore sometimes a personal
choice.

Random effect models are applicable is various field


including genetics studies, quality control, designing
research studies etc

2
Example:
-A plant-breeder obtains 6 crosses between wild
strawberries and strawberry cultivars. These are a random
sample of all potential crosses that the breeder could obtain.
-Fruit weights are measured from 8 plants of each cross.
These 8 plants are a random sample of the progeny possible
from each cross.

Statistical model for random effect:

Random effect model is the model that consists only of


random effects (random variables). The only constant
(fixed) component of such model is the overall mean,
, (the intercept).

For a one random factor, it is stated as:

yij =  ai + ij


yij The response from jth experimental unit
receiving the ith treatment (e.g., fruit
weight of jth plant from ith cross)
 Overall mean - an unknown constant (e.g.,
the true fruit weight of the population-all
plants from all possible crosses)
3
ai An effect due to the ith treatment– assumed
to be normally distributed with mean 0
and variance a2- random effect
ij A random error associated with the
response from the jth experimental unit
receiving the ith treatment

The assumptions:
i. The random components ai and ij are independent
from each other
ii. The ij error effects are assumed to be a random
sample from a population with a mean of 0 and
common variance 2
iii. The ai (random effects for the group) values are
assumed to be a random sample from a population
that is normally distributed with a mean of 0 and
variance a2.

If a2 = 0, then all group effects are equal, but if a2 > 0
there is variability among the group effects. Since the
group effects in the experiment are only a sample from
a large population of effects, the difference among the
specific group means, μ +αi, are of no particular
interest. The variance of the distribution of the group
effects, a2 is the focus of interest with random effects.

4
The variance of a single observation (single fruit
weight), y2 may be expressed as the sum of the two
variancesa2 and 2 :

y2 = a2 + 2

The variancesa2 and 2 are called components of


variance and the random-effect model is called
variance component model.

In the plant-breeding study the variance component


among groups/treatments (a2) represents genetic
variation among the crosses and the plant breeder may
be interested in the ratio of this genetic variation to the
total variation (y2).

OUTLINE OF ANOVA TABLE FOR ONE


FACTOR RANDOM-EFFECT MODEL IN CRD
Source of Degrees of Sum of Mean square Expected mean
variation freedom squares square

5
Treatment t–1 SST SST 2+ra2
MST =
t−1
Error tr-t=N-t SSE SSE 2
MSE=
tr−t
Total tr – 1=N-1 TSS

In the above Anova outline, the computation of the sums of


squares for treatment and error are the same as those for the
fixed effect model.
T ni
TSS=∑ ∑ ( y ij − ȳ .. )2
i =1 j =1

T
SSt=∑ r i ( ȳ i .− ȳ . .)2
i=1

T ni
SSE=∑ ∑ ij i
( y − ȳ .)2

i=1 j=1

The Anova also gives the formulae for computing the Expected
mean squares for treatment and error (variance components for
treatment and error).
6
Expected Mean Squares
The Analysis of variance method is used to estimate
the variance components. The analysis of variance is
computed as if the model is a fixed effects model,
Mean squares are computed by dividing a sum of
squares by respective df:
MS=SS/df,

and then the expected mean squares are derived


algebraically by using the model formulation and
assumptions
yij =  ai + ij

The observed mean squares are estimates of the expected


mean squares;

MSE – mean square for error – is the unbiased estimate of


the experimental error variance ( 2).

When the factor is fixed, the expected value of MSFactor


(MSTreatment) is equal to:
t
∑ ( μi −μ̄ . )2 1
t

σ +r i=1 μ̄ .= ∑ r i μ i
ε2 ( t −1) n i =1

7
t
∑ ( μi −μ̄ . )2
i=1
r
where ( t −1) characterizes the variance among
treatment means.

If there are no differences between treatments then the


expected mean square for treatments is also equal to
2

Algebraically, the expected mean square for the


random factor case can also be evaluated. When the
factor is random, the expected value of MSFactor (or
MSTreatment):

σ 2 +rσ
ε a2
where the component a2 is the variance among levels
of the random factor.

8
ANOVA table for random-effect model
Source of Degrees of Sum of Mean square Expected mean
variation freedom squares square
Treatment t–1 SST SST 2+ra2
MST =
t−1
Error tr-t=N-t SSE SSE 2
MSE=
tr−t
Total tr – 1=N-1 TSS

Expected (MSE )=σ 2ε

Expected ( MST )=σ 2ε +rσ 2a - for equal replications

[ ]
r 2i σ 2a
t
Expected ( MST )=σ 2ε + N −∑
i=1 N t −1
- for unequal
replications

Note: In SAS (PROC GLM and PROC MIXED) the


EMS (expected mean squares) are not computed
algebraically, but are evaluated using a different
procedure called Method of moments.

9
For the Example:

Source df Sum of Mean Expected mean


squares square square
Treatment 5 9.97 1.994 2+8 a2
Error 42 3.60 0.085 2
Total 47 13.57

Point estimates of the variance components – Using


Method of moments (default for Proc glm)

1. Obtain mean squares


2. Evaluate the expectations of the mean squares
3. Equate the expected and observed mean squares
4. Solve the system of equations

σ^ 2e=MSE

MST −MSE
σ^ 2a =
r

10
For unequal replications:
MST −MSE
σ^ 2a =
r0 , where

[ r 2i
]
t
1
r0= N −∑
t −1 i=1 N

For the Example:

σ^ 2e=MSE=0 . 085

MST −MSE 1. 994−0 . 085


σ^ 2a = = =0 . 239
r 8

11
Estimate the variance components in SAS –
PROC GLM
proc glm data=strawberry;
class genotype;
model fruit=genotype;
random genotype /test;
Random statement is used and /test will perform
appropriate tests (and produce EMS)

PROC MIXED

In PROC MIXED fixed factors are placed in the


MODEL statement, random factors are placed in the
RANDOM statement.

Variance components estimates using method of


moments are obtained using method=type3 option:

proc mixed data=strawberry method=type3;


class genotype;
model fruit=;
random genotype;
run;

There are several other methods for estimating


variance components, e.g., maximum likelihood and
12
its modification – restricted (or residual) maximum
likelihood (REML). The maximum likelihood
estimation uses an assumed distribution of the
observations (e.g., normal) and constructs a maximum
likelihood function which is a function of model
parameters. The maximum likelihood estimates of the
parameters are those values of the parameters that
maximize the likelihood function.

REML is the default method used in PROC MIXED:

proc mixed data=strawberry;


class genotype;
model fruit=;
random genotype;
run;

Part of the output with variance estimates:


Covariance Parameter Estimates
Cov Parm Estimate
genotype 0.2385
Residual 0.08577

Hypotheses testing about variance components


using F statistics:

13
The random effects of the model are assumed to have
a normal distribution. Thus given the assumption of
normally distributed effects, the significance of the
treatment component of variance can be tested using
the hypothesis:

H0: a2 = 0
Ha: a2 > 0

Test statistic: F=MST/MSE compare with:

F,df1,df2 = F,t-1,N-t

and the H0 hypothesis is rejected if Fcal > Ftab

For the example Fcal (23.4) > Ftab (2.3)


We therefore reject the H0 and conclude that the
variance for the crosses is greater than zero.

14
Interval Estimates for Variance Components

For error variance (2)


SSE 2 SSE
<σ ε <
χ 2α /2 ,( N −t ) χ(21−α /2), ( N−t )

= 0.0607 and 0.14728

For treatment variance (a2)


SST (1−F α /2 ,( t−1 ),( N−t )/ F o ) 2 SST (1−F (1−α /2) ,( t−1 ), ( N−t )/ F o )
2
<σ a <
rχ α /2 ,( t−1 ) rχ 2(1−α /2) ,( t−1 )
= 0.08695 to 1.48932

Or asymptotic confidence interval for a2 can be


calculated as:
a2 ± z/2 sea2

where sea2 is an estimated standard error of a2

proc mixed data=a cl alpha=0.05 ;


class genotype;
15
model fruit =;
random genotype ;
run;
Covariance Parameter Estimates

Cov Parm Estimate Alpha Lower Upper

genotype 0.2385 0.05 0.09003 1.6137


Residual 0.08577 0.05 0.05831 0.1386

Intraclass correlation
^ 2y
σ
The variance of the response variable equal to
σ^ 2y =σ^ 2a + σ^ 2ε can be proportionally allocated to the two
source of variability: the treatment and the experimental
error
Intraclass correlation,
Treatment = σ^ 2a / σ^ 2y = 0.7376 I

Experimental error = σ^ 2ε / σ^ 2y = 0.2623

Confidence interval for I :

16
F o −F α /2 ,( t−1) ,( N −t ) F o −F( 1−α /2 ), (t−1) ,( N −t)
<ρ I <
F o +(r−1) F α /2, (t−1) ,( N −t) F o +(r−1 )F (1−α /2) ,( t−1 ),( N−t)

Intraclass correlation coefficient is a measure of


proportion of variation accounted for by each variance
component.

17

You might also like