0% found this document useful (0 votes)
22 views4 pages

Introduction To Mixed-Effects Models For Hierarchical and Longitudinal Data (Part II)

Mixed Models 2

Uploaded by

hubik38
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
22 views4 pages

Introduction To Mixed-Effects Models For Hierarchical and Longitudinal Data (Part II)

Mixed Models 2

Uploaded by

hubik38
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

Introduction to Mixed-Effects Models for Hierarchical and Longitudinal Data (Part II)

Sociology 761

John Fox

1. Generalized Linear Mixed-Effects Models


1.1 Quick Review of Generalized Linear Models
I A generalized linear model consists of three components:

Lecture Notes

1. A random component, specifying the conditional distribution of the


response variable, |l, given the predictors. Traditionally, the random
component is an exponential family the normal (Gaussian), binomial,
Poisson, gamma, or inverse-Gaussian.

Introduction to Mixed-Effects Models


for Hierarchical and Longitudinal Data
(Part II)

2. A linear function of the regressors, called the linear predictor,


 l =  +  1{l1 + +  n {ln
on which the expected value l of |l depends.
3. A link function j(l) =  l, which transforms the expectation of the
response to the linear predictor. The inverse of the link function is called
the mean function: j 1( l) = l.

Copyright 2004 by John Fox


Sociology 761

Introduction to Mixed-Effects Models for Hierarchical and Longitudinal Data (Part II)

I In the following table, the logit, probit and complementary log-log links
are for binomial or binary data:
l = j(l)
l = j 1( l)
l
l
logh l
hl
1
 1
l
l
1@2
2
l
l
s
l
 2l
l
1
logit
logh
1  l
1 + hl
probit
x(l)
x1( l)
complementary log-log logh[ logh(1  l)] 1  exp[ exp( l)]
Link
identity
log
inverse
inverse-square
square-root

Sociology 761

c
Copyright 2006
by John Fox

c
Copyright 2006
by John Fox

Introduction to Mixed-Effects Models for Hierarchical and Longitudinal Data (Part II)

I In R, generalized linear models are fit with the glm function, and most of
the arguments of glm are similar to those of lm:
The response variable and regressors are given in a model formula
data, subset, and na.action arguments determine the data on
which the model is fit.
The additional family argument is used to specify a family-generator
function, which may take other arguments, such as a link function.

Sociology 761

c
Copyright 2006
by John Fox

Introduction to Mixed-Effects Models for Hierarchical and Longitudinal Data (Part II)

I The following table gives family generators and default links:


Default Link Range of |l Var(|l| l)
identity (4> +4)
!
0> 1> ===> ql
binomial
logit
l(1  l)
ql
poisson
log
0> 1> 2> ===
l
Gamma
inverse
(0> 4)
!2l
inverse.gaussian 1/mu^2
(0> 4)
!3l
Family
gaussian

For distributions in the exponential families, the conditional variance of


|l is a function of the mean l and a dispersion parameter ! (fixed to 1
for the binomial and Poisson distributions).

c
Copyright 2006
by John Fox

Sociology 761

Introduction to Mixed-Effects Models for Hierarchical and Longitudinal Data (Part II)

link
family
log logit probit cloglog
gaussian
X
binomial
X
X
X
X
poisson
X
Gamma
X
inverse.gaussian X
quasi
X
X
X
X
quasibinomial
X
X
X
quasipoisson
X

Sociology 761

Introduction to Mixed-Effects Models for Hierarchical and Longitudinal Data (Part II)

I The following table shows the links available for each family in R, with
the default links in bold:

c
Copyright 2006
by John Fox

link
family
identity inverse sqrt 1/mu^2
gaussian
X
X
binomial
poisson
X
X
Gamma
X
X
inverse.gaussian
X
X
X
quasi
X
X
X
X
quasibinomial
quasipoisson
X
X

Sociology 761

c
Copyright 2006
by John Fox

Introduction to Mixed-Effects Models for Hierarchical and Longitudinal Data (Part II)

I The quasi, quasibinomial, and quasipoisson family generators


do not correspond to exponential families.
The quasi-binomial and quasi-Poisson families can be used to fit
over-dispersed binomial and Poisson GLMs.
Such models are estimated by quasi-likelihood methods (specifying
the variance as a function of the mean and a dispersion parameter).

Sociology 761

c
Copyright 2006
by John Fox

Introduction to Mixed-Effects Models for Hierarchical and Longitudinal Data (Part II)

1.2 The Generalized Linear Mixed Model


I The generalized linear mixed-effects model (GLMM) is a straightforward
extension of the generalized linear model, adding random effects to
the linear predictor, and expressing the expected value of the response
conditional on the random effects:
j(lm ) = j[H(|lm |e1l> = = = > etl)] =  lm
 lm =  1 +  2{2lm + +  s{slm + e1l}1lm + + etl}tlm
The link function j() is as in generalized linear models.
The conditional distribution of |lm given the random effects is a
member of an exponential family, or for quasi-likelihood estimation
the variance of |lm |e1l> = = = > etl is a function of lm and a dispersion
parameter !.
We make the usual assumptions about the random effects: That they
are multinormally distributed with mean 0 and covariance matrix [.

Sociology 761

c
Copyright 2006
by John Fox

Introduction to Mixed-Effects Models for Hierarchical and Longitudinal Data (Part II)

10

1.3 A Brief Example: Contraceptive Use in


Bangladesh
I This example is borrowed from Harvey Goldstein, Multilevel Statistical
Models, Third Edition (Arnold, 2003).
I The data are drawn from the 1988 Bangladesh Fertility Survey, and
include 1934 women in 60 districts of the country.
I The response variable is whether or not the woman was using contraceptives at the time of the survey, and the explanatory variables all at
the individual level include the womans age (centered at the mean
age), number of children (which is categorized as 0, 1, 2, or 3 or more),
and whether the woman is living in an urban or a rural area.
Goldsteins description of the data seems to imply that the districts are
either urban or rural, but examination of the data reveals that most
districts have both urban and rural residents.

Introduction to Mixed-Effects Models for Hierarchical and Longitudinal Data (Part II)

I The estimation of generalized linear mixed models by ML or REML is


not so straightforward, because the likelihood function includes integrals
that are analytically intractable.
There are two general practical approaches to estimating GLMMs:
Evaluate the likelihood using some method of numerical integration,
such as quadrature (for simple cases) or simulation (for more
complicated ones).
Use a method that approximates the maximum-likelihood estimate.
Implementations of both approaches are available in R.
One general implementation is the lmer function in the lme4
package, which provides an approximate solution by the method of
penalized quasi-likelihood (PQL), based on a linear approximation
to the GLMM, as well as generally more accurate, but more
time-consuming, Laplacian and adaptive Gaussian quadrature
approximations.

Sociology 761

c
Copyright 2006
by John Fox

Introduction to Mixed-Effects Models for Hierarchical and Longitudinal Data (Part II)

11

I A random-coefficient logistic regression model for these data permits the


coefficients for the explanatory variables to vary from district to district:
lm
logh
=  1 +  2cagelm +  3kids1lm +  4kids2lm +  3kids3lm +  5urbanlm
1  lm
+e1l + e2lagelm + e3lkids1lm + e4lkids2lm + e3lkids3lm + e5lurbanlm
where
lm is the probability that woman m in district l uses contraceptives;
cage is centered age;
kids1, kids2, and kids3 are 0/1 dummy variables, treating 0 children
as the baseline category (alternative parametrizations of the children
effect are, of course, possible);
urban is a dummy variable coded 1 for women in urban areas and 0
for those living in rural areas.
Note that there are no level-2 explanatory variables.

The number of women per district in the sample ranges from 2 to 118.
Sociology 761

c
Copyright 2006
by John Fox

Sociology 761

c
Copyright 2006
by John Fox

Introduction to Mixed-Effects Models for Hierarchical and Longitudinal Data (Part II)

12

I To test whether all of the random effects are required, I fit several models
to the data.
I found that I could not get the estimates to converge when I included
random effects for number of children, so I omitted these from the
model.
I obtained the following approximate maximized log-likelihoods (with
model 1 including the random effects for both cage and urban):
Model Omitting Random Effects for logh O
1

1198=99
2
cage
1199=65
3
urban
1206=35
4
intercept
1220=94

Introduction to Mixed-Effects Models for Hierarchical and Longitudinal Data (Part II)

13

Likelihood ratio tests for the variance and covariance components are
as follows:
s
Variance and Covariance Components for J2 gi
cage
1=32 3 =73
urban
14=72 3 =002
intercept
43=90 3 =001
I therefore retained the random effects for urban and the random
intercepts, but removed the random effects for age.

Recall that omitting a random effect omits the corresponding variance


and covariance components.
Sociology 761

c
Copyright 2006
by John Fox

Introduction to Mixed-Effects Models for Hierarchical and Longitudinal Data (Part II)

14

The fixed-effect and variance-component estimates (as standard


deviations) are as follows:
PQL Estimate Std. Error
Parameter Term
1
intercept
1=7110
0=1575
cage
0=0264
0=0079
2
kids1
1=1323
0=1585
3
kids2
1=3579
0=1749
4
kids3
1=3532
0=1801
5
urban
0=8152
0=1665
6
(intercept)
0=6177
#1
(urban)
0=8020
#2
0=3956
#12

0=9779

Sociology 761

c
Copyright 2006
by John Fox

Sociology 761

Introduction to Mixed-Effects Models for Hierarchical and Longitudinal Data (Part II)

c
Copyright 2006
by John Fox

15

Notice that an estimate is produced of the residual standard


deviation, 
b = 0=9779.
In a binomial GLMM,  = ! should be 1, and a value substantially
different from 1 would be indicative of over- or under-dispersion.
Contraceptive use declines with age, increases (up to two children)
with number of children, and is higher in urban than in rural areas.
All of these results are highly statistically significant.

Sociology 761

c
Copyright 2006
by John Fox

You might also like