Intro Spatial Models INLA-3-43
Intro Spatial Models INLA-3-43
Introduction 1
References 91
List of Figures 99
the literature including parametric and non-parametric time trend and in-
teractions. The literature about Bayesian spatio-temporal disease mapping
is extensive. For example, Bernardinelli et al. (1995) use a spatio-temporal
model with linear trend while Assunção et al. (2001) consider a second-degree
polynomial trend model. Using non-parametric models, it deserves attention
the work by Knorr-Held (2000), where four types of space-time interactions
are proposed. Martínez-Beneito et al. (2008) focus on an autoregressive ap-
proach to spatio-temporal disease mapping, and Ugarte et al. (2009a) com-
pare the performance of different space-time models. Most of the research
in disease mapping is based on conditional autoregressive priors (CAR) for
both spatial and temporal effects, extending the seminal work of Besag et al.
(1991). However, other approaches based on splines have been developed.
Within an EB approach, MacNab and Dean (2001) consider autoregressive
local smoothing in space and B-spline smoothing for time. Ugarte et al.
(2010, 2012b) consider a pure interaction P-spline model for space and time,
and Ugarte et al. (2012a) use an ANOVA type P-spline model to describe
spatio-temporal patterns of prostate cancer mortality in Spain. From a FB
approach, spline smoothing has also been used in disease mapping (see for
example, MacNab, 2007; MacNab and Gustafson, 2007).
In this chapter, our target is to go deeply into the INLA possibilities to fit
space-time disease mapping models. Most of the work in spatial and spatio-
temporal disease mapping with INLA considers the Besag et al. (1991) model
(hereafter BYM model) which includes two spatial effects: one assuming a
Gaussian exchangeable prior to model unstructured heterogeneity and an-
other one assuming an intrinsic conditional autoregressive prior (iCAR) for
the spatially structured variability. See for example, Schrödle et al. (2011);
Schrödle and Held (2011a); Held et al. (2010); Schrödle and Held (2011b) or
Blangiardo et al., 2013. However, the iCAR prior is improper and has the
undesirable large-scale property of tending to a negative pairwise correla-
tion for regions located further apart (see MacNab, 2011; Botella-Rocamora
et al., 2013). In addition, the variance components in the BYM convolution
model are not identifiable from the data (MacNab, 2014) and informative
hyperpriors are needed for posterior inference. In this paper we consider the
prior proposed by Leroux et al. (1999) that has been shown to outperform
the iCAR prior (Lee, 2011). This model can be easily implemented using
the R-INLA package as it will be shown later. It has already been used to
construct a local adaptive algorithm for spatial smoothing (Lee and Mitchell,
2013).
In Section 1.2, different spatio-temporal models are described and the
necessary set of identifiability constraints for each model are provided. De-
tails about both PQL and INLA estimation techniques are also included in
6 Spatio-temporal disease mapping
Section 1.3. Additionally, a description of the R-INLA package and some of its
more useful tools are described in detail in Section 1.4. Finally, in Section 1.5
male brain cancer mortality data in Spanish provinces is used as illustration
to show how to fit spatio-temporal models using INLA.
where Oj and Nj are respectively the number of counts and the population
size in ‘age-and-sex’ group j ∈ {1, . . . , J}, so that
n X
X T n X
X T
Oj = Oitj , Nj = Nitj .
i=1 t=1 i=1 t=1
log(rit ) = η + ξi + (β + ϕi ) · t (1.1)
where ξ −i denotes the random effect vector without the ith component, j ∼ i
indicates that areas i and j are neighbors, and ni is the number of neighbors
8 Spatio-temporal disease mapping
of area i.
Two different prior distributions for the differential trend ϕi are examined.
The first one assumes an exchangeable distribution, that is, the prior for the
0
differential trend ϕ = (ϕ1 , . . . , ϕn ) is given by ϕ ∼ N (0, τϕ−1 In ). The second
one considers an iCAR prior distribution, so that ϕ ∼ N (0, [τϕ Rξ ]− ).
Table 1.1: Specification and rank deficiency for the four possible types of
space-time interaction proposed by Knorr-Held (2000).
Rank deficiency of Rδ
Space-time interaction Rδ
RW1 prior for γ RW2 prior for γ
Type I In ⊗ IT − −
Type II In ⊗ Rγ n 2n
Type III Rξ ⊗ IT T T
Type IV Rξ ⊗ Rγ n+T −1 2n + T − 2
or a RW2
1 −2 1
−2 5 −4 1
1 −4 6 −4 1
Rγ =
.. .. ..
. . .
1 −4 6 −4 1
1 −4 5 −2
1 −2 1
0
The interaction random effect δ = (δ11 , . . . , δ1T , . . . , δn1 , . . . , δnT ) is as-
sumed to follow the multivariate normal distribution δ ∼ N (0, [τδ Rδ ]− ),
where Rδ is the nT × nT matrix obtained as the Kronecker product of the
corresponding spatial and temporal structure matrices (see Clayton, 1996).
The four types of interactions proposed by Knorr-Held (2000) will be con-
sidered. In Type I interactions (Rδ = In ⊗ IT ), all parameters δit are a
priori independent without any structure in space and time. In Type II in-
teractions (Rδ = In ⊗ Rγ ), each δi· for i = 1, . . . , n follows a random walk
independently for all other regions. I.e., temporal trends are different from
region to region, but do not have any structure in space. In Type III in-
teractions (Rδ = Rξ ⊗ IT ), each δ·t for t = 1, . . . , T follows an independent
intrinsic CAR prior distribution. I.e., different spatial distributions for each
time point without any temporal structure are assumed. Finally, in Type IV
interactions (Rδ = Rξ ⊗ Rγ ), each δit is completely dependent over space
and time. That is, different temporal trends are assumed from region to
region, but are more likely to be similar for adjacent regions. The structure
matrices for the different type of interactions and their rank deficiencies are
summarized in Table 1.1.
10 Spatio-temporal disease mapping
n
P T
P n
P T
P T
P
ξi = 0, γt = 0, ξi = 0, γt = tγt = 0,
Type II i=1 t=1 i=1 t=1 t=1
T
P T
P
δit = 0, for i = 1, . . . , n δit = 0, for i = 1, . . . , n
t=1 t=1
n
P T
P n
P T
P
ξi = 0, γt = 0, ξi = 0, γt = 0,
Type III i=1 t=1 i=1 t=1
n
P n
P
δit = 0, for t = 1, . . . , T δit = 0, for t = 1, . . . , T
i=1 i=1
n
P T
P n
P T
P
ξi = 0, γt = 0, ξi = 0, γt = 0,
i=1 t=1 i=1 t=1
T T
Type IV P
δit = 0, for i = 1, . . . , n
P
δit = 0, for i = 1, . . . , n
t=1 t=1
n
P n
P
δit = 0, for t = 1, . . . , T δit = 0, for t = 1, . . . , T
i=1 i=1
Breslow and Clayton (1993) applied it to the well-known Scottish lip cancer
data. The PQL analysis relies on a series of approximations to the mixed
model using a first order Taylor series expansion of the link function. First,
an appropriate working vector (Schall, 1991) must be defined to achieve cor-
respondence with the normal mixed model. Once the estimation of the fixed
and random effects are obtained, the restricted maximum likelihood (REML)
equations are used (see Harville, 1977) to estimate the variance components
of the model.
For notational simplicity, let consider the spatio-temporal CAR model of
Equation (1.4) without the unstructured temporal random effect φt . This
model can be expressed in matrix form as
12 Spatio-temporal disease mapping
Y = Xη + Z1 ξ + Z2 γ + Z3 δ + (O − µ)g 0 (µ),
where X is the fixed effects matrix (here a column of ones), Z1 = In ⊗1T is the
design matrix of the main spatial random effect, Z2 = 1n ⊗ IT is the design
matrix of the main temporal random effect, Z3 = In ⊗ IT is the design matrix
of the interaction term, µ is the vector of means of the Poisson distribution,
g is the link function (here the logarithmic function) and g 0 (µ) = 1/µ. Then
a correspondence with a normal mixed model is attained as
Y = Xη + Z1 ξ + Z2 γ + Z3 δ + ,
where = (O − µ)g 0 (µ) ∼ N (0, W−1 ), and W = diag{µit }. The fixed effect
estimator is obtained as
and
Given the initial estimates of the parameters, PQL first solves for fixed
and random effects considering fixed values of the variance parameters. Then,
the variance parameters are updated from the REML equation. The pro-
cedure is repeated until the convergence criteria is achieved. Finally, the
relative risks are estimated as
The second stage is the latent Gaussian field π(x|θ), where a multivariate
Gaussian prior with zero mean and precision matrix Q is assumed for x. This
precision matrix typically depends on the hyperparameters θ (third stage),
which are not necessarily Gaussian. That is, x ∼ N (0, Q−1 (θ)) with density
function given by
−N/2 1/2 1 0
π(x|θ) = (2π) |Q(θ)| exp − x Q(θ)x .
2
1.3 Model fitting and inference 15
(ii) π(xi |θ, y), which is needed to compute the marginal posteriors π(xi |y).
For the first task (i), the Laplace approximation method described in Tierney
and Kadane (1986) can be used, so that the joint posterior density of the
hyperparameters π(θ|y) is approximated as
π(y|x, θ)π(x|θ)π(θ)
π̃(θ|y) ∝ x=x∗ (θ ),
(1.7)
π̃G (x|θ, y)
the properties of expression Equation (1.7) and find good evaluation points
θk for a numerical integration of Equation (1.6). This is done by an iterative
algorithm (Rue et al., 2009). Additionally, an appropriate area weight ∆k
must be assigned to each θk . Details about how posterior marginals π(θk |y)
are computed using numerical integration of an interpolant are available in
Martins et al. (2013).
For the second task (ii), three different approaches are possible: a Gaus-
sian approximation, a full Laplace approximation, and a simplified Laplace
approximation. In the Gaussian approximation, the posterior conditional
distributions π(xi |θ, y) are directly approximated as the marginals from
π̃G (x|θ, y). Even this approximation is the fastest option and often gives
accurate results in short computational time, according to Rue and Martino
(2007) unsatisfactory results can be obtained due to errors in the location
of the posterior marginals, errors due to the lack of skewness or both. This
approximation can be improved through applying another Laplace approxi-
mation to π(xi |θ, y) similar to the one described in Equation (1.7). However,
this “full Laplace” strategy can be computationally expensive. That is the
reason Rue et al. (2009) develop the simplified Laplace approximation based
on a series expansion of the full Laplace approximation. This method is less
time consuming and is very competitive in many applications.
Finally, an approximation to the posterior marginal density of Equa-
tion (1.6) is given by
X
π̃(xi |y) = π̃(xi |θk , y)π̃(θk |y)∆k .
k
for the testing version. To upgrade the package to the latest version (type
inla.version() to find out the currently installed version), use either the
1.4 The R-INLA package 17
> names(inla.models()$latent)
[1] "linear" "iid" "mec" "meb"
[5] "rgeneric" "rw1" "rw2" "crw2"
1
Type names(inla.models()$likelihood) to obtain the list of available likelihoods.
18 Spatio-temporal disease mapping
For each model, a detailed description and usage examples are provided in
http://www.r-inla.org/models/latent-models. Some of them are briefly
0
described in what follows. Assuming that x = (x1 , . . . , xk ) is a vector of
length k:
• The "bym" model, defines the BYM (or convolution) prior for x proposed
by Besag et al. (1991). That is
Since each data point is represented by two random effects, only their sum
is identifiable. The "bym" model computes both the posterior distribution
of u + v (first k elements), and the posterior distribution of the spatially
structured effect u (elements from kP+ 1 to 2k). The model automatically
imposes the sum-to-zero constraint ki=1 (ui + vi ) = 0 (constr=TRUE).
• The "rw1" model, defines a first order random walk prior for x. It is
constructed assuming independent increments
Pk
The model automatically imposes the sum-to-zero constraint i=1 xi = 0
(constr=TRUE).
• The "rw2" model, defines a second order random walk prior for x. It is
constructed assuming independent second-order increments
x ∼ N (0, [τ C]−1 ),
where the structure matrix C is passed to the program through the Cmat
argument (a dense or sparse-matrix).
• The "generic3" model, defines a generic prior for x such that
" #−1
Xm
x ∼ N 0, τi Ci ,
i=1
where ni is the number of neighbors of the ith area. If λmax = 1, then the
covariance matrix defined in Equation (1.14) takes the following expression
−1
β
τ (Ik − C) = [τ (Ik − β(Ik − R))]−1 = [τ (βR + (1 − β)Ik )]−1 ,
λmax
2
At the present time, an experimental version of the model is implemented under the
name "besagproper2".
22 Spatio-temporal disease mapping
Proof 1.1: First note that C = (cij ) is an ML-matrix (see for example
Seneta, 1981, p.45), that is, a real matrix for which cij ≥ 0, i 6= j. Let
consider the non-negative matrix T = µIk + C, where µ = max{ni }i∈{1,...,k} .
Then, C is an irreducible matrix if T is also irreducible. To show that T
is irreducible we assume that in the graph associated to the neighborhood
matrix R (see Rue and Held, 2005, p.18), there is a path from node i to node
j, for all i, j (i.e. regions i and j are connected). When the neighbourhood
structure is defined by adjacency, this condition holds provided there is no
isolated region or group of regions (for example islands). Let us supposed that
T is not irreducible (i.e. T is reducible). Hence, there exists a permutation
matrix P (see Rao and Rao, 1998, p.468) such that
0 A 0
PTP = .
B D
where
P λmax is the maximum eigenvalue of C (see Seneta, 1981, p.52). Clearly
j cij = 1, ∀i = 1, . . . , k, and hence λmax = 1.
So, the lCAR prior can be specified inside the f() function as
> f(x, model="generic1", Cmatrix=<C.Leroux>, constr=TRUE, ...,
+ hyper=list(prec=list(...),beta=list(...)))
where C.Leroux is the structure matrix defined in Equation (1.15).
1.4 The R-INLA package 23
θ = log(τ ) ∼ logGamma(1,5e-05).
Note that for the "bym" model, the vector of hyperparameters is represented
as θ = (log(τu ), log(τv )), where τu and τv are respectively the precision pa-
rameters of the spatially structured and unstructured components of the
model. For the lCAR model described in Section 1.4.2, the vector of hyper-
parameters is represented as θ = (log(τ ), logit(β)), with default hyperprior
distribution
β
logit(β) = log ∼ N (0, 0.1)
1−β
24 Spatio-temporal disease mapping
∂β exp(θ2 )
π(θ2 ) = π(logit(β)) = π(β) · = π(β) · = π(β) · β(1 − β)
∂θ2 (1 + exp(θ2 ))2
Once the "lunif" prior distribution has been defined, it can be included
inside the f() function as
> f(x, model=<model>, ...,
+ hyper=list(beta=list(prior=lunif,initial=0)))
where formula contains the specification of the latent Gaussian fields of the
model, t is the number of time periods, and ID.year1 and ID.year2 are
respectively the internal variable names of φt and γt .
ν ∗ = Aν,
where in this case the likelihood function is linked to the latent field through
ν ∗ instead of ν, i.e.,
YN
π(y|x, θ) = π(yi |νi∗ , θ).
i=1
0
where y = (y1 , . . . , yN ) is the response vector, η is an intercept, A = (aij ) is a
0
design matrix of dimension N ×k and x = (x1 , . . . , xk ) is a vector of unknown
coefficients where an exchangeable prior is considered, i.e., x ∼ N (0, τ Ik ).
n
X
δit = 0, for t = 1, . . . , T
i=1
where Cmat is the Kronecker product of the spatial and temporal structure
matrices, rankdef is its rank deficiency, A.constr is the (n + T ) × nT dimen-
sion matrix given in Equation (1.16) and n.constr is equal to the number
of constraints to be imposed.
More specifically, for members of the exponential family with E[y] = µ(x, θ)
the saturated deviance obtained by setting f (y) = π(y|µ(x, θ) = y) shall
we used. The deviance of the model measures the variability linked to the
likelihood, which is the probabilistic structure used for the observations con-
ditional on the parameters. Typically, the posterior mean deviance D(x, θ) is
used as a measure of fit, as it is very robust. However more complex models
will fit better the data, and consequently lower values of the mean deviance
will be obtained. The Deviance Information Criterion (DIC, Spiegelhalter
et al., 2002) is the most commonly used measure of model fit based on the
1.4 The R-INLA package 29
deviance for Bayesian models. The DIC is computed as the sum of the poste-
rior mean deviance (a measure of goodnes of fit) and the number of effective
parameters (a measure of model complexity), which is defined as
1.5 Illustration
In this section, Spanish male brain cancer mortality data during the period
1986-2013 will be used to fit the spatio-temporal CAR models described
in Section 1.2 with INLA. Since the computational costs are substantially
reduced in comparison to McMC methods, a battery of models can be fitted
and compared in a reasonable time. The model selection criteria described
in Section 1.4.6 and Section 1.4.7 will be used to select the best model. The
R-INLA code to fit these models has been also included.
of death 64.05 years. Differences in brain cancer mortality risk among dif-
ferent Spanish provinces are known to exist, being Navarra and the Basque
provinces among those with a significant high relative risk (Ugarte et al.,
2010, 2014).
Brain cancer mortality data (International Classification of Disease-10:
code C71) registered during the period 1986-2010 in each of the 50 Spanish
provinces (excluding Ceuta and Melilla) have been obtained from the Span-
ish National Epidemiology Center. From a total of 50450 deaths recorded
throughout the studied period 28426 correspond to males and 22024 to fe-
males. The number of observed cases for males varies from 0 to 185, and
indirect age-standardization has been used to compute the number of ex-
pected deaths for each province and year.
The parametric and non-parametric models described in Section 1.2 have
been fitted to the real. The DIC, WAIC and the Logarithmic Score have
been used as model selection criteria. In addition, a corrected version of the
DIC proposed by Plummer (2008) has been included (DICc), because it has
been shown that DIC values may under-penalize complex models in disease
mapping.
Improper uniform prior distributions are given to the standard deviations
in the model and a vague zero mean normal distribution with precision close
to zero is considered for the intercept (η). Finally, a Uniform(0, 1) distribu-
tion has been used for the spatial smoothing parameter of the lCAR prior.
The results for all the fitted models using the simplified Laplace approxima-
tion are shown in Table 1.3.
Parametric models exhibit low values of the effective number of parame-
ters but the highest values of posterior deviance, leading to the largest DIC
values. Additive models are also discarded because of their large values in all
the model selection criteria. In general, models with a RW1 prior distribu-
tion for the structured temporal component shows a better fitting than those
with a RW2 prior. The model without unstructured temporal component
and completely structured (Type IV) interaction term is the best model in
terms of a trade-off between model fit and complexity (smallest DIC, DICc
and WAIC values), and also the best in terms of predictive power (lowest
Logarithmic Score value). To obtain more accurate posterior distributions,
the model has been fitted again using the full Laplace approximation.
The estimated log-relative risks can be split up into different components.
An overall global risk (given by η̂); a risk related to the spatial location (ξ̂)
that can be attributed to factors associated to a particular region; a tem-
poral risk trend common to all areas (γ̂) that can be attributed to changes
in coding the disease, diagnostics or policies affecting the whole country;
and an area specific temporal risk trend (δ̂) that reflects particular effects
32 Spatio-temporal disease mapping
Table 1.3: Posterior mean of the deviance (D̄), number of effective parame-
ters (pD ), model selection criteria (DIC, DICc, WAIC and Logarithmic Score)
and computational time (in seconds) from fitted models in the analysis of
male brain cancer mortality data in Spain.
of each province. Figure 1.1a shows the spatial mortality risk (ζi = exp(ξi ))
associated to each region and constant along the period, while Figure 1.1b
displays the posterior probability that the spatial risk is greater than one.
Probabilities above 0.9 (below 0.1) points toward high (low) risk regions (see
Richardson et al., 2004; Ugarte et al., 2009a,b for some discussion about refer-
ence thresholds in relative risks and cut-off probabilities). From this picture,
a high risk is associated to some northern provinces of Spain. The temporal
risk trend exp(γt ) common to all regions is represented in Figure 1.1c, as well
as the 95% credibility interval. A clear increasing trend is observed during
the studied period.
1.36 1
1.23
0.9
1.11
0.8
1.00
0.2
0.90
0.1
0.81
0.73 0
(a) Map of posterior means of the spatial (b) Map of spatial posterior probabilities
pattern of mortality risks ζi = exp(ξi ) P (ζi > 1|O)
1.3
1.2
1.1
exp(γt)
1.0
0.9
0.8
0.7
Year
Figure 1.1: Spatial and temporal effects of male brain cancer mortality rela-
tive risks in Spanish provinces.
34 Spatio-temporal disease mapping
Finally, Figure 1.2 show the spatio-temporal evolution of male brain can-
cer mortality risks for each province (comparing to the whole of Spain) during
the study period 1986-2010, and the posterior probabilities that the relative
risks are greater than one. The risk scale was originally constructed in the
logarithmic scale to express the same magnitudes of excess and default of risk
with respect to Spain. Then, it was back-transformed to facilitate maps read-
ing and interpretation. For example, 1.67 means a 67% excess of risks with
respect to Spain in the studied period and 0.60 (1/1.67) means the same
amount but of risk default. Combining the information provided by both
maps, an increase in risk is observed as the maps are getting darker with
years. A group of provinces in the north and central-east of Spain exhibit
high risk.
where observed and expected are respectively the vectors of observed and
expected deaths, being n and t the number of areas and time periods for
which data is available (n=50 provinces and t=25 years for brain cancer mor-
tality data). Note that the data must be ordered according to the Kronecker
product given for the structure matrix of the space-time interaction random
effect δ defined in Table 1.1. For details about how to introduce the data
or how the IDs must be specified in INLA, see the examples and tutorials
provided in http://www.r-inla.org/examples.
Then, we define the spatial neighborhood matrix Rξ and the structure
matrix to implement the lCAR prior distribution using the "generic1"
model (see Section 1.4.2) as
> g <- inla.read.graph("prov_nb.inla")
> R.xi <- matrix(0, g$n, g$n)
> for (i in 1:g$n){
+ R.xi[i,i]=g$nnbs[[i]]
+ R.xi[i,g$nbs[[i]]]=-1
+ }
> R.Leroux <- diag(n)-R.xi
1.5 Illustration 35
1.65
1986 1991 1996
1.45
1.28
1.13
1.00
2000 2005 2010
0.88
0.78
0.69
0.61
1
1986 1991 1996
0.9
0.8
0.1
Figure 1.2: Posterior mean distribution of male brain cancer mortality risks
(top) and P (r̂it > 1|O) posterior probability maps (bottom).
The formula object for the models presented in Table 1.3 are defined as
M1 model
> f.M1 <- O ~ f(ID.area, model="generic1", Cmatrix=R.Leroux, constr=TRUE,
+ hyper=list(prec=list(prior=sdunif),beta=list(prior=lunif))) +
+ f(ID.area1, year, model="iid",
+ hyper=list(prec=list(prior=sdunif))) +
+ (year-mean(year))
M2 model
> f.M2 <- O ~ f(ID.area, model="generic1", Cmatrix=R.Leroux, constr=TRUE,
+ hyper=list(prec=list(prior=sdunif),beta=list(prior=lunif))) +
+ f(ID.area1, year, model="besag", graph="prov_nb.inla",
+ hyper=list(prec=list(prior=sdunif))) +
+ (year-mean(year))
M3 model
> f.M3 <- O ~ f(ID.area, model="generic1", Cmatrix=R.Leroux, constr=TRUE,
+ hyper=list(prec=list(prior=sdunif),beta=list(prior=lunif))) +
+ f(ID.year, model="rw1", hyper=list(prec=list(prior=sdunif))) +
+ f(ID.year1, model="iid", hyper=list(prec=list(prior=sdunif)))
M4 model
> f.M4 <- O ~ f(ID.area, model="generic1", Cmatrix=R.Leroux, constr=TRUE,
+ hyper=list(prec=list(prior=sdunif),beta=list(prior=lunif))) +
+ f(ID.year, model="rw1", hyper=list(prec=list(prior=sdunif))) +
+ f(ID.year1, model="iid", hyper=list(prec=list(prior=sdunif))) +
+ f(ID.area.year, model="iid", constr=TRUE,
+ hyper=list(prec=list(prior=sdunif)))
M5 model
> R <- kronecker(diag(n),R.gammaRW1)
> r.def <- n
> A.constr <- kronecker(diag(n),matrix(1,1,t))
>
> f.M5 <- O ~ f(ID.area, model="generic1", Cmatrix=R.Leroux, constr=TRUE,
+ hyper=list(prec=list(prior=sdunif),beta=list(prior=lunif))) +
+ f(ID.year, model="rw1", hyper=list(prec=list(prior=sdunif))) +
+ f(ID.year1, model="iid", hyper=list(prec=list(prior=sdunif))) +
+ f(ID.area.year, model="generic0", Cmatrix=R, rankdef=r.def,
+ constr=TRUE, hyper=list(prec=list(prior=sdunif)),
+ extraconstr=list(A=A.constr, e=rep(0,n)))
1.5 Illustration 37
M6 model
> R <- kronecker(R.xi,diag(t))
> r.def <- t
> A.constr <- kronecker(matrix(1,1,n),diag(t))
>
> f.M6 <- O ~ f(ID.area, model="generic1", Cmatrix=R.Leroux, constr=TRUE,
+ hyper=list(prec=list(prior=sdunif),beta=list(prior=lunif))) +
+ f(ID.year, model="rw1", hyper=list(prec=list(prior=sdunif))) +
+ f(ID.year1, model="iid", hyper=list(prec=list(prior=sdunif))) +
+ f(ID.area.year, model="generic0", Cmatrix=R, rankdef=r.def,
+ constr=TRUE, hyper=list(prec=list(prior=sdunif)),
+ extraconstr=list(A=A.constr, e=rep(0,t)))
M7 model
> R <- kronecker(R.xi,R.gammaRW1)
> r.def <- n+t-1
> A1 <- kronecker(diag(n),matrix(1,1,t))
> A2 <- kronecker(matrix(1,1,n),diag(t))
> A.constr <- rbind(A1,A2)
>
> f.M7 <- O ~ f(ID.area, model="generic1", Cmatrix=R.Leroux, constr=TRUE,
+ hyper=list(prec=list(prior=sdunif),beta=list(prior=lunif))) +
+ f(ID.year, model="rw1", hyper=list(prec=list(prior=sdunif))) +
+ f(ID.year1, model="iid", hyper=list(prec=list(prior=sdunif))) +
+ f(ID.area.year, model="generic0", Cmatrix=R, rankdef=r.def,
+ constr=TRUE, hyper=list(prec=list(prior=sdunif)),
+ extraconstr=list(A=A.constr, e=rep(0,n+t)))
M8 model
> f.M8 <- O ~ f(ID.area, model="generic1", Cmatrix=R.Leroux, constr=TRUE,
+ hyper=list(prec=list(prior=sdunif),beta=list(prior=lunif))) +
+ f(ID.year, model="rw2", hyper=list(prec=list(prior=sdunif))) +
+ f(ID.year1, model="iid", hyper=list(prec=list(prior=sdunif)))
M9 model
> f.M9 <- O ~ f(ID.area, model="generic1", Cmatrix=R.Leroux, constr=TRUE,
+ hyper=list(prec=list(prior=sdunif),beta=list(prior=lunif))) +
+ f(ID.year, model="rw2", hyper=list(prec=list(prior=sdunif))) +
+ f(ID.year1, model="iid", hyper=list(prec=list(prior=sdunif))) +
+ f(ID.area.year, model="iid", constr=TRUE,
+ hyper=list(prec=list(prior=sdunif)),
+ extraconstr=list(A=matrix(rep(1:t,n),1,n*t),e=0))
38 Spatio-temporal disease mapping
M10 model
> R <- kronecker(diag(n),R.gammaRW2)
> r.def <- 2*n
> A.constr <- kronecker(diag(n),matrix(1,1,t))
>
> f.M10 <- O ~ f(ID.area, model="generic1", Cmatrix=R.Leroux, constr=TRUE,
+ hyper=list(prec=list(prior=sdunif),beta=list(prior=lunif))) +
+ f(ID.year, model="rw2", hyper=list(prec=list(prior=sdunif)),
+ extraconstr=list(A=matrix(1:t,1,t),e=0)) +
+ f(ID.year1, model="iid", hyper=list(prec=list(prior=sdunif))) +
+ f(ID.area.year, model="generic0", Cmatrix=R, rankdef=r.def,
+ constr=TRUE, hyper=list(prec=list(prior=sdunif)),
+ extraconstr=list(A=A.constr, e=rep(0,n)))
M11 model
> R <- kronecker(R.xi,diag(t))
> r.def <- t
> A.constr <- kronecker(matrix(1,1,n),diag(t))
>
> f.M11 <- O ~ f(ID.area, model="generic1", Cmatrix=R.Leroux, constr=TRUE,
+ hyper=list(prec=list(prior=sdunif),beta=list(prior=lunif))) +
+ f(ID.year, model="rw2", hyper=list(prec=list(prior=sdunif))) +
+ f(ID.year1, model="iid", hyper=list(prec=list(prior=sdunif))) +
+ f(ID.area.year, model="generic0", Cmatrix=R, rankdef=r.def,
+ constr=TRUE, hyper=list(prec=list(prior=sdunif)),
+ extraconstr=list(A=A.constr, e=rep(0,t)))
M12 model
> R <- kronecker(R.xi,R.gammaRW2)
> r.def <- 2*n+t-2
> A1 <- kronecker(diag(n),matrix(1,1,t))
> A2 <- kronecker(matrix(1,1,n),diag(t))
> A.constr <- rbind(A1,A2)
>
> f.M12 <- O ~ f(ID.area, model="generic1", Cmatrix=R.Leroux, constr=TRUE,
+ hyper=list(prec=list(prior=sdunif),beta=list(prior=lunif))) +
+ f(ID.year, model="rw2", hyper=list(prec=list(prior=sdunif))) +
+ f(ID.year1, model="iid", hyper=list(prec=list(prior=sdunif))) +
+ f(ID.area.year, model="generic0", Cmatrix=R, rankdef=r.def,
+ constr=TRUE, hyper=list(prec=list(prior=sdunif)),
+ extraconstr=list(A=A.constr, e=rep(0,n+t)))
As described in Section 1.4.5, the linear constraints that makes each model
identifiable (see Table 1.2) are specified through the constr=TRUE (a sum-to-
zero constraint over the random effect) and extraconstr arguments. Note
1.5 Illustration 39
and
n
P 0
δit = 0, for t = 1, . . . , T ⇐⇒ (1n ⊗ IT )δ = 0.
i=1
To define the formula object for the non-parametric models without unstruc-
tured temporal component φt (models M13-M18), the f(ID.year1,...) ar-
gument has to be removed. For details about the implementation of the
hyperprior distributions in R-INLA see Section 1.4.3.
Finally, we run the INLA algorithm with a call to the inla() function as