Spatial Probit Model in R
Spatial Probit Model in R
I
n
_
(1)
for z = (z
1
, . . . , z
n
)
is often set to
2
= 1.
The R Journal Vol. 5/1, June ISSN 2073-4859
CONTRIBUTED RESEARCH ARTICLES 131
The data generating process for z is
z = (I
n
W)
1
X + (I
n
W)
1
N (0, I
n
)
Note that if = 0 or W = I
n
, the model reduces to an ordinary probit model, for which Wooldridge
(2002, chapter 17) and Albert (2007, section 10.1) are good references. The ordinary probit model can
be estimated in R by glm().
Another popular spatial model is the spatial error model (SEM) which takes the form
z = X +u, u = Wu + , N
_
0,
2
I
n
_
(2)
z = X + (I
n
W)
1
(I
n
W)
_
1
_
(3)
subject to z
i
0 for y
i
= 1 and z
i
< 0 for y
i
= 0, which can be efciently sampled from using
the method rtmvnorm.sparseMatrix() in package tmvtnorm (Wilhelm and Manjunath, 2013),
see the next section for details.
2. For a normal prior N(c, T), we can sample p(|, z, y) from a multivariate normal as
p (|, z, y) N (c
, T
) (4)
c
=
_
X
X +T
1
_
1
_
X
Sz +T
1
c
_
T
=
_
X
X +T
1
_
1
S = (I
n
W)
The standard way for sampling from this distribution is rmvnorm() from package mvtnorm
(Genz et al., 2013).
3. The remaining conditional density p(|, z, y) is
p (|, z, y) |I
n
W| exp
_
1
2
(Sz X)
(Sz X)
_
(5)
which can be sampled from using Metropolis-Hastings or some other sampling scheme (i.e.
Importance Sampling). We implement a grid-based evaluation and numerical integration
proposed by LeSage and Pace and then draw from the inverse distribution function (LeSage
and Pace, 2009, p. 132).
The R Journal Vol. 5/1, June ISSN 2073-4859
CONTRIBUTED RESEARCH ARTICLES 132
Implementation issues
Since MCMC methods are widely considered to be (very) slow, we devote this section to the discussion
of some implementation issues in R. Key factors to estimate large spatial probit models in R include the
usage of sparse matrices and compiled Fortran code, and possibly also parallelization, which has been
introduced to R 2.14.0 with the package parallel. We report estimation times and memory requirements
for several medium-size and large-size problems and compare our times to those reported by LeSage
and Pace (2009, p. 291). We show that our approach allows the estimation of large spatial probit models
in R within reasonable time.
Drawing p(z|, , y)
In this section we describe the efforts made to generate samples from p(z|, , y)
z N
_
(I
n
W)
1
X,
_
(I
n
W)
(I
n
W)
_
1
_
subject to z
i
0 for y
i
= 1 and z
i
< 0 for y
i
= 0.
The generation of samples from a truncated multivariate normal distribution in high dimensions
is typically done with a Gibbs sampler rather than with other techniques such as rejection sampling
(for a comparison, see Wilhelm and Manjunath (2010)). The Gibbs sampler draws from univariate
conditional densities f (z
i
|z
i
) = f (z
i
|z
1
, . . . , z
i1
, z
i+1
, . . . , z
n
).
The distribution of z
i
conditional on z
i
is univariate truncated normal with variance
i.i
=
ii
i,i
1
i,i
i,i
(6)
= H
1
ii
(7)
and mean
i.i
=
i
+
i,i
1
i,i
_
z
i
i
_
(8)
=
i
H
1
ii
H
i,i
_
z
i
i
_
(9)
and bounds z
i
0 for y
i
= 0 and 0 z
i
for y
i
= 1.
Package tmvtnorm has such a Gibbs sampler in place since version 0.9, based on the covariance
matrix . Following Geweke (1991, 2005, p. 171), the Gibbs sampler is easier to state in terms of the
precision matrix H, simply because it requires fewer and easier operations in the above equations (7)
and (9). These equations will further simplify in the case of a sparse precision matrix H, in which most
of the elements are zero. With sparse H, operations involving H
i,i
need only to be executed for all
non-zero elements rather than for all n 1 variables in z
i
.
In our spatial probit model, the covariance matrix = [(I
n
W)
(I
n
W)]
1
is a dense matrix,
whereas the corresponding precision matrix H =
1
= (I
n
W)
(I
n
W) is sparse. Hence, using
for sampling z is inefcient. For this reason we reimplemented the Gibbs sampler with the precision
matrix H in package tmvtnorm instead (Wilhelm and Manjunath, 2013).
Suppose one wants to draw N truncated multinormal samples in n dimensions, where the precision
matrix H is sparse (n n) with only m < n entries different from zero per row on average (e.g. m = 6
nearest neighbors or average branching factor in a network). With growing dimension n, two types of
problems arise with a usual dense matrix representation for H:
1. Data storage problem: Since matrix H is n n, the space required to store the dense matrix
will be quadratic in n. One remedy is, of course, using some sparse matrix representation for H
(e.g. packages Matrix, sparseM etc.), which actually holds only n m elements instead of n n.
The following code example shows the difference in object sizes for a dense vs. sparse identity
matrix I
n
for n = 10000 and m = 1.
> library(Matrix)
> I_n_dense <- diag(1)
> print(object.size(I_n_dense), units = "Mb")
762.9 Mb
> rm(I_n_dense)
> I_n_sparse <- sparseMatrix(i = 1:1, j = 1:1, x = 1)
> print(object.size(I_n_sparse), units = "Mb")
.2 Mb
The R Journal Vol. 5/1, June ISSN 2073-4859
CONTRIBUTED RESEARCH ARTICLES 133
2. Data access problem: Even with a sparse matrix representation for H, a naive strategy that tries
to access an arbitrary matrix element H[i,j] like in triplet or hash table representations, will
result in N n n matrix accesses to H. This number grows quadratically in n. For example, for
N = 30 and n = 10000 it adds up to 30 10000 10000 = 3 billion hash function calls, which is
inefcient and too slow for most applications.
Iterating only over the non-zero elements in rowi, for example in term H
i,i
, reduces the number
of accesses to H to only N n m instead of N n n, which furthermore will only grow linearly
in n for a xed m. Suitable data structures to access all non-zero elements of the precision matrix
H are linked lists of elements for each row i (list-of-lists) or, thanks to the symmetry of H, the
compressed-sparse-column/row representation of sparse matrices, directly available from the
Matrix package.
How fast is the random number generation for z now? We performed a series of tests for varying
sizes N and n and two different values of m. The results presented in Table 1 show that the sampler
with sparse H is indeed very fast and works ne in high dimensions. The time required scales with
N n.
N
n 10
1
10
2
10
3
10
4
10
5
10
6
m=2 10
1
0.03 0.25 2.60
10
2
0.03 0.25 2.75
10
3
0.03 0.25 2.84
10
4
0.03 0.25 2.80
10
5
0.26 2.79
m=6 10
1
0.03 0.23 2.48
10
2
0.01 0.23 2.53
10
3
0.03 0.23 2.59
10
4
0.02 0.24 2.57
10
5
0.25 2.68
Table 1: Performance test results for the generation of truncated multivariate normal samples z with
rtmvnorm.sparseMatrix() (tmvtnorm) for a varying number of samples N and dimension n. The
precision matrix in each case is H = (I
n
W)
(I
n
W), the spatial weight matrix W contains m
non-zero entries per row. Times are in seconds and measured on an IntelCorei7-2600 CPU @3.40
GHz.
One more performance issue we discuss here is the burn-in size in the innermost Gibbs sampler
generating z from p(z|, , y). Depending on the start value, Gibbs samplers often require a certain
burn-in phase until the sampler mixes well and draws from the desired target distribution. In our
MCMC setup, we only draw one sample of z from p(z|, , y) in each MCMC iteration, but possibly
in very high dimensions (e.g. N = 1, n = 10000). With burn.in=2 samples, we have to generate 21
draws in order to keep just one. In our situation, a large burn-in size will dramatically degrade the
MCMC performance, so the number of burn-in samples for generating z has to be chosen carefully.
LeSage and Pace (2009) discuss the role of the burn-in size and often use burn.in=1, when the
Gibbs sampler starts from zero. Alternatively, they also propose to use no burn-in phase at all (e.g.
burn.in=), but then to set the start value of the sampler to the previous value of z instead of zero.
QR decomposition of (I
n
W)
The mean vector of the truncated normal samples z in equation (3) takes the form
= (I
n
W)
1
X (10)
However, inverting the sparse matrix S = (I
n
W) will produce a dense matrix and will therefore
offset all benets from using sparse matrices (i.e. memory consumption and size of problems that
can be solved). It is preferable to determine by solving the equations (I
n
W) = X with a QR
decomposition of S = (I
n
W). We point out that there is a signicant performance difference
between the usual code mu <-qr.solve(S,X %*% beta) and mu <-solve(qr(S),X %*% beta). The
latter will apply a QR decomposition for a sparse matrix S and will use a method qr() from the
package Matrix, whereas the rst function from the base package will coerce S into a dense matrix.
The R Journal Vol. 5/1, June ISSN 2073-4859
CONTRIBUTED RESEARCH ARTICLES 134
solve(qr(S),X %*% beta) takes only half the time required by qr.solve(S,X %*% beta).
Drawing p(|z, , y)
Drawing p(|z, , y) in equation (4) is a multivariate normal distribution, whose variance T
does
not depend on z and . Therefore, we can efciently vectorize and generate temporary draws
tmp
N(0, T
) for all MCMC iterations before running the chain and then just shift these temporary
draws by c
, T
).
Determining log-determinants and drawing p(|z, , y)
The computation of log-determinants for ln |I
n
W|, whose evaluation is frequently needed for
drawing p(|z, , y), becomes a challenging task for large matrices. Bivand (2010) gives a survey of the
different methods for calculating the Jacobian like the Pace and Barry (1997) grid evaluation of in the
interval [1, . . . , +1] and spline approximations between grid points or the Chebyshev approximation
of the log-determinant (Pace and LeSage, 2004). All of them are implemented in package spdep. For
the estimation of the probit models we are using the existing facilities in spdep (do_ldet()).
Computation of marginal effects
In spatial lag models (and its probit variants), a change of an explanatory variable x
ir
will affect
both the same response variable z
i
(direct effects) and possibly all other responses z
j
(j = i; indirect
effects or spatial spillovers). The resulting effects matrix S
r
(W) is n n for x
r
(r = 1, . . . , k) and three
summary measures of S
r
(W) will be computed: average direct effects (M
r
(D) = n
1
trS
r
(W)), average
total effects (M
r
(T) = n
1
1
n
S
r
(W)1
n
) and the average indirect effects as the difference of total and
direct effects (M
r
(I) = M
r
(T) M
r
(D)). In contrast to the SAR probit, there are no spatial spill-over
effects in the SEM probit model (2). In the MATLAB spatial econometrics toolbox (LeSage, 2010), the
computation of the average total effects requires an inversion of the n n matrix S = (I
n
W) at
every MCMC iteration. Using the same QR-decomposition of S as described above and solving the
equation solve(qr(S),rep(1,n)) speeds up the calculation of total effects by magnitudes.
Examples
Package spatialprobit
After describing the implementation issues to ensure that the estimation is fast enough and capable
to handle large problems, we briey describe the interface of the methods in package spatialprobit
(Wilhelm and de Matos, 2013) and then turn to some examples.
The main estimation method for the SAR probit model sar_probit_mcmc(y,X,W) takes a vector of
dependent variables y, a model matrix X and a spatial weight matrix W. The method sarprobit(formula,
W,data) is a wrapper which allows a model formula and a data frame. Both methods require a spatial
weight matrix W to be passed as an argument. Additionally, the number of MCMC start values,
the number of burn-in iterations and a thinning parameter can be specied. The estimation fit
<-sarprobit(y ~ x,W,data) returns an object of class sarprobit. The model coefcients can be ex-
tracted via coef(fit). summary(fit) returns a coefcient table with z-values, impacts(fit) gives
the marginal effects and the plot(fit) method provides MCMC trace plots, posterior density plots
as well as autocorrelation plots of the model parameters. logLik(fit) and AIC(fit) return the log
likelihood and the AIC of the model for model comparison and testing.
Experiment from LeSage/Pace
We replicate the experiment from LeSage and Pace (2009, section 10.1.5) for n = 400 and n = 1000
random points in a plane and a spatial weight matrix with the 6 nearest neighbors. The spatial probit
model parameters are
2
= 1, = (0, 1, 1)
and = 0.3.
> summary(sarprobit.fit)
--------MCMC spatial autoregressive probit--------
Execution time = 25.35 secs
N draws = 3, N omit (burn-in)= 2
N observations = 2, K covariates = 2
# of Y values = 151, # of 1 Y values = 49
Min rho = -1., Max rho = 1.
--------------------------------------------------
Estimate Std. Dev p-level t-value Pr(>|z|)
Xconst -1.25361 .235 . -6.26 2.3e-9 ***
Xx 2.5238 .28529 . 7.19 1.2e-11 ***
rho .24796 .1571 .967 2.35 .2 *
---
Signif. codes: *** .1 ** .1 * .5 . .1 1
The R Journal Vol. 5/1, June ISSN 2073-4859
CONTRIBUTED RESEARCH ARTICLES 137
The direct, indirect and total marginal effects are extracted using
> impacts(sarprobit.fit)
--------Marginal Effects--------
(a) Direct effects
lower_5 posterior_mean upper_95
Xx .231 .268 .3
(b) Indirect effects
lower_5 posterior_mean upper_95
Xx -.299 -.266 -.23
(c) Total effects
lower_5 posterior_mean upper_95
Xx .149 .179
The corresponding non-spatial probit model is estimated using the glm() function:
> glm1 <- glm(y ~ x, family = binomial("probit"))
> summary(glm1, digits = 4)
Call:
glm(formula = y ~ x, family = binomial("probit"))
Deviance Residuals:
Min 1Q Median 3Q Max
-2.2337 -.3488 -.87 -.2 2.417
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -1.491 .28 -7.18 7.2e-13 ***
x 1.966 .281 6.99 2.7e-12 ***
---
Signif. codes: *** .1 ** .1 * .5 . .1 1
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 222.71 on 199 degrees of freedom
Residual deviance: 13.48 on 198 degrees of freedom
AIC: 17.5
Number of Fisher Scoring iterations: 7
Figures 2 and 3 show the trace plots and posterior densities as part of the MCMC estimation
results. Table 4 compares the SAR probit and standard probit estimates.
Diffusion of innovation and information
A last example looks at the information ow in social networks. Coleman et al. (1957, 1966) have
studied how innovations (i.e. new drugs) diffuse among physicians, until the new drug is widely
adopted by all physicians. They interviewed 246 physicians in 4 different US cities and looked at the
role of 3 interpersonal networks (friends, colleagues and discussion circles) in the diffusion process
(November, 1953February, 1955). Figure 4 illustrates one of the 3 social structures based on the
question "To whom do you most often turn for advice and information?". The data set is available at
http://moreno.ss.uci.edu/data.html#ckm, but also part of spatialprobit (data(CKM)). See also Burt
(1987) and den Bulte and Lilien (2001) for further discussion of this data set. The dependent variable
in the model is the month in which a doctor rst prescribed the new drug. Explanatory variables are
the social structure and individual variables.
The R Journal Vol. 5/1, June ISSN 2073-4859
CONTRIBUTED RESEARCH ARTICLES 138
Figure 2: MCMC trace plots for model parameters with horizontal lines marking the true parameters
Figure 3: Posterior densities for model parameters with vertical markers for the true parameters
Figure 4: Social network of physicians in 4 different cities based on "advisorship" relationship
The R Journal Vol. 5/1, June ISSN 2073-4859
CONTRIBUTED RESEARCH ARTICLES 139
LeSage (2009) sarprobit McSpatial ML McSpatial GMM
Estimates Mean Std dev Mean Std dev Mean Std dev Mean Std dev
n = 400, m = 10 n = 400
1
= 0 -0.1844 0.0686 0.0385 0.0562 0.0286 0.0302 -0.1042 0.0729
2
= 1 0.9654 0.1179 0.9824 0.1139 1.0813 0.1115 0.7690 0.0815
3
= 1 -0.8816 0.1142 -1.0014 0.1163 -0.9788 0.1040 -0.7544 0.0829
= 0.75 0.6653 0.0564 0.7139 0.0427 0.7322 0.0372 1.2208 0.1424
Time (sec.) 1,276
/ 623
11.5 1.4
0.0
n = 1000, m = 1 n = 1000
1
= 0 0.05924 0.0438 -0.0859 0.0371 0.0010 0.0195 -0.1045 0.0421
2
= 1 0.96105 0.0729 0.9709 0.0709 1.0608 0.0713 0.7483 0.0494
3
= 1 -1.04398 0.0749 -0.9858 0.0755 -1.1014 0.0728 -0.7850 0.0528
= 0.75 0.69476 0.0382 0.7590 0.0222 0.7227 0.0229 1.4254 0.0848
Time (sec.) 586
/ 813
15.5 18.7
0.0
Table 2: Bayesian SAR probit estimates for n = 400 and n = 1000 with N = 1000 draws and 200
burn-in samples. m is the burn-in size in the Gibbs sampler drawing z. Timings in R include the
computation of marginal effects and were measured using R 2.15.2 on an IntelCorei7-2600 CPU
@3.40 GHz. Times marked with () are taken from LeSage and Pace (2009, Table 10.1). To allow for
a better comparison, these models were estimated anew () using MATLAB R2007b and the spatial
econometrics toolbox (LeSage, 2010) on the very same machine used for getting the R timings. The ML
and GMM timings () do not involve the computation of marginal effects.
m=1 m=2 m=5 m=10
Estimates Mean Std dev Mean Std dev Mean Std dev Mean Std dev
1
= 0 0.0385 0.0571 0.0447 0.0585 0.0422 0.0571 0.0385 0.0562
2
= 1 1.0051 0.1146 0.8294 0.0960 0.9261 0.1146 0.9824 0.1139
3
= 1 -1.0264 0.1138 -0.8417 0.0989 -0.9446 0.1138 -1.0014 0.1163
= 0.75 0.7226 0.0411 0.6427 0.0473 0.6922 0.0411 0.7139 0.0427
Time (sec.) 10.2 10.2 10.5 11.0
Time (sec.)
in LeSage (2009)
195 314 - 1270
Table 3: Effects of the Gibbs sampler burn-in size m on SAR probit estimates for n = 400, N = 1000
draws and burn.in=2
SAR probit Probit
Estimates Mean Std dev p-level Mean Std dev p-level
1
= 1 -1.2536 0.2004 0.0000 -1.4905 0.2077 0.0000
2
= 2 2.0524 0.2853 0.0000 1.9656 0.2812 0.0000
= 0.3 0.2480 0.1057 0.0097
Time (sec.) 25.4
Table 4: SAR probit estimates vs. probit estimates for the random graph example
The R Journal Vol. 5/1, June ISSN 2073-4859
CONTRIBUTED RESEARCH ARTICLES 140
Probit SAR Probit
Estimates Mean z value Mean z value
(Intercept) 0.3659 0.40 0.1309 0.16
inuence 0.9042 2.67
city 0.0277 0.26 0.0371 0.39
med_sch_yr 0.2937 2.63 0.3226 3.22
meetings 0.1797 1.55 0.1813 1.64
jours 0.1592 2.46 0.1368 2.08
free_time 0.3179 2.31 0.3253 2.58
discuss 0.0010 0.01 0.0643 0.57
clubs 0.2366 2.36 0.2252 2.45
friends 0.0003 0.00 0.0185 0.31
community 0.1974 1.65 0.2298 2.08
patients 0.0402 0.67 0.0413 0.72
proximity 0.0284 0.45 0.0232 0.40
specialty 0.9693 8.53 1.0051 8.11
0.1138 1.33
Table 5: SAR probit and standard probit estimates for the Coleman data.
We are estimating a standard probit model as well as a SARprobit model based on the "advisorship"
network and trying to nd determinants for all adopters of the new drug. We do not aim to fully
reanalyze the Coleman data set, nor to provide a detailed discussion of the results. We rather want to
illustrate the model estimation in R. We nd a positive relationship for the social inuence variable in
the probit model, but social contagion effects as captured by W in the more sound SAR probit model
is rather small and insignicant. This result suggests that social inuence is a factor in information
diffusion, but the information ow might not be correctly described by a SAR model. Other drivers
for adoption which are ignored here, such as marketing efforts or aggressive pricing of the new drug
may play a role in the diffusion process too (den Bulte and Lilien, 2001).
> set.seed(12345)
> # load data set "CKM" and spatial weight matrices "W1","W2","W3"
> data(CKM)
> # /1 variable for early adopter
> y <- as.numeric(CKM$adoption.date <= "February, 1955")
> # create social influence variable
> influence <- as.double(W1 %*% as.numeric(y))
> # Estimate Standard probit model
> glm.W1 <- glm(y ~ influence + city + med_sch_yr + meetings + jours + free_time +
+ discuss + clubs + friends + community + patients + proximity + specialty,
+ data = CKM, family = binomial("probit"))
> summary(glm.W1, digits = 3)
> # Estimate SAR probit model without influence variable
> sarprobit.fit.W1 <- sarprobit(y ~ 1 + city + med_sch_yr + meetings + jours +
+ free_time + discuss + clubs + friends + community + patients + proximity +
+ specialty, data = CKM, W = W1)
> summary(sarprobit.fit.W1, digits = 3)
Table 5 presents the estimation results for the non-spatial probit and the SAR probit specication.
The coefcient estimates are similar in magnitude and sign, but the estimate for does not support the
idea of spatial correlation in this data set.
Parallel estimation of models
MCMC is, similar to the bootstrap, an embarrassingly parallel problem. It can be easily run in
parallel on several cores. From version 2.14.0, R offers a unied way of doing parallelization with the
parallel package. There are several different approaches available to achieve parallelization and not all
approaches are available for all platforms. See for example the conceptual differences between the two
main methods mclapply and parLapply, where the rst will only work serially on Windows. Users are
therefore encouraged to read the parallel package documentation for choosing the appropriate way.
The R Journal Vol. 5/1, June ISSN 2073-4859
CONTRIBUTED RESEARCH ARTICLES 141
Here, we only sketch how easy the SAR probit estimation can be done in parallel with 2 tasks:
> library(parallel)
> mc <- 2
> run1 <- function(...) sarprobit(y ~ X - 1, W, ndraw = 5, burn.in = 2,
+ thinning = 1)
> system.time({
+ set.seed(123, "LEcuyer")
+ sarprobit.res <- do.call(c, mclapply(seq_len(mc), run1))
+ })
> summary(sarprobit.res)
Due to the overhead in setting up the cluster, it is reasonable to expect another 50% performance
gain when working with 2 CPUs.
Summary
In this article we presented the estimation of spatial probit models in R and pointed to the critical
implementation issues. Our performance studies showed that even large problems with n = 10, 000
or n = 100, 000 observations can be handled within reasonable time. We provided an update of
tmvtnorm and a new package spatialprobit on CRAN with the methods for estimating spatial probit
models implemented (Wilhelm and de Matos, 2013). The package currently implements three limited
dependent models: the spatial lag probit model (sarprobit()), the probit model with spatial errors
(semprobit()) and the SAR Tobit model (sartobit()). The Bayesian approach can be further extended
to other limited dependent spatial models, such as ordered probit or models with multiple spatial
weights matrices. We are planning to include these in the package in near future.
Bibliography
J. Albert. Bayesian Computation in R. Springer, 2007. [p131]
J. Albert. LearnBayes: Functions for Learning Bayesian Inference, 2012. URL http://CRAN.R-project.
org/package=LearnBayes. R package version 2.12. [p131]
D. Bates and M. Maechler. Matrix: Sparse and Dense Matrix Classes and Methods, 2013. URL http:
//CRAN.R-project.org/package=Matrix. R package version 1.0-12. [p130]
R. Bivand. Computing the Jacobian in spatial models: an applied survey. Discussion paper /
Norwegian School of Economics and Business Administration, Department of Economics ; 2010,20,
August 2010. URL http://brage.bibsys.no/nhh/bitstream/URN:NBN:no-bibsys_brage_2393/
1/dp21-2.pdf. [p134]
R. Bivand. spdep: Spatial dependence: weighting schemes, statistics and models, 2013. URL http://CRAN.R-
project.org/package=spdep. R package version 0.5-57. [p130]
R. S. Burt. Social contagion and innovation: Cohesion versus structural equivalence. American Journal
of Sociology, 92(6):12871335, 1987. [p137]
C. T. Butts. sna: Tools for Social Network Analysis, 2013. URL http://CRAN.R-project.org/package=sna.
R package version 2.3-1. [p130]
C. T. Butts, M. S. Handcock, and D. R. Hunter. network: Classes for Relational Data. Irvine, CA, 2013.
URL http://statnet.org/. R package version 1.7-2. [p130]
J. Coleman, E. Katz, and H. Menzel. The diffusion of an innovation among physicians. Sociometry, 20:
253270, 1957. [p137]
J. S. Coleman, E. Katz, and H. Menzel. Medical Innovation: A Diffusion Study. New York: Bobbs Merrill,
1966. [p137]
G. Csardi and T. Nepusz. The igraph software package for complex network research. InterJournal,
Complex Systems:1695, 2006. URL http://igraph.sf.net. [p135]
C. V. den Bulte and G. L. Lilien. Medical innovation revisited: Social contagion versus marketing
effort. American Journal of Sociology, 106(5):14091435, 2001. [p137, 140]
The R Journal Vol. 5/1, June ISSN 2073-4859
CONTRIBUTED RESEARCH ARTICLES 142
P. Diggle and P. Ribeiro. Model Based Geostatistics. Springer, New York, 2007. [p130]
A. O. Finley and S. Banerjee. spBayes: Univariate and Multivariate Spatial Modeling, 2013. URL http:
//CRAN.R-project.org/package=spBayes. R package version 0.3-7. [p130]
R. J. Franzese, J. C. Hays, and S. Cook. Spatial-, temporal-, and spatiotemporal-autoregressive probit
models of interdependent binary outcomes: Estimation and interpretation. Prepared for the Spatial
Models of Politics in Europe & Beyond, Texas A&M University, February 2013. [p131]
A. Genz, F. Bretz, T. Miwa, X. Mi, F. Leisch, F. Scheipl, and T. Hothorn. mvtnorm: Multivariate Normal
and t Distributions, 2013. URL http://CRAN.R-project.org/package=mvtnorm. R package version
0.9-9995. [p131]
J. F. Geweke. Efcient simulation from the multivariate normal and Student-t distributions subject to
linear constraints and the evaluation of constraint probabilities, 1991. URL http://www.biz.uiowa.
edu/faculty/jgeweke/papers/paper47/paper47.pdf. [p132]
J. F. Geweke. Contemporary Bayesian Econometrics and Statistics. John Wiley and Sons, 2005. [p132]
T. Klier and D. P. McMillen. Clustering of auto supplier plants in the United States: Generalized
method of moments spatial probit for large samples. Journal of Business and Economic Statistics, 26:
460471, 2008. [p131]
R. Koenker and P. Ng. SparseM: Sparse Linear Algebra, 2013. URL http://CRAN.R-project.org/
package=SparseM. R package version 0.99. [p130]
J. LeSage and R. K. Pace. Introduction to Spatial Econometrics. CRC Press, 2009. [p130, 131, 132, 133, 134,
135, 139]
J. P. LeSage. Bayesian estimation of limited dependent variable spatial autoregressive models. Geo-
graphical Analysis, 32(1):1935, 2000. [p130]
J. P. LeSage. Spatial econometrics toolbox for MATLAB, March 2010. URL http://www.spatial-
econometrics.com/. [p134, 139]
J. P. LeSage, R. K. Pace, N. Lam, R. Campanella, and X. Liu. New Orleans business recovery in the
aftermath of hurricane Katrina. Journal of the Royal Statistical Society: Series A (Statistics in Society),
174:10071027, 2011. [p131]
J. J. Majure and A. Gebhardt. sgeostat: An Object-oriented Framework for Geostatistical Modeling in S+,
2013. URL http://CRAN.R-project.org/package=sgeostat. R package version 1.0-25; S original
by James J. Majure Iowa State University and R port + extensions by Albrecht Gebhardt. [p130]
T. L. Marsh, R. C. Mittelhammer, and R. G. Huffaker. Probit with spatial correlation by eld plot:
Potato leafroll virus net necrosis in potatoes. Journal of Agricultural, Biological, and Environmental
Statistics, 5:2236, 2000. URL http://www.jstor.org/stable/14629. [p131]
D. McMillen. McSpatial: Nonparametric spatial data analysis, 2013. URL http://CRAN.R-project.org/
package=McSpatial. R package version 2.0. [p130, 131]
R. K. Pace and R. Barry. Quick computation of spatial autoregressive estimators. Geographical Analysis,
29:232247, 1997. [p134]
R. K. Pace and J. LeSage. Chebyshev approximation of log-determinants of spatial weight matrices.
Computational Statistics & Data Analysis, 45(2):179196, 2004. [p134]
G. Piras. sphet: Spatial models with heteroskedastic innovations in R. Journal of Statistical Software, 35
(1):121, 2010. URL http://www.jstatsoft.org/v35/i1/. [p130]
W. N. Venables and B. D. Ripley. Modern Applied Statistics with S. Springer, 2002. [p130]
S. Wilhelm and M. G. de Matos. spatialprobit: Spatial Probit Models, 2013. URL http://CRAN.R-
project.org/package=spatialprobit. R package version 0.9-9. [p130, 134, 141]
S. Wilhelm and B. G. Manjunath. tmvtnorm: A package for the truncated multivariate normal
distribution. The R Journal, 2(1):2529, June 2010. URL http://journal.r-project.org/archive/
21-1/RJournal_21-1_Wilhelm+Manjunath.pdf. [p132]
S. Wilhelm and B. G. Manjunath. tmvtnorm: Truncated Multivariate Normal and Student t Distribution,
2013. URL http://CRAN.R-project.org/package=tmvtnorm. R package version 1.4-8. [p131, 132]
The R Journal Vol. 5/1, June ISSN 2073-4859
CONTRIBUTED RESEARCH ARTICLES 143
J. M. Wooldridge. Introductory Econometrics - A Modern Approach. South Western College Publishing,
2002. [p131]
Stefan Wilhelm
Department of Finance, WWZ (Wirtschaftswissenschaftliches Zentrum)
University of Basel
Switzerland
[email protected]
Miguel Godinho de Matos
Department of Engineering & Public Policy
Carnegie Mellon University
United States
[email protected]
The R Journal Vol. 5/1, June ISSN 2073-4859