ARM SAA EC EDR Slides PDF

Methods for Small Area Analyses
of Spatial and Space-time Data

Evan Carey
Robert Penfold
Elisabeth Dowling Root
AcademyHealth Conference, Seattle, WA

June 25, 2018
Outline
• Introduction
• Challenges of spatial data
• Representing space and defining spatial
relationships
• Spatial autocorrelation
• Focus on analysis techniques for area data
– Disease mapping & BYM CAR Models
• Focus on analysis techniques for continuous
data
Part 1: Foundational Concepts
• Why do I care about space: is space a parameter of interest,
or a nuisance parameter?
• What are different ways spatial data can be represented in
my data?
• How do I define ‘near’ and ‘far’?
• What does autocorrelation mean?
• How does spatial autocorrelation differ from spatial trends?
• Why is data irregularly distributed across space challenging
to model?
• How is this connected to small area analysis?
• What does ‘shrinkage’ mean, and why does it improve
models?
Why do you care about space?
I am interested in the I am not interested in the
relationship between effect of location, but my
location and my outcome. data has spatial nature…
• I want to identify areas with • Ignoring space in your

high or low disease rates. models may give you biased
• Potentially create maps results/incorrect p-values
showing above/below • Correctly modeling space
average outcomes. fixes the issue.
• I want to estimate the effect • Space is a ‘nuisance’
of space! parameter here.
Geospatial Data & Public Health
Geographic data, Geographic Information Systems
(GIS), and spatial analysis provide public health officials
with the capability to perform two unique types of
analysis:
1. Find statistically significant areas of high or low

incidence
2. Examine the spatial relationship between health
outcomes and population/contextual factors
Geographic Variation in Health
• People (demographics) and the risk factors
contributing to health are dispersed unevenly across
communities and regions
• Often we are interested in identifying patterns of

disease (or some other health outcome) across space
• We are also interested in understanding the reasons for

these patterns:
– Composition: differences in kinds of people who live in
places
– Context: differences in neighborhood or area-level physical
or social environments
But…“spatial is special”
• Data that are referenced to location bring important
additional information to your data analysis
• But, spatially referenced data also bring special

problems to your analysis
– heterogeneity of observational units → heteroskedasticity
– spatial autocorrelation → residual dependence
• A consequence of these “special problems” is that

traditional assumptions of standard regression
techniques are violated
– statistical inference from such a model is not valid
Spatial data is complex
• The methods we chose to cope with the
complexities of spatial data depend on how we
define space
– Discrete geographic phenomena have spatial bounds.
Locations may be within or outside a geographic
feature.
• Areal data: census tracts, counties, states
– Continuous geographic phenomena have properties
continuously distributed across the landscape.
Locations are specific and have value.
• Point data
• These definitions of space are represented by
different geographic data types
What are Spatial Data
• Location
• Attributes Attribute data:
• Spatial Relationships Survey data
ID Tract ChildDth Race DistPCP
Spatial data: 1 1237 Yes White 5000
Object: Home
longitude, latitude (x, y)
76.9147, 107.6098 2 1237 No AA 3560
3 1238 No White 10789
4 1238 No Asian 7689
Attribute data:
Census tract/PCSA characteristics
Object: Health Center Tract PctPov PctAA Foreclose PCP
1237 .056 .241 .011 1
1238 .079 .443 .043 3
Spatial Relationships: 1239 .151 .078 .225 10
• Proximity to physician
1240 .224 .011 .105 0
• “Contained in” census tract
Spatial Data Types
Event Data (Points)
Lattice Data
(Areas)
Geostatistical Data (Grid)

It’s important to understand that these
designations are not mutually exclusive
Points can be geolocated in some relevant
areal units
These aggregations can be used to produce rates
0.18
0.16
0.11
0.02 0.05
0.09
0.14 0.00
0.7
GIS
Spatial Spatial Data

Analysis Analysis
“Spatial Data Production” “Spatial Statistics”
Event (Point) Lattice (Area) Geostatistical

Data Data Data
| | |
Point Pattern Analysis Regional Count data Spatial Prediction
Spatial Epidemiology Spatial Econometrics
Crime Analysis Spatial Regression
Analysis
Thinking in one dimension:
Does time effect the outcome?
Is there a time trend?
Spatial Autocorrelation and Trends (2D)
“Everything is related to everything else, but near
things are more related than distant things.”
• Correlation in space
– Is a variable in a location correlated with the values in nearby
places?
• Spatial trends in the outcome
– The outcome differs systematically as a function of spatial
location.
These are distinct concepts!

* Humans are pretty bad at identifying spatial trends by eye. We
tend to over interpret noise when it is on a map ☺
Defining spatial relationships
• What is a neighbor? What’s next to what?
• These spatial relationships can be defined in a

number of ways
– Contiguity (common boundary, K-nearest
neighbors)
• What is a “shared” boundary?
• How many “neighbors” to include?
– Distance (distance band)
• What distance do we use?
Contiguity based neighbors
• For areas:
– All polygons that share a common border
• For points k=2

1 km
– Distance k=1
k=3
1.5 km
K-nearest neighbors (KNN) Euclidean distance

The problem with sparse data…
The problem with sparse data…
General Shrinkage Idea
Low High
If we have observed last year’s hospital mortality

rate, what is your best prediction of next year’s
hospital mortality rate?
Low High
Only use information from each hospital to predict

mortality.
No pooling of information (no shrinkage!)
Low High
Share (pool) information across hospitals.

Prediction is ‘shrunk’ towards the mean.
Sharing Spatial Data (Shrinkage)
1/45
4/20
=
= 0.2
0.02
Census Tract B Census Tract C

2/25
2/8 = =
0.25 0.08
Census Tract D
Census Tract A 3/30
1/10
= 0.1 = 0.1
Census Tract E Census Tract F

Focus on methods for
continuously indexed data
Spatial models implemented with R-
INLA
Motivating example: Outcomes of
Veterans in Colorado
Goal: Identify
areas of high and
low event
probability.
What does the

ideal method need
to have?
Ideal method
• Identify spatial trend and make predictions at
all points.
• Resilient to irregularly spaced data (small area
analysis!)
• Exhibit shrinkage / stabilization
• Incorporate other patient level traits in the
model (‘adjust’)
• Converge in reasonable time in medium to
large datasets
Point pattern analysis versus point
referenced models.
Binary Patient Patient

Outcome = Location + Demographics
http://open.lib.umn.edu/mapping/chapter/6-analysis/
Community care utilization in Colorado
(data simulation – no PHI here!)
Simulating Success of Community care
Referrals in the VHA
• Simulation 1:
– no spatial trend (pure spatial noise)
• Simulations 2-4:
– Spatial trend of varying strengths.
How successful are different methods at recovering

the underlying spatial trends of the binomial
process??
Method 1: Simple Interpolation (2D
Smoother)
• Use a 2D smoother:
– Gaussian kernel weighting
– Allows smoothing of binary process at irregularly
space locations.
– Can compute mean and variance across space.
– Nadaraya-Watson smoother (Nadaraya, 1964,
1989; Watson, 1964)
• What results do you expect to get using this
method?
Results for data with no spatial trend.
Results for data with a spatial trend
(simulation 2)
(simulations 3 and 4)
Spatial Models with R-INLA
• Integrated Nested Laplace Approximation (INLA). An
alternative to MCMC for fitting Bayesian models.
• Latent Gaussian models
– Fixed effects, structured and unstructured Gaussian random
effects combined linearly with likelihoods specified.
– ‘focus on the continuous representation of the GRF through an
(stochastic partial differential equation) SPDE’
• Coding is straightforward via R-INLA package.
• Convergence is fast in medium to larger datasets.
• The problem with large spatial data…most traditional
methods of spatial inference require inversion of the
covariance matrix, which is an n3 calculation!
Bakka, Haakon, Håvard Rue, Geir-Arne Fuglstad, Andrea Riebler, David Bolin, Elias Krainski, Daniel Simpson, and Finn Lindgren. “Spatial
Modelling with R-INLA: A Review.” ArXiv:1802.06350 [Stat], February 18, 2018. http://arxiv.org/abs/1802.06350.
R-INLA process
• Import data, use R-INLA package for ease of model
specification and fitting.
• Construct mesh for notion of spatial location:
– Helper functions in R-INLA.
– Expand mesh beyond boundaries of data
– Experiment with density of nodes.
• Connect mesh to observations (output is matrix)
• Create the model
– Spatial effect is connected to the mesh/observations object
– Other patient level effects not connected to location matrix.
• Fit the model.
• Results: Summarize hyperparameter distributions.
• Results: Make predictions on a dense grid of the region.
Construct mesh.
Results for data with no spatial trend.
(simulation 2)
(simulations 3 and 4)
Comparison to other methods
• R-INLA works is easy to implement and works
well in larger datasets.
• Bayesian framework allows hierarchical model
specification, and flexible summary of the
posterior.
• Review article evaluated 7 possible approaches
to this problem:
– R-INLA and Fixed Rank Kriging performed optimally in
larger datasets (memory usage and PU time)
– Methods generally provided similar estimates.
Bradley, Jonathan R., Noel Cressie, and Tao Shi. “A Comparison of Spatial Predictors When Datasets Could Be Very Large.” Statistics
Surveys 10, no. 0 (2016): 100–131. https://doi.org/10.1214/16-SS115.
Focus on methods for
areal data
Spatial smoothing: Headbanging, Locally weighted
averaging, and Bayesian CARs
Elisabeth Dowling Root, MA, PhD

Department of Geography & Division of Epidemiology
The Ohio State University
Focus Area Name

Mapping Rates
• For small areas, rates and mortality ratios are
very instable and maps of rates can be
misleading
– AND rates are spatially correlated
• Trade-off between geographic resolution and the

variability of mapped estimates
• Spatial smoothing can reduce the random noise

in maps of observable health data
– Highlight meaningful geographic patterns in the underlying risk
Focus Area Name

Shrinkage Estimation and Spatial
Smoothing
• Shrinkage methods are often used to stabilize rates
across small areas
– Smoothed estimates for each area “borrow strength” (precision) from
data in other areas by an amount depending on the precision of the raw
estimate of each area
• Estimated rate in area A is adjusted by combining
knowledge about:
– Observed rate in that area
– Average rate in surrounding areas
• The two rates are combined using some form of
weighted average, weights depend on the population
size in area A
Focus Area Name

There are many techniques for spatial
smoothing
• Locally Weighted Average (Anselin, 2006)
– Smooths toward the mean
– Area value replaced by population weighted average of
surrounding areas
• Headbanging (Mungiole, Pickle, Simonson, 1999)
– Smooths toward the median
– Area values replaced if large deviation from the median
and population is not large
• Bayesian Hierarchical (CAR) Models (Lawson, 2013)
– Smooths toward the mean
– Area values calculated using a CAR model with a spatial
random effect term
Focus Area Name

Headbanging
Rate=0.12
N=35
Rate=0.2 Rate=0.02
N=45 Rate=0.15
N=20
N=22
Rate=0.04 Rate=0.08
N=55 Rate=0.3 N=25
N=8
Census Tract A
Rate=0.1 Rate=0.1
N=10 N=30
Headbanging uses
the median, but this
Rate=0.02 technique can also
N=45 Rate=0.03 be applied to the
N=60 neighborhood mean
Focus Area Name

Headbanging Is center value between high and low medians? -- NO
Is the population much greater than neighbors? -- NO
REPLACE!!
Rate=0.12
N=35
Rate=0.02
Rate=0.2
N=45 Rate=0.15
N=20
N=22
Rate=0.04 Rate=0.08
N=55 RATE =
Rate=0.3 N=25
N=8
0.09
Census Tract A Rate=0.1

N=30 Weighted
Rate=0.1 Rate N Rate Median
N=10
0.02 45 0.027
0.02 45 0.027
0.10 10 0.030 low 50%
Rate=0.02
0.20 8 0.048
N=45
Rate=0.03 0.03 60 0.054
N=60 0.08 25 0.060
0.04 55 0.066
0.10 30 0.090 high 50%
0.15 22 0.099
0.12Focus
35 Area
0.125 Name
Headbanging Is center value between high and low medians? -- NO
Is the population much greater than neighbors? -- YES
DON’T REPLACE!!
Rate=0.12
N=35
Rate=0.2 Rate=0.02
N=45 Rate=0.15
N=20
N=22
Rate=0.04 Rate=0.08
N=55 Rate=0.3 N=25
N=200
Census Tract A Rate=0.1

N=30 Weighted
Rate=0.1
Rate N Rate Median
N=10
0.02 45 0.027
0.02 45 0.027
0.10 10 0.030 low 50%
Rate=0.02
0.20 8 0.048
N=45
Rate=0.03 0.03 60 0.054
N=60 0.08 25 0.060
0.04 55 0.066
0.10 30 0.090 high 50%
0.15 22 0.099
0.12Focus
35 Area
0.125 Name
Example: Data Privacy and Spatial
Smoothing
Focus Area Name

Focus Area Name
Focus Area Name
Standardized Mortality Ratio
• Standardized Mortality Ratios show locations on a map with higher than
expected rates given the age-, sex-, etc- distribution of the population in
that area
𝑌𝑖
𝑆𝑀𝑅𝑖 = ∗ 1000
𝐸𝑖
𝑌𝑖 is the observed number of events
𝐸𝑖 is the expected number of events
𝐸𝑖 = ෍ 𝑝𝑗 𝑛𝑖𝑗
𝑗
j is the population stratum (e.g., age*sex*race)
𝑝𝑗 is the frequency of the reference population
𝑛𝑖𝑗 is the number of people in area i in stratum j
• Spatial SMRs also smooth rates using surrounding area observed/

expected rates
Focus Area Name

Model for spatially smoothed SMRs
𝑌𝑖 |𝜇𝑖 ~ 𝑃𝑜𝑖𝑠𝑠𝑜𝑛(𝜇𝑖 )
log 𝜇𝑖 = log 𝐸𝑖 + 𝑏𝑖
σ𝑗≠𝑖 𝑤𝑖𝑗 𝑏𝑗 1
2
𝑏𝑖 |𝑏𝑗≠𝑖 ~ 𝑁 ,𝜎
σ𝑗≠𝑖 𝑤𝑖𝑗 σ𝑗≠𝑖 𝑤𝑖𝑗
– 𝑏𝑖 are area-specific random effects with a correlated random effect

distribution
– 𝑤𝑖𝑗 are weights defining which regions j and i are neighbors
– 𝜎 2 is the variance controlling how similar 𝑏𝑗 is to its neighbors
• In a Bayesian framework, weights depend on the precision of the

SMR (1/𝐸𝑖 ) in area i and the variability (heterogeneity) of the true
risks across areas local or regional mean
Focus Area Name

Spatially smoothed SMRs
• The raw and smoothed standardized mortality ratio (𝑆𝑀𝑅𝑖
෣𝑖 ) are:
and 𝑆𝑀𝑅
𝑌𝑖
𝑆𝑀𝑅𝑖 =
𝐸𝑖
𝜇Ƹ 𝑖
෣
𝑆𝑀𝑅𝑖 =
𝐸𝑖
• For areas with lots of data:

෣𝑖
𝑆𝑀𝑅𝑖 ≈ 𝑆𝑀𝑅
• For areas with sparse data:
෣𝑖 ≈weighted average of SMR in the neighboring areas
𝑆𝑀𝑅
Focus Area Name

෣𝑖
𝑆𝑀𝑅𝑖 vs. 𝑆𝑀𝑅
(Age/Sex/Race Adjusted Suicides)
Focus Area Name

෣𝑖 in Dayton
𝑆𝑀𝑅𝑖 vs. 𝑆𝑀𝑅
Focus Area Name

Classifying areas with excess
(or lower) risk
• Classify an area as having an elevated/lower
risk if:
– Posterior probabilities [Prob (SMRi > 1)] > 0.8
– Outside 95% credible interval
• High specificity
– (false detection < 10%)
• Sensitivity 60%-95% for Ei of 5-20 and true
SMRi of 1.5-3.0
Focus Area Name

Areas of excess/less risk
Focus Area Name

Thoughts on when to smooth
Smoothing should be considered when:
1. The addition of one event or one more person at risk
results in a large difference in the rate (e.g., a change
of 25% or more)
2. The number of events that form the numerator is ≤ 3
3. The number of persons at risk per region is small and
the numbers change by an order of magnitude across
a region (e.g., 10 people in tract A vs. 100 people in
tract B)
*Smoothing reduces noise and makes trends and

patterns more clear
Focus Area Name

Methodology can be extended to
multivariate models
Bayesian CAR model with maternal Bayesian CAR model with
demographics demographics + tract-level SDHs
Focus Area Name

Software
• Estimation of Bayesian models requires
computationally intensive simulation methods
(MCMC)
– Implemented in free WinBUGS and GeoBUGS
software: www.mrc-bsu.cam.ac.uk/bugs
– Also R package CARBayes
• R package INLA implements fast approximation:

www.r-inla.org
– R package diseasemapping calls INLA specifically for
disease mapping
Focus Area Name

ARM SAA EC EDR Slides PDF

Uploaded by

Copyright:

Available Formats

ARM SAA EC EDR Slides PDF

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

ARM SAA EC EDR Slides PDF

Uploaded by

Copyright:

Available Formats

Methods for Small Area Analyses

of Spatial and Space-time Data

AcademyHealth Conference, Seattle, WA

• I want to identify areas with • Ignoring space in your

1. Find statistically significant areas of high or low

• Often we are interested in identifying patterns of

• We are also interested in understanding the reasons for

• But, spatially referenced data also bring special

• A consequence of these “special problems” is that

Geostatistical Data (Grid)

Spatial Spatial Data

Event (Point) Lattice (Area) Geostatistical

These are distinct concepts!

• These spatial relationships can be defined in a

• For points k=2

K-nearest neighbors (KNN) Euclidean distance

If we have observed last year’s hospital mortality

Only use information from each hospital to predict

Share (pool) information across hospitals.

Census Tract B Census Tract C

Census Tract E Census Tract F

What does the

Binary Patient Patient

How successful are different methods at recovering

Elisabeth Dowling Root, MA, PhD

Focus Area Name

• Trade-off between geographic resolution and the

• Spatial smoothing can reduce the random noise

Focus Area Name

Focus Area Name

Focus Area Name

Focus Area Name

Census Tract A Rate=0.1

Census Tract A Rate=0.1

Focus Area Name

• Spatial SMRs also smooth rates using surrounding area observed/

Focus Area Name

– 𝑏𝑖 are area-specific random effects with a correlated random effect

• In a Bayesian framework, weights depend on the precision of the

Focus Area Name

• For areas with lots of data:

Focus Area Name

Focus Area Name

Focus Area Name

Focus Area Name

Focus Area Name

*Smoothing reduces noise and makes trends and

Focus Area Name

Focus Area Name

• R package INLA implements fast approximation:

Focus Area Name

You might also like