Variograms: Brian Klinkenberg Geography
Variograms: Brian Klinkenberg Geography
Brian Klinkenberg
Geography
To get there, we’ll consider
Correlation
Tobler’s first law of geography
Spatial autocorrelation
Three dimensions of scale
Units of observation
Regionalized variables
Variograms
Correlation
What is correlation? What does it measure?
[n] a statistical relation between two or more
variables such that systematic changes in the value
of one variable are accompanied by systematic
changes in the other
[n] a statistic representing how closely two variables
co-vary; it can vary from -1 (perfect negative
correlation) through 0 (no correlation) to +1 (perfect
positive correlation); "what is the correlation
between those two variables?"
[n] a reciprocal relation between two or more things
[Source: http://www.hyperdictionary.com/dictionary/correlation]
Correlation
Suppose we have two variables X and Y, with
means XBAR and YBAR respectively and
standard deviations SX and SY respectively. The
correlation is computed as:
n 1 wij xi x j 2
i j
c
2 wij xi a
2
i j i
Three dimensions of scale
When we think of scale, from a cartographic point-of-view, we
are discussing only the output scale (a map with a large scale of
1:10,000 would cover just a small area). In ecology, scale is
more often associated with the size of the study area (a large-
scale study would encompass a large area). In landscape
ecology terms such as grain and extent are often used,
resolution is used in remote sensing, ‘support’ is a term used in
geostatistics. There is a wide diversity of terms used in a wide
array of disciplines that often share similar but not exact shades
of meaning. Always be explicit in your use of a term. However,
when fully considered we should recognize that there are
multiple dimensions of scale:
Three dimensions of scale
The phenomenon (or process) itself—the size and
phase spacing, and the range of action and extent
of the effect. (Examples: Appalachian Mountains
have an obvious size and phase spacing; clear
cutting has an obvious footprint with an effect that
can extend beyond that footprint). It is also
important to note that patches may have one set of
characteristics (e.g., mean patch size and variance)
and that within a class there will be another set of
characteristics (e.g., mean distance between
patches within a class).
Three dimensions of scale
The sampling units—the sample size and shape, the
spacing of the samples, and the extent of the study
area. (Relates to the fine-scale [high resolution]
variation that can be detected, and to the large-
scale [coarser resolution] variation that may be
detected.)
Three dimensions of scale
The analysis—the size of the analytical units, the
spacing of them and the extent of the analysis can
be different from the scales associated with the
phenomenon being studied (e.g., when computing a
variogram the rule of thumb is that statistics should
not be calculated to greater than one-third to one-
half of the extent of the study domain; if the data
were a satellite image, the samples may be
randomly selected throughout the image and the
pixels may be 30m or 1.2km resolution, also
consider IFOV issues).
Units of observation
When working in space, selecting the units of
observation will have a significant impact on the
results (MAUP). Is the unit of observation smaller
than potential objects of interest, the same size as
the objects of interest, or larger than the objects of
interest (think of quadrats and what happens when
the sampling point includes a tree? – such studies
often require two or more sampling frameworks)? If
larger than the objects of study, then aspects of the
modifiable areal unit problem must be considered.
Regionalized variables
A variable that takes on values according to its
spatial location is known as a regionalized variable.
Considering a variable z measured at location i, we
can partition the total variability in z into three
components:
b lu e d o ts r e p r e s e n t th e d a ta
Regionalized variables
T h e s tr u c tu r a l c o m p o n e n t ( e .g ., a lin e a r tr e n d )
T h e s p a tia lly c o r r e la te d c o m p o n e n t
T h e r a n d o m n o is e c o m p o n e n t ( n o n - fitte d )
Regionalized variables
Regionalized variables are variables that fall
between random variables and completely
deterministic variables.
Typical regionalized variables are functions
describing variables that have geographic
distributions (e.g. elevation of ground surface).
Unlike random variables, regionalized variables
exhibit spatial continuity; however, the change in the
variable is so complex that they cannot be described
by any deterministic function.
The variogram is used to describe regionalized
variables
Variograms
In mathematical terms, the semi-variogram:
T h e s e m i- v a r io g r a m is
b a s e d o n m o d e llin g th e
( s q u a r e d ) d iff e r e n c e s in
th e z - v a lu e s a s a fu n c tio n
o f t h e d is t a n c e s b e t w e e n
a ll o f t h e k n o w n p o in t s .
Variograms
In graphical terms:
T h is is a n e x a m p le o f
a v a r io g r a m p r o d u c e d
u s in g A r c G I S 's
G e o s ta t is tic a l A n a ly s t.
Variograms
Statistical assumptions:
Stationary—mean and variance are not a function of location. Second-order
(weak) stationary is required—variance is a function of the separation
distance.
Isotropy—no directional trends occur in the data (as contrasted with
anisotropy). However, you can compute directional variograms in order to
assess directional trends in the data.
Unbounded variograms (i.e., with no sill) are evidence of nonstationary
variables.
Use of trend surface analysis to remove global trends in the data (to
transform a non-stationary variable [mean varies across space] to a
stationary one).
Lag distances – typically we group the distance intervals into classes so
that we can have enough sample points within any one distance class
(typically 30 is suggested as the minimum number). Small-scale (high
resolution) variation (at the resolution implied by the original sampling
scheme) may not be detectable as a result.
Variograms
The technique can provide a quantification of the scale of
variability exhibited by natural patterns of resource distributions
(although correlograms may be better for this, since you can
conduct statistical tests on the results) and an identification of the
spatial scale at which the sampled variable exhibits maximum
variance.
At larger lag distances (beyond the natural ‘scale’ of the
phenomenon) harmonic effects can be noted, in which the
variogram peaks or dips at lag distances that are multiples of the
natural scale.
Given the noise present in natural environmental data sets, it is
unlikely that you will be able clearly to identify multiple scales.
One approach might be to fit a semivariogram model to the data,
and to examine the residuals for the presence of multiple
patterns of scale.
Variograms
Variograms
http://zappa.nku.edu/~longa/geomed/modules/geostats_lite/lec/illinois.html
Kriging
Kriging is a spatial interpolation technique based on
semi-variograms. Unlike every other spatial
interpolation technique, kriging provides a map that
shows you the uncertainty associated with the
prediction.
Kriging
T IN
P r e d ic tio n s ta n d a r d e r r o r m a p
P r e d ic tio n m a p
Integrating GIS and spatial
statistics
Role of space
an organizing dimension for information
a source of context and linkage
an explanatory variable
a problem
Terminology
lattice, support, drift, topology, layer, coverage, region
Software as glue
within what conceptual framework?