Geostatistics Kriging
Geostatistics Kriging
8.10.2015
Konetekniikka 1, Otakaari 4, 150
10-12
Background
What is Geostatitics
Concepts
Variogram: experimental, theoretical
Anisotropy, Isotropy
Lag, Sill, Range, Nugget
Types of Kriging
Example of kriging interpolation
2
Interpolation
How to estimate unknown values at specific
locations.
Spatial Interpolation
Examples
Trend surfaces
Nearest neighbours: Thiessen(voronoi)
Inverse distance weighting (IDW)
Splines
Kriging
Example:
Site
D to
(5,5)
4.2426
10
2
8
2.8284
4
6
4
5.6569
4
6
2
1.0000
0
0
10
2.0000
Example: IDW
Value of z(x) is estimated from all known
values of z at all n points. (Weighted Moving
Average technique)
n
z ( x) wi zi
i 1
w 1
i 1
Example: IDW
In IDW, the weights weights are based on the
distance from each of the known points (i) to
the point we are trying to estimate (k): dik. In
IDW, we consider the inverse distance, 1/dik
1
d ik
wi n
1
11 d ik
7
Example: IDW
Location (x,y)
D to (5,5)
ID
Weights
(2,2)
4.2426
0.2357
0.1040
(3,7)
2.8284
0.3536
0.1560
(9,9)
5.6569
0.1768
0.0780
(6,5)
1.0000
0.4413
(5,3)
2.0000
0.5
0.2207
2.2661
sum
Z(5,5)
Historical background
Geostatistics, first developed by Georges Matheron (19302000), the French geomathematician. The major concepts
and theory were discovered during 1954-1963 while he was
working with the French Geological Survey in Algeria and
France.
In 1963, he defined the linear geostatistics and concepts of
variography, varaiances of estimation and kriging (named
after Danie Krige) in the Trait de gostatistique applique.
The principles of geostatistics was published in Economic
Geology Vol. 58, 1246-1266.
Kriging was named in honour of Danie Krige (1919-2013),
the South African mining engineer who developed the
methods of interpolation.
9
What is Geostatistics
Techniques which are used for mapping of surfaces
from limited sample data and the estimation of values
at unsampled locations
Geostatistics is used for:
Mining
Geography
Geology
Geophysics
Oceanography
Hydrography
Meterology
Biotechnology
Enviromental studies
Agriculture
11
12
Variogram cloud
Variogram/Semivariogram
Variogram is the variance of the difference
random variables at two locations
To examine the spatial continuity of a
regionalized variable and how this continuity
changes as a function of distance and direction.
The computation of a variogram involves plotting
the relationship between the semivariance and
the lag distance
Measure the strength of correlation as a function
of distance
Quantify the spatial autocorrelation
15
Variogram
Semivariogram, y(h)
Variability increase
16
Variograms
Half of average squared difference between
the paired data values.
The variogram calculated by
1
2
h
(
x
x
)
i
j
2 N (h) (i , j )|hij h
17
Variogram
Experimental variogram (sample or observed
variogram) :
when variogram is computed from sampled data.
The first step towards a quantitative description of
the regionalized variation.
19
Omnidirectional variogram
Omnidirectional variogram is a test for erratic
directional variograms
The omnidirectional variogram contains more
sample pairs than any directional variogram so it
is more likely to show a clearly interpretable
structure.
If the omnidirectional variogram is messy, then
try to discover the reasons for the erraticness,
e.g. Examine the h-scatterplots may reveal that a
single sample value shows large influence on the
calculations.
20
Isotropy
The spatial correlation structure has no
directional effects, the resulting variogram
averages the variogram over all directions.
The covariance function, correlogram, and
semivariogram depend only on the magnitude of
the lag vector h and not the direction
The empirical semivariogram can be computed by
pooling data pairs separated by the appropriate
distances, regardless of direction.
The semevariogram describes omnidirectional.
21
22
23
24
Anisotropy
Spatial variation is not the same in all
directions
The variogram is computed for specific
directions
If the process is anisotropy, then so is the
variogram
25
26
27
28
30
Description
Lag The distance between sampling pairs.
Range The point where the semivariogram
reaches the sill on the lag h axis. Sample points
that are farther apart than range are not spatially
autocorrelated.
Nugget The point where semivariance
intercepts the ordinate.
Sill The value where the semivariogram first
flattens off, the maximum level of semivariance.
The points above the sill indicate negative spatial
correlation and vice versa.
31
Spherical
range
Exponenial
range
sill
nugget
nugget
Lag (h)
sill
Lag (h)
Linear
Gaussian
range
nugget
sill
nugget
Lag (h)
Lag (h)
32
Spherical
Linear
Exponential
Gaussian
Source: Longley, P.A., Goodchild, M.F., Maguire, D.J. And Rhind, D.W., 2001, Geographic Information Systems and Science
34
Spherical model
35
Exponential model
Gaussian model
Linear model
36
Nested structure
One variogram model can be created by
several variogram models
n
(h) i i (h)
i 1
and
t (h) 1 (h) 2 (h)
37
Nested Structure
Example: the nested spherical or double
spherical
38
Ordinary kriging
In ordinary kriging, a probability model is used
in which the bias and error variance can be
computed and select weights for the
neighbour sample locations that the everage
error for the model is 0 and the error variance
is minimized.
The procedure of ordinary kriging is similar to
weighted moving average except the weights
are derived from geostatistical analysis.
39
Ordinary kriging
The estimationi by ordinary kringing can be
expressedn by:
n
i 1
z ( x0 ) i z ( xi ) where
i 1
i 1
The minimum variance of z(x0) is
n
2 i ( xi , x0 )
i 1
x , x ( x , x )
n
i 1
for all j
40
Example:
Site
D to
(5,5)
4.2426
10
2
8
2.8284
4
6
4
5.6569
4
6
2
1.0000
0
0
10
2.0000
41
Example:Ordinary kriging
Computing kriging weights for the unsampled point x = 5,
y = 5. Let the spatial variation of the attribute sampled at
the five points be modelled by a spherical variogram with
parameters c0=2.5, c1=7.5 and range a = 10. The data at
the five sampled points are:
We need to solve
A b
1
Example from Principles of Geographical Information Systems by Burrough and McDonnell, 1998, Oxford University Press
42
Example:Ordinary kriging
Value at (5,5) = weights * z
= 4.3985
With estimation variance = (weights*b)+
= 4.2177 + (-0.1544)
= 4.0628
Note: The estimation error variance is also known as kriging
variance.
43
IDW
4.1814
Ordinary Kriging
(Kriging variance)
4.3985
(4.0628)
44
Block kriging
The modification of kriging equations to
estimate an average value z(B) of the variable
z over a block of area B.
10
z3
8
z4
6
z2
4
z1
2
z5
0
0
10
45
Block kriging
Example from An introduction to applied Geostatistics by Edward H. Isaaks and R.Mohan Srivastava
46
Block kriging
The average value of z(B) over the block B is
given by
is estimated by
z ( x)dx
z ( B)
B area B
n
with
i 1
z( B) i z ( xi )
i 1
i 1
47
Block kriging
The minimum variance is
n
2 ( B) i ( xi , B) ( B, B)
i 1
( x , x ) ( x , B)
i 1
for all j
48
1: (+531)
2: (+75)
4: (+333)
5: (+280)
3: (+326)
Example from An introduction to applied Geostatistics by Edward H. Isaaks and R.Mohan Srivastava
49
1: (+531)
2: (+75)
4: (+333)
5: (+280)
3: (+326)
50
Estimate
336
0.17
0.11
0.09
0.60
0.03
361
0.22
0.03
0.05
0.56
0.14
313
0.07
0.12
0.17
0.61
0.03
339
0.11
0.03
0.12
0.62
0.12
Average
337
0.14
0.07
0.11
0.60
0.08
51
Simple kriging
It is similar to ordinary kriging except that the
weights sum equation (=1) is not added.
The mean is a known constant.
It uses the average of the entire data set.
(ordinary kriging uses local average : the
average of the points in the subset for a
particular interpolation point)
52
Cokriging
It is an extension of ordinary kriging where two or
more variables are interdependent.
How:
U and V are spatial correlated
Variable U can be used to predict variable V that is information
about spatial variation of U can help to map V.
Why:
V data may be expensive to measure or collect or have some
limitations in data collection process so the data may be
infrequent.
U data, on the other hand, may be cheap to measure and
possible to collect more observations.
53
Indicator kriging
Binary value
From a continuous variable z(x), an indicator
can be created by indicating it 1 for z(x) is less
than or equal to a cut-off value, zc, and 0
otherwise
if z ( x) zc
1
( x)
0
otherwise
54
Example
Elevation data set in Rastila
2000 laser scanning points
Minimum
Maximum
18
Mean
4.7247
Median
Skewness
0.64651
Kurtosis
2.0344
Standard deviation
5.543
Variance
30.725
56
Example
57
Example
Variogram model
range
nugget
sill
length
Exponential
0.5
0.10847
1.1232
0.18064
Linear
0.5
0.50754
1.806
Gaussian
0.5
0.4546
1.0447
0.22811
Spherical
0.75
0.35265
1.0666
0.48576
Exponential
0.75
0.13161
1.1388
0.19144
Linear
0.75
0.58671
1.5711
Gaussian
0.75
0.48089
1.0728
0.24717
Spherical
0.95
0.54285
1.9597
1.8513
Exponential
0.95
0.36872
1.4082
0.40794
Linear
0.95
0.55676
1.6424
Gaussian
0.95
0.57831
1.1922
0.34484
58
Example
Variogram model
RMSE
Exponential (0.5)
3.281
Linear (0.5)
3.367
Gaussian (0.5)
3.431
Spherical (0.75)
3.307
Exponential (0.75)
3.272
Linear (0.75)
3.397
Gaussian (0.75)
3.439
Spherical (0.95)
3.381
Exponential (0.95)
3.302
Linear (0.95)
3.386
Gaussian (0.95)
3.466
59
Example
60
Example
61
References:
Geographic information Analysis by Sullivan, D. And
Unwin, D.
Geostatistics for Environmental Scientists by Richard
Webster and Margaret Oliver
Principle of Geographical Information Systems, Chapter 5
and 6 by Peter Burrough and Rachael McDonnell
An Introduction to Applied Geostatistics by Edward Isaaks
and Mohan Srivastava
Quality Aspects in Spatial Data Mining by Alfred Stein,
Wenzhong Shi and Wietske Bijker
62