A Step by Step Guide To Bi-Gaussian Disjunctive Kriging
A Step by Step Guide To Bi-Gaussian Disjunctive Kriging
A Step by Step Guide To Bi-Gaussian Disjunctive Kriging
Abstract
The Disjunctive Kriging formalism has been implemented for a number of tasks in geostatistics. Despite the advantages of this formalism, application has been hindered by presentations that assume a fairly advanced level of mathematical familiarity. This paper will go
through the steps to perform Disjunctive Kriging in a simple case.
The global stationary distribution of the variable under consideration is fit by Hermite
polynomials. Disjunctive Kriging amounts to simple kriging to estimate the polynomial values at an unsampled location. These estimated values completely define the local distribution
of uncertainty. It is straightforward to implement this formalism in computer code, but a
clear exposition of the theoretical details is required for confident application and any future
development.
Introduction
Disjunctive Kriging (DK) has been available for more than 25 years, however the seemingly
complex theory makes it unappealing for most practitioners. DK is a technique that provides advantages in many applications. It can be used to estimate the value of any function
of the variable of interest, making it useful to assess truncated statistics for recoverable
reserves. DK provides a solution space larger than the conventional kriging techniques that
only rely on linear combinations of the data. DK is more practical than the conditional
expectation, since it only requires knowledge of the bivariate law, instead of the full multivariate probability law of the data locations and location being estimated [7, 9, 11, 12, 14].
The theoretical basis of DK is sound, internally consistent, and has been extensively
developed and expanded, among geostatisticians [1, 2, 4, 8, 10, 13]. In practice, those
developments have not been applied to their full potential. DK has been applied mainly
with the use of Hermite polynomials and the bivariate Gaussian assumption [6, 15]. The
discrete Gaussian model for change of support has been used in practice. Still, relatively few
practitioners have mastered DK. The discomfort of many practitioners is due in part to the
dicult literature focussed on theory rather than applications. The available theoretical
work uses complicated notation and the steps are explained for those very comfortable
with mathematics. There is a need for detailed documentation of DK with emphasis on
1
implementation and practical details. This work aims to present DK in a rigorous manner,
with greater focus on its practical aspects.
We start by presenting some background on Hermite polynomials, the bivariate Gaussian
assumption, and then introduce DK with an example. More extensive theory can be found
in Chil`es and Delner [3], Emery [5], and Rivoirard [14].
Hermite Polynomials
Before getting into DK, we need to dene and review some of the properties of Hermite
polynomials. This family of polynomials is important because it will help us parameterize
conditional distributions later on.
Hermite polynomials are dened by Rodrigues formula:
dn g(y)
1
n 0
(1)
dy n
n! g(y)
1
Hn+1 (y) =
y Hn (y)
n+1
n
Hn1 (y)
n+1
n 1
(2)
This expression along with the knowledge of the rst two polynomials is enough for fast
calculation up to any order. The rst three polynomials are given as an example:
H0 (y) = 1
H1 (y) = y
1
H2 (y) = (y 2 1)
2
These polynomials have the following properties:
(3)
Hermite polynomials form an orthonormal basis with respect to the standard normal
distribution, other polynomials families can be considered if a dierent transformation of
the original variable is performed [3, 5].
0
0
1
(h)
(h)
1
Notice that these two terms, the mean vector and variance-covariance matrix, fully
dene the bivariate Gaussian distribution of Y (u) and Y (u + h). The correlogram (h)
gives all the structural information of the bivariate relationship.
Under this assumption, one additional property of Hermite polynomials is of interest.
The covariance between polynomials of dierent order is always 0, and if the order is the
same, it identies the correlation raised to the polynomials degree power, that is:
((h))n if n = p
0
if n = p
(4)
The only term that is left is the covariance between polynomial values of the same degree
for locations separated by a vector h. Since (h) < (0) = 1, this spatial correlation tends
rapidly to zero as the power n increases, that is, the structure tends to pure nugget. Also,
notice that there is no spatial correlation between polynomials of dierent orders.
(5)
The only question that remains is how to nd the coecients fn , n. This can be done
by calculating the expected value of the product of the function and the polynomial of
degree n:
(6)
The expected value can be taken inside the summation, since it is a linear operator and
the coecients fp are constants.
Notice that the expected value of the product of polynomials of dierent degrees corresponds to their covariance. The property of orthogonality comes in so that all terms where
p = n equal zero and we only have the ones where p = n. In these cases, the covariance
becomes the variance that equals 1. We can then simplify Equation 6 and get an expression
to calculate the coecients fn :
fn = E{f (Y (u)) Hn (Y (u))}
We can rewrite this expression and write the expected value in its integral form, which
will later be discretized for numerical calculation:
fn = E{f (Y (u)) Hn (Y (u))} =
(7)
It is worth noting that the coecient of 0 degree corresponds to the mean of the function
of the random variable. This can be seen directly from Equation 7, since, if n = 0, then
Hn (Y (u)) = 1 and the integral becomes the denition of the expected value of f (Y (u)). A
second point of interest is that the variance of the function of Y (u) can also be calculated
(although we are not going to show it here) and corresponds to the innite sum of squared
coecients [14]. In summary:
(8)
E{f (Y (u))} = f0
V ar{f (Y (u))} =
(fn )2
(9)
n=1
The practical implementation of this expansion calls for some simplications: the innite
expansion is truncated at a given degree P . The truncation causes some minor problems,
such as generating values outside the range of the data. These values can simply be reset to
a minimum or maximum values. If the number of polynomials used is large enough, these
problems are of limited impact.
Practical Implementation
Fitting a Global Distribution
Consider the normal score transform (also known as the Gaussian anamorphosis in most
DK literature) of a variable Z with N available samples at locations u , = 1, ..., N . The
cumulative distribution function (cdf) of Z is denoted FZ (z):
y = G1 (FZ (z))
of this function up to a degree P is used to give a close approximation to the shape of the
function .
P
Z(u) = (Y (u))
p Hp (Y (u))
p=0
Notice how we use as a function, like we used f in the previous section. We know how
to calculate the coecients:
0 = E{(Y (u))} = E{Z(u)}
p = E{Z(u) Hp (Y (u))} = (y(u)) Hp (y(u)) g(y(u)) dy(u)
(10)
The last expression can be approximated with the data at hand, as a nite summation.
Consider the data values y(u1 ), y(u2 ), ..., y(uN ) sorted increasingly. The terms (y(u))
become z(u). Since we have N data, we approximate the integral as a sum of N elements:
p =
=
N y(u+1 )
=1 y(u )
N y(u+1 )
=1 y(u )
N
1
=
z(u )
p!
=1
N
1
=
z(u )
p!
=1
1
dp g(y(u))
g(y(u)) dy(u)
dy(u)p
p! g(y(u))
y(u+1 )
y(u )
y(u+1 )
y(u )
d
dy(u)
N
z(u )
=1
dy(u)
y(u+1 )
1
=
z(u ) Hp1 (y(u)) g(y(u))
p
=1
d
Hp1 (y(u)) (p 1)! g(y(u)) dy(u)
dy(u)
N
dp1 g(y(u))
dy(u)p1
y(u )
1
1
Hp1 (y(u+1 )) g(y(u+1 )) Hp1 (y(u )) g(y(u ))
p
p
N
1
(z(u1 ) z(u )) Hp1 (y(u )) g(y(u ))
p
=2
(11)
Declustered Data
Frequency
Frequency
.160
.120
.080
.080
.040
.040
.000
.000
.00
1.00
2.00
3.00
-3.50
-2.50
Primary z
-.50
.50
1.50
2.50
3.50
4.50
Declustered Data
1.00
Cumulative Frequency
Cumulative Frequency
1.00
-1.50
.80
F(z)
.60
.40
.20
.00
.80
G(y)
.60
.40
.20
.00
.00
1.00
2.00
3.00
-3.50
Primary z
10.0
-2.50
-1.50
-.50
.50
1.50
2.50
3.50
4.50
Anamorphosis Function
Primary z
8.0
6.0
4.0
2.0
.0
-3.00
-2.00
-1.00
.00
1.00
2.00
Figure 1: Graphical transformation to normal scores and anamorphosis function for a global
distribution.
P
p Hp (Y (u))
p=0
The coecients p are the unique coecients dened for the expansion using Hermite
polynomials for the indicator function, as the fp were the coecients for the expansion of
the general function f in Equation 5.
The coecients can be calculated as:
0 = G(yc )
p = E{IY (u; yc ) Hp (Y (u))}
=
=
=
yc
1
Hp1 (yc ) g(yc )
p
For clarity, let us see how this last step was done. First, recall the denition of the
Hermite polynomial presented in Equation 1 and replace it in the previous equation:
p =
=
=
=
=
yc
yc
1
p!
1
dp g(y(u))
g(y(u)) dy(u)
dy(u)p
p! g(y(u))
dp1 g(y(u))
dy(u)p1
y(u)=yc
1
Hp1 (yc ) (p 1)! g(yc )
p!
1
Hp1 (yc ) g(yc )
p
It is important to emphasize that the Hermite coecients for the indicator function are
not the same as the ones shown for the continuous function. The idea behind the expansion
is that for each function a dierent set of coecients is found that allows us to express the
function as a linear combination of Hermite polynomials. This expansion ts the function
under consideration.
i = 1, . . . , N
Calculate the Hermite polynomials for all the Gaussian transformed yi data values
using Equations 2 and 3.
Calculate the coecients using Equations 10 and 11.
Generate the approximate function as the linear combination of the Hermite polynomials weighted by the corresponding coecients, as in Equation 5.
Data from a uniform distribution are presented in Figure 2, along with the approximated
histogram obtained by Hermite expansion of 100 polynomials. Recall that the expansion
matches the function if an innite number of polynomials is used, therefore the larger the
number of polynomials, the closer the expansion to reproducing the original distribution.
It was found that any expansion over the order of 25 gave satisfactory results, matching the
statistics up to the fourth signicant digit.
Data from a lognormal distribution were tted with expansions of 10, 50, and 100
polynomials, giving very good match with an expansion of order 100 (Figure 3). Notice
that the approximation for high values seen in the q-q plot does not perform well. This is
due to the truncation of the innite expansion. The mean was reproduced up to the fth
signicant digit, while the variance was reproduced up to the third signicant digit.
Disjunctive Kriging
Disjunctive Kriging (DK) allows the estimation of any function of Z(u), based on a bivariate
probability model. A bivariate Gaussian distribution of the normal scores of the data is
almost always chosen.
DK provides the solution that minimizes the estimation variance among all linear combinations of functions of one point at a time.
In simple words, DK relies on the decomposition of the variable (or a function of it) into
a sum of factors. These factors are orthogonal random variables, uncorrelated with each
other, and therefore the optimum estimate can be found by simple kriging each component.
8
.0500
Approximated Value
.80
.0400
Frequency
.0400
Frequency
Q-Q Plot
.0500
.0300
.0200
.0300
.0200
.0000
.0000
.00
.20
.40
.60
.80
1.00
.40
.20
.0100
.0100
.60
.00
.20
.40
Value
.60
.80
.00
.00
1.00
.20
.40
Approximated Value
.60
.80
Value
Figure 2: Original uniform distribution and corresponding reproduction of uniform distribution with 100 Hermite polynomials. A q-q plot of the original and approximated global
distributions shows the excellent match.
Original Lognormal Data
.100
Approximated Value
.200
Q-Q Plot
250.
.300
Frequency
Frequency
.300
300.
.200
.100
200.
150.
100.
50.
.000
.000
0.
100.
200.
300.
400.
0.
0.
100.
Value
200.
300.
400.
0.
50.
Approximated Value
100.
150.
200.
250.
300.
Value
fp [Hp (Y (u))]SK
p=0
To calculate the DK estimate the normal score transformation of the data is necessary:
y(u ) = G1 (FZ (z(u )))
= 1, ..., N
Then, the spatial covariance of the transformed variable (h) is calculated and modelled.
The covariance function is the correlogram because Y has unit variance.
The Hermite polynomials are computed for all the transformed data up to a degree P
using Equations 2 and 3. Finally, the coecients of the Hermitian expansion, up to a degree
P , can be calculated by Equations 10 and 11.
Simple kriging is performed P times. The estimate of the Hermite polynomial at an
unsampled location u0 is calculated as:
[Hp (y(u0 ))]SK =
n(u0 )
p,i Hp (y(ui ))
p > 0
i=1
where p,i is the simple kriging weight for the data y(ui ) and the degree p; n(u0 ) is the
number of samples found in the search neighborhood used for kriging. Notice that the term
for the mean is not present, since the mean value of the Hermite polynomial is 0, for all
p > 0. Also, note that the SK estimate for the polynomial of degree 0 is 1, since this is its
value by denition (Equation 3).
The weights are obtained by solving the following system of equations:
(1,1 )p
..
.
(n(u0 ),1
)p
(1,n(u0 ) )p
p,1
(1,0 )p
..
..
..
..
.
.
.
.
p
p,n(u0 )
(n(u0 ),0 )
(n(u0 ),n(u0 )p )
(12)
P
fp
p=0
n(u0 )
p,i Hp (y(ui ))
i=1
P
2
(fp )2 SK,p
p=1
2
where SK,p
is the estimation variance from SK of the Hermite polynomials for the system
of order p, that is:
2
2
= H
SK,p
p (Y )
n(u0 )
p,i (i,0 )p
i=1
2
Since H
= 1, p > 0, the estimation variance of DK can be rewritten as:
p (Y )
2
DK
(u0 ) =
P
(fp )2 1
p=1
n(u0 )
i=1
10
p,i (i,0 )p
Example
The following example is presented to illustrate the implementation of DK. Ten sample
values are considered. The sample statistics are:
M ean = 7.278
V ariance = 24.280
To proceed with DK, the global distribution must be tted with Hermite polynomials.
The normal score transforms have been calculated considering only these ten values as the
global distribution. Table 1 shows the original values, their transforms, the corresponding
probability densities and the values for the Hermite polynomials up to the tenth degree.
Once the Hermite polynomials have been calculated, the next step is to calculate the
coecients that will be used to t the global distribution. The coecients in Equation
11 are calculated in Table 2. The values fp are obtained by summing the elements in the
corresponding columns. Recall that f0 = E{Z}. This expected value is estimated with the
sample mean, that is, f0 = 7.278.
The global distribution is tted with Hermite polynomials up to the tenth degree. For
comparison, the tting using 100 Hermite polynomials is also shown on Figure 4.
We are interested in estimating the value of variable Z at location u0 = (0, 0) by
Disjunctive Kriging. Considering a spherical covariance with range of 40 for the Gaussian
transform, we can calculate the estimate at that location given the location of the nearby
samples, as shown on Figure 5.
1.00
1.00
.80
Orig.
Number of Data
mean
std. dev.
coef. of var
maximum
upper quartile
median
lower quartile
minimum
.60
.40
.20
Appr.
10
10
7.28 7.28
4.67 4.12
0.64 0.57
16.63 15.77
12.04 10.18
4.86 5.64
3.38 3.59
2.58 3.50
Cumulative Frequency
Cumulative Frequency
.80
Orig.
Number of Data
mean
std. dev.
coef. of var
maximum
upper quartile
median
lower quartile
minimum
.60
.40
.20
.00
Appr.
10
10
7.28 7.28
4.67 4.17
0.64 0.57
16.63 15.21
12.04 11.26
4.86 5.13
3.38 3.91
2.58 3.32
.00
.0
5.0
10.0
15.0
20.0
.0
5.0
10.0
z Value
15.0
20.0
z Value
Figure 4: Fitting of the global distribution using the ten polynomials. For comparison, a
tting using 100 polynomials is shown. The tted distribution is shown as a thick line.
Only three samples are found in the search neighborhood: z(u1 ) = 3.377 located in
u1 = (2, 0); z(u2 ) = 12.586 located in u2 = (4, 0); and z(u3 ) = 5.398 located in u3 = (0, 4).
A matrix D of absolute distance between the samples, and a vector d with the distance
between the samples and the point to be estimated are built.
2.00
d = 4.00
4.007
Figure 5: Data conguration. The gure shows the original z data. A search radius is
dened to nd the nearby samples to estimate by DK the value of z(u) at location u = (0, 0).
Only three samples are found in the search neighborhood.
p=1
1.000 0.777 0.833
0.925
1
p=1
p=1:
p=2:
p=3:
p=2
(1.000)2 (0.777)2 (0.833)2
(0.925)2
1
p=2
2
2
2
(0.777) (1.000) (0.789) 2
= (0.851)2
(0.833)2 (0.789)2 (1.000)2
(0.851)2
p=2
3
p=3
(1.000)3 (0.777)3 (0.833)3
(0.925)3
1
p=3
3
3
3
(0.777) (1.000) (0.789) 2
= (0.851)3
(0.833)3 (0.789)3 (1.000)3
(0.851)3
p=3
3
13
p=0
1.000
1.000
1.000
1.000
1.000
1.000
1.000
1.000
1.000
1.000
p=1
1.645
1.036
0.674
0.385
0.126
-0.126
-0.385
-0.674
-1.036
-1.645
p=2
1.206
0.052
-0.385
-0.602
-0.696
-0.696
-0.602
-0.385
0.052
1.206
p=3
-0.198
-0.815
-0.701
-0.449
-0.153
0.153
0.449
0.701
0.815
0.198
Hermite
p=4
-1.207
-0.468
0.097
0.435
0.593
0.593
0.435
0.097
-0.468
-1.207
polynomial Hp (y)
p=5
p=6
-0.711
0.624
0.512
0.644
0.656
0.092
0.476
-0.322
0.170
-0.533
-0.170
-0.533
-0.476
-0.322
-0.656
0.092
-0.512
0.644
0.711
0.624
p=7
1.046
-0.222
-0.584
-0.488
-0.183
0.183
0.488
0.584
0.222
-1.046
p=8
0.024
-0.683
-0.225
0.235
0.490
0.490
0.235
-0.225
-0.683
0.024
p=1
-0.118
-0.092
-0.221
-0.137
-0.426
-1.257
-1.032
-0.128
-0.417
-3.828
p=2
-0.086
-0.044
-0.060
-0.012
0.038
0.342
0.492
0.094
0.485
1.248
g(y)
0.103
0.233
0.318
0.370
0.396
0.396
0.370
0.318
0.233
0.103
fp
-0.004
0.048
0.025
-0.025
-0.029
0.009
0.021
0.032
-0.004
-0.025
-0.003
0.019
0.077
0.050
-0.043
-0.043
0.027
0.038
0.055
0.011
-0.036
-0.010
0.028
0.009
0.171
-0.033
-0.113
0.030
0.086
-0.028
0.437
-0.282
-0.244
0.244
0.153
-0.217
0.230
-0.361
-0.045
0.276
-0.036
-0.213
-0.004
-0.052
0.027
0.027
-0.031
-0.010
-0.290
-0.041
0.225
-0.121
-0.098
0.154
0.693
-0.629
-0.210
0.354
0.096
-0.238
p = 10
-0.529
0.639
0.320
-0.163
-0.457
-0.457
-0.163
0.320
0.639
-0.529
p = 10
0.001
-0.015
-0.034
-0.008
0.026
0.195
0.163
-0.001
-0.128
0.198
p=9
-0.973
-0.027
0.500
0.490
0.193
-0.193
-0.490
-0.500
0.027
0.973
p=9
0.027
0.007
-0.017
-0.022
-0.070
-0.098
0.077
0.029
-0.003
-0.071
Table 1: Hermite polynomial values for the data up to the tenth degree
g(y)
0.103
0.233
0.318
0.370
0.396
0.396
0.370
0.318
0.233
0.103
y value
-1.645
-1.036
-0.674
-0.385
-0.126
0.126
0.385
0.674
1.036
1.645
y value
-1.645
-1.036
-0.674
-0.385
-0.126
0.126
0.385
0.674
1.036
1.645
z value
2.582
3.087
3.377
3.974
4.321
5.398
8.791
12.037
12.586
16.626
z value
2.582
3.087
3.377
3.974
4.321
5.398
8.791
12.037
12.586
16.626
p=1
0.674
-1.036
-0.126
0.596
0.287
0.128
0.088
Hp (y(u1 ))
Hp (y(u2 ))
Hp (y(u3 ))
p1
p2
p3
[Hp (y(u0 ))]SK
p=2
-0.385
0.052
-0.696
0.590
0.281
0.139
-0.309
p=3
-0.701
0.815
0.153
0.580
0.272
0.147
-0.162
p=4
0.097
-0.468
0.593
0.566
0.259
0.150
0.023
p=5
0.656
-0.512
-0.170
0.548
0.244
0.150
0.209
p=6
0.092
0.644
-0.533
0.528
0.227
0.147
0.116
p=7
-0.584
0.222
0.183
0.505
0.209
0.142
-0.223
p=8
-0.225
-0.683
0.490
0.480
0.190
0.134
-0.172
p=9
0.500
0.027
-0.193
0.535
0.234
0.148
0.245
p = 10
0.320
0.639
-0.457
0.428
0.153
0.115
0.183
Table 3: Simple kriging weights and hermite polynomial values. The estimated Hermite
polynomials for location u0 are shown at the bottom row.
We solve for the weights p , = 1, ..., n, p = 1, ..., P and estimate each one of the
polynomials for location u0 . Table 3 shows the weights, Hermite polynomials for u1 , u2 ,
and u3 , and the estimated Hermite polynomial values for u0 .
Finally, the DK estimate of z(u0 ) is obtained by combining these estimated Hermite
polynomial values with the coecients for the global transformation.
This gives:
[z(u0 )]DK = [f (y(u0 ))]DK =
P
p=0
Conclusion
This paper presents the methodology to estimate the value of a regionalized variable at
an unsampled location by Disjunctive Kriging. The use of the Hermite polynomials as an
isofactorial family was discussed and the most fundamental equations were presented. A
simple example where a single point was estimated was presented, showing all calculations
required to obtain the estimated value.
DK can be applied under a variety of assumptions regarding the bivariate spatial law.
These extensions have not been presented in this paper. Implementation of DK under
the biGaussian assumption and with other isofactorial families of polynomials could be
considered in the future.
References
[1] M. Armstrong and G. Matheron. Disjunctive kriging revisited - part I. Mathematical
Geology, 18(8):711728, 1986.
[2] M. Armstrong and G. Matheron. Disjunctive kriging revisited - part II. Mathematical
Geology, 18(8):729742, 1986.
[3] J. P. Chil`es and P. Delner. Geostatistics Modeling Spatial Uncertainty. John Wiley
& Sons, New York, 1999.
[4] X. Emery. Conditional simulation of non-Gaussian random functions. Mathematical
Geology, 34(1):79100, 2002.
14
15