Calculation of Uncertainty in The Variogram: Mathematical Geology February 2002
Calculation of Uncertainty in The Variogram: Mathematical Geology February 2002
Calculation of Uncertainty in The Variogram: Mathematical Geology February 2002
net/publication/226292931
CITATIONS READS
46 215
2 authors:
Some of the authors of this publication are also working on these related projects:
All content following this page was uploaded by Julián M. Ortiz on 17 July 2014.
There are often limited data available in early stages of geostatistical modeling. This leads to consid-
erable uncertainty in statistical parameters including the variogram. This article presents an approach
to calculate the uncertainty in the variogram. A methodology to transfer this uncertainty through
geostatistical simulation and decision making is also presented.
The experimental variogram value 2γ̂ (h) for a separation lag vector h is a mean of squared dif-
ferences. The variance of a mean can be calculated with a model of the correlation between the pairs
of data used in the calculation. The “data” here are squared differences; therefore, we need a mea-
sure of a 4-point correlation. A theoretical multi-Gaussian approach is presented for this uncertainty
assessment together with a number of examples. The theoretical results are validated by numerical
simulation. The simulation approach permits generalization to non-Gaussian situations.
Multiple plausible variograms may be fit knowing the uncertainty at each variogram point, 2γ (h).
Multiple geostatistical realizations may then be constructed and subjected to process assessment to
measure the impact of this uncertainty.
INTRODUCTION
169
0882-8121/02/0200-0169/1 °
C 2002 International Association for Mathematical Geology
P1: GCR/GGT/GFQ P2: GVG
Mathematical Geology [mg] PP376-matg-367314 February 15, 2002 9:37 Style file version June 30, 1999
1 X n(h)
2 · γ̂ (h) = · [Z (ui ) − Z (ui + h)]2 (2)
n(h) i=1
1 X n(h)
X̄ = 2 · γ̂ (h) = · Xi (3)
n(h) i=1
From classical statistics, we know that the uncertainty in the mean X̄ is defined as
2
Var{ X̄ } = E{( X̄ − E{ X̄ })2 } = E{ X̄ } − (E{ X̄ })2 (4)
Now, using expression (4) we can calculate the uncertainty in the variogram as-
suming that we have a “reference” variogram model fitted to the experimental
points. X̄ is replaced by 2 · γ̂ (h) and the variance of squared differences around
the model is calculated as follows:
(Ã
1 X
n(h) X
n(h)
=E · [Z (ui ) − Z (ui + h)]2
n(h)2 i=1 j=1
!)
× [Z (u j ) − Z (u j + h)]2 − (2 · γ̂ (h))2 (5)
1 X
n(h) X
n(h)
σ2·2 γ̂ (h) = · E{[Z (ui ) − Z (ui + h)]2 · [Z (u j ) − Z (u j + h)]2 }
n(h)2 i=1 j=1
− (2 · γ̂ (h))2 (6)
Now, replacing X i and X j by the squared differences [z(ui ) − z(ui + h)]2 and
[z(u j ) − z(u j + h)]2 , respectively, and X̄ by the variogram 2 · γ (h),
1 X
n(h) X
n(h)
σ2·2 γ̂ (h) = · Ci j (h) (9)
n(h)2 i=1 j=1
where Ci j (h) is calculated as in Eq. (8). To avoid confusion, note that Ci j (h) is the
covariance between pair i[Z (ui ) − Z (ui + h)]2 and j[Z (u j ) − Z (u j + h)]2
(Fig. 1).
Expression (9) tells us that the uncertainty in the variogram at a distance h
is the average covariance between “pairs of pairs” used to calculate the variogram
for that particular lag.
The covariance between “pairs of pairs” can be calculated theoretically under
a multi-Gaussian assumption. The following section presents this approach. The
next sections present the local and global simulation methods to check the results
given by the theoretical approach. The global simulation method is more general
in the sense that it gives the whole distribution of uncertainty in the variogram
P1: GCR/GGT/GFQ P2: GVG
Mathematical Geology [mg] PP376-matg-367314 February 15, 2002 9:37 Style file version June 30, 1999
values for each lag. Although the shape of the pointwise uncertainty distribu-
tion is unknown and we know that the variogram values must be nonnegative, a
Gaussian shape was assumed to present the confidence intervals calculated using
the variance in the theoretical approach and the local simulation method. Theory
says that if all the squared random variables are independent (which is clearly
not the case) the distribution of uncertainty in a variogram point should be χ 2
(chi square). The global simulation method shows in few cases asymmetric distri-
butions; however, a Gaussian distribution is a good approximation in most of the
cases.
The following steps are required for all three methodologies
1. Transform data to normal space: Any data distribution can be easily trans-
formed to a Gaussian univariate distribution. In the following examples the
program nscore in GSLIB (Deutsch and Journel, 1998) was used to per-
form the transformation. This transformation is commonly done to allow
Gaussian simulation.
2. Check multigaussianity: To fulfil the multi-Gaussian condition, one should
assure that not only the univariate distribution is Gaussian, but also the
bivariate and all multivariate distributions. In practice, some tests can be
done to the transformed distribution in order to accept bigaussianity; how-
ever, they are not often applied, especially in presence of sparse data.
P1: GCR/GGT/GFQ P2: GVG
Mathematical Geology [mg] PP376-matg-367314 February 15, 2002 9:37 Style file version June 30, 1999
THEORETICAL APPROACH
Notice that those pairwise covariances are different than the Ci j (u) presented
earlier, which are fourth order statistics, since they correspond to the covariance
between pairs of squared differences (i.e. “pairs of pairs”). Then, the variogram
variance is calculated as a sum of fourth order moments minus two times the
variogram squared.
A simple program can perform these calculations. For each lag, the location
of pairs considered in the experimental variogram calculation is used to determine
P1: GCR/GGT/GFQ P2: GVG
Mathematical Geology [mg] PP376-matg-367314 February 15, 2002 9:37 Style file version June 30, 1999
SIMULATION ALTERNATIVE
The idea is to simulate each set of four-point locations in turn and evaluate the
fourth order moments in expression (10) by simple averages. Again, the assumption
of multigaussianity simplifies the simulation. A matrix or LU simulation approach
is very fast and efficient since only four points are considered at a time and there are
no conditioning data. All fourth order moments in expression (10) are estimated
as averages of products using the simulated values, and the variogram variance is
calculated with formula (9).
The theoretical approach has the following advantages over the two
simulation-based methods (1) implementation is easier since the fourth order
P1: GCR/GGT/GFQ P2: GVG
Mathematical Geology [mg] PP376-matg-367314 February 15, 2002 9:37 Style file version June 30, 1999
moments are calculated analytically and directly, (2) computer speed is much
improved since there is no need for random number generation or multiple real-
izations, and (3) the simulation methods are approximate, although they converge
to the correct result provided the implementation is correct.
The global simulation method has the advantage that the entire distribution
of uncertainty is simulated.
EXAMPLE 1: CLUSTER.DAT
Table 1. Pointwise Variogram Uncertainty Calculated Using the Three Methods Presented
Figure 3. The experimental variogram, along with the variogram model fitted and the
central confidence intervals at 95, 75, 50, and 25% for each lag (Cluster database).
P1: GCR/GGT/GFQ P2: GVG
Mathematical Geology [mg] PP376-matg-367314 February 15, 2002 9:37 Style file version June 30, 1999
EXAMPLE 2: RED.DAT
Figure 4. Location map of samples and gold content taken from the
database red.dat.
P1: GCR/GGT/GFQ P2: GVG
Mathematical Geology [mg] PP376-matg-367314 February 15, 2002 9:37 Style file version June 30, 1999
Figure 5. The experimental variogram, along with the variogram model fitted and the central confi-
dence intervals at 95, 75, 50, and 25% for each lag (red.dat database). Left: Variogram for thickness;
Right: Variogram for gold content.
The calculation of confidence intervals was performed for each variable, and the
results are shown in Figure 5.
The global simulation method was used to obtain the entire uncertainty dis-
tribution for each lag. 100 nonconditional realizations of a Gaussian random vari-
able were generated using sgsim. The simulated values at the sampled locations
(obtained from the database red.dat) were extracted for each realization. The
experimental variogram was calculated using the simulated values at the sampled
locations and the same parameters that were used to find the experimental points
shown in Figure 5.
The experimental variograms calculated for each realization using the entire
simulated field (showing ergodic fluctuations) and those calculated using only
the simulated data at the sample locations (now considering the effect of ergodic
fluctuations and “sampling fluctuations”) are shown in Figure 6 for thickness and
gold content.
Table 2 shows the variogram variance for each variable and lag, calculated
using the theoretical approach, the local simulation method, and the global simu-
lation method. Hundred realizations were generated for the numerical methods.
The results obtained from the theoretical approach and the local simulation
method are similar; however, the global simulation method gives lower variance
for all the lags. The main difficulty of this approach is to ensure correct use of the
variogram for all distances when a limited number of nearby samples is used (Tran,
1994). The variogram calculated for each realization (using all the simulated nodes)
was presented in Figure 6 (Left). The variability in the variograms calculated using
all the nodes in the grid is lower than the expected variability.
Histograms showing the entire uncertainty distribution for the corresponding
lags are presented in Figure 7. All the histograms generated through the global
simulation method are slightly asymmetric with a tail to the right. This asymmetry
was expected since the variogram is nonnegative.
P1: GCR/GGT/GFQ P2: GVG
Mathematical Geology [mg] PP376-matg-367314 February 15, 2002 9:37 Style file version June 30, 1999
Variable: Thickness
2 17.497 0.332 0.311 0.013 0.012 0.004
3 51.119 0.687 0.540 0.008 0.001 0.007
4 99.311 0.669 0.742 0.044 0.041 0.024
5 148.627 0.871 0.857 0.092 0.089 0.052
6 197.746 0.957 0.921 0.152 0.150 0.085
7 250.436 1.178 0.958 0.176 0.177 0.112
8 297.843 0.969 0.976 0.264 0.258 0.160
9 345.356 0.992 0.986 0.289 0.270 0.193
Variable: Gold content
2 17.497 0.493 0.554 0.044 0.041 0.014
3 54.099 0.706 0.712 0.015 0.001 0.008
4 99.435 0.715 0.833 0.030 0.005 0.015
5 149.221 0.865 0.908 0.053 0.043 0.028
6 198.912 1.065 0.949 0.078 0.075 0.056
7 249.254 1.216 0.972 0.096 0.092 0.066
8 297.879 0.961 0.985 0.134 0.140 0.079
9 345.618 1.088 0.991 0.160 0.161 0.110
Figure 6. The experimental variogram values for each lag calculated using (Left) all the simulated
data and (Right) only the simulated values at sampling locations (red.dat database). Top: Thickness;
Bottom: Gold.
COMMENTS
Figure 7. An example of the uncertainty distribution of the pointwise variogram values: Histograms of variogram
values for lags 6, 7, 8, and 9 for Gold (red.dat database).
181
Style file version June 30, 1999
P1: GCR/GGT/GFQ P2: GVG
Mathematical Geology [mg] PP376-matg-367314 February 15, 2002 9:37 Style file version June 30, 1999
REFERENCES
Cressie, N., and Hawkins, D., 1980, Robust estimation of the variogram: Math. Geology, v. 12, no. 2,
p. 115–126.
Cressie, N., 1991, Statistics for spatial data: John Wiley & Sons, New York, 900 p.
Deutsch, C., and Journel, A., 1998, GSLIB: Geostatistical software library and user’s guide, 2nd ed.:
Oxford University Press, New York, 369 p.
Genton, M. G., 1998, Highly robust variogram estimation: Math. Geology, v. 30, no. 2, p. 213–221.
Goovaerts, P., 1997, Geostatistics for natural resources evaluation: Oxford University Press, New York,
483 p.
Journel, A. G., 1994, Resampling from stochastic simulations: Environ. Ecol. Stat., v. 1, p. 63–84.
Matheron, G., 1965, Les variables régionalisées et leur estimation: Masson et Cie. Editeurs, Paris.
Olea, R. A., 1995, Fundamentals of semivariogram estimation, modeling, and usage, in Yarus, J. M.,
and Chambers, R. L., eds., Stochastic modeling and geostatistics: Principles, methods, and case
studies. American Association of Petroleum Geologists, Tulsa, OK, p. 27–36.
Omre, H., 1984, The variogram and its estimation, in Verly, G., and others eds., Geostatistics for natural
resources characterization: Reidel, Dordrecht, Holland, Vol. 1, p. 107–125.
Tran, T. T., 1994, Improving variogram reproduction on dense simulation grids: Comput. Geosci., v. 20,
no. 7, p. 1161–1168.