0% found this document useful (0 votes)
26 views16 pages

Kriging vs. Simulation, A 2D Map Example - GeostatsPy Well-Documented Demonstration Geostatistical Workflows

This document is a tutorial chapter on spatial estimation techniques, specifically comparing Kriging and Sequential Gaussian Simulation using a 2D map example. It outlines the principles of estimation versus simulation, the properties of Kriging, and the process of Sequential Gaussian Simulation, including necessary coding steps in Python. The chapter aims to provide a hands-on guide for applying these geostatistical methods using the GeostatsPy library.

Uploaded by

Nabaz Hussein
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
26 views16 pages

Kriging vs. Simulation, A 2D Map Example - GeostatsPy Well-Documented Demonstration Geostatistical Workflows

This document is a tutorial chapter on spatial estimation techniques, specifically comparing Kriging and Sequential Gaussian Simulation using a 2D map example. It outlines the principles of estimation versus simulation, the properties of Kriging, and the process of Sequential Gaussian Simulation, including necessary coding steps in Python. The chapter aims to provide a hands-on guide for applying these geostatistical methods using the GeostatsPy library.

Uploaded by

Nabaz Hussein
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 16

Kriging vs.

Simulation, a 2D
Map Example
Contents
Estimation vs. Simulation
Spatial Estimation
Kriging
Sequential Gaussian Simulation
Load the Required Libraries
Declare Functions
Print to PDF
Set the Working Directory
Loading Tabular Data
Set Limits for Plotting, Colorbars and Grid Specification
Data Analytics and Visualization
Simple Kriging
Sequential Gaussian Simulation
Visualize Simulated Realizations and a Kriged Estimation Model
Comments
About the Author
Want to Work Together?

Michael J. Pyrcz, Professor, The University of Texas at Austin

Twitter | GitHub | Website | GoogleScholar | Geostatistics Book | YouTube | Applied Geostats


in Python e-book | Applied Machine Learning in Python e-book | LinkedIn

Chapter of e-book “Applied Geostatistics in Python: a Hands-on Guide with GeostatsPy”.

Skip to main content


 Cite this e-Book as:

Pyrcz, M.J., 2024, Applied Geostatistics in Python: a Hands-on Guide with


GeostatsPy [e-book]. Zenodo. doi:10.5281/zenodo.15169133
DOI 10.5281/zenodo.15169133

The workflows in this book and more are available here:

 Cite the GeostatsPyDemos GitHub Repository as:

Pyrcz, M.J., 2024, GeostatsPyDemos: GeostatsPy Python Package for Spatial Data
Analytics and Geostatistics Demonstration Workflows Repository (0.0.1) [Software].
Zenodo. doi:10.5281/zenodo.12667036. GitHub Repository:
 GeostatsGuy/GeostatsPyDemos DOI 10.5281/zenodo.12667036

By Michael J. Pyrcz
© Copyright 2024.

This chapter is a tutorial for / demonstration of Spatial Estimation with Kriging vs.
Simulation with Sequential Gaussian Simulation (SGSIM) with a 2D map example.

YouTube Lecture: check out my lectures on:

Trend Modeling
Kriging
Stochastic Simulation

For your convenience here’s a summary of the salient points.

Estimation vs. Simulation


Let’s start by comparing spatial estimation and simulation.

Estimation:

honors local data


locally accurate, primary goal of estimation is 1 estimate!
too smooth, appropriate for visualizing trends
too smooth, inappropriate for flow simulation
Skip to main content
one model, no assessment of global uncertainty

Simulation:

honors local data


sacrifices local accuracy, reproduces histogram
honors spatial variability, appropriate for flow simulation
alternative realizations, change random number seed
many models (realizations), assessment of global uncertainty

Spatial Estimation
Consider the case of making an estimate at some unsampled location, , where
z(u 0 ) z is
the property of interest (e.g. porosity etc.) and u0 is a location vector describing the
unsampled location.

How would you do this given data, z(u 1 ), z(u 2 ), and z(u 3 )?

It would be natural to use a set of linear weights to formulate the estimator given the
available data.


z (u) = ∑ λ α z(u α )

α=1

We could add an unbiasedness constraint to impose the sum of the weights equal to one.
What we will do is assign the remainder of the weight (one minus the sum of weights) to the
global average; therefore, if we have no informative data we will estimate with the global
average of the property of interest.

n n

∗ –
z (u) = ∑ λ α z(u α ) + (1 − ∑ λ α )z

α=1 α=1

We will make a stationarity assumption, so let’s assume that we are working with residuals,
y.

∗ ∗ –
y (u) = z (u) − z(u)

If we substitute this form into our estimator the estimator simplifies, since the mean of the
residual is zero. Skip to main content
n


y (u) = ∑ λ α y(u α )

α=1

while satisfying the unbiasedness constraint.

Kriging
Now the next question is what weights should we use?

We could use equal weighting, λ =


1

n
, and the estimator would be the average of the local
data applied for the spatial estimate. This would not be very informative.

We could assign weights considering the spatial context of the data and the estimate:

spatial continuity as quantified by the variogram (and covariance function)

redundancy the degree of spatial continuity between all of the available data with
themselves
closeness the degree of spatial continuity between the available data and the
estimation location

The kriging approach accomplishes this, calculating the best linear unbiased weights for the
local data to estimate at the unknown location. The derivation of the kriging system and the
resulting linear set of equations is available in the lecture notes. Furthermore kriging
provides a measure of the accuracy of the estimate! This is the kriging estimation variance
(sometimes just called the kriging variance).

2
σ (u) = C(0) − ∑ λ α C(u 0 − u α )
E

α=1

What is ‘best’ about this estimate? Kriging estimates are best in that they minimize the
above estimation variance.

Properties of Kriging
Here are some important properties of kriging:

Exact interpolator - kriging estimates with the data values at the data locations

Skip to main content


Kriging variance can be calculated before getting the sample information, as the kriging
estimation variance is not dependent on the values of the data nor the kriging estimate,
i.e. the kriging estimator is homoscedastic.
Spatial context - kriging takes into account, furthermore to the statements on spatial
continuity, closeness and redundancy we can state that kriging accounts for the
configuration of the data and structural continuity of the variable being estimated.
Scale - kriging may be generalized to account for the support volume of the data and
estimate. We will cover this later.
Multivariate - kriging may be generalized to account for multiple secondary data in the
spatial estimate with the cokriging system. We will cover this later.
Smoothing effect of kriging can be forecast. We will use this to build stochastic
simulations later.

Sequential Gaussian Simulation


With sequential Gaussian simulation we build on kriging by:

adding a random residual with the missing variance


sequentially adding the simulated values as data to correct the covariance between the
simulated values

The resulting model corrects the issues of kriging, as we now:

reproduce the global feature PDF / CDF


reproduce the global variogram
while providing a model of uncertainty through multiple realizations

In this chapter, we run kriging estimates and multiple simulation realizations, and compare
the statistics.

Load the Required Libraries


The following code loads the required libraries.

import geostatspy.GSLIB as GSLIB # GSLIB utilit


import geostatspy.geostats as geostats # GSLIB method
import geostatspy
print('GeostatsPy version: ' + str(geostatspy.__version__))

Skip to main content


GeostatsPy version: 0.0.72

We will also need some standard packages. These should have been installed with
Anaconda 3.

import os # set working

from tqdm import tqdm # suppress the


from functools import partialmethod
tqdm.__init__ = partialmethod(tqdm.__init__, disable=True)

ignore_warnings = True # ignore warni


import numpy as np # ndarrays for
import pandas as pd # DataFrames f
import matplotlib.pyplot as plt # for plotting
from matplotlib.ticker import (MultipleLocator, AutoMinorLocator) # control
from matplotlib import gridspec # custom subpl
plt.rc('axes', axisbelow=True) # plot all gri
if ignore_warnings == True:
import warnings
warnings.filterwarnings('ignore')
from IPython.utils import io # mute output
cmap = plt.cm.inferno # color map

If you get a package import error, you may have to first install some of these packages. This
can usually be accomplished by opening up a command window on Windows and then
typing ‘python -m pip install [package-name]’. More assistance is available with the
respective package docs.

Declare Functions
Here’s a convenience function for plotting variograms.

def vargplot(feature,lags,gamma_maj,gamma_min,npps_maj,npps_min,vmodel,azi,a
index_maj,lags_maj,gmod_maj,cov_maj,ro_maj = geostats.vmodel(nlag=100,xl
index_min,lags_min,gmod_min,cov_min,ro_min = geostats.vmodel(nlag=100,xl

plt.scatter(lags,gamma_maj,color = 'dark'+rcolor,edgecolor = 'black',s =


label = 'Experimental Major', alpha = 0.8,zorder=60)
plt.scatter(lags,gamma_min,color = rcolor,edgecolor = 'black',s = npps_m
label='Experimental Minor')
plt.scatter(lags,gamma_maj,color = 'white',edgecolor = 'white',s = npps_
alpha = 0.8,zorder=50)
plt.plot(lags_maj,gmod_maj,color = 'dark'+mcolor,lw=3,label = 'Model Maj
plt.plot(lags_maj,gmod_maj,color = 'white',lw=6*size,zorder=70)

plt.plot(lags_min,gmod_min,color = mcolor,lw=1.5,label='Model Minor',zor


plt.plot(lags_min,gmod_min,color
Skip to main = 'white',lw=3.0,zorder=50)
content
plt.plot([0,2000],[sill,sill],color = 'black',zorder=30)
plt.xlabel(r'Lag Distance $\bf(h)$, (m)')
plt.ylabel(r'$\gamma \bf(h)$')
if atol < 90.0:
plt.title('Directional ' + feature + ' Variogram')
else:
plt.title('Omni Directional NSCORE ' + feature + ' Variogram')
plt.xlim([0,1000]); plt.ylim([0,1.8*sill])
plt.legend(loc=legend_pos)
plt.grid(True)

def locpix_st(array,xmin,xmax,ymin,ymax,step,vmin,vmax,df,xcol,ycol,vcol,tit
xx, yy = np.meshgrid(np.arange(xmin, xmax, step), np.arange(ymax, ymin,
cs = plt.imshow(array,interpolation = None,extent = [xmin,xmax,ymin,ymax
plt.scatter(df[xcol],df[ycol],s=20,c=df[vcol],marker='o',cmap=cmap,vmin=
plt.scatter(df[xcol],df[ycol],s=40,c='white',marker='o',alpha=0.8,linewi
plt.title(title); plt.xlabel(xlabel)
plt.ylabel(ylabel); plt.xlim(xmin, xmax); plt.ylim(ymin, ymax)
cbar = plt.colorbar(cs,orientation="vertical",cmap=cmap)
cbar.set_label(vlabel, rotation=270, labelpad=20)
return cs

Set the Working Directory


I always like to do this so I don’t lose files and to simplify subsequent read and writes (avoid
including the full address each time).

#os.chdir("c:/PGE383") # set the work

Loading Tabular Data


Here’s the command to load our comma delimited data file in to a Pandas’ DataFrame
object.

note the “fraction_data” variable is an option to random take part of the data (i.e., 1.0 is
all data).
this is not standard part of spatial estimation, but fewer data is easier to visualize
given our grid size (we want multiple cells between the data to see the behavior
away from data)
note, I often remove unnecessary data table columns. This clarifies workflows and
reduces the chance of blunders, e.g., using the wrong column!

fraction_data = 0.2 # extract a fr


Skip to main content
df = pd.read_csv(r"https://raw.githubusercontent.com/GeostatsGuy/GeoDataSets
df = df.rename(columns = {'Por':'Porosity'}) # rename featu
df = df.sample(frac = fraction_data,replace = False,random_state=13) # rando
df = df.reset_index() # reset the re
df = df.loc[:,['X','Y','Porosity']]; # retain only
df.head()

X Y Porosity

0 390.800194 460.553846 5.774823

1 380.012934 519.379612 11.577469

2 885.824011 866.827752 13.808706

3 885.845177 355.124942 11.786347

4 855.906100 656.141070 10.422411

Set Limits for Plotting, Colorbars and Grid


Specification
Limits are applied for data and model visualization and the grid parameters sets the
coverage and resolution of our map.

xmin = 0.0; xmax = 1000.0 # spatial limi


ymin = 0.0; ymax = 1000.0

nx = 100; xmn = 5.0; xsiz = 10.0 # grid specifi


ny = 100; ymn = 5.0; ysiz = 10.0

pormin = 0.0; pormax = 22.0 # feature limi


porvar = np.var(df['Porosity'].values) # assume data

tmin = -9999.9; tmax = 9999.9 # triming limi

Data Analytics and Visualization


Let’s take a look at the available data:

location map
histogram
variogram

Skip to main content


%%capture --no-display

plt.subplot(221) # location map


GSLIB.locmap_st(df,'X','Y','Porosity',0,1000,0,1000,0,25,'Porosity Location

plt.subplot(222) # histogram
plt.hist(df['Porosity'].values,bins=np.linspace(pormin,pormax,30),color='dar
label = 'Porosity')
plt.hist(df['Porosity'].values,bins=np.linspace(pormin,pormax,30),color='dar
label = 'Porosity')
plt.xlabel('Porosity (%)'); plt.ylabel('Frequency'); plt.title('Porosity His

plt.subplot(223) # variogram

lags, gamma_maj, npps_maj = geostats.gamv(df,"X","Y",'Porosity',tmin,tmax,xl


lags, gamma_min, npps_min = geostats.gamv(df,"X","Y",'Porosity',tmin,tmax,xl

nug = 0; nst = 2 # 2 nested str


it1 = 2; cc1 = 20.0; azi1 = 0; hmaj1 = 150; hmin1 = 150
it2 = 2; cc2 = 2.0; azi2 = 0; hmaj2 = 1000; hmin2 = 150

vmodel = GSLIB.make_variogram(nug,nst,it1,cc1,azi1,hmaj1,hmin1,it2,cc2,azi2,
vmodel_sim = GSLIB.make_variogram(nug,nst,it1,cc1/(cc1+cc2),azi1,hmaj1,hmin1

vargplot('Porosity',lags,gamma_maj,gamma_min,npps_maj,npps_min,vmodel,azi=0.
legend_pos='lower right') # plot the var

plt.subplots_adjust(left=0.0, bottom=0.0, right=2.0, top=2.1, wspace=0.2, hs

Skip to main content


Simple Kriging
Let’s specify the variogram model, global stationary mean and variance, and kriging
parameters.

%%capture --no-display

vrange_maj = 250; vrange_min = 100 # variogram ra


vazi = 150.0 # variogram ma
vrel_nugget = 0.0 # variogram nu

skmean = np.average(df['Porosity'].values) # assume globa


sill = np.var(df['Porosity'].values) # assume sill

por_vario = GSLIB.make_variogram(nug=vrel_nugget*sill,nst=1,it1=1,cc1=(1.0-v
azi1=vazi,hmaj1=vrange_maj,hmin1=vrange_min) # porosity var

ktype = 0 # kriging type


radius = 600 # search radiu
nxdis = 1; nydis = 1 # number of gr
ndmin = 0; ndmax = 10 # minimum and

Skip to main content


Now let’s pass this to kriging to make our porosity kriging estimate map.

%%capture --no-display

por_kmap, por_vmap = geostats.kb2d(df,'X','Y','Porosity',tmin,tmax,nx,xmn,xs


ndmin=0,ndmax=10,radius=500,ktype=0,skmean=skmean,vario=vmodel)

plt.subplot(221) # kriging esti


GSLIB.locpix_st(por_kmap,xmin,xmax,ymin,ymax,xsiz,pormin,pormax,df,'X','Y','
'X(m)','Y(m)','Porosity (%)',cmap)

plt.subplot(222) # kriging vari


GSLIB.locpix_st(por_vmap,xmin,xmax,ymin,ymax,xsiz,0,sill,df,'X','Y','X','Sim
'Kriging Variance (%^2)',cmap)

plt.subplot(223) # histograms
plt.hist(df['Porosity'].values,density=True,bins=np.linspace(pormin,pormax,5
edgecolor='black',label='Data',zorder=10)
plt.hist(por_kmap.flatten(),density=True,bins=np.linspace(pormin,pormax,50),
edgecolor='black',label='Kriging',zorder=1)
plt.xlabel('Porosity (%)'); plt.ylabel('Frequency'); plt.title('Porosity His

lags, sk_gamma_maj, npps_maj = geostats.gam(por_kmap,tmin,tmax,xsiz,ysiz,ixd


lags, sk_gamma_min, npps_min = geostats.gam(por_kmap,tmin,tmax,xsiz,ysiz,ixd

plt.subplot(224) # experimenta
vargplot('Porosity',lags,sk_gamma_maj,sk_gamma_min,npps_maj,npps_min,vmodel,
mcolor = 'red', rcolor = 'blue',size= 0.05,legend_pos = 'upper righ

plt.subplots_adjust(left=0.0, bottom=0.0, right=2.0, top=2.1, wspace=0.2, hs

Skip to main content


Sequential Gaussian Simulation
Let’s jump right to building a variety of models with simulation and visualizing the results.
We will start with a test, comparasion of simulation with simple and ordinary kriging.

%%capture --no-display

run = True # run the real

if run:
por_sim_one = geostats.sgsim(df,'X','Y','Porosity',wcol=-1,scol=-1,tmin=
twtcol=0,zmin=pormin,zmax=pormax,ltail=1,ltpar=0.0,utail=1,utpar
nx=nx,xmn=xmn,xsiz=xsiz,ny=ny,ymn=ymn,ysiz=ysiz,seed=73073,
ndmin=0,ndmax=20,nodmax=20,mults=0,nmult=2,noct=-1,
ktype=0,colocorr=0.0,sec_map=0,vario=vmodel_sim)[0]

plt.subplot(221) # pixelplot an
locpix_st(por_sim_one,xmin,xmax,ymin,ymax,xsiz,pormin,pormax,df,'X','Y','Por

plt.subplot(222) # histograms
plt.hist(df['Porosity'].values,density=True,bins=np.linspace(pormin,pormax,3
plt.hist(por_sim_one.flatten(),density=True,bins=np.linspace(pormin,pormax,3
plt.xlabel('Porosity (%)'); plt.ylabel('Frequency'); plt.title('Porosity His
Skip to main content
lags, sim_gamma_maj, npps_maj = geostats.gam(por_sim_one,tmin,tmax,xsiz,ysiz
lags, sim_gamma_min, npps_min = geostats.gam(por_sim_one,tmin,tmax,xsiz,ysiz

plt.subplot(223) # variograms
vargplot('Porosity',lags,sim_gamma_maj,sim_gamma_min,npps_maj,npps_min,vmode
mcolor = 'red', rcolor = 'green',size= 0.05,legend_pos = 'lower righ

plt.subplots_adjust(left=0.0, bottom=0.0, right=2.0, top=2.1, wspace=0.2, hs

Visualize Simulated Realizations and a Kriged


Estimation Model
%%capture --no-display

run = True # run the real


if run:
por_sim = geostats.sgsim(df,'X','Y','Porosity',wcol=-1,scol=-1,tmin=tmin
twtcol=0,zmin=pormin,zmax=pormax,ltail=1,ltpar=0.0,utail=1,utpar
nx=nx,xmn=xmn,xsiz=xsiz,ny=ny,ymn=ymn,ysiz=ysiz,seed=73073,
ndmin=0,ndmax=20,nodmax=20,mults=0,nmult=2,noct=-1,
ktype=0,colocorr=0.0,sec_map=0,vario=vmodel_sim)
Skip to main content
plt.subplot(221) # pixelplot an
locpix_st(por_sim[0],xmin,xmax,ymin,ymax,xsiz,pormin,pormax,df,'X','Y','Poro

plt.subplot(222) # pixelplot an
locpix_st(por_sim[1],xmin,xmax,ymin,ymax,xsiz,pormin,pormax,df,'X','Y','Poro

plt.subplot(223) # pixelplot an
locpix_st(por_sim[2],xmin,xmax,ymin,ymax,xsiz,pormin,pormax,df,'X','Y','Poro

plt.subplot(224) # pixelplot an
locpix_st(por_kmap,xmin,xmax,ymin,ymax,xsiz,pormin,pormax,df,'X','Y','Porosi

plt.subplots_adjust(left=0.0, bottom=0.0, right=2.0, top=2.1, wspace=0.2, hs

Comments
This was a basic demonstration and comparison of spatial estimation vs. spatial simulation
with kriging and sequential Gaussian simulation from GeostatsPy. Much more can be done, I
have other demonstrations for modeling workflows with GeostatsPy in the GitHub repository
GeostatsPy_Demos.

I hope this is helpful,


Skip to main content
Michael

About the Author

Professor Michael Pyrcz in his office on the 40 acres, campus of The University of Texas
at Austin.

Michael Pyrcz is a professor in the Cockrell School of Engineering, and the Jackson School
of Geosciences, at The University of Texas at Austin, where he researches and teaches
subsurface, spatial data analytics, geostatistics, and machine learning. Michael is also,

the principal investigator of the Energy Analytics freshmen research initiative and a core
faculty in the Machine Learn Laboratory in the College of Natural Sciences, The
University of Texas at Austin
an associate editor for Computers and Geosciences, and a board member for
Mathematical Geosciences, the International Association for Mathematical
Geosciences.

Michael has written over 70 peer-reviewed publications, a Python package for spatial data
analytics, co-authored a textbook on spatial data analytics, Geostatistical Reservoir
Modeling and author of two recently released e-books, Applied Geostatistics in Python: a
Hands-on Guide with GeostatsPy and Applied Machine Learning in Python: a Hands-on
Guide with Code.

All of Michael’s university lectures are available on his YouTube Channel with links to 100s
of Python interactive dashboards Skip
and well-documented
to main content workflows in over 40 repositories on
his GitHub account, to support any interested students and working professionals with
evergreen content. To find out more about Michael’s work and shared educational resources
visit his Website.

Want to Work Together?


I hope this content is helpful to those that want to learn more about subsurface modeling,
data analytics and machine learning. Students and working professionals are welcome to
participate.

Want to invite me to visit your company for training, mentoring, project review, workflow
design and / or consulting? I’d be happy to drop by and work with you!
Interested in partnering, supporting my graduate student research or my Subsurface
Data Analytics and Machine Learning consortium (co-PI is Professor John Foster)? My
research combines data analytics, stochastic modeling and machine learning theory
with practice to develop novel methods and workflows to add value. We are solving
challenging subsurface problems!
I can be reached at [email protected].

I’m always happy to discuss,

Michael

Michael Pyrcz, Ph.D., P.Eng. Professor, Cockrell School of Engineering and The Jackson
School of Geosciences, The University of Texas at Austin

More Resources Available at: Twitter | GitHub | Website | GoogleScholar | Geostatistics Book
| YouTube | Applied Geostats in Python e-book | Applied Machine Learning in Python e-book |
LinkedIn

Previous Next
Kriging vs. Simulation, a 1D Indicator Simulation
Example

You might also like