Geostatistical Analyst Course Code: Course Title: Date of Submission
Geostatistical Analyst Course Code: Course Title: Date of Submission
Geostatistical Analyst
Submitted by Submitted to
Name: Palash Kumar Baidya Name: Samsunnahar Popy
Student ID: 16ESD242
Year: MSc Assistant Professor,
Semester:1st
Session: 2016-2017 Department of Environmental Science &
Department of Environmental Science & Disaster Management
Disaster Management BSMRSTU, Gopalganj-8100
BSMRSTU, Gopalganj-8100
1
Introduction
Geostatistical Analyst can easily create a continuous surface, or map, from measured sample points
stored in a point feature layer or raster layer or by using polygon centroids. Geostatistics is a class
of statistics used to analyze and predict the values associated with spatial or spatiotemporal
phenomena. It incorporates the spatial (and in some cases temporal) coordinates of the data within
the analyses. Many geostatistical tools were originally developed as a practical means to describe
spatial patterns and interpolate values for locations where samples were not taken. Those tools and
methods have since evolved to not only provide interpolated values, but also measures of
uncertainty for those values. The measurement of uncertainty is critical to informed decision
making, as it provides information on the possible values (outcomes) for each location rather than
just one interpolated value. These are especially helpful for atmospheric data analysis, petroleum
and mining exploration, environmental analysis, precision agriculture, and fish and wildlife
studies.
The tools featured in the ArcGIS Geostatistical Analyst extension are categorized as follows:
2
The Geostatistical Analyst toolbox, which houses geoprocessing tools specifically
designed to extend the capabilities of the Geostatistical Wizard and allow further analysis
of the surfaces it generates
Before using the interpolation techniques, you should explore your data using the exploratory
spatial data analysis tools. These tools allow you to gain insights into your data and to select the
most appropriate method and parameters for the interpolation model. For example, when using
ordinary kriging to produce a quantile map, you should examine the distribution of the input data
because this particular method assumes that the data is normally distributed.The ESDA tools are
accessed through the Geostatistical Analyst toolbar (shown below) and are composed of the
following:
3
Geostatistical Wizard
The Geostatistical Wizard is a dynamic set of pages that is designed to guide you through the
process of constructing and evaluating the performance of an interpolation model. Choices made
on one page determine which options will be available on the following pages and how you interact
with the data to develop a suitable model. The wizard guides you from the point when you choose
an interpolation method all the way to viewing summary measures of the model's expected
performance.
During construction of an interpolation model, the wizard allows changes in parameter values,
suggests or provides optimized parameter values, and allows you to move forward or backward in
the process to assess the cross-validation results to see whether the current model is satisfactory or
some of the parameter values should be modified.
The Geostatistical Analyst toolbox includes tools for analyzing data, producing a variety of output
surfaces, examining and transforming geostatistical layers to other formats, performing
geostatistical simulation and sensitivity analysis, and aiding in designing sampling networks. The
tools have been grouped into five toolsets:
While cross-validation is provided for all methods available in the Geostatistical Wizard and
can also be run for any geostatistical layer using the Cross Validation geoprocessing tool, a
more rigorous way to assess the quality of an output surface is to compare predicted values
with measurements that were not used to construct the interpolation model. As it is not always
possible to go back to the study area to collect an independent validation dataset, one solution
is to divide the original dataset into two parts. One part can be used to construct the model and
produce a surface. The other part can be used to compare and validate the output surface. The
Subset Features tool enables you to split a dataset into training and test datasets.
5
Geostatistical Analyst Workflow
In this topic, a generalized workflow for geostatistical studies is presented, and the main steps are
explained. As mentioned in what is geostatistics, geostatistics is a class of statistics used to analyze
and predict the values associated with spatial or spatiotemporal phenomena. Geostatistical Analyst
provides a set of tools that allow models that use spatial coordinates to be constructed. These
models can be applied to a wide variety of scenarios and are typically used to generate predictions
for unsampled locations, as well as measures of uncertainty for those predictions.
The first step, as in almost any data-driven study, is to closely examine the data. This typically
starts by mapping the dataset, using a classification and color scheme that allow clear visualization
of important characteristics that the dataset might present, for example, a strong increase in values
from north to south or a mix of high and low values in no particular arrangement (possibly a sign
that the data was taken at a scale that does not show spatial correlation).
The second stage is to build the geostatistical model. This process can entail several steps,
depending on the objectives of the study (that is, the types of information the model is supposed
to provide) and the features of the dataset that have been deemed important enough to incorporate.
6
At this stage, information collected during a rigorous exploration of the dataset and prior
knowledge of the phenomenon determine how complex the model is and how good the interpolated
values and measures of uncertainty will be. In the figure above, building the model can involve
preprocessing the data to remove spatial trends, which are modeled separately and added back in
the final step of the interpolation process. While a lot of information can be derived by examining
the dataset, it is important to incorporate any knowledge you might have of the phenomenon. The
modeler cannot rely solely on the dataset to show all the important features; those that do not
appear can still be incorporated into the model by adjusting parameter values to reflect an expected
outcome. It is important that the model be as realistic as possible in order for the interpolated
values and associated uncertainties to be accurate representations of the real phenomenon.
In addition to preprocessing the data, it may be necessary to model the spatial structure (spatial
correlation) in the dataset. Some methods, such as kriging, require this to be explicitly modeled
using semivariogram or covariance functions (Semivariograms and covariance functions), whereas
other methods, such as Inverse Distance Weighting, rely on an assumed degree of spatial structure,
which the modeler must provide based on prior knowledge of the phenomenon.
A final component of the model is the search strategy. This defines how many data points are used
to generate a value for an unsampled location. Their spatial configuration (location with respect to
one another and to the unsampled location) can also be defined. Both factors affect the interpolated
value and its associated uncertainty.
Once the model has been completely defined, it can be used in conjunction with the dataset to
generate interpolated values for all unsampled locations within an area of interest. The output is
usually a map showing values of the variable being modeled. The effect of outliers can be
investigated at this stage, as they will probably change the model's parameter values and thus the
interpolated map. Depending on the interpolation method, the same model can also be used to
generate measures of uncertainty for the interpolated values. Not all models have this capability,
so it is important to define at the start if measures of uncertainty are needed. This determines which
of the models are suitable(Classification trees).
As with all modeling endeavors, the model's output should be checked, that is, make sure that the
interpolated values and associated measures of uncertainty are reasonable and match your
expectations.Once the model has been satisfactorily built, adjusted, and its output checked, the
results can be used in risk analyses and decision making.
7
Uses of Geostatistical Analyst
Geostatistics is widely used in many areas of science and engineering, for example:
The mining industry uses geostatistics for several aspects of a project: initially to quantify
mineral resources and evaluate the project's economic feasibility, then on a daily basis in
order to decide which material is routed to the plant and which is waste, using updated
information as it becomes available.
In the environmental sciences, geostatistics is used to estimate pollutant levels in order to
decide if they pose a threat to environmental or human health and warrant remediation.
Relatively new applications in the field of soil science focus on mapping soil nutrient levels
(nitrogen, phosphorus, potassium, and so on) and other indicators (such as electrical
conductivity) in order to study their relationships to crop yield and prescribe precise
amounts of fertilizer for each location in the field.
Meteorological applications include prediction of temperatures, rainfall, and associated
variables (such as acid rain).
Most recently, there have been several applications of geostatistics in the area of public
health, for example, the prediction of environmental contaminant levels and their relation
to the incidence rates of cancer.
In all of these examples, the general context is that there is some phenomenon of interest occurring
in the landscape (the level of contamination of soil, water, or air by a pollutant; the content of gold
or some other metal in a mine; and so forth). Exhaustive studies are expensive and time consuming,
so the phenomenon is usually characterized by taking samples at different locations. Geostatistics
is then used to produce predictions (and related measures of uncertainty of the predictions) for the
unsampled locations.
The Geostatistical Analyst addresses a wide range of different application areas. The following is
a small sampling of applications in which Geostatistical Analyst was used.
8
Exploratory spatial data analysis
Using measured sample points from a study area, Geostatistical Analyst was used to create
accurate predictions for other unmeasured locations within the same area. Exploratory spatial data
analysis tools included with Geostatistical Analyst were used to assess the statistical properties of
data such as spatial data variability, spatial data dependence, and global trends.
A number of exploratory spatial data analysis tools were used in the example below to investigate
the properties of ozone measurements taken at monitoring stations in the Carpathian Mountains.
Semivariogram modeling
Geostatistical analysis of data occurs in the following phases:
A number of kriging methods are available for surface creation in Geostatistical Analyst, including
ordinary, simple, universal, indicator, probability, and disjunctive kriging. The following
illustrates the two phases of geostatistical analysis of data. First, the Semivariogram/Covariance
wizard was used to fit a model to winter temperature data for the United States. This model was
then used to create the temperature distribution map.
9
Figure 6: Geostatistical Analyst application for winter temperature
The following shows Geostatistical Analyst used to produce a prediction map of radiocesium soil
contamination levels in the country of Belarus after the Chernobyl nuclear power plant accident.
10
Threshold mapping
Probability maps can be generated to predict where values exceed a critical threshold. In the
example below, locations shown in dark orange and red indicate a probability greater than 62.5
percent that radiocesium contamination exceeds the upper permissible level (critical threshold) in
forest berries.
11
Surface prediction using cokriging
Cokriging, an advanced surface modeling method included in Geostatistical Analyst, can be used
to improve surface prediction of a primary variable by taking into account secondary variables,
provided that the primary and secondary variables are spatially correlated.
In the following example, exploratory spatial data analysis tools are used to explore spatial
correlation between ozone (primary variable) and nitrogen dioxide (secondary variable) in
California. Because the variables are spatially correlated, cokriging can use the nitrogen dioxide
data to improve predictions when mapping ozone.
Conclusions
Geostatistical Analyst has served as a bridge between geostatistics and GIS.GIS illuminated well
placement and field locaton. Geostatistical Analyst provides insight into formation property
trends.Free data is good but not good enough. In the other available kriging softwares, few
geostatistical tools have been available, but only in geostatistical analyst, all tools are available
and also integrated within GIS modeling environments. The most important feature of such
Integration is that now it is possible to quantify the quality of prediction surface models by
measuring the statistical error of predicted surfaces.
12
REFERENCE
https://www.esri.com/en-us/arcgis/products/geostatistical-analyst/overview
https://pro.arcgis.com/en/pro-app/latest/tool-reference/geostatistical-analyst/an-
overview-of-the-geostatistical-analyst-toolbox.htm
https://desktop.arcgis.com/en/system-requirements/latest/arcgis-desktop-system-
requirements.htm
https://www.caee.utexas.edu/prof/maidment/giswr2005/geostat/ExGeostat.htm
https://gisrsstudy.com/geostatistical-analyst-arcgis/
13