Spatial Interpolation - Perdinan
Spatial Interpolation - Perdinan
Spatial Interpolation - Perdinan
Summary 15
Further Readings 16
References 16
GEO: 874 – On-line Course Outline Perdinan
The following terms and concepts are introduced in this lesson. You are responsible for
understanding these terms and concepts as you may be tested about them in the exam.
- spatial interpolation
- nearest neighbor
- IDW
- TIN
- delaunay triangulation
- krigging
- spline interpolation
- saddle points
- edge effects
X X X
Figure 1. A specific location (X) within a continuous surface of temperature (left), climate
stations (middle) and grids of a climate model (right)
Do you notice some interesting things? Such as how we create the surface temperature data?
How to find information for temperature at a location X which is located within couple climate
stations (middle image) and grid points (right image)?
When we are dealing with such questions, spatial interpolation is a powerful analysis that can be
used to create continuous surface and find information at a particular location based on
information from currently available dataset. As you may guess, spatial interpolation was applied
to create the surface temperature data presented in Figure 1 and could be applied to obtain
information at location X based on information from other climate stations (Figure 1-middle) and
grid points (Figure 1-right).
2
GEO: 874 – On-line Course Outline Perdinan
2 4 6 .?. 10 12 14 .?. 18 20 22
Can you estimate the missing values? Indeed, it is very easy. The missing values are 8 and 16. In
this case, we apply linear interpolation to find values at particular points. In the spatial context,
this concept can also be applied to create an isoline map based on the distance between two
points. For example, you can create contours for temperature (Figure 2-right) based on available
measurements recorded by weather stations (Figure 2-left).
Figure 2. Locations of weather stations (left) and temperature contours for Wisconsin
In GIS applications, many spatial interpolation approaches are available for predicting values in
unsampled locations based on available measurements. Laurini and Thompson (1992)
distinguishes interpolation methods into five categories: nearest value, linear interpolation, spline
interpolation, stochastic interpolation and model based interpolation (Figure 3). Nearest value
interpolation estimates a value at unknown location based on the value of the nearest known
point. Linear interpolation applies straight line based on two points. Spline interpolation utilizes
three or more points in predicting values for unknown locations. Stochastic and model-based
interpolation use stochastic models and sophisticated statistical models, respectively, to assign
values for the unknown locations. In GIS environment, these methods are mostly developed to
transform irregular point/line data to raster or to alter raster resolution (Mitas & Mitasova, 1999).
For example, a raster dataset can be produced from point values of precipitation distributed
unevenly over particular region (Figure 4). The unknown values are estimated by using a
mathematical formula that assigns values of nearby known points to the unknown points. We
will discuss more detail about the most interpolation methods widely use in GIS applications in
the next section.
3
GEO: 874 – On-line Course Outline Perdinan
Figure 4. Conversion of point values into a raster dataset. Source: ESRI (2010)
Spatial analysis within GIS environment is frequently applied for a wide range of applications,
such as climatology, environment and social economics, because we do not have sufficient
measurements. This condition holds because field observations are relatively expensive in terms
of time and money. For example, there are limited weather stations (Figure 5-left) that can be
used for assessing the impact of climate fluctuations on crop production at the county level
(Figure 5-middle) since maintaining weather stations are quite costly.
Physical and social phenomena are also frequently measured or sampled for particular locations
that may be irregularly distributed over space (Mitas & Mitasova, 1999). In addition, we are also
often dealing with heterogenous datasets with different resolutions such as site observations,
regional scale measurements and computer modeling (Figure 5). Interpolation techniques offer
an ability to combine these different data resolutions that may be required for a particular study.
For example, when we want to assess the impact of climate change at the county level, we can
assign climate stations (Figure 5-left) into each county (Figure 5-middle) using a particular
interpolation technique such as the nearest neighbor. You can find more information about this
technique in the next section.
4
GEO: 874 – On-line Course Outline Perdinan
Figure 5. Location of weather stations (site observations), crop production at county scale in
Wisconsin (regional scale measurement) and outputs of a regional climate model
(computer modeling)
5
GEO: 874 – On-line Course Outline Perdinan
Figure 7. Outputs of various interpolation methods, i.e. IDW (left), spline (middle), krigging
(right), produced by using the same data input
Interpolation methods generally can be classified into deterministic and geostatistical methods.
Deterministic interpolator employs mathematical formula to calculate values in unknown
locations based values from known locations (ESRI, 2010). This method includes inverse
distance weighting (IDW), natural neighbor, trend and spline. Geostatistical methods utilize
statistical information derived from values at known locations and their spatial patterns to
estimate values and spatial arrangements of unknown locations (ESRI, 2010). Samples of this
method are kriging and trend surface analysis (PSU, 2011). Interpolation methods can also be
categorized into global and local interpolators (PSU, 2011). Global interpolators employ
information from all known points to estimate values at unknown locations, for example trend
surface analysis. On the other hand, local interpolators will only use values from nearby known
points to estimate values for the unknown points. Nearest neighbor and IDW are parts of this
interpolation type.
In this course, we will introduce more detail about general assumptions behind the most widely
used interpolation methods, namely: Nearest Neighbor, IDW, Triangulated Irregular Network
(TIN), Krigging and Spline Interpolation. Those who are interested in detail formulation of these
techniques or additional interpolation methods can refer to various GIS books or websites
identified at the further readings section.
6
GEO: 874 – On-line Course Outline Perdinan
If you notice based on this assumption, spatial arrangements of the known points used for
estimating the unknown point’ value are not considered in the calculation. As a consequence, the
interpolation results can vary depending upon the cut-off distance or the choice of number of
closest neighborhood points employed in the calculation. It is also important to note the values
calculated for the unknown locations will never exceed the peak values (i.e. maximum and
minimum values) of the known points used in the calculations. Therefore, a cautious attention
should be given to the accuracy of the interpolated values as this method may not be appropriate
in some locations such as mountainous or valley regions.
7
GEO: 874 – On-line Course Outline Perdinan
To illustrate the effects of the cut-off distance, we can play with different distance exponents
(Figure 10). As you can see, the interpolation result is smoother as the distance is larger. This is
because more points located farther away from the unknown locations are included in the
calculation. Consequently, the influence of nearby points decreases. Relatively similar situation
also occurs when we are included more neighborhood points in the calculation (Figure 11). The
extreme rainfall condition (brown colors) appears more clearly when we are applying small
number of nearest points or shorter cut-off distance in the interpolation.
Figure 10. Interpolation of rainfall using various distance exponents: one (left), two (middle) and
three (right). Source: PSU (2011)
Figure 11. Interpolation of rainfall using various nearest points: 5 (left), 12 (middle) and 25
(right). Source: PSU (2011)
8
GEO: 874 – On-line Course Outline Perdinan
Figure 12. Delaunay triangulation with circumcircles around the red sample data (left) and
interpolated TIN surface created from elevation vector pints (right). Source: left
image Educators (2011) and right image Mitas and Mitasova (1999)
Krigging
Krigging is an interpolation method that uses geostatistical approach. The basic premise of
kriging is relatively similar to IDW as kriging also estimates unknown points’ values using
weighted average values of neighbor locations (Figure 13-left). However, kriging uses
semivariances (γ) as the weighted factor instead of distance. Kriging assumes spatial
autocorrelation among measured values at multiple points over a particular space is present.
Then, these correlations are modeled based on distance and direction of available points using
variogram models to explain the variation in the surface. This technique is different from IDW
that assumes linear relationship among control points with separation distance.
Kriging encompasses multiple step processes. The first step is to calculate semivariances for all
pair known points. Using a graphic called as a semivariogram, we plot the semivariances against
the distance for the all pairs and apply spherical function to fit them. Based on this fitting, we can
determine important components of the graphic, namely: nugget, sill and range (Figure 13-right).
The spherical function is then applied to calculate semivariances based on distances between an
unknown point and its known neighborhood points. These semivariances are then used to
complete a series of linear equations to produce a set of weights (λs) with which interpolation
values for the unknown location can be estimated. Using this approach, the predicted values are
produced with the least of errors.
9
GEO: 874 – On-line Course Outline Perdinan
Figure 13. An illustration of kriging method (left) and semivariogram (right). Nugget is a point at
which the fitted spherical line touches Y-coordinate. Sill is the highest level of
semivariance correspond to the fitted line. Range is the distance at which the
semivariance is leveling off. Source: PSU (2011)
If we compare the process of estimating values at unknown locations between kriging and IDW,
it appears that IDW applies simple approach than kriging. However, the complex processes of
kriging interpolation offer an alternative to allow the spatial structure of our data to contribute in
weighted average of known points. This approach may be better than the subjective decision that
should be taken in IDW interpolation as has been explained above. The surface maps created
using these two approaches show that kriging produces more circular countour patterns than
IDW (Figure 14).
Figure 14. Surface maps created using IDW (left) and Kriging (right). The number of
neighboring points is identical for the two maps. Source: PSU (2011)
Currently, there are many kriging methods such as ordinary kriging, simple kriging, kriging with
an external drift and indicator kriging. In this course, we only discuss general procedures applied
in kriging. Please refer to other references suggested in further readings particularly Mitas and
Mitasova (1999) for further information on kriging.
10
GEO: 874 – On-line Course Outline Perdinan
Spline Interpolation
Spline is categorized into deterministic interpolation method. This method uses a mathematical
function to fit input data resulting in a smooth surface. Spline fits sample points using
polynomial and least-squares methods in order to minimize errors. If you compare with linear
interpolation that uses a linear function to fit input points for each interval of x (i.e. x, x+1),
Spline employs low- degree polynomials to pass through exactly (or close to) the data points and
fit them together smoothly (Figure 15). As an example, if we fit the points in Figure 15 (left)
using cubic spline, each interval will have a unique function to fit the pair points (Figure 16).
Compared to linear interpolation, spline interpolation is likely produce a smaller error.
Figure 15. An illustration of spline interpolation (left) and linear interpolation (right)
Figure 16. An illustration of cubic spline functions to fit the data points in Figure 15
In GIS applications, the same concept above is applied to fit values for a specified number of
nearest known points for predicting a value at unknown point. This technique is frequently
applied for generating gently surface varying such as elevation, water table, temperature, and soil
properties. The following is a side by side comparison of the interpolation outputs using spline
and IDW. As you can see, spline interpolation produces smoother contours than IDW. Spline
interpolation also depicts different patterns surrounding the peak values compared to IDW
interpolation, where spline produces gradual changes around the peak values.
11
GEO: 874 – On-line Course Outline Perdinan
Figure 17. Outputs of spline and IDW interpolation. Source: ESRI (2009)
As with many other statistical methods, the accuracy of interpolation results will be greater as the
more data points you used. The distribution of sample locations also significantly affects the
outcomes of spatial interpolation. For example, the results of interpolation for mountainous
region and valleys may not be accurate, if there are no or insufficient data points collected for
these two locations. As you can see from Figure 18, different numbers of data points used in
creating the surface rainfall yield different images. Major features of rainfall patterns may not be
depicted well as the number of data points used in the interpolation are decreasing. In this case,
the interpolated surface produced using the least data points (Figure 18-right) fails to capture
rainfall patterns in the region. However, you may still need to consider that using more data
points may increase the amount of time to complete the interpolations. Arguably, this situation
may not always be true as the advances in computer power in recent decades may solve this
problem, unless you are dealing with very large dataset such as outputs of climate models for the
entire globe. In this case, you may need to define your regional interest so that you only apply
interpolation for the region.
12
GEO: 874 – On-line Course Outline Perdinan
Figure 18. Outputs of interpolation using different number of data points. Source: PSU (2011)
Furthermore, an insufficient number of sample points surrounding the border of your sample
region called as the edge effects can reduce the accuracy of interpolation outputs. For example,
you want to interpolate your data points within the squared red-box. In the first interpolation, you
only use all data points within the red-box (Figure 19-left), while in the second one you are using
all data points for the region. In the first case, extrapolation method is performed to predict
values in unknown locations near the edge of your regional interest, as there is no information
from your outside boundary. In the second case, you are included the data points from the
outside boundary to perform interpolation. This inclusion solves the edge effects and gives you
better accuracy.
Figure 19. Outputs of interpolation using different number of data points Source: PSU (2011)
The other challenge in interpolation is called saddle points or alternative choice problem. This
problem occurs when there are two pairs of diagonally opposite values in which the values of
pairs at one diagonal is greater than those at the other diagonal (Figure 20). In this case, there are
two possible solutions as we may do interpolation using values from both diagonals. Robinson et
al. 1994 suggests that an average value from both diagonal pairs of data points should be used to
draw isolines as has been documented by PSU (2011) (Figure 20-bottom).
13
GEO: 874 – On-line Course Outline Perdinan
Figure 20. Outputs of interpolation using different number of data points. Source: PSU (2011)
Unfortunately, it is quite often that available data points for your case study are heterogeneous,
distributed unevenly and may have different spatial scale as they are from different measurement
techniques. In this case, your depth knowledge about the reality or earth phenomenon that you
are investigated is critical to assist you in choosing a proper interpolation technique by
evaluating the accuracy of each interpolation method. Indeed, your objective is to choose the
interpolation method whose outputs are closest to reality. As your guidance, practically, the
selection of a specific interpolation method should consider your data points, the kind of surfaces
that you want to create, and estimation errors that you can accept (GIS for Educators, 2011). In
general, the following steps can be performed to complete this task.
1) Evaluation of available data points to understand spatial structure of the data. This step may
give you a hint about which interpolation methods are plausible for your dataset
2) Evaluation of the candidate interpolation methods based the result of step 1
3) Comparing the results obtained from the different interpolation methods in order to select the
best interpolation technique that is the one whose result is closest to reality
These steps may be time consuming at the beginning as you do not have any experience. But, as
you are getting used to with the process and have some experience about different interpolation,
you may be able to complete these steps much more efficient. It is important to note that a
careful selection of an appropriate interpolation method for your dataset would be very helpful
for the final result of your case study.
14
GEO: 874 – On-line Course Outline Perdinan
Summary
Let’s wrap up what you have learned through this lesson.
In this lesson, we cover the most interpolation methods that are widely used for GIS applications,
i.e. Nearest Neighbor, IDW, TIN, Krigging and Spline. Each method may produce different
interpolation results as they have their own assumptions. In GIS environment, these interpolation
methods are mostly developed to transform geographic features (i.e. irregular/regular point/line
data) to raster or to alter raster resolutions.
The number of sample points and their distributions over space are the two major problems in the
implementation of interpolation methods. An insufficient number of sample points over a
particular region can reduce the accuracy of interpolation results. Saddle point is another
problem in spatial interpolation. In this case, calculating an average value from the possible
solutions is suggested to draw isolines.
Each interpolation method has advantages and disadvantages so that there is none superior
technique. A careful evaluation of possible interpolation methods for your dataset is highly
recommended to select an appropriate method for your case study.
15
GEO: 874 – On-line Course Outline Perdinan
Further Readings
Ferenc Sárközy. GIS Functions – Interpolation. Technical University Budapest Department of
Surveying, available at Http://www.agt.bme.hu/public_e/funcint/funcint.html)
Mitas, L., Mitasova, H. (1999): Spatial Interpolation. In: P.Longley, M.F. Goodchild, D.J.
Maguire, D.W.Rhind (Eds.), Geographical Information Systems: Principles, Techniques,
Management and Applications, Wiley.
Smith, M.J.d., Goodchild, M.F. and Longley, P.A., 2007. Geospatial Analysis: A comprehensive
Guide to Principles, Techniques and Software Tools. Matador, Leicester, UK.
DeMers, Michael N. 2000. Fundamentals of geographic information systems. New York : J.
Wiley (MSU Main Library: G70.212 .D46 2000 c.2)
ESRI, 2009. The ArcGIS Help Library, available at
http://resources.esri.com/help/9.3/arcgisdesktop/com/gp_toolref/geoprocessing/surface_creati
on_and_analysis.htm
ESRI, 2010. The ArcGIS Help Library, available at
http://help.arcgis.com/en/arcgisdesktop/10.0/help/index.html#/Understanding_interpolation_
analysis/009z0000006w000000/
Longley, P.A. et.al, 2005. Geographic information systems and science. New York : Wiley
Raskin, R.G., Funk, C.C., Webber, S.R. and Willmott, C.J., 1997. Spherekit: The Spatial
Interpolation Toolkit.
References
Anderson, S. (2008). An evaluation of spatial interpolation methods on air temperature in
Phoenix, AZ. Retrieved March 30th 2011, from
http://www.cobblestoneconcepts.com/ucgis2summer/anderson/anderson.htm.
ESRI. (2009). The ArcGIS Help Library Retrieved March 30th 2011, from
http://resources.esri.com/help/9.3/arcgisdesktop/com/gp_toolref/geoprocessing/surface_c
reation_and_analysis.htm.
ESRI. (2010). The ArcGIS Help Library Retrieved March 30th 2011, from
http://help.arcgis.com/en/arcgisdesktop/10.0/help/index.html#/Understanding_interpolati
on_analysis/009z0000006w000000/.
GIS for Educators. (2011). Topic 10: Spatial Analysis (Interpolation) [Electronic Version].
Retrieved March 30th 2011 from http://linfiniti.com/dla/worksheets/10_interpolation.pdf.
Laurini, R., & Thompson, D. (1992). Fundamentals of Spatial Information Systems. San Diego,
California: Academic Press.
Mitas, L., & Mitasova, H. (1999). Spatial Interpolation. In P.Longley, M. F. Goodchild, D. J.
Maguire & D.W.Rhind (Eds.), Geographical Information Systems: Principles,
Techniques, Management and Applications (pp. 481-492): GeoInformation International,
Wiley.
OGT. (2011). Kriging Overview. Retrieved March 30th 2011, from
http://oilandgastraining.org/data/gl61/G3921.asp?Code=23365.
PSU. (2011). Concept Gallery. Retrieved March 30th 2011, from https://www.e-
education.psu.edu/geog486/book/export/html/1761.
Smith, M. J. d., Goodchild, M. F., & Longley, P. A. (2007). Geospatial Analysis: A
comprehensive Guide to Principles, Techniques and Software Tools. Leicester, UK:
Matador.
16