Spatial Interpolation - Perdinan

Download as pdf or txt
Download as pdf or txt
You are on page 1of 16

GEO: 874 – On-line Course Outline Perdinan

Lesson: Spatial Interpolation

Key terms and concepts 2

What is Spatial Interpolation? 2


Definition and Application of Spatial Interpolation 3

Methods of Spatial Interpolation 5


- Nearest Neighbor Interpolation 6
- Inverse Distance Weighting 7
- Triangulated Irregular Network 8
- Krigging 9
- Spline Interpolation 11
- Spatial Interpolation in GIS Software 12

Challenges in Spatial Interpolation 12

Selection of Spatial Interpolation 14

Summary 15

Further Readings 16

References 16
GEO: 874 – On-line Course Outline Perdinan

Key terms and concepts

The following terms and concepts are introduced in this lesson. You are responsible for
understanding these terms and concepts as you may be tested about them in the exam.

- spatial interpolation
- nearest neighbor
- IDW
- TIN
- delaunay triangulation
- krigging
- spline interpolation
- saddle points
- edge effects

What is Spatial Interpolation?

Please look at the following figures carefully.

X X X

Figure 1. A specific location (X) within a continuous surface of temperature (left), climate
stations (middle) and grids of a climate model (right)

Do you notice some interesting things? Such as how we create the surface temperature data?
How to find information for temperature at a location X which is located within couple climate
stations (middle image) and grid points (right image)?

When we are dealing with such questions, spatial interpolation is a powerful analysis that can be
used to create continuous surface and find information at a particular location based on
information from currently available dataset. As you may guess, spatial interpolation was applied
to create the surface temperature data presented in Figure 1 and could be applied to obtain
information at location X based on information from other climate stations (Figure 1-middle) and
grid points (Figure 1-right).

2
GEO: 874 – On-line Course Outline Perdinan

Definition and Application of Spatial Interpolation


Spatial interpolation is one of the most important components of Geographic Information
Systems (GIS). Spatial interpolation is defined as the process to obtain information or a value at
a particular location from other locations whose information or values are known. The basic
concept of spatial interpolation is Tobler’s Law of Geography (1970) that states all places are
correlated, and nearby places have more correlation than those are further away. The concept of
interpolation is based on the sequential concept in mathematics that allows you to estimate an
unknown value from other known values. For example, if you have the following series:

2 4 6 .?. 10 12 14 .?. 18 20 22

Can you estimate the missing values? Indeed, it is very easy. The missing values are 8 and 16. In
this case, we apply linear interpolation to find values at particular points. In the spatial context,
this concept can also be applied to create an isoline map based on the distance between two
points. For example, you can create contours for temperature (Figure 2-right) based on available
measurements recorded by weather stations (Figure 2-left).

Figure 2. Locations of weather stations (left) and temperature contours for Wisconsin

In GIS applications, many spatial interpolation approaches are available for predicting values in
unsampled locations based on available measurements. Laurini and Thompson (1992)
distinguishes interpolation methods into five categories: nearest value, linear interpolation, spline
interpolation, stochastic interpolation and model based interpolation (Figure 3). Nearest value
interpolation estimates a value at unknown location based on the value of the nearest known
point. Linear interpolation applies straight line based on two points. Spline interpolation utilizes
three or more points in predicting values for unknown locations. Stochastic and model-based
interpolation use stochastic models and sophisticated statistical models, respectively, to assign
values for the unknown locations. In GIS environment, these methods are mostly developed to
transform irregular point/line data to raster or to alter raster resolution (Mitas & Mitasova, 1999).
For example, a raster dataset can be produced from point values of precipitation distributed
unevenly over particular region (Figure 4). The unknown values are estimated by using a
mathematical formula that assigns values of nearby known points to the unknown points. We
will discuss more detail about the most interpolation methods widely use in GIS applications in
the next section.

3
GEO: 874 – On-line Course Outline Perdinan

Figure 3. Several approaches of interpolation. Source: Laurini and Thompson (1992)

Figure 4. Conversion of point values into a raster dataset. Source: ESRI (2010)

Spatial analysis within GIS environment is frequently applied for a wide range of applications,
such as climatology, environment and social economics, because we do not have sufficient
measurements. This condition holds because field observations are relatively expensive in terms
of time and money. For example, there are limited weather stations (Figure 5-left) that can be
used for assessing the impact of climate fluctuations on crop production at the county level
(Figure 5-middle) since maintaining weather stations are quite costly.

Physical and social phenomena are also frequently measured or sampled for particular locations
that may be irregularly distributed over space (Mitas & Mitasova, 1999). In addition, we are also
often dealing with heterogenous datasets with different resolutions such as site observations,
regional scale measurements and computer modeling (Figure 5). Interpolation techniques offer
an ability to combine these different data resolutions that may be required for a particular study.
For example, when we want to assess the impact of climate change at the county level, we can
assign climate stations (Figure 5-left) into each county (Figure 5-middle) using a particular
interpolation technique such as the nearest neighbor. You can find more information about this
technique in the next section.

4
GEO: 874 – On-line Course Outline Perdinan

Figure 5. Location of weather stations (site observations), crop production at county scale in
Wisconsin (regional scale measurement) and outputs of a regional climate model
(computer modeling)

Methods of Spatial Interpolation


Understanding how spatial interpolation works is an essential knowledge for a GIS Scientist.
Many GIS software such as ARCGIS offer a variety of spatial interpolation analyses which are
fully automatic. Users just need to load their data, use the interpolation toolbox and specify the
method to interpolate their data (Figure 6). Different methods are likely to produce different
outputs as they are produced by using different techniques (Figure 7). As a result, the users
should understand basic assumptions of each interpolation technique and identify whether a
particular interpolation method meets their expectation. It is often necessary to evaluate the
results of various interpolation techniques before we make a decision to select a particular
interpolation method for our study as there is no superior technique which can be used for any
situation. We will discuss more detail about this concern in the last section of this lesson.

Figure 6. Interpolation toolbox in ARCGIS

5
GEO: 874 – On-line Course Outline Perdinan

Figure 7. Outputs of various interpolation methods, i.e. IDW (left), spline (middle), krigging
(right), produced by using the same data input

Interpolation methods generally can be classified into deterministic and geostatistical methods.
Deterministic interpolator employs mathematical formula to calculate values in unknown
locations based values from known locations (ESRI, 2010). This method includes inverse
distance weighting (IDW), natural neighbor, trend and spline. Geostatistical methods utilize
statistical information derived from values at known locations and their spatial patterns to
estimate values and spatial arrangements of unknown locations (ESRI, 2010). Samples of this
method are kriging and trend surface analysis (PSU, 2011). Interpolation methods can also be
categorized into global and local interpolators (PSU, 2011). Global interpolators employ
information from all known points to estimate values at unknown locations, for example trend
surface analysis. On the other hand, local interpolators will only use values from nearby known
points to estimate values for the unknown points. Nearest neighbor and IDW are parts of this
interpolation type.

In this course, we will introduce more detail about general assumptions behind the most widely
used interpolation methods, namely: Nearest Neighbor, IDW, Triangulated Irregular Network
(TIN), Krigging and Spline Interpolation. Those who are interested in detail formulation of these
techniques or additional interpolation methods can refer to various GIS books or websites
identified at the further readings section.

Nearest Neighbor Interpolation


Nearest neighbor interpolation assigns a value for an unknown point in some space based on a
value from the nearest location with that point without considering values from the other
neighboring points (Figure 8). This technique is often implemented for applications under the
following conditions: dealing with near-uniformly data distribution over space, no additional
values are desired, and the need to fill in missing values with values from nearby locations
(Smith, Goodchild, & Longley, 2007).

6
GEO: 874 – On-line Course Outline Perdinan

Figure 8. An illustration of Nearest Neighbor (left) and an example of nearest neighbor


interpolation of a random set of points (black dots) (right). Each colored cell indicates
the area for each black point in that cell.

Inverse Distance Weighting


Inverse Distance Weighting (IDW) is categorized into local neighborhood and deterministic
interpolation. This method assumes an unknown value at a particular point is influenced more by
values from nearby points than those are farther away. The value at unknown location (red dot)
can be calculated as a function of weighted average values of its neighborhood known points
within a certain distance or a number of closest points as illustrated in Figure 9. The method is
called inverse distance weighting because the influence of known points in estimating the
unknown points is decreasing with distance.

If you notice based on this assumption, spatial arrangements of the known points used for
estimating the unknown point’ value are not considered in the calculation. As a consequence, the
interpolation results can vary depending upon the cut-off distance or the choice of number of
closest neighborhood points employed in the calculation. It is also important to note the values
calculated for the unknown locations will never exceed the peak values (i.e. maximum and
minimum values) of the known points used in the calculations. Therefore, a cautious attention
should be given to the accuracy of the interpolated values as this method may not be appropriate
in some locations such as mountainous or valley regions.

Figure 9. Illustration of IDW. Source: PSU (2011)

7
GEO: 874 – On-line Course Outline Perdinan

To illustrate the effects of the cut-off distance, we can play with different distance exponents
(Figure 10). As you can see, the interpolation result is smoother as the distance is larger. This is
because more points located farther away from the unknown locations are included in the
calculation. Consequently, the influence of nearby points decreases. Relatively similar situation
also occurs when we are included more neighborhood points in the calculation (Figure 11). The
extreme rainfall condition (brown colors) appears more clearly when we are applying small
number of nearest points or shorter cut-off distance in the interpolation.

Figure 10. Interpolation of rainfall using various distance exponents: one (left), two (middle) and
three (right). Source: PSU (2011)

Figure 11. Interpolation of rainfall using various nearest points: 5 (left), 12 (middle) and 25
(right). Source: PSU (2011)

Triangulated Irregular Network


Triangulated irregular network (TIN) is one of popular GIS technique frequently used to
represent a continuous surface such as elevation. This method is an appropriate option to depict
earth surface with which we can derive lakes, streams, ridges, valleys, and hillsides from
elevation datasets. TIN constructs a continuous surface by connecting edges of triangles which
are formed by interpolating nearest neighbor points using an algorithm. Delaunay triangulation is
a common TIN interpolation algorithm which is used to form the triangles. This algorithm works
by drawing circumcircles to encompass particular sample points and their intersections are linked
to form a network of triangles which are not overlapping each other (Figure 12). In GIS, you can
create TIN surface using points, lines and polygons whose store elevation data.

8
GEO: 874 – On-line Course Outline Perdinan

Figure 12. Delaunay triangulation with circumcircles around the red sample data (left) and
interpolated TIN surface created from elevation vector pints (right). Source: left
image Educators (2011) and right image Mitas and Mitasova (1999)

Krigging
Krigging is an interpolation method that uses geostatistical approach. The basic premise of
kriging is relatively similar to IDW as kriging also estimates unknown points’ values using
weighted average values of neighbor locations (Figure 13-left). However, kriging uses
semivariances (γ) as the weighted factor instead of distance. Kriging assumes spatial
autocorrelation among measured values at multiple points over a particular space is present.
Then, these correlations are modeled based on distance and direction of available points using
variogram models to explain the variation in the surface. This technique is different from IDW
that assumes linear relationship among control points with separation distance.

Kriging encompasses multiple step processes. The first step is to calculate semivariances for all
pair known points. Using a graphic called as a semivariogram, we plot the semivariances against
the distance for the all pairs and apply spherical function to fit them. Based on this fitting, we can
determine important components of the graphic, namely: nugget, sill and range (Figure 13-right).
The spherical function is then applied to calculate semivariances based on distances between an
unknown point and its known neighborhood points. These semivariances are then used to
complete a series of linear equations to produce a set of weights (λs) with which interpolation
values for the unknown location can be estimated. Using this approach, the predicted values are
produced with the least of errors.

9
GEO: 874 – On-line Course Outline Perdinan

Figure 13. An illustration of kriging method (left) and semivariogram (right). Nugget is a point at
which the fitted spherical line touches Y-coordinate. Sill is the highest level of
semivariance correspond to the fitted line. Range is the distance at which the
semivariance is leveling off. Source: PSU (2011)

If we compare the process of estimating values at unknown locations between kriging and IDW,
it appears that IDW applies simple approach than kriging. However, the complex processes of
kriging interpolation offer an alternative to allow the spatial structure of our data to contribute in
weighted average of known points. This approach may be better than the subjective decision that
should be taken in IDW interpolation as has been explained above. The surface maps created
using these two approaches show that kriging produces more circular countour patterns than
IDW (Figure 14).

Figure 14. Surface maps created using IDW (left) and Kriging (right). The number of
neighboring points is identical for the two maps. Source: PSU (2011)

Currently, there are many kriging methods such as ordinary kriging, simple kriging, kriging with
an external drift and indicator kriging. In this course, we only discuss general procedures applied
in kriging. Please refer to other references suggested in further readings particularly Mitas and
Mitasova (1999) for further information on kriging.

10
GEO: 874 – On-line Course Outline Perdinan

Spline Interpolation
Spline is categorized into deterministic interpolation method. This method uses a mathematical
function to fit input data resulting in a smooth surface. Spline fits sample points using
polynomial and least-squares methods in order to minimize errors. If you compare with linear
interpolation that uses a linear function to fit input points for each interval of x (i.e. x, x+1),
Spline employs low- degree polynomials to pass through exactly (or close to) the data points and
fit them together smoothly (Figure 15). As an example, if we fit the points in Figure 15 (left)
using cubic spline, each interval will have a unique function to fit the pair points (Figure 16).
Compared to linear interpolation, spline interpolation is likely produce a smaller error.

Figure 15. An illustration of spline interpolation (left) and linear interpolation (right)

Figure 16. An illustration of cubic spline functions to fit the data points in Figure 15

In GIS applications, the same concept above is applied to fit values for a specified number of
nearest known points for predicting a value at unknown point. This technique is frequently
applied for generating gently surface varying such as elevation, water table, temperature, and soil
properties. The following is a side by side comparison of the interpolation outputs using spline
and IDW. As you can see, spline interpolation produces smoother contours than IDW. Spline
interpolation also depicts different patterns surrounding the peak values compared to IDW
interpolation, where spline produces gradual changes around the peak values.

11
GEO: 874 – On-line Course Outline Perdinan

Figure 17. Outputs of spline and IDW interpolation. Source: ESRI (2009)

Spatial Interpolation in GIS Software


As has been mentioned above, spatial interpolation is one of the most important components in
GIS applications. Many current GIS software include this tool with which you can easily
interpolate your data points. For the methods describe here, you can find them in the following
GIS packages: ArcGIS, IDRISI, GRASS, ILWIS, MFWorks, SPANS, and Vertical Mapper.

Challenges in Spatial Interpolation


As you already know, spatial interpolation is a statistical technique. So, the number of
observations and their distribution over a particular space can significantly affect the results of
interpolation regardless of interpolation methods you choose.

As with many other statistical methods, the accuracy of interpolation results will be greater as the
more data points you used. The distribution of sample locations also significantly affects the
outcomes of spatial interpolation. For example, the results of interpolation for mountainous
region and valleys may not be accurate, if there are no or insufficient data points collected for
these two locations. As you can see from Figure 18, different numbers of data points used in
creating the surface rainfall yield different images. Major features of rainfall patterns may not be
depicted well as the number of data points used in the interpolation are decreasing. In this case,
the interpolated surface produced using the least data points (Figure 18-right) fails to capture
rainfall patterns in the region. However, you may still need to consider that using more data
points may increase the amount of time to complete the interpolations. Arguably, this situation
may not always be true as the advances in computer power in recent decades may solve this
problem, unless you are dealing with very large dataset such as outputs of climate models for the
entire globe. In this case, you may need to define your regional interest so that you only apply
interpolation for the region.

12
GEO: 874 – On-line Course Outline Perdinan

Figure 18. Outputs of interpolation using different number of data points. Source: PSU (2011)

Furthermore, an insufficient number of sample points surrounding the border of your sample
region called as the edge effects can reduce the accuracy of interpolation outputs. For example,
you want to interpolate your data points within the squared red-box. In the first interpolation, you
only use all data points within the red-box (Figure 19-left), while in the second one you are using
all data points for the region. In the first case, extrapolation method is performed to predict
values in unknown locations near the edge of your regional interest, as there is no information
from your outside boundary. In the second case, you are included the data points from the
outside boundary to perform interpolation. This inclusion solves the edge effects and gives you
better accuracy.

Figure 19. Outputs of interpolation using different number of data points Source: PSU (2011)

The other challenge in interpolation is called saddle points or alternative choice problem. This
problem occurs when there are two pairs of diagonally opposite values in which the values of
pairs at one diagonal is greater than those at the other diagonal (Figure 20). In this case, there are
two possible solutions as we may do interpolation using values from both diagonals. Robinson et
al. 1994 suggests that an average value from both diagonal pairs of data points should be used to
draw isolines as has been documented by PSU (2011) (Figure 20-bottom).

13
GEO: 874 – On-line Course Outline Perdinan

Figure 20. Outputs of interpolation using different number of data points. Source: PSU (2011)

Selection of Spatial Interpolation


Selecting a suitable interpolation method for a specific analysis is a challenging task as none
single interpolation method can be applied to all situations. As you can see in the previous
examples, different interpolation methods are likely to produce different outcomes. Each method
also has advantages and disadvantages (Table 1). A good understanding about the advantages
and disadvantages of each method will help us to choose a plausible interpolation technique for
our case study. For example, you may apply the nearest neighbor when you want to interpolate
data points which are distributed evenly in a flat surface.

Unfortunately, it is quite often that available data points for your case study are heterogeneous,
distributed unevenly and may have different spatial scale as they are from different measurement
techniques. In this case, your depth knowledge about the reality or earth phenomenon that you
are investigated is critical to assist you in choosing a proper interpolation technique by
evaluating the accuracy of each interpolation method. Indeed, your objective is to choose the
interpolation method whose outputs are closest to reality. As your guidance, practically, the
selection of a specific interpolation method should consider your data points, the kind of surfaces
that you want to create, and estimation errors that you can accept (GIS for Educators, 2011). In
general, the following steps can be performed to complete this task.

1) Evaluation of available data points to understand spatial structure of the data. This step may
give you a hint about which interpolation methods are plausible for your dataset
2) Evaluation of the candidate interpolation methods based the result of step 1
3) Comparing the results obtained from the different interpolation methods in order to select the
best interpolation technique that is the one whose result is closest to reality

These steps may be time consuming at the beginning as you do not have any experience. But, as
you are getting used to with the process and have some experience about different interpolation,
you may be able to complete these steps much more efficient. It is important to note that a
careful selection of an appropriate interpolation method for your dataset would be very helpful
for the final result of your case study.

14
GEO: 874 – On-line Course Outline Perdinan

Table 1. Advantages and disadvantages of several interpolation methods


Methods Advantages Disadvantages
Nearest - simple - may not perform well if data points are
Neighbor - computationally less intensive distributed unevenly
IDW - simple - decrease the accuracy if data points are
- computationally less intensive distributed unevenly1
TIN - data storage is much more efficient2 - may not produce smooth interpolated
than raster-based representations surface and give a jagged appearance1
- can be used to construct Thiessen
polygons2
Kriging - minimize the mean square error3 - require the estimation of the variogram
- a robust technique3 which is not simple3
- able to measure the certainty or - computationally very intensive3
accuracy of the predictions5 - need depth knowledge about kriging
procedure 3
Spline - can generate sufficiently accurate - the functions are sensitive to outliers
surfaces from few sample points4 - may only be best for gently varying
- able to retain small features4 surfaces whose variance is low5
- provide enough flexibility for local
geometry analysis6
Source: 1GIS for Educators (2011), 2PSU (2011), 3OGT (2011), 4Anderson (2008), 5 ESRI (2010),
6
Mitas and Mitasova (1999)

Summary
Let’s wrap up what you have learned through this lesson.

Interpolation is a statistical technique used to predict values at unknown locations based on


available data points. The basic premise of interpolation is Tobler’s Law of Geography, values
from nearby locations influence more than those farther away.

In this lesson, we cover the most interpolation methods that are widely used for GIS applications,
i.e. Nearest Neighbor, IDW, TIN, Krigging and Spline. Each method may produce different
interpolation results as they have their own assumptions. In GIS environment, these interpolation
methods are mostly developed to transform geographic features (i.e. irregular/regular point/line
data) to raster or to alter raster resolutions.

The number of sample points and their distributions over space are the two major problems in the
implementation of interpolation methods. An insufficient number of sample points over a
particular region can reduce the accuracy of interpolation results. Saddle point is another
problem in spatial interpolation. In this case, calculating an average value from the possible
solutions is suggested to draw isolines.

Each interpolation method has advantages and disadvantages so that there is none superior
technique. A careful evaluation of possible interpolation methods for your dataset is highly
recommended to select an appropriate method for your case study.

15
GEO: 874 – On-line Course Outline Perdinan

Further Readings
Ferenc Sárközy. GIS Functions – Interpolation. Technical University Budapest Department of
Surveying, available at Http://www.agt.bme.hu/public_e/funcint/funcint.html)
Mitas, L., Mitasova, H. (1999): Spatial Interpolation. In: P.Longley, M.F. Goodchild, D.J.
Maguire, D.W.Rhind (Eds.), Geographical Information Systems: Principles, Techniques,
Management and Applications, Wiley.
Smith, M.J.d., Goodchild, M.F. and Longley, P.A., 2007. Geospatial Analysis: A comprehensive
Guide to Principles, Techniques and Software Tools. Matador, Leicester, UK.
DeMers, Michael N. 2000. Fundamentals of geographic information systems. New York : J.
Wiley (MSU Main Library: G70.212 .D46 2000 c.2)
ESRI, 2009. The ArcGIS Help Library, available at
http://resources.esri.com/help/9.3/arcgisdesktop/com/gp_toolref/geoprocessing/surface_creati
on_and_analysis.htm
ESRI, 2010. The ArcGIS Help Library, available at
http://help.arcgis.com/en/arcgisdesktop/10.0/help/index.html#/Understanding_interpolation_
analysis/009z0000006w000000/
Longley, P.A. et.al, 2005. Geographic information systems and science. New York : Wiley
Raskin, R.G., Funk, C.C., Webber, S.R. and Willmott, C.J., 1997. Spherekit: The Spatial
Interpolation Toolkit.

References
Anderson, S. (2008). An evaluation of spatial interpolation methods on air temperature in
Phoenix, AZ. Retrieved March 30th 2011, from
http://www.cobblestoneconcepts.com/ucgis2summer/anderson/anderson.htm.
ESRI. (2009). The ArcGIS Help Library Retrieved March 30th 2011, from
http://resources.esri.com/help/9.3/arcgisdesktop/com/gp_toolref/geoprocessing/surface_c
reation_and_analysis.htm.
ESRI. (2010). The ArcGIS Help Library Retrieved March 30th 2011, from
http://help.arcgis.com/en/arcgisdesktop/10.0/help/index.html#/Understanding_interpolati
on_analysis/009z0000006w000000/.
GIS for Educators. (2011). Topic 10: Spatial Analysis (Interpolation) [Electronic Version].
Retrieved March 30th 2011 from http://linfiniti.com/dla/worksheets/10_interpolation.pdf.
Laurini, R., & Thompson, D. (1992). Fundamentals of Spatial Information Systems. San Diego,
California: Academic Press.
Mitas, L., & Mitasova, H. (1999). Spatial Interpolation. In P.Longley, M. F. Goodchild, D. J.
Maguire & D.W.Rhind (Eds.), Geographical Information Systems: Principles,
Techniques, Management and Applications (pp. 481-492): GeoInformation International,
Wiley.
OGT. (2011). Kriging Overview. Retrieved March 30th 2011, from
http://oilandgastraining.org/data/gl61/G3921.asp?Code=23365.
PSU. (2011). Concept Gallery. Retrieved March 30th 2011, from https://www.e-
education.psu.edu/geog486/book/export/html/1761.
Smith, M. J. d., Goodchild, M. F., & Longley, P. A. (2007). Geospatial Analysis: A
comprehensive Guide to Principles, Techniques and Software Tools. Leicester, UK:
Matador.

16

You might also like