Interpolation

INTERPOLATION
Introduction to GIS
How are raster surfaces Made?

Raster surfaces are generally either made:
From remote sensing (covered later) which collects reflectance values at every pixel within the geographic extent and can be classified later on or From sample points whose Z values are Interpolated across space to fill in all the blank areas.
Introduction to GIS
What is interpolation?
Process of creating a surface based on values at isolated sample points. Sample points are locations where we collect data on some phenomenon and record the spatial coordinates We use mathematical estimation to guess at what the values are in between those points We can create either a raster or vector interpolated surface Interpolation is used because field data are expensive to collect, and cant be collected everywhere
Introduction to GIS
How does it Look

Let say we have our ground water pollution samples
This gives us
Introduction to GIS
How does it work

This can be displayed as a 3D trend surface in 3D analyst
Introduction to GIS
How does it work

We can also use interpolation methods to create contours
Introduction to GIS
Sample points
Also known as control points.
These are points where you or someone else has collected data (attributes) for a spatial coordinate (point) Any number of attributes can be collected at that point
E.g.1 weather stations collect data on temperature, rainfall, wind, humidity, etc. E.g. 2 soil invertebrate samples would record abundance of numerous species at each location
Introduction to GIS
What isnt interpolation?

Interpolation only works where values are spatially dependant, or spatially autocorrelated, that is, where nearby location tend to have similar Z values. Examples of spatially autocorrelated features: elevation, property value, crime levels, precipitation
Non-autocorrelated examples: number of drum sets per city block; cheeseburgers consumed per household. Where values across a landscape are
Introduction to GIS
How does interpolation work

In ArcGIS, to interpolate:
Create or add a point shapefile with some attribute that will be used as a Z value
Click Spatial Analyst>>Interpolate to Raster and then choose the method
Introduction to GIS
Where interpolation does not work

Cannot use interpolation where values are not spatially autocorrelated
Say looking at household incomein an incomesegregated city, you could take a small sample of households for income and probably interpolate However, in a highly income-integrated city, where a given block has rich and poor, this would not work
Introduction to GIS
Interpolation examples
Elevation:
Elevation values tend to be highly spatially autocorrelated because elevation at location (x,y) is generally a function of the surrounding locations Except is areas where terrain is very abrupt and precipitous, such as Patagonia, or Yosemite
In this case, elevation would not be autocorrelated at local (large) scale, but still may be autocorrelated at regional (small scale)
Introduction to GIS
Imagine this elevation cross section: If each dashed line represented a sample point (in 1-D), this spacing would miss major local sources of variation, like the gorge
Introduction to GIS
Our interpolated surface (represented in 1-D by the blue line) would look like this
Introduction to GIS
If we increased the sampling rate, we would pick up that local variation
Introduction to GIS
Here our interpolated surface is much closer to reality at the local level, but we pay for this in the form of higher data gathering cost
Introduction to GIS
Weather
Weather tends to be modeled on a regional level (e.g. your local weather report) because, in most places, weather systems and trends happen over a very large area. Hence the need for sample point density is not so great
In other places, local climate variability is very great, such as in the SF Bay Area where temperatures can vary 50 degrees within 10 miles due to ocean effects.
Introduction to GIS
Weather
Weather is also extremely variable over time, so samples must be continually taken. This is why weather stations are usually permanent
Example: precipitation varying over a season
Source: LUBOS MITAS AND HELENA MITASOVA, University of Illinois
Introduction to GIS
Groundwater contamination:
The needed density of points will depend on the geology and the type of terrain
Areas where geology allows for free groundwater flows across large areas will have less local variation and need less dense points, while areas with geologic features that inhibit or redirect flow (e.g. karst topography) will need denser points
Example
Here are some sample elevation points from which surfaces were derived using the three methods
Introduction to GIS
Inverse Distance Weighting

IDW weights the value of each point by its distance to the cell being analyzed and averages the values. IDW assumes that unknown value is influenced more by nearby than far away points, but we can control how rapid that decay is. Influence diminishes with distance.
To predict a value for any unmeasured location, IDW will use the measured values surrounding the prediction location. Those measured values closest to the prediction location will have more influence on the predicted value than those farther away. It weights the points closer to the prediction location greater than those farther away, hence the name inverse distance weighted. From ArcGIS help

IDW weights the value of each point by its distance to the cell being analyzed and averages the values. IDW assumes that unknown value is influenced more by nearby than far away points, but we can control how rapid that decay is. Influence diminishes with distance. IDW has no method of testing for the quality of predictions, so validity testing requires taking additional observations. IDW is sensitive to sampling, with circular patterns often around solitary data points

IDW: assumes value of an attribute z at any unsampled point is a distance-weighted average of sampled points lying within a defined neighborhood around that unsampled point. Essentially it is a weighted moving avg
z ( x0 ) i z ( xi )
i 1 ^ n
Where i are given by some weighting fn and

-p
p z ( x ) d i ij i 1 p d ij i 1 n n
i 1
Common form of weighting function is d yielding:

z ( x0 )
^
IDW-How it works
Z value at location ij is f of Z value at known point xy times the inverse distance raised to a power P. Z value field: numeric attribute to be interpolated Power: determines relationship of weighting and distance; where p= 0, no decrease in influence with distance; as p increases distant points becoming less influential in interpolating Z value at a given pixel
IDW-How it works
There are two IDW method options Variable and fixed radius:
1. Variable (or nearest neighbor): User defines how many neighbor points are going to be used to define value for each cell 2. Fixed Radius: User defines a radius within which every point will be used to define the value for each cell
IDW-How it works
Can also define Barriers: User chooses whether to
limit certain points from being used in the calculation of a new value for a cell, even if the point is near. E.g. wouldn't use an elevation point on one side of a ridge to create an elevation value on the other side of the ridge. User chooses a line theme to represent the barrier
IDW-How it works
What is the best P to use?
It is the P where the Root Mean Squared Prediction Error (RMSPE) is lowest, as in the graph on right
To determine this, we would need a test, or validation data set, showing Z values in x,y locations that are not included in prediction data and then look for discrepancies between actual and predicted values. We keep changing the P value until we get the minimum level of error. Without this, we just guess.
IDW-How it works
This can be done in ArcGIS using the Geostatistical Wizard
You can look for an optimal P by testing your sample point data against a validation data set
This validation set can be another point layer or a raster layer
Example: we have elevation data points and we generate a DTM. We then validate our newly created DTM against an existing DTM, or against another existing elevation points data set. The computer determine what the optimum P is to minimize our error
IDW-How it works
Example: IDW
Done with P =2. Notice how it is not as smooth as Spline. This is because of the weighting function introduced through P
Introduction to GIS
Spline Method
Another option for interpolation method This fits a curve through the sample data assign values to other locations based on their location on the curve Thin plate splines create a surface that passes through sample points with the least possible change in slope at all points, that is with a minimum curvature surface
SPLINE has two types: regularized and tension

Tension results in a rougher surface that more closely adheres to abrupt changes in sample points Regularized results in a smoother surface that smoothes out abruptly changing values somewhat
Spline Method
Another option for interpolation method
This fits a curve through the sample data assign values to other locations based on their location on the curve
Thin plate splines create a surface that passes through sample points with the least possible change in slope at all points, that is with a minimum curvature surface. Uses piece-wise functions fitted to a small number of data points, but joins are continuous, hence can modify one part of curve without having to recompute whole Overall function is continuous with continuous first and second derivatives.
Spline Method
SPLINE has two types: regularized and tension
Tension results in a rougher surface that more closely adheres to abrupt changes in sample points
Regularized results in a smoother surface that smoothes out abruptly changing values somewhat
Spline Method
Weight: this controls the tautness of the curves. High weight value with the Regularized Type, will result in an increasingly smooth output surface. Under the Tension Type, increases in the Weight will cause the surface to become stiffer, eventually conforming closely to the input points. Number of points around a cell that will be used to fit a polynomial function to a curve
Example: Spline
Note how smooth the curves of the terrain are; this is because Spline is fitting a simply polynomial equation through the points
Pros and Cons of Spline Method

Splines retain smaller features, in contrast to IDW Produce clear overview of data
Continuous, so easy to calculate derivates for topology

Results are sensitive to locations of break points No estimate of errors, like with IDW Can often result in over-smooth surfaces
Introduction to GIS
Kriging Method
Semivariograms measure the strength of statistical correlation as a function of distance; they quantify spatial autocorrelation
Because Kriging is based on the semivariogram, it is probabilistic, while IDW and Spline are deterministic
Kriging associates some probability with each prediction, hence it provides not just a surface, but some measure of the accuracy of that surface
Kriging equations are determined by fitting line through points so as to minimize weighted sum of squares between points and line
These equations are weighted based on spatial autocorrelation, which is determined from the semivariograms
Kriging Method
Like IDW interpolation, Kriging forms weights from surrounding measured values to predict values at unmeasured locations. As with IDW interpolation, the closest measured values usually have the most influence. However, the kriging weights for the surrounding measured points are more sophisticated than those of IDW. IDW uses a simple algorithm based on distance, but kriging weights come from a semivariogram that was developed by looking at the spatial structure of the data. To create a continuous surface or map of the phenomenon, predictions are made for locations in the study area based on the semivariogram and the spatial arrangement of measured values that are nearby. --from ESRI Help
Kriging Method
Kriging is a geostatistical method and a probabilistic method, unlike the others, which are deterministic. That is, there is a probability associated with each prediction. Kriging has both a deterministic and probabilistic component, respectively Z(s) = (s) + (s), where both are functions of distance
Assumes spatial variation in variable is too irregular to be modeled by simple smooth function, better with stochastic surface
Interpolation parameters (e.g. weights) are chosen to optimize fn
Assumes that variable in space can be modeled as sum of three components: 1) structure/deterministic part, 2) random but spatially correlated part and 3) spatially uncorrelated random part
Kriging Method
Hence, foundation of Kriging is notion of spatial autocorrelation, or tendency of values of entities closer in space to be related. This is a violation of classical statistical models, since observations are assumed to be independent. Autocorrelation can be assessed using a semivariogram, which plots the difference in pair values (variance) against their distances. Where autocorrelation exists, the semivariance should increase until certain distance where SV= variance around mean, so flattens out. That value is called a sill. The sloped area, or range is where values are related to each other. Intercept is nugget
Semivariance
Semivariogram(distance h) = 0.5 * average [ (value at location i value at location j)2] OR n
( h)
{z( x ) z( x h)}
i 1 i i
2n
Based on the scatter of points, the computer (Geostatistical analyst) fits a curve through those points
The inverse is the covariance matrix which shows correlation over space
Steps
Variogram cloud; can use bins to make box plot Empirical variogram: choose bins and lags Model variogram: fit function through empirical variogram
Functional forms?
Variogram
Plots semi-variance against distance between points Is binned to simplify Can be binned based on just distance (top) or distance and direction (bottom) Where autocorrelation exists, the semivariance should have slope Look at variogram to find where slope levels
Binning based on distance only
Binning based on distance and direction
Variogram
SV value where it flattens out is called a sill. The distance range for which there is a slope is called the neighborhood; this is where there is positive spatial structure The intercept is called the nugget and represents random noise that is spatially independent
sill
nugget range
Functional Forms
From Fortin and Dale Spatial Analysis
Kriging Method
We can then use a scatter plot of predicted versus actual values to see the extent to which our model actually predicts the values If the blue line and the points lie along the 1:1 line this indicates that the kriging model predicts the data well
Kriging Method
The fitted variogram results in a series of matrices and vectors that are used in weighting and locally solving the kriging equation. Basically, at this point, it is similar to other interpolation methods in that we are taking a weighting moving average, but the weights autocorrelation measures. () are based on statistically derived
s are chosen so that the estimate is unbiased and the estimated variance is less than for any other possible linear combo of the variables.
z ( x0 )
Kriging Method
Produces four types of prediction maps:
Prediction Map: Predicted values

Probability Map: Probability that value over x Prediction Standard Error Map: fit of model
Quantile maps: Probability that value over certain quantile
Kriging Method
Semivariograms measure the strength of statistical correlation as a function of distance; they quantify spatial autocorrelation Because Kriging is based on the semivariogram, it is probabilistic, while IDW and Spline are deterministic Kriging associates some probability with each prediction, hence it provides not just a surface, but some measure of the accuracy of that surface Kriging equations are determined by fitting line through points so as to minimize weighted sum of squares between points and line These equations are weighted based on spatial autocorrelation, which is determined from the semivariograms
Example: Kriging
This one is kind of in betweenbecause it fits an equation through point, but weights it based on probabilities
Kriging: Ordinary vs. Universal

Known as Kriging in the presence of universal trends.
Universal kriging is used where there is an underlying trend beyond the simple spatial autocorrelation Generally this trend occurs at a different scale Trend may be fn of some geographic feature that occurs on one part of the map
Kriging output: prediction
Introduction to GIS
Other methods of interpolation

Thiessen polygons
This method builds polygons, rather than a raster surface, from control points
grows polygons around sample points that are supposed to represent areas of homogeneity
Source: Jens-Ulrich Nomme http://www.tuharburg.de/sb3/pssd/GIS-Methods/thiessen.html
Introduction to GIS
Density Functions
We can also use sample points to map out density raster surfaces. This need to require a z value in each, it can simply be based on the abundance and distribution of points.
Introduction to GIS
Density Functions
These settings would give us a raster density surface, based just on the abundance of points within a kernel or data frame. In this case, a z value for each point is not necessary.
END
Spatial autocorrelation
Correlation of a field with itself
Low
High
Maximum
Spatial optimization
www.giscenter.net/eng/work_03_e.html
Spatial interpolation
Linear interpolation
C B
Half way from A to B, Value is (A + B) / 2 A
Nonlinear Interpolation
When things aren't or shouldnt be so simple Values computed by piecewise moving window Basic types: 1. Trend surface analysis / Polynomial 2. Minimum Curvature Spline 3. Inverse Distance Weighted 4. Kriging
1. Trend Surface/Polynomial
point-based
Fits a polynomial to input points When calculating function that will describe surface, uses least-square regression fit
approximate interpolator
Resulting surface doesnt pass through all data points global trend in data, varying slowly overlain by local but rapid fluctuations
1. Trend Surface cont.

flat but TILTED plane to fit data surface is approximated by linear equation (polynomial degree 1) z = a + bx + cy tilted but WARPED plane to fit data
surface is approximated by quadratic equation (polynomial degree 2) z = a + bx + cy + dx2 + exy + fy2
Trend Surfaces
2. Minimum Curvature Splines

Fits a minimum-curvature surface through input points Like bending a sheet of rubber to pass through points
While minimizing curvature of that sheet
repeatedly applies a smoothing equation (piecewise polynomial) to the surface

Resulting surface passes through all points
best for gently varying surfaces, not for rugged ones (can overshoot data values)
3. Distance Weighted Methods
3. Inverse Distance Weighted

Each input point has local influence that diminishes with distance estimates are averages of values at n known points within window R
where w is some function of distance (e.g., w = 1/dk)
IDW
IDW is popular, easy, but problematic Interpolated values limited by the range of the data No interpolated value will be outside the observed range of z values How many points should be included in the averaging? What about irregularly distributed points? What about the map edges?
IDW Example
ozone concentrations at CA measurement stations 1. estimate a complete field, make a map 2. estimate ozone concentrations at specific locations (e.g., Los Angeles)
Ozone in S. Cal: Text Example
measuring stations and concentrations (point shapefile) CA cities (point shapefile) CA outline (polygon shapefile) DEM (raster)
IDW Wizard in Geostatistical Analyst define data source
Further define interpolation method

Power of distance
4 sectors
Cross validation
removing one of the n observation points and using the remaining n-1 points to predict its value. Error = observed - predicted
Result
4. Kriging
Assumes distance or direction betw. sample points shows a spatial correlation that help describe the surface Fits function to Specified number of points OR All points within a window of specified radius Based on an analysis of the data, then an application of the results of this analysis to interpolation Most appropriate when you already know about spatially correlated distance or directional bias in data
Involves several steps Exploratory statistical analysis of data Variogram modeling Creating the surface based on variogram
Kriging
Breaks up topography into 3 elements: Drift (general trend), small deviations from the drift and random noise.
To be stepped over
Explore with Trend analysis

You may wish to remove a trend from the dataset before using kriging. The Trend Analysis tool can help identify global trends in the input dataset.
Kriging Results
Once the variogram has been developed, it is used to estimate distance weights for interpolation Computationally very intensive w/ lots of data points Estimation of the variogram complex
No one method is absolute best Results never absolute, assumptions about distance, directional bias
Kriging Example
Surface has no constant mean Maybe no underlying trend
surface has a constant mean, no underlying trend
allows for a trend binary data
Analysis of Variogram
Fitting a Model, Directional Effects
How Many Neighbors?
Cross Validation
Kriging Result
similar pattern to IDW less detail in remote areas smooth
IDW vs. Kriging

Kriging appears to give a more smooth look to the data Kriging avoids the bulls eye effect Kriging gives us a standard error

Interpolation

Uploaded by

Copyright:

Available Formats

Interpolation

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Interpolation

Uploaded by

Copyright:

Available Formats

INTERPOLATION

How are raster surfaces Made?

How does it Look

How does it work

How does it work

What isnt interpolation?

How does interpolation work

Where interpolation does not work

Example: precipitation varying over a season

Source: LUBOS MITAS AND HELENA MITASOVA, University of Illinois

Inverse Distance Weighting

Inverse Distance Weighting

Inverse Distance Weighting

Where i are given by some weighting fn and

Common form of weighting function is d yielding:

SPLINE has two types: regularized and tension

Pros and Cons of Spline Method

Continuous, so easy to calculate derivates for topology

Binning based on distance only

Binning based on distance and direction

From Fortin and Dale Spatial Analysis

Prediction Map: Predicted values

Quantile maps: Probability that value over certain quantile

Kriging: Ordinary vs. Universal

Kriging output: prediction

Other methods of interpolation

Source: Jens-Ulrich Nomme http://www.tuharburg.de/sb3/pssd/GIS-Methods/thiessen.html

Half way from A to B, Value is (A + B) / 2 A

1. Trend Surface cont.

2. Minimum Curvature Splines

repeatedly applies a smoothing equation (piecewise polynomial) to the surface

3. Distance Weighted Methods

3. Inverse Distance Weighted

where w is some function of distance (e.g., w = 1/dk)

Ozone in S. Cal: Text Example

IDW Wizard in Geostatistical Analyst define data source

Further define interpolation method

Explore with Trend analysis

surface has a constant mean, no underlying trend

allows for a trend binary data

Fitting a Model, Directional Effects

How Many Neighbors?

IDW vs. Kriging

You might also like