0% found this document useful (0 votes)
65 views26 pages

Species Distribution Modelling

Species distribution modelling can help conservation efforts by predicting patterns of species distribution when detailed distribution data is unavailable or costly to collect. Maxent is a presence-only species distribution modelling method that relates species occurrence data to environmental predictors to estimate a probability distribution of species presence across a study area. It requires species presence locations and environmental raster data as inputs. Maxent outputs a probability raster representing the likelihood of species occurrence that can be analyzed and mapped in GIS software to aid conservation planning.

Uploaded by

opulithe
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
65 views26 pages

Species Distribution Modelling

Species distribution modelling can help conservation efforts by predicting patterns of species distribution when detailed distribution data is unavailable or costly to collect. Maxent is a presence-only species distribution modelling method that relates species occurrence data to environmental predictors to estimate a probability distribution of species presence across a study area. It requires species presence locations and environmental raster data as inputs. Maxent outputs a probability raster representing the likelihood of species occurrence that can be analyzed and mapped in GIS software to aid conservation planning.

Uploaded by

opulithe
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 26

Species distribution modelling

with MAXENT
Mikael von Numers
Åbo Akademi
Why model species distribution?
– Knowledge about the geographical distribution of species
is crucial for conservation and spatial planning.
– Detailed data on species distribution is usually not
available and collecting such data is costly and labor
intensive.
– Conservationists have in many cases to rely on predictive
models for estimating patterns of species distribution and
for making conservation strategies.
– SDMs provide one of the best ways to overcome
sparseness typical of distributional data, by relating them
to a set of geographic or environmental predictors.
What do we need for SDM?

– Reliable data on species presences (and absences)


– Environmental data as GIS rasters (predictors)
Typical workflow:
Maxent
A short introduction
• Maxent is a presence-only (po) modelling method, which means that
no absence data is needed.

• Maxent (or other po methods) might be a good choice for instance


when:
– There is no absence data is available (which is often the case (absences are
not recorded, data from museums, herbaria etc.)

– There is reason to believe that the absence data is not reliable.

– Several other reasons, for instance:


• The species is not stationary (satellite tagged animals (e.g. porpoises),
radiotelemetry data)
• The species hard to detect (e.g. reptiles)
• The species is temporarily absent.
• The species occurs in patches.
• You have only a single observation within a large suitable territory (for instance
singing bird males)
How it works
• The Maxent method does not need species absences; instead it
uses background environmental data for the entire study area. The
method focuses on how the environment where the species is
known to occur, relates on the environment across the rest of the
study area. The idea is find the probability distribution of maximum
entropy (most spread out), subject to constraints imposed by
information available regarding the species presences and the
environmental conditions across the study area (more in Phillips et
al. 2006 and Elith et al. 2011: A statistical explanation of MaxEnt for
ecologists.

• Maxent has similarities to GAM and GLM but Maxent models a


probability distribution over all pixels in the study area, and in no
sense are pixels without species interpreted as absences, meaning
that “pseudoabsences” are not used.
• Advantages:
• Maxent can use both continuous and categorical environmental variables (predictors)
• Maxent is able to fit complex relationships between the species and the environmental
variables (features in Maxent), also including interactions between the predictors.
• Produces test statistics, measures of variable importance and response curves.
• A possibility to make cross-validations.
• A possibility to shift “regularization parameters”. These determine how “focused” the output
distribution is. A larger parameter will give a less localized prediction.
• Works well together with, for instance, ArcView.
• Is reported to be effective with a relatively small number of presences.
• The output raster represents a continuous measure of probability of occurrence.

• Maxent is a quite new method, but it has performed excellently in tests compared to other
similar methods.
• It is quite easy to use and has an nice user friendly interface.
• Shareware, active discussion group, lots of published papers recently.
• Download from: www.cs.princeton.edu/~schapire/maxent/
• Major conclusions drawn from Elith et al. 2006:
– Presence-only data are useful for modelling species´ distributions
– Presence-only data can be sufficiently accurate to be used in conservation planning
– New modelling methods, such as MAXENT, generally outperforms established methods
• Drawbacks:
– a “black box”; not easy to understand how the method works, compared to,
for instance, to GLM or GAM
– According to the literature not as “mature” a statistical method as GAM or
GLM.
– Sample selection bias is a bigger problem for presence-only methods than for
presence -absence methods. If there is a bias you will get a model that
combines the species distribution with the distribution of sampling effort.
• There are methods to deal with this problem: you can provide Maxent
with a “bias”raster to correct for the bias in sampling effort.
– If absence data are available, a presence-absence method is a better choice
than a po-method.
In this case a fitted model might be closer to a model of survey effort than of distribution.
The Maxent user interface
Zostera marina
Species data:
75 presence points of
Zostera marina in the
S. Archipelago Sea
Species X_coord Y_coord
Zostera, 3214710, 6666810
Zostera, 3191860, 6681080
Zostera, 3195940, 6674130 Species data format:
Zostera, 3215030, 6679040
Zostera, 3208580, 6653860
Zostera, 3184780, 6642620 • data as a comma delimited *.csv file (use Excel).
Zostera, 3205750, 6669300 • only 3 columns needed: species name(s) and co-ordinates.
Zostera, 3196800, 6646150
Zostera, 3213730, 6678190
Zostera, 3206280, 6678010
Zostera, 3199600, 6647510
Zostera, 3197280, 6646490
Zostera, 3200910, 6648660
Zostera, 3212160, 6647820
Zostera, 3212160, 6647890
Zostera, 3189660, 6683280
Zostera, 3205810, 6669390
Zostera, 3213530, 6654590
Fucus, 3209220, 6657510
Fucus, 3194840, 6646240
Fucus, 3196250, 6646940
Fucus, 3189310, 6683540

Predictor layers describing the environmental variables

• the grids has to be in ascii raster format (ESRI .asc)


•the grids must have the same geographic bounds and cell size.
• the layers can be continuous or categorical.

ncols 1827
nrows 2044
xllcorner 3176430
llcorner 6636626
cellsize 25
NODATA_value -9999
-9999 -9999 -9999 -9999 -9999 -9999 -0.1697558 -0.3892355 -0.629083 -0.8858771 -1.15194
-1.418818 -1.683608 -1.943836 -2.19765 -2.453322 -2.724256 -3.016762 -3.336428 -
3.700734 -4.129993 -4.631121 -5.202521 -5.847729 -6.573002 -7.368282 -8.198206 -
9.017972 -9.795128 -10.51915 -11.18508 -11.76465 -12.1964 -12.40763 -12.36905 -
12.19018 -12.21704 -12.41916 -13.14217 -14.7096 -17.03474 -19.10044 -20.86929 -
22.32145 -23.51356 -24.54868 -25.52947 -26.52113 -27.53157 -28.51738 -29.42035 -
30.20646 -30.87352 -31.43348 -31.87958 -32.1587 -32.1725 -31.80916 -30.98529 -29.68661
-27.99283 -26.07573 -24.18093 -22.60346 -21.62301 -21.31837 -21.2968 -21.14873 -
20.38395 -18.66777 -15.95716 -13.17438 -10.75567 -8.945774 -6.735695 -4.542916 -
2.320879 -9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999 -
9999 -9999 -0.5598915
Predictors:
Depth (DEM)
Predictors:
exposure
Predictors:
distance from sand. A proxy
for sandy substrate (that is
not available).
Predictors:
Slope (derived from the DEM)
The Maxent output
probability raster is an ascii
(.asc) raster, which is easy
to exported to ArcView for
further analysis and
symbolisation.
Substrate data = categorical data
Cormorant fishing areas
Substrate included as a categorical variable
Worth to remember when modelling:
1. Garbage in garbage out.
2. Use a sufficient number of records. No algorithm can model extremely sparse species
data. Guideline > 30 records.
3. Each record should bring new information to the model; clusters of observations -> one
observation.
4. Samples should spread across the whole area of interest. -> Stratified sampling.
5. Beware of sampling bias especially in po-methods.
6. Pre-process the predictors carefully. Resolution, collinearity etc.
7. Check the model fit. ( AUC, cross validation, learn-test datasets). Large literature
available.
8. Many sources of error. -> predictions will always be uncertain. -> Be realistic and
cautious when interpreting the results.
Workflow:
1. The Maxent program
2. The Maxent output

3. Do a Maxent run using Zostera data and four predictor layers (individually or together)
4. Import the Maxent predictions to ArcView (together)

5. Use ArcView to mask out part of the study area (together).


6. Do a new Maxent run and compare the results.

You might also like