Chapter 8 Variation Partitioning - Workshop 10 - Advanced Multivariate Analyses in R
Chapter 8 Variation Partitioning - Workshop 10 - Advanced Multivariate Analyses in R
Variation partitioning is a type of analysis that combines RDA and partial RDA to divide the
variation of a response variable among two, three or four explanatory data sets. For example,
you might want to partition the variation in a community matrix among a set of abiotic
environmental variables, and a set of biotic variables. You could also partition this community
variation among small-scale or large-scale variables, to test the effect of spatial scale on your
community.
The results of variation partitioning analyses are traditionally represented by a Venn diagram,
in which the percentage of explained variance by each explanatory data set is reported. In a
case where we are partitioning the variation among two explanatory matrices, the result could
be represented as follows:
Figure 8.2: Representing variation partitioning results.
Here,
To demonstrate how variation partitioning works in R , we will partition the variation of fish
species composition between chemical and topographic variables. The varpart() function
from vegan makes this easy for us.
## Partition table:
## Df R.squared Adj.R.squared Testable
## [a+c] = X1 7 0.60579 0.47439 TRUE
## [b+c] = X2 3 0.41526 0.34509 TRUE
## [a+b+c] = X1+X2 10 0.73414 0.58644 TRUE
## Individual fractions
You can then visualise the results with the plot() function.
anova.cca(rda(spe.hel, env.chem))
## Permutation: free
## Number of permutations: 999
##
## Model: rda(X = spe.hel, Y = env.chem)
## Df Variance F Pr(>F)
## Df Variance F Pr(>F)
## Model 3 0.20867 5.918 0.001 ***
## Residual 25 0.29384
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Model: rda(X = spe.hel, Y = env.chem, Z = env.topo)
## Df Variance F Pr(>F)
## Model 7 0.16024 3.0842 0.001 ***
## Residual 18 0.13360
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## Df Variance F Pr(>F)
## Model 3 0.064495 2.8965 0.001 ***
## Residual 18 0.133599
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
All of the testable fractions in the variation partitioning are statistically significant!
8.3 Challenge 3
Partition the variation in the mite species data according to substrate variables ( SubsDens ,
WatrCont ) and significant spatial variables.
data("mite.pcnm")
ordiR2step()
varpart()
anova.cca(rda())
plot()
8.3.1 Challenge 3: Solution
There are a lot of spatial variables in this dataset (22!). We should select the most important
ones, to avoid overloading the model.
## rda(formula = mite.spe.hel ~ V2 + V3 + V8 + V1 + V6 + V4 + V9 +
## V16 + V7 + V20, data = mite.pcnm)
...
## Model: rda(X = mite.spe.hel, Y = mite.subs, Z = mite.spat)
## Df Variance F Pr(>F)
## Model 2 0.025602 4.4879 0.001 ***
## Residual 57 0.162583
...
...
# Step 5: Plot
plot(mite.part,
digits = 2,
Space explains a lot of the variation in species abundances here: 19.4% (p = 0.001) of the
variation is explained by space alone, and 24.8% is jointly explained by space and substrate.
Substrate only explains ~6% (p = 0.001) of the variation in community composition across
sites on its own! Also note that half of the variation is not explained by the variables we
included in the model (look at the residuals!), so the model could be improved.
This large effect of space could be a sign that some spatial ecological process is
important here (like dispersal, for example). However, it could also be telling us that we
are missing an important environmental variable in our model, which itself varies in space!
All the content of the workshop series is under a Creative Commons Attribution-NonCommercial-
ShareAlike 4.0 International License.