Bio TIMEr
Bio TIMEr
1
2 BTsubset_data
Contents
BTsubset_data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
BTsubset_meta . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
getAlphaMetrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
getBetaMetrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
getLinearRegressions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
gridding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
resampling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
scale_color_biotime . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
Index 12
Description
Usage
BTsubset_data
BTsubset_meta 3
Format
Source
<https://biotime.st-andrews.ac.uk/download.php>
Description
Usage
BTsubset_meta
4 BTsubset_meta
Format
Source
<https://biotime.st-andrews.ac.uk/download.php>
getAlphaMetrics 5
Description
Calculates a set of standard alpha diversity metrics
Usage
getAlphaMetrics(x, measure)
Arguments
x (data.frame) BioTIME data table in the format of the output of the gridding
function and/or resampling function.
measure (character) chosen currency defined by a single column name.
Details
The function getAlphaMetrics computes nine alpha diversity metrics for a given community data
frame, where measure is a character input specifying the abundance or biomass field used for the
calculations. For each row of the data frame with data, getAlphaMetrics calculates the following
metrics:
- Species richness (S) as the total number of species in each year with currency > 0.
- Numerical abundance (N) as the total currency (sum) in each year (either total abundance or total
biomass).
- Maximum Numerical abundance (maxN) as the highest currency value reported in each year.
P
- Shannon or Shannon–Weaver index is calculated as i pi logb pi , where pi is the proportional
abundance of species i and b is the base of the logarithm (natural logarithms), while exponential
Shannon is given by exp(Shannon).
- Simpson’s index is calculated as 1 − sum(p2i ), while Inverse Simpson as 1/sum(p2i ).
- McNaughton’s Dominance is calculated as the sum of the pi of the two most abundant species.
PS
- Probability of intraspecific encounter or PIE is calculated as NN−1 1 − i=1 πi2 .
Note that the input data frame needs to be in the format of the output of the gridding function
and/or resampling functions, which includes keeping the default BioTIME data column names. If
such columns are not found an error is issued and the computations are halted.
Value
Returns a data frame with results for species richness (S), numerical abundance (N), maximum nu-
merical abundance (maxN), Shannon Index (Shannon), Exponential Shannon (expShannon), Simp-
son’s Index (Simpson), Inverse Simpson (InvSimpson), Probability of intraspecific encounter (PIE)
and McNaughton’s Dominance (DomMc) for each year and assemblageID.
6 getBetaMetrics
Examples
x <- data.frame(
resamp = 1L,
YEAR = rep(rep(2010:2015, each = 4), times = 4),
Species = c(replicate(n = 8L, sample(letters, 24L, replace = FALSE))),
ABUNDANCE = rpois(24 * 8, 10),
assemblageID = rep(LETTERS[1L:8L], each = 24)
)
res <- getAlphaMetrics(x, measure = "ABUNDANCE")
Description
Usage
getBetaMetrics(x, measure)
Arguments
x (data.frame) BioTIME data table in the format of the output of the gridding
function and/or resampling functions.
measure (character) chosen currency defined by a single column name.
Details
The function getBetaMetrics computes three beta diversity metrics for a given community data
frame, where measure is a character input specifying the abundance or biomass field used for the
calculations. getBetaMetrics calls the vegdist function which calculates for each row the fol-
lowing metrics: Jaccard dissimilarity (method = "jaccard"), Morisita-Horn dissimilarity (method
= "horn") and Bray-Curtis dissimilarity (method = "bray"). Here, the dissimilarity metrics are
calculated against the baseline year of each assemblage time series i.e. the first year of each time
series. Note that the input data frame needs to be in the format of the output of the gridding and/or
resampling functions, which includes keeping the default BioTIME data column names. If such
columns are not found an error is issued and the computations are halted.
Value
Returns a data.frame with results for Jaccard dissimilarity (JaccardDiss), Morisita-Horn dis-
similarity (MorisitaHornDiss), and Bray-Curtis dissimilarity (BrayCurtsDiss) for each year and
assemblageID.
getLinearRegressions 7
Examples
x <- data.frame(
resamp = 1L,
YEAR = rep(rep(2010:2015, each = 4), times = 4),
Species = c(replicate(
n = 8L,
sample(letters, 24L, replace = FALSE))),
ABUNDANCE = rpois(24 * 8, 10),
assemblageID = rep(LETTERS[1L:8L], each = 24)
)
Description
Usage
Arguments
Details
The function ‘getLinearRegressions‘ fits simple linear regression models (see lm for details) for a
given output (’data’) of either getAlphaMetrics or getBetaMetrics function. ‘divType‘ needs
to be specified in agreement with x. The typical model has the form ‘metric ~ year‘. Note that
assemblages with less than 3 time points and/or single species time series are removed.
Value
Returns a single long ‘data.frame‘ with results of linear regressions (slope, p-value, significance,
intercept) for each ‘assemblageID‘.
8 gridding
Examples
library(BioTIMEr)
x <- data.frame(
resamp = 1L,
YEAR = rep(rep(2010:2015, each = 4), times = 4),
Species = c(replicate(n = 8L * 6L, sample(letters[1L:10L], 4L, replace = FALSE))),
ABUNDANCE = rpois(24 * 8, 10),
assemblageID = rep(LETTERS[1L:8L], each = 24)
)
alpham <- getAlphaMetrics(x, "ABUNDANCE")
getLinearRegressions(x = alpham, divType = "alpha", pThreshold = 0.01)
betam <- getBetaMetrics(x = x, "ABUNDANCE")
getLinearRegressions(x = betam, divType = "beta")
Description
grids BioTIME data into a discrete global grid based on the location of the samples (latitude/longitude).
Usage
gridding(meta, btf, res = 12, resByData = FALSE)
Arguments
meta (data.frame) BioTIME metadata.
btf (data.frame) BioTIME data.
res (integer) cell resolution. Must be in the range [0,30]. Larger values represent
finer resolutions. Default: 12 (~96 sq km). Passed to dgconstruct.
resByData (logical) FALSE by default. If TRUE, the function dg_closest_res_to_area
is called to adapt ‘res‘ to the data extent.
Details
Each BioTIME study contains distinct samples which were collected with a consistent methodology
over time, and each with unique coordinates and date. These samples can be fixed plots (i.e. SL or
’single-location’ studies where measures are taken from a set of specific georeferenced sites at any
given time) or wide-ranging surveys, transects, tows, and so on (i.e. ML or ’multi-location’ studies
where measures are taken from multiple sampling locations over large extents that may or may not
align from year to year, see runResampling. gridding is a function designed to deal with the issue
of varying spatial extent between studies by using a global grid of hexagonal cells derived from
dgconstruct and assigning the individual samples to the cells across the grid based on its latitude
and longitude. Specifically, each sample is assigned a different combination of study ID and grid
cell resulting in a unique identifier for each assemblage time series within each cell (assemblageID).
resampling 9
This allows for the integrity of each study and each sample to be maintained, while large extent stud-
ies are split into local time series at the grid cell level. By default meta represents a long form data
frame containing the data information for BioTIME studies and btf is a data frame containing long
form data from a main BioTIME query (see Example). res defines the global grid cell resolution,
thus determining the size of the cells (see vignette("dggridR")). res = 12 was found to be the
most appropriate value when working on the whole BioTIME database(corresponding to ~96 km2
cell area), but the user can define their own grid resolution (e.g. res = 14, or when resbyData =
TRUE allow the function to find the best res based on the average study extent.
Value
Returns a 'data.frame', with selected columns from the btf and meta data frames, an extra
integer column called 'cell' and two character columns called ’StudyMethod’ and ’assemblageID’
(concatenation of study_ID and cell).
Examples
library(BioTIMEr)
gridded_data <- gridding(BTsubset_meta, BTsubset_data)
Description
Takes the output of gridding and applies sample-based rarefaction to standardise the number of
samples per year within each cell-level time series (i.e. assemblageID).
Usage
Arguments
x (data.frame) BioTIME gridded data to be resampled (in the format of the out-
put of the gridding function).
measure (character) currency to be retained during the sample-based rarefaction. Can
be either defined by a single column name or a vector of two or more column
names.
resamps (integer) number of repetitions. Default is 1.
conservative (logical). FALSE by default. If TRUE, whenever a NA is found in the measure
field(s), the whole sample is removed instead of the missing observations only.
10 scale_color_biotime
Details
Sample-based rarefaction prevents temporal variation in sampling effort from affecting diversity es-
timates (see Gotelli N.J., Colwell R.K. 2001 Quantifying biodiversity: procedures and pitfalls in the
measurement and comparison of species richness. Ecology Letters 4(4), 379-391) by selecting an
equal number of samples across all years in a time series. resampling counts the number of unique
samples taken in each year (sampling effort), identifies the minimum number of samples across all
years, and then uses this minimum to randomly resample each year down to that number. Thus,
standardising the sampling effort between years, standard biodiversity metrics can be calculated
based on an equal number of samples (e.g. using getAlphaMetrics, getAlphaMetrics). measure
is a character input specifying the chosen currency to be used during the sample-based rarefac-
tion. It can be a single column name or a vector of two or more column names - e.g. for BioTIME,
measure="ABUNDANCE", measure="BIOMASS" or measure = c("ABUNDANCE", "BIOMASS").
By default, any observations with NA within the currency field(s) are removed. You can choose
to remove the full sample where such observations are present by setting conservative to TRUE.
resamps can be used to define multiple iterations, effectively creating multiple alternative datasets
as in each iteration different samples will be randomly selected for the years where number of
samples > minimum. Note that the function always returns a single data frame, i.e. if resamps > 1,
the returned data frame is the result of individual data frames concatenated together, one from each
iteration identified by a numerical unique identifier 1:resamps.
Value
Returns a single long form data.frame containing the total currency or currencies of interest (sum)
for each species in each year within each rarefied time series (i.e. assemblageID). An extra integer
column called resamp indicates the specific iteration.
Examples
library(BioTIMEr)
set.seed(42)
x <- gridding(BTsubset_meta, BTsubset_data)
resampling(x, measure = "BIOMASS")
resampling(x, measure = "ABUNDANCE")
resampling(x, measure = c("ABUNDANCE","BIOMASS"))
Description
Usage
scale_color_biotime(palette = "realms", discrete = TRUE, reverse = FALSE, ...)
Arguments
palette One of: ‘realms‘, ‘gradient‘, ‘cool‘, ‘warm‘, default to ‘realms‘.
discrete See Details. default to ‘FALSE‘
reverse Default to ‘FALSE‘
... Passed to discrete_scale or scale_color_gradient
Details
USAGE NOTE: Remember to change these arguments when plotting colours continuously.
Value
If discrete is TRUE, the function returns a colour palette produced by discrete_scale and if
discrete is FALSE, the function returns a colour palette produced by scale_color_gradient.
If discrete is TRUE, the function returns a colour palette produced by discrete_scale and if
discrete is FALSE, the function returns a colour palette produced by scale_color_gradient.
Author(s)
Cher F. Y. Chow
Index
∗ datasets
BTsubset_data, 2
BTsubset_meta, 3
BTsubset_data, 2
BTsubset_meta, 3
dg_closest_res_to_area, 8
dgconstruct, 8
discrete_scale, 11
getAlphaMetrics, 5, 7, 10
getBetaMetrics, 6, 7
getLinearRegressions, 7
gridding, 5, 6, 8, 9
lm, 7
resampling, 5, 6, 9
scale_color_biotime, 10
scale_color_gradient, 11
scale_colour_biotime
(scale_color_biotime), 10
scale_fill_biotime
(scale_color_biotime), 10
vegdist, 6
12