Uses the BWG allometry data to estimate biomass for available species.
devtools::install_github("ropenscilabs/datastorr")
devtools::install_github("SrivastavaLab/fwdata")
devtools::install_github("SrivastavaLab/bwgbiomass")
Acquire the latest release of allometry data or biomass data
# Acquire the latest allometry data release
allometry_data <- allometry()
# Acquire the latest bwgbiomass data release
biomass_data <- biomass()
Generate the biomass table fresh
biomass_data <- gen_biomass()
You can also create intermediate data frames needed to generate the final biomass table. See documentation below for a description of the data frames and the columns they contain:
# The allometry matrix
allometry_matrix <- create_allometry_matrix()
# The equation bank:
equation_bank <- create_equation_bank()
# The category lookup table:
category_lookup <- create_category_lookup()
# The biomass table cleared of original biomass info:
biomass_table <- create_biomass_table()
To do a data comparison with the original biomass estimates:
# Get the latest data
biomass_data <- biomass()
# Do the data check
check_data <- sanity_check(biomass_data)
Data versions are numbered with a bwgbiomass
version and an allometrydata
version.
For example:
v.0.0.1_0.0.1
The numbers before the _
indicate which version of bwgbiomass is being used. When updates are made to the bwgbiomass code, this number will change. The numbers after the _
indicate which version of the allometry data was used. When allometry data are added or changed, this number will change.
- Species in the biomass table with NA
length
andsize_category
are averagesize_category
- For species with a size
range
,length_mm
is computed as the mean of the range - Species with missing stage are given the
stage
larva, except for ostracods, which are marked as adult - In the allometry matrix,
biomass_mg
was computed by dividing bynumber_of_individuals
- Species found at multiple sites are assumed to have the same biomass x length relationship and the linear model was built using all data, not separated by site, habitat, or researcher
- Sizes in the biomass table with an underscore (
_
) are assumed to be a size range and the average of the numbers on either side of the_
was taken - Species in the allometry matrix with no
bwg_name
are assumed not to have abwg_name
and given a placeholder name ofMISSING_X
whereX
is a unique number for that species
Allometric equations are calculated by linear regression of log10(mass_mg) against log10(length_mm).
95% confidence intervals were computed for biomass predictions.
To determine biomass, species from the biomass table are looked up in the allometry matrix. The size specified in the biomass table is matched to the allometry matrix. If not found, it is either derived using an equation bank, or derived using the closest taxonomic relative.
Biomass was determined in the following order, with the meanings of provenance
defined below:
length.raw
: Exact length was found in the allometry matrix and used directly to determine biomasslength.interp
: Length was used to interpolate biomass from the equation bankcategory.raw
: Size category was used to look up a length estimate, which was found in the allometry matrix. Raw biomass was used based on that length estimate.category.interp
: Size category was used to look up a length estimate, which was not found in the allometry matrix. Biomass was interpolated based on that length estimate, using the equation bank.
Meanings of provenance_species
:
exact
: Exact species was found in the allometry matrixrelated
: The most closely related species (taxonomically) from the allometry matrix was used
When more than one closest relative was available, those relatives with highest quality data were prioritized, and median biomass taken. Following the order of provenance
above:
- Look for relatives with
length_mm
matching thelength_mm
of the target species in the biomass table. If only one relative had a matchinglength_mm
, this was used. If more than one relative was found, the median biomass was used. Provenancelength.raw
was assigned. If no relatives were found, go to step 2. - Look for relatives with an equation in the equation bank. Biomass of all relatives was computed using the
length_mm
of the target species in the biomass table. The median of computed biomasses was used and provenancelength.interpolate
assigned. - If the target species has only
length_est_mm
, look for relatives withlength_mm
matching thelength_est_mm
of the target species in the biomass table. The median of all such relatives was used and provenancecategory.raw
assigned. If no relatives were found, go to step 4. - Look for relatives with an equation in the equation bank and calculate biomass using
length_est_mm
. Use the median of calculated biomasses.
If no relatives were found in any of the steps, no biomass was assigned.
All close relatives used to compute a median biomass are named in closest_relative
. Because of the way median was computed, the number of species named in closest_relative
may differ from the number in num_relatives
.
Dry biomass was always prioritized over wet biomass.
If two species had the same median biomass, the lower value was used. This gives a more conservative biomass estimate.
Description of output files and column meanings
The allometry matrix generated by create_allometry_matrix()
: The imported and joined XLS files; column descriptions here are copied from the XLS template where possible
bwg_name
: BWG name assigned to each species in the BWG database (e.g., Diptera.200). You can check the BWG names for each species in BWGdb. However, if there is still no BWG name assigned to the species you are to include here, please enter 'NA'.name
: (morpho)species name assigned by each researcher working at a given field site.length_mm
: Length of the individual, in millimitres.length_measured_as
: Information on how the body length of this individual or taxa was measured: e.g., head capsule only, from the head to the last abdominal segment, etc; if this information is not available, please enter 'NA'.number_of_individuals
: The number of individuals in a pooled sample of mass. For example, if we had to weigh 10 tiny instars of chironomid together because the mass of a single one was below the detection limit, then number of individuals = 10.stage
: larva, pupa or adultsize_category
: size category name (if one assigned by researcher, e.g. "small",otherwise please enter 'NA')instar_number
: instar number (if one noted, otherwise enter 'NA')biomass_mg
: Wet or dry mass of the individual, in milligramsbiomass_type
: Wet or dry biomass
The allometric equation bank, generated with the function create_equation_bank()
bwg_name
: The BWG name from the databasestage
: The developmental stage (i.e. "larva", "pupa", "adult")biomass_type
: Wet or dry biomassfit
: The linear model fitr_squared
: R2 for the allometric equationsample_size
: sample size for the allometric equationintercept
: intercept for the allometric equationslope
: slope for the allometric equation
The category lookup table for matching a category to a length estimate, generated with the function create_category_lookup()
bwg_name
: The BWG name from the databasestage
: The developmental stage (i.e. "larva", "pupa", "adult")size_category
: The categorical name of the size (e.g. "small", "medium")length_est_mm
: The estimated length for individuals in this size category, measured in millimetres
The biomass table acquired from Dropbox and cleared of biomass measurements, generated with the function create_biomass_table()
species_id
: a unique ID for the speciesmeasurement_id
: a unique ID for the size measurementbwg_name
: The BWG name from the databasecategory_range
: Length measured as a category, or a range?stage
: The developmental stage (i.e. "larva", "pupa", "adult")size_category
: The categorical name of the size (e.g. "small", "medium")length_mm
: The length (if available) of the individual in millimetres
The full table of new biomass estimates, using the function gen_biomass()
species_id
: a unique ID for the speciesmeasurement_id
: a unique ID for the size measurementbwg_name
: The BWG name from the databasestage
: The developmental stage (i.e. "larva", "pupa", "adult")length_measured_as
: length_measured_as: Information on how the body length of this individual or taxa was measured: e.g., head capsule only, from the head to the last abdominal segment, etc; if this information is not available, please enter 'NA'.length_mm
: The length (if available) of the individual in millimetreslength_est_mm
: If a category was used to make an estimate of length, that estimate is placed in this columnbiomass_type
: Wet or dry biomassbiomass_mg
: Wet or dry mass of the individual, in milligramsbiomass_ci_upr
andbiomass_ci_lwr
: bounds on the 95% confidence interval of biomass_mg if interpolation was usedprovenance
: method used to calculate biomassprovenance_species
: species on which biomass calculation was performedclosest_relative
: The name of the closest relative. If this contains multiple bwg_names, the median biomass of those was usednum_relatives
: how many relatives found in the closest shared taxonomic groupshared_taxon
: at which taxonomic level were relatives foundr_squared
: R2 for the allometric equationsample_size
: sample size for the allometric equationintercept
: intercept for the allometric equationslope
: slope for the allometric equation
Comparison of biomass measurements with Jana's values (from the original table). Generated using the function sanity_check()
measurement_id
: a unique ID for the size measurementspecies_id
: a unique ID for the speciesbwg_name
: The BWG name from the databasestage
: The developmental stage (i.e. "larva", "pupa", "adult")length_mm
: The length (if available) of the individual in millimetresprovenance
: method used to calculate biomassprovenance_species
: species on which biomass calculation was performedbiomass_orig_mg
: Wet or dry mass of the individual, in milligrams found in the original biomass tablebiomass_new_mg
: Wet or dry mass of the individual, in milligrams calculated by this scriptnew_over_old
: Original biomass divided by my biomass; a value close to one indicates a close match; A value larger than one means the original biomass was larger; A value less than one means the new biomass was larger; Many of the original biomass values are in grams, so were first multiplied by 1000 to get mg.biomass_ci_upr
andbiomass_ci_lwr
: bounds on the 95% confidence interval of biomass_new_mg if interpolation was usedr_squared
: R2 for the allometric equation used to generate biomass_new_mg, if applicable
The package is currently very reliant on other packages to function. This makes it on the heavy side.
MIT + file LICENSE