The funbiogeo
package
requires that information is structured in three different datasets:
data.frame
(species_traits
in funbiogeo
), which contains
trait values for several traits (in columns) for several species (in
rows).data.frame
(site_species
in funbiogeo
), which contains
the presence/absence, abundance, or cover information for species (in
columns) by sites (in rows).site_locations
in funbiogeo
), which contains
the physical locations of the sites of interestOptionally, an additional dataset can be provided:
data.frame
(species_categories
in funbiogeo
), which
contains two-columns: one for species, one for potential categorization
of species (whether it’s taxonomic classes, specific diets, or any
arbitrary classification)In funbiogeo
these datasets must be in
a wide format (where one row hosts several variables across columns),
but sometimes information is structured in a long format (one
observation per row, also called tidy
format).
For instance, the following dataset illustrates the wider format (the presence/absence of all species is spread across columns).
site | species_1 | species_2 | species_3 | species_4 |
---|---|---|---|---|
A | 1 | 0 | 1 | 1 |
B | 0 | 0 | 1 | 1 |
C | 1 | 1 | 1 | 0 |
The following dataset illustrates the long format (the column
species
contains the name of the species and the column
occurrence
contains the presence/absence of species).
site | species | occurrence |
---|---|---|
A | species_1 | 1 |
B | species_1 | 0 |
C | species_1 | 1 |
A | species_2 | 0 |
B | species_2 | 0 |
C | species_2 | 1 |
A | species_3 | 1 |
B | species_3 | 1 |
C | species_3 | 1 |
A | species_4 | 1 |
B | species_4 | 1 |
C | species_4 | 0 |
fb_format_*()
functionsIf your data are not split into these wider datasets, you can use the
functions fb_format_*()
to create these specific objects
from a long format dataset.
fb_format_site_locations()
allows to extract the
site x locations information from the long format
datafb_format_site_species()
allows to extract the
site x species information from the long format
datafb_format_species_traits()
allows to extract the
species x traits information from the long format
datafb_format_species_categories()
allows to extract the
species x categories information from the long format
dataAll these functions take a long dataset as input (argument
data
), where one row corresponds to the
occurrence/abundance/coverage of one species at one site and output a
wider object.
funbiogeo
provides a small excerpt of long format data
to show how to use the functions. This data sits at
system.file("extdata", "raw_mammals_data.csv", package = "funbiogeo")
.
Let’s import the long format dataset provided by
funbiogeo
:
# Define the path to long format dataset ----
file_name <- system.file("extdata", "raw_mammals_data.csv", package = "funbiogeo")
# Read the file ----
all_data <- read.csv(file_name)
species | order | site | longitude | latitude | count | adult_body_mass | gestation_length | litter_size | max_longevity | sexual_maturity_age | diet_breadth |
---|---|---|---|---|---|---|---|---|---|---|---|
sp_001 | Cetartiodactyla | fb_103 | 7.27182 | 59.09736 | 1 | 461900.76 | 235.00 | 1.25 | 324 | 668.20 | 1 |
sp_001 | Cetartiodactyla | fb_1001 | 20.77182 | 52.59736 | 1 | 461900.76 | 235.00 | 1.25 | 324 | 668.20 | 1 |
sp_001 | Cetartiodactyla | fb_102 | 6.77182 | 59.09736 | 1 | 461900.76 | 235.00 | 1.25 | 324 | 668.20 | 1 |
sp_001 | Cetartiodactyla | fb_104 | 7.77182 | 59.09736 | 1 | 461900.76 | 235.00 | 1.25 | 324 | 668.20 | 1 |
sp_001 | Cetartiodactyla | fb_101 | 6.27182 | 59.09736 | 1 | 461900.76 | 235.00 | 1.25 | 324 | 668.20 | 1 |
sp_001 | Cetartiodactyla | fb_1000 | 20.27182 | 52.59736 | 1 | 461900.76 | 235.00 | 1.25 | 324 | 668.20 | 1 |
sp_001 | Cetartiodactyla | fb_1002 | 21.27182 | 52.59736 | 1 | 461900.76 | 235.00 | 1.25 | 324 | 668.20 | 1 |
sp_002 | Rodentia | fb_1000 | 20.27182 | 52.59736 | 1 | 21.11 | 19.89 | 5.64 | 48 | 76.04 | NA |
sp_002 | Rodentia | fb_1002 | 21.27182 | 52.59736 | 1 | 21.11 | 19.89 | 5.64 | 48 | 76.04 | NA |
sp_002 | Rodentia | fb_1001 | 20.77182 | 52.59736 | 1 | 21.11 | 19.89 | 5.64 | 48 | 76.04 | NA |
The function fb_format_species_traits()
extracts species
traits values from this long table to create the species x traits
dataset. Note that one species must have one unique trait value (no
trait variation across sites is allowed).
# Extract species x traits data ----
species_traits <- fb_format_species_traits(
data = all_data,
species = "species",
traits = c("adult_body_mass", "gestation_length", "litter_size",
"max_longevity", "sexual_maturity_age", "diet_breadth")
)
# Preview ----
head(species_traits, 10)
#> species adult_body_mass gestation_length litter_size max_longevity
#> 1 sp_001 461900.76 235.00 1.25 324.0
#> 2 sp_002 21.11 19.89 5.64 48.0
#> 3 sp_005 31.60 24.50 4.94 48.0
#> 4 sp_006 21.90 23.68 5.16 52.8
#> 5 sp_010 8.31 NA 1.73 252.0
#> 6 sp_013 31756.51 63.50 4.98 354.0
#> 7 sp_016 22502.01 196.00 1.79 204.0
#> 8 sp_017 240867.13 235.61 1.09 321.6
#> 9 sp_022 9.89 29.00 4.04 38.4
#> 10 sp_026 57224.61 230.00 1.00 300.0
#> sexual_maturity_age diet_breadth
#> 1 668.20 1
#> 2 76.04 NA
#> 3 43.27 NA
#> 4 57.93 4
#> 5 NA 1
#> 6 679.37 1
#> 7 400.97 NA
#> 8 659.91 5
#> 9 66.88 2
#> 10 543.28 2
The function fb_format_site_species()
extracts species
occurrence/abundance/coverage from this long table to create the site x
species dataset. Note that one species must have been observed one time
at one site (the package funbiogeo
does not yet consider
temporal survey).
# Format site x species data ----
site_species <- fb_format_site_species(data = all_data,
site = "site",
species = "species",
value = "count",
na_to_zero = TRUE
)
# Preview ----
head(site_species[ , 1:8], 10)
#> site sp_001 sp_002 sp_005 sp_006 sp_010 sp_013 sp_016
#> 1 fb_103 1 0 0 1 0 1 1
#> 2 fb_1001 1 1 1 1 1 1 1
#> 3 fb_102 1 0 0 1 0 1 1
#> 4 fb_104 1 0 0 1 0 1 1
#> 5 fb_101 1 0 0 1 0 1 1
#> 6 fb_1000 1 1 1 1 1 1 1
#> 7 fb_1002 1 1 1 1 1 1 1
#> 8 fb_1022 0 0 1 1 1 0 1
#> 9 fb_1018 0 0 1 1 1 0 0
#> 10 fb_1024 0 0 1 1 1 0 1
The function fb_format_site_locations()
extracts sites
coordinates from this long table to create the site x locations dataset.
Note that one site must have one unique longitude x latitude value.
# Format site x locations data ----
site_locations <- fb_format_site_locations(data = all_data,
site = "site",
longitude = "longitude",
latitude = "latitude",
na_rm = FALSE)
# Preview ----
head(site_locations)
#> Simple feature collection with 6 features and 1 field
#> Geometry type: POINT
#> Dimension: XY
#> Bounding box: xmin: 52.59736 ymin: 6.271821 xmax: 59.09736 ymax: 20.77182
#> Geodetic CRS: WGS 84
#> site geometry
#> 1 fb_103 POINT (59.09736 7.271821)
#> 2 fb_1001 POINT (52.59736 20.77182)
#> 3 fb_102 POINT (59.09736 6.771821)
#> 4 fb_104 POINT (59.09736 7.771821)
#> 5 fb_101 POINT (59.09736 6.271821)
#> 6 fb_1000 POINT (52.59736 20.27182)
The function fb_format_species_categories()
extracts
species values for one supra-category (optional) from this long table to
create the species x categories dataset. This category (e.g. order,
family, endemism status, conservation status, etc.) can be later by
several functions in funbiogeo
to aggregate metrics at this
level.
# Extract species x categories data ----
species_categories <- fb_format_species_categories(data = all_data,
species = "species",
category = "order"
)
# Preview ----
head(species_categories, 10)
#> species order
#> 1 sp_001 Cetartiodactyla
#> 8 sp_002 Rodentia
#> 11 sp_005 Rodentia
#> 27 sp_006 Rodentia
#> 59 sp_010 Chiroptera
#> 81 sp_013 Carnivora
#> 89 sp_016 Cetartiodactyla
#> 113 sp_017 Cetartiodactyla
#> 132 sp_022 Eulipotyphla
#> 138 sp_026 Cetartiodactyla
Once your data are in the good format, you can get
started with funbiogeo
.