--- title: "Special cases in funbiogeo" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Special Cases in funbiogeo} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>", pngquant = "--speed=1 --quality=50" ) ``` ```{r setup} library(funbiogeo) ``` This vignette aims to describe several specific cases in the use of `funbiogeo`. It provides detailed examples of these uses. If you find your case is missing or if you have additional questions, please [open an issue](https://github.com/FRBCesab/funbiogeo/issues/new/choose). ## Working with Categorical Traits Traits are not always continuous. While `funbiogeo` has been thought mainly to work with continuous trait data, it can also work with categorical trait data. This section describes how to use `funbiogeo` to work with categorical traits. The default dataset provided in `funbiogeo` is an extract of the [WOODIV database](https://doi.org/10.1038/s41597-021-00873-3), describing the diversity of Mediterrannean trees. It contains data for 28 species. To focus on categorical traits, we here propose to add three more traits for each species: its leaf habit (whether is deciduous or not?), its seed dispersal mode, and its shade tolerance. The next chunk gives these traits for the 24 species. We coded seed dispersal as a categorical trait with two modalities `"anemochory"` and `"endozoochory"`. We coded shade tolerance as a categorical traits with five ordered levels `"very_intolerant"`, `"intolerant"`, `"moderately_tolerant"`, `"tolerant"`, and `"very_tolerant"`. We first give the complete dataset, and then randomly remove data points to show the abilities of `funbiogeo` to display missing categorical traits. ```{r woodiv_cat} woodiv_cat <- data.frame( species = c( "AALB", "ACEP", "ANEB", "APIN", "CLIB", "CSEM", "JCOM", "JDEL", "JMAC", "JNAV", "JOXY", "JPHO", "JTHU", "PBRU", "PHAL", "PHEL", "PINI", "PMUG", "PPIA", "PPIR", "PSYL", "PUNC", "TART", "TBAC" ), leaf_habit = c("evergreen"), seed_dispersal = c( "anemochory", "anemochory", "anemochory", "anemochory", "anemochory", "anemochory", "endozoochory", "endozoochory", "endozoochory", "endozoochory", "endozoochory", "endozoochory", "endozoochory", "anemochory", "anemochory", "anemochory", "anemochory", "anemochory", "endozoochory", "anemochory", "anemochory", "anemochory", "anemochory", "endozoochory" ), shade_tolerance = c( "tolerant", "moderately_tolerant", "moderately_tolerant", "moderately_tolerant", "moderately_tolerant", "intolerant", "intolerant", "intolerant", "intolerant", "intolerant", "intolerant", "intolerant", "intolerant", "very_intolerant", "very_intolerant", "intolerant", "intolerant", "intolerant", "intolerant", "intolerant", "intolerant", "intolerant", "very_intolerant", "very_tolerant" ) ) head(woodiv_cat) ``` Then to simulate missing trait data, we randomly remove 20% of the values: ```{r woodiv_cat_na} # Randomly removes 20% of the values set.seed(20260411) woodiv_cat_na <- apply( woodiv_cat[, 2:4], 2, function(x) { x[sample(c(1:24), floor(24 / 10))] <- NA x } ) woodiv_cat_na <- as.data.frame(woodiv_cat_na) woodiv_cat_na$species <- woodiv_cat$species woodiv_cat_na <- woodiv_cat_na[, c(4, 1:3)] woodiv_cat_na$shade_tolerance <- factor( woodiv_cat_na$shade_tolerance, levels = c( "very_intolerant", "intolerant", "moderately_tolerant", "tolerant", "very_tolerant" ), ordered = TRUE ) head(woodiv_cat_na) ``` We can now use all of the functions of `funbiogeo` as with continuous trait data: ```{r fb_plots} # Show trait completeness overall fb_plot_species_traits_completeness(woodiv_cat_na) # Map site completeness per trait fb_map_site_traits_completeness( woodiv_locations, woodiv_site_species, woodiv_cat_na ) ``` **Note**: the only two functions of `funbiogeo` that won't work with categorical traits are `fb_cwm()` which computes an abundance-weighted trait average, and `fb_plot_trait_correlation()` which displays trait-trait correlations. ## Considering Intraspecific Variation Trait-based ecology tends to present its frameworks and analyses with species average traits, most of its concepts can, however, apply to intraspecific trait variation, `funbiogeo` is no different. All of the examples, including the dataset provided with the package, show species average traits. In this section, we detail how to work with data that include intraspecific variation within `funbiogeo`. This should be fairly similar to what's possible across other functional diversity R packages. To include intraspecific variation, the user has to index species within specific sites. For example, if they are three individuals of *Abies alba* in site *A*, then the user has to provide different names to the different individuals like `Abies_alba_1`, `Abies_alba_2`, and `Abies_alba_3`. These names have to be reused consistently across objects `site_species`, `species_traits`, and `species_categories`. As such, the user can define as fine as possible intraspecific variation. It is also possible to provide individual trait value for one or several sites and species average trait for the rest of the sites, following the same idea as long as the naming of species and invidivuals is consistent across objects. In this case, the specified individuals will be counfounded as distinct species in trait completeness plots. ## Sites of Arbitrary Shapes `funbiogeo` contains functions that require site-level data. A "site" is here defined as any geographic grain in which the studied organisms occur. Depending on the underlying scientific question, a site could be a single geographic point, for example marking a precise sampled location, or it could be a polygon (e.g., the area of a protected area), or a (multi-)line (e.g., a transect or a sampling route), or a square in a grid (e.g., through a sampling grid). The `fb_map_*()` functions in `funbiogeo` are agnostic to the shape of the sites, meaning they will work whatever the nature of the sites. The outputs will be adapted to the nature of the sites. In this section, we will show examples with sites of different types and see how this affects the output given by `funbiogeo` mapping functions. We will first select the 100 first sites in the `woodiv_locations` object: ```{r sampled_sites} sampled_sites <- woodiv_locations[1:100, ] fb_map_site_traits_completeness( sampled_sites, woodiv_site_species, woodiv_traits ) ``` We will now convert the sites to points by taking the centroid of sites and use `fb_map_*()` functions to see how it will affect their outputs: ```{r convert_to_points} # Convert all the sites into 'POINT' geometry points_sites <- sf::st_centroid(sampled_sites) points_sites # Map the sites fb_map_site_traits_completeness( points_sites, woodiv_site_species, woodiv_traits ) ``` As seen above, the sites are now actual points instead of the original squares. The function will adapt to the geometry of the sites provided by the user. But `funbiogeo` can accommodate sites of any geometry, to show sites that represent lines, we will group sites into lines of sites and use the same function. ```{r convert_to_lines} lines_sites <- points_sites # Assign groups to create 10 lines site_ids <- data.frame( site = points_sites$site, group_line = rep(1:10, each = 10) ) # Group sites geographically lines_sites <- lines_sites |> dplyr::inner_join(site_ids, by = "site") |> dplyr::group_by(group_line) |> dplyr::summarise(country = unique(country)) |> sf::st_cast("LINESTRING") |> dplyr::rename(site = group_line) lines_sites # Group sites in woodiv_site_species woodiv_site_lines <- woodiv_site_species |> dplyr::inner_join(site_ids) |> dplyr::select(-site) |> dplyr::rename(site = group_line) |> dplyr::group_by(site) |> dplyr::summarise( dplyr::across(dplyr::everything(), function(x) as.numeric(sum(x) > 0)) ) fb_map_site_traits_completeness( lines_sites, woodiv_site_lines, woodiv_traits ) ``` The geometry now displays the lines, even though they are not the most perfect representation of the actual sites, but it shows the capabilities of funbiogeo. Similarly to the [upscaling vignette](vignettes/upscaling.Rmd), the map functions can also accommodate larger polygons, for example by aggregating sites per country. ```{r convert_to_polygons} # Convert all sites to a single polygon polygon_sites <- sampled_sites |> dplyr::group_by(country) |> dplyr::summarise(country = unique(country)) |> sf::st_cast("MULTIPOLYGON") |> dplyr::rename(site = country) # Compute new site-species object woodiv_site_polygon <- woodiv_site_species |> subset(site %in% sampled_sites$site) |> dplyr::select(-site) |> dplyr::mutate(site = "Portugal") |> dplyr::group_by(site) |> dplyr::summarise( dplyr::across(dplyr::everything(), function(x) as.numeric(sum(x) > 0)) ) # Display the map fb_map_site_traits_completeness( polygon_sites, woodiv_site_polygon, woodiv_traits ) ``` Now all of the sites are merged as a single big polygon. Have fun with `funbiogeo` and if you have a question, an issue, or a suggestion, make sure to fill a report [on GitHub](https://github.com/FRBCesab/funbiogeo/issues/new).