Issue with Combining and Naming Branches in targets R Package

36 views Asked by At

All the code and datasets are avalable at this github repo

I am currently working on a workflow for Species Distribution Modeling using the targets R package. I’ve encountered two issues in a specific part of my workflow. Firstly, I am downloading presences in parallel using the crew package since the actual dataset consists of around 40,000 species. I have provided the relevant code below:

library(targets)
source("R/functions.R")
library(crew)

tar_option_set(
  packages = c("readr", "SDMWorkflows", "janitor", "data.table"),
  controller = crew_controller_local(workers = 6),
  error = "null"
)
list(
  tar_target(file, "First_10_species.csv", format = "file"),
  # Read the file
  tar_target(data, get_data(file)),
  # Filter the species to only plants
  tar_target(Only_Plants, filter_plants(data)),
  # Parallelize and retrieve species presences for species within Denmark
  tar_target(Presences,
    get_plant_presences(Only_Plants),
    pattern = map(Only_Plants)
  ),
  # summarize the number of presences per species
  tar_target(Presence_summary, summarise_presences(Presences),
    pattern = map(Presences)
  ),
  # Filter to only the species that have 5 presences
  tar_target(Over_5, Filter_Over_5(Presence_summary))
)

The SDMWorkflows package is a package I made that you can install by using this code

remotes::install_github("Sustainscapes/SDMWorkflows")

The accompanying function script (R/functions.R) is as follows:

get_data <- function(file) {
  readr::read_csv(file) |>
    janitor::clean_names()
}

filter_plants <- function(df) {
  result <- df |>
    dplyr::filter(kingdom == "Plantae") |>
    dplyr::pull(species) |>
    unique() |>
    head(10)
  return(result)
}

get_plant_presences <- function(species) {
  SDMWorkflows::GetOccs(
    Species = unique(species),
    WriteFile = FALSE,
    Log = FALSE,
    country = "DK",
    limit = 100000,
    year = "1999,2023"
  )
}

summarise_presences <- function(df) {
  Sum <- as.data.table(df)[, .N, keyby = .(family, genus, species)]
  return(Sum)
}

Filter_Over_5 <- function(DT) {
  DT[N > 5]
}

While the workflow appears to be working well, some species summaries are showing errors. The errors are documented in the following table and figure

name error
Presence_summary_24c8afe2 object genus not found
Presence_summary_7044ad96 object genus not found
Presence_summary_a8f163ad object genus not found
Presence_summary_c7ecffc9 object genus not found
knitr::include_graphics("PlotTarget.png")

These errors are expected for species that did not present presences within Denmark. However, the summary appears fine, and from the initial 10 presences, it generates a data.table with 6 species, as illustrated in this table:

family genus species N
Pinaceae Abies Abies cephalonica 1
Pinaceae Abies Abies koreana 3
Pinaceae Abies Abies nordmanniana 1130
Pinaceae Abies Abies sibirica 14
Pinaceae Abies Abies veitchii 2
Thuidiaceae Abietinella Abietinella abietina 9

I have two specific questions:

  • Addressing Errors in summarise_presences: Despite the errors, the results of summarise_presences are as expected. How can I eliminate these errors from the summary?

  • Filtering Species in Presences for Plotting: Suppose I want to use the results of Presences to plot coordinates with a function like PlotPres, but I only want to include species that appear in the Over_5 object. How can I achieve this mapping, considering that the species have names instead of branches?

PlotPres <- function(df) {
  G <- ggplot(df, aes(x = decimalLongitude, y = decimalLatitude)) +
    geom_point() +
    theme_bw()

  print(G)
}

as you can see if I do this for branch 6 it works

PlotPres(tar_read("Presences", branches = 6)[[1]])

Session info

Because session_info is TRUE, the rendered result includes session info, even though no such code is included here in the source document.

Standard output and standard error
Warning: program compiled against libxml 210 using older 209
Warning messages:
1: In normalizePath(Sys.getenv("TMPDIR", Sys.getenv("TMP"))) :
  path[1]="": No such file or directory
2: In normalizePath(Sys.getenv("TMPDIR", Sys.getenv("TMP"))) :
  path[1]="": No such file or directory
3: In normalizePath(Sys.getenv("TMPDIR", Sys.getenv("TMP"))) :
  path[1]="": No such file or directory
Session info
sessioninfo::session_info()
#;-) ─ Session info ───────────────────────────────────────────────────────────────
#;-)  setting  value
#;-)  version  R version 4.3.2 (2023-10-31)
#;-)  os       Ubuntu 20.04.6 LTS
#;-)  system   x86_64, linux-gnu
#;-)  ui       X11
#;-)  language en_US:en
#;-)  collate  en_US.UTF-8
#;-)  ctype    en_US.UTF-8
#;-)  tz       Europe/Copenhagen
#;-)  date     2023-11-29
#;-)  pandoc   2.19.2 @ /usr/lib/rstudio/resources/app/bin/quarto/bin/tools/ (via rmarkdown)
#;-) 
#;-) ─ Packages ───────────────────────────────────────────────────────────────────
#;-)  package     * version date (UTC) lib source
#;-)  backports     1.4.1   2021-12-13 [1] CRAN (R 4.3.0)
#;-)  base64url     1.4     2018-05-14 [1] CRAN (R 4.3.2)
#;-)  callr         3.7.3   2022-11-02 [3] CRAN (R 4.2.2)
#;-)  cli           3.6.1   2023-03-23 [3] CRAN (R 4.2.3)
#;-)  codetools     0.2-19  2023-02-01 [4] CRAN (R 4.2.2)
#;-)  colorspace    2.1-0   2023-01-23 [3] CRAN (R 4.2.2)
#;-)  curl          5.1.0   2023-10-02 [1] CRAN (R 4.3.1)
#;-)  data.table  * 1.14.8  2023-02-17 [1] CRAN (R 4.3.0)
#;-)  digest        0.6.33  2023-07-07 [1] CRAN (R 4.3.1)
#;-)  dplyr         1.1.4   2023-11-17 [1] CRAN (R 4.3.2)
#;-)  evaluate      0.23    2023-11-01 [1] CRAN (R 4.3.2)
#;-)  fansi         1.0.5   2023-10-08 [1] CRAN (R 4.3.1)
#;-)  farver        2.1.1   2022-07-06 [3] CRAN (R 4.2.1)
#;-)  fastmap       1.1.1   2023-02-24 [3] CRAN (R 4.2.2)
#;-)  fs            1.6.3   2023-07-20 [1] CRAN (R 4.3.1)
#;-)  generics      0.1.3   2022-07-05 [1] CRAN (R 4.3.0)
#;-)  ggplot2     * 3.4.4   2023-10-12 [3] CRAN (R 4.3.1)
#;-)  glue          1.6.2   2022-02-24 [3] CRAN (R 4.1.2)
#;-)  gtable        0.3.4   2023-08-21 [3] CRAN (R 4.3.1)
#;-)  highr         0.10    2022-12-22 [1] CRAN (R 4.3.0)
#;-)  htmltools     0.5.7   2023-11-03 [1] CRAN (R 4.3.2)
#;-)  igraph        1.5.1   2023-08-10 [3] CRAN (R 4.3.1)
#;-)  knitr         1.45    2023-10-30 [1] CRAN (R 4.3.2)
#;-)  labeling      0.4.3   2023-08-29 [3] CRAN (R 4.3.1)
#;-)  lifecycle     1.0.4   2023-11-07 [3] CRAN (R 4.3.2)
#;-)  magrittr      2.0.3   2022-03-30 [3] CRAN (R 4.1.3)
#;-)  munsell       0.5.0   2018-06-12 [3] CRAN (R 4.0.0)
#;-)  pillar        1.9.0   2023-03-22 [3] CRAN (R 4.2.3)
#;-)  pkgconfig     2.0.3   2019-09-22 [3] CRAN (R 4.0.0)
#;-)  png           0.1-8   2022-11-29 [1] CRAN (R 4.3.0)
#;-)  processx      3.8.2   2023-06-30 [1] CRAN (R 4.3.1)
#;-)  ps            1.7.5   2023-04-18 [1] CRAN (R 4.3.0)
#;-)  purrr         1.0.2   2023-08-10 [3] CRAN (R 4.3.1)
#;-)  R.cache       0.16.0  2022-07-21 [1] CRAN (R 4.3.0)
#;-)  R.methodsS3   1.8.2   2022-06-13 [1] CRAN (R 4.3.0)
#;-)  R.oo          1.25.0  2022-06-12 [1] CRAN (R 4.3.0)
#;-)  R.utils       2.12.2  2022-11-11 [1] CRAN (R 4.3.0)
#;-)  R6            2.5.1   2021-08-19 [3] CRAN (R 4.1.1)
#;-)  reprex        2.0.2   2022-08-17 [1] CRAN (R 4.3.0)
#;-)  rlang         1.1.2   2023-11-04 [1] CRAN (R 4.3.2)
#;-)  rmarkdown     2.25    2023-09-18 [1] CRAN (R 4.3.1)
#;-)  rstudioapi    0.15.0  2023-07-07 [3] CRAN (R 4.3.1)
#;-)  scales        1.3.0   2023-11-28 [1] CRAN (R 4.3.2)
#;-)  sessioninfo   1.2.2   2021-12-06 [3] CRAN (R 4.1.2)
#;-)  styler        1.10.0  2023-05-24 [1] CRAN (R 4.3.0)
#;-)  targets     * 1.3.2   2023-10-12 [1] CRAN (R 4.3.2)
#;-)  tibble        3.2.1   2023-03-20 [3] CRAN (R 4.3.1)
#;-)  tidyselect    1.2.0   2022-10-10 [1] CRAN (R 4.3.0)
#;-)  utf8          1.2.4   2023-10-22 [1] CRAN (R 4.3.1)
#;-)  vctrs         0.6.4   2023-10-12 [1] CRAN (R 4.3.1)
#;-)  withr         2.5.2   2023-10-30 [1] CRAN (R 4.3.2)
#;-)  xfun          0.41    2023-11-01 [1] CRAN (R 4.3.2)
#;-)  xml2          1.3.5   2023-07-06 [1] CRAN (R 4.3.1)
#;-)  yaml          2.3.7   2023-01-23 [1] CRAN (R 4.3.1)
#;-) 
#;-)  [1] /home/au687614/R/x86_64-pc-linux-gnu-library/4.3
#;-)  [2] /usr/local/lib/R/site-library
#;-)  [3] /usr/lib/R/site-library
#;-)  [4] /usr/lib/R/library
#;-) 
#;-) ──────────────────────────────────────────────────────────────────────────────
0

There are 0 answers