setDT() in functions fails when using future_lapply

139 views Asked by At

Possibly related to this question.

future_lapply throws an error when using setDT (or as.data.table for that matter) is used inside a function, complaining that the object is not a data.table.

This only happens when using the

  1. tweaking multisession: plan(tweak(multisession, workers = 1))
  2. running plan(multicore)

Any idea what is happening?

library(data.table)
library(future.apply)
#> Loading required package: future

df <- data.frame(id = 1:4)
f <- function(df, x){
  df |> setDT()
  df[, temp := x]
}

# Runs fine
lapply(1:3, function(x) f(df, x))
#> [[1]]
#>    id temp
#> 1:  1    1
#> 2:  2    1
#> 3:  3    1
#> 4:  4    1
#> 
#> [[2]]
#>    id temp
#> 1:  1    2
#> 2:  2    2
#> 3:  3    2
#> 4:  4    2
#> 
#> [[3]]
#>    id temp
#> 1:  1    3
#> 2:  2    3
#> 3:  3    3
#> 4:  4    3

# Runs fine
plan(multisession)
future_lapply(1:3, function(x) f(df, x))
#> [[1]]
#>    id temp
#> 1:  1    1
#> 2:  2    1
#> 3:  3    1
#> 4:  4    1
#> 
#> [[2]]
#>    id temp
#> 1:  1    2
#> 2:  2    2
#> 3:  3    2
#> 4:  4    2
#> 
#> [[3]]
#>    id temp
#> 1:  1    3
#> 2:  2    3
#> 3:  3    3
#> 4:  4    3

# Tweaking the plan: fails
plan(tweak(multisession, workers = 1))
future_lapply(1:3, function(x) f(df, x))
#> Error in `:=`(temp, x): Check that is.data.table(DT) == TRUE. Otherwise, := and `:=`(...) are defined for use in j, once only and in particular ways. See help(":=").

# Trying as.data.table: fails
f2 <- function(df, x){
  dt = as.data.table(df)
  dt[, temp := x]
}
future_lapply(1:3, function(x) f2(df, x))
#> Error in `:=`(temp, x): Check that is.data.table(DT) == TRUE. Otherwise, := and `:=`(...) are defined for use in j, once only and in particular ways. See help(":=").

# Multicore also fails 
# [won't print on the reprex because doesn't run on RStudio, but tested running
# the script from the terminal]
plan(multicore)
#> Warning in supportsMulticoreAndRStudio(...): [ONE-TIME WARNING] Forked
#> processing ('multicore') is not supported when running R from RStudio
#> because it is considered unstable. For more details, how to control forked
#> processing or not, and how to silence this warning in future R sessions, see ?
#> parallelly::supportsMulticore
future_lapply(1:3, function(x) f(df, x))
#> [[1]]
#>    id temp
#> 1:  1    1
#> 2:  2    1
#> 3:  3    1
#> 4:  4    1
#> 
#> [[2]]
#>    id temp
#> 1:  1    2
#> 2:  2    2
#> 3:  3    2
#> 4:  4    2
#> 
#> [[3]]
#>    id temp
#> 1:  1    3
#> 2:  2    3
#> 3:  3    3
#> 4:  4    3

Created on 2022-04-20 by the reprex package (v2.0.1)

0

There are 0 answers