terra package returns error when try to run parallel operations

3.4k views Asked by At

I'm working with raster package and I try to switch to terra but for some reasons that I don't understand, terra cannot reproduce the same operation of raster when working in parallel with packages such snowfall and future.apply. Here is a reproducible example.

library(terra)
r <- rast()
r[] <- 1:ncell(r)
m <- rast()
m[] <- c(rep(1,ncell(m)/5),rep(2,ncell(m)/5),rep(3,ncell(m)/5),rep(4,ncell(m)/5),rep(5,ncell(m)/5))
ms <- separate(m,other=NA)
plot(ms)
mymask <- function(ind){
  tipo <- tipo_tav[ind]
  mask <- ms[[ind]]
  
  masked <-
    terra::mask(
      r,
      mask
    )
  
  richard <- function(x){
    k <-0.2
    v <-0.3
    a <-200
    y0 <-2
    y <- k/v*x*(1-((x/a)^v))+y0
    return(y)
  }
  pred <- richard(masked)
  pred <- clamp(pred,lower=0)
  return(pred)
}
#the sequential usage works fine, faster than the `raster` counterpart
system.time(x <- mymask(1))#0.03

#when I try to run my function in parallel I receive an error
plan(multisession,workers=5)
system.time(pred_list <- future_lapply(1:5, FUN = mymask))

Error in .External(list(name = "CppMethod__invoke_notvoid", address = <pointer: (nil)>, : NULL value as symbol address.

the exactly same code works well if I change rast with raster and terra::mask with raster::mask. See below:

library(raster)
r <- raster(r)
ms <- stack(ms)
mymask <- function(ind){
  tipo <- tipo_tav[ind]
  mask <- ms[[ind]]
  
  masked <-
    raster::mask(
      r,
      mask     
    )
  
  richard <- function(x){
    k <-0.2
    v <-0.3
    a <-200
    y0 <-2
    y <- k/v*x*(1-((x/a)^v))+y0
    return(y)
  }
  pred <- richard(masked)
  pred <- clamp(pred,lower=0)
  return(pred)
}
#this works fine
system.time(x <- mymask(1))#0.06
#this works too
plan(multisession,workers=5)
system.time(pred_list <- future_lapply(1:5, FUN = mymask))#15.48

The same behavior if I use snowfall instead of future

library(snowfall)
sfInit(parallel = TRUE, cpus =5)
sfLibrary(terra)
sfExportAll()
system.time(pred_list <- sfLapply(1:5, fun = mymask))
sfStop()

this return the same error of future_lapply Why is this happening? I've never seen such an error. I was hoping to take advantage of the higher speed of terra but so I'm stuck.

1

There are 1 answers

1
Robert Hijmans On BEST ANSWER

A SpatRaster cannot be serialized, you cannot send it to parallel compute nodes. Have a look here for more discussion.

Instead you can (a) send and receive filenames; (b) parallelize your custom function that you supply to app or lapp; (c) use the cores=n argument (where available, e.g. app and predict); (d) use a mechanism like wrap; (e) send a filename and a vector to make a SpatExtent to process and create a virtual raster from the output tiles (see ?vrt).

For example, you could do use a function like this (Option "a")

prich <- function(filein, fileout) {
    r <- rast(filein)
    richard <- function(x) {
        k <-0.2
        v <-0.3
        a <-200
        y0 <-2
        y <- k/v*x*(1-((x/a)^v))+y0
        y[y<0] <- 0
        return(y)
    }
    x <- app(masked, richard, filename=fileout, overwrite=TRUE)
    return(TRUE)
}

I use app because it is much more efficient for large rasters --- as it could avoid writing temp files for each of the 10 arithmetic operations with a SpatRaster. Given that you want to parallelize this relatively simple function, I assume the files are very large.

Or option "c":

richard <- function(x) {
    k <-0.2
    v <-0.3
    a <-200
    y0 <-2
    y <- k/v*x*(1-((x/a)^v))+y0
    y[y<0] <- 0
    return(y)
 }
 x <- app(masked, richard, cores=12)

In neither case I included the masking. You could include it in option "a" but mask is disk I/O intensive, not computationally intensive, so it might be as efficient to do it in one step rather than in parallel.

With wrap you could do something like this

f <- function(w) {
    x <- rast(w)
    y <- richard(x)
    wrap(y)
}

r <- rast(nrow=10, ncol=10, vals=1:100)
x <- f(wrap(r))
x <- rast(x)

Where f would be run in parallel. That only works for small rasters, but you could parallelize over tiles, and you can create tiles with terra::makeTiles.

More internal parallelization options will be coming, but don't hold your breath.