Understanding memory usage and performance of `furrr::future_apply`

220 views Asked by At

I am facing some issues parallelizing processes with furrr::future_apply (see Optimizing memory usage in applying a furrr function to a large list of tibbles: experiencing unexpected memory increase, and memory not being released in future with nested plan(multisession) ).

I simplified my setting to a simple situation:

    rm(list=ls(all=TRUE))
    
    require(future)
    require(furrr)
    require(dplyr)
    require(readr)
    require(parallel)
    set.seed(123)
    
    # fake data
    my_list <-   replicate(1000000, rnorm(1000), simplify = FALSE)
    
    # function to parallelize
    f_to_parallelize <- function(x){
      
      y <- sum(x)
      
      return(y)
      
    }
    
    # plans to test
    plan(sequential)
    #plan(multisession, workers=2)
    #plan(multisession, workers=6)
    #plan(multisession, workers=15)
    
    l <- future_walk(my_list, f_to_parallelize)
    

When I profile memory and time for these 4 plans this is what I get:

memory_profiling

I have launched 4 different jobs from R studio server, while I was profiling all memory used for processes with my user in a separate job to get data for the graph.

This is the outpu of my sessionInfo()) of the parallelization jobs:

R version 4.2.2 Patched (2022-11-10 r83330) Platform: x86_64-pc-linux-gnu (64-bit) Running under: Ubuntu 20.04.6 LTS Matrix products: default BLAS: /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.9.0 LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.9.0 locale: [1] LC_CTYPE=C.UTF-8 LC_NUMERIC=C LC_TIME=C.UTF-8
[4] LC_COLLATE=C.UTF-8 LC_MONETARY=C.UTF-8 LC_MESSAGES=C.UTF-8
[7] LC_PAPER=C.UTF-8 LC_NAME=C LC_ADDRESS=C
[10] LC_TELEPHONE=C LC_MEASUREMENT=C.UTF-8 LC_IDENTIFICATION=C
attached base packages: [1] parallel stats graphics grDevices utils datasets methods
[8] base
other attached packages: [1] readr_2.1.2 dplyr_1.1.0 furrr_0.2.3 future_1.24.0 loaded via a namespace (and not attached): [1] rstudioapi_0.13 parallelly_1.30.0 magrittr_2.0.2 hms_1.1.1
[5] tidyselect_1.2.0 R6_2.5.1 rlang_1.1.1 fansi_1.0.2
[9] globals_0.14.0 tools_4.2.2 utf8_1.2.2 cli_3.6.0
[13] ellipsis_0.3.2 digest_0.6.29 tibble_3.1.6 lifecycle_1.0.3
[17] crayon_1.5.0 tzdb_0.2.0 purrr_1.0.1 vctrs_0.5.2
[21] codetools_0.2-18 glue_1.6.2 compiler_4.2.2 pillar_1.7.0
[25] generics_0.1.2 listenv_0.8.0 pkgconfig_2.0.3

Is this behavior normal? I did not expected the steep increase in memory for all the plans, other than the increase in time when I increase the number of workers.

I also tested the sys.sleep(1) funtion in parallel, and I got the result I expected, time decreases as I increase workers.

What I am trying to parallelize is far more complex than this, i.e. a series of nested wrapped functions that do some training for some time series models and inference writing a csv and not returning anything.

I fill like I am losing something very simple but yet I cannot wrap my head around it, what concerns me the most is the memory increase, as it would be a very mempory intensive function. Also, the production machine will have Windows so I will not be able to use mclapply or other forked methods. Much appreciated if anyone is able to clarify this to me.

0

There are 0 answers