Using jags.parallel from within a function (R language Error in get(name, envir = envir) : object 'y' not found)

Question

Using jags.parallel from within a function (R language Error in get(name, envir = envir) : object 'y' not found)

3.7k views Asked by goryh At 21 May 2014 at 17:50

Using jags.parallel from the command line or a script works fine. I can run this modified example from http://www.inside-r.org/packages/cran/R2jags/docs/jags just fine

# An example model file is given in:
  model.file <- system.file(package="R2jags", "model", "schools.txt")
#=================#
# initialization  #
#=================#

  # data
  J <- 8.0
  y <- c(28.4,7.9,-2.8,6.8,-0.6,0.6,18.0,12.2)
  sd <- c(14.9,10.2,16.3,11.0,9.4,11.4,10.4,17.6)

  jags.data <- list("y","sd","J")
  jags.params <- c("mu","sigma","theta")
  jags.inits <- function(){
    list("mu"=rnorm(1),"sigma"=runif(1),"theta"=rnorm(J))
  }


#===============================#
# RUN jags and postprocessing   #
#===============================#
#  jagsfit <- jags(data=jags.data, inits=jags.inits, jags.params, 
#    n.iter=5000, model.file=model.file)

  # Run jags parallely, no progress bar. R may be frozen for a while, 
  # Be patient. Currenlty update afterward does not run parallelly

  print("Running Parallel") 
  jagsfit <- jags.parallel(data=jags.data, inits=jags.inits, jags.params, 
    n.iter=5000, model.file=model.file)

However if I wrap it in a function

testparallel <- functions(out){
# An example model file is given in:
    .
    .
    .
jagsfit <- jags.parallel(data=jags.data, inits=jags.inits, jags.params, 
  n.iter=5000, model.file=model.file)
print(out)
return(jagsfit)
}

Then I get the error: Error in get(name, envir = envir) : object 'y' not found Based on what I found here I know that it is an issue with the environment exported to the cluster and I have fixed it by changing

J <- 8.0
y <- c(28.4,7.9,-2.8,6.8,-0.6,0.6,18.0,12.2)
sd <- c(14.9,10.2,16.3,11.0,9.4,11.4,10.4,17.6)

to

  assign("J",8.0,envir=globalenv()) 
  assign("y",c(28.4,7.9,-2.8,6.8,-0.6,0.6,18.0,12.2),envir=globalenv()) 
  assign("sd",c(14.9,10.2,16.3,11.0,9.4,11.4,10.4,17.6),envir=globalenv())

Is there a better way to get around this?

Thank you, Greg

P.S.

I am working on this code for someone else so I don't really want to changes things in the R2jags package to let me pass in the environment to export though I plan on suggesting it to the authors of the package.

Original Q&A

There are 4 answers

Philippe On 23 May 2014 at 14:38

if you use intensively JAGS in parrallel I can suggest you to look the package rjags combined with the package dclone. I think dclone is realy powerfull because the syntaxe was exactly the same as rjags. I have never see your problem with this package.

If you want to use R2jags I think you need to pass your variables and your init function to the workers with the function:

clusterExport(cl, list("jags.data", "jags.params", "jags.inits"))

Kalin On 10 February 2015 at 02:32

Without changing the code of R2jags, you can still assign those data variables to the global environment in an easier way by using list2env.

Obviously, there is is a concern that those variable names could be overwritten in the global environment, but you probably can control for that.

Below is the same code as the example given in the original post except I put the data into a list and sent that list's data into the global environment using the list2env function. (Also I took out the unused "out" variable in the function.) This currently runs fine for me; you may have to add more chains and/or add more iterations to see the parallelism in action, though.

testparallel <- function(){

    library(R2jags)

    model.file <- system.file(package="R2jags", "model", "schools.txt")

    # Make a list of the data with named items.
    jags.data.v2 <- list(
        J=8.0, 
        y=c(28.4,7.9,-2.8,6.8,-0.6,0.6,18.0,12.2),
        sd=c(14.9,10.2,16.3,11.0,9.4,11.4,10.4,17.6) )

    # Store all that data explicitly in the globalenv() as
    # was previosly suggesting using the assign(...) function.
    # This will do that for you.
    # Now R2jags will have access to the data without you having 
    # to explicitly "assign" each to the globalenv.
    list2env( jags.data.v2, envir=globalenv() )

    jags.params <- c("mu","sigma","theta")
    jags.inits <- function(){
        list("mu"=rnorm(1),"sigma"=runif(1),"theta"=rnorm(J))
    }

    jagsfit <- jags.parallel(
        data=names(jags.data.v2), 
        inits=jags.inits, 
        jags.params, 
        n.iter=5000, 
        model.file=model.file)

    return(jagsfit)
}

M.L. On 24 May 2023 at 09:57

This is esentially what r2jags is doing, but rewritten so you can put those variables into the environment by hand in clusterExport(), which loads variables into the blank R session set up separately for each cluster:

jinits <- function(){list(.RNG.name = 'lecuyer::RngStream',
                            .RNG.seed = round(1e+06*runif(1)))}
cl <- makeCluster(mcmc_params$nchains,methods=F, type="PSOCK")
JAGSmod <- function(seed){
      set.seed(seed) #note this affects jinits, but not rjags itself
      jags_mod <- jags.model(mod_path,datastruct,inits=jinits,n.adapt=mcmc_params$nadapt) 
      update(jags_mod, n.iter=mcmc_params$nburnin)
      mod_samp <- coda.samples(jags_mod, monitorparams, n.iter=mcmc_params$nsamples, thin=mcmc_params$nthin)
      return(mod_samp)
    }
clusterExport(cl,c('mod_path','datastruct','mcmc_params','monitorparams','jinits'),envir=environment()) ## could just do 'JAGSmod','jags.model','coda.samples','load.module' here instead but:
clusterEvalQ(cl, {
      library(rjags)
      load.module('lecuyer')
      load.module('glm')
    }) #runs code in each blank cl instance of R
res <- parLapply(cl,1:mcmc_params$nchains,JAGSmod)
stopCluster(cl)
l_mcmc <- as.mcmc.list(lapply(res,as.mcmc))
parsum <- summary(window(l_mcmc))

This can easily be wrapped in a function, although might require you to pass in c(datastruct,mcmc_params,monitorparams).

**goryh** · Accepted Answer · 2015-02-10T23:02:16+00:00

goryh On 10 February 2015 at 23:02 BEST ANSWER

So I have contacted the author of R2jags and he has added an addition argument to jags.parallel that lets you pass envir, which is then past onto clusterExport.

This works well except it allows clashes between the name of my data and variables in the jags.parallel function.

TechQA.

Using jags.parallel from within a function (R language Error in get(name, envir = envir) : object 'y' not found)

There are 4 answers

Related Questions in R

Related Questions in PARALLEL-PROCESSING

Related Questions in R2JAGS

Related Questions in JAGS.PARALLEL

Popular Questions

Trending Questions