I'm implementing a bootstrap-t procedure for confidence bands for a statistic. Here is my code:
#Compute bootstrap variance
bt.var<-function(x,statistic,R=10000){
var(replicate(R,statistic(sample(x,replace=T))))
}
#Compute studentized bootstrap statistic
bt.one.student<-function(x, statistic.0, statistic,R=10000){
(statistic(x)-statistic.0)/sqrt(bt.var(x,statistic,R))
}
#Compute 95% confidence bands
bt.student<-function(x,statistic,R1=10000,R2=10000){
statistic.0<-statistic(x)
one.boot<-function(x,statistic.0,statistic,R2){
x.star<-sample(x,replace=T)
theta.hat<-statistic(x.star)
out<-bt.one.student(x.star,statistic.0,statistic,R2)
c(theta.hat,out)
}
output<-replicate(R1, one.boot(x,statistic.0,statistic,R2))
var.est<-var(output[1,])
q<-quantile(output[2,], c(0.025, 0.975))
c(statistic.0-sqrt(var.est)*q[2], statistic.0-sqrt(var.est)*q[1])
}
Since I want to implement the function bt.student()
using the parallel
package to take advantage of multi-cores, I'm using the following code:
library(parallel)
cl<-makeCluster(detectCores())
bt.var<-function(x,statistic,R=10000){
var(parSapply(cl, 1:R, function(i) statistic(sample(x,replace=T))))
}
bt.one.student<-function(x, statistic.0, statistic,R=10000){
(statistic(x)-statistic.0)/sqrt(bt.var(x,statistic,R))
}
one.boot<-function(x,statistic.0,statistic,R2){
x.star<-sample(x,replace=T)
theta.hat<-statistic(x.star)
out<-bt.one.student(x.star,statistic.0,statistic,R2)
c(theta.hat,out)
}
bt.student<-function(x,statistic,R1=10000,R2=10000){
statistic.0<-statistic(x)
output<-parSapply(cl, 1:R1, function(i) one.boot(x,statistic.0,statistic,R2) )
var.est<-var(output[1,])
q<-quantile(output[2,], c(0.025, 0.975))
c(statistic.0-sqrt(var.est)*q[2], statistic.0-sqrt(var.est)*q[1])
}
clusterExport(cl, c("bt.var","bt.one.student","one.boot"))
clusterSetRNGStream(cl)
x<-rnorm(40,mean=3,sd=2)
clusterExport(cl, "x")
bt.student(x,mean,R1=150,R2=150)
I get the following error:
Error in checkForRemoteErrors(val) :
4 nodes produced errors; first error: could not find function "parSapply"
Do you know why I get this error? I have to use parSapply
since there is no parallel equivalent replicate
in the parallel
package.
It looks like you're trying to use nested parallelism, which is rather tricky to do, and often isn't necessary. To make your example work, you'd have to create a cluster object on each worker, but then you'll have way too many workers which could horribly bog down your machine.
I suggest that you revert "bt.var" to the original sequential version, and only use "parSapply" in "bt.student". That gives you 10,000 good sized tasks, which should work well and make good use of your cores.