Here is the algorithm of what I want to do with R:
- Simulate 10 time series data set from
ARIMA
model througharima.sim()
function - Split the series into sub-series of possible
2s
,3s
,4s
,5s
,6s
,7s
,8s
, and9s
. - For each size take a resample the blocks with replacement, for new series and obtain the best
ARIMA
model from the subseries from each block size throughauto.arima()
function. - Obtain for each subseries of each block sizes
RMSE
.
The below R
function get that done.
## Load packages and prepare multicore process
library(forecast)
library(future.apply)
plan(multisession)
library(parallel)
library(foreach)
library(doParallel)
n_cores <- detectCores()
cl <- makeCluster(n_cores)
registerDoParallel(cores = detectCores())
## simulate ARIMA(1,0, 0)
#n=10; phi <- 0.6; order <- c(1, 0, 0)
bootstrap1 <- function(n, phi){
ts <- arima.sim(n, model = list(ar=phi, order = c(1, 0, 0)), sd = 1)
########################################################
## create a vector of block sizes
t <- length(ts) # the length of the time series
lb <- seq(n-2)+1 # vector of block sizes to be 1 < l < n (i.e to be between 1 and n exclusively)
########################################################
## This section create matrix to store block means
BOOTSTRAP <- matrix(nrow = 1, ncol = length(lb))
colnames(BOOTSTRAP) <-lb
########################################################
## This section use foreach function to do detail in the brace
BOOTSTRAP <- foreach(b = 1:length(lb), .combine = 'cbind') %do%{
l <- lb[b]# block size at each instance
m <- ceiling(t / l) # number of blocks
blk <- split(ts, rep(1:m, each=l, length.out = t)) # divides the series into blocks
######################################################
res<-sample(blk, replace=T, 10) # resamples the blocks
res.unlist <- unlist(res, use.names = FALSE) # unlist the bootstrap series
train <- head(res.unlist, round(length(res.unlist) - 10)) # Train set
test <- tail(res.unlist, length(res.unlist) - length(train)) # Test set
nfuture <- forecast::forecast(train, model = forecast::auto.arima(train), lambda=0, biasadj=TRUE, h = length(test))$mean # makes the `forecast of test set
RMSE <- Metrics::rmse(test, nfuture) # RETURN RMSE
BOOTSTRAP[b] <- RMSE
}
BOOTSTRAPS <- matrix(BOOTSTRAP, nrow = 1, ncol = length(lb))
colnames(BOOTSTRAPS) <- lb
BOOTSTRAPS
return(list(BOOTSTRAPS))
}
Calling the function
bootstrap1(10, 0.6)
I get the below result:
## 2 3 4 5 6 7 8 9
## [1,] 0.8920703 0.703974 0.6990448 0.714255 1.308236 0.809914 0.5315476 0.8175382
I want to repeat the above step 1
to step 4
chronologically, then I think of Monte Carlo
technology in R
. Thus, I load its package and run the below function:
param_list=list("n"=10, "phi"=0.6)
library(MonteCarlo)
MC_result<-MonteCarlo(func = bootstrap1, nrep=3, param_list = param_list)
expecting to get a like of the below result in matrix
form:
## [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9]
## [1,] 0.8920703 0.703974 0.6990448 0.714255 1.308236 0.809914 0.5315476 0.8175382
## [2,] 0.8909836 0.8457537 1.095148 0.8918468 0.8913282 0.7894167 0.8911484 0.8694729
## [3,] 1.586785 1.224003 1.375026 1.292847 1.437359 1.418744 1.550254 1.30784
but I get the following error message:
Error in MonteCarlo(func = bootstrap1, nrep = 3, param_list = param_list) : func has to return a list with named components. Each component has to be scalar.
How can I find my way to obtain a desired result like the above and make the result reproducible?
You get this error message because MonteCarlo expects
bootstrap1()
to accept one parameter combination for the simulation and that it only returns one value (RMSE
) per replication. This is not the case here since the block length (lb
) is determined by the length of the simulated time series (n
) withinbootstrap1
and so you will get results forn - 2
block lengths for each call.A solution is to pass the block length as a parameter and rewrite
bootstrap1()
appropriately:To run the simulation, pass the parameters as well as
bootstrap1()
toMonteCarlo()
. For the simulation to be carried out in parallel you need to set the number of cores viancpus
. The MonteCarlo package uses snowFall, so it should run on Windows.Note that I also set
raw = T
(otherwise the outcomes would be averages over all replications). Setting the seed before will make the results reproducible.The result is an array. I think it's best to transform it into a data.frame via
MakeFrame()
:It's easy to get a
reps x lb
matrix though: