Argument of length 0" during cross-validation in R

27 views Asked by At

I'm attempting to perform cross-validation with XGBoost using the caret package in R. However, I'm encountering an error message: "Argument of length 0" during the cross-validation process. I've checked my code and data, but I'm unable to pinpoint the exact cause of the issue.

Here's a simplified version of my code:

predcvmodel=function(xtrain,ytrain,k=5)
{
  # xtrain=matrix of predictors
  # ytrain=vector of target 
  # k= # folds in CV
  

  n=nrow(xtrain)
  pred=rep(0,n)
  
  set.seed(123) # Set seed value for randomness
  library(xgboost)
  library(caret)
  
  
  # Create folds for cross-validation
  folds <- createFolds(as.factor(ytrain), k = k, list = TRUE, returnTrain = TRUE)
  
  for(i in 1:k) {
    cat("Iteration", i, "\n")
    
    # Create training and test sets
    train_indices <- unlist(folds[-i])
    test_indices <- folds[[i]]
    
    # Define XGBoost parameters
    xgb_params <- list(
      objective = "reg:tweedie",  # Using Tweedie distribution
      eval_metric = paste("tweedie-nloglik@", 1.5),  # Evaluation metric for Tweedie distribution
      max_depth = 6,             # Maximum tree depth
      eta = 0.3,                 # Learning rate
      nrounds = 100,             # Number of boosting rounds
      verbose = 0                # Verbosity level (0 for silent)
    )
    
    # Train XGBoost model with cross-validation
    xgb_cv <- xgb.cv(
      data = xgb.DMatrix(xtrain[train_indices,], label = ytrain[train_indices]),
      params = xgb_params,
      nrounds = xgb_params$nrounds,  # Include nrounds parameter
      nfold = k,
      early_stopping_rounds = 10,  # Early stopping to prevent overfitting
      maximize = FALSE,             # Whether to maximize the evaluation metric
      stratified = T
    )
    
    # Find the optimal number of boosting rounds
    best_nrounds <- which.min(xgb_cv$evaluation_log$test_tweedie_nloglik_mean)
    
    # Retrain the model on the full training set with the optimal number of rounds
    xgb_model <- xgboost(
      data = xgb.DMatrix(xtrain[train_indices,], label = ytrain[train_indices]),
      params = xgb_params,
      nrounds = best_nrounds
    )
    
    # Make predictions on the test set
    pred[test_indices] <- predict(xgb_model, newdata = xgb.DMatrix(xtrain[test_indices,]))
  }
  
  # Return the predictions
  pred
} 

I'm receiving the error message:

Error in begin_iteration:end_iteration : argument of length 0

Can anyone help me identify the source of this error and suggest possible solutions?Thank you.

0

There are 0 answers