Unchanged `lambda.min` values for multiple LASSO regressions in R

200 views Asked by At

I'm trying to perform multiple LASSO regressions in R using the following code:

library(readxl)
data <-read_excel("data.xlsx") # 20x20 matrix  
library(glmnet)
    library(coefplot)
    
    A <- as.matrix(data)
    results <- lapply(seq_len(ncol(A)), function(i) {
      list(
    fit_lasso = glmnet(A[, -i], A[, i], standardize = T, alpha = 1), 
    cvfit = cv.glmnet(A[, -i] , A[, i] , standardize = TRUE , type.measure = "mse" , nfolds = 10 , alpha = 1)
  )
})
coefficients <- lapply(results, function(x, fun) fun(coef(x$cvfit, s = "lambda.min")), function(x) x[x[, 1L] != 0L, 1L, drop = FALSE])

My output results results in a Large list (20 elements, 1MB) with 20 same LASSO output but for 20 variables and coefficients output is only the significant variables in each case.

I notice that for the same dataset the results are not always the same - maybe because of lambda changing values in each run? not sure. I want to make my code to find the same lambda.min's and give always the same results when I run the dataset. I believe a set.seed() might manage it but can't figure out how to sufficiently include it.

How can I always make it print the same outputs for a specific dataset?

1

There are 1 answers

2
DaveArmstrong On BEST ANSWER

I got it to produce the same lambda.min values from run to run just by putting set.seed() before the list. Then, you're setting the seed for the random draws of the cross-validation runs.

library(readxl)
data <-read_excel("data.xlsx") # 20x20 matrix  
library(glmnet)
    library(coefplot)
    
    A <- as.matrix(data)
 set.seed(54234)   
 results <- lapply(seq_len(ncol(A)), function(i) {
      list(
    fit_lasso = glmnet(A[, -i], A[, i], standardize = T, alpha = 1), 
    cvfit = cv.glmnet(A[, -i] , A[, i] , standardize = TRUE , type.measure = "mse" , nfolds = 10 , alpha = 1)
  )
})
coefficients <- lapply(results, function(x, fun) fun(coef(x$cvfit, s = "lambda.min")), function(x) x[x[, 1L] != 0L, 1L, drop = FALSE])