glmnet training throws error on x,y dataframe arguments: am I using it wrong?

4.3k views Asked by At

I'm trying to learn a penalized logistic regression method with glmnet. I'm trying to predict if a car from the mtcars example data will have an automatic transmission or manual. I think my code is pretty straightforward, but I seem to be getting an error:

This first block simply splits mtcars into an 80% train set and a 20% test set

library(glmnet)
attach(mtcars)

smp_size <- floor(0.8 * nrow(mtcars))

set.seed(123)
train_ind <- sample(seq_len(nrow(mtcars)), size=smp_size)

train <- mtcars[train_ind,]
test <- mtcars[-train_ind,]

I know the x data is supposed to be in a matrix form without the response, so I separate the two training sets into a non-response matrix (train_x) and a response vector (train_y)

train_x <- train[,!(names(train) %in% c("am"))]
train_y <- train$am

But when trying to train the model,

p1 <- glmnet(train_x, train_y)

I get the error:

Error in elnet(x, is.sparse, ix, jx, y, weights, offset, type.gaussian,
:(list) object cannot be coerced to type 'double'

Am I missing something?

1

There are 1 answers

0
agstudy On

Coercing the first argument as a matrix solve for me :

p1 <- glmnet(as.matrix(train_x), train_y)

In fact , form glmnet? looks that the first argument should be a matrix/sparse matrix:

x: input matrix, of dimension nobs x nvars; each row is an observation vector. Can be in sparse matrix format (inherit from class "sparseMatrix" as in package Matrix; not yet available for family="cox")