I have a large dataset (200 000+ rows, 140 variables) that has at least one missing value on each row that I have replaced with NA
. I am trying to use the caret
library to predict. The rattle
library can deal with them, but does anyone know how to use caret
?
The caret library direction says that you should use the below:
gbmFit1 <- train(twoplus~., data=training, method='GBM', trControl=fitControl,
na.action=na.omit)
but this gives the error:
Error in train.formula(twoplus ~ ., data = training, method = "M5", trControl = fitControl, :
Every row has at least one missing value were found
It simply shows you have some columns which have missing values for all records. Just exclude these columns from your dataset.