Speed difference between caret and klaR packages, for Naive Bayes

Question

Speed difference between caret and klaR packages, for Naive Bayes

1.6k views Asked by Bryan At 26 June 2015 at 17:17

I'm running a Naive Bayes model, and using the klaR package directly is very fast, less than a second to compute on a standard laptop:

mod <- NaiveBayes(category ~ ., data=training, na.action = na.omit)

However, using the caret packages's train() interface--which I thought was simply a wrapper for the above function--takes a very long time:

mod <- train(category ~ ., data=training, na.action = na.omit, method="nb")

I'm guessing this is because train defaults to include some resampling. I tried including trControl = trainControl(method = "none") but received the following error:

Error in train.default(x, y, weights = w, ...) : Only one model should be specified in tuneGrid with no resampling

Any ideas why this might occur or general thoughts on the speed difference between the two functions?

Also, is there any chance the speed difference is related to the formula interface? A few of my predictors are factors with over a hundred levels.

Original Q&A

There are 1 answers

**smci** · Answer 1 · 2015-07-17T00:38:25+00:00

Because when you call caret::train without specifying any of trControl, tuneGrid, tuneLength, it defaults to doing a grid-search over all possible hyperparameters!!

trControl = trainControl(), tuneGrid = NULL, tuneLength = 3

... and worse still, it does that grid-search using the default parameters of that particular model (NaiveBayes in this case)!

And the default for trainControl is absolutely not what you want: method = "boot", number = 10 or 25, which is 10/25 entire passes of bootstrap and also saving intermediate results (returnData=T).

So you override one bad default by doing trControl = trainControl(method = "none"), but that tickles that it's still doing a grid-search with tuneGrid = NULL, tuneLength = 3. You need to explicitly set/override those.

(as @Khl4v already said in comment)

TechQA.

Speed difference between caret and klaR packages, for Naive Bayes

There are 1 answers

Related Questions in R

Related Questions in CLASSIFICATION

Related Questions in R-CARET

Related Questions in NAIVEBAYES

Related Questions in GRID-SEARCH

Popular Questions

Popular Tags

Trending Questions