randomForest package in R mse calculation

2k views Asked by At

I feel like I'm missing something very basic here.

I've run a random forest regression:

INTERP.rf<-randomForest(y~.,data=df,importance=T,mtry=3,ntree=300)

and then extracted the predictions for the training set:

rf.predict<-predict(INTERP.rf,df,type="response")

the %var from rf.predict looked too low so I checked it:

MSE.rf<-sum((rf.predict-y)^2)/length(y)

...and got a wildly different answer than an inspection of the rf.predict object gave.

Please can someone highlight my error?

1

There are 1 answers

0
Lisa Avery On

The correct way to do this is to use:

rf.predict<-predict(INTERP.rf)

I was not aware that I needed to use predict.randomforest(model) as opposed to predict.randomForest(model,trainingData) to get the OOB predictions.

Thank you to @joran and @Vlo for helpful comments