I am trying to get AUC value from the logistic regression for 2 classes classification problem. Response of this data is factor of 1 (terrible) and 2 (great).
The model is built as follow:
fit.ridge = glmnet(xmat, as.numeric(train$response), alpha = 0, family="binomial")
During the prediction step, I found 2 ways to achieve ROC using package pROC
. However, these 2 ways give different results:
xmat = model.matrix(response ~ ., data = train)[, -1]
Note: bestlam.ridge = best lambda value from cv.glmnet
Way 1)
pred = predict(fit.ridge, s = bestlam.ridge, newx = xmat, type="response")
auc = roc(as.numeric(train$response), as.numeric(pred))$auc
Way 2)
pred = predict(fit.ridge, s = bestlam.ridge, newx = xmat, type="response")
print(range(pred))
pred_bool = rep('bad', length(pred))
pred_bool[pred > 0.5] = 'good'
table(train$quality, pred_bool)
roc(as.numeric(train$quality), as.numeric(as.factor(pred_bool)))$auc
Way 1 gives about 0.1 higher AUC than Way 2 (and has the same AUC values as in cv.glmnet
when I used type.measure="auc"
).
But Way 2 looks like a valid way to go for the classification problem.
Please let me know which way is the correct AUC in this case?