I am using R v3.3.2 and Caret 6.0.71 (i.e. latest versions) to construct a logistic regression classifier. I am using the confusionMatrix function to create stats for judging its performance.
logRegConfMat <- confusionMatrix(logRegPrediction, valData[,"Seen"])
- Reference 0, Prediction 0 = 30
- Reference 1, Prediction 0 = 14
- Reference 0, Prediction 1 = 60
- Reference 1, Prediction 1 = 164
Accuracy : 0.7239
Sensitivity : 0.3333
Specificity : 0.9213
The target value in my data (Seen) uses 1 for true and 0 for false. I assume the Reference (Ground truth) columns and Predication (Classifier) rows in the confusion matrix follow the same convention. Therefore my results show:
- True Negatives (TN) 30
- True Positives (TP) 164
- False Negatives (FN) 14
- False Positives (FP) 60
Question: Why is sensitivity given as 0.3333 and specificity given as 0.9213? I would have thought it was the other way round - see below.
I am reluctant to believe that there is bug in the R confusionMatrix function as nothing has been reported and this seems to be a significant error.
Most references about calculating specificity and sensitivity define them as follows - i.e. www.medcalc.org/calc/diagnostic_test.php
- Sensitivity = TP / (TP+FN) = 164/(164+14) = 0.9213
- Specificity = TN / (FP+TN) = 30/(60+30) = 0.3333
According to the documentation
?confusionMatrix
:Hence in your example positive result will be
0
, and evaluation metrics will be the wrong way around. To override default behaviour, you can set the argumentpositive =
to the correct value, alas: