R Confusion Matrix sensitivity and specificity labeling

10.5k views Asked by At

I am using R v3.3.2 and Caret 6.0.71 (i.e. latest versions) to construct a logistic regression classifier. I am using the confusionMatrix function to create stats for judging its performance.

logRegConfMat <- confusionMatrix(logRegPrediction, valData[,"Seen"])

  • Reference 0, Prediction 0 = 30
  • Reference 1, Prediction 0 = 14
  • Reference 0, Prediction 1 = 60
  • Reference 1, Prediction 1 = 164

Accuracy : 0.7239
Sensitivity : 0.3333
Specificity : 0.9213

The target value in my data (Seen) uses 1 for true and 0 for false. I assume the Reference (Ground truth) columns and Predication (Classifier) rows in the confusion matrix follow the same convention. Therefore my results show:

  • True Negatives (TN) 30
  • True Positives (TP) 164
  • False Negatives (FN) 14
  • False Positives (FP) 60

Question: Why is sensitivity given as 0.3333 and specificity given as 0.9213? I would have thought it was the other way round - see below.

I am reluctant to believe that there is bug in the R confusionMatrix function as nothing has been reported and this seems to be a significant error.


Most references about calculating specificity and sensitivity define them as follows - i.e. www.medcalc.org/calc/diagnostic_test.php

  • Sensitivity = TP / (TP+FN) = 164/(164+14) = 0.9213
  • Specificity = TN / (FP+TN) = 30/(60+30) = 0.3333
2

There are 2 answers

3
mtoto On

According to the documentation ?confusionMatrix:

"If there are only two factor levels, the first level will be used as the "positive" result."

Hence in your example positive result will be 0, and evaluation metrics will be the wrong way around. To override default behaviour, you can set the argument positive = to the correct value, alas:

 confusionMatrix(logRegPrediction, valData[,"Seen"], positive = "1")
0
Convex1 On

confusionMatrix( y_hat, y, positive = "1" )

will redefine all the metrics using "1" as the positive outcome. For example sensitivity and specificity will be reversed, but it will still display the confusion matrix as before, i.e. in the order of ( 0, 1). This can be rectified by reordering the factor levels of the classes as shown below.

y_hat = factor(y_hat, levels(y_hat)[ c(2,1) ])

y = factor(y, levels(y)[ c(2,1) ]

Now the matrix will be displayed in the order of (1, 0) with "1" as the positive outcome, and there is no need to use the positive="1" argument.