Suppose there are 8 red balls, 1 green, and 1 blue. Very imbalanced data. I'm stupid and predict all of 10 are red. Did I get the confusion matrix correct? If so, I sum over all categories the true positives, and I sum over all (true positive + false positive), their quotient should be the precision of my stupid classification, right? Why does it look so good? Same thing to the recall.
Also, they don't agree with the results computed by sklearn
, which seem much more sensible.
from sklearn.metrics import precision_score, recall_score
a = ["R"]*8 + ["G"] + ["B"]
b = ["R"]*10
print("Precision:", precision_score(a,b,average="macro"))
print("Recall:", recall_score(a,b,average="macro"))
Precision: 0.26666666666666666
Recall: 0.3333333333333333