roc_curve from multilabel classification has slope

2.5k views Asked by At

I have a multilabel classifier written in Keras from which I want to compute AUC and plot a ROC curve for every element classified from my test set.

enter image description here

Everything seems fine, except that some elements have a roc curve that have a slope as follows:

enter image description here

I don't know how to interpret the slope in such cases.

Basically my workflow goes as follows, I have a pre-trained model, instance of Keras, and I have the features X and the binarized labels y, every element in y is an array of length 1000, as it is a multilabel classification problem each element in y might contain many 1s, indicating that the element belongs to multiples classes, so I used the built-in loss of binary_crossentropy and my outputs of the model prediction are score probailities. Then I plot the roc curve as follows.

from sklearn.metrics import roc_curve, auc
#...
for xi, yi in (X_test, y_test):
    y_pred = model.predict([xi])[0]
    fpr, tpr, _ = roc_curve(yi, y_pred)

    plt.plot(fpr, tpr, color='darkorange', lw=0.5)

The predict method returns probabilities, as I'm using the functional api of keras.

Does anyone knows why roc curves looks like this?

1

There are 1 answers

0
Ismael On BEST ANSWER

Asking in the mailing list of scikit-learn, they answered:

Slope usually means there are ties in your predictions.

Which is the case in this problem.