I am trying to obtain the confusion matrix from python CRFsuite.
This my code:
from sklearn.metrics import confusion_matrix
confusion_matrix(y_test, pred_y, normalize='true', labels=lables)
error:
ValueError: You appear to be using a legacy multi-label data representation. Sequence of sequences are no longer supported; use a binary array or sparse matrix instead - the MultiLabelBinarizer transformer can convert to this format.
I tried to use MultiLabelBinarizer()
, but still couldn't get the confusion matrix.
After googling around I found this answer, it says that for the confusion matrix function you have to flatten the y_test
and pred_y
. I took a look at the source code of CRFsuite for other metrics here, they do use a fallaten function:
def _flattens_y(func):
@wraps(func)
def wrapper(y_true, y_pred, *args, **kwargs):
y_true_flat = flatten(y_true)
y_pred_flat = flatten(y_pred)
return func(y_true_flat, y_pred_flat, *args, **kwargs)
return wrapper
But there is no function for obtaining the confusion matrix
.
The y_test
and pred_y
are nested lists.
How can I flatten the y_test
and pred_y
to obtain the confusion matrix?
Thank you.