I want perform multilabel classification. A have a dataset in arff format which I load. However I don't now how convert import data to X and y vectors in order to apply sklearn/train_test_split.
How can I get X and y?
data, meta = scipy.io.arff.loadarff('../yeast-train.arff')
df = pd.DataFrame(data)
#Get X, y
X, y = ??? <---
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)
Ok. Its a multilabel data in which features are in the columns
Att1, Att2, Att3.... Att20
and targets are in the columnsClass1, Class2, .... Class14
.So you need to use those columns for getting the X and y. Do it like this: