Multiclass vs Multilabel

1.8k views Asked by At

Currently, I am working on projects in which I have to classify the restaurant review data. I am using multinomial Naive Bayes algorithm. I am bit confused that my problem is related to multiclass or multilabel.

review example-

Please treat your customer like customer, not dogs. .I will never go or advice anyone to go at Naivedyam, Hauz Khas.They guys are sick and complete businessman. Food was ver bad in taste, but place and staff were too dirty

It contains three different classes like

Bad Experience
Staff Behavior
food quality

How to create the training data set?

Should I use multilabel and create the training data set like

ID Content                    Tags
1, "content of the review#1", Bad Experience,Staff Behavior,food quality

or

like in multiclass

 Review          Tags
above review, Bad Experience
above review, Staff Behavior
above review, food quality

Any suggestion

1

There are 1 answers

0
Hakim K On

Your problem is a multilabel classification example.

One approach is to treat each output response as a separate binary classification problem

   X           Y1    Y2 
0  1.438161    0     1
1 -0.283780    1     1
2  0.552564    1     0
3  1.931332    0     1
4  1.656010    0     1
5  0.944862    1     0

Where Y1, Y2 is a one-hot encoding of whether "Bad experience" or "Staff behaviour" occurred or not.

You can find a worked out example for multilabel classification in the scikit-learn documentation.