Currently, I am working on projects in which I have to classify the restaurant review data. I am using multinomial Naive Bayes algorithm. I am bit confused that my problem is related to multiclass or multilabel.
review example-
Please treat your customer like customer, not dogs. .I will never go or advice anyone to go at Naivedyam, Hauz Khas.They guys are sick and complete businessman. Food was ver bad in taste, but place and staff were too dirty
It contains three different classes like
Bad Experience
Staff Behavior
food quality
How to create the training data set?
Should I use multilabel and create the training data set like
ID Content Tags
1, "content of the review#1", Bad Experience,Staff Behavior,food quality
or
like in multiclass
Review Tags
above review, Bad Experience
above review, Staff Behavior
above review, food quality
Any suggestion
Your problem is a multilabel classification example.
One approach is to treat each output response as a separate binary classification problem
Where Y1, Y2 is a one-hot encoding of whether "Bad experience" or "Staff behaviour" occurred or not.
You can find a worked out example for multilabel classification in the scikit-learn documentation.