Multilabel Classification using NaiveBayes Classifier in Spark

Question

Multilabel Classification using NaiveBayes Classifier in Spark

1.1k views Asked by xeonzion At 19 December 2016 at 03:55

I have the data in the format
blah sentence one --> label1, label2
blah sentence two --> label2, label4
blah sentence three --> label3

How can I use OneVsRestClassifier with NaiveBayesClassifier in Spark? (i.e., How should my data be structured?). For a multi-class classification with NaiveBayes, the class LabeledPoint contains label and Feature Vector. But, for the above mentioned case, how should the data be structured?

Original Q&A

There are 1 answers

**marilena.oita** · Answer 1 · 2017-03-07T16:12:55+00:00

Just structure the data as usual (LabeledPoint), but use multiple classifiers (e.g, OneVsRest), and switch up the data passed into each (based on your multiple labelled vectors). Another solution is to get the probabilities for all classes, instead of getting the most probable (predict(p.features()))

Vector prediction = model.predictProbabilities(p.features());

and then take the topk most probable predictions using a threshold filtering.

TechQA.

Multilabel Classification using NaiveBayes Classifier in Spark

There are 1 answers

Related Questions in SCALA

Related Questions in APACHE-SPARK

Related Questions in APACHE-SPARK-MLLIB

Related Questions in NAIVEBAYES

Popular Questions

Popular Tags

Trending Questions