NaiveBayes classifier handling different data types in python

Question

NaiveBayes classifier handling different data types in python

4.4k views Asked by user 3317704 At 19 June 2015 at 12:01

I am trying to implement Naive Bayes classifier in Python. My attributes are of different data types : Strings, Int, float, Boolean, Ordinal

I could use Gaussian Naive Bayes classifier (Sklearn.naivebayes : Python package) , But I do not know how the different data types are to be handled. The classifier throws an error, stating cannot handle data types other than Int or float

One way I could possibly think of is encoding the strings to numerical values. But I also doubt , how good the classifier would perform if I do this.

Original Q&A

There are 2 answers

Jijo Jose On 28 June 2016 at 08:51

Don't convert data type manually instead use the dict vectorization.

http://scikit-learn.org/stable/modules/generated/sklearn.feature_extraction.DictVectorizer.html

**Numlet** · Accepted Answer · 2015-06-19T12:51:59+00:00

Yes, you will need to convert the strings to numerical values The naive Bayes classifier can not handle strings as there is not a way an string can enter in a mathematical equation.

If your strings have some "scalar value" for example "large, medium, small" you might want to classify them as "3,2,1", However, if your strings are things without order such as colours or names, you can do this or assign binary variables with every variable referring to a colour or name, if they are not many.

For example if you are classifying cars an they can be red blue and green you can define the variables 'Red' 'Blue' 'Green' that take the values 0/1, depending on the colour of your car.

TechQA.

NaiveBayes classifier handling different data types in python

There are 2 answers

Related Questions in PYTHON

Related Questions in SCIKIT-LEARN

Related Questions in GAUSSIAN

Related Questions in NAIVEBAYES

Popular Questions

Popular Tags

Trending Questions