I have two numpy arrays x and y acquired from an sframe where x has 6 dimensions and y (target variable) has one dimension.
x =np.array([[ 0 , 0 , 0, 24 ,0, 34], [ 0 , 0 , 0, 22 ,0, 34], ...])
y = np.array([[0], [0], [0], [1], [1], ...])
I am using scikit-learn to apply naive bayes classifier. When I try to fit x and y in naive bayes classifier, I gives the following error:
/home/.../local/lib/python2.7/site-packages/sklearn/utils/validation.py:526: DataConversionWarning: A column-vector y was passed when a 1d array was expected. Please change the shape of y to (n_samples, ), for example using ravel().
Traceback (most recent call last):
File "main_naive.py", line 10, in <module>
main()
File "main_naive.py", line 7, in main
naive_bayes.predict()
File "/home/.../naive_bayes_model.py", line 184, in predict
self.naive_bayes.fit(x, y)
File "/home/.../local/lib/python2.7/site-packages/sklearn/naive_bayes.py", line 566, in fit
Y = labelbin.fit_transform(y)
File "/home/.../local/lib/python2.7/site-packages/sklearn/base.py", line 494, in fit_transform
return self.fit(X, **fit_params).transform(X)
File "/home/.../local/lib/python2.7/site-packages/sklearn/preprocessing/label.py", line 304, in fit
self.classes_ = unique_labels(y)
File "/home/.../local/lib/python2.7/site-packages/sklearn/utils/multiclass.py", line 98, in unique_labels
raise ValueError("Unknown label type: %s" % repr(ys))
ValueError: Unknown label type: (array([0, 0, 0, ..., 0, 0, 0], dtype=object),)
Here is my code:
from sklearn.naive_bayes import BernoulliNB
naive_bayes = BernoulliNB(alpha=1e-2)
#x = self.training1[self.feature_columns].to_numpy()
#x = x.reshape(-len(self.feature_columns), len(self.feature_columns))
#y = self.training1[[target_column]].to_numpy()
#y = y.reshape(-1L,1L)
x =np.array([[ 0 , 0 , 0, 24 ,0, 34], [ 0 , 0 , 0, 22 ,0, 34], ...])
y = np.array([[0], [0], [0], [1], [1], ...])
naive_bayes.fit(x, y)
Where am I going wrong?
I figured out the issue. It was because y contained 'None' values, so I simply removed None values from y.