I am trying to use sklearn.neural_network.BernoulliRBM with iris dataset:
from sklearn import datasets
iris = datasets.load_iris()
collist = ['SL', 'SW', 'PL', 'PW']
dat = pd.DataFrame(data=iris.data, columns=collist)
from sklearn.neural_network import BernoulliRBM
model = BernoulliRBM(n_components=2)
scores = model.fit_transform(dat)
print(scores.shape)
print(scores)
However, I am only getting 1 as output for all rows:
(150, 2)
[[1. 1.]
[1. 1.]
[1. 1.]
[1. 1.]
[1. 1.] # same for all rows
Can I get values similar to scores for individual rows as I can get in principal component analysis? Else how can I get some useful numbers from RBM? I tried model.score_samples(dat) but that also gives value of 0 for vast majority of rows.
According to the documentation:
Since your
datvalues are all greater than 1, I'm guessing the model is truncating all input data to 1.0. If, for example, you apply a normalization:You'll get values with some variation:
Since your input features must have an interpretation as probabilities, you'll want to think about what if any normalization is reasonable for the particular problem you are solving.