How to evaluate a deep belief network of a stack of BernoulliRBM's performance?

184 views Asked by At

I have this dataset of 1s and 0s (7248x2048) and I want to do a feature reduction from 2048 to 256 . I have already tried an autoencoder which performs well and now I thought that maybe a deep belief network( stack of BernoulliRBMs from scikit-learn) could also reduce the features and maybe faster. I followed this previous implementation of dbn.

  1. How can I assess the performance of the dbn? I tried building a pipeline with layers 1024-> 512 -> 256 -> 512 -> 1024 -> 2048 and then calculate the "reconstruction error" of it. Does this make sense?

  2. The decreasing pseudo-likelihood in the encoding part is promising? If you know other similar DBN implementations in tensorflow or pytorch, I would appreciate it.

  3. The .score_samples function calculates the pseudo likelihood and I am not sure how to interpret it.

     import numpy as np 
    
       import pandas as pd 
       from sklearn.model_selection import train_test_split 
       from sklearn.neural_network import BernoulliRBM
       df = pd.DataFrame(np.random.randint(0,2,size=(7248, 2048)))
       X_train, X_test = train_test_split(df, test_size=0.2, random_state=0)
       X_train, X_val  = train_test_split(X_train, test_size=0.15, random_state=0)
    
     learning_rate = 0.1 
    
       total_units   =  2048 
       total_epochs  =  20 
       batch_size    =  16 
       rbm1 = BernoulliRBM(n_components=total_units // 2, learning_rate=learning_rate, 
       batch_size=batch_size, n_iter=total_epochs, verbose=1)
       rbm2 = BernoulliRBM(n_components=total_units // 4 , learning_rate=learning_rate,  
       batch_size=batch_size, n_iter=total_epochs, verbose=1)
       rbm3 = BernoulliRBM(n_components=total_units // 8, learning_rate=learning_rate, 
       batch_size=batch_size, n_iter=total_epochs, verbose=1)
       rbm4 = BernoulliRBM(n_components=total_units // 4 , learning_rate=learning_rate, 
       batch_size=batch_size, n_iter=total_epochs, verbose=1)
       rbm5 = BernoulliRBM(n_components=total_units // 2, learning_rate=learning_rate, 
       batch_size=batch_size, n_iter=total_epochs, verbose=1)
       rbmout = BernoulliRBM(n_components=total_units , learning_rate=learning_rate, 
       batch_size=batch_size, n_iter=total_epochs, verbose=1)
    
     model = Pipeline(steps=[('rbm1', rbm1),('rbm2', rbm2),('rbm3',rbm3),('rbm4',rbm4),('rbm5', rbm5),('rbmout', rbmout)]) 
    
       model.fit(X_train)
    
     actual = pd.DataFrame(X_val)
    
       preds = pd.DataFrame(model.fit_transform(X_val))
       dif = preds.subtract(actual)
       dif2 = np.square(dif)
       dif2['loss'] = dif2.sum(axis=1)
       dif2['loss'].mean()
    
0

There are 0 answers