Since Google released out tensorflow, it becomes kind of trend in the current deep learning selections.
I'd like to do some experiments about RBM/DBN (Restricted Boltzmann Machine/Deep Belief Network), I've made some attempt by myself and kind of implement it well through the combination of available APIs from tensorflow. See code and previous answer.
So, if doesn't bother the code running performance, here's the gift for RBM/DBN implementation with tensorflow.
But, the running performance must be considered for the future. Because of the special progress of CD (Contrastive Divergence) algorithm, I think it just works against the framework (data flow graph) used by tensorflow. That's why my code seems weired.
So, the custom operation should be implemented for acceleration. I've followed the current documentation about adding custom ops.
REGISTER_OP("NaiveRbm")
.Input("visible: float32")
.Input("weights: float32")
.Input("h_bias: float32")
.Input("v_bias: float32")
.Output("hidden: float32")
.Doc(R"doc(
Naive Rbm for seperate training use. DO NOT mix up with other operations
)doc");
In my design, NaiveRbm
should is an operation that takes visible
,weights
,h_bias
,v_bias
as input, but output by only first 3 Variables
( simply sigmoid(X*W+hb) ), its gradient should return at least gradients for last 3 Variables
.
Imagine example psuedo code like this:
X = tf.placeholder()
W1, hb1, vb1 = tf.Variable()
W2, hb2, vb2 = tf.Variable()
rbm1 = NaiveRbm(X,W1,hb1,vb1)
train_op = tf.train.MomentumOptimizer(0.01, 0.5).minimize(rbm1)
rbm2 = NaiveRbm(tf.stop_gradient(rbm1), W2, hb2, vb2)
train_op2 = tf.train.MomentumOptimizer(0.01, 0.5).minimize(rbm2)
with tf.Session() as sess:
for batch in batches:
sess.run(train_op, feed_dict={X: batch})
for batch in batches:
sess.run(train_op2, feed_dict={X: batch})
But the tensorflow library is too complex for me. And after too much time seeking for how to implement these existing operations (sigmoid
, matmul
, ma_add
, relu
, random_uniform
) in custom operation, no solution is found by myself.
So, I'd like to ask if someone could help me achieve the remain works.
PS: before getting some ideas, I'd like to dive into Theano
since it implements RBM/DBN already. Just in my opinion, Caffe
is kind of not suitable for RBM/DBN because of its framework.
Update: After scratch through the tutorials from Theano, I found the key reason for Theano implemented the RBM/DBN while the tensorflow haven't is the scan
technology. So, there might wait tensorflow to implement scan
technology to prepare for RBM/DBN implementation.