RBM: Deriving the Replicated Softmax Model (RSM)

349 views Asked by At

I am trying to derive the conditional distribution of the visible variables, enter image description here, for the Replicated Softmax Model (RSM) or equivalently, the Restricted Boltzmann Machine (RBM) for word counts, according to the paper: "Replicated Softmax: an Undirected Topic Model" by Salakhutdinov and Hinton.

Paper can be found at: http://citeseerx.ist.psu.edu/viewdoc/download;jsessionid=B04C8D67D381B8106FF6FA4203A86264?doi=10.1.1.164.71&rep=rep1&type=pdf

However, despite all efforts, I've been unable to get how the conditional can turn out to be a softmax distribtution:
enter image description here

Also, I'm confused if enter image description here is a 3D matrix and enter image description here a 2D matrix or is it instead a 2D matrix and vector respectively. I believe it is the latter. Hoping someone can demonstrate the derivations.

I am looking to implement the RSM to do topic modelling in python's theano. I am aware that there are codes out there but I prefer to understand the derivation myself so that I can extend or optimize the codes without the risk of breaking the model.


p.s. apologies, this is a repost of https://math.stackexchange.com/questions/2085616/rbm-deriving-the-replicated-softmax-model-rsm but i did so as aren't as many mathstackexchange users.

1

There are 1 answers

0
krenova On BEST ANSWER

After sometime I found out where I misunderstood things and managed to derive the equations. Please refer to math.stackexchange:
https://math.stackexchange.com/questions/2085616/rbm-deriving-the-replicated-softmax-model-rsm/2087272#2087272