Neural Nets sum up weights, but RBMs... multiply weights into a probability? So is an RBM kind of like a bidirectional neural net that multiplies it's weights instead of adding them?
Can you look at an RBM as being a kind of multiplicative NN?
2k views Asked by Josh T At
2
First off, a restricted Boltzmann machine is a type of neural network, so there is no difference between a NN and an RBM. I think by NN you really mean the traditional feedforward neural network. Also, note that neither feedforward neural networks nor RBMs are considered fully connected networks. The terminology "fully connected" comes from graph theory and means that each node is connected to every other node, which is clearly not the case here. The layers are, however, fully connected from one to another.
Traditional feedforward neural networks
The traditional FNN model is a supervised learning algorithm for modelling data. To train this network one needs a dataset containing labelled instances. One will present each item to the network, consecutively compute activations for each layer up the network until the output layer is reached and then compare the output with the target output (the label). One then typically uses the backpropagation algorithm to obtain the gradient of the weights and biases for each unit in order to update these parameters via gradient descent. Typically, either the entire dataset or batches of it are passed through the network in one go, and the parameter updates are computed with respect to them all.
RBMs
The RBM model is a version of the Boltzmann machine model that has been restricted for computational efficiency. RBMs are BMs without connections between units in the same layer. This isn't the place to go into detail but I will point you to some external resources. There are a number of variations to the algorithm and the explanations online do not make this clear, nor are very useful for the inexperienced.
Neural networks are algorithms for fitting models to datasets. In an RBM, we attempt to do this using 2 layers of nodes: a "visible layer" that we set to the input and a "hidden layer" that we use to model the input layer. Crucially, the learning process is unsupervised. Training involves using the hidden layer to reconstruct the visible layer and updating the weights and biases using the difference between the node states before and after reconstruction (I have very much simplified this explanation; for more information note that this training algorithm is called contrastive divergence (CD)). Also note that neurons are activated probabilistically in this model. The connections between each layer are bidirectional, thus the network forms a bipartite graph.
Importantly, RBMs do not produce an output in the same way FNNs do. As of this, they are often used to train a network before an output layer is added and another algorithm, such as an autoencoder, is used with the weights learned by the RBM.
Check out these resources:
In general
The performance of any network depends on its parameters and design choices as well as the problem to which it is applied. RBMs and FNNs are suitable for different kinds issues.
I highly recommend Geoffrey Hinton's course "Neural Networks for Machine Learning" on Coursera - the course has taken place but the lectures are available for free.