I’m trying to figure out if it’s more efficient to run an RNN on the inputs, and then run another RNN on those outputs, repeatedly (one horizontal layer at a time). Or to run one time-step at a time for all layers (one vertical layer at a time).
I know tensorflow's MultiCellRNN class does the latter. Why is this method chosen over the former? Is the former equally efficient? Are there cases where going one time-step at a time for all layers is preferable?
See http://karpathy.github.io/2015/05/21/rnn-effectiveness/ for reference on multi-layer RNNs.
1: How to easily implement an RNN Use an lstm cell, they're generally better (no vanishing gradient problems) and tensorflow makes it very easy to implement them through:
from tensorflow.python.ops.rnn_cell import BasicLSTMCell ... cell = BasicLSTMCell( state_dim ) stacked_lstm = tf.nn.rnn_cell.MultiRNNCell([cell]*num_layers, state_is_tuple=True)
find out more on the tensorflow website: https://www.tensorflow.org/tutorials/recurrent/
2:Horizontal or Deep? Just like you can have a multi layer neural networks, you can also have a multi layer RNN. Think of the RNN cell as a layer within your neural network, a special layer which allows you to remember sequential inputs. From my experience you will still have linear transforms (or depth) within your network, but the question to have multiple layers of lstm cells depends on your network topology, preference, and computational ability. (the more the merrier) The amount of inputs and outputs depends on your problem, and as far as I can remember there is no such thing as multiple Horizontal RNN cells, just depth. All computation is done depth wise one input at a time. The multi layer function you referenced is awesome, it handles all computation for you under the hood, just tell it how many cells you want and it does the rest.
Good Luck