I'm trying to implement an LSTM in CNTK (using Python) to classify a sequence.
Input:
Features are fixed length sequences of numbers (a time series)
Labels are vectors of one-hot values
Network:
input = input_variable(input_dim)
label = input_variable(num_output_classes)
h = Recurrence(LSTM(lstm_dim)) (input)
final_output = C.sequence.last(h)
z = Dense(num_output_classes) (final_output)
loss = C.cross_entropy_with_softmax(z, label)
Output: A probability that the sequence matches a label
All sizes are fixed, so I don't think I need any dynamic axis and haven't specified any.
However, CNTK is not happy and I get:
return cross_entropy_with_softmax(output_vector, target_vector, axis, name)
RuntimeError: Currently if an operand of a elementwise operation has any dynamic axes, those must match the dynamic axes of the other operands
If (as per some of the examples) I define label with a dynamic axis
label = input_variable(num_output_classes, dynamic_axes=[C.Axis.default_batch_axis()])
It no longer complains about this, and gets further to:
tf = np.split(training_features,num_minibatches)
tl = np.split(training_labels, num_minibatches)
for i in range(num_minibatches*num_passes): # multiply by the
features = np.ascontiguousarray(tf[i%num_minibatches])
labels = np.ascontiguousarray(tl[i%num_minibatches])
# Specify the mapping of input variables in the model to actual minibatch data to be trained with
trainer.train_minibatch({input : features, label : labels})
But dies with this error:
File "C:\Users\Dev\Anaconda3\envs\cntk-py34\lib\site-packages\cntk\cntk_py.py", line 1745, in train_minibatch
return _cntk_py.Trainer_train_minibatch(self, *args)
RuntimeError: Node '__v2libuid__Plus561__v2libname__Plus552' (Plus operation): DataFor: FrameRange's dynamic axis is inconsistent with matrix: {numTimeSteps:1, numParallelSequences:100, sequences:[{seqId:0, s:0, begin:0, end:1}, {seqId:1, s:1, begin:0, end:1}, {seqId:2, s:2, begin:0, end:1}, {seqId:3, s:3, begin:0, end:1}, {seq...
What do I need to do to fix this?
If I am understanding this correctly you have sequences of one dimensional inputs. If so, then your troubles stem from this line
which declares a sequence of input_dim dimensional vectors. If you change it to
then I believe your initial attempt should work.
Update: The above is not sufficient by itself because the operation of taking the last element of a sequence creates an output whose dynamic axes are different from the default dynamic axes with which the label is created. An easy fix is to define the label after you have defined the output
z
like thisThis works without any complaints for me. I then fed some dummy data like this (assuming a minibatch of 4 a sequence length of 5 and 3 classes)
and it worked as expected.