I am trying to train a custom subclassed tensorflow model. The model takes a (batch_size, nH, nW,1,2) sized tensor as input and outputs a (batch_size, n) output. The training works smoothly if I store the training data as a tensor and feed it to
model = custom_model(),
model.compile(optimizer=tf.keras.optimizers.Adam(learning_rate=0.001),
loss = tf.keras.losses.MeanAbsoluteError(),
),
model.fit(x = train, y = target,epochs=epochs)
However, I want to train larger datasets which don't fit in the GPU memory. To this end I define a training pipeline which uses tf.data.Dataset object as follows
tf_dataset = tf.data.Dataset.from_tensor_slices((range(100000)))#,range(36000))),
tf_dataset = tf_dataset.map(data_pipeline_tf)
tf_dataset = tf_dataset.batch(batch_size)
Each element of tf_dataset returns (batch_size, nH, nW,1,2) input and (batch_size, n) output
I then train the model using
model.fit(x = tf_dataset,epochs = epochs)
However this gives me the following error
if input_shape.dims[channel_axis].value is None:
TypeError: 'NoneType' object is not subscriptable
The dimensions of tensors retuned in tf_dataset seems to be correct since I get the output with desired shape if I do
for el in tf_dataset.take(1):
model(el[0])
It seems the problem is when I do model.fit(x = tf_dataset)
Basically the exact same problem was reported here and here with no satisfactory resolution using the model.fit() method.
I am using a Colab notebook with tf version 2.15.0 with a T4 GPU
How do I resolve this issue? Any help appreciated.
Thanks
One of the suggestion was to use run_eagerly = True. That worked when I was testing the above process on CPU but was really slow.
I also tried model.build((batch_size, nH, nW,1,2)) after I compiled the model but before calling model.fit(). This however led to another error.