I am training a time series forecasting model in Tensorflow. I create a tf.data.Dataset containing batches of windows of data using the approach presented in the example notebook https://www.tensorflow.org/tutorials/structured_data/time_series#4_create_tfdatadatasets.
However, from profiling, I see that this results in an input pipeline bottleneck. Tensorboard profiler suggests to perform the map operation offline, but I have not found a way to do so. I tried to change the number of parallel executors in map and also to apply prefetch and cache transformations, withouth any improvement in training time.
Finally, I decided to manually implement the Dataset.map() function using a simple for loop like this:
ds = tf.keras.utils.timeseries_dataset_from_array(
data=data,
targets=None,
sequence_length=self.total_window_size,
sequence_stride=1,
shuffle=True,
batch_size=32,)
input_tensor_list = []
labels_tensor_list = []
for window_batch in dataset.as_numpy_iterator():
input_tensor, labels_tensor = split(window=window_batch)
input_tensor_list.append(input_tensor)
labels_tensor_list.append(labels_tensor)
result_dataset = tf.data.Dataset.from_tensor_slices(
(tf.stack(input_tensor_list), tf.stack(labels_tensor_list)))
This improves my training time (25% reduction). However, it requires the last batch to also have the same size as the others, which the other approach did not.
I would like to know whether 1) there is a way to speed up the use of map as used in the example notebook or 2) how can I modify my code to avoid having to drop the last batch of data. Thank you