How does Tensorflow.data.Dataset.load() handle the loading of multiple Datasets saved with the same name via Tensorflow.data.Dataset.save()

21 views Asked by At

I'm new to the usage of Tensorflow datasets and I try to understand the inner workings.

As far as i can tell tf.data.Dataset.save(dataset, path/name) creates a new folder named "name" at "path". The folder itself contains a "dataset_spec.pb" and a "snapshot.metadata", I assume containing metadata, as well as another folder with a random number as name, containing the data itself.

If I'd save another dataset with the same path/name I would expect it to replace all of the content, however it just replaces the metadata files and adds another "randomly named" folder.

How does tf.data.Dataset.load(path/name) handle the situation? Is it only capable to load the latest safe (given by the metadatafiles) or can I manipulate the choosen folder? If the latter would be True, I'd expect to do it by the "reader_func" load() takes as argument. If so, is there any decent documentation?

Thank you in advance for your help.

Stackoverflow asked me to describe what I tried to fix my problem. As this is a purely informative question and not related to a concrete coding problem, feel free to skip this part.

I saved multiple dummy datasets with different values using tf.data.Dataset.save(dataset,"same_name"). At first I expected it to overwrite the data, however I noticed the folder "same_name" got more and more entries Afterwards I used tf.data.Dataset.load("same_name") half expecting to get random sets or a concatination of the different datasets. However I got only the latest addition.

0

There are 0 answers