Dear ImageNet and TensorFlow specialists
Originally, I have downloaded the ImageNet dataset tar files ILSVRC2012_img_train.tar, ILSVRC2012_img_val.tar and used this PyTorch script to extract the files. This results in the folder structure
# imagenet/train/
# ├── n01440764
# │ ├── n01440764_10026.JPEG
# │ ├── n01440764_10027.JPEG
# │ ├── ......
# ├── ......
# imagenet/val/
# ├── n01440764
# │ ├── ILSVRC2012_val_00000293.JPEG
# │ ├── ILSVRC2012_val_00002138.JPEG
# │ ├── ......
# ├── ......
How can I load this resulting folder structure into a TensorFlow Dataset (tfds)? An example is shown here, but it assumes extraction from the tar files (which I no longer have available). Please help : )
You don't need to use TensorFlow Datasets (
tfds) to load theImageNetdataset from the existing folder structure, but you can utilizetf.data.Datasetfrom the TensorFlow library.Using the Keras API:
That will create TensorFlow datasets from your folder structure.
Do replace
'path_to/imagenet/train'and'path_to/imagenet/val'with the actual paths to the training and validation datasets.The function
tf.keras.utils.image_dataset_from_directoryautomatically infers the labels from the subdirectory names in the provided directory path. The labels are one-hot encoded which suits multi-class classification tasks.See "Building a One Hot Encoding Layer with TensorFlow" from George Novack for illustrating that concept.
The
label_mode='categorical'argument in thetf.keras.preprocessing.image_dataset_from_directoryfunction instructs TensorFlow to one-hot encode the labels. That means that the labels are converted into a binary vector where 1 stands for the correct class, and 0 stands for the rest.That one-hot encoding format suits multi-class classification tasks because in these tasks, an instance can belong to one class out of multiple possible classes. (as opposed to "binary", when there are only two possible classes).
In the context of the
ImageNetdataset, there are 1,000 possible classes, so one-hot encoding is a suitable choice for representing the class labels.The function returns a
tf.data.Datasetobject that yields batches of images and labels. Images are in the format offloat32with values between 0 and 1, and the labels are also infloat32format.You can then use these datasets to train your model.