Keras image augmentation: How to choose "steps per epoch" parameter and include specific augmentations during training?

1.5k views Asked by At

I am training an image classification CNN using Keras. Using the ImageDataGenerator function, I apply some random transformations to the training images (e.g. rotation, shearing, zooming). My understanding is, that these transformations are applied randomly to each image before passed to the model.

But some things are not clear to me:

1) How can I make sure that specific rotations of an image (e.g. 90°, 180°, 270°) are ALL included while training.

2) The steps_per_epoch parameter of model.fit_generator should be set to the number of unique samples of the dataset divided by the batch size define in the flow_from_directory method. Does this still apply when using the above mentioned image augmentation methods, since they increase the number of training images?

Thanks, Mario

1

There are 1 answers

1
captainst On

Some time ago I raised myself the same questions and I think a possible explanation is here:

Consider this example:

    aug = ImageDataGenerator(rotation_range=90, width_shift_range=0.1, 
                             height_shift_range=0.1, shear_range=0.2, 
                             zoom_range=0.2, horizontal_flip=True, 
                             fill_mode="nearest")

For question 1): I specify a rotation_range=90, which means that while you flow (retrieve) the data, the generator will randomly rotate the image by a degree between 0 and 90 deg. You can not specify an exact angle cause that's what ImageDataGenerator does: generate randomly the rotation. It is also very important concerning your second question.

For question 2): Yes it still applies to the data augmentation method. I was also confused in the beginning. The reason is that since the image is generated randomly, for each epoch, the network sees the images all different from those in previous epoch. That's why the data is "augmented" - the augmentation is not done within an epoch, but throughout the entire training process. However, I have seen other people specifying 2x value of the original steps_per_epoch.

Hope this helps