Keras : Dealing with large image datasets

367 views Asked by At

I am trying to fit a model using a large image datasets. I have a memory RAM of 14 GB, and the dataset have the size of 40 GB. I tried to use fit_generator, but I end up with a method that does not delete the loaded batchs after using theme.

If there is anyway to sole the problem or resources, thanks to point me to it.

Thanks.

The generator code is :

class Data_Generator(Sequence):

    def __init__(self, image_filenames, labels, batch_size):
        self.image_filenames, self.labels = image_filenames, labels
        self.batch_size = batch_size

    def __len__(self):
        return int(np.ceil(len(self.image_filenames) / float(self.batch_size)))
    def __format_labels__(self, gd_truth):
        cols=gd_truth.columns
        y=[]
        for col in cols:
            y.append(gd_truth[col].values)
        return y

    def __getitem__(self, idx):
        batch_x = self.image_filenames[idx * self.batch_size:(idx + 1) * self.batch_size]
        batch_y = self.labels[idx * self.batch_size:(idx + 1) * self.batch_size]
        gd_truth=pd.DataFrame(data=batch_y,columns=self.labels.columns)
        #gd_truth=batch_y
        return np.array([read_image(file_name) for file_name in batch_x]),self.__format_labels__(gd_truth) #np.array(batch_y)

Then I have created two generators for train and validation images:

training_batch_generator = Data_Generator(training_filenames, trainTargets, batch_size)
mvalidation_batch_generator = Data_Generator(validation_filenames, valTargets, batch_size)

The fit_generator call is as follow :

num_epochs=10
model.fit_generator(generator=my_training_batch_generator,
                                          steps_per_epoch=(num_training_samples // batch_size),
                                          epochs=num_epochs,
                                          verbose=1,
                                          validation_data=my_validation_batch_generator,
                                          validation_steps=(num_validation_samples // batch_size),
                                          max_queue_size=16)
0

There are 0 answers