I want to use transfer learning with Google's Inception network for an image recognition problem. I am using retrain.py from the TensorFlow example source for inspiration and follow this official tutorial from Google.
However, its input pipeline is "each step chooses 10 images (10 is batch_size) at random from the training set, finds their bottlenecks from the cache, and feeds them into the final layer to get predictions."
Specifically, in its code retrain.py , batch_size images will be retrieved by selecting randomly each images. For each image, find its image index by
image_index = random.randrange(MAX_NUM_IMAGES_PER_CLASS + 1)
where MAX_NUM_IMAGES_PER_CLASS
= 2^27 + 1, then image_index
will be moduloed by the available number of images for the label.
My questions are:
1) Why it has to create batches like that (random to a large number and then mod the length of each class)? Why don't we just random to the length of each class)
2) If we choose a random set of images for each time of feeding, do we miss a large portion of data and receive duplicated feed examples? Why is the input pipeline arranged as normal: we have many epochs, each epoch is full data and is divided into many batches, then we feed each batch to our model?
Thank you very much!!
Reference
in retrain.py, train_bottlenecks
, train_ground_truth
(these are fed into model) are created by get_random_distorted_bottlenecks
. In this function, we loop for each images until we have enough batch_size images: find a random label, then find the random index:
label_index = random.randrange(class_count)
label_name = list(image_lists.keys())[label_index]
image_index = random.randrange(MAX_NUM_IMAGES_PER_CLASS + 1)