I'm currently working on a CNN related project where I'm a newbie in that particular area. I have like a set of images with 500 images on fabric defects. How can I increase the number of images like up to 2000? Any libraries that I can use on this?
Data Augmentation using Python
803 views Asked by Ruvy Rathnayake At
2
There are 2 answers
2
On
The go-to libary for image augmentation is imgaug.
The documentation is self explaining but here is an example:
import numpy as np
from imgaug import augmenters as iaa
from PIL import Image
# load image and convert to matrix
image = np.array(Image.open("<path to image>"))
# convert image to matrix
# image must passed into a list because you can also put a list of multiple images into the augmenter, but for this demonstration we will only take one.
image = [image]
# all these augmentation techniques will applied with a certain probability
augmenter = iaa.Sequential([
iaa.Fliplr(0.5), # horizontal flips
iaa.Crop(percent=(0, 0.1)), # random crops
iaa.Sometimes(
0.5,
iaa.GaussianBlur(sigma=(0, 0.5))
),
iaa.AdditiveGaussianNoise(loc=0, scale=(0.0, 0.05*255), per_channel=0.5),
], random_order=True) # apply augmenters in random order
augmented_image = augmenter(images=image)
augmented_image
is now a list with which contains one augmented image of the original.
Since you said you want to create 2000 from 500 images you can do the following:
You augment each image 4 times, ie like this:
total_images = []
for image_path in image_paths:
image = Image.load(image_path)
# create a list with for times the same image
images = [image for i in range(4)]
# pass it into the augmenter and get 4 different augmentations
augmented_images = augmenter(images=images)
# add all images to a list or save it otherwise
total_images += augmented_images
There are different data augmentation techniques like zooming, mirroring, rotating, cropping, etc. The idea is to create new images from your initial set of images so that model has to take into account new information caused by these changes.
Several librairies allow to do that, the first one is OpenCV, then you can use Keras on top of Tensorflow which provides a built-in high level functiton for data generation, or scikit-image.
I would recommend to start with simple and efficient techniques like mirroring and random cropping, and continue with color or contrast augmentation.
Documentation and articles: