Keras Style Transfer remove zero-center by mean pixel

1.4k views Asked by At

I am working on the Image Style Transfer with Keras, but im stuck in the part of remove zero-center by mean pixel

from __future__ import print_function
from keras.preprocessing.image import load_img, img_to_array
from scipy.misc import imsave
import numpy as np
from scipy.optimize import fmin_l_bfgs_b
import time
import argparse

from keras.applications import vgg19
from keras import backend as K

base_image_path = "images/input.jpg"
style_reference_image_path = "images/style.jpg"
result_prefix = "output"
iterations = 10

# Weights
content_weight = 0.025
style_weight = 1.0
# total variation weight
total_variation_weight = 1.0

# output 
width, height = load_img(base_image_path).size
img_nrows = 400
img_ncols = int(width * img_nrows / height)

# Fit into VGG19 format
def preprocess_image(image_path):
    img = load_img(image_path, target_size=(img_nrows, img_ncols))
    img = img_to_array(img)
    img = np.expand_dims(img, axis=0)
    img = vgg19.preprocess_input(img)
    return img

# Turning feature vectors into image
def deprocess_image(x):
    if K.image_data_format() == 'channels_first':
        x = x.reshape((3, img_nrows, img_ncols))
        x = x.transpose((1, 2, 0))
    else:
        x = x.reshape((img_nrows, img_ncols, 3))
    # (Remove zero-center by mean pixel)
    x[:, :, 0] += 103.939
    x[:, :, 1] += 116.779
    x[:, :, 2] += 123.68
    # 'BGR'->'RGB'
    x = x[:, :, ::-1]
    x = np.clip(x, 0, 255).astype('uint8')
    return x

The final part, (Remove zero-center by mean pixel), I searched on google but could not find the similar approach. 103.939, 116.779 and 123.68 --> I could not calculate these figures using the mean values of image.

And why are there "BGR"? Aren't they suppose to be in "RGB" at the beginning?

1

There are 1 answers

0
chantya127 On BEST ANSWER

1.Vgg-19 model preprocessing_input function docs :

def preprocess_input(x, data_format=None, mode='caffe', **kwargs):
"""Preprocesses a tensor or Numpy array encoding a batch of images.
# Arguments
    x: Input Numpy or symbolic tensor, 3D or 4D.
        The preprocessed data is written over the input data
        if the data types are compatible. To avoid this
        behaviour, `numpy.copy(x)` can be used.
    data_format: Data format of the image tensor/array.
    mode: One of "caffe", "tf" or "torch".
        - caffe: will convert the images from RGB to BGR,
            then will zero-center each color channel with
            respect to the ImageNet dataset,
            without scaling.
        - tf: will scale pixels between -1 and 1,
            sample-wise.
        - torch: will scale pixels between 0 and 1 and then
            will normalize each channel with respect to the
            ImageNet dataset.
# Returns
    Preprocessed tensor or Numpy array.

2.In Short the images are converted from RGB to BGR, then each color channel is zero-centered with respect to the ImageNet dataset, without scaling and mean values used for zero-centering each channel are [103.939, 116.779, 123.68] .

3. In deprocess_image() function , the same mean values ([103.939, 116.779, 123.68]) are added to each respective channel and then converted back to the original form , from 'BGR' -> 'RGB' ,

Note:- The mean value of the dataset is the mean value of the pixels of all the images across all the colour channels (e.g. RBG). Grey scale images will have just one mean value and colour images like ImageNet will have 3 mean values.

Usually mean is calculated on the training set and the same mean is used to normalize both training and test images.