Reconstruct input image from feature maps for Neural Style transfer

563 views Asked by At

I am working on Neural Style Transfer using VGG19 model. I am trying to follow the paper: A Neural Algorithm of Artistic Style (https://arxiv.org/pdf/1508.06576.pdf) and trying to reconstruct the images using the feature maps at each convolution layer as shown in the image below.

enter image description here

I have extracted the feature maps for the convolution layers as shown in the code below.

    vgg = VGG16(include_top = True, weights = "imagenet")
    model_layer_names = ["block1_conv1","block1_conv2","block2_conv1","block2_conv2","block3_conv1","block3_conv2","block3_conv3","block4_conv1",
                 "block4_conv2","block4_conv3","block5_conv1","block5_conv2","block5_conv3"]
    layer_ouputs = [vgg.get_layer(layer).output for layer in model_layer_names]
    viz_model = Model(inputs = vgg.input, outputs = layer_ouputs)
    feature_map_preds = viz_model.predict(style_img)

I am able to plot the feature maps as images. But I want to plot the input image (like content and style representation images as above) using the feature maps and I am cannot convert the channels (64,128,256,512) into 3 channels. Can someone please help me out on this?

Really appreciate any comments and help.

1

There are 1 answers

1
Dan Jackson On

I've had the same problem! I eventually worked it out: Set the content weight to zero and the style weight to some large positive number - I used 1000 but see what works for you. Then initialise it with a white noise array scaled to between 0 and 255:

input = np.random.randn(1, 256, 256, 3) * 255

Then depending on what level of style you want to reconstruct, set the weights for the style layers. To reconstruct from conv1 (first style reconstruction), use:

style_layer_weights = [1, 0, 0, 0, 0]

or to reconstruct from conv1 and conv2 (second style reconstruction), use:

style_layer_weights = [0.5, 0.5, 0, 0, 0]

etc

Then

for style_feature, target_feature, weight in zip(features, target_features, weights):
    style_score += style_loss(style_feature, target_feature) * weight

Then run gradient descent and it should converge to the style reconstruction.