How to make VGGNet19 model for FCN: ValueError: Inputs have incompatible shapes. Received shapes (14, 14, 512) and (224, 224, 512)

15 views Asked by At

I want to Architecture of the FCN-VGG19 adapted from Long et al. (2015) which learns to combine high level information with fine, low level information using skips from the third and fourth pooling layer. Hidden layers are equipped with rectified linear units (ReLUs) and the number of channels for the convolutional layers increases with the depth of the network. During training the input image is a fixed size of 224 × 224 pixels, while receptive fields for all filters are 3 × 3 pixels throughout the whole network. This configuration allows the FCN to learn approximately 140 million parameters. Prediction is performed using upsampling layers with four channels for the all classes [ncl] in the reference data. Upsampling layers are fused with 1 × 1 convolutions of the third and fourth pooling layers with the same channel dimension [x,y,ncl]. The final upsampling layer predicts fine details using fused information from the last convolutional layer, third and fourth pooling layer upsampled at stride 8 with source code here

import tensorflow as tf
from tensorflow.keras.applications import VGG19
from tensorflow.keras.layers import Conv2D, Conv2DTranspose, Add

# Load the VGG19 model pretrained on ImageNet
base_model = VGG19(weights='imagenet', include_top=False, input_shape=(224, 224, 3))

# Get the third and fourth pooling layers
pool3 = base_model.get_layer('block3_pool').output
pool4 = base_model.get_layer('block4_pool').output

# Create the fully convolutional network (FCN) by adding convolutional layers and transpose convolutional layers
x = base_model.output
x = Conv2D(512, (3, 3), activation='relu', padding='same')(x)
x = Conv2D(512, (3, 3), activation='relu', padding='same')(x)
x = Conv2D(512, (3, 3), activation='relu', padding='same')(x)

# Upsample the last convolutional layer
x = Conv2DTranspose(512, kernel_size=(3, 3), strides=(2, 2), padding='same')(x)

# Upsample the third pooling layer and fuse it with a 1x1 convolution
pool3_conv = Conv2D(512, (1, 1), activation='relu', padding='same')(pool3)
pool3_upsampled = Conv2DTranspose(512, kernel_size=(3, 3), strides=(8, 8), padding='same')(pool3_conv)

x = Add()([x, pool3_upsampled])

# Upsample the fourth pooling layer and fuse it with a 1x1 convolution
pool4_conv = Conv2D(512, (1, 1), activation='relu', padding='same')(pool4)
pool4_upsampled = Conv2DTranspose(512, kernel_size=(3, 3), strides=(8, 8), padding='same')(pool4_conv)

x = Add()([x, pool4_upsampled])

# Final upsampling layer with 3 output channels for prediction
x = Conv2DTranspose(3, kernel_size=(3, 3), strides=(8, 8), padding='same')(x)

# Create the model
model = tf.keras.Model(inputs=base_model.input, outputs=x)

# Make sure the base model layers are not trainable
for layer in base_model.layers:
    layer.trainable = False

# Compile the model (use appropriate loss function and metrics for your task)
model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])

# Print model summary
model.summary()

but if i am try that always error like this : ValueError: Inputs have incompatible shapes. Received shapes (14, 14, 512) and (224, 224, 512)

What have I do for fix this problem? I want use that model for Satellite Image semantic segmentation

0

There are 0 answers