How to remove first N layers from a Keras Model?

5k views Asked by At

I would like to remove the first N layers from the pretrained Keras model. For example, an EfficientNetB0, whose first 3 layers are responsible only for preprocessing:

import tensorflow as tf

efinet = tf.keras.applications.EfficientNetB0(weights=None, include_top=True)

print(efinet.layers[:3])
# [<tensorflow.python.keras.engine.input_layer.InputLayer at 0x7fa9a870e4d0>,
# <tensorflow.python.keras.layers.preprocessing.image_preprocessing.Rescaling at 0x7fa9a61343d0>,
# <tensorflow.python.keras.layers.preprocessing.normalization.Normalization at 0x7fa9a60d21d0>]

As M.Innat mentioned, the first layer is an Input Layer, which should be either spared or re-attached. I would like to remove those layers, but simple approach like this throws error:

cut_input_model = return tf.keras.Model(
    inputs=[efinet.layers[3].input], 
    outputs=efinet.outputs
)

This will result in:

ValueError: Graph disconnected: cannot obtain value for tensor KerasTensor(...)

What would be the recommended way to do this?

3

There are 3 answers

1
Innat On BEST ANSWER

The reason for getting the Graph disconnected error is because you don't specify the Input layer. But that's not the main issue here. Sometimes removing the intermediate layer from the keras model is not straightforward with Sequential and Functional API.

For sequential, it comparatively should be easy whereas, in a functional model, you need to care about multi-input blocks (e.g multiply, add etc). For example: if you want to remove some intermediate layer in a sequential model, you can easily adapt this solution. But for the functional model (efficientnet), you can't because of the multi-input internal blocks and you will encounter this error: ValueError: A merged layer should be called on a list of inputs. So that needs a bit more work AFAIK, here is a possible approach to overcome it.


Here I will show a simple workaround for your case, but it's probably not general and also unsafe in some cases. That based on this approach; using pop method. Why it can be unsafe to use!. Okay, let's first load the model.

func_model = tf.keras.applications.EfficientNetB0()

for i, l in enumerate(func_model.layers):
    print(l.name, l.output_shape)
    if i == 8: break

input_19 [(None, 224, 224, 3)]
rescaling_13 (None, 224, 224, 3)
normalization_13 (None, 224, 224, 3)
stem_conv_pad (None, 225, 225, 3)
stem_conv (None, 112, 112, 32)
stem_bn (None, 112, 112, 32)
stem_activation (None, 112, 112, 32)
block1a_dwconv (None, 112, 112, 32)
block1a_bn (None, 112, 112, 32)

Next, using .pop method:

func_model._layers.pop(1) # remove rescaling
func_model._layers.pop(1) # remove normalization

for i, l in enumerate(func_model.layers):
    print(l.name, l.output_shape)
    if i == 8: break

input_22 [(None, 224, 224, 3)]
stem_conv_pad (None, 225, 225, 3)
stem_conv (None, 112, 112, 32)
stem_bn (None, 112, 112, 32)
stem_activation (None, 112, 112, 32)
block1a_dwconv (None, 112, 112, 32)
block1a_bn (None, 112, 112, 32)
block1a_activation (None, 112, 112, 32)
block1a_se_squeeze (None, 32)
0
Joe Mattioni On

I've been trying to do the same thing with the keras tensorflow VGGFace model. After a lot of experimenting I found that this approach works. In this case all of the model is used except for the last layer, which is replaced with a custom embeddings layer:

vgg_model = VGGFace(include_top=True, input_shape=(224, 224, 3)) # full VGG16 model
inputs = Input(shape=(224, 224, 3))
x = inputs
# Assemble all layers except for the last layer
for layer in vgg_model.layers[1:-2]:
  x = vgg_model.get_layer(layer.name)(x)
    
# Now add a new last layer that provides the 128 embeddings output
x = Dense(128, activation='softmax', use_bias=False, name='fc8x')(x)
# Create the custom model
custom_vgg_model = Model(inputs, x, name='custom_vggface')

Unlike layers[x] or pop(), get_layer gets the actual layer allowing them to be assembled into a new output layer set. You can then create a new model from it. The 'for' statement starts with 1 rather than 0 because the input layer is already defined by 'inputs'.

This method works for sequential models. Not clear if it would work for more complex models.

0
Youcef4k On

For me @M.Innat solution resulted in a disconnected graph, because just popping layers is not enough, a connection needs to be made between the input layer and the first convolution layer (you can check the problem with Netron).

The only proper solution that worked for me is by manually editing the configs of the model.

Here is a full script that remove the pre-processing part of Efficientnet-B1. Tested with TF2.

import tensorflow as tf

def split(model, start, end):
    confs = model.get_config()
    kept_layers = set()
    for i, l in enumerate(confs['layers']):
        if i == 0:
            confs['layers'][0]['config']['batch_input_shape'] = model.layers[start].input_shape
            if i != start:
                #confs['layers'][0]['name'] += str(random.randint(0, 100000000)) # rename the input layer to avoid conflicts on merge
                confs['layers'][0]['config']['name'] = confs['layers'][0]['name']
        elif i < start or i > end:
            continue
        kept_layers.add(l['name'])
    # filter layers
    layers = [l for l in confs['layers'] if l['name'] in kept_layers]
    layers[1]['inbound_nodes'][0][0][0] = layers[0]['name']
    # set conf
    confs['layers'] = layers
    confs['input_layers'][0][0] = layers[0]['name']
    confs['output_layers'][0][0] = layers[-1]['name']
    # create new model
    submodel = tf.keras.Model.from_config(confs)
    for l in submodel.layers:
        orig_l = model.get_layer(l.name)
        if orig_l is not None:
            l.set_weights(orig_l.get_weights())
    return submodel


model = tf.keras.applications.efficientnet.EfficientNetB1()

# first layer = 3, last layer = 341
new_model = split(model, 3, 341)
new_model.summary()
new_model.save("efficientnet_b1.h5")

The script is based on this great answer.