VGG16 and VGG19 not doing any learning during training, although AlexNet performs well?

54 views Asked by At

I'm replicating the results of a research paper for a ML Project. The paper is about Palm Vein recognition using CNNs. It trains 3 CNNs on different palm vein datasets, one of which is the FYODB Dataset.

I'm using Keras to train my models from scratch, and although AlexNet performs well with test accuracies over 95%, for some reason VGG16 and VGG19 both are unable to do ANY learning during training. Their accuracy in each epoch fails to reach even 0.1.

I'll share the code I'm using to build and train the model - Note that the paper I'm replicating intentionally reduced the number of filters per CONV2D layer to reduce training times (I've tried with the original architecture too though, same results).

Some key constants:

  • Number of classes: 160
  • Number of samples in dataset: 6400
  • Train-Val Split: 80-20
  • Train-Test Split: 80-20
  • Total training, testing, validation images: 4096, 1024, 1280
  • Image Shape: (224, 224, 3)

Here's the code I have for building and training. I also did try transfer learning with pre-trained VGG16 weights and that actually worked just fine, with over 95% test accuracy.

data_augmentation = keras.Sequential(
    [
        layers.RandomFlip("horizontal"),
        layers.RandomRotation(0.1),
        layers.RandomZoom(0.1),
        layers.RandomContrast(0.1),
        layers.RandomTranslation(0.1, 0.1),
        layers.RandomHeight(0.1),
        layers.RandomWidth(0.1),
    ]
)

def make_vgg16_model(input_shape, num_classes):
    inputs = keras.Input(shape=input_shape)

    # Block 1
    x = data_augmentation(inputs)
    x = layers.Rescaling(1.0 / 255)(inputs)
    x = layers.Conv2D(32, (3, 3), activation='relu', padding='same')(inputs)
    x = layers.Conv2D(32, (3, 3), activation='relu', padding='same')(x)
    x = layers.MaxPooling2D((2, 2), strides=(2, 2))(x)


    # Block 2
    x = layers.Conv2D(64, (3, 3), activation='relu', padding='same')(x)
    x = layers.Conv2D(64, (3, 3), activation='relu', padding='same')(x)
    x = layers.MaxPooling2D((2, 2), strides=(2, 2))(x)


    # Block 3
    x = layers.Conv2D(96, (3, 3), activation='relu', padding='same')(x)
    x = layers.Conv2D(96, (3, 3), activation='relu', padding='same')(x)
    x = layers.Conv2D(96, (3, 3), activation='relu', padding='same')(x)
    x = layers.MaxPooling2D((2, 2), strides=(2, 2))(x)


    # Block 4
    x = layers.Conv2D(128, (3, 3), activation='relu', padding='same')(x)
    x = layers.Conv2D(128, (3, 3), activation='relu', padding='same')(x)
    x = layers.Conv2D(128, (3, 3), activation='relu', padding='same')(x)
    x = layers.MaxPooling2D((2, 2), strides=(2, 2))(x)


    # Block 5
    x = layers.Conv2D(128, (3, 3), activation='relu', padding='same')(x)
    x = layers.Conv2D(128, (3, 3), activation='relu', padding='same')(x)
    x = layers.Conv2D(128, (3, 3), activation='relu', padding='same')(x)
    x = layers.MaxPooling2D((2, 2), strides=(2, 2))(x)


    # Flatten and Fully Connected Layers
    x = layers.Flatten()(x)
    x = layers.Dense(4096, activation='relu')(x)
    x = layers.Dropout(0.5)(x)
    x = layers.Dense(4096, activation='relu')(x)
    x = layers.Dropout(0.5)(x)
    outputs = layers.Dense(num_classes, activation='softmax')(x)

    return keras.Model(inputs, outputs)
from tqdm import tqdm
num_epochs = 30

models = {
    "AlexNet": make_alexnet_model(input_shape=image_size, num_classes=num_classes),
    "VGG16": make_vgg16_model(input_shape=image_size, num_classes=num_classes),
    "VGG19": make_vgg19_model(input_shape=image_size, num_classes=num_classes),
}

model_histories = {}

for name, model in models.items():
    print(f'\x1b[34mTraining {name} Model...\x1b[0m')
    model.compile(
        optimizer=keras.optimizers.Adam(1e-3),
        loss="sparse_categorical_crossentropy",
        metrics=["accuracy"],
    )
    start = time.time()
        
    # Wrap model.fit with tqdm for a progress bar
    progress_bar = tqdm(total=num_epochs, position=0, leave=True)
    history = model.fit(
        train_dataset,
        epochs=num_epochs,
        validation_data=val_dataset,
        verbose=1,
        callbacks=[
            tf.keras.callbacks.LambdaCallback(on_epoch_end=lambda epoch, logs: progress_bar.update(1)),
        ]
    )
    progress_bar.close()
    
    model_histories[name] = history
    
    end = time.time()
    print(f'Finished training {name} in {end-start:.2f}s\n')

Output Sample:

Epoch 14/30
128/128 [==============================] - ETA: 0s - loss: 5.0713 - accuracy: 0.0054
 47%|████▋     | 14/30 [05:35<06:16, 23.50s/it]
128/128 [==============================] - 23s 182ms/step - loss: 5.0713 - accuracy: 0.0054 - val_loss: 5.1037 - val_accuracy: 0.0023
Epoch 15/30
128/128 [==============================] - ETA: 0s - loss: 5.0709 - accuracy: 0.0081
 50%|█████     | 15/30 [05:58<05:53, 23.55s/it]
0

There are 0 answers