QAT using TF2 and MobileNetV3-large for full integer quantization

99 views Asked by At

I have a Google Colab file where I am attempting to transfer learn and fine tune a MobileNetV3-Large model for binary classification, followed by full integer quantization. My goal is to run the model on the Coral TPU USB accelerator which I have attached to a Raspberry Pi.

Initially I tried post training quantization, which I was able to get running, but due to the hardswish activation of the model, accuracy heavily drops after this process. Right now, I’m trying to use quantize aware training, but am having a lot of problems with tfmot. In a sequential model, after defining the base model (mobilenetv3), I follow this with a GlobalAvgPooling2D, dropout and dense layer. However, this results in the error “Quantizing a tf.keras Model inside another tf.keras Model is not supported” when attempting to run quantize_model from tfmot.

The code I’m using is as follows. Thanks for any help.


import numpy as np
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers, regularizers, callbacks
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import GlobalAveragePooling2D, Dropout, Dense
from tensorflow.keras.applications import MobileNetV3Large
import tensorflow_model_optimization as tfmot

print("#### Import GDrive ####")
from google.colab import drive
drive.mount('/content/drive')

# Define model name
model_name = ""

# Declare the training, validation, and testing directories
train_dir = r""
val_dir = r""
test_dir = r""

# Load the training, validation, and testing datasets
print("#### Dataset Information ####\nTraining Dataset:")
train_ds = tf.keras.utils.image_dataset_from_directory(
    train_dir,
    label_mode='binary',
    image_size=(224, 224),
    batch_size=32)

print("Validation Dataset:")
val_ds = tf.keras.utils.image_dataset_from_directory(
    val_dir,
    label_mode='binary',
    image_size=(224, 224),
    batch_size=32)

print("Testing Dataset:")
test_ds = tf.keras.utils.image_dataset_from_directory(
    test_dir,
    label_mode='binary',
    image_size=(224, 224),
    batch_size=32)

# Instantiate the base model
print("#### Download Model ####")
base_model = MobileNetV3Large(input_shape=(224, 224, 3),
                              alpha=1.0,
                              minimalistic=False,
                              include_top=False,
                              weights='imagenet',
                              include_preprocessing=True)
base_model.trainable = False

# Define the model
model = tf.keras.Sequential([
    base_model,
    GlobalAveragePooling2D(),
    Dropout(0.5),
    Dense(1, activation='sigmoid', kernel_regularizer=regularizers.l2(0.01))
])

# Compile the model
model.compile(optimizer=keras.optimizers.Adam(),
              loss=keras.losses.BinaryCrossentropy(),
              metrics=[keras.metrics.BinaryAccuracy()])

# Add early stopping
early_stopping = callbacks.EarlyStopping(monitor='val_loss', patience=5, restore_best_weights=True)

# Train the model
print("\n#### Transfer Learning ####")
model.fit(train_ds,
          epochs=25,
          validation_data=val_ds,
          callbacks=[early_stopping])

# Save the model
model.save(f"{model_name}_initial_raw")
print("\nInitial raw model saved.")

# Unfreezing the top layers of the base model
base_model.trainable = True
for layer in base_model.layers[:50]:
    layer.trainable = False

# Implement QAT for fine-tuning
with tfmot.quantization.keras.quantize_scope():
  q_aware_model = tfmot.quantization.keras.quantize_model(model)

  # Re-compile the QAT model
  q_aware_model.compile(optimizer=keras.optimizers.Adam(1e-5),
                        loss=keras.losses.BinaryCrossentropy(),
                        metrics=[keras.metrics.BinaryAccuracy()])

  # Train the fine-tuned model
  print("#### Fine Tuning ####")
  model.fit(train_ds,
            epochs=25,
            validation_data=val_ds,
            callbacks=[early_stopping],
            class_weight=class_weight)

  # Evaluate the fine-tuned model on the test dataset
  print("\n#### QAT Model Evaluation ####")
  model.evaluate(test_ds)

  # Save raw model
  model.save(f"{model_name}_fine_raw")
  print("\nFine-tuned raw model saved.")

# Convert the QAT fine-tuned model into a full int quantised TFLite model
converter = tf.lite.TFLiteConverter.from_keras_model(model)
converter.optimizations = [tf.lite.Optimize.DEFAULT]

quantized_tflite_model = converter.convert()
1

There are 1 answers

1
CrazyT On

Had the same problem as you.

The comment here helped me with this:

https://github.com/tensorflow/tensorflow/issues/57034#issuecomment-1207889502

Basically the sequential-model with a submodel creates that problem.

So you need to chain the model differently with the help of a functional model.

Sadly you will probably get another problem with the MobileNetV3Large-Model.

For example you might get the following error:

 Exception encountered when calling layer "tf.__operators__.add_364" (type TFOpLambda).

'list' object has no attribute 'dtype'

I'm still figuring out what is going wrong, but my current guess is, that it is a bug inside the quantize_model-method.