Quantized TFLite model gives better accuracy than TF model

639 views Asked by At

I am developing an end-to-end training and quantization aware traing example. Using the CIFAR 10 dataset, I load a pretrained MobilenetV2 model and then use the code from the TensorFlow Guide to quantize my model. After the whole process finishes properly, I get the following results:

Quant TFLite test_accuracy: 0.94462
Quant TF test accuracy:     0.744700014591217
TF test accuracy:           0.737500011920929

I wonder, how is this possible? Quantization is supposed to reduce accuracy a little bit.

I have noticed that in the TensorFlow's Guide example, accuracy is also enhanced a little bit, but very little compared to my example. To be more specific, when running this code which uses mnist dataset, I get the results below, which are acceptable by the developers of TensorFlow, as they mention that there is no change in accuracy.

Quant TFLite test_accuracy: 0.9817
Quant TF test accuracy:     0.9815
TF test accuracy:           0.9811

Note that I haven't changed the code I attached from the TensorFlow Guide, I just use a different dataset and model.

1

There are 1 answers

0
dtlam26 On

This can be possible when your model is not fully converged and also the size of your test dataset is not considered as big enough to differ those 2. In addition, even your model is converged, the way of reducing bit inference sometimes can help the range of random variable in each node is limited and sometimes can match your case and help the gradient find the optimal point better. However, I still encourage you to expand your test set, and check the model convergence because the gap is way too much.

A solid proof for you can be the classification from amazon when they reduce float32 to float16. The accuracy increase enter image description here

Furthermore, for MNIST, the data is simple and the accuracy is not really much different when it is just around 0.9815 as mean with a little variance. To my understanding, this is reasonable.