OpenCV DNN inference with "training=True" for using sample mean and variance (Pix2Pix)

397 views Asked by At

I have trained a Pix2Pix network using Keras Tensorflow, following this tutorial. The Pix2Pix uses Instance Normalization, such that when doing inference, we would need to have the Instance Normalization layers (batch norm for batch size of 1) to compute the sample mean and variance. In Tensorflow, I would call the forward as pred = model(x, training=True). The model is the generator part of the Pix2Pix, which is a UNet with Instance Normalization.

model = tf.keras.models.load_model("pix2pix")
pred = model(img, training=True)

https://www.tensorflow.org/tutorials/generative/pix2pix

We are using this model in C++ using the OpenCV DNN to perform inference, however, we see that the "forward" call of the OpenCV DNN performs as with Training=False, i.e. it uses training mean and variance in the model instead of obtaining the sample mean and variances. Additionally, the model has been optimized using OpenVINO into the intermediate representation.

print('OpenCV DNN Inference...')
print('OCV Version is',cv2.__version__)
# Load model
net = cv2.dnn.readNet("model.bin", "model.xml")
# Format input
blob = cv2.dnn.blobFromImages(img)
net.setInput(blob)

pred = net.forward()

Is there a way to tell the OpenCV DNN Forward call to do inference as if with Training=True?

1

There are 1 answers

1
Iffa_Intel On

This question is actually focused on OpenCV instead of OpenVINO. For a deeper explanation, it's best to redirect this to their forum/platform as they are the proper expert for these.

Generally, the training argument informs the Neural Network Layers on which path it should take since the layer's behaviour differs during training and inference.

Referring to the link you shared, it has been explained that the training=True is intentional in the training phase algorithm since you want the batch statistics, while running the model on the test dataset. If you use training=False, you get the accumulated statistics learned from the training dataset (which you don't want).

In short, you'll need to know what your layer needs to perform during training and inferencing(depending on your software design). Then, assign the condition to the training argument in accordance to training/inference phase.