I'm facing an issue with TF2 object detection API that seems to have occurred overnight. I'm trying to resume training from a saved checkpoint and as usual I change the path in the config file to where the checkpoints are before resuming the training, which has always worked.
Today it's throwing this error (see below). For some reason, checkpoint dir and model dir cannot be the same. Now, the big problem is that if I change the model dir, it restarts training from zero and not from the last epoch, so I'm stuck. This only happens in TF2, I also tried with TF1 and works fine.
File "/usr/local/lib/python3.7/dist-packages/object_detection/utils/variables_helper.py", line 230, in ensure_checkpoint_supported (' Please set model_dir to a different path.'))) RuntimeError: Checkpoint dir (/content/drive/MyDrive/Object_detection/training) and model_dir (/content/drive/MyDrive/Object_detection/training) cannot be same. Please set model_dir to a different path.
I faced the same problem. It said that the model_dir and chechpoint_dir could not be the same, however, if they are different the training would just start from the beginning.
It was due to a recent addition (May 7) of a check at the end of the file "research/object_detection/utils/variables_helper.py":
I managed to fix it by changing it to something like:
After cloning the Github repository and before installing the object_detection package.
I believe you could have also changed the clone version, something like (might need some editing to get it working):