I need to train a network to detect traffic signs, using Tensorflow2 object detection API. However, a lot of the traffic signs instances contained in the training dataset are very small. Most of the images in the training set have also a quite high resolution (there are a variety of resolutions), and I am currently resizing the images to 1024x1024. I am using a Faster R-CNN 101. The resizement causes the road signs to become extremely small and the network is not able to detect them, I am using this scales for the anchors: [0.1, 0.25, 0.75, 1.25, 1.75].
I think there are two possible solutions:
- use the original images without resizing them (I think it requires a lot of memory on the GPU)
- random crop the training images and use the crops as training dataset, so that the traffic signs don't become too small
The second solution is a bit tricky because you also have to change the bounding boxes coordinates I guess. Is there a way to do that easily?
Furthermore, any other idea is well accepted.
Thanks.