How can I converge loss to a lower value?(tensorflow)

633 views Asked by At

I used tensorflow object detection API.
Here is my environment.
All images are from coco API

Tensorflow version : 1.13.1
Tensorboard version : 1.13.1
Number of test images : 3000
Number of train images : 24000
Pre-trained model : SSD mobilenet v2 quantized 300x300 coco
Number of detecting class : 1(person)

And here is my train_config.

train_config: {
  batch_size: 6
  optimizer {
    adam_optimizer: {
        learning_rate {
            exponential_decay_learning_rate: {
        initial_learning_rate:0.000035
        decay_steps: 7
        decay_factor: 0.98      
          }
        }
      }
    }
  fine_tune_checkpoint: "D:/TF/models/research/object_detection/ssd_mobilenet_v2_quantized_300x300_coco_2019_01_03/model.ckpt"
  fine_tune_checkpoint_type:  "detection"
  num_steps: 200000
  data_augmentation_options {
    random_horizontal_flip {
    }
  }
  data_augmentation_options {
    ssd_random_crop {
    }
  }
}

I can't find optimized learning rate, appropriate decay steps and factor.
So I did many training, but the result is always similar.
How can I fix this??
I already spent a week just for this problem..
On the other post, someone recommended that add a noise to data set(images).
But I don't know what it means.
How can I make that happen?

1

There are 1 answers

0
dnl_anoj On BEST ANSWER

I think what was referenced on the other post was to do some data augmentation by adding some noisy images to your training dataset. It means that you apply some random transformations to your input so that the model aims to generalize better. A type of noise that can be used is the Random Gaussian noise (https://en.wikipedia.org/wiki/Gaussian_noise) which is applied by patch in the object-detection API. Although it seems that you have enough training images it is worth a shot. The noise would look like :

...
data_augmentation_options {
  random_horizontal_flip {
  }
}
data_augmentation_options {
  ssd_random_crop {
  }
}
data_augmentation_options {
  randompatchgaussian {
    // The patch size will be chosen to be in the range
    // [min_patch_size, max_patch_size). 
    min_patch_size: 300;
    max_patch_size: 300; //if you want the whole image to be noisy
  }
}
...

For the list of data augmentation you can check : https://github.com/tensorflow/models/blob/master/research/object_detection/protos/preprocessor.proto

Regarding the learning rate one common strategy is to try on large learning rate (0.02 for instance) and one very small as you have tried already. I would recommend you to try with 0.02, leave it for a while or use the exponential decay learning rate to see if the results are better.

Changing the batch_size can also have some benefits, try batch_size = 2 instead of 6.

I would also recommend you to leave the training for more steps until you see no improvements at all in the training, maybe leave it until the 200000 steps define in your configuration.

Some deeper strategies can help the model to perform better, they have been said on this answer : https://stackoverflow.com/a/61699696/14203615

That being said, if your dataset is correctly made you should get good results on your test set.