Tensorflow AlexNet accuracy does not increase during training for ILSVRC2012 data set

257 views Asked by At

We have been working with a tensorflow AlexNet implementation based on the following model.

https://github.com/SidHard/tfAlexNet

We have been trying to train the model using the ILSVRC2012 training data which contains 1000 categories of images. However during this training the accuracy is almost always reported as zero. The loss decreases and eventually levels out (~7.xx) after about 600 iterations. Here is the sample output for 1000 iterations:

lr 0.001 Iter 20, Minibatch Loss= 32345.199219, Training Accuracy= 0.00000

lr 0.001 Iter 40, Minibatch Loss= 19276.070312, Training Accuracy= 0.00000

lr 0.001 Iter 60, Minibatch Loss= 10468.625977, Training Accuracy= 0.00000

lr 0.001 Iter 80, Minibatch Loss= 8523.987305, Training Accuracy= 0.00000

lr 0.001 Iter 100, Minibatch Loss= 6895.601562, Training Accuracy= 0.00000

lr 0.001 Iter 120, Minibatch Loss= 5296.706055, Training Accuracy= 0.00000

lr 0.001 Iter 140, Minibatch Loss= 4541.700195, Training Accuracy= 0.00000

lr 0.001 Iter 160, Minibatch Loss= 3658.005371, Training Accuracy= 0.01562

lr 0.001 Iter 180, Minibatch Loss= 3368.450195, Training Accuracy= 0.00000

lr 0.001 Iter 200, Minibatch Loss= 2641.639160, Training Accuracy= 0.00000

lr 0.001 Iter 220, Minibatch Loss= 2349.733154, Training Accuracy= 0.00000

lr 0.001 Iter 240, Minibatch Loss= 2258.051270, Training Accuracy= 0.00000

lr 0.001 Iter 260, Minibatch Loss= 2042.907471, Training Accuracy= 0.00000

lr 0.001 Iter 280, Minibatch Loss= 1863.766602, Training Accuracy= 0.01562

lr 0.001 Iter 300, Minibatch Loss= 1761.209717, Training Accuracy= 0.00000

lr 0.001 Iter 320, Minibatch Loss= 1295.963623, Training Accuracy= 0.00000

lr 0.001 Iter 340, Minibatch Loss= 1295.362793, Training Accuracy= 0.00000

lr 0.001 Iter 360, Minibatch Loss= 1384.312988, Training Accuracy= 0.00000

lr 0.001 Iter 380, Minibatch Loss= 1054.358154, Training Accuracy= 0.00000

lr 0.001 Iter 400, Minibatch Loss= 1121.298584, Training Accuracy= 0.00000

lr 0.001 Iter 420, Minibatch Loss= 874.211853, Training Accuracy= 0.00000

lr 0.001 Iter 440, Minibatch Loss= 643.881165, Training Accuracy= 0.00000

lr 0.001 Iter 460, Minibatch Loss= 521.504211, Training Accuracy= 0.00000

lr 0.001 Iter 480, Minibatch Loss= 312.603027, Training Accuracy= 0.00000

lr 0.001 Iter 500, Minibatch Loss= 227.839951, Training Accuracy= 0.00000

lr 0.001 Iter 520, Minibatch Loss= 87.130402, Training Accuracy= 0.00000

lr 0.001 Iter 540, Minibatch Loss= 8.717880, Training Accuracy= 0.03125

lr 0.001 Iter 560, Minibatch Loss= 7.396675, Training Accuracy= 0.00000

lr 0.001 Iter 580, Minibatch Loss= 7.604724, Training Accuracy= 0.00000

lr 0.001 Iter 600, Minibatch Loss= 7.333892, Training Accuracy= 0.00000

lr 0.001 Iter 620, Minibatch Loss= 7.501148, Training Accuracy= 0.00000

lr 0.001 Iter 640, Minibatch Loss= 7.662533, Training Accuracy= 0.00000

lr 0.001 Iter 660, Minibatch Loss= 7.267312, Training Accuracy= 0.00000

lr 0.001 Iter 680, Minibatch Loss= 7.307860, Training Accuracy= 0.00000

lr 0.001 Iter 700, Minibatch Loss= 7.554455, Training Accuracy= 0.00000

lr 0.001 Iter 720, Minibatch Loss= 7.164557, Training Accuracy= 0.00000

lr 0.001 Iter 740, Minibatch Loss= 7.410737, Training Accuracy= 0.00000

lr 0.001 Iter 760, Minibatch Loss= 7.430292, Training Accuracy= 0.00000

lr 0.001 Iter 780, Minibatch Loss= 7.378484, Training Accuracy= 0.00000

lr 0.001 Iter 800, Minibatch Loss= 7.348798, Training Accuracy= 0.00000

lr 0.001 Iter 820, Minibatch Loss= 7.194911, Training Accuracy= 0.00000

lr 0.001 Iter 840, Minibatch Loss= 7.180418, Training Accuracy= 0.00000

lr 0.001 Iter 860, Minibatch Loss= 7.523817, Training Accuracy= 0.00000

lr 0.001 Iter 880, Minibatch Loss= 7.398925, Training Accuracy= 0.00000

lr 0.001 Iter 900, Minibatch Loss= 7.427280, Training Accuracy= 0.00000

lr 0.001 Iter 920, Minibatch Loss= 7.344418, Training Accuracy= 0.00000

lr 0.001 Iter 940, Minibatch Loss= 7.317279, Training Accuracy= 0.00000

lr 0.001 Iter 960, Minibatch Loss= 7.290085, Training Accuracy= 0.01562

lr 0.001 Iter 980, Minibatch Loss= 7.403413, Training Accuracy= 0.00000

lr 0.0001 Iter 1000, Minibatch Loss= 7.329219, Training Accuracy= 0.00000

We have run the training for > 10,000 iterations and neither the loss or accuracy change after ~600 iterations.

Does anyone have any thoughts/ideas about this training behavior?

Also, has anyone been able to set up and train a tensorflow AlexNet model using the ILSVRC2012 data set?

Thanks.

0

There are 0 answers