Deep neural network diverges after convergence

Question

Deep neural network diverges after convergence

1k views Asked by dazzphot At 06 January 2017 at 15:46

I implemented the A3C network in https://arxiv.org/abs/1602.01783 in TensorFlow.

At this point I'm 90% sure the algorithm is implemented correctly. However, the network diverges after convergence. See the attached image that I got from a toy example where the maximum episode reward is 7.

When it diverges, policy network starts giving a single action very high probability (>0.9) for most states.

What should I check for this kind of problem? Is there any reference for it?

Original Q&A

There are 1 answers

**jaromiru** · Answer 1 · 2018-06-28T13:52:14+00:00

Note that in Figure 1 of the original paper the authors say:

For asynchronous methods we average over the best 5 models from 50 experiments.

That can mean that in lot of cases the algorithm does not work that well. From my experience, A3C often diverges, even after convergence. Carefull learning-rate scheduling can help. Or do what the authors did - learn several agents with different seed and pick the one performing the best on your validation data. You could also employ early stopping when validation error becomes to increase.

TechQA.

Deep neural network diverges after convergence

There are 1 answers

Related Questions in TENSORFLOW

Related Questions in DEEP-LEARNING

Related Questions in GRADIENT-DESCENT

Related Questions in REINFORCEMENT-LEARNING

Popular Questions

Popular Tags

Trending Questions