Using Reinforecement Learning or Not? How to solve specific optimization problem?

23 views Asked by anzin1122 At 16 January 2024 at 15:11

I have an optimization problem for which I have no idea how to find a solution. The concept involves receiving a simple observation (binary) at each timestep, making decisions accordingly, and subsequently receiving another observation. After 1000 timesteps, I obtain a value indicating the effectiveness of the selected decisions. My objective is to determine the best policy of decisions to maximize the end value. What methods can I use to solve this problem?

While Reinforcement Learning seems logical here, it may struggle to recognize that actions at the beginning can have the same influence as those at the end.

I tried DRL algorithms but any of them cannot find good policy. I assume the problem with the sparse reward at the end. Any solution method as a recommendation is appreciated.

Original Q&A

TechQA.

Using Reinforecement Learning or Not? How to solve specific optimization problem?

There are 0 answers

Related Questions in PYTHON

Related Questions in MATHEMATICAL-OPTIMIZATION

Related Questions in REINFORCEMENT-LEARNING

Related Questions in SUPERVISED-LEARNING

Popular Questions

Trending Questions