Error processing event with use of ray's PPO algorithm

445 views Asked by At

I am using the PPO algorithm - provided by ray - to train an RL agent to stabilize traffic. During the training process, I keep seeing ValueError('Observation outside expected value range', Box(500,) screenshot

However, I don't know which part of my script is causing this issue or if it is caused by flow at all ?

1

There are 1 answers

1
Eugene Vinitsky On

Oof yes that's a very small bug caused by the RLlib upgrade. Basically, the Ray version we used to use wasn't strict about the bounds of the observation space being restricted, but the new version of Ray does. You can fix this by going into the corresponding environment and changing the low and high values of the observation space to be slightly more permissive (say, -2 to 2 instead of the current -1 to 1)