Theory behind state normalization in Reinforcement Learning

757 views Asked by At

I know that normalizing the observation state returns better results in reinforcement learning Stable-baselines documentation. But I could not find any theoretical background to back this theory up. I applied RL to robotics grasping. I receive the raw depth sensor values and input it into a series of convolutional layers, at the end receiving the 512-dimensional output. Without normalizing this output, the agent does not learn a working policy. But by applying normalization, it somehow achieves far better performance. I am not looking for a full mathematical proof. Instead, a logical explanation is enough.

0

There are 0 answers