How to apply DDPG OUnoise to my environment

59 views Asked by At

I am trying to perform reinforcement learning using the DDPG algorithm in my custom environment. I looked for various OUnoises here, but I couldn't find one that fits my environment.

Detail : A total of four actions are output from the Actor network. ex) tensor([0.5914, 0.5693, 0.5467, 0.6196], device='cuda:0') The range for all actions is between 0 and 1 by passing the sigmoid function in the last layer.

And the following is the class for OUnoise that I use. class OUNoise: """Ornstein-Uhlenbeck process."""

 def __init__(self, size, seed, mu=0., theta=0.15, sigma=0.1):
     """Initialize parameters and noise process.""" = mu * torch.ones(size)
     self. theta = theta
     self.sigma = sigma
     self.seed = random.seed(seed)

 def reset(self):
     """Reset the internal state (= noise) to mean (mu)."""
     self.state = copy.copy(

 def sample(self):
     """Update internal state and return it as a noise sample."""
     x = self. state
     dx = self.theta * ( - x) + self.sigma *torch.tensor(np.array([np.random.normal() for i in range(len(x))]))
     self.state = x + dx
     return self. state

As a result of learning by doing action + OUnoise, you will learn an action that does nothing. I'm learning to track the target.

What I want to ask is how to set OUnoise if the range of action is 0-1. (mean, standard deviation of OUnoise, etc. In particular, torch. () for i in range(len(x))])) < I think that line is important.)

dx = self.theta * ( - x) + self.sigma *torch.tensor(np.array([np.random.normal() for i in range(len(x))])) < that line to np.random.normal(loc= 0.5, std = 0.2 ), np.random.random(), and np.random.uniform(-1,1), but there is no improvement. Also, the reason why the action range is 0 to 1 is that it is easier to convert the action value to be applied to the actual environment by using the sigmoid function rather than tanh when applied to the actual environment.


There are 0 answers