Parallel environments in Pong keep ending up in the same state despite random actions being taken

Question

Parallel environments in Pong keep ending up in the same state despite random actions being taken

198 views Asked by Swami At 01 April 2022 at 08:26

Hi I am trying to use the SubprocVecEnv to run 8 parallel Pong environment instances. I tried testing the state transitions using random actions but after 15 steps (with random left or right actions), the states of all the environments are the same. I was wondering how this happened and whether I did something wrong? Shouldn't all the different environment states be different? I checked the actions taken and net net they are all different (i.e. after 15 steps, some of the agents have taken more left than right actions and vice versa).

Can someone help on why all environments end at the same state even after 15 steps of random actions? My problem is that there is no new learning between environments if they all follow the same trajectory? Thanks a lot!

from baselines.common.vec_env.subproc_vec_env import SubprocVecEnv
env_name='PongDeterministic-v4' 

def make_env(env_name, seed):
    def f_():
        env=gym.make(env_name)
        env.seed(seed)
        return env
    return f_

envs=[make_env(env_name,42) for _ in range(8)]
envs = SubprocVecEnv(envs)
   
envs.reset()
for _ in range(15):
    fr1, _, _, _ = envs.step(np.random.choice([4, 5],8))
base=fr1[0,:,:,:]
for i in range(fr1.shape[0]):
    if fr1[i,:,:,:].all()==base.all():
       print('Match :(')

Match :(
Match :(
Match :(
Match :(
Match :(
Match :(
Match :(
Match :(

Original Q&A

There are 1 answers

**Swami** · Answer 1 · 2022-04-03T06:42:38+00:00

Swami On 03 April 2022 at 06:42

Ok, figured it out. I was using the same seed for all the envs.

envs=[make_env(env_name,42) for _ in range(8)]

should be changed to

envs=[make_env(env_name,i) for i in range(8)] #Seed as some function of i

TechQA.

Parallel environments in Pong keep ending up in the same state despite random actions being taken

There are 1 answers

Related Questions in REINFORCEMENT-LEARNING

Related Questions in OPENAI-GYM

Related Questions in PONG

Related Questions in POLICY-GRADIENT-DESCENT

Popular Questions

Popular Tags

Trending Questions