I'm confused by what seems to be a bug in tf_agents. The following piece of code works fine even through tf_env.action_spec().is_compatible_with(action) is False.

import tensorflow as tf
import tf_agents.environments.tf_py_environment as tf_py_environment
from tf_agents.environments.tf_py_environment_test import PYEnvironmentMock

py_env = PYEnvironmentMock()
tf_env = tf_py_environment.TFPyEnvironment(py_env)
assert tf_env.batch_size == 1
action = tf.constant(2, shape=(1,), dtype=tf.int32)
assert tf_env.action_spec().is_compatible_with(action) is False
obs = tf_env.step(action)

If now, I change the shape of the action, the action is indeed compatible with the spec, but calling tf_env.step(action)

action = tf.constant(2, shape=(), dtype=tf.int32)
assert tf_env.action_spec().is_compatible_with(action) # Now ok
obs = tf_env.step(action) # raises an IndexError: list index out of range

raises the IndexError below as it expects a 1-dim action array:

Traceback (most recent call last):
  File "/lib/python3.6/site-packages/IPython/core/interactiveshell.py", line 3291, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-42-f3211de6b571>", line 1, in <module>
  File "/lib/python3.6/site-packages/tf_agents/environments/tf_environment.py", line 232, in step
    return self._step(action)
  File "/lib/python3.6/site-packages/tensorflow/python/autograph/impl/api.py", line 147, in graph_wrapper
    return f(*args, **kwargs)
  File "/lib/python3.6/site-packages/tf_agents/environments/tf_py_environment.py", line 209, in _step
    dim_value = tensor_shape.dimension_value(action.shape[0])
  File "/lib/python3.6/site-packages/tensorflow/python/framework/tensor_shape.py", line 837, in __getitem__
    return self._dims[key]
IndexError: list index out of range

Is there anything wrong in my code?

0 Answers