I'd like to build a custom environment (e.g. OpenAI or Petting Zoo) to train RL agents. That said, my use case is a bit unique where it's not just that the agents take actions within the environment (e.g. env.step(action)), but that certain things are happening within the environment (e.g. data has changed, an event occurred) that the agents would like to know about proactively to perform actions. At the moment, all of the examples I've seen are where an agent takes an action, observes rewards, etc. It's basically unidirectional.
I was thinking of extending the environment to become a pub/sub where it can emit events and agents can listen to these events. But not sure if there's some convention to be followed here or best practices beyond what I'm thinking.