Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Resetting only the vectorized environments that are done? #73

Open
kfu02 opened this issue Dec 26, 2023 · 2 comments
Open

Resetting only the vectorized environments that are done? #73

kfu02 opened this issue Dec 26, 2023 · 2 comments

Comments

@kfu02
Copy link
Contributor

kfu02 commented Dec 26, 2023

Hi, sorry in advance if this isn't the right place to ask these kinds of questions.

I have been playing with VMAS in its vanilla form (no torchRL/RLLib) to try and understand how to implement my own Scenarios, and currently I am confused with how VMAS handles resetting the environment. The reset() function docstring states that it handles resetting "in a vectorized way". From my testing, it seems to me that it resets all vectorized environments.

I was hoping "in a vectorized way" meant that it only reset the environments which were done and left the others alone. I would like it to behave this way to collect episode reward from episodes that are allowed to run until termination, for instance. Does VMAS have this functionality built-in? Am I misunderstanding reset()?

Thank you for the great library, by the way!

@matteobettini
Copy link
Member

matteobettini commented Dec 27, 2023

Hello. Thanks for this question as this is a point I feel it is good to clarify and improve upon.

The current situation

Currently, as you say, there are 2 ways to reset an environment:

  • env.reset() which resets all enviornments
  • env.reset_at(index) which resets a specific environment at env_index: int

The way that is currently available to reset done environments is to cycle through the done flags and reset only the done envs as:

done # shape = [n_envs]
for i in range(n_envs):
    if done[i]:
         env.reset_at(i)

The ideal situation

To improve efficiency and avoid this for loop. It would be awsome if the reset_at function also accepted a mask.

Something like:

env.reset_at(done)

This would be amazing. The only problem is that the reset_at function of all current scenarios and a major bit of simulator logic will need to be rewritten. So it is not a quick or easy effort.

A consideration

What I do for some scenarios I create is to not implment a done function and let all environments be only done after max_steps. This makes it so that you can always call env.reset(). I understand that this does not fit all tasks, but I figured I would mention this in case it is helpful.

P.S. This change has long been on our TODOs https://github.com/proroklab/VectorizedMultiAgentSimulator?tab=readme-ov-file#todos

@matteobettini matteobettini pinned this issue Dec 27, 2023
@kfu02
Copy link
Contributor Author

kfu02 commented Dec 27, 2023

Thank you! Your answer makes sense. I will think over these options.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants