Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

【feature request】support resource pools across multiple cloud providers #5910

Open
LangDaoAI opened this issue Feb 2, 2023 · 4 comments
Labels
feature Feature requests

Comments

@LangDaoAI
Copy link

Hi team,

Is there any plan to support resource pools across multiple cloud providers?

From https://docs.determined.ai/latest/introduction.html:

image

Thanks!

@rb-determined-ai
Copy link
Member

Is the feature request for one resource pool across multiple clouds, or multiple resource pools, one per cloud?

@LangDaoAI
Copy link
Author

hi @rb-determined-ai I think that current system architecture can not support one resource pool across multiple clouds(the master in some cloud region has no way to communicate with dynamic agents from different cloud providers); On the other hand, about "multiple resource pools, one per cloud", unless the master can communicate with each resource pool by some network.

I want to hear your some suggestions.

Thanks!

@rb-determined-ai rb-determined-ai added the feature Feature requests label Feb 3, 2023
@rb-determined-ai
Copy link
Member

You are right, we really can't offer one resource pool across multiple clouds. Also, it would have severe impacts on training performance due to the network latency.

We are aware of the feature request for multiple resource pools, one per cloud. At this time, it is not currently on our roadmap (planned 12 months out).

But I've made our product team aware of your request.

@LangDaoAI
Copy link
Author

You are right, we really can't offer one resource pool across multiple clouds. Also, it would have severe impacts on training performance due to the network latency.

We are aware of the feature request for multiple resource pools, one per cloud. At this time, it is not currently on our roadmap (planned 12 months out).

But I've made our product team aware of your request.

@rb-determined-ai about "multiple resource pools, one per cloud", I have the following consideration for the system architecture:

multi masters

on the architecture, I have a private datacenter hosting master_proxy which can route into master_aws and master_gcp, forming multiple resource pools, one per cloud, and webui is hosted the same as the master_proxy which can access the two postgreses from different cloud providers for the metadata.

As far as I know, current system architecture can not support "multiple resource pools, one per cloud".

please correct my any problem, thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature Feature requests
Projects
None yet
Development

No branches or pull requests

2 participants