Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use Petals without sharing GPU #14

Open
raihan0824 opened this issue Mar 15, 2023 · 11 comments
Open

Use Petals without sharing GPU #14

raihan0824 opened this issue Mar 15, 2023 · 11 comments

Comments

@raihan0824
Copy link

is it possible to use petals for inferring/prompt tuning without sharing my gpu?

@Muennighoff
Copy link
Collaborator

Not sure about that, maybe one of @borzunov @justheuristic @mryab knows?

@borzunov
Copy link

Hi @raihan0824,

Your GPU is not shared when you use a Petals client to run inference or fine-tuning. The GPU is only shared when you run a Petals server.

@raihan0824
Copy link
Author

yes, but I want to run bloom in petals with my own GPU, not others. Is that possible?

@mryab
Copy link
Member

mryab commented Mar 15, 2023

Hi, do you mean you want to use Petals with your GPU, but don't want to let the others use it? I think you can set up a private swarm using these instructions. If you run into any troubles, the tutorial has a link to the Discord server, where we (and other users of Petals) can help you with technical issues.

Please keep in mind that you'll need around 176GB of GPU memory just for 8-bit parameters though; if you only have a single GPU, your best bet is offloading or joining the public swarm.

@raihan0824
Copy link
Author

Well noted.

Is it possible to do prompt tuning with that private swarm? also what if I want to use the smaller bloom model such as bloomz-7b1-mt?

My goal is to do prompt tuning on bloomz-7b1-mt.

@mryab
Copy link
Member

mryab commented Mar 15, 2023

Yes, it is possible — you just need to specify a different set of initial peers in DistributedBloomConfig when you're creating DistributedBloomForCausalLM from the tutorial. By default, the config (and thus the model) will connect to peers from the public swarm — you need to change these to the addresses of your peers in the private swarm.

However, I'd say that for bloomz-7b1, you might not even need Petals (depends on your GPU setup, obviously). A reasonably new GPU should be able to host the whole model, so you'll be able to run it just with standard Transformers/PEFT. Do you have any specific reasons why you want to use Petals for this task?

@raihan0824
Copy link
Author

The reason why I want to use petals is because it can be used to do prompt tuning, instead of fine-tuning. I can't find other sources that provides prompt tuning for BLOOM

@mryab
Copy link
Member

mryab commented Mar 15, 2023

Have you checked out https://github.com/huggingface/peft#use-cases? I think PEFT even showcases bigscience/bloomz-7b1, and the model support matrix includes BLOOM for prompt tuning

@raihan0824
Copy link
Author

Thank you for the info! Will check it out.

So I want to confirm my initial question:
It's possible to use petals with my own GPU to do inference and prompt-tuning on bigscience/bloomz-7b1 model.
Is that correct?

@mryab
Copy link
Member

mryab commented Mar 15, 2023

Yes, it is possible, but not necessary: with PEFT, you are likely to get the same result with fewer intermediate steps for setup.

@raihan0824
Copy link
Author

Thank you very much 🙏

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants