Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Suggestion to improve UX #32

Open
ch3rn0v opened this issue Apr 2, 2023 · 3 comments
Open

Suggestion to improve UX #32

ch3rn0v opened this issue Apr 2, 2023 · 3 comments

Comments

@ch3rn0v
Copy link

ch3rn0v commented Apr 2, 2023

Thank you for this project! I tried it, and unlike some others it worked (with llama 7b and 2080 ti).
Now I'd like to scale up my experiments.

  1. In order to do so, I would need an option to initiate training programmatically. Technically, I'd be able to extract what I need from main.py, but it'd be great if there was an already tested example.
  2. Secondly, I'd like to see an example on how to convert the directory with checkpoints into a standalone model.

Would you please share your thoughts on this or perhaps a link to where it's already implemented? Thank you in advance.

@lxe
Copy link
Owner

lxe commented Apr 6, 2023

I just released v2 where I rewrote the whole thing from scratch. No way to merge the lora weights yet but it's very simple. Check this out for example: https://github.com/lxe/cerebras-lora-alpaca/blob/main/merge_lora_weights_into_cerebras.ipynb

I'll definitely add merging and downloading in the UI.

I think for programmatic finetuning, the tloen/alpaca-lora is a better fit

@ch3rn0v
Copy link
Author

ch3rn0v commented Apr 7, 2023

Thank you, I'll check it out!

@ch3rn0v
Copy link
Author

ch3rn0v commented Apr 9, 2023

I managed to use parts of your library to make programmatic finetuning. The two things it took is the Trainer class and some constants from the config.py file.
There are these questions left now:

  • I see it says load_in_8bit=True. Does this mean that the original model is somehow processed on the fly at the time of loading into memory in order to take roughly half the space?
  • Does this also mean that training changes these "rounded" weights instead of the original ones?
  • Finally, if I repeat the whole process in the cloud with much more vram, will it be enough to simply set load_in_8bit=False in order to load the original llama's weights and make a "diff" that's also computed using f16 instead of "8bit"?

Thank you in advance!

Edit: Perhaps, the peft.prepare_model_for_int8_training call should also be removed if load_in_8bit=False, right?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants