-
Notifications
You must be signed in to change notification settings - Fork 33
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fine-tuning codellama dataset #1
Comments
I only tried it on plaintext now. Do you have some example/link to more detailed format description and/or some open dataset in that format so I can check? |
Can I use the Alpaca Dataset json format? [ |
@Naozumi520 - let me try that out and update if needed |
@Naozumi520 -- needs some work, but should be possible. will update here once I get it running successfully. |
Testing a similar dataset (https://huggingface.co/datasets/databricks/databricks-dolly-15k) in https://github.com/okuvshynov/slowllama/tree/try_dolly |
it works in a sense the loss is going down. Here's the log of finetuning llama7b on first 100 samples from dolly15k:
However, there are a few important things to do to improve:
Still, you can try it by doing something like
|
Wow, thank you! |
I tried on my intel Mac as my m2 Mac is not in my home. However, I got the error |
I don't know much about intel Mac - I assumed 'mps' device is a GPU was first introduced in Apple silicon devices, starting with Apple M1. Maybe that framework works with older GPUs too, not sure. Could you post your intel mac config? Maybe it's possible to run it on its GPU? Running on CPU will be likely way too slow. |
Yes, mps works with older GPUs too, including intel Macs (AMD gpu). My Mac is i9 cpu with 5500m gpu (4gb vram). Apple silicon use ram as vram so the ops backend memory should not be a problem. However, the intel Mac I'm now using have only 4gb of vram, and because of the long holiday now in Hong Kong I couldn't take my m2 Mac. :( |
I see. 4Gb is a little too low - might be still possible to get it working with short sequence + tiny batch size (=1). In this case it is probably critical to implement gradient accumulation though. |
It is probably also possible to do 2-level prefetch (HDD->RAM->vRAM), while with unified memory I did everything in 1 level. |
@Naozumi520 -- after b87cd7c it's possible to do both storage and finetuning in fp16 datatype which cuts both compute and RAM requirements considerably. |
wow, thank you! |
Is there a particular dataset format required for finetuning codellama? I have the dataset in the OpenAI suggested format which is basically a jsonl with each entry having messages: [{role: 'system', content: ''}, {role: 'user', content: ''}, {role: 'assistant', content: ''}]} object. Will this format work?
The text was updated successfully, but these errors were encountered: