Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How much do you cost on fine-tune starcoder? #11

Open
wangjiyang opened this issue Jul 19, 2023 · 5 comments
Open

How much do you cost on fine-tune starcoder? #11

wangjiyang opened this issue Jul 19, 2023 · 5 comments

Comments

@wangjiyang
Copy link

Hi. Thank you for your great work. You approach is helpful to me. I am trying to fine-tune starcoder to enhance its C code performance. So your cost of fine-tune starcoder is helpful to me. Could you share this information? Thanks.

@minosvasilias
Copy link
Owner

Hey @wangjiyang , thanks, glad it's useful!

The finetuning process for the starcoder-15b model took 15h30m on an 8xA100-80GB instance.
I was able to get that instance for 12$/h, so that's a total cost of 192$ if you're being very efficient with your time management while renting the instance.

In reality, downloading the base-model, setting up the dataset, debugging some test-finetunes with few steps to make sure there are no memory issues, and uploading the final finetune added a couple extra hours on top of that, so i'd calculate a bit more depending on how long that takes you.

@viktor-ferenczi
Copy link

If your GPU cloud provider supports creating a template, then it may make the process cheaper. Prepare your template without GPU (if possible) or with a cheap one. The template should include all model data cached, so nothing big needs to be downloaded when the actual (expensive) training instance is created with all the GPUs. This approach can save most of the setup time, but cannot help with the memory related issues which need to be debugged with all the GPUs present.

@wangjiyang
Copy link
Author

Thanks @minosvasilias @viktor-ferenczi , your information are very helpful. I read some paper and find someone tried to use AST(abstract syntax tree) to improve model quality on coding generation. Do you have any expertise on this approach?

@viktor-ferenczi
Copy link

Yes, I have. In the AskYourCode ChatGPT plugin I was doing exactly that. Please feel free to message me directly in Discord, invite is on the https://askyourcode.ai page.

@minosvasilias
Copy link
Owner

Good point on the templates @viktor-ferenczi .

No personal experience with the syntax trees, but also interested in finding out more.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants