-
Notifications
You must be signed in to change notification settings - Fork 223
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Instruction tuning with my own datasets #139
Comments
Thanks for your questions!
|
Thank you for your guidance! I've managed to fine-tune the model using multiple GPUs successfully. I suspect that the model's proficiency in Chinese might be attributed to the vicuna model components. Therefore, further fine-tuning this model with additional Chinese instructions could potentially enhance its performance. I'm considering exploring this to see the impact on its language handling capabilities. |
May I inquire about the number of GPUs utilized during the fine-tuning process? Thank you! |
For small fine-tuning data, I think 4-8 GPU > 40G is ok. However, the current codebase may be not efficient. You can follow some other repos like LAVIN to use some lightweight fine-tuning strategies like QLoRA. |
I am planning to fine-tune the VideoChat2 model with custom instruction data to enhance its performance on downstream tasks. I have a couple of questions regarding the pre-training data and the process of fine-tuning with Chinese instructions. Your insights will be highly valuable to me.
1. Pre-Training Data Language:
Was Chinese video-text data utilized in the pre-training phase of the VideoChat2 model? I've experimented with some Chinese instructions, and the model's performance was quite satisfactory. Is it advisable to perform instruction tuning on the stage 3 model using Chinese instructions?
2.Multi-GPU Fine-Tuning:
I am interested in fine-tuning the model using multiple GPUs to expedite the training process. However, I couldn't find any related arguments or settings for enabling multi-GPU training in the provided configuration file ("/scripts/config_7b_stage3.py"). Could you provide guidance or examples on how to modify the configuration for multi-GPU support?
Your assistance will greatly aid in optimizing the model for my specific requirements. Thank you in advance for your help.
The text was updated successfully, but these errors were encountered: