-
Notifications
You must be signed in to change notification settings - Fork 237
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
1.5最大窗口长度只有2048吗?可不可以设置的更长比如4096 #180
Comments
请问如果想把训练的长度扩的更大应该怎么办?比如我想扩到8192,应该从预训练开始重新做吗? |
我觉得不需要重头预训练,4k训练的模型直接扩大到8k-10k没有大问题,如果想扩大到更大的长度,可能需要再用长数据做一下微调。 另外您可以试试我们最近发布的Mini-InternVL-Chat-2B-V1-5和Mini-InternVL-Chat-4B-V1-5,这两个模型都是在8k长度下做的SFT。 |
谢谢,请问长度为4096做sft大约需要多少资源?不配置slurm集群可以用16*48G卡来做吗 |
|
No description provided.
The text was updated successfully, but these errors were encountered: