New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
error when convert llama1 ckpts to hf formath #30723
Comments
Same question! |
Haha that's annoying, we might have broken conversion for llama1 when adding llama3. |
I encountered the same error when I was converting the Llama 2 model. Using |
Yep, it's not expected. I'll open a PR to fix conversion on all models 🤗 |
Any update on this issue, please. |
If you just need to convert the old Llama models, you can quickly revert to a previous version of transformers (prior to 4.39.0) 😉 |
System Info
transformers
version: 4.41.0.dev0Who can help?
No response
Information
Tasks
examples
folder (such as GLUE/SQuAD, ...)Reproduction
python transformers/models/llama/convert_llama_weights_to_hf.py --input_dir path-to-llama1-7b-source --model_size 7B --output_dir path-to-llama1-7b-target --llama_version 1
Expected behavior
error: RuntimeError: shape '[32, 2, 2, 4096]' is invalid for input of size 16777216
In line
transformers/src/transformers/models/llama/convert_llama_weights_to_hf.py
Line 181 in f26e407
the k_proj in llama1 7b is 4096 by 4096, but the dim1 here is 128.
Maybe this is a bug when converting llama1 ckpt
The text was updated successfully, but these errors were encountered: