Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

error when convert llama1 ckpts to hf formath #30723

Open
2 of 4 tasks
a157801 opened this issue May 9, 2024 · 7 comments
Open
2 of 4 tasks

error when convert llama1 ckpts to hf formath #30723

a157801 opened this issue May 9, 2024 · 7 comments

Comments

@a157801
Copy link

a157801 commented May 9, 2024

System Info

  • transformers version: 4.41.0.dev0
  • Platform: Linux-4.18.0-425.3.1.el8.x86_64-x86_64-with-glibc2.17
  • Python version: 3.9.12
  • Huggingface_hub version: 0.21.4
  • Safetensors version: 0.4.2
  • Accelerate version: 0.21.0
  • Accelerate config: not found
  • PyTorch version (GPU?): 2.2.1+cu121 (True)
  • Tensorflow version (GPU?): not installed (NA)
  • Flax version (CPU?/GPU?/TPU?): not installed (NA)
  • Jax version: not installed
  • JaxLib version: not installed
  • Using GPU in script?:
  • Using distributed or parallel set-up in script?:

Who can help?

No response

Information

  • The official example scripts
  • My own modified scripts

Tasks

  • An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
  • My own task or dataset (give details below)

Reproduction

python transformers/models/llama/convert_llama_weights_to_hf.py --input_dir path-to-llama1-7b-source --model_size 7B --output_dir path-to-llama1-7b-target --llama_version 1

Expected behavior

error: RuntimeError: shape '[32, 2, 2, 4096]' is invalid for input of size 16777216
In line

dim1=dim // num_local_key_value_heads,

the k_proj in llama1 7b is 4096 by 4096, but the dim1 here is 128.
Maybe this is a bug when converting llama1 ckpt

@amyeroberts
Copy link
Collaborator

cc @ArthurZucker

@ZHEGG
Copy link

ZHEGG commented May 9, 2024

Same question!

@ArthurZucker
Copy link
Collaborator

Haha that's annoying, we might have broken conversion for llama1 when adding llama3.
Could you test on transformers==4.38 or 4.39?

@RPC2
Copy link

RPC2 commented May 10, 2024

I encountered the same error when I was converting the Llama 2 model. Using transformers==4.38 solved this problem.

@ArthurZucker
Copy link
Collaborator

Yep, it's not expected. I'll open a PR to fix conversion on all models 🤗

@yoryis
Copy link

yoryis commented May 16, 2024

Any update on this issue, please.

@ArthurZucker
Copy link
Collaborator

If you just need to convert the old Llama models, you can quickly revert to a previous version of transformers (prior to 4.39.0) 😉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants