error when convert llama1 ckpts to hf formath #30723

a157801 · 2024-05-09T04:37:02Z

System Info

transformers version: 4.41.0.dev0
Platform: Linux-4.18.0-425.3.1.el8.x86_64-x86_64-with-glibc2.17
Python version: 3.9.12
Huggingface_hub version: 0.21.4
Safetensors version: 0.4.2
Accelerate version: 0.21.0
Accelerate config: not found
PyTorch version (GPU?): 2.2.1+cu121 (True)
Tensorflow version (GPU?): not installed (NA)
Flax version (CPU?/GPU?/TPU?): not installed (NA)
Jax version: not installed
JaxLib version: not installed
Using GPU in script?:
Using distributed or parallel set-up in script?:

Who can help?

No response

Information

The official example scripts
My own modified scripts

Tasks

An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
My own task or dataset (give details below)

Reproduction

python transformers/models/llama/convert_llama_weights_to_hf.py --input_dir path-to-llama1-7b-source --model_size 7B --output_dir path-to-llama1-7b-target --llama_version 1

Expected behavior

error: RuntimeError: shape '[32, 2, 2, 4096]' is invalid for input of size 16777216
In line

transformers/src/transformers/models/llama/convert_llama_weights_to_hf.py

Line 181 in f26e407

dim1=dim // num_local_key_value_heads,

the k_proj in llama1 7b is 4096 by 4096, but the dim1 here is 128.
Maybe this is a bug when converting llama1 ckpt

The text was updated successfully, but these errors were encountered:

amyeroberts · 2024-05-09T08:35:04Z

cc @ArthurZucker

ZHEGG · 2024-05-09T09:06:22Z

Same question!

ArthurZucker · 2024-05-09T09:22:40Z

Haha that's annoying, we might have broken conversion for llama1 when adding llama3.
Could you test on transformers==4.38 or 4.39?

RPC2 · 2024-05-10T07:32:50Z

I encountered the same error when I was converting the Llama 2 model. Using transformers==4.38 solved this problem.

ArthurZucker · 2024-05-10T08:33:28Z

Yep, it's not expected. I'll open a PR to fix conversion on all models 🤗

yoryis · 2024-05-16T18:41:57Z

Any update on this issue, please.

ArthurZucker · 2024-05-20T08:39:45Z

If you just need to convert the old Llama models, you can quickly revert to a previous version of transformers (prior to 4.39.0) 😉

ArthurZucker mentioned this issue May 10, 2024

Meet problems when I use the file src/transformers/models/llama/convert_llama_weights_to_hf.py to transfer LlaMa-7B #30734

Open

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

error when convert llama1 ckpts to hf formath #30723

error when convert llama1 ckpts to hf formath #30723

a157801 commented May 9, 2024 •

edited

amyeroberts commented May 9, 2024

ZHEGG commented May 9, 2024

ArthurZucker commented May 9, 2024

RPC2 commented May 10, 2024

ArthurZucker commented May 10, 2024

yoryis commented May 16, 2024

ArthurZucker commented May 20, 2024

error when convert llama1 ckpts to hf formath #30723

error when convert llama1 ckpts to hf formath #30723

Comments

a157801 commented May 9, 2024 • edited

System Info

Who can help?

Information

Tasks

Reproduction

Expected behavior

amyeroberts commented May 9, 2024

ZHEGG commented May 9, 2024

ArthurZucker commented May 9, 2024

RPC2 commented May 10, 2024

ArthurZucker commented May 10, 2024

yoryis commented May 16, 2024

ArthurZucker commented May 20, 2024

a157801 commented May 9, 2024 •

edited