Meet problems when I use the file src/transformers/models/llama/convert_llama_weights_to_hf.py to transfer LlaMa-7B #30734

wwxxyy1996 · 2024-05-09T20:26:53Z

System Info

When I use the file convert_llama_weights_to_hf.py to converted LlaMa-7B weights to the Hugging Face Transformers format, an error raised for the line 154 and said "RuntimeError: shape '[32, 2, 2, 4096]' is invalid for input of size 16777216".

My command
python convert_llama_weights_to_hf.py --input_dir ./ --model_size 7B --output_dir ./7B_hf

Here is my environments,
cuda-11.7, gcc-10.2.0, torch==2.0.0+cu117 torchvision==0.15.1+cu117, openai==0.27.8, transformers==4.41.0.dev0

Thank you very much!!

Who can help?

@ArthurZucker @pcuenca @Xe

Information

The official example scripts
My own modified scripts

Tasks

An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
My own task or dataset (give details below)

Reproduction

Just use this code to convert the model. The model size (consolidated.00.pth file) is 13G.

python convert_llama_weights_to_hf.py --input_dir ./ --model_size 7B --output_dir ./7B_hf

Expected behavior

Could you give me some suggestions?

The text was updated successfully, but these errors were encountered:

ArthurZucker · 2024-05-10T06:09:18Z

Hey! This seems to be a duplicate of #30723, I'll check it out asap 😉

wwxxyy1996 · 2024-05-10T18:06:08Z

Thank you very much for your answering!! It is very important for me. Have a good weekend~ Best wishes, Xinyi From: Arthur ***@***.***> Sent: Friday, May 10, 2024 7:10 AM To: huggingface/transformers ***@***.***> Cc: Xinyi Wang ***@***.***>; Author ***@***.***> Subject: Re: [huggingface/transformers] Meet problems when I use the file src/transformers/models/llama/convert_llama_weights_to_hf.py to transfer LlaMa-7B (Issue #30734) 你通常不会收到来自 ***@***.******@***.***> 的电子邮件。了解这一点为什么很重要<https://aka.ms/LearnAboutSenderIdentification> Hey! This seems to be a duplicate of #30723<#30723>, I'll check it out asap 😉 — Reply to this email directly, view it on GitHub<#30734 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/BGS7HNB3C2DHEWXR42PHNELZBRQCHAVCNFSM6AAAAABHPOETQCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCMBTHEZTGNZUGQ>. You are receiving this because you authored the thread.Message ID: ***@***.******@***.***>> This message and any attachment are intended solely for the addressee and may contain confidential information. If you have received this message in error, please contact the sender and delete the email and attachment. Any views or opinions expressed by the author of this email do not necessarily reflect the views of the University of Nottingham. Email communications with the University of Nottingham may be monitored where permitted by law.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Meet problems when I use the file src/transformers/models/llama/convert_llama_weights_to_hf.py to transfer LlaMa-7B #30734

Meet problems when I use the file src/transformers/models/llama/convert_llama_weights_to_hf.py to transfer LlaMa-7B #30734

wwxxyy1996 commented May 9, 2024

ArthurZucker commented May 10, 2024

wwxxyy1996 commented May 10, 2024 via email

Meet problems when I use the file src/transformers/models/llama/convert_llama_weights_to_hf.py to transfer LlaMa-7B #30734

Meet problems when I use the file src/transformers/models/llama/convert_llama_weights_to_hf.py to transfer LlaMa-7B #30734

Comments

wwxxyy1996 commented May 9, 2024

System Info

Who can help?

Information

Tasks

Reproduction

Expected behavior

ArthurZucker commented May 10, 2024

wwxxyy1996 commented May 10, 2024 via email