Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RuntimeError: mat1 and mat2 shapes cannot be multiplied #2

Open
dsheng opened this issue Jun 1, 2023 · 3 comments
Open

RuntimeError: mat1 and mat2 shapes cannot be multiplied #2

dsheng opened this issue Jun 1, 2023 · 3 comments

Comments

@dsheng
Copy link

dsheng commented Jun 1, 2023

尝试在12G卡上训练 python qlora.py --model_name="chinese_alpaca" --model_name_or_path="./model_hub/chinese-alpaca-7b" --trust_remote_code=False --dataset="msra" --source_max_len=128 --target_max_len=64 --do_train --save_total_limit=1 --padding_side="right" --per_device_train_batch_size=8 --do_eval --bits=4 --save_steps=10 --gradient_accumulation_steps=1 --learning_rate=1e-5 --output_dir="./output/alpaca/" --lora_r=8 --lora_alpha=32
出错:
File "/mnt/data1ts/llm/training/qlora-chinese-LLM/qlora.py", line 1012, in
train()
File "/mnt/data1ts/llm/training/qlora-chinese-LLM/qlora.py", line 973, in train
train_result = trainer.train(resume_from_checkpoint=checkpoint_dir)

result = F.linear(x, transpose(self.weight, self.fan_in_fan_out), bias=self.bias)
RuntimeError: mat1 and mat2 shapes cannot be multiplied (1536x4096 and 1x8388608)
可能是什么原因导致? 谢谢。

@taishan1994
Copy link
Owner

尝试在12G卡上训练 python qlora.py --model_name="chinese_alpaca" --model_name_or_path="./model_hub/chinese-alpaca-7b" --trust_remote_code=False --dataset="msra" --source_max_len=128 --target_max_len=64 --do_train --save_total_limit=1 --padding_side="right" --per_device_train_batch_size=8 --do_eval --bits=4 --save_steps=10 --gradient_accumulation_steps=1 --learning_rate=1e-5 --output_dir="./output/alpaca/" --lora_r=8 --lora_alpha=32 出错: File "/mnt/data1ts/llm/training/qlora-chinese-LLM/qlora.py", line 1012, in train() File "/mnt/data1ts/llm/training/qlora-chinese-LLM/qlora.py", line 973, in train train_result = trainer.train(resume_from_checkpoint=checkpoint_dir)

result = F.linear(x, transpose(self.weight, self.fan_in_fan_out), bias=self.bias) RuntimeError: mat1 and mat2 shapes cannot be multiplied (1536x4096 and 1x8388608) 可能是什么原因导致? 谢谢。

peft版本要为0.4.0.dev0

@dsheng
Copy link
Author

dsheng commented Jun 2, 2023

谢谢,解决没问题了。

@zlh1992
Copy link

zlh1992 commented Jun 7, 2023

/opt/conda/envs/tch/lib/python3.9/site-packages/peft/tuners/lora.py:619 in forward │
│ │
│ 616 │ │ │ │ self.unmerge() │
│ 617 │ │ │ result = F.linear(x, transpose(self.weight, self.fan_in_fan_out), bias=self. │
│ 618 │ │ elif self.r[self.active_adapter] > 0 and not self.merged: │
│ ❱ 619 │ │ │ result = F.linear(x, transpose(self.weight, self.fan_in_fan_out), bias=self. │
│ 620 │ │ │ │
│ 621 │ │ │ x = x.to(self.lora_A[self.active_adapter].weight.dtype) │
│ 622 │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
RuntimeError: mat1 and mat2 shapes cannot be multiplied (1600x4096 and 1x8388608)

peft 0.4.0.dev0
bitsandbytes 0.39.0
deepspeed 0.9.3
transformers 4.30.0.dev0
我还是会遇到peft的这个报错
请问我这个版本配置如何解决这个错误呢?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants