Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

预测代码 #105

Open
qaqrt opened this issue Sep 28, 2023 · 2 comments
Open

预测代码 #105

qaqrt opened this issue Sep 28, 2023 · 2 comments

Comments

@qaqrt
Copy link

qaqrt commented Sep 28, 2023

运行merge_lora 会报错,runtimeError('error(s) ') in loading state_dict for PeftModelForCausalLM,

copying a param with shape torch.size([3072,8,1]) from checkpoint,the shape in current model is torch.size([4608.8])

为什么会出现4608啊

@qaqrt
Copy link
Author

qaqrt commented Sep 28, 2023

eneration config file not found, using a generation config created from the model config.
07/07/2023 16:36:35 - INFO - utils.common - Fine-tuning method: LoRA
Traceback (most recent call last):
File "……/ChatGLM-Efficient-Tuning/src/train_sft.py", line 105, in
main()
File "……/ChatGLM-Efficient-Tuning/src/train_sft.py", line 25, in main
model, tokenizer = load_pretrained(model_args, finetuning_args, training_args.do_train, stage="sft")
File "……/ChatGLM-Efficient-Tuning/src/utils/common.py", line 244, in load_pretrained
model = init_adapter(model, model_args, finetuning_args, is_trainable)
File "……/ChatGLM-Efficient-Tuning/src/utils/common.py", line 117, in init_adapter
model = PeftModel.from_pretrained(model, checkpoint)
File "……/miniconda3/envs/glm_tuning/lib/python3.10/site-packages/peft/peft_model.py", line 181, in from_pretrained
model.load_adapter(model_id, adapter_name, **kwargs)
File "……/miniconda3/envs/glm_tuning/lib/python3.10/site-packages/peft/peft_model.py", line 376, in load_adapter
set_peft_model_state_dict(self, adapters_weights, adapter_name=adapter_name)
File "……/miniconda3/envs/glm_tuning/lib/python3.10/site-packages/peft/utils/save_and_load.py", line 123, in set_peft_model_state_dict
model.load_state_dict(peft_model_state_dict, strict=False)
File "……/miniconda3/envs/glm_tuning/lib/python3.10/site-packages/torch/nn/modules/module.py", line 2041, in load_state_dict
raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for PeftModelForCausalLM:
size mismatch for base_model.model.transformer.encoder.layers.0.self_attention.query_key_value.lora_A.default.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([64, 4096]).
size mismatch for base_model.model.transformer.encoder.layers.0.self_attention.query_key_value.lora_B.default.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([4608, 64]).
size mismatch for base_model.model.transformer.encoder.layers.1.self_attention.query_key_value.lora_A.default.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([64, 4096]).
size mismatch for base_model.model.transformer.encoder.layers.1.self_attention.query_key_value.lora_B.default.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([4608, 64]).
size mismatch for base_model.model.transformer.encoder.layers.2.self_attention.query_key_value.lora_A.default.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([64, 4096]).
size mismatch for base_model.model.transformer.encoder.layers.2.self_attention.query_key_value.lora_B.default.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([4608, 64]).
size mismatch for base_model.model.transformer.encoder.layers.3.self_attention.query_key_value.lora_A.default.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([64, 4096]).
size mismatch for base_model.model.transformer.encoder.layers.3.self_attention.query_key_value.lora_B.default.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([4608, 64]).
size mismatch for base_model.model.transformer.encoder.layers.4.self_attention.query_key_value.lora_A.default.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([64, 4096]).
size mismatch for base_model.model.transformer.encoder.layers.4.self_attention.query_key_value.lora_B.default.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([4608, 64]).
size mismatch for base_model.model.transformer.encoder.layers.5.self_attention.query_key_value.lora_A.default.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([64, 4096]).
size mismatch for base_model.model.transformer.encoder.layers.5.self_attention.query_key_value.lora_B.default.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([4608, 64]).
size mismatch for base_model.model.transformer.encoder.layers.6.self_attention.query_key_value.lora_A.default.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([64, 4096]).
size mismatch for base_model.model.transformer.encoder.layers.6.self_attention.query_key_value.lora_B.default.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([4608, 64]).
size mismatch for base_model.model.transformer.encoder.layers.7.self_attention.query_key_value.lora_A.default.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([64, 4096]).
size mismatch for base_model.model.transformer.encoder.layers.7.self_attention.query_key_value.lora_B.default.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([4608, 64]).
size mismatch for base_model.model.transformer.encoder.layers.8.self_attention.query_key_value.lora_A.default.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([64, 4096]).
size mismatch for base_model.model.transformer.encoder.layers.8.self_attention.query_key_value.lora_B.default.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([4608, 64]).
size mismatch for base_model.model.transformer.encoder.layers.9.self_attention.query_key_value.lora_A.default.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([64, 4096]).
size mismatch for base_model.model.transformer.encoder.layers.9.self_attention.query_key_value.lora_B.default.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([4608, 64]).
size mismatch for base_model.model.transformer.encoder.layers.10.self_attention.query_key_value.lora_A.default.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([64, 4096]).
size mismatch for base_model.model.transformer.encoder.layers.10.self_attention.query_key_value.lora_B.default.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([4608, 64]).
size mismatch for base_model.model.transformer.encoder.layers.11.self_attention.query_key_value.lora_A.default.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([64, 4096]).
size mismatch for base_model.model.transformer.encoder.layers.11.self_attention.query_key_value.lora_B.default.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([4608, 64]).
size mismatch for base_model.model.transformer.encoder.layers.12.self_attention.query_key_value.lora_A.default.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([64, 4096]).
size mismatch for base_model.model.transformer.encoder.layers.12.self_attention.query_key_value.lora_B.default.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([4608, 64]).
size mismatch for base_model.model.transformer.encoder.layers.13.self_attention.query_key_value.lora_A.default.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([64, 4096]).
size mismatch for base_model.model.transformer.encoder.layers.13.self_attention.query_key_value.lora_B.default.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([4608, 64]).
size mismatch for base_model.model.transformer.encoder.layers.14.self_attention.query_key_value.lora_A.default.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([64, 4096]).
size mismatch for base_model.model.transformer.encoder.layers.14.self_attention.query_key_value.lora_B.default.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([4608, 64]).
size mismatch for base_model.model.transformer.encoder.layers.15.self_attention.query_key_value.lora_A.default.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([64, 4096]).
size mismatch for base_model.model.transformer.encoder.layers.15.self_attention.query_key_value.lora_B.default.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([4608, 64]).
size mismatch for base_model.model.transformer.encoder.layers.16.self_attention.query_key_value.lora_A.default.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([64, 4096]).
size mismatch for base_model.model.transformer.encoder.layers.16.self_attention.query_key_value.lora_B.default.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([4608, 64]).
size mismatch for base_model.model.transformer.encoder.layers.17.self_attention.query_key_value.lora_A.default.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([64, 4096]).
size mismatch for base_model.model.transformer.encoder.layers.17.self_attention.query_key_value.lora_B.default.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([4608, 64]).
size mismatch for base_model.model.transformer.encoder.layers.18.self_attention.query_key_value.lora_A.default.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([64, 4096]).
size mismatch for base_model.model.transformer.encoder.layers.18.self_attention.query_key_value.lora_B.default.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([4608, 64]).
size mismatch for base_model.model.transformer.encoder.layers.19.self_attention.query_key_value.lora_A.default.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([64, 4096]).
size mismatch for base_model.model.transformer.encoder.layers.19.self_attention.query_key_value.lora_B.default.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([4608, 64]).
size mismatch for base_model.model.transformer.encoder.layers.20.self_attention.query_key_value.lora_A.default.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([64, 4096]).
size mismatch for base_model.model.transformer.encoder.layers.20.self_attention.query_key_value.lora_B.default.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([4608, 64]).
size mismatch for base_model.model.transformer.encoder.layers.21.self_attention.query_key_value.lora_A.default.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([64, 4096]).
size mismatch for base_model.model.transformer.encoder.layers.21.self_attention.query_key_value.lora_B.default.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([4608, 64]).
size mismatch for base_model.model.transformer.encoder.layers.22.self_attention.query_key_value.lora_A.default.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([64, 4096]).
size mismatch for base_model.model.transformer.encoder.layers.22.self_attention.query_key_value.lora_B.default.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([4608, 64]).
size mismatch for base_model.model.transformer.encoder.layers.23.self_attention.query_key_value.lora_A.default.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([64, 4096]).
size mismatch for base_model.model.transformer.encoder.layers.23.self_attention.query_key_value.lora_B.default.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([4608, 64]).
size mismatch for base_model.model.transformer.encoder.layers.24.self_attention.query_key_value.lora_A.default.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([64, 4096]).
size mismatch for base_model.model.transformer.encoder.layers.24.self_attention.query_key_value.lora_B.default.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([4608, 64]).
size mismatch for base_model.model.transformer.encoder.layers.25.self_attention.query_key_value.lora_A.default.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([64, 4096]).
size mismatch for base_model.model.transformer.encoder.layers.25.self_attention.query_key_value.lora_B.default.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([4608, 64]).
size mismatch for base_model.model.transformer.encoder.layers.26.self_attention.query_key_value.lora_A.default.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([64, 4096]).
size mismatch for base_model.model.transformer.encoder.layers.26.self_attention.query_key_value.lora_B.default.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([4608, 64]).
size mismatch for base_model.model.transformer.encoder.layers.27.self_attention.query_key_value.lora_A.default.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([64, 4096]).
size mismatch for base_model.model.transformer.encoder.layers.27.self_attention.query_key_value.lora_B.default.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([4608, 64]).
```

@liucongg
Copy link
Owner

lora训练时,参数维度lora_dim是否一致?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants