以lora、bfloat16方式微调模型，模型微调后采用lora参数和基座模型进行推理，使用merge_and_unload()类前后推理结果不一致，为什么会出现这种情况呢 #1168

shaojh1 · 2024-03-24T07:54:30Z

是否已有关于该错误的issue或讨论？ | Is there an existing issue / discussion for this?

我已经搜索过已有的issues和讨论 | I have searched the existing issues / discussions

该问题是否在FAQ中有解答？ | Is there an existing answer for this in FAQ?

我已经搜索过FAQ | I have searched FAQ

当前行为 | Current Behavior

model = AutoModelForCausalLM.from_pretrained(
model_path,
trust_remote_code=True,
torch_dtype=torch.bfloat16,
device_map="auto")

model = PeftModel.from_pretrained(
model,
adapter_path,
torch_dtype=torch.bfloat16,
device_map="auto")

model = model.merge_and_unload()
model.bfloat16()

期望行为 | Expected Behavior

按理说使用merge_and_unload类前后模型推理结果应该一致

复现方法 | Steps To Reproduce

No response

运行环境 | Environment

- OS:Ubuntu 18.04.6
- Python:3.10
- Transformers:4.32.0
- PyTorch:2.0.1
- CUDA (`python -c 'import torch; print(torch.version.cuda)'`):11.4

备注 | Anything else?

No response

jklj077 · 2024-03-25T06:06:01Z

For accurate comparison of results between the two models, please adhere to these guidelines:

Ensure that both models have the do_sample hyperparameter set to False in generation. This will guarantee that the models generate outputs deterministically rather than randomly sampling possible sequences.
Be aware that minor discrepancies may exist in the output results because of inherent variations in floating-point arithmetic operations. Since the computational process diverges before and after adapter merging, this can lead to subtle differences in the final outcomes despite identical inputs.

If you have encountered substantial differences, steps and inference examples to reproduce the problem is welcomed.

tyh4521 · 2024-04-16T07:57:17Z

我也遇到了相同的问题

微调后使用adapter推理是符合预期的
adapter通过merge_and_unload后生成的模型推理却失去的微调的效果

两种模式都已经设置do_sample=False,num_beams=1

github-actions · 2024-05-16T08:05:11Z

This issue has been automatically marked as inactive due to lack of recent activity. Should you believe it remains unresolved and warrants attention, kindly leave a comment on this thread.
此问题由于长期未有新进展而被系统自动标记为不活跃。如果您认为它仍有待解决，请在此帖下方留言以补充信息。

github-actions bot added the inactive label May 16, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

以lora、bfloat16方式微调模型，模型微调后采用lora参数和基座模型进行推理，使用merge_and_unload()类前后推理结果不一致，为什么会出现这种情况呢 #1168

以lora、bfloat16方式微调模型，模型微调后采用lora参数和基座模型进行推理，使用merge_and_unload()类前后推理结果不一致，为什么会出现这种情况呢 #1168

shaojh1 commented Mar 24, 2024

jklj077 commented Mar 25, 2024

tyh4521 commented Apr 16, 2024

github-actions bot commented May 16, 2024

以lora、bfloat16方式微调模型，模型微调后采用lora参数和基座模型进行推理，使用merge_and_unload()类前后推理结果不一致，为什么会出现这种情况呢 #1168

以lora、bfloat16方式微调模型，模型微调后采用lora参数和基座模型进行推理，使用merge_and_unload()类前后推理结果不一致，为什么会出现这种情况呢 #1168

Comments

shaojh1 commented Mar 24, 2024

是否已有关于该错误的issue或讨论？ | Is there an existing issue / discussion for this?

该问题是否在FAQ中有解答？ | Is there an existing answer for this in FAQ?

当前行为 | Current Behavior

期望行为 | Expected Behavior

复现方法 | Steps To Reproduce

运行环境 | Environment

备注 | Anything else?

jklj077 commented Mar 25, 2024

tyh4521 commented Apr 16, 2024

github-actions bot commented May 16, 2024