微调完成后使用llama_factory的vllm和qwen官方的vllm部署方式启动返回的不一样 #1241

lxb0425 · 2024-05-08T09:55:47Z

是否已有关于该错误的issue或讨论？ | Is there an existing issue / discussion for this?

我已经搜索过已有的issues和讨论 | I have searched the existing issues / discussions

该问题是否在FAQ中有解答？ | Is there an existing answer for this in FAQ?

我已经搜索过FAQ | I have searched FAQ

当前行为 | Current Behavior

大佬我使用llama_factory 微调成功后
使用llama_factory 的vllm与使用qwen官方文档推荐的vllm方式部署
返回不一样
llama_factory vllm部署的返回都很正常从没出过问题
千问官方vllm部署的总是有些问题回复的效果很差几乎乱回答如下图

大概什么原因啊

期望行为 | Expected Behavior

期望返回都很正常

复现方法 | Steps To Reproduce

No response

运行环境 | Environment

- OS:
- Python: 3.10
- Transformers:
- PyTorch:
- CUDA 12.2
- vllm 0.3.3

备注 | Anything else?

No response

jklj077 · 2024-05-09T06:00:53Z

Hi, can you clarify on the difference between deployment of "vllm from llama_factory" and "vllm from Qwen's official documentation"?

Based on the shared screenshot, it appears that you are using a custom frontend. As vllm is not fully compatible with Qwen(1.0) models (unaware of the chat template and the stop token ids), the frontend has to at least pass stop_token_ids to the API created by vllm. Or, you could use fastchat+vllm as introduced in the README. If you are using Qwen1.5, plain vllm should work fine.

jklj077 · 2024-05-22T03:28:50Z

As Qwen1.0 is no longer actively maintained, we kindly ask to you migrate to Qwen1.5 and direct your related question there. Thanks for you cooperation.

jklj077 closed this as not planned Won't fix, can't repro, duplicate, stale May 22, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

微调完成后使用llama_factory的vllm和qwen官方的vllm部署方式启动返回的不一样 #1241

微调完成后使用llama_factory的vllm和qwen官方的vllm部署方式启动返回的不一样 #1241

lxb0425 commented May 8, 2024

jklj077 commented May 9, 2024

jklj077 commented May 22, 2024

微调完成后使用llama_factory的vllm和qwen官方的vllm部署方式启动返回的不一样 #1241

微调完成后使用llama_factory的vllm和qwen官方的vllm部署方式启动返回的不一样 #1241

Comments

lxb0425 commented May 8, 2024

是否已有关于该错误的issue或讨论？ | Is there an existing issue / discussion for this?

该问题是否在FAQ中有解答？ | Is there an existing answer for this in FAQ?

当前行为 | Current Behavior

期望行为 | Expected Behavior

复现方法 | Steps To Reproduce

运行环境 | Environment

备注 | Anything else?

jklj077 commented May 9, 2024

jklj077 commented May 22, 2024