You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi, can you clarify on the difference between deployment of "vllm from llama_factory" and "vllm from Qwen's official documentation"?
Based on the shared screenshot, it appears that you are using a custom frontend. As vllm is not fully compatible with Qwen(1.0) models (unaware of the chat template and the stop token ids), the frontend has to at least pass stop_token_ids to the API created by vllm. Or, you could use fastchat+vllm as introduced in the README. If you are using Qwen1.5, plain vllm should work fine.
As Qwen1.0 is no longer actively maintained, we kindly ask to you migrate to Qwen1.5 and direct your related question there. Thanks for you cooperation.
是否已有关于该错误的issue或讨论? | Is there an existing issue / discussion for this?
该问题是否在FAQ中有解答? | Is there an existing answer for this in FAQ?
当前行为 | Current Behavior
大佬 我使用llama_factory 微调成功后
使用llama_factory 的vllm与使用qwen官方文档推荐的vllm方式部署
返回不一样
llama_factory vllm部署的返回都很正常 从没出过问题
千问官方vllm部署的 总是有些问题 回复的效果很差 几乎乱回答 如下图
大概什么原因啊
期望行为 | Expected Behavior
期望返回都很正常
复现方法 | Steps To Reproduce
No response
运行环境 | Environment
备注 | Anything else?
No response
The text was updated successfully, but these errors were encountered: