You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The text was updated successfully, but these errors were encountered:
Lyzin
changed the title
[BUG] Qwen-1.8-Chat,用llama.cpp量化为F16,然后推理回答错乱看不懂
[BUG] Qwen-1.8-Chat,用llama.cpp量化为F16,然后推理回答错乱,请问1.8在llama.cpp还不支持吗?
Dec 26, 2023
Lyzin
changed the title
[BUG] Qwen-1.8-Chat,用llama.cpp量化为F16,然后推理回答错乱,请问1.8在llama.cpp还不支持吗?
[BUG] Qwen-1.8-Chat,用llama.cpp量化为int,然后推理回答错乱,请问1.8在llama.cpp还不支持吗?
Dec 26, 2023
Lyzin
changed the title
[BUG] Qwen-1.8-Chat,用llama.cpp量化为int,然后推理回答错乱,请问1.8在llama.cpp还不支持吗?
[BUG] Qwen-1.8-Chat,用llama.cpp量化为f16,然后推理回答错乱,请问1.8在llama.cpp还不支持吗?
Dec 26, 2023
是否已有关于该错误的issue或讨论? | Is there an existing issue / discussion for this?
该问题是否在FAQ中有解答? | Is there an existing answer for this in FAQ?
当前行为 | Current Behavior
使用llama.cpp项目先转化为f16
python3 convert-hf-to-gguf.py models/Qwen-1_8B-Chat/
然后推理
./main -m ./models/Qwen-1_8B-Chat/ggml-model-f16.gguf -n 512 --color -i -cml -f prompts/chat-with-qwen.txt
但是回答错乱,1.8B是不支持llama.cpp量化吗?
同样试了转为int4量化,也是出现回答错乱
期望行为 | Expected Behavior
期望可以正常回答
复现方法 | Steps To Reproduce
下载llama.cpp项目
下载Qwen-1_8B-Chat模型
转化模型为f16精度
再转为int4量化版本推理
推理出现回答错乱看不懂
运行环境 | Environment
备注 | Anything else?
No response
The text was updated successfully, but these errors were encountered: