You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This is probably just because V3 is a work in progress, but I wanted to make sure.
When trying to run Qwen 1.5 - 0.5B it works with the V2 script, but when swapping to V3 I get a 404 not found.
type not specified for model. Using the default dtype: q8.
GET https://huggingface.co/Xenova/Qwen1.5-0.5B-Chat/resolve/main/onnx/model_quantized.onnx 404 (Not Found)
Hi there 👋 v3 will use the name model instead of decoder_merged_model, as the latter is the result of a legacy conversion process which created multiple versions of the model (w/ and w/o past key value inputs). So, this change isn't needed.
If you want to override the behaviour yourself, you can use the model_file_name option when loading the model.
Question
This is probably just because V3 is a work in progress, but I wanted to make sure.
When trying to run Qwen 1.5 - 0.5B it works with the V2 script, but when swapping to V3 I get a 404 not found.
It seems V3 is looking for a file that was renamed 3 months ago.
Rename onnx/model_quantized.onnx to onnx/decoder_model_merged_quantized.onnx
I've tried setting
dtype
to 16 and 32, which does change the URL it tries to get, but those URL's also do not exist :-De.g.
https://huggingface.co/Xenova/Qwen1.5-0.5B-Chat/resolve/main/onnx/model_fp16.onnx
when usingdtype: 'fp16'
.Is there something I can do to make V3 find the correct files?
(I'm still trying to find that elusive small model with a large context size to do document summarization with)
The text was updated successfully, but these errors were encountered: