You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
GTPQ is much much faster than GGUF. I'm currently loading them with https://github.com/Zuellni/ComfyUI-ExLlama-Nodes but would love to switch to VLM Nodes, since there are alot more features with this node pack.
VLM Nodes supports Auto GPTQ when loading https://huggingface.co/internlm/internlm-xcomposer2-vl-7b-4bit, so I assume that it could load other GPTQ models as well. I just have no idea what the directory structure should look like and/or if I have to rename the safetensors to be a specific filename.
Any help would be awesome!
The text was updated successfully, but these errors were encountered:
Just be aware that this will probably require a specially compiled version of llama-cpp-python in order to utilize the GPU. It's doable, but a massive headache, at least on windows.
I see that you can load individual GGUF files from
\models\LLavacheckpoints
, however how do you add GPTQ models like https://huggingface.co/TheBloke/U-Amethyst-20B-GPTQ.GTPQ is much much faster than GGUF. I'm currently loading them with https://github.com/Zuellni/ComfyUI-ExLlama-Nodes but would love to switch to VLM Nodes, since there are alot more features with this node pack.
VLM Nodes supports Auto GPTQ when loading https://huggingface.co/internlm/internlm-xcomposer2-vl-7b-4bit, so I assume that it could load other GPTQ models as well. I just have no idea what the directory structure should look like and/or if I have to rename the safetensors to be a specific filename.
Any help would be awesome!
The text was updated successfully, but these errors were encountered: