Load LLM GTPQ safetensor file from folder #57

saftle · 2024-03-26T04:47:10Z

I see that you can load individual GGUF files from \models\LLavacheckpoints, however how do you add GPTQ models like https://huggingface.co/TheBloke/U-Amethyst-20B-GPTQ.

GTPQ is much much faster than GGUF. I'm currently loading them with https://github.com/Zuellni/ComfyUI-ExLlama-Nodes but would love to switch to VLM Nodes, since there are alot more features with this node pack.

VLM Nodes supports Auto GPTQ when loading https://huggingface.co/internlm/internlm-xcomposer2-vl-7b-4bit, so I assume that it could load other GPTQ models as well. I just have no idea what the directory structure should look like and/or if I have to rename the safetensors to be a specific filename.

Any help would be awesome!

The text was updated successfully, but these errors were encountered:

gokayfem · 2024-03-26T05:10:29Z

i think this is an llm, not vlm, i need to make a node for this. i may or may not add this functionality. im not sure.

saftle · 2024-03-26T05:23:23Z

@gokayfem but you have an LLMLoader already loading GGUFs which works great btw, it just doesn't load GPTQs :P

drphero · 2024-04-03T17:39:30Z

Just be aware that this will probably require a specially compiled version of llama-cpp-python in order to utilize the GPU. It's doable, but a massive headache, at least on windows.

gokayfem closed this as not planned Won't fix, can't repro, duplicate, stale May 28, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Load LLM GTPQ safetensor file from folder #57

Load LLM GTPQ safetensor file from folder #57

saftle commented Mar 26, 2024 •

edited

gokayfem commented Mar 26, 2024

saftle commented Mar 26, 2024

drphero commented Apr 3, 2024

Load LLM GTPQ safetensor file from folder #57

Load LLM GTPQ safetensor file from folder #57

Comments

saftle commented Mar 26, 2024 • edited

gokayfem commented Mar 26, 2024

saftle commented Mar 26, 2024

drphero commented Apr 3, 2024

saftle commented Mar 26, 2024 •

edited