New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
where can I download the predictor of Relu-Falcon-40B (float16)? #120
Comments
Yes. All predictors we published are in FP16. To use it with a FP16 model, you can convert the model and predictor into PowerInfer GGUF as mentioned in our README. If you want to run a INT4-quantized model + predictor, you can quantize the generated FP16 model, and the predictor will be quantized at the same time. |
I convert the model and predictor of Falcon-40B into PowerInfer GGUF as mentioned in your README, and keep the directory as you show in README. However, it come with the following error, failed to offload anything to GPU ggml_cuda_set_main_device: using device 0 (NVIDIA A100 80GB PCIe) as main device
llm_load_gpu_split_with_budget: error: failed to generate gpu split
|
Can you confirm that your PyTorch version aligns with our |
Here are content in your "requirements.txt": "numpy>=1.24.4 Here are my package versions: "numpy 1.26.2
|
I tested code around the error shown below, and I believe it's some kind of incompatibility of PyTorch. # Load and sort activation data for each layer
freq = torch.load(f"{activation_path}/activation_{i}.pt")
freq, _ = torch.sort(freq, descending=True) We assumed freq is a tensor, and it is, in our environment with PyTorch 2.1.2. But if PyTorch loaded |
Hmmmmm. Were the activation files corrupted, or manually renamed before? They should be the same format for all model architectures. I would suggest you to purge & redownload all these files to make sure everything is clean and as expected. |
Is the predictors in link https://huggingface.co/PowerInfer/ReluFalcon-40B-Predictor for Relu-Falcon-40B (float16)? Or Relu-Falcon-40B (int4)? If is int4, where can I download the predictors of Relu-Falcon-40B (float16)?
The text was updated successfully, but these errors were encountered: