Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

not enough space in the buffer with long prompts #129

Open
3 tasks done
RachelShalom opened this issue Jan 23, 2024 · 2 comments
Open
3 tasks done

not enough space in the buffer with long prompts #129

RachelShalom opened this issue Jan 23, 2024 · 2 comments
Labels
bug-unconfirmed Unconfirmed bugs

Comments

@RachelShalom
Copy link

RachelShalom commented Jan 23, 2024

Prerequisites

Before submitting your question, please ensure the following:

  • I am running the latest version of PowerInfer. Development is rapid, and as of now, there are no tagged versions.
  • I have carefully read and followed the instructions in the README.md.
  • I searched using keywords relevant to my issue to make sure that I am creating a new issue that is not already open (or closed).

Question Details

I am trying to run llama-7b-relu.q4.powerinfer.gguf with the following command:
PowerInfer/build/bin/main -m ReluLLaMA-7B-PowerInfer-GGUF/llama-7b-relu.q4.powerinfer.gguf -n 128 -t 8 -p "$PROMPT" -c 1000 --n-gpu-layers 20
the prompt size is ~ 800 tokens. when I run this I get:

not enough space in the buffer (needed 1409024, largest block available 1389584)
GGML_ASSERT: /home/user_name/PowerInfer/ggml-alloc.c:116: !"not enough space in the buffer"
[Thread debugging using libthread_db enabled]

when I use small prompt size (10-20 tokens in the prompt) it works as expected. any ideas on how to solve this?

@RachelShalom RachelShalom added the question Further information is requested label Jan 23, 2024
@hodlen
Copy link
Collaborator

hodlen commented Jan 24, 2024

Hi @RachelShalom can you test it again on the latest main branch? I have tested with the same model with a prompt for about 1.9K tokens and 2048 context window size, and it looks good.

If the error persists, would you mind posting the entire error log? It will help us to pinpoint the root cause.

@hodlen hodlen added bug-unconfirmed Unconfirmed bugs and removed question Further information is requested labels Jan 24, 2024
@tusharsoni42909
Copy link

Shorten the prompt or use abbreviations

PROMPT="Shortened version of your prompt"
PowerInfer/build/bin/main -m ReluLLaMA-7B-PowerInfer-GGUF/llama-7b-relu.q4.powerinfer.gguf -n 128 -t 8 -p "$PROMPT" -c 1000 --n-gpu-layers 20

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug-unconfirmed Unconfirmed bugs
Projects
None yet
Development

No branches or pull requests

3 participants