Thread safety #499
-
Is llama.cpp thread safe? I have encountered some problems and weird issues when creating a CTX on another thread and then using it in another. |
Beta Was this translation helpful? Give feedback.
Replies: 3 comments 5 replies
-
See #370 (comment) tl;dr not yet, but it's a priority and parallel inference is on the roadmap. |
Beta Was this translation helpful? Give feedback.
-
Would be nice if llama.cpp was thread safe h2oai/h2ogpt#1017 |
Beta Was this translation helpful? Give feedback.
-
And this affects package stability also in the special case of CPU-only inference. I get segfaults during concurrent inference attempts (using Streamlit) even on CPU-only machines, where these errors are especially easy to reproduce given how long the inference takes in CPU alone. It can be reproduced in the official I know it is a bit of a narrow and specialized case, but maybe |
Beta Was this translation helpful? Give feedback.
See #370 (comment)
tl;dr not yet, but it's a priority and parallel inference is on the roadmap.