You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Do we have a script to run the benchmarks mentioned in the paper? Especially for llama 7B, 13B models. Also, I can not run the llama-bench and I am getting a model loading error. Any lead?
The text was updated successfully, but these errors were encountered:
We benchmarked the performance simply via the main binary specifying 8 threads and expected output with the same set of prompts. You can use this command and check the printed performance stats on your end:
However, please be aware that the current open-sourced version of PowerInfer may have some differences from the internal version evaluated in our paper, particularly in terms of performance. For instance, we have simplified certain features, such as fine-grained tensor offloading and comprehensive FFN neuron offloading policy, to facilitate easier use for a broader range of users.
Do we have a script to run the benchmarks mentioned in the paper? Especially for llama 7B, 13B models. Also, I can not run the llama-bench and I am getting a model loading error. Any lead?
The text was updated successfully, but these errors were encountered: