Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Benchmarks #121

Open
hasanar1f opened this issue Jan 15, 2024 · 2 comments
Open

Benchmarks #121

hasanar1f opened this issue Jan 15, 2024 · 2 comments
Labels
question Further information is requested

Comments

@hasanar1f
Copy link

Do we have a script to run the benchmarks mentioned in the paper? Especially for llama 7B, 13B models. Also, I can not run the llama-bench and I am getting a model loading error. Any lead?

@hasanar1f hasanar1f added the question Further information is requested label Jan 15, 2024
@hodlen
Copy link
Collaborator

hodlen commented Jan 22, 2024

We benchmarked the performance simply via the main binary specifying 8 threads and expected output with the same set of prompts. You can use this command and check the printed performance stats on your end:

./build/bin/main -m "$model" -n "$output_len" -p "$prompt" -t "8" --ignore-eos

However, please be aware that the current open-sourced version of PowerInfer may have some differences from the internal version evaluated in our paper, particularly in terms of performance. For instance, we have simplified certain features, such as fine-grained tensor offloading and comprehensive FFN neuron offloading policy, to facilitate easier use for a broader range of users.

@hodlen
Copy link
Collaborator

hodlen commented Jan 22, 2024

Also, I can not run the llama-bench and I am getting a model loading error. Any lead?

We haven't tested with llama-bench yet, and there can be incompatibilities. Would mind sharing your error log so we can better pinpointing any bug?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants