Benchmarks #121

hasanar1f · 2024-01-15T20:12:37Z

Do we have a script to run the benchmarks mentioned in the paper? Especially for llama 7B, 13B models. Also, I can not run the llama-bench and I am getting a model loading error. Any lead?

hodlen · 2024-01-22T18:51:56Z

We benchmarked the performance simply via the main binary specifying 8 threads and expected output with the same set of prompts. You can use this command and check the printed performance stats on your end:

./build/bin/main -m "$model" -n "$output_len" -p "$prompt" -t "8" --ignore-eos

However, please be aware that the current open-sourced version of PowerInfer may have some differences from the internal version evaluated in our paper, particularly in terms of performance. For instance, we have simplified certain features, such as fine-grained tensor offloading and comprehensive FFN neuron offloading policy, to facilitate easier use for a broader range of users.

hodlen · 2024-01-22T18:54:17Z

Also, I can not run the llama-bench and I am getting a model loading error. Any lead?

We haven't tested with llama-bench yet, and there can be incompatibilities. Would mind sharing your error log so we can better pinpointing any bug?

hasanar1f added the question Further information is requested label Jan 15, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Benchmarks #121

Benchmarks #121

hasanar1f commented Jan 15, 2024

hodlen commented Jan 22, 2024

hodlen commented Jan 22, 2024

Benchmarks #121

Benchmarks #121

Comments

hasanar1f commented Jan 15, 2024

hodlen commented Jan 22, 2024

hodlen commented Jan 22, 2024