You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
1)configure yaml to run llama2, input = 1K, launch all-in-one benchmark
ipex-llm/python/llm/dev/benchmark/all-in-one$ ./run-arc.sh
2)configure yaml to run llama3-instruct, input = 1K, launch all-in-one benchmark
GPU will hang when convert the llama3-instruct model.
if run 2) first and then run 1), GPU will hang, too.
The text was updated successfully, but these errors were encountered:
I can't reproduce your error, it works fine on my machine.
If you want to run both of them, please make sure your transformers is 4.37.x
My Arc770 is 16GB version, how about yours?
You can try test_api transformer_int4_fp16_gpu to save 1.3GB memory than transformer_int4_gpu.
1)configure yaml to run llama2, input = 1K, launch all-in-one benchmark
ipex-llm/python/llm/dev/benchmark/all-in-one$ ./run-arc.sh
2)configure yaml to run llama3-instruct, input = 1K, launch all-in-one benchmark
GPU will hang when convert the llama3-instruct model.
if run 2) first and then run 1), GPU will hang, too.
The text was updated successfully, but these errors were encountered: