Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GPU hang when switch between Llama2 and Llama3 on ARC770 #10852

Open
moutainriver opened this issue Apr 23, 2024 · 1 comment
Open

GPU hang when switch between Llama2 and Llama3 on ARC770 #10852

moutainriver opened this issue Apr 23, 2024 · 1 comment
Assignees

Comments

@moutainriver
Copy link

1)configure yaml to run llama2, input = 1K, launch all-in-one benchmark
ipex-llm/python/llm/dev/benchmark/all-in-one$ ./run-arc.sh
2)configure yaml to run llama3-instruct, input = 1K, launch all-in-one benchmark
GPU will hang when convert the llama3-instruct model.

if run 2) first and then run 1), GPU will hang, too.

@qiuxin2012 qiuxin2012 self-assigned this Apr 23, 2024
@qiuxin2012
Copy link
Contributor

I can't reproduce your error, it works fine on my machine.
If you want to run both of them, please make sure your transformers is 4.37.x
My Arc770 is 16GB version, how about yours?
You can try test_api transformer_int4_fp16_gpu to save 1.3GB memory than transformer_int4_gpu.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants