-
Notifications
You must be signed in to change notification settings - Fork 375
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cuda增加,直到溢出报错 #789
Comments
@sevenandseven , which reranker do you use? |
bge-reranker-large、bge-reranker-base、bge-reranker-v2-m3、bge-reranker-v2-gemma、bge-reranker-v2-minicpm-layerwise。 |
You can reduce the |
I encountered this situation while inference, without the above parameters. |
@sevenandseven , you can pass |
ok,thanks。 |
Hello, I am using the officially provided method of loading the reranker to perform similarity calculations. During the calculation process, I found that after the cache stabilizes for a period of time, it gradually increases until there is not enough video memory left.
How can I solve this problem?
I tried using (torch.cuda.empty_cache())this method, but it didn’t work very well, and there was still not enough video memory.
The text was updated successfully, but these errors were encountered: