cuda增加，直到溢出报错 #789

sevenandseven · 2024-05-15T07:06:21Z

Hello, I am using the officially provided method of loading the reranker to perform similarity calculations. During the calculation process, I found that after the cache stabilizes for a period of time, it gradually increases until there is not enough video memory left.
How can I solve this problem?
I tried using （torch.cuda.empty_cache()）this method, but it didn’t work very well, and there was still not enough video memory.

staoxiao · 2024-05-15T17:42:59Z

@sevenandseven , which reranker do you use?

sevenandseven · 2024-05-16T01:16:04Z

@sevenandseven , which reranker do you use?

bge-reranker-large、bge-reranker-base、bge-reranker-v2-m3、bge-reranker-v2-gemma、bge-reranker-v2-minicpm-layerwise。
The above models all exhibit this behavior.

staoxiao · 2024-05-16T10:29:59Z

You can reduce the batch size and max_length to reduce memory cost.

sevenandseven · 2024-05-16T10:43:54Z

You can reduce the batch size and max_length to reduce memory cost.

I encountered this situation while inference, without the above parameters.

staoxiao · 2024-05-16T10:47:43Z

@sevenandseven , you can pass batch size and max_length to compute_score(batch size=?, max_length=?) function: https://github.com/FlagOpen/FlagEmbedding/blob/master/FlagEmbedding/flag_reranker.py#L194

sevenandseven · 2024-05-16T10:49:21Z

@sevenandseven , you can pass batch size and max_length to compute_score(batch size=?, max_length=?) function: https://github.com/FlagOpen/FlagEmbedding/blob/master/FlagEmbedding/flag_reranker.py#L194

ok,thanks。

staoxiao mentioned this issue May 17, 2024

bge-rerank-base用onnx部署显存持续增长不会释放，直到溢出 #796

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cuda增加，直到溢出报错 #789

cuda增加，直到溢出报错 #789

sevenandseven commented May 15, 2024

staoxiao commented May 15, 2024

sevenandseven commented May 16, 2024

staoxiao commented May 16, 2024

sevenandseven commented May 16, 2024

staoxiao commented May 16, 2024

sevenandseven commented May 16, 2024

cuda增加，直到溢出报错 #789

cuda增加，直到溢出报错 #789

Comments

sevenandseven commented May 15, 2024

staoxiao commented May 15, 2024

sevenandseven commented May 16, 2024

staoxiao commented May 16, 2024

sevenandseven commented May 16, 2024

staoxiao commented May 16, 2024

sevenandseven commented May 16, 2024