two questions that i want to solve #167

yeptttt · 2024-03-18T01:59:49Z

您好，目前有两个迫切需要解决的问题，能否帮忙解答。
1.gpu显存存在一个上限，是否支持提高显存到24GB以提高gpu的利用率？
2.如何同时输入多个问题进行并行推理？

hodlen · 2024-04-06T13:52:48Z

目前我们的实现无法做到非常准确的GPU显存占用估计，因此存在轻微的显存浪费。如果你希望尽可能占用满显存，可以尝试设置 --vram-budget 到一个大于物理显存大小，但在你的工作负载下不会OOM的值。
如果你指的是prompt相同的并行推理，可以使用 examples/batched 。对于不同的prompt， examples/server 的 --cont-batching 可能有所帮助，但并不建议使用，因为我们在测试中发现它会生成明显错误的结果。

Currently, our implementation cannot accurately estimate GPU memory usage, leading to minor memory wastage. If you wish to maximize memory usage, you might try setting --vram-budget to a value larger than the physical memory size, yet small enough to prevent Out of Memory (OOM) errors under your workload.
For parallel inference with the same prompt, you can utilize the examples/batched feature. For parallel processing of different prompts, the --cont-batching option in examples/server might be helpful, although it is not recommended. Our tests have shown that it can lead to significantly incorrect results.

yeptttt · 2024-04-11T02:48:49Z

目前我们的实现无法做到非常准确的GPU显存占用估计，因此存在轻微的显存浪费。如果你希望尽可能占用满显存，可以尝试设置 --vram-budget 到一个大于物理显存大小，但在你的工作负载下不会OOM的值。

如果你指的是prompt相同的并行推理，可以使用 examples/batched 。对于不同的prompt， examples/server 的 --cont-batching 可能有所帮助，但并不建议使用，因为我们在测试中发现它会生成明显错误的结果。

Currently, our implementation cannot accurately estimate GPU memory usage, leading to minor memory wastage. If you wish to maximize memory usage, you might try setting --vram-budget to a value larger than the physical memory size, yet small enough to prevent Out of Memory (OOM) errors under your workload.

For parallel inference with the same prompt, you can utilize the examples/batched feature. For parallel processing of different prompts, the --cont-batching option in examples/server might be helpful, although it is not recommended. Our tests have shown that it can lead to significantly incorrect results.

目前第一个问题就是GPU显存通过 --vram-budget 只能最大调到11GB，但我希望使用更多显存

yeptttt added the question Further information is requested label Mar 18, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

two questions that i want to solve #167

two questions that i want to solve #167

yeptttt commented Mar 18, 2024

hodlen commented Apr 6, 2024

yeptttt commented Apr 11, 2024

two questions that i want to solve #167

two questions that i want to solve #167

Comments

yeptttt commented Mar 18, 2024

hodlen commented Apr 6, 2024

yeptttt commented Apr 11, 2024