-
Notifications
You must be signed in to change notification settings - Fork 245
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature] throw Turbomind error to python #1539
Comments
Hi @lijing1996 You may provide detailed information about the error reported, how it was triggered, and provide a minimal reproducible example. When the TurboMind Engine reports an error, it is usually divided into two situations. One is an unrecoverable error, such as OOM, just let it crash and the other is an error that only affects a specific request at present, in which case letting that request fail and having the client retry will suffice. |
The first case. In such a case, could it catch the error and then re-import and re-load the model? I found it was sometimes OOM in my case with a large batch size. However, with a small batch size, the speed was low. |
In this situation, catching the error is meaningless as it is a fatal error. It should just be allowed to crash to expose the problem. Also, I believe this is a bug that should be fixed. Could you provide detailed steps for reproducing it, including the model, request parameters, specific request content, etc.? As a program running on the server side for a long time, stability is very important, especially for Internet services. |
It is just a OOM error. I use VLM to caption lots of images, so I need re-start after crash. |
Motivation
Cannot catch and continue running when Turbomind throws an error.
Related resources
No response
Additional context
No response
The text was updated successfully, but these errors were encountered: