New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CUDA out of memory #2807
Comments
@zkdfbb , can you share some details about your workflow/pipeline? Are you simply instantiating your model, passing it into QuantizationSimModel, computing encodings, and then using qsim.model(...) to run evaluation? If you could also provide some other metrics:
Regarding the onnx warning message, currently we have not really looked into how to silence these warnings. We can mark it as a to do item for better user experience. |
@quic-klhsieh , You can try to reproduce with the code below. I have solved the problem, I use @torch.no_grad to decorate a function which include model forward and post processing. But I think there is still a problem with quantization aware training, the GPU memory is growing so fast that it becomes unusable quickly.
|
@zkdfbb Glad to hear torch.no_grad() worked for you. Thank you for the code snippet, we can take a look at why the memory continues to increase with each iteration. |
Hello, I want to evaluate my model after QuantizationSimModel, but encountered a CUDA out of memory error. Normally, evaluating the model requires 7GB of VRAM, but after quantization, even 80GB is not enough. How can I solve this problem?
My environment as follows:
python: 3.8.10
pytorch: 2.2.0
aimet: 1.30.0
Another problem:
where I export the quantized model to onnx, There is a warning message:
[W shape_type_inference.cpp:1973] Warning: The shape inference of aimet_torch::CustomMarker type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. (function UpdateReliable)
It seems to have no impact on the exported ONNX model, but I still want to ask if this message can be eliminated
The text was updated successfully, but these errors were encountered: