Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to release GPU memory after catching OutOfMemory error? #88

Open
WatanoK10 opened this issue Aug 2, 2023 · 0 comments
Open

How to release GPU memory after catching OutOfMemory error? #88

WatanoK10 opened this issue Aug 2, 2023 · 0 comments

Comments

@WatanoK10
Copy link

Hi everyone.
I have a question regarding the use of this model.

I ran tex2img with the following code and got an error

from kandinsky2 import get_kandinsky2
model = get_kandinsky2('cuda', task_type='text2img', model_version='2.1', use_flash_attention=False)
images = model.generate_text2img(
    "cat 4k", 
    num_steps=100,
    batch_size=2, 
    guidance_scale=4,
    h=400, w=600,
    sampler='p_sampler', 
    prior_cf_scale=4,
    prior_steps="5"
)

and got an error

OutOfMemoryError                          Traceback (most recent call last)
Cell In[4], line 1
----> 1 images = model.generate_text2img(
      2     "cat 4k", 
      3     num_steps=100,
      4     batch_size=2, 
      5     guidance_scale=4,
      6     h=400, w=600,
      7     sampler='p_sampler', 
      8     prior_cf_scale=4,
      9     prior_steps="5"
     10 )

File [~/src/kandinsky/lib/python3.10/site-packages/torch/utils/_contextlib.py:115], in context_decorator..decorate_context(*args, **kwargs)
    112 @functools.wraps(func)
    113 def decorate_context(*args, **kwargs):
    114     with ctx_factory():
--> 115         return func(*args, **kwargs)

File [~/src/kandinsky/lib/python3.10/site-packages/kandinsky2/kandinsky2_1_model.py:341], in Kandinsky2_1.generate_text2img(self, prompt, num_steps, batch_size, guidance_scale, h, w, sampler, prior_cf_scale, prior_steps, negative_prior_prompt, negative_decoder_prompt)
    338     config["diffusion_config"]["timestep_respacing"] = str(num_steps)
    339 diffusion = create_gaussian_diffusion(**config["diffusion_config"])
--> 341 return self.generate_img(
    342     prompt=prompt,
...
     66 norm_f = self.norm_layer(f)
---> 67 new_f = norm_f * self.conv_y(zq) + self.conv_b(zq)
     68 return new_f

OutOfMemoryError: CUDA out of memory. Tried to allocate 280.00 MiB (GPU 0; 7.79 GiB total capacity; 7.31 GiB already allocated; 206.69 MiB free; 7.40 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

At this moment, GPU memory usage was 6890 MiB / 8192 MiB as indicated by nvidia-smi

$ nvidia-smi
Wed Aug  2 23:11:21 2023       
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.54.03              Driver Version: 535.54.03    CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  NVIDIA GeForce RTX 3060 Ti     Off | 00000000:07:00.0 Off |                  N/A |
|  0%   35C    P8              10W / 200W |   6890MiB /  8192MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+
                                                                                         
+---------------------------------------------------------------------------------------+
| Processes:                                                                            |
|  GPU   GI   CI        PID   Type   Process name                            GPU Memory |
|        ID   ID                                                             Usage      |
|=======================================================================================|
|    0   N/A  N/A   1435481      C                            ~/bin/python3     6884MiB |
+---------------------------------------------------------------------------------------+

After deleting the model and images variables, I get the same error when I redefine the model. nvidia-smi shows no change in GPU memory usage. It seems that GPU memory was not released even after deleting the variables.
Therefore, it was necessary to restart the jupyter instance to re-run.
If you know how to release and re-allocate GPU memory without restarting the instance, please let us know.

Finally, I want to express my gratitude to the development team and everyone else for their hard work and dedication.

Kind regards.

@WatanoK10 WatanoK10 changed the title How to free GPU memory after catching OutOfMemory error? How to release GPU memory after catching OutOfMemory error? Aug 2, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant