Meta-Llama-3-70B-Instruct running out of memory on 8 A100-40GB #183

whatdhack · 2024-05-03T00:16:17Z

Describe the bug

Out of memory. Tried to allocate X.XX GiB .....

I guess any A100 system with 8+ GPUs

python example_chat_completion.py

Out of memory. Tried to allocate X.XX GiB  .....

Additional context
Is there a way to reduce the memory requirement ? Most obvious trick, reducing batch size, did not prevent OOM.

The text was updated successfully, but these errors were encountered:

whatdhack · 2024-05-11T17:33:08Z

What is the best way to adapt the 8 checkpoints for A100-80GB/H100 for the 70B model to say 16 A100-40GB ?

subramen · 2024-05-15T16:56:39Z

Please see this thread: #157 (comment)