Support Dynamo level Caching #125958

JackCaoG · 2024-05-10T20:29:46Z

🚀 The feature, motivation and pitch

torch.compile can takes order of seconds to compile a decent size model like Llama2 7B with a aot-autogra enabled backend. Note that I only include the dyanmo + aot_autograd time, this does not include the backend compiler(like inductor) compilation time. It would be ideal if dynamo can cache the torch.compile to speed up development time.

We(PyTorch/XLA) are trying to integrate with the VLLM. @WoosukKwon reports that in the warm up phase of the VLLM, it needs to pre-compile ~30 different input shape combinations. PyTorch/XLA does not support dynamic shapes today so torch.compile will keep compiling the model code which slows down the development speed(@WoosukKwon needs to wait for 10 minutes before warm up is finished). PyTorch/XLA already cache the XLA compilation but torch.compile itself is pretty expensive.

Alternatives

Reduce torch.compile time for a model with only batch dimension changes.

Additional context

cc @ezyang @msaroufim @bdhirsh @anijain2305 @chauhang @yanboliang

The text was updated successfully, but these errors were encountered:

ezyang · 2024-05-11T21:16:42Z

cc @jamesjwu @oulgen

AOTAutograd caching will help here.

soulitzer added the oncall: pt2 label May 11, 2024

msaroufim mentioned this issue May 11, 2024

[Question] MBU in automated CI? pytorch/ao#237

Open

msaroufim added module: startup-tracing-compile Compilation mechanism or time spent in (re)compilation, tracing, startup triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module labels May 11, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support Dynamo level Caching #125958

Support Dynamo level Caching #125958

JackCaoG commented May 10, 2024 •

edited by pytorch-bot bot

ezyang commented May 11, 2024

Support Dynamo level Caching #125958

Support Dynamo level Caching #125958

Comments

JackCaoG commented May 10, 2024 • edited by pytorch-bot bot

🚀 The feature, motivation and pitch

Alternatives

Additional context

ezyang commented May 11, 2024

JackCaoG commented May 10, 2024 •

edited by pytorch-bot bot