[Inductor] Flex attention supports dynamic shape #125994

yanboliang · 2024-05-11T05:51:49Z

static shapes perf

| Type    |   Speedup |   batch_size |   num_heads |   q_seq_len |   k_seq_len |   head_dim | score_mod   | dtype          |
|---------|-----------|--------------|-------------|-------------|-------------|------------|-------------|----------------|
| Average |     0.692 |              |             |             |             |            |             |                |
| Max     |     0.855 |           16 |          16 |        4096 |        4096 |         64 | head_bias   | torch.bfloat16 |
| Min     |     0.419 |            8 |          16 |         512 |         512 |        256 | noop        | torch.bfloat16 |

dynamic shapes perf

| Type    |   Speedup |   batch_size |   num_heads |   q_seq_len |   k_seq_len |   head_dim | score_mod     | dtype          |
|---------|-----------|--------------|-------------|-------------|-------------|------------|---------------|----------------|
| Average |     0.670 |              |             |             |             |            |               |                |
| Max     |     0.864 |           16 |          16 |        4096 |        4096 |         64 | relative_bias | torch.bfloat16 |
| Min     |     0.376 |            8 |          16 |         512 |         512 |        256 | relative_bias | torch.bfloat16 |

cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @peterbell10 @ipiszy @yf225 @chenyang78 @kadeng @muchulee8 @ColinPeppler @amjames @desertfire @chauhang

pytorch-bot · 2024-05-11T05:51:52Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/125994

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

❌ 1 New Failure, 1 Unrelated Failure

As of commit 5014543 with merge base d7fe3c4 ():

NEW FAILURE - The following job has failed:

pull / linux-focal-cuda12.1-py3.10-gcc9-sm86 / test (default, 5, 5, linux.g5.4xlarge.nvidia.gpu) (gh)
No space left on device : '/home/ec2-user/actions-runner/_work/_temp/b176d4fc-6ef7-4515-aa8c-b260bcf36613.sh'

FLAKY - The following job failed but was likely due to flakiness present on trunk:

inductor / cuda12.1-py3.10-gcc9-sm86 / test (dynamic_inductor_timm, 2, 2, linux.g5.4xlarge.nvidia.gpu) (gh) (similar failure)
sebotnet33ts_256

This comment was automatically generated by Dr. CI and updates every 15 minutes.

Chillee

Should run some benchmarks too.

test/inductor/test_flex_attention.py

torch/_inductor/kernel/flex_attention.py

yanboliang · 2024-05-13T18:05:04Z

Should run some benchmarks too.

Yea, benchmarking is on the way.

test/inductor/test_flex_attention.py

torch/nn/attention/_flex_attention.py

drisspg · 2024-05-15T00:00:17Z

benchmarks/transformer/score_mod.py

@@ -98,7 +99,7 @@ def generate_inputs(
    return query, key, value


-def run_single_experiment(config: ExperimentConfig) -> ExperimentResults:


Above in this file is

torch._dynamo.config.automatic_dynamic_shapes = False

does compile ignore this if dynamic=true?

Yes, dynamic=True means forcing dynamic.

drisspg · 2024-05-15T00:01:12Z

test/inductor/test_flex_attention.py

@@ -126,6 +126,19 @@ def score_mod(score, b, h, m, n):


 class TestTemplatedSDPA(InductorTestCase):
+    def _check_equal(self, golden_out, ref_out, compiled_out, dtype):


wrote something pretty similar lol:
https://github.com/pytorch/pytorch/pull/125515/files#diff-e3963412cc249e81fecfcf8774f5428b2b5e837ff3633ae13d8b7886ab5bc3b9R134

drisspg · 2024-05-15T00:04:05Z

torch/_dynamo/source.py

@@ -617,3 +617,7 @@ def is_from_defaults(source: Source):
    if isinstance(source, ChainedSource):
        return is_from_defaults(source.base)
    return False
+
+
+def is_cell_contents(source: Source):


what is this doing out of curiosity?

This is part of heuristic rules that determinate if we should wrap int as symint. Here we are saying if the value is from a cell closures, we would not make it dynamic since cell closures usually are constant. We define these heuristics based on source.

yanboliang · 2024-05-15T04:41:40Z

@pytorchbot merge -f "No space left on device"

pytorchmergebot · 2024-05-15T04:43:12Z

Merge started

Your change will be merged immediately since you used the force (-f) flag, bypassing any CI checks (ETA: 1-5 minutes). Please use -f as last resort and instead consider -i/--ignore-current to continue the merge ignoring current failures. This will allow currently pending tests to finish and report signal before the merge.

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

## static shapes perf ``` | Type | Speedup | batch_size | num_heads | q_seq_len | k_seq_len | head_dim | score_mod | dtype | |---------|-----------|--------------|-------------|-------------|-------------|------------|-------------|----------------| | Average | 0.692 | | | | | | | | | Max | 0.855 | 16 | 16 | 4096 | 4096 | 64 | head_bias | torch.bfloat16 | | Min | 0.419 | 8 | 16 | 512 | 512 | 256 | noop | torch.bfloat16 | ``` ## dynamic shapes perf ``` | Type | Speedup | batch_size | num_heads | q_seq_len | k_seq_len | head_dim | score_mod | dtype | |---------|-----------|--------------|-------------|-------------|-------------|------------|---------------|----------------| | Average | 0.670 | | | | | | | | | Max | 0.864 | 16 | 16 | 4096 | 4096 | 64 | relative_bias | torch.bfloat16 | | Min | 0.376 | 8 | 16 | 512 | 512 | 256 | relative_bias | torch.bfloat16 | ``` Pull Request resolved: pytorch#125994 Approved by: https://github.com/Chillee

[Inductor] Flex attention supports dynamic shape

987fd28

pytorch-bot bot added the ciflow/inductor label May 11, 2024

yanboliang marked this pull request as ready for review May 11, 2024 05:51

pytorch-bot bot added the module: inductor label May 11, 2024

yanboliang added the topic: not user facing topic category label May 11, 2024

yanboliang requested review from Chillee and drisspg May 11, 2024 06:33

Chillee reviewed May 11, 2024

View reviewed changes

test/inductor/test_flex_attention.py Outdated Show resolved Hide resolved

test/inductor/test_flex_attention.py Show resolved Hide resolved

torch/_inductor/kernel/flex_attention.py Show resolved Hide resolved

yanboliang added 2 commits May 13, 2024 22:45

Update

8ebec08

Update

d85b0e3

yanboliang requested review from albanD, jbschlosser and mikaylagawarecki as code owners May 14, 2024 06:01

yanboliang added 2 commits May 13, 2024 23:14

Update

b29a25b

update

1326c82

yanboliang commented May 14, 2024

View reviewed changes

test/inductor/test_flex_attention.py Show resolved Hide resolved

yanboliang requested a review from Chillee May 14, 2024 17:10

Chillee approved these changes May 14, 2024

View reviewed changes

torch/nn/attention/_flex_attention.py Show resolved Hide resolved

Update

5014543

pytorch-bot bot added the module: dynamo label May 14, 2024

yanboliang added the ciflow/trunk Trigger trunk jobs on your pull request label May 14, 2024

drisspg reviewed May 15, 2024

View reviewed changes

pytorchmergebot added the merging label May 15, 2024

pytorchmergebot added the Merged label May 15, 2024

pytorchmergebot closed this in dfab69f May 15, 2024

pytorchmergebot removed the merging label May 15, 2024

yanboliang deleted the flex-dyn branch May 15, 2024 04:43

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Inductor] Flex attention supports dynamic shape #125994

[Inductor] Flex attention supports dynamic shape #125994

yanboliang commented May 11, 2024 •

edited

pytorch-bot bot commented May 11, 2024 •

edited

Chillee left a comment

yanboliang commented May 13, 2024

drisspg May 15, 2024

yanboliang May 15, 2024

drisspg May 15, 2024

drisspg May 15, 2024

yanboliang May 15, 2024 •

edited

yanboliang commented May 15, 2024

pytorchmergebot commented May 15, 2024

		@@ -98,7 +99,7 @@ def generate_inputs(
		return query, key, value


		def run_single_experiment(config: ExperimentConfig) -> ExperimentResults:

		@@ -126,6 +126,19 @@ def score_mod(score, b, h, m, n):


		class TestTemplatedSDPA(InductorTestCase):
		def _check_equal(self, golden_out, ref_out, compiled_out, dtype):

[Inductor] Flex attention supports dynamic shape #125994

[Inductor] Flex attention supports dynamic shape #125994

Conversation

yanboliang commented May 11, 2024 • edited

static shapes perf

dynamic shapes perf

pytorch-bot bot commented May 11, 2024 • edited

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/125994

❌ 1 New Failure, 1 Unrelated Failure

Chillee left a comment

Choose a reason for hiding this comment

yanboliang commented May 13, 2024

drisspg May 15, 2024

Choose a reason for hiding this comment

yanboliang May 15, 2024

Choose a reason for hiding this comment

drisspg May 15, 2024

Choose a reason for hiding this comment

drisspg May 15, 2024

Choose a reason for hiding this comment

yanboliang May 15, 2024 • edited

Choose a reason for hiding this comment

yanboliang commented May 15, 2024

pytorchmergebot commented May 15, 2024

Merge started

yanboliang commented May 11, 2024 •

edited

pytorch-bot bot commented May 11, 2024 •

edited

yanboliang May 15, 2024 •

edited