New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Flaky test_memory_format with nn.BatchNorm2d when running with inductor #125967
Comments
Skipping the test in the context of #125967 until the issue is root caused and fixed properly. Pull Request resolved: #125970 Approved by: https://github.com/clee2000
Marking high priority as it is easily reproducible |
Skipping the test in the context of pytorch#125967 until the issue is root caused and fixed properly. Pull Request resolved: pytorch#125970 Approved by: https://github.com/clee2000
I'll blindly suspect it's due to RNG state. I tried to repro, but this command:
just shows:
Add What's the correct way to run a disabled test? |
Oh, I skip the test in #125970 to keep trunk sane. Please revert my change before trying to reproduce it. Otherwise, it shows up as a skipped test. |
I at least figured out why test order matters here. It's due to dynamo compilation cache. If we run test for BatchNorm1D first, we already have a few compiled functions in dynamo cache. Later on when we test BatchNorm2D, the cache limit is reached, we fall back to eager and bypass the issue. To very the above point, I added torch._dynamo.reset() in the beginning of the test, now I can repro the issue even if BatchNorm2D is tested after BatchNorm1D. |
In #125967, we found test results depend on test order. The root cause is due to earlier tests populate dynamo cache and affect the later tests. This PR clear dynamo cache before each unit test so we get more deterministic result for unit test [ghstack-poisoned]
In #125967, we found test results depend on test order. The root cause is due to earlier tests populate dynamo cache and affect the later tests. This PR clear dynamo cache before each unit test so we get more deterministic result for unit test [ghstack-poisoned]
Fix #125967 . The test actually fail for empty 4D or 5D tensors when checking for memory format. I'm not exactly sure what recent inductor change cause the failure, but it may be not that important to maintain strides for an empty tensor. (?) I just skip the check for empty tensor. [ghstack-poisoned]
Fix #125967 . The test actually fail for empty 4D or 5D tensors when checking for memory format. I'm not exactly sure what recent inductor change cause the failure, but it may be not that important to maintain strides for an empty tensor. (?) I just skip the check for empty tensor. [ghstack-poisoned]
Fix #125967 . The test actually fail for empty 4D or 5D tensors when checking for memory format. I'm not exactly sure what recent inductor change cause the failure, but it may be not that important to maintain strides for an empty tensor. (?) I just skip the check for empty tensor. [ghstack-poisoned]
In #125967, we found test results depend on test order. The root cause is due to earlier tests populate dynamo cache and affect the later tests. This PR clear dynamo cache before each unit test so we get more deterministic result for unit test [ghstack-poisoned]
Fix #125967 . The test actually fail for empty 4D or 5D tensors when checking for memory format. I'm not exactly sure what recent inductor change cause the failure, but it may be not that important to maintain strides for an empty tensor. (?) I just skip the check for empty tensor. [ghstack-poisoned]
In #125967, we found test results depend on test order. The root cause is due to earlier tests populate dynamo cache and affect the later tests. This PR clear dynamo cache before each unit test so we get more deterministic result for unit test [ghstack-poisoned]
In #125967, we found test results depend on test order. The root cause is due to earlier tests populate dynamo cache and affect the later tests. This PR clear dynamo cache before each unit test so we get more deterministic result for unit test Pull Request resolved: #126586 Approved by: https://github.com/jansel
This test starts to fail in trunk for
nn.BatchNorm2d
recently. I think it's another example of #125239 where the test order of the tests matter.On devgpu, running the test alone fails like how it fails on CI:
But if I run it after
nn.BatchNorm1
, it passes:cc @ezyang @gchanan @zou3519 @kadeng @msaroufim @seemethere @malfet @pytorch/pytorch-dev-infra @bdhirsh @anijain2305 @chauhang @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @peterbell10 @ipiszy @yf225 @chenyang78 @muchulee8 @ColinPeppler @amjames @desertfire
The text was updated successfully, but these errors were encountered: