Fix test runner --fail-fast test flake #51783

JoeDupuis · 2024-05-11T02:24:32Z

I am working on a couple of flakes that have been bugging me for a while.
This one is testing that --fail-fast forces an early return and doesn't run all the tests when running tests in parallel.

Example of a failed test run

I've ran the test in a loop to test the variability.
Running the current version 50 times, I got these results:
{"3"=>1, "5"=>36, "4"=>8, "6"=>5}

The keys are the number of tests that ran and the value is the number of times we got that result in the outer loop (I ran the test 50 times). I've inserted the script I've used to get these numbers at the end.

A short sleep eliminates all variability:
{"4"=>50}

Sleeping in a test is not ideal. It is slow and it is not a true fix, but in this particular case I believe it could make sense:
1- The test is already not perfect. It seems prone to race condition / flake
2- Benchmark show negligible impact on test performance
3- A true fix would be quite complex
4- It's not a new pattern in the code base (Rails)

Since we eliminated variability, we can lower down the number of tests from 10 to 4. Making it, in some instances, even faster than the original test!

I ran a benchmark with 4 versions:
1- "Current test" (10 tests with no sleep)
2- "With sleep" (10 tests with sleep)
3- "Short" (4 tests)
4- "Short with sleep" (4 tests with sleep)

Here are the results:

ruby 3.2.0 (2022-12-25 revision a528908271) [x86_64-linux]
Warming up --------------------------------------
             Current     1.000 i/100ms
          with sleep     1.000 i/100ms
               short     1.000 i/100ms
    short with sleep     1.000 i/100ms
Calculating -------------------------------------
             Current      1.032 (± 0.0%) i/s -      6.000 in   5.812659s
          with sleep      1.018 (± 0.0%) i/s -      6.000 in   5.897112s
               short      1.033 (± 0.0%) i/s -      6.000 in   5.807960s
    short with sleep      1.031 (± 0.0%) i/s -      6.000 in   5.822803s

Comparison:
               short:        1.0 i/s
             Current:        1.0 i/s - 1.00x  slower
    short with sleep:        1.0 i/s - 1.00x  slower
          with sleep:        1.0 i/s - 1.01x  slower

Here are the results on variability:

Current test 50 times
Time:
  0.018374   0.016918  43.072536 ( 49.452285)
count:
{"3"=>1, "5"=>36, "4"=>8, "6"=>5}
------------------------------------------------------------
With sleep 50 times
Time:
  0.024689   0.018993  43.850807 ( 53.411969)
count:
{"4"=>50}
------------------------------------------------------------
Short 50 times
Time:
  0.015890   0.014887  42.154732 ( 48.472620)
count:
{"3"=>47, "2"=>3}
------------------------------------------------------------
Short with sleep 50 times
Time:
  0.021356   0.013760  41.751833 ( 48.164752)
count:
{"2"=>50}

Benchmark Script

Here's the (rough) benchmark script I used. It's a test in the same file as the original test.

diff --git a/railties/test/application/test_runner_test.rb b/railties/test/application/test_runner_test.rb
index f30bdb2890..80565899a2 100644
--- a/railties/test/application/test_runner_test.rb
+++ b/railties/test/application/test_runner_test.rb
@@ -802,6 +802,133 @@ def test_verify_fail_fast
       assert_operator matches[1].to_i, :<, 11
     end
 
+
+    def test_flake_fix
+      require 'benchmark/ips'
+      exercise_parallelization_regardless_of_machine_core_count(with: :processes)
+
+      app_file "test/unit/parallel_test_current.rb", <<-RUBY
+        require "test_helper"
+
+        class ParallelTest < ActiveSupport::TestCase
+          def test_verify_fail_fast
+            assert false
+          end
+
+          10.times do |n|
+            define_method("test_verify_fail_fast_\#{n}") do
+              assert true
+            end
+          end
+        end
+      RUBY
+
+      app_file "test/unit/parallel_test_sleep.rb", <<-RUBY
+        require "test_helper"
+
+        class ParallelTest < ActiveSupport::TestCase
+          def test_verify_fail_fast
+            assert false
+          end
+
+          10.times do |n|
+            define_method("test_verify_fail_fast_\#{n}") do
+              sleep 0.1
+              assert true
+            end
+          end
+        end
+      RUBY
+
+      app_file "test/unit/parallel_test_sleep_short.rb", <<-RUBY
+        require "test_helper"
+
+        class ParallelTest < ActiveSupport::TestCase
+          def test_verify_fail_fast
+            assert false
+          end
+
+          4.times do |n|
+            define_method("test_verify_fail_fast_\#{n}") do
+              sleep 0.1
+              assert true
+            end
+          end
+        end
+      RUBY
+
+      app_file "test/unit/parallel_test_short.rb", <<-RUBY
+        require "test_helper"
+
+        class ParallelTest < ActiveSupport::TestCase
+          def test_verify_fail_fast
+            assert false
+          end
+
+          4.times do |n|
+            define_method("test_verify_fail_fast_\#{n}") do
+              assert true
+            end
+          end
+        end
+      RUBY
+
+      sample_size = 50
+      count = {}
+      test_output = ""
+
+
+      run = ->(filename, sample_size) do
+        count = {}
+        test_output = ""
+        Benchmark.measure {
+          sample_size.times do
+            capture(:stderr) do
+              test_output = run_test_command("#{filename} --fail-fast --seed 31992", stderr: true)
+            end
+            matches = test_output.match(/(\d+) runs, (\d+) assertions, (\d+) failures/)
+            count[matches[1]] = count[matches[1]].to_i + 1
+          end
+        }
+      end
+
+      puts "Current test #{sample_size} times"
+      time = run.call("test/unit/parallel_test_current.rb", sample_size)
+      puts "Time:", time
+      puts "count:", count
+
+      puts '------------------------------------------------------------'
+
+      puts "With sleep #{sample_size} times"
+      time = run.call("test/unit/parallel_test_sleep.rb", sample_size)
+      puts "Time:", time
+      puts "count:", count
+
+      puts '------------------------------------------------------------'
+
+      puts "Short #{sample_size} times"
+      time = run.call("test/unit/parallel_test_short.rb", sample_size)
+      puts "Time:", time
+      puts "count:", count
+
+      puts '------------------------------------------------------------'
+
+      puts "Short with sleep #{sample_size} times"
+      time = run.call("test/unit/parallel_test_sleep_short.rb", sample_size)
+      puts "Time:", time
+      puts "count:", count
+
+
+      Benchmark.ips do |x|
+        x.report("Current") { run.call("test/unit/parallel_test_current.rb", 1) }
+        x.report("with sleep") { run.call("test/unit/parallel_test_current.rb", 1) }
+        x.report("short") { run.call("test/unit/parallel_test_short.rb", 1)  }
+        x.report("short with sleep") { run.call("test/unit/parallel_test_sleep_short.rb", 1)  }
+        x.compare!
+      end
+    end
+
+
     def test_run_in_parallel_with_processes
       exercise_parallelization_regardless_of_machine_core_count(with: :processes)

Checklist

Before submitting the PR make sure the following are checked:

This Pull Request is related to one change. Unrelated changes should be opened in separate PRs.
Commit message has a detailed description of what changed and why. If this PR fixes a related issue include it in the commit message. Ex: [Fix #issue-number]
Tests are added or updated if you fix a bug or add a feature.
CHANGELOG files are updated for the changed libraries if there is a behavior change or additional feature. Minor bug fixes and documentation changes should not be included.

Without the sleep, the test has high variability in the amount of test ran before stopping. With the sleep, the test is pretty stable with negligeable impact on run time.

Fix test runner --fail-fast test flake

Fix test runner --fail-fast test flake

a63baa5

Without the sleep, the test has high variability in the amount of test ran before stopping. With the sleep, the test is pretty stable with negligeable impact on run time.

rails-bot bot added the railties label May 11, 2024

JoeDupuis mentioned this pull request May 13, 2024

Add active_record.postgresql_adapter_decode_dates config #51763

Merged

4 tasks

rafaelfranca merged commit b56ee17 into rails:main May 15, 2024
4 checks passed

rafaelfranca added a commit that referenced this pull request May 15, 2024

Merge pull request #51783 from JoeDupuis/fix-fail-fast-flake

b9d505c

Fix test runner --fail-fast test flake

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix test runner --fail-fast test flake #51783

Fix test runner --fail-fast test flake #51783

JoeDupuis commented May 11, 2024 •

edited

Fix test runner --fail-fast test flake #51783

Fix test runner --fail-fast test flake #51783

Conversation

JoeDupuis commented May 11, 2024 • edited

Benchmark Script

Checklist

JoeDupuis commented May 11, 2024 •

edited