Optimize `Hash#transform_{keys,values}` #14502

jgaskins · 2024-04-17T04:10:15Z

Preallocating the correct-size hash is faster for hash sizes > 4, but the empty hash is faster for sizes <= 4. It would be a lot closer, but Hash#upsert allocates an initial capacity of 4 when the hash is empty but Hash#initialize sets the floor for initial_capacity at 8 elements.

This PR gives the best of both worlds by using the current approach when the source hash has <= 4 keys in it, or a fully preallocated hash for > 4.

Benchmark code

require "benchmark"

{
  1,
  4,
  5,
  10,
  100,
  1_000,
}.each do |size|
  hash = Array
    .new(size) { |i| {i.to_s, i.to_s} }
    .to_h

  unless hash.transform_keys(&.itself) == hash.transform_keys_optimized(&.itself)
    raise "Oops! #{hash.transform_keys_optimized(&.itself).pretty_inspect}"
  end

  puts
  puts
  puts "### #{size}"
  puts "Keys"
  Benchmark.ips do |x|
    x.report "baseline" { hash.transform_keys(&.itself) }
    x.report "optimized" { hash.transform_keys_optimized(&.itself) }
  end
  puts
  puts "Values"
  Benchmark.ips do |x|
    x.report "baseline" { hash.transform_values(&.itself) }
    x.report "optimized" { hash.transform_values_optimized(&.itself) }
  end
end

class Hash(K, V)
  def transform_keys_optimized(& : K, V -> K2) : Hash(K2, V) forall K2
    hash = Hash(K2, V).new(initial_capacity: size)
    each do |key, value|
      hash[yield(key, value)] = value
    end
    hash
  end

  def transform_values_optimized(& : V, K -> V2) : Hash(K, V2) forall V2
    hash = Hash(K, V2).new(initial_capacity: size)
    each do |key, value|
      hash[key] = yield(value, key)
    end
    hash
  end
end

### 1
Keys
 baseline  28.39M ( 35.23ns) (± 1.34%)  176B/op        fastest
optimized  20.34M ( 49.18ns) (± 1.41%)  272B/op   1.40× slower

Values
 baseline  29.40M ( 34.01ns) (± 1.61%)  176B/op        fastest
optimized  20.30M ( 49.27ns) (± 1.96%)  272B/op   1.45× slower


### 4
Keys
 baseline  17.18M ( 58.19ns) (± 2.63%)  176B/op        fastest
optimized  13.56M ( 73.76ns) (± 1.19%)  272B/op   1.27× slower

Values
 baseline  17.19M ( 58.17ns) (± 2.61%)  176B/op        fastest
optimized  13.56M ( 73.74ns) (± 1.16%)  272B/op   1.27× slower


### 5
Keys
 baseline  10.14M ( 98.62ns) (± 1.09%)  384B/op   1.23× slower
optimized  12.45M ( 80.34ns) (± 1.00%)  272B/op        fastest

Values
 baseline  10.07M ( 99.34ns) (± 4.05%)  384B/op   1.23× slower
optimized  12.40M ( 80.66ns) (± 3.20%)  272B/op        fastest


### 10
Keys
 baseline   5.04M (198.33ns) (± 4.11%)  832B/op   1.28× slower
optimized   6.46M (154.82ns) (± 0.96%)  512B/op        fastest

Values
 baseline   5.08M (196.95ns) (± 2.39%)  832B/op   1.27× slower
optimized   6.43M (155.58ns) (± 3.46%)  512B/op        fastest


### 100
Keys
 baseline 560.97k (  1.78µs) (± 1.02%)  7.09kB/op   1.35× slower
optimized 757.56k (  1.32µs) (± 1.02%)  3.34kB/op        fastest

Values
 baseline 562.41k (  1.78µs) (± 1.02%)  7.09kB/op   1.34× slower
optimized 754.39k (  1.33µs) (± 2.92%)  3.34kB/op        fastest


### 1000
Keys
 baseline  49.88k ( 20.05µs) (± 1.27%)  56.4kB/op   1.47× slower
optimized  73.53k ( 13.60µs) (± 1.89%)  28.1kB/op        fastest

Values
 baseline  49.67k ( 20.13µs) (± 1.23%)  56.4kB/op   1.47× slower
optimized  73.05k ( 13.69µs) (± 3.42%)  28.1kB/op        fastest

ysbaddaden · 2024-04-18T09:43:34Z

src/hash.cr

+      each do |key, value|
+        hash[yield(key, value)] = value
+      end
+      hash


Could we merge the branches? Like:

copy = size > 4 ? Hash(K2, V).new(capacity) : Hash(K2, V).new each_with_object(copy) { ... }

Maybe 4 could be a constant instead of a magic number? For example Hash::INITIAL_CAPACITY or Hash::MINIMAL_CAPACITY.

It can be even simpler: Hash(K2, V).new(size < 4 ? 0 : size)

straight-shoota · 2024-04-18T11:41:25Z

Initializing the transformed has with the current size seems like a good idea. We could also consider using the current capacity in order to make it a copy with identical properties, just transformed values/keys.

But I'm not sure about optimizing the capacity for small hashes. The default minimum capacity for a hash has its reasons. It's not highly optimized for the use case of very small hashes, but that's that common and the overhead is minimal.
It will overall be less efficient if more items are added afterwards. I suppose an argument could be made that transformed hashes are less likely to get new additions. But that's speculative and surely there are many use cases where it doesn't hold.

So my suggestion would be to drop the special handling for small hashes and just allocate the transformed hash with an initial capacity.

ysbaddaden · 2024-04-19T08:48:14Z

Let's do some archeology:

In #8017 Hash#initialize called #malloc_entries(initial_capacity / 2) which correlated with the other #malloc_entries(4) since initial capacity is supposed to be 8.

But #8182 patched #initialize to now call #malloc_entries(initial_capacity) but it didn't change the other #malloc_entries(4) and now they don't correlate anymore.

Said differently: magic numbers 😮‍💨

All that to say that magic numbers really are as bad as we say they are + I agree with @straight-shoota's statement above: let's just always use the capacity, as other calls to #malloc_entries do.

Optimize Hash#transform_{keys,values}

69c966a

Blacksmoke16 added performance kind:refactor topic:stdlib:collection labels Apr 17, 2024

ysbaddaden reviewed Apr 18, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimize `Hash#transform_{keys,values}` #14502

Optimize `Hash#transform_{keys,values}` #14502

jgaskins commented Apr 17, 2024 •

edited

ysbaddaden Apr 18, 2024

straight-shoota Apr 18, 2024

straight-shoota commented Apr 18, 2024 •

edited

ysbaddaden commented Apr 19, 2024

Optimize Hash#transform_{keys,values} #14502

Are you sure you want to change the base?

Optimize Hash#transform_{keys,values} #14502

Conversation

jgaskins commented Apr 17, 2024 • edited

ysbaddaden Apr 18, 2024

Choose a reason for hiding this comment

straight-shoota Apr 18, 2024

Choose a reason for hiding this comment

straight-shoota commented Apr 18, 2024 • edited

ysbaddaden commented Apr 19, 2024

Optimize `Hash#transform_{keys,values}` #14502

Optimize `Hash#transform_{keys,values}` #14502

jgaskins commented Apr 17, 2024 •

edited

straight-shoota commented Apr 18, 2024 •

edited