Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

db/config.cc: increment components_memory_reclaim_threshold config default #18611

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

lkshminarayanan
Copy link
Member

Incremented the components_memory_reclaim_threshold config's default value to 0.8 as the previous value was too strict and caused unnecessary eviction in otherwise healthy clusters.

Fixes #18607

…fault

Incremented the components_memory_reclaim_threshold config's default
value to 0.8 as the previous value was too strict and caused unnecessary
eviction in otherwise healthy clusters.

Fixes scylladb#18607

Signed-off-by: Lakshmi Narayanan Sreethar <lakshmi.sreethar@scylladb.com>
@lkshminarayanan lkshminarayanan self-assigned this May 10, 2024
@lkshminarayanan lkshminarayanan added backport/5.2 Issues that should be backported to 5.2 branch once they'll be fixed backport/5.4 Issues that should be backported to 5.4 branch once they'll be fixed labels May 10, 2024
@scylladb-promoter
Copy link
Contributor

🟢 CI State: SUCCESS

✅ - Build
✅ - Unit Tests Custom
The following new/updated tests ran 100 times for each mode:
🔹 boost/sstable_datafile_test
✅ - Container Test
✅ - dtest
✅ - dtest with topology changes
✅ - Unit Tests

Build Details:

  • Duration: 2 hr 38 min
  • Builder: spider3.cloudius-systems.com

Copy link
Contributor

@michoecho michoecho left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@denesb No, what are you doing. 0.8 effectively disables the entire mechanism.

The system gradually adjusts memtable flushing speed to keep memtable memory usage under 0.5 (and stops accepting writes at the extreme), and doesn't even start flushing until it reaches 0.25 0.15.

Bloom filter memory usage isn't going to reach 0.8 because the system is all but guaranteed to OOM at 0.75 0.85, and that's even without counting all the non-cache, non-memtable memory usage.

If you want to prevent OOM crashes with this feature, the threshold must be way under 0.85 (with enough margin to cover all non-LSA, non-bloom memory usage). If you want to guarantee that, then it must be somewhere under (0.5 - margin).

For example, take the last case of bloom explosion from a few days ago (https://github.com/scylladb/scylla-enterprise/issues/4156). The affected node started choking on bad_allocs at bloom fraction of 0.6, and crashed (due to bad_alloc-induced flush failure) at 0.65.

Edit: I misremembered the defaults — we start flushing at 0.15, not at 0.25. Which means that 0.8 might be enough to prevent some OOMs, but that's cutting it very close.

@denesb
Copy link
Contributor

denesb commented May 13, 2024

If you want to prevent OOM crashes with this feature, the threshold must be way under 0.75 (with enough margin to cover all non-LSA, non-bloom memory usage). If you want to guarantee that, then it must be somewhere under (0.5 - margin).

For example, take the last case of bloom explosion from a few days ago (scylladb/scylla-enterprise#4156). The affected node started choking on bad_allocs at bloom fraction of 0.6, and crashed (due to bad_alloc-induced flush failure) at 0.65.

0.6 is way too strict. In another customer case, the cluster is healthy and steady at 0.6 BF memory usage ratio and forcing it below makes IO and latencies jump and results in escalation. So we have to use the max() of all use-cases as the default and lower on deployments where the default is already problematic.

@mykaul mykaul added this to the 5.4.7 milestone May 13, 2024
@michoecho
Copy link
Contributor

0.6 is way too strict. In another customer case, the cluster is healthy and steady at 0.6 BF memory usage ratio

Steady at 0.6? Which customer case was that?

@denesb
Copy link
Contributor

denesb commented May 13, 2024

0.6 is way too strict. In another customer case, the cluster is healthy and steady at 0.6 BF memory usage ratio

Steady at 0.6? Which customer case was that?

https://github.com/scylladb/scylla-enterprise/issues/4183
And let's stop discussing customer issues here, before we accidentally name-drop one of them. Although I suppose these cases are very much relevant here, still customer issues should not be discussed in the OSS repository.

@mykaul
Copy link
Contributor

mykaul commented May 21, 2024

@avikivity - per our discussion on Sunday - please review.

@avikivity
Copy link
Member

The goal of the feature was to prevent OOM. If we use the max() of all clusters, it will never prevent OOM since different clusters have other non-LSA components (prepared statements, cached queriers, running ops, background writes, repair with its large buffers, group 0 holding all the tablets metadata).

We can increase it to 0.2, but not beyond.

@denesb
Copy link
Contributor

denesb commented May 21, 2024

The goal of the feature was to prevent OOM. If we use the max() of all clusters, it will never prevent OOM since different clusters have other non-LSA components (prepared statements, cached queriers, running ops, background writes, repair with its large buffers, group 0 holding all the tablets metadata).

We can increase it to 0.2, but not beyond.

If we make the default too conservative, field is just going to disable it on all clusters and then we also aren't preventing OOMs.
I have sent an email to rnd-int to discuss this in its wider context, let's continue the discussion there.

@mykaul
Copy link
Contributor

mykaul commented May 23, 2024

@avikivity - please help move this forward.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/memory footprint backport/5.2 Issues that should be backported to 5.2 branch once they'll be fixed backport/5.4 Issues that should be backported to 5.4 branch once they'll be fixed backport/6.0 status/regression
Projects
None yet
Development

Successfully merging this pull request may close these issues.

bloom-filter: default value for components_memory_reclaim_threshold is too strict
6 participants