Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Segment with heavy deletes not picked for merge in TieredMergePolicy #13226

Open
khushbr opened this issue Mar 27, 2024 · 2 comments
Open

Segment with heavy deletes not picked for merge in TieredMergePolicy #13226

khushbr opened this issue Mar 27, 2024 · 2 comments
Labels

Comments

@khushbr
Copy link

khushbr commented Mar 27, 2024

Description

Description

We have a cluster, running on Lucene v8.7.0 and configured with TieredMergePolicy. We are seeing a peculiar behavior where segments with heavy deletes are not getting picked as part of background merge operation and also, on invoking force merge expunge delete.

As seen in below segment info, all the segments are close to maxMergedSegmentBytes 5GB value and the segDelPct is ~99.9%, which is significantly higher than the threshold value of 20%, defined in deletesPctAllowed.

index     shard prirep segment generation docs.count docs.deleted    size size.memory committed searchable version compound
index-1   0     p      _37pfj     5398399         10            0  20.9kb           0 true      false      8.7.0   true
index-1   0     r      _1pzoc     2892252         37            0  26.1kb           0 true      false      8.7.0   true
...
index-1   0     p      _2hr8l     4187685         40     28834069   4.8gb       10408 true      true       8.7.0   false
index-1   0     p      _1voeo     3157584          0     29902767   4.9gb       10520 true      true       8.7.0   false
index-1   0     r      _1wtuc     3211284          0     29948777   4.9gb       10520 true      true       8.7.0   false
index-1   0     r      _2jchf     4261875         40     30082926     5gb       10584 true      true       8.7.0   false
  1. In an attempt to influence the MergeScore for these segments, I increased the value for reclaim_deletes_weight to ridiculously high number 50 but the segment was still getting skipped with score of ~2.22 and skew value 0.713.
[2024-03-20T15:46:36,015][TRACE][o.e.i.e.E.MP ]: Lucene Merge Thread #403832] MP:   maybe=_1wtuc(8.7.0):C29948777/29948777:[diagnostics={os=Linux, java.version=11.0.17, os.arch=amd64, java.runtime.version=11.0.17+9-LTS, source=merge, ... :softDel=12100433 :id=i9yc9l5c6qvt26u9srmz8umo score=2.220052984489907 skew=0.713 nonDelRatio=1.000 tooLarge=false size=7083.170 MB

  1. Also played with increasing and decreasing the threshold value of max_merged_segment. Decreasing the value to 3GB resulted in segment _1pa38 getting picked for merge, but the deletes were not expunged post the merge finish.
[2024-03-27T08:33:42,666][TRACE][o.e.i.e.E.MS] [refresh][T#1] MS:     launch new thread
..
[2024-03-27T08:33:42,667][TRACE][o.e.i.e.E.IW]: Lucene Merge Thread #145644] IW: now apply deletes for 10 merging segments
...
2024-03-27T08:33:42,667][TRACE][o.e.i.e.E.IW ]: Lucene Merge Thread #145644] IW: now merge ...  index=_1pa38(8.7.0):C60121906:[diagnostics={os=Linux, java.version=11.0.17, os.arch=amd64, java.runtime.version=11.0.17+9-LTS, source=merge, os.version=5.10.149-133.644.amzn2.x86_64, java.vendor=Amazon.com Inc., java.vm.version=11.0.17+9-LTS, lucene.version=8.7.0, mergeMaxNumSegments=40, mergeFactor=10, timestamp=1711309532883}]:[attributes={Lucene87StoredFieldsFormat.mode=BEST_SPEED}]:fieldInfosGen=1886:dvGen=1886 :softDel=60117820 :id=4v1w9rn7oj6c0d78gy0t6lih8...
[2024-03-27T08:33:42,848][TRACE][o.e.i.e.E.IW]: Lucene Merge Thread #145644] IW: merge codec=Lucene87 maxDoc=6985; merged segment has no vectors; norms; docValues; prox; freqs; points; 0.2 sec to merge segment [1.19 MB, 6.64 MB/sec]

[2024-03-27T08:33:44,699][TRACE][o.e.i.e.E.IW]: Lucene Merge Thread #145644] IW: commitMerge: ... index=_1pa38(8.7.0):C60121906:[diagnostics={os=Linux, java.version=11.0.17, os.arch=amd64, java.runtime.version=11.0.17+9-LTS, source=merge, os.version=5.10.149-133.644.amzn2.x86_64, java.vendor=Amazon.com Inc., java.vm.version=11.0.17+9-LTS, lucene.version=8.7.0, mergeMaxNumSegments=40, mergeFactor=10, timestamp=1711309532883}]:[attributes={Lucene87StoredFieldsFormat.mode=BEST_SPEED}]:fieldInfosGen=1886:dvGen=1886 :softDel=60117820 :id=4v1w9rn7oj6c0d78gy0t6lih8
...
[2024-03-27T08:33:44,700][TRACE][o.e.i.e.E.IFD]: Lucene Merge Thread #145644] IFD: now checkpoint "_1pa38(8.7.0):C60121906:[diagnostics={os=Linux, java.version=11.0.17, os.arch=amd64, java.runtime.version=11.0.17+9-LTS, source=merge, os.version=5.10.149-133.644.amzn2.x86_64, java.vendor=Amazon.com Inc., java.vm.version=11.0.17+9-LTS, lucene.version=8.7.0, mergeMaxNumSegments=40, mergeFactor=10, timestamp=1711309532883}]:[attributes={Lucene87StoredFieldsFormat.mode=BEST_SPEED}]:fieldInfosGen=1886:dvGen=1886 :softDel=60117820 :id=4v1w9rn7oj6c0d78gy0t6lih8 ...
[2024-03-27T08:33:44,701][TRACE][o.e.i.e.E.IW]: Lucene Merge Thread #145644] IW: after commitMerge: _1pa38(8.7.0):C60121906:[diagnostics={os=Linux, java.version=11.0.17...

Version and environment details

No response

@vigyasharma
Copy link
Contributor

TieredMergePolicy prefers merges that have less skew across segment sizes, smaller size, and higher no. of expunged deletes. Each merge here is a set of segments that will be merged into a single segment (eventually this becomes aOneMerge object). To do this curation, the policy assigns a merge score to each merge, and lower values of the score are preferred for merging.

__

[2024-03-20T15:46:36,015][TRACE][o.e.i.e.E.MP ]: Lucene Merge Thread #403832] MP:   maybe=_1wtuc(8.7.0):C29948777/29948777:[diagnostics={os=Linux, java.version=11.0.17, os.arch=amd64, java.runtime.version=11.0.17+9-LTS, source=merge, ... :softDel=12100433 :id=i9yc9l5c6qvt26u9srmz8umo score=2.220052984489907 skew=0.713 nonDelRatio=1.000 tooLarge=false size=7083.170 MB

From the log above, _1wtuc seems to have a high skew value (it ranges from 1/mergeFactor = 0.1 (best) to 1 (worst)), but what stands out is the high value of nonDelRatio = 1.000.

nonDelRatio is calculated as totalBytesAfterMerge / totalBytesBeforeMerge, and gives a sense of the no. of deletes that merge would expunge. A high value (1 being highest) indicates that merge will not reclaim any deletes!

The value for totalBytesAfterMerge comes from summing up the post-merge size of each segment, which is computed by prorating the size of expunged deletes: segmentSize * (1 - reclaimableDeletes/maxDoc). The no. of reclaimable deletes is fetched from numDeletesToMerge() in the merge policy, which can be overridden by implementations like SoftDeletesRetentionMergePolicy to retain soft deleted documents in the segment post merge.

It is likely that for this segment, even though we have a high no. of deletes, SoftDeletesRetentionMergePolicy is retaining all of them, causing nonDelRatio to be 1. Would help to look at your SoftDeletesRetentionMergePolicy implementation.

...

As a side note, is the log line above truncated? Because going by C29948777/29948777 and :softDel=12100433 - the size of pending deletes in the segment is 29948777 (same as total docs), while no. of soft deletes is 12100433, (only 60% of total pending deletes). Even if all of them are retained by the merge policy, there should still be 40% deletes that merge can reclaim. I wonder if some info, like details of other segments in the merge, got truncated from the log line.

@vigyasharma
Copy link
Contributor

For the segment _1pa38, can you also share it's details from before setting max_merged_segment to 3gb, i.e. when it was not getting picked up for merge? The difference can help understand why that helped.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants