-
Notifications
You must be signed in to change notification settings - Fork 404
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Issue 6581: LTS - Throttle SLTS on slow or unstable LTS #7300
base: issue-6581-LTS-throttle
Are you sure you want to change the base?
Issue 6581: LTS - Throttle SLTS on slow or unstable LTS #7300
Conversation
Signed-off-by: Sachin Joshi <sachin.joshi@emc.com>
… part 1 Signed-off-by: Sachin Joshi <sachin.joshi@emc.com>
… Simplify logic, remove TimerUtils.java Signed-off-by: Sachin Joshi <sachin.joshi@emc.com>
… Add mock tests. Signed-off-by: Sachin Joshi <sachin.joshi@emc.com>
… Fix doc comments. Signed-off-by: Sachin Joshi <sachin.joshi@emc.com>
… Fix doc comments. Signed-off-by: Sachin Joshi <sachin.joshi@emc.com>
Signed-off-by: Sachin Joshi <sachin.joshi@emc.com>
…e-6581-LTS-throttle-storage-writer
…orageHealthTracker. Signed-off-by: Sachin Joshi <sachin.joshi@emc.com>
…duled task. Signed-off-by: Sachin Joshi <sachin.joshi@emc.com>
…e-6581-LTS-throttle-storage-writer
Signed-off-by: Sachin Joshi <sachin.joshi@emc.com>
…e-6581-LTS-throttle-storage-writer
…e-6581-LTS-throttle-storage-writer
…e-6581-LTS-throttle-storage-writer Signed-off-by: Sachin Joshi <sachin.joshi@emc.com> # Conflicts: # segmentstore/storage/src/main/java/io/pravega/segmentstore/storage/chunklayer/ChunkedSegmentStorage.java # segmentstore/storage/src/main/java/io/pravega/segmentstore/storage/chunklayer/SystemJournal.java # segmentstore/storage/src/test/java/io/pravega/segmentstore/storage/chunklayer/ChunkedSegmentStorageTests.java
…e-6581-LTS-throttle-storage-writer
…e-6581-LTS-throttle-storage-writer
…e-6581-LTS-throttle-storage-writer
…ub.com/sachin-j-joshi/pravega into issue-6581-LTS-throttle-storage-writer-for-merge Signed-off-by: Sachin Joshi <sachin.joshi@emc.com> # Conflicts: # segmentstore/server/src/main/java/io/pravega/segmentstore/server/containers/StreamSegmentContainer.java # segmentstore/storage/src/main/java/io/pravega/segmentstore/storage/chunklayer/ChunkedSegmentStorage.java # segmentstore/storage/src/main/java/io/pravega/segmentstore/storage/chunklayer/ChunkedSegmentStorageConfig.java
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## issue-6581-LTS-throttle #7300 +/- ##
==========================================================
Coverage ? 86.00%
Complexity ? 16385
==========================================================
Files ? 1045
Lines ? 61187
Branches ? 6224
==========================================================
Hits ? 52626
Misses ? 5354
Partials ? 3207 ☔ View full report in Codecov by Sentry. |
Thanks @sachin-j-joshi , one question before getting into the review: has this been tested with performance test to check for potential regressions and/or for the behavior of the system inducing LTS unavailability on purpose? |
@RaulGracia We do plan to validate this change with perf and load regression test. (That is the reason this PR is against the feature branch so that we can continue improve it before merging to master) |
...e/storage/src/main/java/io/pravega/segmentstore/storage/chunklayer/StorageHealthTracker.java
Show resolved
Hide resolved
.../storage/src/main/java/io/pravega/segmentstore/storage/chunklayer/ChunkedSegmentStorage.java
Show resolved
Hide resolved
...re/storage/src/main/java/io/pravega/segmentstore/storage/chunklayer/ChunkStorageMetrics.java
Show resolved
Hide resolved
Signed-off-by: Sachin Joshi <sachin.joshi@emc.com>
366138f
to
d37b635
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
100ms is far too low for most of the backends we support. (Because 100ms an expected p50 for many LTSs). Rather than making this configured can we make this adaptive by tracking the latency of successful requests? Or failing that, could we make the default value determined by which backend store is being used so that we can set reasonable defaults.
[ PR against feature branch]
Signed-off-by: Sachin Joshi sachin.joshi@emc.com
Change log description
LTS - Throttle major SLTS activities on slow or unavailable LTS.
Purpose of the change
Fixes #6581
What the code does
New
StorageHealthTracker
that tracks health ofChunkedSegmentStorage
and reports storage health status periodically.Based on the
how is throttle calculated
SLOW
- when % late requests in current iteration is above min throttle threshold, throttling is applied on the next iteration and each updating request is throttled by duration proportional to % late requests.DEGRADED
- when % late requests are above max threshold then ALL update requests are rejected withStorageUnavailableException
UNAVAILABLE
- when underlying storage itself indicate that storage is too busy or unavailable (Eg http 503)StorageUnavailableException
, theStorageHealthTracker
also throttles any requests until there is at least one successful request. This throttle duration is linearly proportional to number of iteration where storage was unavailable. As soon as there is a successful non-late request this throttle becomes inactive.When LTS is slow or degraded performance.
When LTS is unavailable.
Only following modifying operations are throttled.
How to verify it
All tests should pass