Instrument upsert points, optimization and flushing operations #4158

ffuugoo · 2024-05-02T14:02:00Z

This PR adds tracing instrumentation to upsert points operations, optimizer and flush workers.

Intended use is to enable this additional logs on chaos-testing, so that we can debug consistency failures.

tracing filtering is a clusterfuck, so I've added a custom inner = true field to all added spans and events, so that we can turn them on/off easily (and independently of other log messages).

All added spans and events are disabled by default (so they would not spam the console; this logging is extensive).
They can be enabled with the following filters:

[{inner}]=info - enable all added instrumentation
[upsert_points/insert]=info,[upsert_points/update]=info,[upsert_points/move]=info - enable instrumentation related to upsert points operations
[optimize]=info - enable instrumentation related to optimization
[flush_all]=info,[flush]=info - enable instrumentation related to flushing
E.g.: QDRANT__LOG_LEVEL=error,[upsert_points/insert]=info,[optimize]=info,[flush]=info

All Submissions:

Contributions should target the dev branch. Did you create your branch from dev?
Have you followed the guidelines in our Contributing document?
Have you checked to ensure there aren't other open Pull Requests for the same update/change?

New Feature Submissions:

Does your submission pass tests?
Have you formatted your code locally using cargo +nightly fmt --all command prior to submission?
Have you checked your code using cargo clippy --all --all-features command?

Changes to Core Features:

Have you added an explanation of what your changes do and why you'd like us to include them?
Have you written new tests for your core changes, as applicable?
Have you successfully ran tests with your changes locally?

lib/collection/src/collection_manager/holders/proxy_segment.rs

lib/collection/src/collection_manager/holders/segment_holder.rs

lib/collection/src/collection_manager/optimizers/segment_optimizer.rs

lib/segment/src/entry/entry_point.rs

Additionally: - Remove *all* trailing commas in `tracing` macro calls

ffuugoo · 2024-05-02T15:10:44Z

Example of new instrumentation

…is disabled

…erations if tracing is disabled BTW, neither work :/

xzfc · 2024-05-03T03:52:35Z

lib/collection/src/collection_manager/holders/proxy_segment.rs

+    fn tmp_path(&self) -> PathBuf {
+        self.write_segment.get().read().data_path()
+    }


The result of this function is not stored anywhere and is used immediately. Would it make sense to return a non-owning type here to avoid clones? E.g. something like this: fn tmp_path(&self) -> MappedRwLockReadGuard<'_, Path<'_>>. It would require a similar change for fn data_path().

Yeah, I think I'll make just that. Initially this was made for debug session, and I didn't care to refactor SegmentEntry::data_path, so I just rolled with PathBufs everywhere.

But if we gonna merge this, I don't want to have allocating calls in the upsert_points path. I've tried to implement a check with tracing if particular span/event would be logged, and if not, avoid calling SegmentEntry::id... but neither of my approaches worked, so far. :/

So, having a tracing::info!("{}", something()) is not enough to not call the something() if it is disabled?

So, I haven't strictly tested all of this, but:

for message format arguments, it might be enough

but I'm not using SegmentEntry::id as a format argument, I use it as a "field"

and tracing allows using fields for filtering

so, I assume, that fields have to always be evaluated (so that they can be used as for filtering)

maybe, except when the span/event is disabled by the plain log-level

but I explicitly use fields to filter tracing data in this PR

so I assume segment.id = segment.id() in the instrumentation is always evaluated

To make it worse:

I tried a few approaches to pre-filter instrumentation

e.g., query "will this span/event be enabled given it has this log-level and these fields"

and don't call SegmentEntry::id if it's disabled

but in all cases tracing always reported instrumentation as enabled

even though instrumentation was not logged, i.e., it was filtered out

so, I'd assume, that with how PR is structured right now, tracing evaluates all expressions first

and filtering applied later, once everything is already evaluated

Partially fixed this. SegmentEntry::id is cheap to call now, though, I assume segment.id = %segment.id() in instrumentation still allocates.

lib/segment/src/entry/entry_point.rs

ffuugoo requested review from timvisee and generall May 2, 2024 14:02

github-actions bot mentioned this pull request May 2, 2024

Flaky test segment_builder_test::test_building_cancellation #2723

Open

timvisee reviewed May 2, 2024

View reviewed changes

github-actions bot mentioned this pull request May 2, 2024

Flaky test index::hnsw_index::tests::test_graph_connectivity::test_graph_connectivity #2875

Open

ffuugoo and others added 3 commits May 2, 2024 17:02

Instrument upsert points, optimization and flushing operations

9ceed88

Simplify instrumentation filtering

55e67e1

Additionally: - Remove *all* trailing commas in `tracing` macro calls

Derive Clone for LockedSegment

ef73378

ffuugoo force-pushed the instrument-upsert-points branch from 341a846 to ef73378 Compare May 2, 2024 15:03

Clarify one of the log messages a tiny bit

8c66102

ffuugoo added 2 commits May 2, 2024 17:44

Avoid calling SegmentEntry::id during upsert operations if tracing …

2354c61

…is disabled

Another approach to avoid calling SegmentEntry::id during upsert op…

4f47a08

…erations if tracing is disabled BTW, neither work :/

github-actions bot mentioned this pull request May 2, 2024

Flaky test hnsw_discover_test::hnsw_discover_precision #2973

Open

xzfc reviewed May 3, 2024

View reviewed changes

ffuugoo added 2 commits May 3, 2024 11:41

Make SegmentEntry::id cheap to call

ab1e6ba

fixup! Make SegmentEntry::id cheap to call

1017405

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Instrument upsert points, optimization and flushing operations #4158

Instrument upsert points, optimization and flushing operations #4158

ffuugoo commented May 2, 2024 •

edited

ffuugoo commented May 2, 2024

xzfc May 3, 2024

ffuugoo May 3, 2024

timvisee May 3, 2024

ffuugoo May 3, 2024 •

edited

ffuugoo May 3, 2024

Instrument upsert points, optimization and flushing operations #4158

Are you sure you want to change the base?

Instrument upsert points, optimization and flushing operations #4158

Conversation

ffuugoo commented May 2, 2024 • edited

All Submissions:

New Feature Submissions:

Changes to Core Features:

ffuugoo commented May 2, 2024

xzfc May 3, 2024

Choose a reason for hiding this comment

ffuugoo May 3, 2024

Choose a reason for hiding this comment

timvisee May 3, 2024

Choose a reason for hiding this comment

ffuugoo May 3, 2024 • edited

Choose a reason for hiding this comment

ffuugoo May 3, 2024

Choose a reason for hiding this comment

ffuugoo commented May 2, 2024 •

edited

ffuugoo May 3, 2024 •

edited