Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Instrument upsert points, optimization and flushing operations #4158

Open
wants to merge 8 commits into
base: dev
Choose a base branch
from

Conversation

ffuugoo
Copy link
Contributor

@ffuugoo ffuugoo commented May 2, 2024

This PR adds tracing instrumentation to upsert points operations, optimizer and flush workers.

Intended use is to enable this additional logs on chaos-testing, so that we can debug consistency failures.

tracing filtering is a clusterfuck, so I've added a custom inner = true field to all added spans and events, so that we can turn them on/off easily (and independently of other log messages).

All added spans and events are disabled by default (so they would not spam the console; this logging is extensive).
They can be enabled with the following filters:

  • [{inner}]=info - enable all added instrumentation
  • [upsert_points/insert]=info,[upsert_points/update]=info,[upsert_points/move]=info - enable instrumentation related to upsert points operations
  • [optimize]=info - enable instrumentation related to optimization
  • [flush_all]=info,[flush]=info - enable instrumentation related to flushing
  • E.g.: QDRANT__LOG_LEVEL=error,[upsert_points/insert]=info,[optimize]=info,[flush]=info

All Submissions:

  • Contributions should target the dev branch. Did you create your branch from dev?
  • Have you followed the guidelines in our Contributing document?
  • Have you checked to ensure there aren't other open Pull Requests for the same update/change?

New Feature Submissions:

  1. Does your submission pass tests?
  2. Have you formatted your code locally using cargo +nightly fmt --all command prior to submission?
  3. Have you checked your code using cargo clippy --all --all-features command?

Changes to Core Features:

  • Have you added an explanation of what your changes do and why you'd like us to include them?
  • Have you written new tests for your core changes, as applicable?
  • Have you successfully ran tests with your changes locally?

@ffuugoo
Copy link
Contributor Author

ffuugoo commented May 2, 2024

Example of new instrumentation Screenshot 2024-05-02 at 17 10 07

Comment on lines 880 to 882
fn tmp_path(&self) -> PathBuf {
self.write_segment.get().read().data_path()
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The result of this function is not stored anywhere and is used immediately. Would it make sense to return a non-owning type here to avoid clones? E.g. something like this: fn tmp_path(&self) -> MappedRwLockReadGuard<'_, Path<'_>>. It would require a similar change for fn data_path().

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I think I'll make just that. Initially this was made for debug session, and I didn't care to refactor SegmentEntry::data_path, so I just rolled with PathBufs everywhere.

But if we gonna merge this, I don't want to have allocating calls in the upsert_points path. I've tried to implement a check with tracing if particular span/event would be logged, and if not, avoid calling SegmentEntry::id... but neither of my approaches worked, so far. :/

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, having a tracing::info!("{}", something()) is not enough to not call the something() if it is disabled?

Copy link
Contributor Author

@ffuugoo ffuugoo May 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, I haven't strictly tested all of this, but:

  • for message format arguments, it might be enough
  • but I'm not using SegmentEntry::id as a format argument, I use it as a "field"
  • and tracing allows using fields for filtering
  • so, I assume, that fields have to always be evaluated (so that they can be used as for filtering)
    • maybe, except when the span/event is disabled by the plain log-level
    • but I explicitly use fields to filter tracing data in this PR
    • so I assume segment.id = segment.id() in the instrumentation is always evaluated

To make it worse:

  • I tried a few approaches to pre-filter instrumentation
  • e.g., query "will this span/event be enabled given it has this log-level and these fields"
    • and don't call SegmentEntry::id if it's disabled
  • but in all cases tracing always reported instrumentation as enabled
    • even though instrumentation was not logged, i.e., it was filtered out
  • so, I'd assume, that with how PR is structured right now, tracing evaluates all expressions first
    • and filtering applied later, once everything is already evaluated

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Partially fixed this. SegmentEntry::id is cheap to call now, though, I assume segment.id = %segment.id() in instrumentation still allocates.

lib/segment/src/entry/entry_point.rs Outdated Show resolved Hide resolved
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants