Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added OpenTelemetry instrumentation to Ghost backend #20144

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

cmraible
Copy link
Contributor

@cmraible cmraible commented May 7, 2024

This PR adds OpenTelemetry instrumentation to Ghost's backend code, which allows us to view traces similar to what we see in Sentry Performance locally.

It also refactors the ConnectionPoolInstrumentation (that was seeming to cause CPU utilization issues) —instead of creating so many logs and metrics and trying to pipe them into elastic, this class now simply creates spans in OpenTelemetry for the acquire and query stage of the query, so you can see both if there is contention in the pool and if the query itself is taking a long time. The new version of this instrumentation lives at ghost/core/core/shared/OpenTelemetryKnexTracing.js.

Note: there is likely some opportunity to combine this file with ghost/core/core/shared/SentryKnexTracingIntegration.js since there is a lot of overlap in functionality, but for now this seems to work well.

OpenTelemetry is enabled if NODE_ENV === 'development' or if it is explicitly enabled via config with opentelemetry:enabled.

It also adds a Jaeger container to Ghost's docker-compose file for viewing the traces. There's no setup required (beyond running yarn docker:reset to pickup the changes in the docker-compose file the first time — but this will also reset your DB so be careful). This will launch the Jaeger container, and you can view the UI to see the traces at http://localhost:16686/search.

Example trace in Jaeger:
CleanShot 2024-05-22 at 07 00 26@2x

Todo:

  • Combine OpenTelemetryKnexTracing.js and SentryKnexTracingIntegration.js if it makes sense
  • Add more attributes to the tags as needed
  • Add tests for unhappy paths and make sure any potential errors are caught
  • Check if CPU usage is affected by this

@cmraible cmraible force-pushed the opentelemetry-wip branch 6 times, most recently from 7573224 to 9ba1396 Compare May 7, 2024 06:42
Copy link

codecov bot commented May 7, 2024

Codecov Report

Attention: Patch coverage is 94.21053% with 11 lines in your changes are missing coverage. Please review.

Project coverage is 73.29%. Comparing base (b6195d2) to head (6845204).
Report is 184 commits behind head on main.

Files Patch % Lines
ghost/core/core/shared/OpenTelemetryKnexTracing.js 92.48% 10 Missing ⚠️
ghost/core/core/shared/instrumentation.js 97.72% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main   #20144      +/-   ##
==========================================
+ Coverage   73.02%   73.29%   +0.27%     
==========================================
  Files        1264     1265       +1     
  Lines       73780    74314     +534     
  Branches     9825     9868      +43     
==========================================
+ Hits        53876    54467     +591     
+ Misses      19103    19047      -56     
+ Partials      801      800       -1     
Flag Coverage Δ
admin-tests 42.07% <ø> (+0.28%) ⬆️
e2e-tests 81.84% <94.21%> (+0.22%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does @opentelemetry/instrumentation-knex not work in place of this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It picks up the queries, but it doesn't include spans for the acquire piece. That said, this is pretty ugly and doesn't seem to scale well, so I think we might just have to accept what we get with @opentelemetry/instrumentation-knex as good enough — leaning towards tearing this out at this point.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants