Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ServerMetrics to collect metrics related server #5627

Open
wants to merge 19 commits into
base: main
Choose a base branch
from

Conversation

seongbeenkim
Copy link

@seongbeenkim seongbeenkim commented Apr 22, 2024

Motivation:

Add ServerMetrics to collect metrics related server. #4992

Modifications:

  • Add ServerMetrics.increasePendingHttp1Requests()
    in Http1RequestDecoder#channelRead before fireChannelRead()
  • Add ServerMetrics.increasePendingHttp2Requests()
    in Http2RequestDecoder#onHeadersRead before fireChannelRead()
  • Add ServerMetrics.decreasePendingHttp1Requests(), ServerMetrics.decreasePendingHttp2Requests(), ServerMetrics.increaseActiveHttp1WebSocketRequests(), ServerMetrics.increaseActiveHttp1Requests() and ServerMetrics.increaseActiveHttp2Requests()
    in HttpServerHandler#serve0 before service.serve(reqCtx, req)
  • Add ServerMetrics.decreaseActiveHttp1WebSocketRequests(), ServerMetrics.decreaseActiveHttp1Requests() and ServerMetrics.decreaseActiveHttp2Requests()
    in AbstractHttpResponseHandler#tryComplete
  • Add ServerMetrics.increaseActiveConnectionsAndGet() and ServerMetrics.decreaseActiveConnections()
    in ConnectionLimitingHandler#channelRead

Result:

@CLAassistant
Copy link

CLAassistant commented Apr 22, 2024

CLA assistant check
All committers have signed the CLA.

@minwoox minwoox added this to the 1.29.0 milestone Apr 22, 2024
Copy link
Member

@minwoox minwoox left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great, @seongbeenkim! Left some suggestions. 😉

@@ -574,14 +575,19 @@ private ChannelFuture doStart(ServerPort port) {
}

private void setupServerMetrics() {
final MeterRegistry meterRegistry = config().meterRegistry();
final MeterRegistry meterRegistry = config.meterRegistry();
final ServerMetrics serverMetrics = config.serverMetrics();
final GracefulShutdownSupport gracefulShutdownSupport = this.gracefulShutdownSupport;
assert gracefulShutdownSupport != null;

meterRegistry.gauge("armeria.server.pending.responses", gracefulShutdownSupport,
GracefulShutdownSupport::pendingResponses);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@seongbeenkim is going to add the metric to the ServerMetrics in the follow-up PR.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question) Could "armeria.server.pending.responses" be deprecated?
I guess ServerMetric. activeRequests() == GracefulShutdownSupport.pendingResponses().

Copy link
Member

@minwoox minwoox Jun 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pendingResponses does not include the transient service requests, meanwhile activeRequests include them.
If we make activeRequests do not contain the transient service requests, we can deprecate pendingResponses. But I wasn't sure if it was okay. 🤔

Copy link
Contributor

@jrhee17 jrhee17 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good overall! Left some minor comments

@@ -44,13 +50,18 @@ void testExceedMaxNumConnections() {

@Test
void testMaxNumConnectionsRange() {
final ConnectionLimitingHandler handler = new ConnectionLimitingHandler(Integer.MAX_VALUE);
final Server server = Server.builder()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question) Couldn't we just create a ServerMetrics instance and pass it instead of creating a Server?
Ditto for the other test case above.

e.g.

final ServerMetrics serverMetrics = new ServerMetrics();
val handler = new ConnectionLimitingHandler(Integer.MAX_VALUE, serverMetrics);
...

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we can create ServerMetrics!!
i think creating ServerMetrics is a better way for the test code!!
i will apply the review thank you!! :)

*/
public final class ServerMetrics {

private final LongAdder pendingRequests = new LongAdder();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't seem to be used

Suggested change
private final LongAdder pendingRequests = new LongAdder();

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh right.. i forgot to delete it after some modifications..
i will remove the field!! thank you!!

@@ -823,6 +831,8 @@ static String toString(
buf.append(absoluteUriTransformer);
buf.append(", unhandledExceptionsReportIntervalMillis: ");
buf.append(unhandledExceptionsReportIntervalMillis);
buf.append(", serverMetrics: ");
buf.append(serverMetrics);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should also implement toString() for serverMetrics if we want to do this.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh i will implement toString() method!! thank you!!

@@ -82,6 +85,14 @@ void disconnectWhenFinished() {
}

final boolean tryComplete(@Nullable Throwable cause) {
if (req.isHttp1WebSocket()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm concerned decreaseActive*Requests will never have a chance to be called.
e.g. If a request is received but a connection is immediately closed, will decreaseActive*Requests always be called?

I'm wondering if the future where the request/response is completed is a more fail-safe location. What do you think @minwoox ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good suggestion. I prefer the location. 👍

@minwoox
Copy link
Member

minwoox commented May 21, 2024

Let me resolve the conflict. 😉

@jrhee17
Copy link
Contributor

jrhee17 commented May 29, 2024

Gentle ping 🙏

@minwoox
Copy link
Member

minwoox commented May 30, 2024

I've pushed commits because we are about to release the next version.
There are also some failing tests that will be resolved after #5680 gets merged.

Copy link
Member

@minwoox minwoox left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@seongbeenkim Thanks for the all your work! 😄

* Returns the number of all pending requests.
*/
public long pendingRequests() {
return pendingHttp1Requests.longValue() + pendingHttp2Requests.longValue();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit:

Suggested change
return pendingHttp1Requests.longValue() + pendingHttp2Requests.longValue();
return pendingHttp1Requests() + pendingHttp2Requests();

Comment on lines 70 to 72
return activeHttp1WebSocketRequests.longValue() +
activeHttp1Requests.longValue() +
activeHttp2Requests.longValue();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto.

Suggested change
return activeHttp1WebSocketRequests.longValue() +
activeHttp1Requests.longValue() +
activeHttp2Requests.longValue();
return activeHttp1WebSocketRequests() +
activeHttp1Requests() +
activeHttp2Requests();

ServerMetrics::activeConnections);
meterRegistry.gauge("armeria.server.pending.requests", serverMetrics,
ServerMetrics::pendingRequests);
meterRegistry.gauge("armeria.server.active.requests", serverMetrics,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we implement MeterBinder to provide a way to easily export the values as metrics? For example:

final class CertificateMetrics implements MeterBinder {

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good suggestion. 👍

meterRegistry.gauge("armeria.server.pending.requests", serverMetrics,
ServerMetrics::pendingRequests);
meterRegistry.gauge("armeria.server.active.requests", serverMetrics,
ServerMetrics::activeRequests);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we tag SessionProtocol to the metric?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All addressed. PTAL. 😉

Copy link
Contributor

@ikhoon ikhoon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like a useful feature. ServerMetric may be used to implement a graceful shutdown logic.
Thanks, @seongbeenkim and @minwoox.

@@ -574,14 +575,19 @@ private ChannelFuture doStart(ServerPort port) {
}

private void setupServerMetrics() {
final MeterRegistry meterRegistry = config().meterRegistry();
final MeterRegistry meterRegistry = config.meterRegistry();
final ServerMetrics serverMetrics = config.serverMetrics();
final GracefulShutdownSupport gracefulShutdownSupport = this.gracefulShutdownSupport;
assert gracefulShutdownSupport != null;

meterRegistry.gauge("armeria.server.pending.responses", gracefulShutdownSupport,
GracefulShutdownSupport::pendingResponses);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question) Could "armeria.server.pending.responses" be deprecated?
I guess ServerMetric. activeRequests() == GracefulShutdownSupport.pendingResponses().

Copy link
Contributor

@jrhee17 jrhee17 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good overall 👍 Pushed a small commit which cleans up dead code and tests.
Also left a comment which I'm not how to handle at the moment 🤔

@@ -767,6 +766,14 @@ private void handleRequestOrResponseComplete() {
requestOrResponseComplete = true;
return;
}
if (req.isHttp1WebSocket()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I've realized that this won't be reached if the request hits the following lines:

I'm not sure what the best way to handle this is at the moment 😅 Any ideas are welcome

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. I think we need to call req.abort() for the early returned requests.
Otherwise, ByteBuf in the requests could be leaked.

We may decrease the increased metric in .abort() when req.init() isn't invoked.

Copy link
Member

@minwoox minwoox Jun 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for identifying these locations. 👍
I think we can proceed as follows:

  • Increase the pending requests in HttpRequestDecoders regardless of whether requests are aggregating or streaming.
  • Decrease the pending requests in the locations identified by @jrhee17.
  • Decrease the pending requests and increase the active requests right before invoking service.serve().

What do you think?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good to me 👍

if (routingContext.sessionProtocol().isMultiplex()) {
serverMetrics.increaseActiveHttp2Requests();
} else {
serverMetrics.increaseActiveHttp1Requests();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you elaborate on why we need to update the metrics in this class? Wouldn't it be better calling these in Http[12]RequestDecoder?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wanted to avoid making this change in the serve0 method:
0a77e61#diff-8c8fd516b9814fa10a52ccad8660a3383cc0862c1e0cc6f91b0a456eb11a3641R458-R476

However, I think we don't have any other choice but to make the change there due to the comment:
#5627 (comment)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. Let's wait for @seongbeenkim to address the comments, then. 😉

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I pushed the commit. PTAL. 😉
41ee0f8

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Increasing pending requests in Http[12]RequestDecoder, decreasing pending requests and increasing active requests in HttpServerHandler look good to me.

This isn't just because of the comment #5627 (comment)
but also because it simplifies understanding and managing the flow of metrics.

From my review, all classes implement DecodedHttpRequest are only created in Http[12]RequestDecoder.
if we manage the metrics within Http[12]RequestDecoder and HttpServerHandler,
we could clearly observe that pending requests increase according to protocol in Http[12]RequestDecoder just before handling requests, and pending requests decreas and active requests increase in HttpServerHandler, regardless of the specific DecodedHttpRequest implementation.

However, if we manage the metrics within each implemented DecodedHttpRequest,
we couldn't see clearly when to decrease or increase pending and active requests because each class handles the metrics differently. This complexity makes it challenging to address issues when metrics need to be managed amidst the processing of requests, as noted in the comment.

Since all class implement DecodedHttpRequest are created through Http[12]RequestDecoder and are simply treated as DecodedHttpRequest in lots of classes includingHttpServerHandler,
so i think handling the metrics directly through DecodedHttpRequest seems more straightforward than basing it on whether the requests are aggregated or streaming.

Copy link
Member

@trustin trustin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank very much, @seongbeenkim and @minwoox! 🙇

Comment on lines +264 to +265
System.nanoTime(), SystemInfo.currentTimeMicros(), true, false
);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps we can revert this cosmetic change?

Suggested change
System.nanoTime(), SystemInfo.currentTimeMicros(), true, false
);
System.nanoTime(), SystemInfo.currentTimeMicros(), true, false);

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Expose the server-side metrics programmatically
6 participants