Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Emitter AMQP protocol stucked after MQ broker restart #40592

Open
bkalas opened this issue May 13, 2024 · 9 comments
Open

Emitter AMQP protocol stucked after MQ broker restart #40592

bkalas opened this issue May 13, 2024 · 9 comments
Labels

Comments

@bkalas
Copy link

bkalas commented May 13, 2024

Describe the bug

we are using Emitters for sending message to remote MQ broker.
Normally without any OnOverflow annotation
@Inject @Channel(RESPONSE_QUEUE) Emitter<String> emitter; ... public void emit(Message<String> toEmit) { this.emitter.send(toEmit); ...
Few times we observed situation that if remote broker was restarted, emiting of messages stopped working.
For few x (~100) messages we didnt get any errror (i guess x= default size of buffer),
then we started to get errror
java.lang.IllegalStateException: SRMSG00034: Insufficient downstream requests to emit item
This was only solved by quarkus app restart.
Health report for channel was all time showing OK status.

We are also using a lot of consumers (@Incoming ) of amqp messages, this were susccessfully reconnected and continue to work.

Whn i try to use for example
@OnOverflow(FAIL, bufferSize = 100)
it at least discnnected after error from channel and started reporting KO in health check, but here bufferSize seems to be ignored, so it is not usable.

Expected behavior

Emitting channel must be resilient against remote broker restart and must continue to emit messages.

Actual behavior

Sometimes restarts of remote broker was handled ok, but noit always.

How to Reproduce?

  1. must have installed amq broker (in our case Redhat AMQ7.11)
  2. quarkus app with Emitter to some queue in remote broker
  3. emit messages in some intervlas and restart broker.
  4. for me always ~6 restart of broker was enough to simulate this issue, that aftger one of this restarts no new message ws emitted.

Output of uname -a or ver

No response

Output of java -version

17

Quarkus version or git rev

2.16.12.Final

Build tool (ie. output of mvnw --version or gradlew --version)

No response

Additional information

No response

@quarkus-bot
Copy link

quarkus-bot bot commented May 13, 2024

/cc @cescoffier (reactive-messaging), @ozangunalp (reactive-messaging)

@ozangunalp
Copy link
Contributor

Looks like an issue with the send-retry mechanism on the AMQP connector. It doesn't reconnect the client and keeps retrying with the previous (unconnected) sender.

I have a fix in mind but I need to be able to reproduce this scenario (in the test environment) to test it.

@MikkoKauhanen
Copy link

Hi,

If it helps at all I created quick and dirty test to try to reproduce the issue which you can find from link to the repo. This test uses Quarkus 3.10.1.

We have same problem in our services that use quarkus version 3.5.3. In our emitters we have @OnOverflow(OnOverflow.Strategy.UNBOUNDED_BUFFER) as a strategy.

Here is also link to my msg about my findings related to this issue: link to comment

@ozangunalp
Copy link
Contributor

@MikkoKauhanen Thanks for this, I'll check it later today or at the beginning of next week.

@bkalas
Copy link
Author

bkalas commented Jun 4, 2024

@ozangunalp Hi, anything new?

@MikkoKauhanen
Copy link

Hi, Just wanted to mention that we have now found out two additional issues related to MQ restart.

In case there are messages created by the emitter to be produced to broker during the MQ disconnect. After the broker is up there are a lot of connections established by the application to broker. And these connections seem to stay alive but I don't know if these are ever used by the application for producing messages anymore.

We noticed this in our dev env where we use micro instance type of Amazon MQ (maximumConnections = 300) where we exceeded the maximumConnections and our services health checks started to fail.

We also saw that connections to broker increased a lot even when there were no messages tried to be produced. This turned to be caused by the AMQPConnector Readiness health check that tries to make a connection through AmqpCreditBasedSender.isConnected().

@cescoffier
Copy link
Member

We also saw that connections to broker increased a lot even when there were no messages tried to be produced. This turned to be caused by the AMQPConnector Readiness health check that tries to make a connection through AmqpCreditBasedSender.isConnected().

That's expected no? The readiness check verify that the broker is reachable. So, we need to establish connections. Now, it should be only one connection.

@MikkoKauhanen
Copy link

We also saw that connections to broker increased a lot even when there were no messages tried to be produced. This turned to be caused by the AMQPConnector Readiness health check that tries to make a connection through AmqpCreditBasedSender.isConnected().

That's expected no? The readiness check verify that the broker is reachable. So, we need to establish connections. Now, it should be only one connection.

Yes I can understand that the connection is tried to be established. But it seems to be one new connection for each q/health endpoint call which is done during message broker disconnect/downtime.

connections

@MikkoKauhanen
Copy link

Hey,

I updated the demo project to have three tests that tries to reproduce the issues that I am aware.

  1. Connection amount is increasing due to the emitted msgs while broker is not available
  2. Connection amount is increasing due to health checks done while broker is not available
  3. Producer stops producing messages after connection issues to broker.

Link to repo

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants