Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix: MuxBearerClosed socket closed when reading data #1644

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

perseus-algol
Copy link

@perseus-algol perseus-algol commented Apr 20, 2024

Closes #1632.

client.abort() causes following error at cardano node cardano.node.LocalErrorPolicy:Error:182 IP LocalAddress "/socket@58" ErrorPolicyUnhandledApplicationException (MuxError MuxBearerClosed "<socket: 58> closed when reading data, waiting on next header True"). If remove this call, it works fine.

When making this edit, I was guided by:

  • Sebastian's comment in Mithril Signer Local Error Policy : Error 182 - MuxError #1632 - "client should send MsgRelease and MsgDone after the query"
  • Ouroboros network spec, section 3.12 Local State Query mini-protocol, Figure 3.8: State machine of the Local State Query mini-protocol.
  • Examples from Pallas library, in examples used same type of Client - NodeClient, containing multiplexer inside. And there is no abort call at the end of example. Although in this example we see only client.send_release() and no send_done as how is it done in Mithril and as Sebastian advises.

…hen reading data, waiting on next header True""
@jpraynaud
Copy link
Member

@falcucci what is the purpose of the client.abort() in the PallasChainObserver?

Is it safe to remove it or do we need to replace with a different command to gracefully close the connection to the Cardano node?

@falcucci
Copy link
Collaborator

falcucci commented Apr 25, 2024

It remotely shutdown the current task so it might happen instantaneously and interrupt a parallel task that is running or even somehow breakes the message sent by the server if there is something idle to be sent like a block, pending response or even a ping pong message.

I think it's reasonable to replace the approach to close the plexer client (if it's really needed at our use case) to avoid hanged connections only

Maybe we could guarantee we abort the tasks when and if the mithril instance is killed, so it would be enough.

If this isnt a risk (hanged connections) this pr looks good to me.

@perseus-algol
Copy link
Author

perseus-algol commented Apr 25, 2024

I have a suspicion that by calling abort we interrupt tick method in the middle of message sending/receiving, which can cause socket closed error on server side. But at the same we still need somehow to terminate spawned async processes of muxer and demuxer.

I'll made some experiments tomorrow regarding graceful shutdown of muxer and demuxer.

@ch1bo
Copy link
Member

ch1bo commented May 7, 2024

@scarmuega Please see aboves comment on the tick insight.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Mithril Signer Local Error Policy : Error 182 - MuxError
4 participants