Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TCPSocket_PollStatus crash on Linux #154

Open
rotoglup opened this issue Feb 8, 2019 · 12 comments
Open

TCPSocket_PollStatus crash on Linux #154

rotoglup opened this issue Feb 8, 2019 · 12 comments

Comments

@rotoglup
Copy link

rotoglup commented Feb 8, 2019

While trying to use Remotery (@ 589ded3) on my Linux app, I get a crash every time the browser connects on the app.

I managed to trace the issue to TCPSocket_PollStatus which corrupts the stack.

It seems that the crash is provoked by the select call, which is known to cause memory corruptions when the given fd is larger than 1024... see https://sigquit.wordpress.com/2009/12/24/stack-corruption-improper-use-of-fd_set/

I've put an assert(tcp_socket->socket+1<FD_SETSIZE); before the selectcall, which fires right before the crash occurs.

@rotoglup
Copy link
Author

rotoglup commented Feb 8, 2019

FWIW, I've had some success using poll instead of select, here's my function :

static SocketStatus TCPSocket_PollStatus(TCPSocket* tcp_socket)
{
    SocketStatus status;
    struct pollfd fds[1];

    status.can_read = RMT_FALSE;
    status.can_write = RMT_FALSE;
    status.error_state = RMT_ERROR_NONE;

    assert(tcp_socket != NULL);
    if (tcp_socket->socket == INVALID_SOCKET)
    {
        status.error_state = RMT_ERROR_SOCKET_INVALID_POLL;
        return status;
    }

    // Set read/write/error markers for the socket
    fds[0].fd = tcp_socket->socket;
    fds[0].events = POLLIN | POLLPRI | POLLOUT | POLLERR;

    // Poll socket status without blocking
    if (poll(fds, 1, 0) == SOCKET_ERROR)
    {
        status.error_state = RMT_ERROR_SOCKET_SELECT_FAIL;
        return status;
    }

    status.can_read = (fds[0].revents & (POLLIN | POLLPRI)) != 0 ? RMT_TRUE : RMT_FALSE;
    status.can_write = (fds[0].revents & POLLOUT) != 0 ? RMT_TRUE : RMT_FALSE;
    status.error_state = (fds[0].revents & POLLERR) != 0 ? RMT_ERROR_SOCKET_POLL_ERRORS : RMT_ERROR_NONE;
    return status;
}

@dwilliamson
Copy link
Collaborator

Thanks, reading up on all this now.

@dwilliamson
Copy link
Collaborator

This certainly is a cross-platform minefield. https://daniel.haxx.se/docs/poll-vs-select.html

@rotoglup
Copy link
Author

rotoglup commented Feb 10, 2019 via email

@dwilliamson
Copy link
Collaborator

I mean, this is a pretty crazy old API limitation, almost funny. I'm not sure I want to introduce poll as I'd still need a select fallback for the systems that don't have it and then they might suffer the issue!

How's about trying to trick it? Could you try negatively offsetting the address of fd_read etc. So:

int byte_index = socket / 8;
int bit_index = socket & 7;
FD_ZERO(&fd_read);
FD_SET(bit_index, &fd_read);
select(socket + 1, (char*)&fd_read - byte_index, ...
can_read = FD_ISSET(bit_index, &fd_read);

@dwilliamson
Copy link
Collaborator

Just looked at the implementation of fd_set and realised it's not actually a "bit array" but a SOCKET array (in winsock2.h). It looks like you can't rely on any implementation patterns :/

@rotoglup
Copy link
Author

Would the only sensible thing to do be to check on FD_SETSIZE ? At least to prevent crashing. This would need to change TCPSocket_PollStatus and WebSocketHandshake (to add the error code) :

if (FD_SETSIZE <= tcp_socket->socket) {
    status.error_state = RMT_ERROR_SOCKET_SELECT_FAIL;
    return status;
}

@dwilliamson
Copy link
Collaborator

I think that would help and needs adding, with obvious error reporting to the user. But it's not the solution.

The irony in all this is that it's fine on Windows as fd_set is an array of SOCKET entries as opposed to a bit array so there's no chance of this kind of overflow.

So... we may need an #ifdef branch where poll is used on all platforms other than Windows.

@rotoglup
Copy link
Author

I'm traditionnally a Windows developper, I'm a newcomer in *NIX land, so I'm not versed enough to help on the poll reliability !
Maybe allow users to control the poll activation would be nice too - just in case.

@dwilliamson
Copy link
Collaborator

It's in wide use as a replacement and was introduced in 1986 vs 1983 but POSIX-like systems sometimes muck it up (e.g. see https://daniel.haxx.se/blog/2016/10/11/poll-on-mac-10-12-is-broken/). Remotery is used on so many varying platforms now that something this core is quite nerve-wracking to change.

@rotoglup
Copy link
Author

rotoglup commented Feb 11, 2019 via email

@dwilliamson
Copy link
Collaborator

dwilliamson commented Feb 11, 2019

Good idea. Maybe worth doing it the other way round: use select on Windows and poll elsewhere. If that fails to compile for somebody can introduce a user define that forces use of select.

Motivation being poll should work on the majority of use-cases. Only a few exceptions may fail to work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants