New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Handle overflow in rd_buf_write_remains #4689
base: master
Are you sure you want to change the base?
Conversation
a1622dd
to
66fa00b
Compare
Hi @emasab, Could you please review and approve this PR? It's critical for us as it addresses a blocking issue with the confluent-kafka-dotnet library update. Your approval is eagerly awaited. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Anchit, the source of the problem comes from a bug that happens previously, check the comment.
tests/0011-produce_batch.c
Outdated
@@ -106,7 +115,14 @@ static void test_single_partition(void) { | |||
TEST_SAY("test_single_partition: Created kafka instance %s\n", | |||
rd_kafka_name(rk)); | |||
|
|||
rkt = rd_kafka_topic_new(rk, test_mk_topic_name("0011", 0), topic_conf); | |||
topicSuffix = (char *)malloc(topicSuffixLength + 1 * sizeof(char)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change can be removed as it's not necessary for reproducing the issue
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need topicSuffix
because with the combination of both clientID
and topicSuffix
we were able to see this border condition in the segment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had verified it happens with just that client id, because the segment is removed when the varint is changed without reducing the erased bytes
tests/0011-produce_batch.c
Outdated
@@ -178,6 +194,9 @@ static void test_single_partition(void) { | |||
TEST_SAY("Destroying kafka instance %s\n", rd_kafka_name(rk)); | |||
rd_kafka_destroy(rk); | |||
|
|||
free(clientId); | |||
free(topicSuffix); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove topicSuffix
There also a problem in My proposed solution is to add a |
Makes sense. I'll create a separate PR to handle that. |
@anchitj better to do it here, as it's basically the same thing that is missing |
After the implementation of KIP-320, TopicCnt was sent as a varint in the metadata request. This change caused an overflow issue due to the use of
rd_buf_erase
to erase the remaining bytes. The overflow occurred in the calculationrbuf->rbuf_size - (rbuf->rbuf_len + rbuf->rbuf_erased)
becauserbuf->rbuf_erased
was non-zero when the buffer was almost full, leading to an overflow condition.As a result, for certain configurations that caused the buffer to be nearly full, the client was unable to send metadata requests, and it also caused crashes in the .NET client.