Consumer not receiving messages when power off and restart, consumer's ack floor is ahead of stream's last sequence #5412

stitchcula · 2024-05-13T07:53:05Z

Observed behavior

When the server is unexpectedly powered off and then restarted, all consumers are unable to self-recover and fail to receive messages. The log is as follows:

[1022] 2024/05/11 16:31:07.334599 [INF] Starting nats-server
[1022] 2024/05/11 16:31:07.382891 [INF]   Version:  2.10.14
[1022] 2024/05/11 16:31:07.383039 [INF]   Git:      [31af767]
[1022] 2024/05/11 16:31:07.383055 [DBG]   Go build: go1.21.9
[1022] 2024/05/11 16:31:07.383112 [INF]   Name:     NCFGS7EGBHTKZ46NGOLKN365O3RGONUATABFDU3MDRJOOWGBOEKH3LQ4
[1022] 2024/05/11 16:31:07.383150 [INF]   Node:     jAsr7q4d
[1022] 2024/05/11 16:31:07.383172 [INF]   ID:       NCFGS7EGBHTKZ46NGOLKN365O3RGONUATABFDU3MDRJOOWGBOEKH3LQ4
[1022] 2024/05/11 16:31:07.383248 [WRN] Plaintext passwords detected, use nkeys or bcrypt
[1022] 2024/05/11 16:31:07.383363 [INF] Using configuration file: /etc/nats/simpleJetStream.conf
[1022] 2024/05/11 16:31:07.383430 [WRN] Maximum payloads over 8.00 MB are generally discouraged and could lead to poor performance
[1022] 2024/05/11 16:31:07.513725 [INF] Starting http monitor on 0.0.0.0:8422
[1022] 2024/05/11 16:31:07.514385 [INF] Starting JetStream
[1022] 2024/05/11 16:31:07.553176 [INF]     _ ___ _____ ___ _____ ___ ___   _   __  __
[1022] 2024/05/11 16:31:07.553260 [INF]  _ | | __|_   _/ __|_   _| _ \ __| /_\ |  \/  |
[1022] 2024/05/11 16:31:07.553283 [INF] | || | _|  | | \__ \ | | |   / _| / _ \| |\/| |
[1022] 2024/05/11 16:31:07.553303 [INF]  \__/|___| |_| |___/ |_| |_|_\___/_/ \_\_|  |_|
[1022] 2024/05/11 16:31:07.553323 [INF] 
[1022] 2024/05/11 16:31:07.553342 [INF]          https://docs.nats.io/jetstream
[1022] 2024/05/11 16:31:07.553361 [INF] 
[1022] 2024/05/11 16:31:07.553381 [INF] ---------------- JETSTREAM ----------------
[1022] 2024/05/11 16:31:07.553433 [INF]   Max Memory:      3.73 GB
[1022] 2024/05/11 16:31:07.553464 [INF]   Max Storage:     93.13 GB
[1022] 2024/05/11 16:31:07.553488 [INF]   Store Directory: "/opt/data/nats/jetstream"
[1022] 2024/05/11 16:31:07.553510 [INF] -------------------------------------------
[1022] 2024/05/11 16:31:08.304405 [INF]   Starting restore for stream '$G > XANOLE_OUTPUT_VALUES'
[1022] 2024/05/11 16:31:08.403004 [WRN] Filestore [XANOLE_OUTPUT_VALUES] Stream state detected prior state, could not locate msg block 5054
[1022] 2024/05/11 16:31:08.405046 [INF]   Restored 0 messages for stream '$G > XANOLE_OUTPUT_VALUES' in 101ms
[1022] 2024/05/11 16:31:11.701798 [INF]   Recovering 2 consumers for stream - '$G > XANOLE_OUTPUT_VALUES'
[1022] 2024/05/11 16:31:13.597179 [WRN] Detected consumer '$G > XANOLE_OUTPUT_VALUES > consumer_default' ack floor 14084 is ahead of stream's last sequence 0
[1022] 2024/05/11 16:31:13.597335 [WRN] Detected consumer '$G > XANOLE_OUTPUT_VALUES > consumer_401899c7-a201-42b4-9953-e59d4b427d43' ack floor 14084 is ahead of stream's last sequence 0

After the restart, it was almost 100% likely to reproduce the situation described in the issue, where the consumer's ack floor is greater than the stream's last sequence, resulting in all consumers being unable to consume.

Note that:

[WRN] Filestore [XANOLE_OUTPUT_VALUES] Stream state detected prior state, could not locate msg block 5054
...
[WRN] Detected consumer '$G > XANOLE_OUTPUT_VALUES > consumer_default' ack floor 14084 is ahead of stream's last sequence 0

This appears to be the last blk index recorded in index.db, which actually cannot be found on the disk. This inconsistency may be caused by non-atomic sync writing of the index.db and blk to disk. When server/filestore.go func (fs *fileStore) recoverFullState() (rerr error) cannot find the corresponding [index].blk, it defaults to setting the stream's LastSeq to 0. The consumer's meta is also independently persisted, and server/consumer.go func (o *consumer) checkStateForInterestStream() error reads and finds that the recorded AckFloor is higher than the actual stream's LastSeq, thus throwing an error.

By the way, I've noticed that the persistence and recovery mechanisms of the filestore in different versions of the stream vary slightly.

I tested various sync time periods, including 'sync: always', but it doesn't seem to have much effect:

jetstream {
    store_dir: /userdata/nats
    max_mem: 100M
    max_file: 500M
    sync: always # 1s to 2min
}

The related issue is #4566 . I believe this is a server-side problem and should not be circumvented by client-side operations like deleting and re-adding consumers.

This issue is beyond my ability to fix quickly. I earnestly seek assistance. @derekcollison @bruth

If you need more information or logs, please feel free to contact me, and I will provide them as soon as possible.

Expected behavior

After the server unexpectedly power off and then restarted, all consumers should able to self-recover and messages should be consumed normally.
Some degree of data loss is acceptable.

Server and client version

I conducted power-off tests on the following server versions, and all exhibited the issue:

v2.10.6
v2.10.11
v2.10.14
v2.10.15
v2.11.0-preview.2
main (#31b155b)

All client versions have been cross-tested:

v1.30.0
v1.30.2
v1.34.1

Host environment

I have used multiple servers for testing, and have tested with various file systems (xfs, ext4) and types of disks (nano flash, SSD, mechanical hard disk) to rule out the possibility of randomness.

eg:

Linux XXX 5.10.110 #3 SMP Tue Oct 24 15:37:17 CST 2023 aarch64 GNU/Linux, ext4, nano flash
Linux XXX 3.10.0-327.el7.x86_64 #1 SMP Thu Nov 19 22:10:57 UTC 2015 x86_64 GNU/Linux, xfs, SSD
Linux XXX 3.10.0-327.el7.x86_64 #1 SMP Thu Nov 19 22:10:57 UTC 2015 x86_64 GNU/Linux, xfs, HDD

Steps to reproduce

Using a single-node NATS JetStream running on a standalone server. The configuration for the server is as follows:

jetstream {
    store_dir: /userdata/nats
    max_mem: 100M
    max_file: 500M
    sync: always # 1s to 2min
}

Create stream and consumer. The configuration for the consumer is as follows:

Information for Consumer XANOLE_OUTPUT_VALUES > consumer_default created 2024-05-11T16:03:32+08:00

Configuration:

                    Name: consumer_default
        Delivery Subject: _INBOX.YOhDaj4BWhBtA4ueOQxq0x
          Filter Subject: xanole.outputs.xxx>
          Deliver Policy: All
     Deliver Queue Group: consumer_default
              Ack Policy: Explicit
                Ack Wait: 30.00s
           Replay Policy: Instant
         Max Ack Pending: 1,000
            Flow Control: false

State:

  Last Delivered Message: Consumer sequence: 1,074,833 Stream sequence: 1,074,913 Last delivery: 423µs ago
    Acknowledgment Floor: Consumer sequence: 1,074,832 Stream sequence: 1,074,912 Last Ack: 2ms ago
        Outstanding Acks: 1 out of maximum 1,000
    Redelivered Messages: 0
    Unprocessed Messages: 0
         Active Interest: Active using Queue Group xhouse_inputs_agents_default

After normal production and consumption for a period of time, directly cut off the server's power supply and then restart, you can almost certainly reproduce the issue 100% of the time.
Using reboot and kill -9 seems unable to simulate this process.

The text was updated successfully, but these errors were encountered:

derekcollison · 2024-05-13T15:16:33Z

After cutting the power, could you snapshot the store directory and share with us privately after reboot but before you restart the nats-server?

Also, what size payloads have you set the system to use? I see it is complaining they are over 8MB which we strongly discourage.

derekcollison · 2024-05-13T15:33:36Z

Or provide more information on number of messages inbound into the WQ, how big are they, how are the consumers operating? Does the stream usually have messages or do the consumers usually consumer messages pretty quickly and keep the number of messages very low or at zero?

bfoxstudio · 2024-05-13T20:23:53Z

same problem on kv tests and reboot container by drain docker node or reboot VM #5205

yuzhou-nj · 2024-05-14T01:40:44Z

I have encountered this problem. The sequence of consumer exceeds that of stream, so I cannot get new messages from stream until the sequence of stream exceeds that of consumer.

derekcollison · 2024-05-14T02:10:43Z

We are trying to reproduce, so any and all information on how you are interacting with the stream is helpful.

How fast messages are being sent into the system?
How fast they are being consumed?

In general is the stream near empty or at empty most of the time?

katrinwab · 2024-05-14T15:28:40Z

in our case, 1 msg per sec was sent
cluster has 5 node

we drained nodes with meta leader and stream leader
waited 2 minute
activated nodes again
after all, consumer didn't see all message (etc some messages were pulling, but jetstream_consumer_num_pending was increasing)

katrinwab · 2024-05-14T15:33:13Z

also we have problems where we drain not leader node
after all, jetstream_stream_last_seq in drained node is less than in not drained node
that difference increases by time

wallyqs · 2024-05-14T16:22:46Z

Thanks everyone for sharing about the symptoms from this issue, currently we're looking at a reproducible example to trip this condition.
@stitchcula I see in the logs that you have a warning of Maximum payloads over 8.00 MB, do you have an idea of the size of the messages that you are sending or whether persisted messages are close to 8MB?

bfoxstudio · 2024-05-15T08:33:47Z

Thanks everyone for sharing about the symptoms from this issue, currently we're looking at a reproducible example to trip this condition. @stitchcula I see in the logs that you have a warning of Maximum payloads over 8.00 MB, do you have an idea of the size of the messages that you are sending or whether persisted messages are close to 8MB?

We have a standard 1mb payloads. We test on version master -> 2.10.5 and always when restarting the docker server problem with consumers and problem with last sequence reproduce

stitchcula · 2024-05-15T14:14:22Z

In my testing, the size of payload is around 50KB, with a consumption rate of 100 messages per second. The consumers usually consumer messages pretty quickly and keep the number of messages very low.

To avoid incidents in production, I temporarily forked a branch and made an aggressive fix, where if the consumer initialization finds the ack floor higher than the stream's lastSeq, it forcibly resets to lastSeq.

This allows the consumer to consume normally, but data loss occurs. I believe the fundamental issue is that the persistence mechanism of the stream during a power outage can be improved to avoid losing the cursor.

stitchcula · 2024-05-15T14:35:03Z

snapshot the store directory and share with us privately after reboot but before you restart the nats-server

I have already taken a snapshot of the store directory after rebooting but before restarting the nats-server. How can I share it with you?

Thanks for the hard work!

wallyqs · 2024-05-15T14:51:03Z

@stitchcula you can send to wally@nats.io if possible?

stitchcula added the defect Suspected defect such as a bug or regression label May 13, 2024

stitchcula mentioned this issue May 13, 2024

Stream FirstSeq is far behind all consumers Ack Floor #3996

Open

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Consumer not receiving messages when power off and restart, consumer's ack floor is ahead of stream's last sequence #5412

Consumer not receiving messages when power off and restart, consumer's ack floor is ahead of stream's last sequence #5412

stitchcula commented May 13, 2024 •

edited

derekcollison commented May 13, 2024

derekcollison commented May 13, 2024

bfoxstudio commented May 13, 2024

yuzhou-nj commented May 14, 2024

derekcollison commented May 14, 2024

katrinwab commented May 14, 2024

katrinwab commented May 14, 2024

wallyqs commented May 14, 2024

bfoxstudio commented May 15, 2024

stitchcula commented May 15, 2024 •

edited

stitchcula commented May 15, 2024

wallyqs commented May 15, 2024

Consumer not receiving messages when power off and restart, consumer's ack floor is ahead of stream's last sequence #5412

Consumer not receiving messages when power off and restart, consumer's ack floor is ahead of stream's last sequence #5412

Comments

stitchcula commented May 13, 2024 • edited

Observed behavior

Expected behavior

Server and client version

Host environment

Steps to reproduce

derekcollison commented May 13, 2024

derekcollison commented May 13, 2024

bfoxstudio commented May 13, 2024

yuzhou-nj commented May 14, 2024

derekcollison commented May 14, 2024

katrinwab commented May 14, 2024

katrinwab commented May 14, 2024

wallyqs commented May 14, 2024

bfoxstudio commented May 15, 2024

stitchcula commented May 15, 2024 • edited

stitchcula commented May 15, 2024

wallyqs commented May 15, 2024

stitchcula commented May 13, 2024 •

edited

stitchcula commented May 15, 2024 •

edited