Fate of Data When durability.enabled=false #533
-
because data is spread across DRAM and solid disks, if durability is disabled but a key+value currently in DRAM needs to be evicted to disk to free up bytes in DRAM, does this mean that key would get removed from the "cached updates"? thus the application has no idea what's persisted on disk and what only lives in DRAM, but if the data always fits into DRAM then no data is ever resident on disk? |
Beta Was this translation helpful? Give feedback.
Replies: 3 comments
-
Hi Victor, I'll add a bit and I am sure my colleagues will add a bit more tomorrow, and maybe even correct me :). HSE obviously has its in-memory component (c0) and its on-media component (cN). In addition to those two components, it has a write-ahead log (WAL). Note that writing to the WAL and writing to cN are not the same thing. When you disable durability, you are disabling that WAL component. In addition to toggling the WAL, we also provide the user with the durability interval, which controls the amount of time between flushes to the WAL. Say your app has durability enabled with a sync interval of 5s. In the event of a crash, you will lose as much as 5 seconds of data, but up to that point, every 5 seconds HSE has been flushing data to the WAL in the event it needs to replay data if it crashes. As data gets ingested from c0 to cN, the WAL is garbage collected. In the case that you describe, you could potentially lose all your new data in the event of a crash. It would probably be advisable to call Of course, if your app is operating on read-only data, you should mark the |
Beta Was this translation helpful? Give feedback.
-
Cached KV pairs are never evicted from DRAM before being written to media, regardless of whether journaling is enabled or disabled. Furthermore, even with journaling disabled, the data on media is guaranteed to be consistent. That is, updates performed in a transaction are always all-or-nothing. There are multiple benefits to enabling journaling: 1) you can configure a flush interval to bound the amount of data lost in the event of a power loss or other failure w/o the app having to explicitly call sync(), and 2) performance is often better because we have more flexibility in when and how we ingest data (into cN) that has been written to the journal. Net is we recommend having the journal enabled unless there is something specific about your application that makes it advantageous for you not to do so. |
Beta Was this translation helpful? Give feedback.
Cached KV pairs are never evicted from DRAM before being written to media, regardless of whether journaling is enabled or disabled. Furthermore, even with journaling disabled, the data on media is guaranteed to be consistent. That is, updates performed in a transaction are always all-or-nothing.
There are multiple benefits to enabling journaling: 1) you can configure a flush interval to bound the amount of data lost in the event of a power loss or other failure w/o the app having to explicitly call sync(), and 2) performance is often better because we have more flexibility in when and how we ingest data (into cN) that has been written to the journal.
Net is we recommend having the journal…