New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
storage: alter default cluster setting for sstable compression algorithm #123953
Open
Tracked by
#123950
Labels
A-storage
Relating to our storage engine (Pebble) on-disk storage.
C-enhancement
Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)
T-storage
Storage Team
Projects
Comments
nicktrav
added
C-enhancement
Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)
A-storage
Relating to our storage engine (Pebble) on-disk storage.
T-storage
Storage Team
labels
May 10, 2024
4 tasks
nicktrav
added a commit
to nicktrav/cockroach
that referenced
this issue
May 10, 2024
In cockroachdb#120784, we now allow for the `zstd` compression algorithm to be used as the codec for SSTables. Initial testing has shown this algorithm to provide a better compression ratio than snappy, at the cost of a minor increase in CPU. This unlocks benefits such as an improved node density, as more data can reside in each store. Alter the default SStable compression algorithm to `zstd`. Closes cockroachdb#123953. Release note (performance improvement): The default SSTable compression algorithm was changed from snappy to zstd. The latter has been shown to improve a store's compression ratio at the cost of a minor CPU increase. Existing clusters can opt into this new compression algorithm by setting `storage.sstable.compression_algorithm` to `zstd`. Existing SSTables compressed with the snappy compression algorithm will NOT be actively re-written. Instead, as SSTables are re-written over time with the new compression algorithm as existing SSTables are compacted.
nicktrav
added a commit
to nicktrav/cockroach
that referenced
this issue
May 10, 2024
In cockroachdb#120784, we now allow for the `zstd` compression algorithm to be used as the codec for SSTables. Initial testing has shown this algorithm to provide a better compression ratio than snappy, at the cost of a minor increase in CPU. This unlocks benefits such as an improved node density, as more data can reside in each store. Alter the default SStable compression algorithm to `zstd`. Closes cockroachdb#123953. Release note (performance improvement): The default SSTable compression algorithm was changed from snappy to zstd. The latter has been shown to improve a store's compression ratio at the cost of a minor CPU increase. Existing clusters can opt into this new compression algorithm by setting `storage.sstable.compression_algorithm` to `zstd`. Existing SSTables compressed with the snappy compression algorithm will NOT be actively re-written. Instead, as SSTables are re-written over time with the new compression algorithm as existing SSTables are compacted.
nicktrav
added a commit
to nicktrav/cockroach
that referenced
this issue
May 15, 2024
Currently, there exists a single cluster setting, `storage.sstable.compression_algorithm`, that controls the compression algorithm used by the cluster when writing SSTables. This cluster setting currently defaults to `snappy`. There are various end destinations and use cases for SSTables - those that reside in a Pebble store (the most common), those sent over the wire via `AddSSTable`, and those generated as part of a backup for storage in S3/GCS. Each of these destinations and use cases benefit from being able to independently alter the compression algorithm. In addition to the existing cluster setting, add two backup-specific cluster settings: - `storage.sstable.compression_algorithm_backup_storage`: applies to SSTs generated as part of a backup, where SSTs will reside in blob storage. These require compression, but not at the cost of additional CPU at generation time. Defaults to `snappy`, the existing algorithm used. - `storage.sstable.compression_algorithm_backup_transport`: applies to SSTs generated as part of a backup that are sent, immediately iterated, and then discarded (i.e. they are never persisted). Such SSTs typically have larger block sizes and benefit from compression. Defaults to `snappy`, the existing algorithm used. While this change introduces new cluster settings that allows independently altering the compression algorithm for different types of SSTs, the existing compression behavior is unchanged. Touches cockroachdb#123953. Release note (general change): Adds two new cluster settings, `storage.sstable.compression_algorithm_backup_storage` and `storage.sstable.compression_algorithm_backup_transport`, that in addition to the existing cluster setting `storage.sstable.compression_algorithm`, can be used to alter the compression algorithm used for various types of SSTs.
nicktrav
added a commit
to nicktrav/cockroach
that referenced
this issue
May 15, 2024
In cockroachdb#120784, we now allow for the `zstd` compression algorithm to be used as the codec for SSTables. Initial testing has shown this algorithm to provide a better compression ratio than snappy, at the cost of a minor increase in CPU. This unlocks benefits such as an improved node density, as more data can reside in each store. Alter the default SSTable compression algorithm to `zstd`. Note that this change only applies to SSTs written directly into a local Pebble store, or generated to send over the wire for ingestion into a remote store (e.g. `AddSSTable`). The defaults for backup-related SSTs are left unchanged. Closes cockroachdb#123953. Release note (performance improvement): The default SSTable compression algorithm was changed from snappy to zstd. The latter has been shown to improve a store's compression ratio at the cost of a minor CPU increase. Existing clusters can opt into this new compression algorithm by setting `storage.sstable.compression_algorithm` to `zstd`. Existing SSTables compressed with the snappy compression algorithm will NOT be actively re-written. Instead, as SSTables are re-written over time with the new compression algorithm as existing SSTables are compacted.
nicktrav
added a commit
to nicktrav/cockroach
that referenced
this issue
May 15, 2024
Currently, there exists a single cluster setting, `storage.sstable.compression_algorithm`, that controls the compression algorithm used by the cluster when writing SSTables. This cluster setting currently defaults to `snappy`. There are various end destinations and use cases for SSTables - those that reside in a Pebble store (the most common), those sent over the wire via `AddSSTable`, and those generated as part of a backup for storage in S3/GCS. Each of these destinations and use cases benefit from being able to independently alter the compression algorithm. In addition to the existing cluster setting, add two backup-specific cluster settings: - `storage.sstable.compression_algorithm_backup_storage`: applies to SSTs generated as part of a backup, where SSTs will reside in blob storage. These require compression, but not at the cost of additional CPU at generation time. Defaults to `snappy`, the existing algorithm used. - `storage.sstable.compression_algorithm_backup_transport`: applies to SSTs generated as part of a backup that are sent, immediately iterated, and then discarded (i.e. they are never persisted). Such SSTs typically have larger block sizes and benefit from compression. Defaults to `snappy`, the existing algorithm used. While this change introduces new cluster settings that allows independently altering the compression algorithm for different types of SSTs, the existing compression behavior is unchanged. Touches cockroachdb#123953. Release note (general change): Adds two new cluster settings, `storage.sstable.compression_algorithm_backup_storage` and `storage.sstable.compression_algorithm_backup_transport`, that in addition to the existing cluster setting `storage.sstable.compression_algorithm`, can be used to alter the compression algorithm used for various types of SSTs.
nicktrav
added a commit
to nicktrav/cockroach
that referenced
this issue
May 15, 2024
In cockroachdb#120784, we now allow for the `zstd` compression algorithm to be used as the codec for SSTables. Initial testing has shown this algorithm to provide a better compression ratio than snappy, at the cost of a minor increase in CPU. This unlocks benefits such as an improved node density, as more data can reside in each store. Alter the default SSTable compression algorithm to `zstd`. Note that this change only applies to SSTs written directly into a local Pebble store, or generated to send over the wire for ingestion into a remote store (e.g. `AddSSTable`). The defaults for backup-related SSTs are left unchanged. Closes cockroachdb#123953. Release note (performance improvement): The default SSTable compression algorithm was changed from snappy to zstd. The latter has been shown to improve a store's compression ratio at the cost of a minor CPU increase. Existing clusters can opt into this new compression algorithm by setting `storage.sstable.compression_algorithm` to `zstd`. Existing SSTables compressed with the snappy compression algorithm will NOT be actively re-written. Instead, as SSTables are re-written over time with the new compression algorithm as existing SSTables are compacted.
nicktrav
added a commit
to nicktrav/cockroach
that referenced
this issue
May 18, 2024
Fix an issue where the compression cluster setting is being set on a copy of the per-level configuration, rather than the configuration that is ultimately passed to Pebble. Touches cockroachdb#123953. Release note: None.
craig bot
pushed a commit
that referenced
this issue
May 21, 2024
124388: storage: fix setting of compression algorithm r=aadityasondhi a=nicktrav Fix an issue where the compression cluster setting is being set on a copy of the per-level configuration, rather than the configuration that is ultimately passed to Pebble. Touches #123953. Release note: None. Epic: CRDB-37583 Co-authored-by: Nick Travers <travers@cockroachlabs.com>
craig bot
pushed a commit
that referenced
this issue
May 21, 2024
124388: storage: fix setting of compression algorithm r=RaduBerinde a=nicktrav Fix an issue where the compression cluster setting is being set on a copy of the per-level configuration, rather than the configuration that is ultimately passed to Pebble. Touches #123953. Release note: None. Epic: CRDB-37583 Co-authored-by: Nick Travers <travers@cockroachlabs.com>
blathers-crl bot
pushed a commit
that referenced
this issue
May 21, 2024
Fix an issue where the compression cluster setting is being set on a copy of the per-level configuration, rather than the configuration that is ultimately passed to Pebble. Touches #123953. Release note: None.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
A-storage
Relating to our storage engine (Pebble) on-disk storage.
C-enhancement
Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)
T-storage
Storage Team
Set
storage.sstable.compression_algorithm
tozstd
.There are likely some tests that will need to be altered to take into account SSTable size after being compressed with zstd, rather than snappy.
Jira issue: CRDB-38625
The text was updated successfully, but these errors were encountered: