Added WAL compression checksum (#10319)

Summary:
Enabled zstd checksum flag in StreamingCompress so that WAL (de)compreression is protected by a checksum per compression frame.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/10319

Test Plan:
- `make check`
- WAL perf: average ops/sec over 10 runs is 161226 pre PR and 159635 post PR (1% drop).
```
sudo TEST_TMPDIR=/dev/shm/memtable_write ./db_bench_checksum -benchmarks=fillseq -max_write_buffer_number=100 -num=1000000 -min_write_buffer_number_to_merge=10 -wal_compression=zstd
```

Reviewed By: ajkr

Differential Revision: D37673311

Pulled By: cbi42

fbshipit-source-id: 9f34a3bfc2a82e5c80b1ec63bb339a7465108ec9
main
Changyu Bi 2 years ago committed by Facebook GitHub Bot
parent 86c2d0a95d
commit 5f9fe7f21e
  1. 7
      HISTORY.md
  2. 7
      db/db_wal_test.cc
  3. 2
      util/compression.h

@ -4,10 +4,10 @@
* Mempurge option flag `experimental_mempurge_threshold` is now a ColumnFamilyOptions and can now be dynamically configured using `SetOptions()`. * Mempurge option flag `experimental_mempurge_threshold` is now a ColumnFamilyOptions and can now be dynamically configured using `SetOptions()`.
* Support backward iteration when `ReadOptions::iter_start_ts` is set. * Support backward iteration when `ReadOptions::iter_start_ts` is set.
* Provide support for ReadOptions.async_io with direct_io to improve Seek latency by using async IO to parallelize child iterator seek and doing asynchronous prefetching on sequential scans. * Provide support for ReadOptions.async_io with direct_io to improve Seek latency by using async IO to parallelize child iterator seek and doing asynchronous prefetching on sequential scans.
* Added support for blob caching in order to cache frequently used blobs for BlobDB. * Added support for blob caching in order to cache frequently used blobs for BlobDB.
* User can configure the new ColumnFamilyOptions `blob_cache` to enable/disable blob caching. * User can configure the new ColumnFamilyOptions `blob_cache` to enable/disable blob caching.
* Either sharing the backend cache with the block cache or using a completely separate cache is supported. * Either sharing the backend cache with the block cache or using a completely separate cache is supported.
* A new abstraction interface called `BlobSource` for blob read logic gives all users access to blobs, whether they are in the blob cache, secondary cache, or (remote) storage. Blobs can be potentially read both while handling user reads (`Get`, `MultiGet`, or iterator) and during compaction (while dealing with compaction filters, Merges, or garbage collection) but eventually all blob reads go through `Version::GetBlob` or, for MultiGet, `Version::MultiGetBlob` (and then get dispatched to the interface -- `BlobSource`). * A new abstraction interface called `BlobSource` for blob read logic gives all users access to blobs, whether they are in the blob cache, secondary cache, or (remote) storage. Blobs can be potentially read both while handling user reads (`Get`, `MultiGet`, or iterator) and during compaction (while dealing with compaction filters, Merges, or garbage collection) but eventually all blob reads go through `Version::GetBlob` or, for MultiGet, `Version::MultiGetBlob` (and then get dispatched to the interface -- `BlobSource`).
### Public API changes ### Public API changes
* Add metadata related structs and functions in C API, including * Add metadata related structs and functions in C API, including
@ -28,6 +28,7 @@
## Behavior Change ## Behavior Change
* In leveled compaction with dynamic levelling, level multiplier is not anymore adjusted due to oversized L0. Instead, compaction score is adjusted by increasing size level target by adding incoming bytes from upper levels. This would deprioritize compactions from upper levels if more data from L0 is coming. This is to fix some unnecessary full stalling due to drastic change of level targets, while not wasting write bandwidth for compaction while writes are overloaded. * In leveled compaction with dynamic levelling, level multiplier is not anymore adjusted due to oversized L0. Instead, compaction score is adjusted by increasing size level target by adding incoming bytes from upper levels. This would deprioritize compactions from upper levels if more data from L0 is coming. This is to fix some unnecessary full stalling due to drastic change of level targets, while not wasting write bandwidth for compaction while writes are overloaded.
* For track_and_verify_wals_in_manifest, revert to the original behavior before #10087: syncing of live WAL file is not tracked, and we track only the synced sizes of **closed** WALs. (PR #10330). * For track_and_verify_wals_in_manifest, revert to the original behavior before #10087: syncing of live WAL file is not tracked, and we track only the synced sizes of **closed** WALs. (PR #10330).
* WAL compression now computes/verifies checksum during compression/decompression.
## 7.4.0 (06/19/2022) ## 7.4.0 (06/19/2022)
### Bug Fixes ### Bug Fixes

@ -1449,7 +1449,7 @@ TEST_P(DBWALTestWithParams, kAbsoluteConsistency) {
// fill with new date // fill with new date
RecoveryTestHelper::FillData(this, &options); RecoveryTestHelper::FillData(this, &options);
// corrupt the wal // corrupt the wal
RecoveryTestHelper::CorruptWAL(this, options, corrupt_offset * .3, RecoveryTestHelper::CorruptWAL(this, options, corrupt_offset * .33,
/*len%=*/.1, wal_file_id, trunc); /*len%=*/.1, wal_file_id, trunc);
// verify // verify
options.wal_recovery_mode = WALRecoveryMode::kAbsoluteConsistency; options.wal_recovery_mode = WALRecoveryMode::kAbsoluteConsistency;
@ -1602,7 +1602,10 @@ TEST_P(DBWALTestWithParams, kPointInTimeRecovery) {
const size_t row_count = RecoveryTestHelper::FillData(this, &options); const size_t row_count = RecoveryTestHelper::FillData(this, &options);
// Corrupt the wal // Corrupt the wal
RecoveryTestHelper::CorruptWAL(this, options, corrupt_offset * .3, // The offset here was 0.3 which cuts off right at the end of a
// valid fragment after wal zstd compression checksum is enabled,
// so changed the value to 0.33.
RecoveryTestHelper::CorruptWAL(this, options, corrupt_offset * .33,
/*len%=*/.1, wal_file_id, trunc); /*len%=*/.1, wal_file_id, trunc);
// Verify // Verify

@ -1733,6 +1733,8 @@ class ZSTDStreamingCompress final : public StreamingCompress {
max_output_len) { max_output_len) {
#ifdef ZSTD_STREAMING #ifdef ZSTD_STREAMING
cctx_ = ZSTD_createCCtx(); cctx_ = ZSTD_createCCtx();
// Each compressed frame will have a checksum
ZSTD_CCtx_setParameter(cctx_, ZSTD_c_checksumFlag, 1);
assert(cctx_ != nullptr); assert(cctx_ != nullptr);
input_buffer_ = {/*src=*/nullptr, /*size=*/0, /*pos=*/0}; input_buffer_ = {/*src=*/nullptr, /*size=*/0, /*pos=*/0};
#endif #endif

Loading…
Cancel
Save