From 5f9fe7f21ed7c538355ed66fc3d044dff30f2247 Mon Sep 17 00:00:00 2001 From: Changyu Bi Date: Wed, 13 Jul 2022 15:29:20 -0700 Subject: [PATCH] Added WAL compression checksum (#10319) Summary: Enabled zstd checksum flag in StreamingCompress so that WAL (de)compreression is protected by a checksum per compression frame. Pull Request resolved: https://github.com/facebook/rocksdb/pull/10319 Test Plan: - `make check` - WAL perf: average ops/sec over 10 runs is 161226 pre PR and 159635 post PR (1% drop). ``` sudo TEST_TMPDIR=/dev/shm/memtable_write ./db_bench_checksum -benchmarks=fillseq -max_write_buffer_number=100 -num=1000000 -min_write_buffer_number_to_merge=10 -wal_compression=zstd ``` Reviewed By: ajkr Differential Revision: D37673311 Pulled By: cbi42 fbshipit-source-id: 9f34a3bfc2a82e5c80b1ec63bb339a7465108ec9 --- HISTORY.md | 7 ++++--- db/db_wal_test.cc | 7 +++++-- util/compression.h | 2 ++ 3 files changed, 11 insertions(+), 5 deletions(-) diff --git a/HISTORY.md b/HISTORY.md index 1840dc7ac..a91faba55 100644 --- a/HISTORY.md +++ b/HISTORY.md @@ -4,10 +4,10 @@ * Mempurge option flag `experimental_mempurge_threshold` is now a ColumnFamilyOptions and can now be dynamically configured using `SetOptions()`. * Support backward iteration when `ReadOptions::iter_start_ts` is set. * Provide support for ReadOptions.async_io with direct_io to improve Seek latency by using async IO to parallelize child iterator seek and doing asynchronous prefetching on sequential scans. -* Added support for blob caching in order to cache frequently used blobs for BlobDB. +* Added support for blob caching in order to cache frequently used blobs for BlobDB. * User can configure the new ColumnFamilyOptions `blob_cache` to enable/disable blob caching. - * Either sharing the backend cache with the block cache or using a completely separate cache is supported. - * A new abstraction interface called `BlobSource` for blob read logic gives all users access to blobs, whether they are in the blob cache, secondary cache, or (remote) storage. Blobs can be potentially read both while handling user reads (`Get`, `MultiGet`, or iterator) and during compaction (while dealing with compaction filters, Merges, or garbage collection) but eventually all blob reads go through `Version::GetBlob` or, for MultiGet, `Version::MultiGetBlob` (and then get dispatched to the interface -- `BlobSource`). + * Either sharing the backend cache with the block cache or using a completely separate cache is supported. + * A new abstraction interface called `BlobSource` for blob read logic gives all users access to blobs, whether they are in the blob cache, secondary cache, or (remote) storage. Blobs can be potentially read both while handling user reads (`Get`, `MultiGet`, or iterator) and during compaction (while dealing with compaction filters, Merges, or garbage collection) but eventually all blob reads go through `Version::GetBlob` or, for MultiGet, `Version::MultiGetBlob` (and then get dispatched to the interface -- `BlobSource`). ### Public API changes * Add metadata related structs and functions in C API, including @@ -28,6 +28,7 @@ ## Behavior Change * In leveled compaction with dynamic levelling, level multiplier is not anymore adjusted due to oversized L0. Instead, compaction score is adjusted by increasing size level target by adding incoming bytes from upper levels. This would deprioritize compactions from upper levels if more data from L0 is coming. This is to fix some unnecessary full stalling due to drastic change of level targets, while not wasting write bandwidth for compaction while writes are overloaded. * For track_and_verify_wals_in_manifest, revert to the original behavior before #10087: syncing of live WAL file is not tracked, and we track only the synced sizes of **closed** WALs. (PR #10330). +* WAL compression now computes/verifies checksum during compression/decompression. ## 7.4.0 (06/19/2022) ### Bug Fixes diff --git a/db/db_wal_test.cc b/db/db_wal_test.cc index 54451ff47..96b5d4f91 100644 --- a/db/db_wal_test.cc +++ b/db/db_wal_test.cc @@ -1449,7 +1449,7 @@ TEST_P(DBWALTestWithParams, kAbsoluteConsistency) { // fill with new date RecoveryTestHelper::FillData(this, &options); // corrupt the wal - RecoveryTestHelper::CorruptWAL(this, options, corrupt_offset * .3, + RecoveryTestHelper::CorruptWAL(this, options, corrupt_offset * .33, /*len%=*/.1, wal_file_id, trunc); // verify options.wal_recovery_mode = WALRecoveryMode::kAbsoluteConsistency; @@ -1602,7 +1602,10 @@ TEST_P(DBWALTestWithParams, kPointInTimeRecovery) { const size_t row_count = RecoveryTestHelper::FillData(this, &options); // Corrupt the wal - RecoveryTestHelper::CorruptWAL(this, options, corrupt_offset * .3, + // The offset here was 0.3 which cuts off right at the end of a + // valid fragment after wal zstd compression checksum is enabled, + // so changed the value to 0.33. + RecoveryTestHelper::CorruptWAL(this, options, corrupt_offset * .33, /*len%=*/.1, wal_file_id, trunc); // Verify diff --git a/util/compression.h b/util/compression.h index d310ca09d..4fccbdb00 100644 --- a/util/compression.h +++ b/util/compression.h @@ -1733,6 +1733,8 @@ class ZSTDStreamingCompress final : public StreamingCompress { max_output_len) { #ifdef ZSTD_STREAMING cctx_ = ZSTD_createCCtx(); + // Each compressed frame will have a checksum + ZSTD_CCtx_setParameter(cctx_, ZSTD_c_checksumFlag, 1); assert(cctx_ != nullptr); input_buffer_ = {/*src=*/nullptr, /*size=*/0, /*pos=*/0}; #endif