Ribbon long-term support, starting level support (#8198)

Summary:
Since the Ribbon filter schema seems good (compatible back to
6.15.0), this change commits to long term support of the SST schema,
even though we expect the API for enabling Ribbon to change (still
called NewExperimentalRibbonFilterPolicy).

This also adds support for "hybrid" configuration in which some levels
use Bloom (higher levels, lower numbered) for speed and the rest use
Ribbon (lower levels, higher numbered) for memory space efficiency.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/8198

Test Plan: unit test added, crash test support

Reviewed By: jay-zhuang

Differential Revision: D27831232

Pulled By: pdillinger

fbshipit-source-id: 90e528677689474d293ed6710b42ba89fbd5b5ab
main
Peter Dillinger 3 years ago committed by Facebook GitHub Bot
parent 90e245697f
commit 10196d7edc
  1. 2
      HISTORY.md
  2. 2
      db_stress_tool/db_stress_common.h
  3. 4
      db_stress_tool/db_stress_gflags.cc
  4. 9
      db_stress_tool/db_stress_test_base.cc
  5. 34
      include/rocksdb/filter_policy.h
  6. 9
      options/options_test.cc
  7. 75
      table/block_based/filter_policy.cc
  8. 49
      table/block_based/filter_policy_internal.h
  9. 2
      tools/db_crashtest.py
  10. 45
      util/bloom_test.cc

@ -28,6 +28,8 @@
* Added BackupEngine support for integrated BlobDB, with blob files shared between backups when table files are shared. Because of current limitations, blob files always use the kLegacyCrc32cAndFileSize naming scheme, and incremental backups must read and checksum all blob files in a DB, even for files that are already backed up.
* Added an optional output parameter to BackupEngine::CreateNewBackup(WithMetadata) to return the BackupID of the new backup.
* Added BackupEngine::GetBackupInfo / GetLatestBackupInfo for querying individual backups.
* Made the Ribbon filter a long-term supported feature in terms of the SST schema(compatible with version >= 6.15.0) though the API for enabling it is expected to change.
* Added hybrid configuration of Ribbon filter and Bloom filter where some LSM levels use Ribbon for memory space efficiency and some use Bloom for speed. See NewExperimentalRibbonFilterPolicy. This also changes the default behavior of NewExperimentalRibbonFilterPolicy to use Bloom on level 0 and Ribbon on later levels.
## 6.19.0 (03/21/2021)
### Bug Fixes

@ -144,7 +144,7 @@ DECLARE_bool(enable_write_thread_adaptive_yield);
DECLARE_int32(reopen);
DECLARE_double(bloom_bits);
DECLARE_bool(use_block_based_filter);
DECLARE_bool(use_ribbon_filter);
DECLARE_int32(ribbon_starting_level);
DECLARE_bool(partition_filters);
DECLARE_bool(optimize_filters_for_memory);
DECLARE_int32(index_type);

@ -410,8 +410,8 @@ DEFINE_bool(use_block_based_filter, false,
"use block based filter"
"instead of full filter for block based table");
DEFINE_bool(use_ribbon_filter, false,
"Use Ribbon filter instead of Bloom filter");
DEFINE_int32(ribbon_starting_level, false,
"First level to use Ribbon filter instead of Bloom");
DEFINE_bool(partition_filters, false,
"use partitioned filters "

@ -26,11 +26,12 @@ StressTest::StressTest()
compressed_cache_(NewLRUCache(FLAGS_compressed_cache_size)),
filter_policy_(
FLAGS_bloom_bits >= 0
? FLAGS_use_ribbon_filter
? NewExperimentalRibbonFilterPolicy(FLAGS_bloom_bits)
? FLAGS_ribbon_starting_level < FLAGS_num_levels
? NewExperimentalRibbonFilterPolicy(
FLAGS_bloom_bits, FLAGS_ribbon_starting_level)
: FLAGS_use_block_based_filter
? NewBloomFilterPolicy(FLAGS_bloom_bits, true)
: NewBloomFilterPolicy(FLAGS_bloom_bits, false)
? NewBloomFilterPolicy(FLAGS_bloom_bits, true)
: NewBloomFilterPolicy(FLAGS_bloom_bits, false)
: nullptr),
db_(nullptr),
#ifndef ROCKSDB_LITE

@ -216,23 +216,33 @@ class FilterPolicy {
extern const FilterPolicy* NewBloomFilterPolicy(
double bits_per_key, bool use_block_based_builder = false);
// An EXPERIMENTAL new Bloom alternative that saves about 30% space
// compared to Bloom filters, with about 3-4x construction time and
// similar query times. For example, if you pass in 10 for
// An new Bloom alternative that saves about 30% space compared to
// Bloom filters, with about 3-4x construction time and similar
// query times. For example, if you pass in 10 for
// bloom_equivalent_bits_per_key, you'll get the same 0.95% FP rate
// as Bloom filter but only using about 7 bits per key. (This
// way of configuring the new filter is considered experimental
// and/or transitional, so is expected to go away.)
// and/or transitional, so is expected to be replaced with a new API.
// The constructed filters will be given long-term support.)
//
// Ribbon filters are ignored by previous versions of RocksDB, as if
// no filter was used.
// The space savings of Ribbon filters makes sense for lower (higher
// numbered; larger; longer-lived) levels of LSM, whereas the speed of
// Bloom filters make sense for highest levels of LSM. Setting
// ribbon_starting_level allows for this design. For example,
// ribbon_starting_level=1 means that Bloom filters will be used in
// level 0, including flushes, and Ribbon filters elsewhere.
// ribbon_starting_level=0 means (almost) always use Ribbon.
//
// Note: this policy can generate Bloom filters in some cases.
// For very small filters (well under 1KB), Bloom fallback is by
// design, as the current Ribbon schema is not optimized to save vs.
// Bloom for such small filters. Other cases of Bloom fallback should
// be exceptional and log an appropriate warning.
// Ribbon filters are compatible with RocksDB >= 6.15.0. Earlier
// versions reading the data will behave as if no filter was used
// (degraded performance until compaction rebuilds filters).
//
// Note: even with ribbon_starting_level=0, this policy can generate
// Bloom filters in some cases. For very small filters (well under 1KB),
// Bloom fallback is by design, as the current Ribbon schema is not
// optimized to save vs. Bloom for such small filters. Other cases of
// Bloom fallback should be exceptional and log an appropriate warning.
extern const FilterPolicy* NewExperimentalRibbonFilterPolicy(
double bloom_equivalent_bits_per_key);
double bloom_equivalent_bits_per_key, int ribbon_starting_level = 1);
} // namespace ROCKSDB_NAMESPACE

@ -940,6 +940,15 @@ TEST_F(OptionsTest, GetBlockBasedTableOptionsFromString) {
&new_opt));
ASSERT_TRUE(new_opt.filter_policy != nullptr);
bfp = dynamic_cast<const BloomFilterPolicy*>(new_opt.filter_policy.get());
// Not a BloomFilterPolicy
EXPECT_FALSE(bfp);
ASSERT_OK(GetBlockBasedTableOptionsFromString(
config_options, table_opt, "filter_policy=experimental_ribbon:5.678:0;",
&new_opt));
ASSERT_TRUE(new_opt.filter_policy != nullptr);
bfp = dynamic_cast<const BloomFilterPolicy*>(new_opt.filter_policy.get());
// Pure Ribbon configuration is (oddly) BloomFilterPolicy
EXPECT_EQ(bfp->GetMillibitsPerKey(), 5678);
EXPECT_EQ(bfp->GetMode(), BloomFilterPolicy::kStandard128Ribbon);

@ -23,6 +23,7 @@
#include "util/hash.h"
#include "util/ribbon_config.h"
#include "util/ribbon_impl.h"
#include "util/string_util.h"
namespace ROCKSDB_NAMESPACE {
@ -1054,7 +1055,7 @@ BloomFilterPolicy::BloomFilterPolicy(double bits_per_key, Mode mode)
BloomFilterPolicy::~BloomFilterPolicy() {}
const char* BloomFilterPolicy::Name() const {
const char* BuiltinFilterPolicy::Name() const {
return "rocksdb.BuiltinBloomFilter";
}
@ -1087,8 +1088,8 @@ void BloomFilterPolicy::CreateFilter(const Slice* keys, int n,
}
}
bool BloomFilterPolicy::KeyMayMatch(const Slice& key,
const Slice& bloom_filter) const {
bool BuiltinFilterPolicy::KeyMayMatch(const Slice& key,
const Slice& bloom_filter) const {
const size_t len = bloom_filter.size();
if (len < 2 || len > 0xffffffffU) {
return false;
@ -1110,7 +1111,7 @@ bool BloomFilterPolicy::KeyMayMatch(const Slice& key,
array);
}
FilterBitsBuilder* BloomFilterPolicy::GetFilterBitsBuilder() const {
FilterBitsBuilder* BuiltinFilterPolicy::GetFilterBitsBuilder() const {
// This code path should no longer be used, for the built-in
// BloomFilterPolicy. Internal to RocksDB and outside
// BloomFilterPolicy, only get a FilterBitsBuilder with
@ -1184,7 +1185,7 @@ FilterBitsBuilder* BloomFilterPolicy::GetBuilderFromContext(
// Read metadata to determine what kind of FilterBitsReader is needed
// and return a new one.
FilterBitsReader* BloomFilterPolicy::GetFilterBitsReader(
FilterBitsReader* BuiltinFilterPolicy::GetFilterBitsReader(
const Slice& contents) const {
uint32_t len_with_meta = static_cast<uint32_t>(contents.size());
if (len_with_meta <= kMetadataLen) {
@ -1265,7 +1266,7 @@ FilterBitsReader* BloomFilterPolicy::GetFilterBitsReader(
log2_cache_line_size);
}
FilterBitsReader* BloomFilterPolicy::GetRibbonBitsReader(
FilterBitsReader* BuiltinFilterPolicy::GetRibbonBitsReader(
const Slice& contents) const {
uint32_t len_with_meta = static_cast<uint32_t>(contents.size());
uint32_t len = len_with_meta - kMetadataLen;
@ -1289,7 +1290,7 @@ FilterBitsReader* BloomFilterPolicy::GetRibbonBitsReader(
}
// For newer Bloom filter implementations
FilterBitsReader* BloomFilterPolicy::GetBloomBitsReader(
FilterBitsReader* BuiltinFilterPolicy::GetBloomBitsReader(
const Slice& contents) const {
uint32_t len_with_meta = static_cast<uint32_t>(contents.size());
uint32_t len = len_with_meta - kMetadataLen;
@ -1362,10 +1363,50 @@ const FilterPolicy* NewBloomFilterPolicy(double bits_per_key,
return new BloomFilterPolicy(bits_per_key, m);
}
// Chooses between two filter policies based on LSM level
class LevelThresholdFilterPolicy : public BuiltinFilterPolicy {
public:
LevelThresholdFilterPolicy(std::unique_ptr<const FilterPolicy>&& a,
std::unique_ptr<const FilterPolicy>&& b,
int starting_level_for_b)
: policy_a_(std::move(a)),
policy_b_(std::move(b)),
starting_level_for_b_(starting_level_for_b) {
assert(starting_level_for_b_ >= 0);
}
// Deprecated block-based filter only
void CreateFilter(const Slice* keys, int n, std::string* dst) const override {
policy_a_->CreateFilter(keys, n, dst);
}
FilterBitsBuilder* GetBuilderWithContext(
const FilterBuildingContext& context) const override {
if (context.level_at_creation >= starting_level_for_b_) {
return policy_b_->GetBuilderWithContext(context);
} else {
return policy_a_->GetBuilderWithContext(context);
}
}
private:
const std::unique_ptr<const FilterPolicy> policy_a_;
const std::unique_ptr<const FilterPolicy> policy_b_;
int starting_level_for_b_;
};
extern const FilterPolicy* NewExperimentalRibbonFilterPolicy(
double bloom_equivalent_bits_per_key) {
return new BloomFilterPolicy(bloom_equivalent_bits_per_key,
BloomFilterPolicy::kStandard128Ribbon);
double bloom_equivalent_bits_per_key, int ribbon_starting_level) {
std::unique_ptr<const FilterPolicy> ribbon_only{new BloomFilterPolicy(
bloom_equivalent_bits_per_key, BloomFilterPolicy::kStandard128Ribbon)};
if (ribbon_starting_level > 0) {
std::unique_ptr<const FilterPolicy> bloom_only{new BloomFilterPolicy(
bloom_equivalent_bits_per_key, BloomFilterPolicy::kFastLocalBloom)};
return new LevelThresholdFilterPolicy(
std::move(bloom_only), std::move(ribbon_only), ribbon_starting_level);
} else {
return ribbon_only.release();
}
}
FilterBuildingContext::FilterBuildingContext(
@ -1396,10 +1437,18 @@ Status FilterPolicy::CreateFromString(
NewBloomFilterPolicy(bits_per_key, use_block_based_builder));
}
} else if (value.compare(0, kExpRibbonName.size(), kExpRibbonName) == 0) {
size_t pos = value.find(':', kExpRibbonName.size());
int ribbon_starting_level;
if (pos == std::string::npos) {
pos = value.size();
ribbon_starting_level = 1;
} else {
ribbon_starting_level = ParseInt(trim(value.substr(pos + 1)));
}
double bloom_equivalent_bits_per_key =
ParseDouble(trim(value.substr(kExpRibbonName.size())));
policy->reset(
NewExperimentalRibbonFilterPolicy(bloom_equivalent_bits_per_key));
ParseDouble(trim(value.substr(kExpRibbonName.size(), pos)));
policy->reset(NewExperimentalRibbonFilterPolicy(
bloom_equivalent_bits_per_key, ribbon_starting_level));
} else {
return Status::NotFound("Invalid filter policy name ", value);
#else

@ -38,10 +38,38 @@ class BuiltinFilterBitsBuilder : public FilterBitsBuilder {
virtual double EstimatedFpRate(size_t num_entries, size_t bytes) = 0;
};
// Abstract base class for RocksDB built-in filter policies.
// This class is considered internal API and subject to change.
class BuiltinFilterPolicy : public FilterPolicy {
public:
// Shared name because any built-in policy can read filters from
// any other
const char* Name() const override;
// Deprecated block-based filter only
bool KeyMayMatch(const Slice& key, const Slice& bloom_filter) const override;
// Old API
FilterBitsBuilder* GetFilterBitsBuilder() const override;
// Read metadata to determine what kind of FilterBitsReader is needed
// and return a new one. This must successfully process any filter data
// generated by a built-in FilterBitsBuilder, regardless of the impl
// chosen for this BloomFilterPolicy. Not compatible with CreateFilter.
FilterBitsReader* GetFilterBitsReader(const Slice& contents) const override;
private:
// For newer Bloom filter implementation(s)
FilterBitsReader* GetBloomBitsReader(const Slice& contents) const;
// For Ribbon filter implementation(s)
FilterBitsReader* GetRibbonBitsReader(const Slice& contents) const;
};
// RocksDB built-in filter policy for Bloom or Bloom-like filters.
// This class is considered internal API and subject to change.
// See NewBloomFilterPolicy.
class BloomFilterPolicy : public FilterPolicy {
class BloomFilterPolicy : public BuiltinFilterPolicy {
public:
// An internal marker for operating modes of BloomFilterPolicy, in terms
// of selecting an implementation. This makes it easier for tests to track
@ -88,16 +116,9 @@ class BloomFilterPolicy : public FilterPolicy {
~BloomFilterPolicy() override;
const char* Name() const override;
// Deprecated block-based filter only
void CreateFilter(const Slice* keys, int n, std::string* dst) const override;
// Deprecated block-based filter only
bool KeyMayMatch(const Slice& key, const Slice& bloom_filter) const override;
FilterBitsBuilder* GetFilterBitsBuilder() const override;
// To use this function, call GetBuilderFromContext().
//
// Neither the context nor any objects therein should be saved beyond
@ -110,12 +131,6 @@ class BloomFilterPolicy : public FilterPolicy {
// (An internal convenience function to save boilerplate.)
static FilterBitsBuilder* GetBuilderFromContext(const FilterBuildingContext&);
// Read metadata to determine what kind of FilterBitsReader is needed
// and return a new one. This must successfully process any filter data
// generated by a built-in FilterBitsBuilder, regardless of the impl
// chosen for this BloomFilterPolicy. Not compatible with CreateFilter.
FilterBitsReader* GetFilterBitsReader(const Slice& contents) const override;
// Essentially for testing only: configured millibits/key
int GetMillibitsPerKey() const { return millibits_per_key_; }
// Essentially for testing only: legacy whole bits/key
@ -157,12 +172,6 @@ class BloomFilterPolicy : public FilterPolicy {
// Sum over all generated filters f:
// (predicted_fp_rate(f) - predicted_fp_rate(f|o_f_f_m=false)) * 2^32
mutable std::atomic<int64_t> aggregate_rounding_balance_;
// For newer Bloom filter implementation(s)
FilterBitsReader* GetBloomBitsReader(const Slice& contents) const;
// For Ribbon filter implementation(s)
FilterBitsReader* GetRibbonBitsReader(const Slice& contents) const;
};
} // namespace ROCKSDB_NAMESPACE

@ -102,7 +102,7 @@ default_params = {
"mock_direct_io": False,
"use_full_merge_v1": lambda: random.randint(0, 1),
"use_merge": lambda: random.randint(0, 1),
"use_ribbon_filter": lambda: random.randint(0, 1),
"ribbon_starting_level": lambda: random.randint(0, 10),
"verify_checksum": 1,
"write_buffer_size": 4 * 1024 * 1024,
"writepercent": 35,

@ -1195,6 +1195,51 @@ INSTANTIATE_TEST_CASE_P(Full, FullBloomTest,
BloomFilterPolicy::kFastLocalBloom,
BloomFilterPolicy::kStandard128Ribbon));
static double GetEffectiveBitsPerKey(FilterBitsBuilder* builder) {
union {
uint64_t key_value;
char key_bytes[8];
};
const unsigned kNumKeys = 1000;
Slice key_slice{key_bytes, 8};
for (key_value = 0; key_value < kNumKeys; ++key_value) {
builder->AddKey(key_slice);
}
std::unique_ptr<const char[]> buf;
auto filter = builder->Finish(&buf);
return filter.size() * /*bits per byte*/ 8 / (1.0 * kNumKeys);
}
TEST(RibbonTest, RibbonTestLevelThreshold) {
BlockBasedTableOptions opts;
FilterBuildingContext ctx(opts);
// A few settings
for (int ribbon_starting_level : {0, 1, 10}) {
std::unique_ptr<const FilterPolicy> policy{
NewExperimentalRibbonFilterPolicy(8, ribbon_starting_level)};
// Claim to be generating filter for this level
ctx.level_at_creation = ribbon_starting_level;
std::unique_ptr<FilterBitsBuilder> builder{
policy->GetBuilderWithContext(ctx)};
// Must be Ribbon (more space efficient than 8 bits per key)
ASSERT_LT(GetEffectiveBitsPerKey(builder.get()), 7.5);
if (ribbon_starting_level > 0) {
// Claim to be generating filter for this level
ctx.level_at_creation = ribbon_starting_level - 1;
builder.reset(policy->GetBuilderWithContext(ctx));
// Must be Bloom (~ 8 bits per key)
ASSERT_GT(GetEffectiveBitsPerKey(builder.get()), 7.5);
}
}
}
} // namespace ROCKSDB_NAMESPACE
int main(int argc, char** argv) {

Loading…
Cancel
Save