Allow fractional bits/key in BloomFilterPolicy (#6092)

Summary:
There's no technological impediment to allowing the Bloom
filter bits/key to be non-integer (fractional/decimal) values, and it
provides finer control over the memory vs. accuracy trade-off. This is
especially handy in using the format_version=5 Bloom filter in place
of the old one, because bits_per_key=9.55 provides the same accuracy as
the old bits_per_key=10.

This change not only requires refining the logic for choosing the best
num_probes for a given bits/key setting, it revealed a flaw in that logic.
As bits/key gets higher, the best num_probes for a cache-local Bloom
filter is closer to bpk / 2 than to bpk * 0.69, the best choice for a
standard Bloom filter. For example, at 16 bits per key, the best
num_probes is 9 (FP rate = 0.0843%) not 11 (FP rate = 0.0884%).
This change fixes and refines that logic (for the format_version=5
Bloom filter only, just in case) based on empirical tests to find
accuracy inflection points between each num_probes.

Although bits_per_key is now specified as a double, the new Bloom
filter converts/rounds this to "millibits / key" for predictable/precise
internal computations. Just in case of unforeseen compatibility
issues, we round to the nearest whole number bits / key for the
legacy Bloom filter, so as not to unlock new behaviors for it.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6092

Test Plan: unit tests included

Differential Revision: D18711313

Pulled By: pdillinger

fbshipit-source-id: 1aa73295f152a995328cb846ef9157ae8a05522a
main
Peter Dillinger 5 years ago committed by Facebook Github Bot
parent 72daa92d3a
commit 57f3032285
  1. 3
      HISTORY.md
  2. 10
      include/rocksdb/filter_policy.h
  3. 4
      java/rocksjni/filter.cc
  4. 10
      java/src/main/java/org/rocksdb/BloomFilter.java
  5. 23
      options/options_test.cc
  6. 4
      table/block_based/block_based_table_factory.cc
  7. 66
      table/block_based/filter_policy.cc
  8. 19
      table/block_based/filter_policy_internal.h
  9. 52
      util/bloom_impl.h
  10. 145
      util/bloom_test.cc
  11. 2
      util/filter_bench.cc

@ -6,6 +6,7 @@
* With FIFO compaction style, options.periodic_compaction_seconds will have the same meaning as options.ttl. Whichever stricter will be used. With the default options.periodic_compaction_seconds value with options.ttl's default of 0, RocksDB will give a default of 30 days. * With FIFO compaction style, options.periodic_compaction_seconds will have the same meaning as options.ttl. Whichever stricter will be used. With the default options.periodic_compaction_seconds value with options.ttl's default of 0, RocksDB will give a default of 30 days.
* Added an API GetCreationTimeOfOldestFile(uint64_t* creation_time) to get the file_creation_time of the oldest SST file in the DB. * Added an API GetCreationTimeOfOldestFile(uint64_t* creation_time) to get the file_creation_time of the oldest SST file in the DB.
* An unlikely usage of FilterPolicy is no longer supported. Calling GetFilterBitsBuilder() on the FilterPolicy returned by NewBloomFilterPolicy will now cause an assertion violation in debug builds, because RocksDB has internally migrated to a more elaborate interface that is expected to evolve further. Custom implementations of FilterPolicy should work as before, except those wrapping the return of NewBloomFilterPolicy, which will require a new override of a protected function in FilterPolicy. * An unlikely usage of FilterPolicy is no longer supported. Calling GetFilterBitsBuilder() on the FilterPolicy returned by NewBloomFilterPolicy will now cause an assertion violation in debug builds, because RocksDB has internally migrated to a more elaborate interface that is expected to evolve further. Custom implementations of FilterPolicy should work as before, except those wrapping the return of NewBloomFilterPolicy, which will require a new override of a protected function in FilterPolicy.
* NewBloomFilterPolicy now takes bits_per_key as a double instead of an int. This permits finer control over the memory vs. accuracy trade-off in the new Bloom filter implementation and should not change source code compatibility.
* The option BackupableDBOptions::max_valid_backups_to_open is now only used when opening BackupEngineReadOnly. When opening a read/write BackupEngine, anything but the default value logs a warning and is treated as the default. This change ensures that backup deletion has proper accounting of shared files to ensure they are deleted when no longer referenced by a backup. * The option BackupableDBOptions::max_valid_backups_to_open is now only used when opening BackupEngineReadOnly. When opening a read/write BackupEngine, anything but the default value logs a warning and is treated as the default. This change ensures that backup deletion has proper accounting of shared files to ensure they are deleted when no longer referenced by a backup.
### Default Option Changes ### Default Option Changes
@ -16,7 +17,7 @@
* Universal compaction to support options.periodic_compaction_seconds. A full compaction will be triggered if any file is over the threshold. * Universal compaction to support options.periodic_compaction_seconds. A full compaction will be triggered if any file is over the threshold.
* `GetLiveFilesMetaData` and `GetColumnFamilyMetaData` now expose the file number of SST files as well as the oldest blob file referenced by each SST. * `GetLiveFilesMetaData` and `GetColumnFamilyMetaData` now expose the file number of SST files as well as the oldest blob file referenced by each SST.
* A batched MultiGet API (DB::MultiGet()) that supports retrieving keys from multiple column families. * A batched MultiGet API (DB::MultiGet()) that supports retrieving keys from multiple column families.
* Full and partitioned filters in the block-based table use an improved Bloom filter implementation, enabled with format_version 5 (or above) because previous releases cannot read this filter. This replacement is faster and more accurate, especially for high bits per key or millions of keys in a single (full) filter. For example, the new Bloom filter has a lower false positive rate at 16 bits per key than the old one at 100 bits per key. * Full and partitioned filters in the block-based table use an improved Bloom filter implementation, enabled with format_version 5 (or above) because previous releases cannot read this filter. This replacement is faster and more accurate, especially for high bits per key or millions of keys in a single (full) filter. For example, the new Bloom filter has the same false postive rate at 9.55 bits per key as the old one at 10 bits per key, and a lower false positive rate at 16 bits per key than the old one at 100 bits per key.
* Added AVX2 instructions to USE_SSE builds to accelerate the new Bloom filter and XXH3-based hash function on compatible x86_64 platforms (Haswell and later, ~2014). * Added AVX2 instructions to USE_SSE builds to accelerate the new Bloom filter and XXH3-based hash function on compatible x86_64 platforms (Haswell and later, ~2014).
* Support options.ttl with options.max_open_files = -1. File's oldest ancester time will be written to manifest. If it is availalbe, this information will be used instead of creation_time in table properties. * Support options.ttl with options.max_open_files = -1. File's oldest ancester time will be written to manifest. If it is availalbe, this information will be used instead of creation_time in table properties.
* Setting options.ttl for universal compaction now has the same meaning as setting periodic_compaction_seconds. * Setting options.ttl for universal compaction now has the same meaning as setting periodic_compaction_seconds.

@ -151,8 +151,12 @@ class FilterPolicy {
// Return a new filter policy that uses a bloom filter with approximately // Return a new filter policy that uses a bloom filter with approximately
// the specified number of bits per key. // the specified number of bits per key.
// //
// bits_per_key: bits per key in bloom filter. A good value for bits_per_key // bits_per_key: average bits allocated per key in bloom filter. A good
// is 10, which yields a filter with ~ 1% false positive rate. // choice is 9.9, which yields a filter with ~ 1% false positive rate.
// When format_version < 5, the value will be rounded to the nearest
// integer. Recommend using no more than three decimal digits after the
// decimal point, as in 6.667.
//
// use_block_based_builder: use deprecated block based filter (true) rather // use_block_based_builder: use deprecated block based filter (true) rather
// than full or partitioned filter (false). // than full or partitioned filter (false).
// //
@ -167,5 +171,5 @@ class FilterPolicy {
// FilterPolicy (like NewBloomFilterPolicy) that does not ignore // FilterPolicy (like NewBloomFilterPolicy) that does not ignore
// trailing spaces in keys. // trailing spaces in keys.
extern const FilterPolicy* NewBloomFilterPolicy( extern const FilterPolicy* NewBloomFilterPolicy(
int bits_per_key, bool use_block_based_builder = false); double bits_per_key, bool use_block_based_builder = false);
} // namespace rocksdb } // namespace rocksdb

@ -19,10 +19,10 @@
/* /*
* Class: org_rocksdb_BloomFilter * Class: org_rocksdb_BloomFilter
* Method: createBloomFilter * Method: createBloomFilter
* Signature: (IZ)J * Signature: (DZ)J
*/ */
jlong Java_org_rocksdb_BloomFilter_createNewBloomFilter( jlong Java_org_rocksdb_BloomFilter_createNewBloomFilter(
JNIEnv* /*env*/, jclass /*jcls*/, jint bits_per_key, JNIEnv* /*env*/, jclass /*jcls*/, jdouble bits_per_key,
jboolean use_block_base_builder) { jboolean use_block_base_builder) {
auto* sptr_filter = new std::shared_ptr<const rocksdb::FilterPolicy>( auto* sptr_filter = new std::shared_ptr<const rocksdb::FilterPolicy>(
rocksdb::NewBloomFilterPolicy(bits_per_key, use_block_base_builder)); rocksdb::NewBloomFilterPolicy(bits_per_key, use_block_base_builder));

@ -20,7 +20,7 @@ package org.rocksdb;
*/ */
public class BloomFilter extends Filter { public class BloomFilter extends Filter {
private static final int DEFAULT_BITS_PER_KEY = 10; private static final double DEFAULT_BITS_PER_KEY = 10.0;
private static final boolean DEFAULT_MODE = true; private static final boolean DEFAULT_MODE = true;
/** /**
@ -39,7 +39,7 @@ public class BloomFilter extends Filter {
* *
* <p> * <p>
* bits_per_key: bits per key in bloom filter. A good value for bits_per_key * bits_per_key: bits per key in bloom filter. A good value for bits_per_key
* is 10, which yields a filter with ~ 1% false positive rate. * is 9.9, which yields a filter with ~ 1% false positive rate.
* </p> * </p>
* <p> * <p>
* Callers must delete the result after any database that is using the * Callers must delete the result after any database that is using the
@ -47,7 +47,7 @@ public class BloomFilter extends Filter {
* *
* @param bitsPerKey number of bits to use * @param bitsPerKey number of bits to use
*/ */
public BloomFilter(final int bitsPerKey) { public BloomFilter(final double bitsPerKey) {
this(bitsPerKey, DEFAULT_MODE); this(bitsPerKey, DEFAULT_MODE);
} }
@ -70,10 +70,10 @@ public class BloomFilter extends Filter {
* @param bitsPerKey number of bits to use * @param bitsPerKey number of bits to use
* @param useBlockBasedMode use block based mode or full filter mode * @param useBlockBasedMode use block based mode or full filter mode
*/ */
public BloomFilter(final int bitsPerKey, final boolean useBlockBasedMode) { public BloomFilter(final double bitsPerKey, final boolean useBlockBasedMode) {
super(createNewBloomFilter(bitsPerKey, useBlockBasedMode)); super(createNewBloomFilter(bitsPerKey, useBlockBasedMode));
} }
private native static long createNewBloomFilter(final int bitsKeyKey, private native static long createNewBloomFilter(final double bitsKeyKey,
final boolean useBlockBasedMode); final boolean useBlockBasedMode);
} }

@ -23,6 +23,7 @@
#include "rocksdb/memtablerep.h" #include "rocksdb/memtablerep.h"
#include "rocksdb/utilities/leveldb_options.h" #include "rocksdb/utilities/leveldb_options.h"
#include "rocksdb/utilities/object_registry.h" #include "rocksdb/utilities/object_registry.h"
#include "table/block_based/filter_policy_internal.h"
#include "test_util/testharness.h" #include "test_util/testharness.h"
#include "test_util/testutil.h" #include "test_util/testutil.h"
#include "util/random.h" #include "util/random.h"
@ -515,13 +516,15 @@ TEST_F(OptionsTest, GetBlockBasedTableOptionsFromString) {
BlockBasedTableOptions table_opt; BlockBasedTableOptions table_opt;
BlockBasedTableOptions new_opt; BlockBasedTableOptions new_opt;
// make sure default values are overwritten by something else // make sure default values are overwritten by something else
ASSERT_OK(GetBlockBasedTableOptionsFromString(table_opt, ASSERT_OK(GetBlockBasedTableOptionsFromString(
"cache_index_and_filter_blocks=1;index_type=kHashSearch;" table_opt,
"checksum=kxxHash;hash_index_allow_collision=1;no_block_cache=1;" "cache_index_and_filter_blocks=1;index_type=kHashSearch;"
"block_cache=1M;block_cache_compressed=1k;block_size=1024;" "checksum=kxxHash;hash_index_allow_collision=1;no_block_cache=1;"
"block_size_deviation=8;block_restart_interval=4;" "block_cache=1M;block_cache_compressed=1k;block_size=1024;"
"filter_policy=bloomfilter:4:true;whole_key_filtering=1;", "block_size_deviation=8;block_restart_interval=4;"
&new_opt)); "format_version=5;whole_key_filtering=1;"
"filter_policy=bloomfilter:4.567:false;",
&new_opt));
ASSERT_TRUE(new_opt.cache_index_and_filter_blocks); ASSERT_TRUE(new_opt.cache_index_and_filter_blocks);
ASSERT_EQ(new_opt.index_type, BlockBasedTableOptions::kHashSearch); ASSERT_EQ(new_opt.index_type, BlockBasedTableOptions::kHashSearch);
ASSERT_EQ(new_opt.checksum, ChecksumType::kxxHash); ASSERT_EQ(new_opt.checksum, ChecksumType::kxxHash);
@ -534,7 +537,13 @@ TEST_F(OptionsTest, GetBlockBasedTableOptionsFromString) {
ASSERT_EQ(new_opt.block_size, 1024UL); ASSERT_EQ(new_opt.block_size, 1024UL);
ASSERT_EQ(new_opt.block_size_deviation, 8); ASSERT_EQ(new_opt.block_size_deviation, 8);
ASSERT_EQ(new_opt.block_restart_interval, 4); ASSERT_EQ(new_opt.block_restart_interval, 4);
ASSERT_EQ(new_opt.format_version, 5U);
ASSERT_EQ(new_opt.whole_key_filtering, true);
ASSERT_TRUE(new_opt.filter_policy != nullptr); ASSERT_TRUE(new_opt.filter_policy != nullptr);
const BloomFilterPolicy& bfp =
dynamic_cast<const BloomFilterPolicy&>(*new_opt.filter_policy);
EXPECT_EQ(bfp.GetMillibitsPerKey(), 4567);
EXPECT_EQ(bfp.GetWholeBitsPerKey(), 5);
// unknown option // unknown option
ASSERT_NOK(GetBlockBasedTableOptionsFromString(table_opt, ASSERT_NOK(GetBlockBasedTableOptionsFromString(table_opt,

@ -510,8 +510,8 @@ std::string ParseBlockBasedTableOption(const std::string& name,
if (pos == std::string::npos) { if (pos == std::string::npos) {
return "Invalid filter policy config, missing bits_per_key"; return "Invalid filter policy config, missing bits_per_key";
} }
int bits_per_key = double bits_per_key =
ParseInt(trim(value.substr(kName.size(), pos - kName.size()))); ParseDouble(trim(value.substr(kName.size(), pos - kName.size())));
bool use_block_based_builder = bool use_block_based_builder =
ParseBoolean("use_block_based_builder", trim(value.substr(pos + 1))); ParseBoolean("use_block_based_builder", trim(value.substr(pos + 1)));
new_options->filter_policy.reset( new_options->filter_policy.reset(

@ -27,9 +27,10 @@ namespace {
// See description in FastLocalBloomImpl // See description in FastLocalBloomImpl
class FastLocalBloomBitsBuilder : public BuiltinFilterBitsBuilder { class FastLocalBloomBitsBuilder : public BuiltinFilterBitsBuilder {
public: public:
FastLocalBloomBitsBuilder(const int bits_per_key, const int num_probes) FastLocalBloomBitsBuilder(const int millibits_per_key)
: bits_per_key_(bits_per_key), num_probes_(num_probes) { : millibits_per_key_(millibits_per_key),
assert(bits_per_key_); num_probes_(FastLocalBloomImpl::ChooseNumProbes(millibits_per_key_)) {
assert(millibits_per_key >= 1000);
} }
// No Copy allowed // No Copy allowed
@ -77,14 +78,15 @@ class FastLocalBloomBitsBuilder : public BuiltinFilterBitsBuilder {
int CalculateNumEntry(const uint32_t bytes) override { int CalculateNumEntry(const uint32_t bytes) override {
uint32_t bytes_no_meta = bytes >= 5u ? bytes - 5u : 0; uint32_t bytes_no_meta = bytes >= 5u ? bytes - 5u : 0;
return static_cast<int>(uint64_t{8} * bytes_no_meta / bits_per_key_); return static_cast<int>(uint64_t{8000} * bytes_no_meta /
millibits_per_key_);
} }
uint32_t CalculateSpace(const int num_entry) override { uint32_t CalculateSpace(const int num_entry) override {
uint32_t num_cache_lines = 0; uint32_t num_cache_lines = 0;
if (bits_per_key_ > 0 && num_entry > 0) { if (millibits_per_key_ > 0 && num_entry > 0) {
num_cache_lines = static_cast<uint32_t>( num_cache_lines = static_cast<uint32_t>(
(int64_t{num_entry} * bits_per_key_ + 511) / 512); (int64_t{num_entry} * millibits_per_key_ + 511999) / 512000);
} }
return num_cache_lines * 64 + /*metadata*/ 5; return num_cache_lines * 64 + /*metadata*/ 5;
} }
@ -136,7 +138,7 @@ class FastLocalBloomBitsBuilder : public BuiltinFilterBitsBuilder {
} }
} }
int bits_per_key_; int millibits_per_key_;
int num_probes_; int num_probes_;
std::vector<uint64_t> hash_entries_; std::vector<uint64_t> hash_entries_;
}; };
@ -187,7 +189,7 @@ using LegacyBloomImpl = LegacyLocalityBloomImpl</*ExtraRotates*/ false>;
class LegacyBloomBitsBuilder : public BuiltinFilterBitsBuilder { class LegacyBloomBitsBuilder : public BuiltinFilterBitsBuilder {
public: public:
explicit LegacyBloomBitsBuilder(const int bits_per_key, const int num_probes); explicit LegacyBloomBitsBuilder(const int bits_per_key);
// No Copy allowed // No Copy allowed
LegacyBloomBitsBuilder(const LegacyBloomBitsBuilder&) = delete; LegacyBloomBitsBuilder(const LegacyBloomBitsBuilder&) = delete;
@ -208,7 +210,6 @@ class LegacyBloomBitsBuilder : public BuiltinFilterBitsBuilder {
} }
private: private:
friend class FullFilterBlockTest_DuplicateEntries_Test;
int bits_per_key_; int bits_per_key_;
int num_probes_; int num_probes_;
std::vector<uint32_t> hash_entries_; std::vector<uint32_t> hash_entries_;
@ -228,9 +229,9 @@ class LegacyBloomBitsBuilder : public BuiltinFilterBitsBuilder {
void AddHash(uint32_t h, char* data, uint32_t num_lines, uint32_t total_bits); void AddHash(uint32_t h, char* data, uint32_t num_lines, uint32_t total_bits);
}; };
LegacyBloomBitsBuilder::LegacyBloomBitsBuilder(const int bits_per_key, LegacyBloomBitsBuilder::LegacyBloomBitsBuilder(const int bits_per_key)
const int num_probes) : bits_per_key_(bits_per_key),
: bits_per_key_(bits_per_key), num_probes_(num_probes) { num_probes_(LegacyNoLocalityBloomImpl::ChooseNumProbes(bits_per_key_)) {
assert(bits_per_key_); assert(bits_per_key_);
} }
@ -412,12 +413,24 @@ const std::vector<BloomFilterPolicy::Mode> BloomFilterPolicy::kAllUserModes = {
kAuto, kAuto,
}; };
BloomFilterPolicy::BloomFilterPolicy(int bits_per_key, Mode mode) BloomFilterPolicy::BloomFilterPolicy(double bits_per_key, Mode mode)
: bits_per_key_(bits_per_key), mode_(mode) { : mode_(mode) {
// We intentionally round down to reduce probing cost a little bit // Sanitize bits_per_key
num_probes_ = static_cast<int>(bits_per_key_ * 0.69); // 0.69 =~ ln(2) if (bits_per_key < 1.0) {
if (num_probes_ < 1) num_probes_ = 1; bits_per_key = 1.0;
if (num_probes_ > 30) num_probes_ = 30; } else if (!(bits_per_key < 100.0)) { // including NaN
bits_per_key = 100.0;
}
// Includes a nudge toward rounding up, to ensure on all platforms
// that doubles specified with three decimal digits after the decimal
// point are interpreted accurately.
millibits_per_key_ = static_cast<int>(bits_per_key * 1000.0 + 0.500001);
// For better or worse, this is a rounding up of a nudged rounding up,
// e.g. 7.4999999999999 will round up to 8, but that provides more
// predictability against small arithmetic errors in floating point.
whole_bits_per_key_ = (millibits_per_key_ + 500) / 1000;
} }
BloomFilterPolicy::~BloomFilterPolicy() {} BloomFilterPolicy::~BloomFilterPolicy() {}
@ -433,7 +446,7 @@ void BloomFilterPolicy::CreateFilter(const Slice* keys, int n,
assert(mode_ == kDeprecatedBlock); assert(mode_ == kDeprecatedBlock);
// Compute bloom filter size (in both bits and bytes) // Compute bloom filter size (in both bits and bytes)
uint32_t bits = static_cast<uint32_t>(n * bits_per_key_); uint32_t bits = static_cast<uint32_t>(n * whole_bits_per_key_);
// For small n, we can see a very high false positive rate. Fix it // For small n, we can see a very high false positive rate. Fix it
// by enforcing a minimum bloom filter length. // by enforcing a minimum bloom filter length.
@ -442,12 +455,15 @@ void BloomFilterPolicy::CreateFilter(const Slice* keys, int n,
uint32_t bytes = (bits + 7) / 8; uint32_t bytes = (bits + 7) / 8;
bits = bytes * 8; bits = bytes * 8;
int num_probes =
LegacyNoLocalityBloomImpl::ChooseNumProbes(whole_bits_per_key_);
const size_t init_size = dst->size(); const size_t init_size = dst->size();
dst->resize(init_size + bytes, 0); dst->resize(init_size + bytes, 0);
dst->push_back(static_cast<char>(num_probes_)); // Remember # of probes dst->push_back(static_cast<char>(num_probes)); // Remember # of probes
char* array = &(*dst)[init_size]; char* array = &(*dst)[init_size];
for (int i = 0; i < n; i++) { for (int i = 0; i < n; i++) {
LegacyNoLocalityBloomImpl::AddHash(BloomHash(keys[i]), bits, num_probes_, LegacyNoLocalityBloomImpl::AddHash(BloomHash(keys[i]), bits, num_probes,
array); array);
} }
} }
@ -470,7 +486,7 @@ bool BloomFilterPolicy::KeyMayMatch(const Slice& key,
// Consider it a match. // Consider it a match.
return true; return true;
} }
// NB: using k not num_probes_ // NB: using stored k not num_probes for whole_bits_per_key_
return LegacyNoLocalityBloomImpl::HashMayMatch(BloomHash(key), bits, k, return LegacyNoLocalityBloomImpl::HashMayMatch(BloomHash(key), bits, k,
array); array);
} }
@ -504,9 +520,9 @@ FilterBitsBuilder* BloomFilterPolicy::GetFilterBitsBuilderInternal(
case kDeprecatedBlock: case kDeprecatedBlock:
return nullptr; return nullptr;
case kFastLocalBloom: case kFastLocalBloom:
return new FastLocalBloomBitsBuilder(bits_per_key_, num_probes_); return new FastLocalBloomBitsBuilder(millibits_per_key_);
case kLegacyBloom: case kLegacyBloom:
return new LegacyBloomBitsBuilder(bits_per_key_, num_probes_); return new LegacyBloomBitsBuilder(whole_bits_per_key_);
} }
} }
assert(false); assert(false);
@ -649,7 +665,7 @@ FilterBitsReader* BloomFilterPolicy::GetBloomBitsReader(
return new AlwaysTrueFilter(); return new AlwaysTrueFilter();
} }
const FilterPolicy* NewBloomFilterPolicy(int bits_per_key, const FilterPolicy* NewBloomFilterPolicy(double bits_per_key,
bool use_block_based_builder) { bool use_block_based_builder) {
BloomFilterPolicy::Mode m; BloomFilterPolicy::Mode m;
if (use_block_based_builder) { if (use_block_based_builder) {

@ -91,7 +91,7 @@ class BloomFilterPolicy : public FilterPolicy {
// tests should prefer using NewBloomFilterPolicy (user-exposed). // tests should prefer using NewBloomFilterPolicy (user-exposed).
static const std::vector<Mode> kAllUserModes; static const std::vector<Mode> kAllUserModes;
explicit BloomFilterPolicy(int bits_per_key, Mode mode); explicit BloomFilterPolicy(double bits_per_key, Mode mode);
~BloomFilterPolicy() override; ~BloomFilterPolicy() override;
@ -111,6 +111,11 @@ class BloomFilterPolicy : public FilterPolicy {
// chosen for this BloomFilterPolicy. Not compatible with CreateFilter. // chosen for this BloomFilterPolicy. Not compatible with CreateFilter.
FilterBitsReader* GetFilterBitsReader(const Slice& contents) const override; FilterBitsReader* GetFilterBitsReader(const Slice& contents) const override;
// Essentially for testing only: configured millibits/key
int GetMillibitsPerKey() const { return millibits_per_key_; }
// Essentially for testing only: legacy whole bits/key
int GetWholeBitsPerKey() const { return whole_bits_per_key_; }
protected: protected:
// To use this function, call FilterBuildingContext::GetBuilder(). // To use this function, call FilterBuildingContext::GetBuilder().
// //
@ -120,8 +125,16 @@ class BloomFilterPolicy : public FilterPolicy {
const FilterBuildingContext&) const override; const FilterBuildingContext&) const override;
private: private:
int bits_per_key_; // Newer filters support fractional bits per key. For predictable behavior
int num_probes_; // of 0.001-precision values across floating point implementations, we
// round to thousandths of a bit (on average) per key.
int millibits_per_key_;
// Older filters round to whole number bits per key. (There *should* be no
// compatibility issue with fractional bits per key, but preserving old
// behavior with format_version < 5 just in case.)
int whole_bits_per_key_;
// Selected mode (a specific implementation or way of selecting an // Selected mode (a specific implementation or way of selecting an
// implementation) for building new SST filters. // implementation) for building new SST filters.
Mode mode_; Mode mode_;

@ -72,6 +72,50 @@ namespace rocksdb {
// //
class FastLocalBloomImpl { class FastLocalBloomImpl {
public: public:
static inline int ChooseNumProbes(int millibits_per_key) {
// Since this implementation can (with AVX2) make up to 8 probes
// for the same cost, we pick the most accurate num_probes, based
// on actual tests of the implementation. Note that for higher
// bits/key, the best choice for cache-local Bloom can be notably
// smaller than standard bloom, e.g. 9 instead of 11 @ 16 b/k.
if (millibits_per_key <= 2080) {
return 1;
} else if (millibits_per_key <= 3580) {
return 2;
} else if (millibits_per_key <= 5100) {
return 3;
} else if (millibits_per_key <= 6640) {
return 4;
} else if (millibits_per_key <= 8300) {
return 5;
} else if (millibits_per_key <= 10070) {
return 6;
} else if (millibits_per_key <= 11720) {
return 7;
} else if (millibits_per_key <= 14001) {
// Would be something like <= 13800 but sacrificing *slightly* for
// more settings using <= 8 probes.
return 8;
} else if (millibits_per_key <= 16050) {
return 9;
} else if (millibits_per_key <= 18300) {
return 10;
} else if (millibits_per_key <= 22001) {
return 11;
} else if (millibits_per_key <= 25501) {
return 12;
} else if (millibits_per_key > 50000) {
// Top out at 24 probes (three sets of 8)
return 24;
} else {
// Roughly optimal choices for remaining range
// e.g.
// 28000 -> 12, 28001 -> 13
// 50000 -> 23, 50001 -> 24
return (millibits_per_key - 1) / 2000 - 1;
}
}
static inline void AddHash(uint32_t h1, uint32_t h2, uint32_t len_bytes, static inline void AddHash(uint32_t h1, uint32_t h2, uint32_t len_bytes,
int num_probes, char *data) { int num_probes, char *data) {
uint32_t bytes_to_cache_line = fastrange32(len_bytes >> 6, h1) << 6; uint32_t bytes_to_cache_line = fastrange32(len_bytes >> 6, h1) << 6;
@ -228,6 +272,14 @@ class FastLocalBloomImpl {
// //
class LegacyNoLocalityBloomImpl { class LegacyNoLocalityBloomImpl {
public: public:
static inline int ChooseNumProbes(int bits_per_key) {
// We intentionally round down to reduce probing cost a little bit
int num_probes = static_cast<int>(bits_per_key * 0.69); // 0.69 =~ ln(2)
if (num_probes < 1) num_probes = 1;
if (num_probes > 30) num_probes = 30;
return num_probes;
}
static inline void AddHash(uint32_t h, uint32_t total_bits, int num_probes, static inline void AddHash(uint32_t h, uint32_t total_bits, int num_probes,
char *data) { char *data) {
const uint32_t delta = (h >> 17) | (h << 15); // Rotate right 17 bits const uint32_t delta = (h >> 17) | (h << 15); // Rotate right 17 bits

@ -16,6 +16,7 @@ int main() {
#else #else
#include <array> #include <array>
#include <cmath>
#include <vector> #include <vector>
#include "logging/logging.h" #include "logging/logging.h"
@ -69,7 +70,7 @@ class BlockBasedBloomTest : public testing::Test {
filter_.clear(); filter_.clear();
} }
void ResetPolicy(int bits_per_key) { void ResetPolicy(double bits_per_key) {
policy_.reset(new BloomFilterPolicy(bits_per_key, policy_.reset(new BloomFilterPolicy(bits_per_key,
BloomFilterPolicy::kDeprecatedBlock)); BloomFilterPolicy::kDeprecatedBlock));
Reset(); Reset();
@ -229,6 +230,22 @@ TEST_F(BlockBasedBloomTest, Schema) {
Build(); Build();
ASSERT_EQ(BloomHash(FilterData()), 3057004015U); ASSERT_EQ(BloomHash(FilterData()), 3057004015U);
// With new fractional bits_per_key, check that we are rounding to
// whole bits per key for old Bloom filters.
ResetPolicy(9.5); // Treated as 10
for (int key = 1; key < 88; key++) {
Add(Key(key, buffer));
}
Build();
ASSERT_EQ(BloomHash(FilterData()), /*SAME*/ 3057004015U);
ResetPolicy(10.499); // Treated as 10
for (int key = 1; key < 88; key++) {
Add(Key(key, buffer));
}
Build();
ASSERT_EQ(BloomHash(FilterData()), /*SAME*/ 3057004015U);
ResetPolicy(); ResetPolicy();
} }
@ -250,7 +267,12 @@ class FullBloomTest : public testing::TestWithParam<BloomFilterPolicy::Mode> {
BuiltinFilterBitsBuilder* GetBuiltinFilterBitsBuilder() { BuiltinFilterBitsBuilder* GetBuiltinFilterBitsBuilder() {
// Throws on bad cast // Throws on bad cast
return &dynamic_cast<BuiltinFilterBitsBuilder&>(*bits_builder_.get()); return &dynamic_cast<BuiltinFilterBitsBuilder&>(*bits_builder_);
}
const BloomFilterPolicy* GetBloomFilterPolicy() {
// Throws on bad cast
return &dynamic_cast<const BloomFilterPolicy&>(*policy_);
} }
void Reset() { void Reset() {
@ -260,7 +282,7 @@ class FullBloomTest : public testing::TestWithParam<BloomFilterPolicy::Mode> {
filter_size_ = 0; filter_size_ = 0;
} }
void ResetPolicy(int bits_per_key) { void ResetPolicy(double bits_per_key) {
policy_.reset(new BloomFilterPolicy(bits_per_key, GetParam())); policy_.reset(new BloomFilterPolicy(bits_per_key, GetParam()));
Reset(); Reset();
} }
@ -366,14 +388,57 @@ class FullBloomTest : public testing::TestWithParam<BloomFilterPolicy::Mode> {
}; };
TEST_P(FullBloomTest, FilterSize) { TEST_P(FullBloomTest, FilterSize) {
auto bits_builder = GetBuiltinFilterBitsBuilder(); // In addition to checking the consistency of space computation, we are
for (int n = 1; n < 100; n++) { // checking that denoted and computed doubles are interpreted as expected
auto space = bits_builder->CalculateSpace(n); // as bits_per_key values.
auto n2 = bits_builder->CalculateNumEntry(space); bool some_computed_less_than_denoted = false;
ASSERT_GE(n2, n); // Note: enforced minimum is 1 bit per key (1000 millibits), and enforced
auto space2 = bits_builder->CalculateSpace(n2); // maximum is 100 bits per key (100000 millibits).
ASSERT_EQ(space, space2); for (auto bpk :
std::vector<std::pair<double, int> >{{-HUGE_VAL, 1000},
{-INFINITY, 1000},
{0.0, 1000},
{1.234, 1234},
{3.456, 3456},
{9.5, 9500},
{10.0, 10000},
{10.499, 10499},
{21.345, 21345},
{99.999, 99999},
{1234.0, 100000},
{HUGE_VAL, 100000},
{INFINITY, 100000},
{NAN, 100000}}) {
ResetPolicy(bpk.first);
auto bfp = GetBloomFilterPolicy();
EXPECT_EQ(bpk.second, bfp->GetMillibitsPerKey());
EXPECT_EQ((bpk.second + 500) / 1000, bfp->GetWholeBitsPerKey());
double computed = bpk.first;
// This transforms e.g. 9.5 -> 9.499999999999998, which we still
// round to 10 for whole bits per key.
computed += 0.5;
computed /= 1234567.0;
computed *= 1234567.0;
computed -= 0.5;
some_computed_less_than_denoted |= (computed < bpk.first);
ResetPolicy(computed);
bfp = GetBloomFilterPolicy();
EXPECT_EQ(bpk.second, bfp->GetMillibitsPerKey());
EXPECT_EQ((bpk.second + 500) / 1000, bfp->GetWholeBitsPerKey());
auto bits_builder = GetBuiltinFilterBitsBuilder();
for (int n = 1; n < 100; n++) {
auto space = bits_builder->CalculateSpace(n);
auto n2 = bits_builder->CalculateNumEntry(space);
EXPECT_GE(n2, n);
auto space2 = bits_builder->CalculateSpace(n2);
EXPECT_EQ(space, space2);
}
} }
// Check that the compiler hasn't optimized our computation into nothing
EXPECT_TRUE(some_computed_less_than_denoted);
ResetPolicy();
} }
TEST_P(FullBloomTest, FullEmptyFilter) { TEST_P(FullBloomTest, FullEmptyFilter) {
@ -546,32 +611,43 @@ TEST_P(FullBloomTest, Schema) {
EXPECT_EQ("34,74,130,236,643,882,962,1015,1035,1110", FirstFPs(10)); EXPECT_EQ("34,74,130,236,643,882,962,1015,1035,1110", FirstFPs(10));
} }
ResetPolicy(14); // num_probes = 9 // This used to be 9 probes, but 8 is a better choice for speed,
// especially with SIMD groups of 8 probes, with essentially no
// change in FP rate.
// FP rate @ 9 probes, old Bloom: 0.4321%
// FP rate @ 9 probes, new Bloom: 0.1846%
// FP rate @ 8 probes, new Bloom: 0.1843%
ResetPolicy(14); // num_probes = 8 (new), 9 (old)
for (int key = 0; key < 2087; key++) { for (int key = 0; key < 2087; key++) {
Add(Key(key, buffer)); Add(Key(key, buffer));
} }
Build(); Build();
EXPECT_EQ(GetNumProbesFromFilterData(), 9); EXPECT_EQ(GetNumProbesFromFilterData(), SelectByImpl(9, 8));
EXPECT_EQ( EXPECT_EQ(
BloomHash(FilterData()), BloomHash(FilterData()),
SelectByImpl(SelectByCacheLineSize(178861123, 379087593, 2574136516U), SelectByImpl(SelectByCacheLineSize(178861123, 379087593, 2574136516U),
3129678118U)); 3709876890U));
if (GetParam() == BloomFilterPolicy::kFastLocalBloom) { if (GetParam() == BloomFilterPolicy::kFastLocalBloom) {
EXPECT_EQ("130,989,2002,3225,3543,4522,4863,5256,5277", FirstFPs(9)); EXPECT_EQ("130,240,522,565,989,2002,2526,3147,3543", FirstFPs(9));
} }
ResetPolicy(16); // num_probes = 11 // This used to be 11 probes, but 9 is a better choice for speed
// AND accuracy.
// FP rate @ 11 probes, old Bloom: 0.3571%
// FP rate @ 11 probes, new Bloom: 0.0884%
// FP rate @ 9 probes, new Bloom: 0.0843%
ResetPolicy(16); // num_probes = 9 (new), 11 (old)
for (int key = 0; key < 2087; key++) { for (int key = 0; key < 2087; key++) {
Add(Key(key, buffer)); Add(Key(key, buffer));
} }
Build(); Build();
EXPECT_EQ(GetNumProbesFromFilterData(), 11); EXPECT_EQ(GetNumProbesFromFilterData(), SelectByImpl(11, 9));
EXPECT_EQ( EXPECT_EQ(
BloomHash(FilterData()), BloomHash(FilterData()),
SelectByImpl(SelectByCacheLineSize(1129406313, 3049154394U, 1727750964), SelectByImpl(SelectByCacheLineSize(1129406313, 3049154394U, 1727750964),
1262483504)); 1087138490));
if (GetParam() == BloomFilterPolicy::kFastLocalBloom) { if (GetParam() == BloomFilterPolicy::kFastLocalBloom) {
EXPECT_EQ("240,945,2660,3299,4031,4282,5173,6197,8715", FirstFPs(9)); EXPECT_EQ("3299,3611,3916,6620,7822,8079,8482,8942,10167", FirstFPs(9));
} }
ResetPolicy(10); // num_probes = 6, but different memory ratio vs. 9 ResetPolicy(10); // num_probes = 6, but different memory ratio vs. 9
@ -616,6 +692,39 @@ TEST_P(FullBloomTest, Schema) {
EXPECT_EQ("16,126,133,422,466,472,813,1002,1035,1159", FirstFPs(10)); EXPECT_EQ("16,126,133,422,466,472,813,1002,1035,1159", FirstFPs(10));
} }
// With new fractional bits_per_key, check that we are rounding to
// whole bits per key for old Bloom filters but fractional for
// new Bloom filter.
ResetPolicy(9.5);
for (int key = 1; key < 2088; key++) {
Add(Key(key, buffer));
}
Build();
EXPECT_EQ(GetNumProbesFromFilterData(), 6);
EXPECT_EQ(BloomHash(FilterData()),
SelectByImpl(/*SAME*/ SelectByCacheLineSize(2885052954U, 769447944,
4175124908U),
/*CHANGED*/ 3166884174U));
if (GetParam() == BloomFilterPolicy::kFastLocalBloom) {
EXPECT_EQ(/*CHANGED*/ "126,156,367,444,458,791,813,976,1015,1035",
FirstFPs(10));
}
ResetPolicy(10.499);
for (int key = 1; key < 2088; key++) {
Add(Key(key, buffer));
}
Build();
EXPECT_EQ(GetNumProbesFromFilterData(), SelectByImpl(6, 7));
EXPECT_EQ(BloomHash(FilterData()),
SelectByImpl(/*SAME*/ SelectByCacheLineSize(2885052954U, 769447944,
4175124908U),
/*CHANGED*/ 4098502778U));
if (GetParam() == BloomFilterPolicy::kFastLocalBloom) {
EXPECT_EQ(/*CHANGED*/ "16,236,240,472,1015,1045,1111,1409,1465,1612",
FirstFPs(10));
}
ResetPolicy(); ResetPolicy();
} }

@ -52,7 +52,7 @@ DEFINE_uint32(vary_key_size_log2_interval, 5,
DEFINE_uint32(batch_size, 8, "Number of keys to group in each batch"); DEFINE_uint32(batch_size, 8, "Number of keys to group in each batch");
DEFINE_uint32(bits_per_key, 10, "Bits per key setting for filters"); DEFINE_double(bits_per_key, 10.0, "Bits per key setting for filters");
DEFINE_double(m_queries, 200, "Millions of queries for each test mode"); DEFINE_double(m_queries, 200, "Millions of queries for each test mode");

Loading…
Cancel
Save