Allow applying `CompactionFilter` outside of compaction (#8243)

Summary:
From HISTORY.md release note:

- Allow `CompactionFilter`s to apply in more table file creation scenarios such as flush and recovery. For compatibility, `CompactionFilter`s by default apply during compaction. Users can customize this behavior by overriding `CompactionFilterFactory::ShouldFilterTableFileCreation()`.
- Removed unused structure `CompactionFilterContext`

Pull Request resolved: https://github.com/facebook/rocksdb/pull/8243

Test Plan: added unit tests

Reviewed By: pdillinger

Differential Revision: D28088089

Pulled By: ajkr

fbshipit-source-id: 0799be7908e3b39fea09fc3f1ab00e13ad817fae
main
Andrew Kryczka 4 years ago committed by Facebook GitHub Bot
parent 242ac6c17c
commit a639c02f8e
  1. 4
      HISTORY.md
  2. 32
      db/builder.cc
  3. 7
      db/compaction/compaction.cc
  4. 14
      db/compaction/compaction_iterator.cc
  5. 2
      db/compaction/compaction_iterator.h
  6. 136
      db/db_compaction_filter_test.cc
  7. 96
      include/rocksdb/compaction_filter.h
  8. 1
      include/rocksdb/listener.h
  9. 28
      include/rocksdb/options.h

@ -12,13 +12,15 @@
### New Features ### New Features
* Add new option allow_stall passed during instance creation of WriteBufferManager. When allow_stall is set, WriteBufferManager will stall all writers shared across multiple DBs and columns if memory usage goes beyond specified WriteBufferManager::buffer_size (soft limit). Stall will be cleared when memory is freed after flush and memory usage goes down below buffer_size. * Add new option allow_stall passed during instance creation of WriteBufferManager. When allow_stall is set, WriteBufferManager will stall all writers shared across multiple DBs and columns if memory usage goes beyond specified WriteBufferManager::buffer_size (soft limit). Stall will be cleared when memory is freed after flush and memory usage goes down below buffer_size.
* Allow `CompactionFilter`s to apply in more table file creation scenarios such as flush and recovery. For compatibility, `CompactionFilter`s by default apply during compaction. Users can customize this behavior by overriding `CompactionFilterFactory::ShouldFilterTableFileCreation()`.
* Added more fields to FilterBuildingContext with LSM details, for custom filter policies that vary behavior based on where they are in the LSM-tree. * Added more fields to FilterBuildingContext with LSM details, for custom filter policies that vary behavior based on where they are in the LSM-tree.
### Performace Improvements ### Performance Improvements
* BlockPrefetcher is used by iterators to prefetch data if they anticipate more data to be used in future. It is enabled implicitly by rocksdb. Added change to take in account read pattern if reads are sequential. This would disable prefetching for random reads in MultiGet and iterators as readahead_size is increased exponential doing large prefetches. * BlockPrefetcher is used by iterators to prefetch data if they anticipate more data to be used in future. It is enabled implicitly by rocksdb. Added change to take in account read pattern if reads are sequential. This would disable prefetching for random reads in MultiGet and iterators as readahead_size is increased exponential doing large prefetches.
### Public API change ### Public API change
* Removed a parameter from TableFactory::NewTableBuilder, which should not be called by user code because TableBuilder is not a public API. * Removed a parameter from TableFactory::NewTableBuilder, which should not be called by user code because TableBuilder is not a public API.
* Removed unused structure `CompactionFilterContext`.
* The `skip_filters` parameter to SstFileWriter is now considered deprecated. Use `BlockBasedTableOptions::filter_policy` to control generation of filters. * The `skip_filters` parameter to SstFileWriter is now considered deprecated. Use `BlockBasedTableOptions::filter_policy` to control generation of filters.
* ClockCache is known to have bugs that could lead to crash or corruption, so should not be used until fixed. Use NewLRUCache instead. * ClockCache is known to have bugs that could lead to crash or corruption, so should not be used until fixed. Use NewLRUCache instead.
* Deprecated backupable_db.h and BackupableDBOptions in favor of new versions with appropriate names: backup_engine.h and BackupEngineOptions. Old API compatibility is preserved. * Deprecated backupable_db.h and BackupableDBOptions in favor of new versions with appropriate names: backup_engine.h and BackupEngineOptions. Old API compatibility is preserved.

@ -109,6 +109,26 @@ Status BuildTable(
TableProperties tp; TableProperties tp;
if (iter->Valid() || !range_del_agg->IsEmpty()) { if (iter->Valid() || !range_del_agg->IsEmpty()) {
std::unique_ptr<CompactionFilter> compaction_filter;
if (ioptions.compaction_filter_factory != nullptr &&
ioptions.compaction_filter_factory->ShouldFilterTableFileCreation(
tboptions.reason)) {
CompactionFilter::Context context;
context.is_full_compaction = false;
context.is_manual_compaction = false;
context.column_family_id = tboptions.column_family_id;
context.reason = tboptions.reason;
compaction_filter =
ioptions.compaction_filter_factory->CreateCompactionFilter(context);
if (compaction_filter != nullptr &&
!compaction_filter->IgnoreSnapshots()) {
s.PermitUncheckedError();
return Status::NotSupported(
"CompactionFilter::IgnoreSnapshots() = false is not supported "
"anymore.");
}
}
TableBuilder* builder; TableBuilder* builder;
std::unique_ptr<WritableFileWriter> file_writer; std::unique_ptr<WritableFileWriter> file_writer;
{ {
@ -143,11 +163,11 @@ Status BuildTable(
builder = NewTableBuilder(tboptions, file_writer.get()); builder = NewTableBuilder(tboptions, file_writer.get());
} }
MergeHelper merge(env, tboptions.internal_comparator.user_comparator(), MergeHelper merge(
ioptions.merge_operator.get(), nullptr, ioptions.logger, env, tboptions.internal_comparator.user_comparator(),
ioptions.merge_operator.get(), compaction_filter.get(), ioptions.logger,
true /* internal key corruption is not ok */, true /* internal key corruption is not ok */,
snapshots.empty() ? 0 : snapshots.back(), snapshots.empty() ? 0 : snapshots.back(), snapshot_checker);
snapshot_checker);
std::unique_ptr<BlobFileBuilder> blob_file_builder( std::unique_ptr<BlobFileBuilder> blob_file_builder(
(mutable_cf_options.enable_blob_files && blob_file_additions) (mutable_cf_options.enable_blob_files && blob_file_additions)
@ -165,8 +185,8 @@ Status BuildTable(
snapshot_checker, env, ShouldReportDetailedTime(env, ioptions.stats), snapshot_checker, env, ShouldReportDetailedTime(env, ioptions.stats),
true /* internal key corruption is not ok */, range_del_agg.get(), true /* internal key corruption is not ok */, range_del_agg.get(),
blob_file_builder.get(), ioptions.allow_data_in_errors, blob_file_builder.get(), ioptions.allow_data_in_errors,
/*compaction=*/nullptr, /*compaction=*/nullptr, compaction_filter.get(),
/*compaction_filter=*/nullptr, /*shutting_down=*/nullptr, /*shutting_down=*/nullptr,
/*preserve_deletes_seqnum=*/0, /*manual_compaction_paused=*/nullptr, /*preserve_deletes_seqnum=*/0, /*manual_compaction_paused=*/nullptr,
db_options.info_log, full_history_ts_low); db_options.info_log, full_history_ts_low);

@ -526,10 +526,17 @@ std::unique_ptr<CompactionFilter> Compaction::CreateCompactionFilter() const {
return nullptr; return nullptr;
} }
if (!cfd_->ioptions()
->compaction_filter_factory->ShouldFilterTableFileCreation(
TableFileCreationReason::kCompaction)) {
return nullptr;
}
CompactionFilter::Context context; CompactionFilter::Context context;
context.is_full_compaction = is_full_compaction_; context.is_full_compaction = is_full_compaction_;
context.is_manual_compaction = is_manual_compaction_; context.is_manual_compaction = is_manual_compaction_;
context.column_family_id = cfd_->GetID(); context.column_family_id = cfd_->GetID();
context.reason = TableFileCreationReason::kCompaction;
return cfd_->ioptions()->compaction_filter_factory->CreateCompactionFilter( return cfd_->ioptions()->compaction_filter_factory->CreateCompactionFilter(
context); context);
} }

@ -100,8 +100,8 @@ CompactionIterator::CompactionIterator(
blob_garbage_collection_cutoff_file_number_( blob_garbage_collection_cutoff_file_number_(
ComputeBlobGarbageCollectionCutoffFileNumber(compaction_.get())), ComputeBlobGarbageCollectionCutoffFileNumber(compaction_.get())),
current_key_committed_(false), current_key_committed_(false),
cmp_with_history_ts_low_(0) { cmp_with_history_ts_low_(0),
assert(compaction_filter_ == nullptr || compaction_ != nullptr); level_(compaction_ == nullptr ? 0 : compaction_->level()) {
assert(snapshots_ != nullptr); assert(snapshots_ != nullptr);
bottommost_level_ = compaction_ == nullptr bottommost_level_ = compaction_ == nullptr
? false ? false
@ -232,7 +232,7 @@ bool CompactionIterator::InvokeFilterIfNeeded(bool* need_skip,
if (kTypeBlobIndex == ikey_.type) { if (kTypeBlobIndex == ikey_.type) {
blob_value_.Reset(); blob_value_.Reset();
filter = compaction_filter_->FilterBlobByKey( filter = compaction_filter_->FilterBlobByKey(
compaction_->level(), filter_key, &compaction_filter_value_, level_, filter_key, &compaction_filter_value_,
compaction_filter_skip_until_.rep()); compaction_filter_skip_until_.rep());
if (CompactionFilter::Decision::kUndetermined == filter && if (CompactionFilter::Decision::kUndetermined == filter &&
!compaction_filter_->IsStackedBlobDbInternalCompactionFilter()) { !compaction_filter_->IsStackedBlobDbInternalCompactionFilter()) {
@ -251,6 +251,12 @@ bool CompactionIterator::InvokeFilterIfNeeded(bool* need_skip,
valid_ = false; valid_ = false;
return false; return false;
} }
if (compaction_ == nullptr) {
status_ =
Status::Corruption("Unexpected blob index outside of compaction");
valid_ = false;
return false;
}
const Version* const version = compaction_->input_version(); const Version* const version = compaction_->input_version();
assert(version); assert(version);
@ -271,7 +277,7 @@ bool CompactionIterator::InvokeFilterIfNeeded(bool* need_skip,
} }
if (CompactionFilter::Decision::kUndetermined == filter) { if (CompactionFilter::Decision::kUndetermined == filter) {
filter = compaction_filter_->FilterV2( filter = compaction_filter_->FilterV2(
compaction_->level(), filter_key, value_type, level_, filter_key, value_type,
blob_value_.empty() ? value_ : blob_value_, &compaction_filter_value_, blob_value_.empty() ? value_ : blob_value_, &compaction_filter_value_,
compaction_filter_skip_until_.rep()); compaction_filter_skip_until_.rep());
} }

@ -340,6 +340,8 @@ class CompactionIterator {
// Saved result of ucmp->CompareTimestamp(current_ts_, *full_history_ts_low_) // Saved result of ucmp->CompareTimestamp(current_ts_, *full_history_ts_low_)
int cmp_with_history_ts_low_; int cmp_with_history_ts_low_;
const int level_;
bool IsShuttingDown() { bool IsShuttingDown() {
// This is a best-effort facility, so memory_order_relaxed is sufficient. // This is a best-effort facility, so memory_order_relaxed is sufficient.
return shutting_down_ && shutting_down_->load(std::memory_order_relaxed); return shutting_down_ && shutting_down_->load(std::memory_order_relaxed);

@ -82,6 +82,11 @@ class DeleteFilter : public CompactionFilter {
return true; return true;
} }
bool FilterMergeOperand(int /*level*/, const Slice& /*key*/,
const Slice& /*operand*/) const override {
return true;
}
const char* Name() const override { return "DeleteFilter"; } const char* Name() const override { return "DeleteFilter"; }
}; };
@ -190,18 +195,36 @@ class KeepFilterFactory : public CompactionFilterFactory {
bool compaction_filter_created_; bool compaction_filter_created_;
}; };
// This filter factory is configured with a `TableFileCreationReason`. Only
// table files created for that reason will undergo filtering. This
// configurability makes it useful to tests for filtering non-compaction table
// files, such as "CompactionFilterFlush" and "CompactionFilterRecovery".
class DeleteFilterFactory : public CompactionFilterFactory { class DeleteFilterFactory : public CompactionFilterFactory {
public: public:
explicit DeleteFilterFactory(TableFileCreationReason reason)
: reason_(reason) {}
std::unique_ptr<CompactionFilter> CreateCompactionFilter( std::unique_ptr<CompactionFilter> CreateCompactionFilter(
const CompactionFilter::Context& context) override { const CompactionFilter::Context& context) override {
if (context.is_manual_compaction) { EXPECT_EQ(reason_, context.reason);
return std::unique_ptr<CompactionFilter>(new DeleteFilter()); if (context.reason == TableFileCreationReason::kCompaction &&
} else { !context.is_manual_compaction) {
// Table files created by automatic compaction do not undergo filtering.
// Presumably some tests rely on this.
return std::unique_ptr<CompactionFilter>(nullptr); return std::unique_ptr<CompactionFilter>(nullptr);
} }
return std::unique_ptr<CompactionFilter>(new DeleteFilter());
}
bool ShouldFilterTableFileCreation(
TableFileCreationReason reason) const override {
return reason_ == reason;
} }
const char* Name() const override { return "DeleteFilterFactory"; } const char* Name() const override { return "DeleteFilterFactory"; }
private:
const TableFileCreationReason reason_;
}; };
// Delete Filter Factory which ignores snapshots // Delete Filter Factory which ignores snapshots
@ -349,7 +372,8 @@ TEST_F(DBTestCompactionFilter, CompactionFilter) {
// create a new database with the compaction // create a new database with the compaction
// filter in such a way that it deletes all keys // filter in such a way that it deletes all keys
options.compaction_filter_factory = std::make_shared<DeleteFilterFactory>(); options.compaction_filter_factory = std::make_shared<DeleteFilterFactory>(
TableFileCreationReason::kCompaction);
options.create_if_missing = true; options.create_if_missing = true;
DestroyAndReopen(options); DestroyAndReopen(options);
CreateAndReopenWithCF({"pikachu"}, options); CreateAndReopenWithCF({"pikachu"}, options);
@ -421,7 +445,8 @@ TEST_F(DBTestCompactionFilter, CompactionFilter) {
// entries in VersionEdit, but none of the 'AddFile's. // entries in VersionEdit, but none of the 'AddFile's.
TEST_F(DBTestCompactionFilter, CompactionFilterDeletesAll) { TEST_F(DBTestCompactionFilter, CompactionFilterDeletesAll) {
Options options = CurrentOptions(); Options options = CurrentOptions();
options.compaction_filter_factory = std::make_shared<DeleteFilterFactory>(); options.compaction_filter_factory = std::make_shared<DeleteFilterFactory>(
TableFileCreationReason::kCompaction);
options.disable_auto_compactions = true; options.disable_auto_compactions = true;
options.create_if_missing = true; options.create_if_missing = true;
DestroyAndReopen(options); DestroyAndReopen(options);
@ -450,6 +475,64 @@ TEST_F(DBTestCompactionFilter, CompactionFilterDeletesAll) {
} }
#endif // ROCKSDB_LITE #endif // ROCKSDB_LITE
TEST_F(DBTestCompactionFilter, CompactionFilterFlush) {
// Tests a `CompactionFilterFactory` that filters when table file is created
// by flush.
Options options = CurrentOptions();
options.compaction_filter_factory =
std::make_shared<DeleteFilterFactory>(TableFileCreationReason::kFlush);
options.merge_operator = MergeOperators::CreateStringAppendOperator();
Reopen(options);
// Puts and Merges are purged in flush.
ASSERT_OK(Put("a", "v"));
ASSERT_OK(Merge("b", "v"));
ASSERT_OK(Flush());
ASSERT_EQ("NOT_FOUND", Get("a"));
ASSERT_EQ("NOT_FOUND", Get("b"));
// However, Puts and Merges are preserved by recovery.
ASSERT_OK(Put("a", "v"));
ASSERT_OK(Merge("b", "v"));
Reopen(options);
ASSERT_EQ("v", Get("a"));
ASSERT_EQ("v", Get("b"));
// Likewise, compaction does not apply filtering.
ASSERT_OK(dbfull()->CompactRange(CompactRangeOptions(), nullptr, nullptr));
ASSERT_EQ("v", Get("a"));
ASSERT_EQ("v", Get("b"));
}
TEST_F(DBTestCompactionFilter, CompactionFilterRecovery) {
// Tests a `CompactionFilterFactory` that filters when table file is created
// by recovery.
Options options = CurrentOptions();
options.compaction_filter_factory =
std::make_shared<DeleteFilterFactory>(TableFileCreationReason::kRecovery);
options.merge_operator = MergeOperators::CreateStringAppendOperator();
Reopen(options);
// Puts and Merges are purged in recovery.
ASSERT_OK(Put("a", "v"));
ASSERT_OK(Merge("b", "v"));
Reopen(options);
ASSERT_EQ("NOT_FOUND", Get("a"));
ASSERT_EQ("NOT_FOUND", Get("b"));
// However, Puts and Merges are preserved by flush.
ASSERT_OK(Put("a", "v"));
ASSERT_OK(Merge("b", "v"));
ASSERT_OK(Flush());
ASSERT_EQ("v", Get("a"));
ASSERT_EQ("v", Get("b"));
// Likewise, compaction does not apply filtering.
ASSERT_OK(dbfull()->CompactRange(CompactRangeOptions(), nullptr, nullptr));
ASSERT_EQ("v", Get("a"));
ASSERT_EQ("v", Get("b"));
}
TEST_P(DBTestCompactionFilterWithCompactParam, TEST_P(DBTestCompactionFilterWithCompactParam,
CompactionFilterWithValueChange) { CompactionFilterWithValueChange) {
Options options = CurrentOptions(); Options options = CurrentOptions();
@ -842,6 +925,49 @@ TEST_F(DBTestCompactionFilter, IgnoreSnapshotsFalse) {
delete options.compaction_filter; delete options.compaction_filter;
} }
class TestNotSupportedFilterFactory : public CompactionFilterFactory {
public:
explicit TestNotSupportedFilterFactory(TableFileCreationReason reason)
: reason_(reason) {}
bool ShouldFilterTableFileCreation(
TableFileCreationReason reason) const override {
return reason_ == reason;
}
std::unique_ptr<CompactionFilter> CreateCompactionFilter(
const CompactionFilter::Context& /* context */) override {
return std::unique_ptr<CompactionFilter>(new TestNotSupportedFilter());
}
const char* Name() const override { return "TestNotSupportedFilterFactory"; }
private:
const TableFileCreationReason reason_;
};
TEST_F(DBTestCompactionFilter, IgnoreSnapshotsFalseDuringFlush) {
Options options = CurrentOptions();
options.compaction_filter_factory =
std::make_shared<TestNotSupportedFilterFactory>(
TableFileCreationReason::kFlush);
Reopen(options);
ASSERT_OK(Put("a", "v10"));
ASSERT_TRUE(Flush().IsNotSupported());
}
TEST_F(DBTestCompactionFilter, IgnoreSnapshotsFalseRecovery) {
Options options = CurrentOptions();
options.compaction_filter_factory =
std::make_shared<TestNotSupportedFilterFactory>(
TableFileCreationReason::kRecovery);
Reopen(options);
ASSERT_OK(Put("a", "v10"));
ASSERT_TRUE(TryReopen(options).IsNotSupported());
}
} // namespace ROCKSDB_NAMESPACE } // namespace ROCKSDB_NAMESPACE
int main(int argc, char** argv) { int main(int argc, char** argv) {

@ -14,23 +14,15 @@
#include <vector> #include <vector>
#include "rocksdb/rocksdb_namespace.h" #include "rocksdb/rocksdb_namespace.h"
#include "rocksdb/types.h"
namespace ROCKSDB_NAMESPACE { namespace ROCKSDB_NAMESPACE {
class Slice; class Slice;
class SliceTransform; class SliceTransform;
// Context information of a compaction run // CompactionFilter allows an application to modify/delete a key-value during
struct CompactionFilterContext { // table file creation.
// Does this compaction run include all data files
bool is_full_compaction;
// Is this compaction requested by the client (true),
// or is it occurring as an automatic compaction process
bool is_manual_compaction;
};
// CompactionFilter allows an application to modify/delete a key-value at
// the time of compaction.
class CompactionFilter { class CompactionFilter {
public: public:
@ -52,31 +44,33 @@ class CompactionFilter {
enum class BlobDecision { kKeep, kChangeValue, kCorruption, kIOError }; enum class BlobDecision { kKeep, kChangeValue, kCorruption, kIOError };
// Context information of a compaction run // Context information for a table file creation.
struct Context { struct Context {
// Does this compaction run include all data files // Whether this table file is created as part of a compaction including all
// table files.
bool is_full_compaction; bool is_full_compaction;
// Is this compaction requested by the client (true), // Whether this table file is created as part of a compaction requested by
// or is it occurring as an automatic compaction process // the client.
bool is_manual_compaction; bool is_manual_compaction;
// Which column family this compaction is for. // The column family that will contain the created table file.
uint32_t column_family_id; uint32_t column_family_id;
// Reason this table file is being created.
TableFileCreationReason reason;
}; };
virtual ~CompactionFilter() {} virtual ~CompactionFilter() {}
// The compaction process invokes this // The table file creation process invokes this method before adding a kv to
// method for kv that is being compacted. A return value // the table file. A return value of false indicates that the kv should be
// of false indicates that the kv should be preserved in the // preserved in the new table file and a return value of true indicates
// output of this compaction run and a return value of true // that this key-value should be removed from the new table file. The
// indicates that this key-value should be removed from the // application can inspect the existing value of the key and make decision
// output of the compaction. The application can inspect // based on it.
// the existing value of the key and make decision based on it.
// //
// Key-Values that are results of merge operation during compaction are not // Key-Values that are results of merge operation during table file creation
// passed into this function. Currently, when you have a mix of Put()s and // are not passed into this function. Currently, when you have a mix of Put()s
// Merge()s on a same key, we only guarantee to process the merge operands // and Merge()s on a same key, we only guarantee to process the merge operands
// through the compaction filters. Put()s might be processed, or might not. // through the `CompactionFilter`s. Put()s might be processed, or might not.
// //
// When the value is to be preserved, the application has the option // When the value is to be preserved, the application has the option
// to modify the existing_value and pass it back through new_value. // to modify the existing_value and pass it back through new_value.
@ -85,8 +79,9 @@ class CompactionFilter {
// Note that RocksDB snapshots (i.e. call GetSnapshot() API on a // Note that RocksDB snapshots (i.e. call GetSnapshot() API on a
// DB* object) will not guarantee to preserve the state of the DB with // DB* object) will not guarantee to preserve the state of the DB with
// CompactionFilter. Data seen from a snapshot might disappear after a // CompactionFilter. Data seen from a snapshot might disappear after a
// compaction finishes. If you use snapshots, think twice about whether you // table file created with a `CompactionFilter` is installed. If you use
// want to use compaction filter and whether you are using it in a safe way. // snapshots, think twice about whether you want to use `CompactionFilter` and
// whether you are using it in a safe way.
// //
// If multithreaded compaction is being used *and* a single CompactionFilter // If multithreaded compaction is being used *and* a single CompactionFilter
// instance was supplied via Options::compaction_filter, this method may be // instance was supplied via Options::compaction_filter, this method may be
@ -94,7 +89,7 @@ class CompactionFilter {
// that the call is thread-safe. // that the call is thread-safe.
// //
// If the CompactionFilter was created by a factory, then it will only ever // If the CompactionFilter was created by a factory, then it will only ever
// be used by a single thread that is doing the compaction run, and this // be used by a single thread that is doing the table file creation, and this
// call does not need to be thread-safe. However, multiple filters may be // call does not need to be thread-safe. However, multiple filters may be
// in existence and operating concurrently. // in existence and operating concurrently.
virtual bool Filter(int /*level*/, const Slice& /*key*/, virtual bool Filter(int /*level*/, const Slice& /*key*/,
@ -104,9 +99,9 @@ class CompactionFilter {
return false; return false;
} }
// The compaction process invokes this method on every merge operand. If this // The table file creation process invokes this method on every merge operand.
// method returns true, the merge operand will be ignored and not written out // If this method returns true, the merge operand will be ignored and not
// in the compaction output // written out in the new table file.
// //
// Note: If you are using a TransactionDB, it is not recommended to implement // Note: If you are using a TransactionDB, it is not recommended to implement
// FilterMergeOperand(). If a Merge operation is filtered out, TransactionDB // FilterMergeOperand(). If a Merge operation is filtered out, TransactionDB
@ -143,13 +138,14 @@ class CompactionFilter {
// snapshot - beware if you're using TransactionDB or // snapshot - beware if you're using TransactionDB or
// DB::GetSnapshot(). // DB::GetSnapshot().
// - If value for a key was overwritten or merged into (multiple Put()s // - If value for a key was overwritten or merged into (multiple Put()s
// or Merge()s), and compaction filter skips this key with // or Merge()s), and `CompactionFilter` skips this key with
// kRemoveAndSkipUntil, it's possible that it will remove only // kRemoveAndSkipUntil, it's possible that it will remove only
// the new value, exposing the old value that was supposed to be // the new value, exposing the old value that was supposed to be
// overwritten. // overwritten.
// - Doesn't work with PlainTableFactory in prefix mode. // - Doesn't work with PlainTableFactory in prefix mode.
// - If you use kRemoveAndSkipUntil, consider also reducing // - If you use kRemoveAndSkipUntil for table files created by
// compaction_readahead_size option. // compaction, consider also reducing compaction_readahead_size
// option.
// //
// Should never return kUndetermined. // Should never return kUndetermined.
// Note: If you are using a TransactionDB, it is not recommended to filter // Note: If you are using a TransactionDB, it is not recommended to filter
@ -189,13 +185,13 @@ class CompactionFilter {
} }
// This function is deprecated. Snapshots will always be ignored for // This function is deprecated. Snapshots will always be ignored for
// compaction filters, because we realized that not ignoring snapshots doesn't // `CompactionFilter`s, because we realized that not ignoring snapshots
// provide the guarantee we initially thought it would provide. Repeatable // doesn't provide the guarantee we initially thought it would provide.
// reads will not be guaranteed anyway. If you override the function and // Repeatable reads will not be guaranteed anyway. If you override the
// returns false, we will fail the compaction. // function and returns false, we will fail the table file creation.
virtual bool IgnoreSnapshots() const { return true; } virtual bool IgnoreSnapshots() const { return true; }
// Returns a name that identifies this compaction filter. // Returns a name that identifies this `CompactionFilter`.
// The name will be printed to LOG file on start up for diagnosis. // The name will be printed to LOG file on start up for diagnosis.
virtual const char* Name() const = 0; virtual const char* Name() const = 0;
@ -214,16 +210,28 @@ class CompactionFilter {
} }
}; };
// Each compaction will create a new CompactionFilter allowing the // Each thread of work involving creating table files will create a new
// application to know about different compactions // `CompactionFilter` according to `ShouldFilterTableFileCreation()`. This
// allows the application to know about the different ongoing threads of work
// and makes it unnecessary for `CompactionFilter` to provide thread-safety.
class CompactionFilterFactory { class CompactionFilterFactory {
public: public:
virtual ~CompactionFilterFactory() {} virtual ~CompactionFilterFactory() {}
// Returns whether a thread creating table files for the specified `reason`
// should invoke `CreateCompactionFilter()` and pass KVs through the returned
// filter.
virtual bool ShouldFilterTableFileCreation(
TableFileCreationReason reason) const {
// For backward compatibility, default implementation only applies
// `CompactionFilter` to files generated by compaction.
return reason == TableFileCreationReason::kCompaction;
}
virtual std::unique_ptr<CompactionFilter> CreateCompactionFilter( virtual std::unique_ptr<CompactionFilter> CreateCompactionFilter(
const CompactionFilter::Context& context) = 0; const CompactionFilter::Context& context) = 0;
// Returns a name that identifies this compaction filter factory. // Returns a name that identifies this `CompactionFilter` factory.
virtual const char* Name() const = 0; virtual const char* Name() const = 0;
}; };

@ -16,6 +16,7 @@
#include "rocksdb/compression_type.h" #include "rocksdb/compression_type.h"
#include "rocksdb/status.h" #include "rocksdb/status.h"
#include "rocksdb/table_properties.h" #include "rocksdb/table_properties.h"
#include "rocksdb/types.h"
namespace ROCKSDB_NAMESPACE { namespace ROCKSDB_NAMESPACE {

@ -128,9 +128,10 @@ struct ColumnFamilyOptions : public AdvancedColumnFamilyOptions {
// Allows an application to modify/delete a key-value during background // Allows an application to modify/delete a key-value during background
// compaction. // compaction.
// //
// If the client requires a new compaction filter to be used for different // If the client requires a new `CompactionFilter` to be used for different
// compaction runs, it can specify compaction_filter_factory instead of this // compaction runs and/or requires a `CompactionFilter` for table file
// option. The client should specify only one of the two. // creations outside of compaction, it can specify compaction_filter_factory
// instead of this option. The client should specify only one of the two.
// compaction_filter takes precedence over compaction_filter_factory if // compaction_filter takes precedence over compaction_filter_factory if
// client specifies both. // client specifies both.
// //
@ -141,12 +142,21 @@ struct ColumnFamilyOptions : public AdvancedColumnFamilyOptions {
// Default: nullptr // Default: nullptr
const CompactionFilter* compaction_filter = nullptr; const CompactionFilter* compaction_filter = nullptr;
// This is a factory that provides compaction filter objects which allow // This is a factory that provides `CompactionFilter` objects which allow
// an application to modify/delete a key-value during background compaction. // an application to modify/delete a key-value during table file creation.
// //
// A new filter will be created on each compaction run. If multithreaded // Unlike the `compaction_filter` option, which is used when compaction
// compaction is being used, each created CompactionFilter will only be used // creates a table file, this factory allows using a `CompactionFilter` when a
// from a single thread and so does not need to be thread-safe. // table file is created for various reasons. The factory can decide what
// `TableFileCreationReason`s use a `CompactionFilter`. For compatibility, by
// default the decision is to use a `CompactionFilter` for
// `TableFileCreationReason::kCompaction` only.
//
// Each thread of work involving creating table files will create a new
// `CompactionFilter` when it will be used according to the above
// `TableFileCreationReason`-based decision. This allows the application to
// know about the different ongoing threads of work and makes it unnecessary
// for `CompactionFilter` to provide thread-safety.
// //
// Default: nullptr // Default: nullptr
std::shared_ptr<CompactionFilterFactory> compaction_filter_factory = nullptr; std::shared_ptr<CompactionFilterFactory> compaction_filter_factory = nullptr;

Loading…
Cancel
Save