Recommit "Avoid adding tombstones of the same file to RangeDelAggregator multiple times"

Summary:
The origin commit #3635  will hurt performance for users who aren't using range deletions, because unneeded std::set operations, so it was reverted by commit 44653c7b7a. (see #3672)

To fix this, move the set to  and add a check in , i.e., file will be added only if  is non-nullptr.

The db_bench command which find the performance regression:
> ./db_bench --benchmarks=fillrandom,seekrandomwhilewriting --threads=1 --num=1000000 --reads=150000 --key_size=66 > --value_size=1262 --statistics=0 --compression_ratio=0.5 --histogram=1 --seek_nexts=1 --stats_per_interval=1 > --stats_interval_seconds=600 --max_background_flushes=4 --num_multi_db=1 --max_background_compactions=16 --seed=1522388277 > -write_buffer_size=1048576 --level0_file_num_compaction_trigger=10000 --compression_type=none

Before and after the modification, I re-run this command on the machine, the results of are as follows:

  **fillrandom**
 Table | P50 | P75 | P99 | P99.9 | P99.99 |
  ---- | --- | --- | --- | ----- | ------ |
 before commit | 5.92 | 8.57 | 19.63 | 980.97 | 12196.00 |
 after commit  | 5.91 | 8.55 | 19.34 | 965.56 | 13513.56 |

 **seekrandomwhilewriting**
  Table | P50 | P75 | P99 | P99.9 | P99.99 |
   ---- | --- | --- | --- | ----- | ------ |
 before commit | 1418.62 | 1867.01 | 3823.28 | 4980.99 | 9240.00 |
 after commit  | 1450.54 | 1880.61 | 3962.87 | 5429.60 | 7542.86 |
Closes https://github.com/facebook/rocksdb/pull/3800

Differential Revision: D7874245

Pulled By: ajkr

fbshipit-source-id: 2e8bec781b3f7399246babd66395c88619534a17
main
LingBin 7 years ago committed by Facebook Github Bot
parent 4c5a3232e4
commit 72942ad7a4
  1. 7
      db/range_del_aggregator.cc
  2. 3
      db/range_del_aggregator.h
  3. 16
      db/table_cache.cc

@ -536,4 +536,11 @@ bool RangeDelAggregator::IsEmpty() {
return true;
}
bool RangeDelAggregator::AddFile(uint64_t file_number) {
if (rep_ == nullptr) {
return true;
}
return rep_->added_files_.emplace(file_number).second;
}
} // namespace rocksdb

@ -6,6 +6,7 @@
#pragma once
#include <map>
#include <set>
#include <string>
#include <vector>
@ -140,6 +141,7 @@ class RangeDelAggregator {
CompactionIterationStats* range_del_out_stats = nullptr,
bool bottommost_level = false);
bool IsEmpty();
bool AddFile(uint64_t file_number);
private:
// Maps tombstone user start key -> tombstone object
@ -166,6 +168,7 @@ class RangeDelAggregator {
struct Rep {
StripeMap stripe_map_;
PinnedIteratorsManager pinned_iters_mgr_;
std::set<uint64_t> added_files_;
};
// Initializes rep_ lazily. This aggregator object is constructed for every
// read, so expensive members should only be created when necessary, i.e.,

@ -247,13 +247,15 @@ InternalIterator* TableCache::NewIterator(
}
}
if (s.ok() && range_del_agg != nullptr && !options.ignore_range_deletions) {
std::unique_ptr<InternalIterator> range_del_iter(
table_reader->NewRangeTombstoneIterator(options));
if (range_del_iter != nullptr) {
s = range_del_iter->status();
}
if (s.ok()) {
s = range_del_agg->AddTombstones(std::move(range_del_iter));
if (range_del_agg->AddFile(fd.GetNumber())) {
std::unique_ptr<InternalIterator> range_del_iter(
table_reader->NewRangeTombstoneIterator(options));
if (range_del_iter != nullptr) {
s = range_del_iter->status();
}
if (s.ok()) {
s = range_del_agg->AddTombstones(std::move(range_del_iter));
}
}
}

Loading…
Cancel
Save