Fix CompactFiles by adding all necessary files

Summary:
The compact files API had a bug where some overlapping files
are not added. These are files which overlap with files which were
added to the compaction input files, but not to the original set of
input files. This happens only when there are more than two levels
involved in the compaction. An example will illustrate this better.

Level 2 has 1 input file 1.sst which spans [20,30].

Level 3 has added file  2.sst which spans [10,25]

Level 4 has file 3.sst which spans [35,40] and
        input file 4.sst which spans [46,50].

The existing code would not add 3.sst to the set of input_files because
it only becomes an overlapping file in level 4 and it wasn't one in
level 3.

When installing the results of the compaction, 3.sst would overlap with
output file from the compact files and result in the assertion in
version_set.cc:1130

 // Must not overlap
   assert(level <= 0 || level_files->empty() ||
            internal_comparator_->Compare(
                (*level_files)[level_files->size() - 1]->largest, f->smallest) <
                0);
This change now adds overlapping files from the current level to the set
of input files also so that we don't hit the assertion above.

Test Plan:
d=/tmp/j; rm -rf $d; seq 1000 | parallel --gnu --eta
'd=/tmp/j/d-{}; mkdir -p $d; TEST_TMPDIR=$d ./db_compaction_test
--gtest_filter=*CompactilesOnLevel* --gtest_also_run_disabled_tests >&
'$d'/log-{}'

Reviewers: igor, yhchiang, sdong

Reviewed By: yhchiang

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D43437
main
Venkatesh Radhakrishnan 10 years ago
parent 87df6295dd
commit 20b244fcca
  1. 7
      db/compaction_picker.cc
  2. 4
      db/db_compaction_test.cc

@ -710,7 +710,12 @@ Status CompactionPicker::SanitizeCompactionInputFilesForAllLevels(
aggregated_file_meta.largestkey = largestkey;
// For all lower levels, include all overlapping files.
for (int m = l + 1; m <= output_level; ++m) {
// We need to add overlapping files from the current level too because even
// if there no input_files in level l, we would still need to add files
// which overlap with the range containing the input_files in levels 0 to l
// Level 0 doesn't need to be handled this way because files are sorted by
// time and not by key
for (int m = std::max(l, 1); m <= output_level; ++m) {
for (auto& next_lv_file : levels[m].files) {
if (HaveOverlappingKeyRanges(
comparator, aggregated_file_meta, next_lv_file)) {

@ -1180,8 +1180,8 @@ TEST_F(DBCompactionTest, FilesDeletedAfterCompaction) {
} while (ChangeCompactOptions());
}
// TODO(t6534343) -- Don't run two level 0 CompactFiles concurrently
TEST_F(DBCompactionTest, DISABLED_CompactFilesOnLevelCompaction) {
// Check level comapction with compact files
TEST_F(DBCompactionTest, CompactFilesOnLevelCompaction) {
const int kTestKeySize = 16;
const int kTestValueSize = 984;
const int kEntrySize = kTestKeySize + kTestValueSize;

Loading…
Cancel
Save