Summary:
Introduce DeleteScheduler that allow enforcing a rate limit on file deletion
Instead of deleting files immediately, files are moved to trash directory and deleted in a background thread that apply sleep penalty between deletes if needed.
I have updated PurgeObsoleteFiles and PurgeObsoleteWALFiles to use the delete_scheduler instead of env_->DeleteFile
Test Plan:
added delete_scheduler_test
existing unit tests
Reviewers: kradhakrishnan, anthony, rven, yhchiang, sdong
Reviewed By: sdong
Subscribers: dhruba
Differential Revision: https://reviews.facebook.net/D43221
Summary:
Fixed RocksJava test failure of shouldSetTestCappedPrefixExtractor
by adding the missing native implementation of
useCappedPrefixExtractor.
Test Plan:
make jclean
make rocksdbjava -j32
make jtest
Reviewers: igor, anthony, IslamAbdelRahman, sdong
Reviewed By: sdong
Subscribers: dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D43551
Summary:
MyRocks is using jemalloc latest version, not 3.6.0.
Combining multiple versions (3.6.0 in RocksDB and latest in MyRocks)
broke some features -- for example, getting SIGSEGV when heap profiling
was enabled.
This diff switches to use jemalloc latest, if
env variable ROCKSDB_FBCODE_BUILD_WITH_481=1 was set.
My understanding is this env was used by MyRocks only so it would be
safe to change.
Test Plan: building MyRocks then verified jemalloc heap profiling worked
Reviewers: igor, rven, yhchiang, jtolmer, maykov, sdong
Reviewed By: sdong
Subscribers: dhruba
Differential Revision: https://reviews.facebook.net/D43479
Summary:
Merge pull request #665 by adamretter
Exposes BackupEngine from C++ to the Java API. Previously only BackupableDB was available
Test Plan: BackupEngineTest.java
Reviewers: fyrz, igor, ankgup87, yhchiang
Reviewed By: yhchiang
Subscribers: dhruba
Differential Revision: https://reviews.facebook.net/D42873
Summary:
Make DBCompactionTest.SkipStatsUpdateTest more stable by
removing flaky but unnecessary assertion on the size of db
as simply checking the random file open count is suffice.
Test Plan: db_compaction_test
Reviewers: igor, anthony, IslamAbdelRahman, sdong
Reviewed By: sdong
Subscribers: dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D43533
Summary: In a recent change, crash_test can put data under TEST_TMPDIR. However, the directory is not cleaned before running the test, which may cause unexpected results. Clean it.
Test Plan: Run white and black box crash test against non-existing, or non-empty but not compactible DBs, and make sure it works as expected.
Reviewers: kradhakrishnan, rven, yhchiang, IslamAbdelRahman
Reviewed By: IslamAbdelRahman
Subscribers: leveldb, dhruba
Differential Revision: https://reviews.facebook.net/D43515
Summary:
UpdateAccumulatedStats() is used to optimize compaction decision
esp. when the number of deletion entries are high, but this function
can slowdown DBOpen esp. in disk environment.
This patch adds DBOptions::skip_sats_update_on_db_open, which skips
UpdateAccumulatedStats() in DB::Open() time when it's set to true.
Test Plan: Add DBCompactionTest.SkipStatsUpdateTest
Reviewers: igor, anthony, IslamAbdelRahman, sdong
Reviewed By: sdong
Subscribers: tnovak, dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D42843
Summary: Currently, whitebox crash test is not really executed, because the DB is destroyed after each crash. With this fix, in the first half of the time, DB will keep opening the crashed DB and continue from there.
Test Plan: "make whitebox_crash_test" and see the same DB keeps crashing and being reopened.
Reviewers: IslamAbdelRahman, yhchiang, rven, kradhakrishnan
Reviewed By: kradhakrishnan
Subscribers: leveldb, dhruba
Differential Revision: https://reviews.facebook.net/D43503
Summary: Currently crash_test only puts data under /tmp. It is less flexible if we want to cover different file systems or media. Make crash_test to appreciate TEST_TMPDIR so that users can run it against another file system.
Test Plan: Run blackbox_crash_test and whitebox_crash_test with or without TEST_TMPDIR set and make sure DBs are put in the right place
Reviewers: kradhakrishnan, yhchiang, rven, IslamAbdelRahman
Reviewed By: IslamAbdelRahman
Subscribers: leveldb, dhruba
Differential Revision: https://reviews.facebook.net/D43509
Summary:
crash_test now only runs complicated options, multiple column families, prefix hash, frequently changing options, many compaction threads, etc. These options are good to cover new features but we loss coverage in most common use cases. Furthermore, by running only for multiple column families, we are not able to create LSM trees that are large enough to cover some stress cases.
Make half of crash_test runs the simply tests: single column family, default mem table, one compaction thread, no change options.
Test Plan: Run crash_test
Reviewers: rven, yhchiang, IslamAbdelRahman, kradhakrishnan
Reviewed By: kradhakrishnan
Subscribers: leveldb, dhruba
Differential Revision: https://reviews.facebook.net/D43461
Summary: So I took a look and I used a pointer to TableBuilder. Changed it to a unique_ptr. I think this should work, but I cannot run valgrind correctly on my local machine to test it.
Test Plan: Run valgrind, but it's not working locally. It says I'm executing an unrecognized instruction.
Reviewers: yhchiang
Subscribers: dhruba, sdong
Differential Revision: https://reviews.facebook.net/D43485
Summary: In "sst_dump --show_compression_sizes", a reference of CompressionOptions is kept in TableBuilderOptions, which is destroyed later, causing a memory issue.
Test Plan: Run valgrind against SSTDumpToolTest.CompressedSizes and make sure it is fixed
Reviewers: IslamAbdelRahman, yhchiang, kradhakrishnan, rven
Reviewed By: rven
Subscribers: leveldb, dhruba
Differential Revision: https://reviews.facebook.net/D43497
Summary:
This diff adds CompactOnDeletionCollector in utilities/table_properties_collectors,
which applies a sliding window to a sst file and mark this file as need-compaction
when it observe enough deletion entries within the consecutive keys covered by
the sliding window.
Test Plan: compact_on_deletion_collector_test
Reviewers: igor, anthony, IslamAbdelRahman, kradhakrishnan, yoshinorim, sdong
Reviewed By: sdong
Subscribers: maykov, dhruba
Differential Revision: https://reviews.facebook.net/D41175
Summary:
The compact files API had a bug where some overlapping files
are not added. These are files which overlap with files which were
added to the compaction input files, but not to the original set of
input files. This happens only when there are more than two levels
involved in the compaction. An example will illustrate this better.
Level 2 has 1 input file 1.sst which spans [20,30].
Level 3 has added file 2.sst which spans [10,25]
Level 4 has file 3.sst which spans [35,40] and
input file 4.sst which spans [46,50].
The existing code would not add 3.sst to the set of input_files because
it only becomes an overlapping file in level 4 and it wasn't one in
level 3.
When installing the results of the compaction, 3.sst would overlap with
output file from the compact files and result in the assertion in
version_set.cc:1130
// Must not overlap
assert(level <= 0 || level_files->empty() ||
internal_comparator_->Compare(
(*level_files)[level_files->size() - 1]->largest, f->smallest) <
0);
This change now adds overlapping files from the current level to the set
of input files also so that we don't hit the assertion above.
Test Plan:
d=/tmp/j; rm -rf $d; seq 1000 | parallel --gnu --eta
'd=/tmp/j/d-{}; mkdir -p $d; TEST_TMPDIR=$d ./db_compaction_test
--gtest_filter=*CompactilesOnLevel* --gtest_also_run_disabled_tests >&
'$d'/log-{}'
Reviewers: igor, yhchiang, sdong
Reviewed By: yhchiang
Subscribers: dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D43437
Summary:
Made SuggestCompactRangeNoTwoLevel0Compactions by forcing
a flush after generating a file and waiting for compaction at the end.
Test Plan: Run SuggestCompactRangeNoTwoLevel0Compactions
Reviewers: yhchiang, igor, sdong
Reviewed By: sdong
Subscribers: dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D43449
Summary:
As of now compactions involving files from Level 0 and Level 1 are single
threaded because the files in L0, although sorted, are not range partitioned like
the other levels. This means that during L0-L1 compaction each file from L1
needs to be merged with potentially all the files from L0.
This attempt to parallelize the L0-L1 compaction assigns a thread and a
corresponding iterator to each L1 file that then considers only the key range
found in that L1 file and only the L0 files that have those keys (and only the
specific portion of those L0 files in which those keys are found). In this way
the overlap is minimized and potentially eliminated between different iterators
focusing on the same files.
The first step is to restructure the compaction logic to break L0-L1 compactions
into multiple, smaller, sequential compactions. Eventually each of these smaller
jobs will be run simultaneously. Areas to pay extra attention to are
# Correct aggregation of compaction job statistics across multiple threads
# Proper opening/closing of output files (make sure each thread's is unique)
# Keys that span multiple L1 files
# Skewed distributions of keys within L0 files
Test Plan: Make and run db_test (newer version has separate compaction tests) and compaction_job_stats_test
Reviewers: igor, noetzli, anthony, sdong, yhchiang
Reviewed By: yhchiang
Subscribers: MarkCallaghan, dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D42699
Summary: Now ldb dump_manifest refuses to work if there are 20 levels. Extend the limit to 64.
Test Plan: Run the tool with 20 number of levels
Reviewers: kradhakrishnan, anthony, IslamAbdelRahman, yhchiang
Reviewed By: yhchiang
Subscribers: leveldb, dhruba
Differential Revision: https://reviews.facebook.net/D42879
Summary:
sst_dump_tool contains two instances of `fprintf`s where the `format` argument is not
a string literal. This prevents the code from compiling with some compilers/compiler
options because of the potential security risks associated with printing non-literals.
Test Plan: make all
Reviewers: rven, igor, yhchiang, sdong, anthony
Reviewed By: anthony
Subscribers: dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D43305
Summary:
There was a bug in table_properties_collector_test that this patch
is fixing: `!backward_mode && !test_int_tbl_prop_collector` in
TestCustomizedTablePropertiesCollector was never true, so the code
in the if-block never got executed. The reason is that the
CustomizedTablePropertiesCollector test was skipping tests with
`!backward_mode_ && !encode_as_internal`. The reason for skipping
the tests is unknown.
Test Plan: make table_properties_collector_test && ./table_properties_collector_test
Reviewers: rven, igor, yhchiang, anthony, sdong
Reviewed By: sdong
Subscribers: dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D43281
Summary:
Added a new feature to sst_dump_tool.cc to allow a user to see the sizes of the different compression algorithms on an .sst file.
Usage:
./sst_dump --file=<filename> --show_compression_sizes
./sst_dump --file=<filename> --show_compression_sizes --set_block_size=<block_size>
Note: If you do not set a block size, it will default to 16kb
Test Plan: manual test and the write a unit test
Reviewers: IslamAbdelRahman, anthony, yhchiang, rven, kradhakrishnan, sdong
Reviewed By: sdong
Subscribers: dhruba
Differential Revision: https://reviews.facebook.net/D42963
Summary:
Support RollbackToSavePoint() in WriteBatch and WriteBatchWithIndex. Support for partial transaction rollback is needed for MyRocks.
An alternate implementation of Transaction::RollbackToSavePoint() exists in D40869. However, the other implementation is messier because it is implemented outside of WriteBatch. This implementation is much cleaner and also exposes a potentially useful feature to WriteBatch.
Test Plan: Added unit tests
Reviewers: IslamAbdelRahman, kradhakrishnan, maykov, yoshinorim, hermanlee4, spetrunia, sdong, yhchiang
Reviewed By: yhchiang
Subscribers: dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D42723
Summary:
Crash tests are supposed to restart the same DB after crashing, but it is now opening a different DB. Fix it.
It's probably a leftover of https://reviews.facebook.net/D17073
Test Plan: Run the test and make sure the same Db is opened.
Reviewers: kradhakrishnan, rven, igor, IslamAbdelRahman, yhchiang, anthony
Reviewed By: anthony
Subscribers: leveldb, dhruba
Differential Revision: https://reviews.facebook.net/D43197
Summary:
For task #7771355, we would like to log the number of corrupt keys
during a compaction. This patch implements and tests the count
as part of CompactionJobStats.
Test Plan: make && make check
Reviewers: rven, igor, yhchiang, sdong
Reviewed By: sdong
Subscribers: dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D42921
Summary: Adds the Java build and tests to Travis
Test Plan: Make sure that Travis still runs (does currently)
Reviewers: igor, fyrz, sdong
Subscribers: dhruba
Differential Revision: https://reviews.facebook.net/D43173
Summary: Fix for universal compaction with trivial move, when the ouput level is 0. The tests where failing. Fixed by allowing normal compaction when output level is 0.
Test Plan: modified test cases run successfully.
Reviewers: sdong, yhchiang, IslamAbdelRahman
Reviewed By: IslamAbdelRahman
Subscribers: anthony, kradhakrishnan, leveldb, dhruba
Differential Revision: https://reviews.facebook.net/D42933
Summary:
Whenever a Java class implements equals(), it has to implement hashCode(), otherwise
there might be weird behavior when inserting instances of the class in a hash map for
example. This adds two missing hashCode() implementations and extends tests to test
the hashCode() implementations.
Test Plan: make jtest
Reviewers: rven, igor, sdong, yhchiang
Reviewed By: yhchiang
Subscribers: anthony, dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D43017
Summary:
While working on https://reviews.facebook.net/D43017 , I realized
that some Java tests are failing due to a deprecated option.
This patch removes the offending tests, adds @Deprecated annotations
to the Java interface and removes the corresponding functions in
rocksjni
Test Plan: make jtest (all tests are passing now)
Reviewers: rven, igor, sdong, anthony, yhchiang
Reviewed By: yhchiang
Subscribers: dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D43035
* std::chrono does not provide enough granularity for microsecs and periodically emits
duplicates
* the bug is manifested in log rotation logic where we get duplicate
log file names and loose previous log content
* msvc does not imlement COW on std::strings adjusted the test to use
refs in the loops as auto does not retain ref info
* adjust auto_log rotation test with Windows specific command to remove
a folder. The test previously worked because we have unix utils installed
in house but this may not be the case for everyone.
Summary: DBCompactionTest.PartialCompactionFailure has a risk that one flush job writes out two mem tables into one file, so that the total files flushed are less than expected. Fix it by writing for flush to finish after every write.
Test Plan: Run the test
Reviewers: IslamAbdelRahman, kradhakrishnan, yhchiang, anthony
Reviewed By: anthony
Subscribers: leveldb, dhruba
Differential Revision: https://reviews.facebook.net/D42831
Summary:
My latest fix to pragma_error.h caused compilation errors for another internal project. I am now unable to figure out how to get pragma_error working on all platforms and build environments (nor am I able to test any other options).
Seems like the best option is to get rid of this macro. include/utilities has been deprecated for a year now, so lets just deal with a breaking change in 3.13 to remove these files. And I guess we'll have to live with having an extra convenience.h.
Thoughts?
Test Plan: build
Reviewers: igor, yhchiang, kradhakrishnan, IslamAbdelRahman, sdong
Reviewed By: sdong
Subscribers: dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D42597
Summary: These tests used to fail if a compaction happened between flushing tables and enumerating them to get properties.
Test Plan: this reports occasional failures without this diff and no failures with it: `for i in {1..10000}; do echo $i; done | parallel --gnu -j100 'TEST_TMPDIR=`TMPDIR=/dev/shm/rockstemp mktemp -d -t` ./db_test --gtest_filter=DBTest.GetUserDefinedTablaProperties >&/dev/null || echo {} failed'`
Reviewers: sdong
Reviewed By: sdong
Subscribers: dhruba
Differential Revision: https://reviews.facebook.net/D42861
Summary: https://reviews.facebook.net/D42321 has left PosixMmapFile in some weird state. This diff removes pending_sync_ that was now unused, fixes indentation and prevents Fsync() from calling both fsync() and fdatasync().
Test Plan: `make -j check`
Reviewers: sdong
Reviewed By: sdong
Subscribers: dhruba
Differential Revision: https://reviews.facebook.net/D42885
Summary:
I'll just copy internal task summary here:
"
This sequence will cause data loss in the middle after an sync write:
non-sync write key 1
flush triggered, not yet scheduled
sync write key 2
system crash
After rebooting, users might see key 2 but not key 1, which violates the API of sync write.
This can be reproduced using unit test FaultInjectionTest::DISABLED_WriteOptionSyncTest.
One way to fix it is for a sync write, if there is outstanding unsynced log files, we need to syc them too.
"
This diff should be considered together with the next diff D40905; in isolation this fix probably could be a little simpler.
Test Plan: `make check`; added a test for that (DBTest.SyncingPreviousLogs) before noticing FaultInjectionTest.WriteOptionSyncTest (keeping both since mine asserts a bit more); both tests fail without this diff; for D40905 stacked on top of this diff, ran tests with ASAN, TSAN and valgrind
Reviewers: rven, yhchiang, IslamAbdelRahman, anthony, kradhakrishnan, igor, sdong
Reviewed By: sdong
Subscribers: dhruba
Differential Revision: https://reviews.facebook.net/D40899
Summary:
Fixes T6548822. Added a new function for estimating the size of the live data
as proposed in the task. The value can be accessed through the property
rocksdb.estimate-live-data-size.
Test Plan:
There are two unit tests in version_set_test and a simple test in db_test.
make version_set_test && ./version_set_test;
make db_test && ./db_test gtest_filter=GetProperty
Reviewers: rven, igor, yhchiang, sdong
Reviewed By: sdong
Subscribers: dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D41493
Summary: Directly using TMPDIR can cause problems when running tests using parallel option. Fix them.
Test Plan: Run all tests in parallel
Reviewers: kradhakrishnan, yhchiang, IslamAbdelRahman, anthony
Reviewed By: anthony
Subscribers: leveldb, dhruba
Differential Revision: https://reviews.facebook.net/D42807