Summary: Seems like the cleanest way to resolve this is to move CompactedDBImpl into db/. CompactedDBImpl should probably live in the same place as DBImplReadonly since Opening the latter could end up instantiating the former. Both DBImplReadonly and CompactedDBImpl inherit from DBImpl access protected members of DBImpl( and the latter access friendly private methods).
Test Plan: unit tests
Reviewers: sdong, yhchiang, kradhakrishnan, igor
Reviewed By: igor
Subscribers: dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D42027
Summary: Remove ColumnFamiliesBatchWithIndexTest from ROCKSDB_LITE since WriteBatchWithIndex is not supported
Test Plan: write_batch_test
Reviewers: yhchiang, sdong, igor
Reviewed By: igor
Subscribers: dhruba
Differential Revision: https://reviews.facebook.net/D42033
Summary: This option is guarding the feature implemented 2 and a half years ago: D8991. The feature was enabled by default back then and has been running without issues. There is no reason why any client would turn this feature off. I found no reference in fbcode.
Test Plan: none
Reviewers: sdong, yhchiang, anthony, dhruba
Reviewed By: dhruba
Subscribers: dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D42063
Summary:
In one of our recent meetings, we discussed deprecating features that are not being actively used. One of those features, at least within Facebook, is timeout_hint. The feature is really nicely implemented, but if nobody needs it, we should remove it from our code-base (until we get a valid use-case). Some arguments:
* Less code == better icache hit rate, smaller builds, simpler code
* The motivation for adding timeout_hint_us was to work-around RocksDB's stall issue. However, we're currently addressing the stall issue itself (see @sdong's recent work on stall write_rate), so we should never see sharp lock-ups in the future.
* Nobody is using the feature within Facebook's code-base. Googling for `timeout_hint_us` also doesn't yield any users.
Test Plan: make check
Reviewers: anthony, kradhakrishnan, sdong, yhchiang
Reviewed By: yhchiang
Subscribers: sdong, dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D41937
Summary:
We were manipulating `const char*` arrays in CompactionJob to
change the sequence number/types of keys. This patch changes
UpdateInternalKey() to use string methods to do the manipulation
and updates all calls accordingly.
Test Plan:
Added test case for UpdateInternalKey() in dbformat_test.
make && make check
Reviewers: sdong, rven, yhchiang, igor
Reviewed By: igor
Subscribers: dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D41985
Summary:
Logging, dealing with key prefix batches and updating stats
moved from CompactionJob::Run() into separate functions.
Test Plan: make all && make check
Reviewers: sdong, rven, yhchiang, igor
Reviewed By: igor
Subscribers: dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D41919
Summary: This is revert of D40449. I'm not sure why our tests failed before, they seem to be working fine on my devbox.
Test Plan: ran `make check`
Reviewers: anthony
Reviewed By: anthony
Subscribers: dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D41955
Summary: Block reduce_levels_test in ROCKSDB_LITE as LDBCommand is not supported
Test Plan:
make reduce_levels_test -j64
OPT=-DROCKSDB_LITE make reduce_levels_test -j64
make check -j64
Reviewers: sdong, igor, yhchiang
Reviewed By: yhchiang
Subscribers: dhruba
Differential Revision: https://reviews.facebook.net/D41967
Summary: Block spatial_db_test in ROCKSDB_LITE as SpatialDB is not supported
Test Plan: spatial_db_test
Reviewers: yhchiang, igor, sdong
Reviewed By: sdong
Subscribers: dhruba
Differential Revision: https://reviews.facebook.net/D41991
Summary:
Move global static functions in db_test_util to DBTestBase.
This is to prevent unused function warning when decoupling
db_test.cc into multiple files.
Test Plan: db_test
Reviewers: igor, sdong, anthony, IslamAbdelRahman, kradhakrishnan
Reviewed By: kradhakrishnan
Subscribers: dhruba
Differential Revision: https://reviews.facebook.net/D42009
Summary:
Move reusable part of db_test.cc to util/db_test_util.h.
This makes it more possible to partition db_test.cc into
multiple smaller test files.
Also, fixed many old lint errors in db_test.
Test Plan: db_test
Reviewers: igor, anthony, IslamAbdelRahman, sdong, kradhakrishnan
Reviewed By: sdong, kradhakrishnan
Subscribers: dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D41973
Summary: Adding a main for optimistic_transaction_test that report that it was skipped when using ROCKSDB_LITE
Test Plan: optimistic_transaction_test
Reviewers: yhchiang, sdong, igor, anthony
Reviewed By: anthony
Subscribers: dhruba
Differential Revision: https://reviews.facebook.net/D41979
Summary:
Added new statistics in CompactionJobStats to keep track of
deletion entries and the expiration of those entries. Updated these
fields in compaction_job.cc as compaction took place and wrote a new
test in compaction_job_stats_test.cc to verify accuracy.
Test Plan:
Wrote new test DeletionStatsTest in
compaction_job_stats_test.cc to verify
Reviewers: sdong, igor, yhchiang
Reviewed By: yhchiang
Subscribers: leveldb, dhruba
Differential Revision: https://reviews.facebook.net/D41355
Summary: This helps Windows port to format their changes, as discussed. Might have formatted some other codes too becasue last 10 commits include more.
Test Plan: Build it.
Reviewers: anthony, IslamAbdelRahman, kradhakrishnan, yhchiang, igor
Reviewed By: igor
Subscribers: leveldb, dhruba
Differential Revision: https://reviews.facebook.net/D41961
Summary:
Public API depends on port/port.h which is wrong. Fix it.
Also with gcc 4.8.1 build was broken as MAX_INT32 was not recognized. Fix it by using ::max in linux.
Test Plan: Build it and try to build an external project on top of it.
Reviewers: anthony, yhchiang, kradhakrishnan, igor
Reviewed By: igor
Subscribers: yoshinorim, leveldb, dhruba
Differential Revision: https://reviews.facebook.net/D41745
Summary: Print whether fast CRC32 is supported in DB info LOG
Test Plan: Run db_bench and see it prints out correctly.
Reviewers: yhchiang, anthony, kradhakrishnan, igor
Reviewed By: igor
Subscribers: MarkCallaghan, yoshinorim, leveldb, dhruba
Differential Revision: https://reviews.facebook.net/D41733
Summary:
new_table_iterator_nanos is not cleaned in PerfContext::Reset() while new_table_block_iter_nanos is cleaned twice. Fix it.
Also fix a comment.
Test Plan: Build and db_bench with --perf_context to see the value shown.
Reviewers: kradhakrishnan, anthony, yhchiang, igor
Reviewed By: igor
Subscribers: leveldb, dhruba
Differential Revision: https://reviews.facebook.net/D41721
Summary: Add a perf context counter to help users figure out time spent on reading indexes and bloom filter blocks.
Test Plan: Will write a unit test
Subscribers: leveldb, dhruba
Differential Revision: https://reviews.facebook.net/D41433
Summary: The t/DBTest.DropWrites test still fails under certain gcc version in release unit test.
I unfortunately cannot repro the failure (since the compilers have mapped library which I am not able to map to correctly). I am suspecting the clock skew.
Test Plan: Run make check
Reviewers:
CC: sdong igore
Task ID: #7312624
Blame Rev:
Summary:
The new flag --cache_index_and_filter_blocks sets
BlockBasedTableOptions.cache_index_and_filter_blocks
Test Plan: make db_bench. Working on benchmarks with the new flag.
Reviewers: igor
Reviewed By: igor
Subscribers: dhruba
Differential Revision: https://reviews.facebook.net/D41481
1) Crash in env_win.cc that prevented db_test run to completion and some new tests
2) Fix new corruption tests in DBTest by allowing a shared trunction of files. Note that this is generally needed ONLY for tests.
3) Close database so WAL is closed prior to inducing corruption similar to what we did within Corruption tests.
Summary: Change the naming style of getter and setters according to Google C++ style in compaction.h file
Test Plan: Compilation success
Reviewers: sdong
Reviewed By: sdong
Subscribers: leveldb, dhruba
Differential Revision: https://reviews.facebook.net/D41265
Summary: Currently there is no test in the suite to test the case where
there are multiple WAL files and there is a corruption in one of them. We have
tests for single WAL file corruption scenarios. Added tests to mock
the scenarios for all combinations of recovery modes and corruption in
specified file locations.
Test Plan: Run make check
Reviewers: sdong igor
CC: leveldb@
Task ID: #7501229
Blame Rev:
Summary: Coverage test has been occasionally failing due to this timing check.
Test Plan: run test
Reviewers: yhchiang
Reviewed By: yhchiang
Subscribers: dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D41367
Summary: This change enables trivial move if all the input files are non onverlapping while doing Universal Compaction.
Test Plan: ./compaction_picker_test and db_test ran successfully with the new testcases.
Reviewers: sdong
Reviewed By: sdong
Subscribers: leveldb, dhruba
Differential Revision: https://reviews.facebook.net/D40875
Summary:
Fixed a bug in test ThreadStatusSingleCompaction where
SyncPoint traces are not cleared before the test begins
its second iteration.
Test Plan: db_test
Reviewers: sdong, anthony, IslamAbdelRahman, igor
Reviewed By: igor
Subscribers: dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D41337
Summary:
Remove assert(current_ == CurrentReverse()) in MergingIterator::Prev()
because it is possible to have some keys larger than the seek-key
inserted between Seek() and SeekToLast(), which makes current_ not
equal to CurrentReverse().
Test Plan: db_stress
Reviewers: igor, sdong, IslamAbdelRahman, anthony
Reviewed By: anthony
Subscribers: dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D41331
Summary: Update HISTORY.md for Listener
Test Plan: no code change
Reviewers: igor, sdong, IslamAbdelRahman, anthony
Reviewed By: anthony
Subscribers: dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D41325
Summary:
This diff reverts the following two previous diffs related to
DBIter::FindPrevUserKey(), which makes db_stress unstable.
We should bake a better fix for this.
* "Fix a comparison in DBIter::FindPrevUserKey()"
ec70fea4c4.
* "Fixed endless loop in DBIter::FindPrevUserKey()"
acee2b08a2.
Test Plan: db_stress
Reviewers: anthony, igor, sdong
Reviewed By: sdong
Subscribers: dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D41301
Summary:
This patch adds three test cases for ExpandWhileOverlapping()
to the compaction_picker_test test suite.
ExpandWhileOverlapping() only has an effect if the comparison
function for the internal keys allows for overlapping user
keys in different SST files on the same level. Thus, this
patch adds a comparator based on sequence numbers to
compaction_picker_test for the new test cases.
Test Plan:
- make compaction_picker_test && ./compaction_picker_test
-> All tests pass
- Replace body of ExpandWhileOverlapping() with `return true`
-> Compile and run ./compaction_picker_test as before
-> New tests fail
Reviewers: sdong, yhchiang, rven, anthony, IslamAbdelRahman, kradhakrishnan, igor
Reviewed By: igor
Subscribers: dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D41277
Summary:
Two issues:
* the input keys to the compaction don't include sequence number.
* sequence number is set to max(seq_num), but it should be set to max(seq_num)+1, because the condition here is strictly-larger (i.e. we will only zero-out sequence number if the DB's sequence number is strictly greater than the key's sequence number): https://github.com/facebook/rocksdb/blob/master/db/compaction_job.cc#L830
Test Plan: make compaction_job_test && ./compaction_job_test
Reviewers: sdong, lovro
Reviewed By: lovro
Subscribers: dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D41247
Summary:
While profiling compaction in our service I noticed a lot of CPU (~15% of compaction) being spent in MergingIterator and key comparison. Looking at the code I found MergingIterator was (understandably) using std::priority_queue for the multiway merge.
Keys in our dataset include sequence numbers that increase with time. Adjacent keys in an L0 file are very likely to be adjacent in the full database. Consequently, compaction will often pick a chunk of rows from the same L0 file before switching to another one. It would be great to avoid the O(log K) operation per row while compacting.
This diff replaces std::priority_queue with a custom binary heap implementation. It has a "replace top" operation that is cheap when the new top is the same as the old one (i.e. the priority of the top entry is decreased but it still stays on top).
Test Plan:
make check
To test the effect on performance, I generated databases with data patterns that mimic what I describe in the summary (rows have a mostly increasing sequence number). I see a 10-15% CPU decrease for compaction (and a matching throughput improvement on tmpfs). The exact improvement depends on the number of L0 files and the amount of locality. Performance on randomly distributed keys seems on par with the old code.
Reviewers: kailiu, sdong, igor
Reviewed By: igor
Subscribers: yoshinorim, dhruba, tnovak
Differential Revision: https://reviews.facebook.net/D29133