Summary:
MergeUntil was not reporting a success when merging an operand with
a Value/Deletion despite the comments in MergeHelper and CompactionJob
indicating otherwise. This lead to operands being written to the compaction
output unnecessarily:
M1 M2 M3 P M4 M5 --> (P+M1+M2+M3) M2 M3 M4 M5 (before the diff)
M1 M2 M3 P M4 M5 --> (P+M1+M2+M3) M4 M5 (after the diff)
In addition, the code handling Values/Deletion was basically identical.
This patch unifies the code. Finally, this patch also adds testing for
merge_helper.
Test Plan: make && make check
Reviewers: sdong, rven, yhchiang, tnovak, igor
Reviewed By: igor
Subscribers: tnovak, dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D42351
Summary:
Previous run may leave some SST files with higher file numbers than manifest indicates.
Compaction or flush may start to run while DB::Open() is still going on. SST file garbage collection may happen interleaving with compaction or flush, and overwrite files generated by compaction of flushes after they are generated. This might cause data loss. This possibility of interleaving is recently introduced.
Fix it by not allowing compaction or flush to be scheduled before DB::Open() finishes.
Test Plan: Add a unit test. This verification will have a chance to fail without the fix but doesn't fix without the fix.
Reviewers: kradhakrishnan, anthony, yhchiang, IslamAbdelRahman, igor
Reviewed By: igor
Subscribers: dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D42399
Summary:
Fixes T7697334. Adds a simple test to check whether CompactionJob deals
with corrupted keys correctly. Right now, we preserve corrupted keys.
Note: depending on the type of corruption and options like comparators,
CompactionJob fails. This test just checks whether corrupted keys that
do not fail CompactionJob are preserved.
Test Plan:
`make compaction_job_test && ./compaction_job_test` -> Tests pass.
Add `input->Next(); continue;` in CompactionJob::ProcessKeyValueCompaction()
inside then-branch of `!ParseInternalKey(key, &ikey)` -> Tests fail.
Reviewers: sdong, rven, yhchiang, igor
Reviewed By: igor
Subscribers: dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D42237
Summary: Moved convenience.h out of utilities to remove a dependency on utilities in db.
Test Plan: unit tests. Also compiled a link to the old location to verify the _Pragma works.
Reviewers: sdong, yhchiang, igor
Reviewed By: igor
Subscribers: dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D42201
Summary:
Trvial move in universal compaction was failing when trying to move files from levels other than 0.
This was because the DeleteFile while trivially moving, was only deleting files of level 0 which caused duplication of same file in different levels.
This is fixed by passing the right level as argument in the call of DeleteFile while doing trivial move.
Test Plan: ./db_test ran successfully with the new test cases.
Reviewers: sdong
Reviewed By: sdong
Subscribers: leveldb, dhruba
Differential Revision: https://reviews.facebook.net/D42135
Summary: gcc-4.9-glibc-2.20 complains about uninitialized variable.
db/compaction_picker.cc: In member function 'bool
rocksdb::CompactionPicker::IsInputNonOverlapping(rocksdb::Compaction*)':
db/compaction_picker.cc:1174:17: error:
'prev.rocksdb::{anonymous}::InputFileInfo::f' may be used uninitialized in this
function [-Werror=maybe-uninitialized]
InputFileInfo prev, curr, next;
Test Plan: pmake on local environment
Reviewers: sdong igor
CC: leveldb@
Task ID: #
Blame Rev:
Summary:
Remove --help entry for readhot.
Update read_random_exp_range flag description: The distribution is num *
exp(-r), not num * exp(r).
Test Plan: Run ./db_bench --help
Reviewers: sdong, igor
Reviewed By: igor
Subscribers: dhruba
Differential Revision: https://reviews.facebook.net/D42303
Summary:
Before, writing key/value pairs out to files was done inside
ProcessKeyValueCompaction(). To make ProcessKeyValueCompaction()
more understandable, this patch moves the writing part to a separate
function. This is intended to be a stepping stone for additional
changes.
Test Plan: make && make check
Reviewers: sdong, rven, yhchiang, igor
Reviewed By: igor
Subscribers: dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D42243
Summary: I didn't know this can work :)
Test Plan: none
Reviewers: sdong, yhchiang
Reviewed By: yhchiang
Subscribers: dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D42081
Summary:
Fixes two minor issues in CompactionJob.
CompactionJob::Run() dereferences log_buffer_ without a check, so
this patch adds an assert in the constructor where log_buffer_
is assigned. compaction_job_stats_ can be null but
ProcessKeyValueCompaction was dereferencing it without a check.
Test Plan: make && make check
Reviewers: sdong, rven, yhchiang, igor
Reviewed By: igor
Subscribers: dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D42231
Summary: Move UniversalCompaction related db-tests to db_universal_compaction_test.cc
Test Plan:
db_test
db_universal_compaction_test
Reviewers: igor, sdong, IslamAbdelRahman, anthony
Reviewed By: anthony
Subscribers: dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D42225
Summary: Make TransactionLogIterator related tests from db_test.cc to db_log_iter_test.cc
Test Plan:
db_test
db_log_iter_test
Reviewers: sdong, IslamAbdelRahman, igor, anthony
Reviewed By: igor
Subscribers: dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D42045
Summary: Seems like the cleanest way to resolve this is to move CompactedDBImpl into db/. CompactedDBImpl should probably live in the same place as DBImplReadonly since Opening the latter could end up instantiating the former. Both DBImplReadonly and CompactedDBImpl inherit from DBImpl access protected members of DBImpl( and the latter access friendly private methods).
Test Plan: unit tests
Reviewers: sdong, yhchiang, kradhakrishnan, igor
Reviewed By: igor
Subscribers: dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D42027
Summary: Remove ColumnFamiliesBatchWithIndexTest from ROCKSDB_LITE since WriteBatchWithIndex is not supported
Test Plan: write_batch_test
Reviewers: yhchiang, sdong, igor
Reviewed By: igor
Subscribers: dhruba
Differential Revision: https://reviews.facebook.net/D42033
Summary: This option is guarding the feature implemented 2 and a half years ago: D8991. The feature was enabled by default back then and has been running without issues. There is no reason why any client would turn this feature off. I found no reference in fbcode.
Test Plan: none
Reviewers: sdong, yhchiang, anthony, dhruba
Reviewed By: dhruba
Subscribers: dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D42063
Summary:
In one of our recent meetings, we discussed deprecating features that are not being actively used. One of those features, at least within Facebook, is timeout_hint. The feature is really nicely implemented, but if nobody needs it, we should remove it from our code-base (until we get a valid use-case). Some arguments:
* Less code == better icache hit rate, smaller builds, simpler code
* The motivation for adding timeout_hint_us was to work-around RocksDB's stall issue. However, we're currently addressing the stall issue itself (see @sdong's recent work on stall write_rate), so we should never see sharp lock-ups in the future.
* Nobody is using the feature within Facebook's code-base. Googling for `timeout_hint_us` also doesn't yield any users.
Test Plan: make check
Reviewers: anthony, kradhakrishnan, sdong, yhchiang
Reviewed By: yhchiang
Subscribers: sdong, dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D41937
Summary:
We were manipulating `const char*` arrays in CompactionJob to
change the sequence number/types of keys. This patch changes
UpdateInternalKey() to use string methods to do the manipulation
and updates all calls accordingly.
Test Plan:
Added test case for UpdateInternalKey() in dbformat_test.
make && make check
Reviewers: sdong, rven, yhchiang, igor
Reviewed By: igor
Subscribers: dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D41985
Summary:
Logging, dealing with key prefix batches and updating stats
moved from CompactionJob::Run() into separate functions.
Test Plan: make all && make check
Reviewers: sdong, rven, yhchiang, igor
Reviewed By: igor
Subscribers: dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D41919
Summary: Block reduce_levels_test in ROCKSDB_LITE as LDBCommand is not supported
Test Plan:
make reduce_levels_test -j64
OPT=-DROCKSDB_LITE make reduce_levels_test -j64
make check -j64
Reviewers: sdong, igor, yhchiang
Reviewed By: yhchiang
Subscribers: dhruba
Differential Revision: https://reviews.facebook.net/D41967
Summary:
Move global static functions in db_test_util to DBTestBase.
This is to prevent unused function warning when decoupling
db_test.cc into multiple files.
Test Plan: db_test
Reviewers: igor, sdong, anthony, IslamAbdelRahman, kradhakrishnan
Reviewed By: kradhakrishnan
Subscribers: dhruba
Differential Revision: https://reviews.facebook.net/D42009
Summary:
Move reusable part of db_test.cc to util/db_test_util.h.
This makes it more possible to partition db_test.cc into
multiple smaller test files.
Also, fixed many old lint errors in db_test.
Test Plan: db_test
Reviewers: igor, anthony, IslamAbdelRahman, sdong, kradhakrishnan
Reviewed By: sdong, kradhakrishnan
Subscribers: dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D41973
Summary:
Added new statistics in CompactionJobStats to keep track of
deletion entries and the expiration of those entries. Updated these
fields in compaction_job.cc as compaction took place and wrote a new
test in compaction_job_stats_test.cc to verify accuracy.
Test Plan:
Wrote new test DeletionStatsTest in
compaction_job_stats_test.cc to verify
Reviewers: sdong, igor, yhchiang
Reviewed By: yhchiang
Subscribers: leveldb, dhruba
Differential Revision: https://reviews.facebook.net/D41355
Summary: This helps Windows port to format their changes, as discussed. Might have formatted some other codes too becasue last 10 commits include more.
Test Plan: Build it.
Reviewers: anthony, IslamAbdelRahman, kradhakrishnan, yhchiang, igor
Reviewed By: igor
Subscribers: leveldb, dhruba
Differential Revision: https://reviews.facebook.net/D41961
Summary:
Public API depends on port/port.h which is wrong. Fix it.
Also with gcc 4.8.1 build was broken as MAX_INT32 was not recognized. Fix it by using ::max in linux.
Test Plan: Build it and try to build an external project on top of it.
Reviewers: anthony, yhchiang, kradhakrishnan, igor
Reviewed By: igor
Subscribers: yoshinorim, leveldb, dhruba
Differential Revision: https://reviews.facebook.net/D41745
Summary: Print whether fast CRC32 is supported in DB info LOG
Test Plan: Run db_bench and see it prints out correctly.
Reviewers: yhchiang, anthony, kradhakrishnan, igor
Reviewed By: igor
Subscribers: MarkCallaghan, yoshinorim, leveldb, dhruba
Differential Revision: https://reviews.facebook.net/D41733
Summary: Add a perf context counter to help users figure out time spent on reading indexes and bloom filter blocks.
Test Plan: Will write a unit test
Subscribers: leveldb, dhruba
Differential Revision: https://reviews.facebook.net/D41433
Summary: The t/DBTest.DropWrites test still fails under certain gcc version in release unit test.
I unfortunately cannot repro the failure (since the compilers have mapped library which I am not able to map to correctly). I am suspecting the clock skew.
Test Plan: Run make check
Reviewers:
CC: sdong igore
Task ID: #7312624
Blame Rev:
Summary:
The new flag --cache_index_and_filter_blocks sets
BlockBasedTableOptions.cache_index_and_filter_blocks
Test Plan: make db_bench. Working on benchmarks with the new flag.
Reviewers: igor
Reviewed By: igor
Subscribers: dhruba
Differential Revision: https://reviews.facebook.net/D41481
1) Crash in env_win.cc that prevented db_test run to completion and some new tests
2) Fix new corruption tests in DBTest by allowing a shared trunction of files. Note that this is generally needed ONLY for tests.
3) Close database so WAL is closed prior to inducing corruption similar to what we did within Corruption tests.
Summary: Change the naming style of getter and setters according to Google C++ style in compaction.h file
Test Plan: Compilation success
Reviewers: sdong
Reviewed By: sdong
Subscribers: leveldb, dhruba
Differential Revision: https://reviews.facebook.net/D41265
Summary: Currently there is no test in the suite to test the case where
there are multiple WAL files and there is a corruption in one of them. We have
tests for single WAL file corruption scenarios. Added tests to mock
the scenarios for all combinations of recovery modes and corruption in
specified file locations.
Test Plan: Run make check
Reviewers: sdong igor
CC: leveldb@
Task ID: #7501229
Blame Rev:
Summary: Coverage test has been occasionally failing due to this timing check.
Test Plan: run test
Reviewers: yhchiang
Reviewed By: yhchiang
Subscribers: dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D41367
Summary: This change enables trivial move if all the input files are non onverlapping while doing Universal Compaction.
Test Plan: ./compaction_picker_test and db_test ran successfully with the new testcases.
Reviewers: sdong
Reviewed By: sdong
Subscribers: leveldb, dhruba
Differential Revision: https://reviews.facebook.net/D40875
Summary:
Fixed a bug in test ThreadStatusSingleCompaction where
SyncPoint traces are not cleared before the test begins
its second iteration.
Test Plan: db_test
Reviewers: sdong, anthony, IslamAbdelRahman, igor
Reviewed By: igor
Subscribers: dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D41337
Summary:
This diff reverts the following two previous diffs related to
DBIter::FindPrevUserKey(), which makes db_stress unstable.
We should bake a better fix for this.
* "Fix a comparison in DBIter::FindPrevUserKey()"
ec70fea4c4.
* "Fixed endless loop in DBIter::FindPrevUserKey()"
acee2b08a2.
Test Plan: db_stress
Reviewers: anthony, igor, sdong
Reviewed By: sdong
Subscribers: dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D41301
Summary:
This patch adds three test cases for ExpandWhileOverlapping()
to the compaction_picker_test test suite.
ExpandWhileOverlapping() only has an effect if the comparison
function for the internal keys allows for overlapping user
keys in different SST files on the same level. Thus, this
patch adds a comparator based on sequence numbers to
compaction_picker_test for the new test cases.
Test Plan:
- make compaction_picker_test && ./compaction_picker_test
-> All tests pass
- Replace body of ExpandWhileOverlapping() with `return true`
-> Compile and run ./compaction_picker_test as before
-> New tests fail
Reviewers: sdong, yhchiang, rven, anthony, IslamAbdelRahman, kradhakrishnan, igor
Reviewed By: igor
Subscribers: dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D41277
Summary:
Two issues:
* the input keys to the compaction don't include sequence number.
* sequence number is set to max(seq_num), but it should be set to max(seq_num)+1, because the condition here is strictly-larger (i.e. we will only zero-out sequence number if the DB's sequence number is strictly greater than the key's sequence number): https://github.com/facebook/rocksdb/blob/master/db/compaction_job.cc#L830
Test Plan: make compaction_job_test && ./compaction_job_test
Reviewers: sdong, lovro
Reviewed By: lovro
Subscribers: dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D41247
Summary:
This fixes the following scenario we've hit:
- we reached max_total_wal_size, created a new wal and scheduled flushing all memtables corresponding to the old one,
- before the last of these flushes started its column family was dropped; the last background flush call was a no-op; no one removed the old wal from alive_logs_,
- hours have passed and no flushes happened even though lots of data was written; data is written to different column families, compactions are disabled; old column families are dropped before memtable grows big enough to trigger a flush; the old wal still sits in alive_logs_ preventing max_total_wal_size limit from kicking in,
- a few more hours pass and we run out disk space because of one huge .log file.
Test Plan: `make check`; backported the new test, checked that it fails without this diff
Reviewers: igor
Reviewed By: igor
Subscribers: dhruba
Differential Revision: https://reviews.facebook.net/D40893
Summary: Make RocksDb build and run on Windows to be functionally
complete and performant. All existing test cases run with no
regressions. Performance numbers are in the pull-request.
Test plan: make all of the existing unit tests pass, obtain perf numbers.
Co-authored-by: Praveen Rao praveensinghrao@outlook.com
Co-authored-by: Sherlock Huang baihan.huang@gmail.com
Co-authored-by: Alex Zinoviev alexander.zinoviev@me.com
Co-authored-by: Dmitri Smirnov dmitrism@microsoft.com
Summary: Try to allocate LevelFileIteratorState and LevelFileNumIterator from DB iterator's arena, instead of calling malloc and free.
Test Plan: valgrind check
Reviewers: rven, yhchiang, anthony, kradhakrishnan, igor
Reviewed By: igor
Subscribers: leveldb, dhruba
Differential Revision: https://reviews.facebook.net/D40929
Summary: We have a race in the way test works. We avoided the race by adding the
wait to the counter. I thought 1s was eternity, but that is not true in some
scenarios. Increasing the timeout to 10s and adding warnings.
Also, adding nosleep to avoid the case where the wakeup thread is waiting behind
the sleeping thread for scheduling.
Test Plan: Run make check
Reviewers: siying igorcanadi
CC: leveldb@
Task ID: #7312624
Blame Rev:
Summary:
When seek target is a merge key (`kTypeMerge`), `DBIter::FindNextUserEntry()`
advances the underlying iterator _past_ the current key (`saved_key_`); see
`MergeValuesNewToOld()`. However, `FindPrevUserKey()` assumes that `iter_`
points to an entry with the same user key as `saved_key_`. As a result,
`it->Seek(key) && it->Prev()` can cause the iterator to be positioned at the
_next_, instead of the previous, entry (new test, written by @lovro, reproduces
the bug).
This diff changes `FindPrevUserKey()` to also skip keys that are _greater_ than
`saved_key_`.
Test Plan: db_test
Reviewers: igor, sdong
Reviewed By: sdong
Subscribers: leveldb, dhruba, lovro
Differential Revision: https://reviews.facebook.net/D40791