Summary:
ColumnFamilyOptions::compaction_options_fifo and all its sub-fields can be set dynamically now.
Some of the ways in which the fifo compaction options can be set are:
- `SetOptions({{"compaction_options_fifo", "{max_table_files_size=1024}"}})`
- `SetOptions({{"compaction_options_fifo", "{ttl=600;}"}})`
- `SetOptions({{"compaction_options_fifo", "{max_table_files_size=1024;ttl=600;}"}})`
- `SetOptions({{"compaction_options_fifo", "{max_table_files_size=51;ttl=49;allow_compaction=true;}"}})`
Most of the code has been made generic enough so that it could be reused later to make universal options (and other such nested defined-types) dynamic with very few lines of parsing/serializing code changes.
Introduced a few new functions like `ParseStruct`, `SerializeStruct` and `GetStringFromStruct`.
The duplicate code in `GetStringFromDBOptions` and `GetStringFromColumnFamilyOptions` has been moved into `GetStringFromStruct`. So they become just simple wrappers now.
Closes https://github.com/facebook/rocksdb/pull/3006
Differential Revision: D6058619
Pulled By: sagar0
fbshipit-source-id: 1e8f78b3374ca5249bb4f3be8a6d3bb4cbc52f92
Summary:
Allow users to rate limit background work based on read bytes, written bytes, or sum of read and written bytes. Support these by changing the RateLimiter API, so no additional options were needed.
Closes https://github.com/facebook/rocksdb/pull/2433
Differential Revision: D5216946
Pulled By: ajkr
fbshipit-source-id: aec57a8357dbb4bfde2003261094d786d94f724e
Summary:
First cut for early review; there are few conceptual points to answer and some code structure issues.
For conceptual points -
- restriction-wise, we're going to disallow ingest_behind if (use_seqno_zero_out=true || disable_auto_compaction=false), the user is responsible to properly open and close DB with required params
- we wanted to ingest into reserved bottom most level. Should we fail fast if bottom level isn't empty, or should we attempt to ingest if file fits there key-ranges-wise?
- Modifying AssignLevelForIngestedFile seems the place we we'd handle that.
On code structure - going to refactor GenerateAndAddExternalFile call in the test class to allow passing instance of IngestionOptions, that's just going to incur lots of changes at callsites.
Closes https://github.com/facebook/rocksdb/pull/2144
Differential Revision: D4873732
Pulled By: lightmark
fbshipit-source-id: 81cb698106b68ef8797f564453651d50900e153a
Summary:
Move some files under util/ to new directories env/, monitoring/ options/ and cache/
Closes https://github.com/facebook/rocksdb/pull/2090
Differential Revision: D4833681
Pulled By: siying
fbshipit-source-id: 2fd8bef
Summary:
…action
The two options, min_partial_merge_operands and verify_checksums_in_compaction, are not seldom used. Remove them to reduce the total number of options. Also remove them from Java and C interface.
Closes https://github.com/facebook/rocksdb/pull/1902
Differential Revision: D4601219
Pulled By: siying
fbshipit-source-id: aad4cb2
Summary:
Remove disableDataSync, and another similarly named disable_data_sync options.
This is being done to simplify options, and also because the performance gains of this feature can be achieved by other methods.
Closes https://github.com/facebook/rocksdb/pull/1859
Differential Revision: D4541292
Pulled By: sagar0
fbshipit-source-id: 5b3a6ca
Summary:
- Made RangeDelAggregator's InternalKeyComparator member a reference-to-const so we don't need to copy-construct it. Also added InternalKeyComparator to ImmutableCFOptions so we don't need to construct one for each DBIter.
- Made MemTable::NewRangeTombstoneIterator and the table readers' NewRangeTombstoneIterator() functions return nullptr instead of NewEmptyInternalIterator to avoid the allocation. Updated callers accordingly.
Closes https://github.com/facebook/rocksdb/pull/1548
Differential Revision: D4208169
Pulled By: ajkr
fbshipit-source-id: 2fd65cf
Summary:
Currently our skip-list have an optimization to speedup sequential
inserts from a single stream, by remembering the last insert position.
We extend the idea to support sequential inserts from multiple streams,
and even tolerate small reordering wihtin each stream.
This PR is the interface part adding the following:
- Add `memtable_insert_prefix_extractor` to allow specifying prefix for each key.
- Add `InsertWithHint()` interface to memtable, to allow underlying
implementation to return a hint of insert position, which can be later
pass back to optimize inserts.
- Memtable will maintain a map from prefix to hints and pass the hint
via `InsertWithHint()` if `memtable_insert_prefix_extractor` is non-null.
Closes https://github.com/facebook/rocksdb/pull/1419
Differential Revision: D4079367
Pulled By: yiwu-arbug
fbshipit-source-id: 3555326
Summary:
change ioptions.comparator to user_comparator instread of internal_comparator.
Also change Comparator* to InternalKeyComparator* to make its type explicitly.
Test Plan: make all check -j64
Reviewers: andrewkr, sdong, yiwu
Reviewed By: yiwu
Subscribers: andrewkr, dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D65121
Summary:
We always run consistency checks when compiling in debug mode
allow users to set Options::force_consistency_checks to true to be able to run such checks even when compiling in release mode
Test Plan:
make check -j64
make release
Reviewers: lightmark, sdong, yiwu
Reviewed By: yiwu
Subscribers: hermanlee4, andrewkr, yoshinorim, jkedgar, dhruba
Differential Revision: https://reviews.facebook.net/D64701
Summary:
I saw this exception thrown because sometimes we may resize with -ve value
if we have empty max_bytes_for_level_multiplier_additional vector
Test Plan: run the tests
Reviewers: yiwu
Reviewed By: yiwu
Subscribers: andrewkr, dhruba
Differential Revision: https://reviews.facebook.net/D64791
Summary: Use ImmutableDBOptions/MutableDBOptions internally and DBOptions only for user-facing APIs. MutableDBOptions is barely a placeholder for now. I'll start to move options to MutableDBOptions in following diffs.
Test Plan:
make all check
Reviewers: yhchiang, IslamAbdelRahman, sdong
Reviewed By: sdong
Subscribers: andrewkr, dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D64065
Summary:
* Change constructor of MutableCFOptions to depends only on ColumnFamilyOptions.
* Move `max_subcompactions`, `compaction_options_fifo` and `compaction_pri` to ImmutableCFOptions to make it clear that they are immutable.
Test Plan: existing unit tests.
Reviewers: yhchiang, IslamAbdelRahman, sdong
Reviewed By: sdong
Subscribers: andrewkr, dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D63945
Summary: There's no reference to ImmutableCFOptions elsewhere in /include/rocksdb. ImmutableCFOptions was introduced in this commit (5665e5e285) but later its reference in /include/rocksdb/table.h is removed.
Test Plan:
make all check
Reviewers: IslamAbdelRahman, sdong, yhchiang
Reviewed By: yhchiang
Subscribers: yhchiang, andrewkr, dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D63177
Summary: To reduce number of options, merge source_compaction_factor, max_grandparent_overlap_bytes and expanded_compaction_factor into max_compaction_bytes.
Test Plan: Add two new unit tests. Run all existing tests, including jtest.
Reviewers: yhchiang, igor, IslamAbdelRahman
Reviewed By: IslamAbdelRahman
Subscribers: leveldb, andrewkr, dhruba
Differential Revision: https://reviews.facebook.net/D59829
Summary: Extend the option memtable_prefix_bloom_huge_page_tlb_size from just putting memtable bloom filter to huge page to memtable itself too.
Test Plan: Run all existing tests.
Reviewers: IslamAbdelRahman, yhchiang, andrewkr
Reviewed By: andrewkr
Subscribers: leveldb, andrewkr, dhruba
Differential Revision: https://reviews.facebook.net/D60513
Summary: filter_deltes is not a frequently used feature. Remove it.
Test Plan: Run all test suites.
Reviewers: igor, yhchiang, IslamAbdelRahman
Reviewed By: IslamAbdelRahman
Subscribers: leveldb, andrewkr, dhruba
Differential Revision: https://reviews.facebook.net/D59427
Summary:
memtable_prefix_bloom_probes is not a critical option. Remove it to reduce number of options.
It's easier for users to make mistakes with memtable_prefix_bloom_bits, turn it to memtable_prefix_bloom_bits_ratio
Test Plan: Run all existing tests
Reviewers: yhchiang, igor, IslamAbdelRahman
Reviewed By: IslamAbdelRahman
Subscribers: gunnarku, yoshinorim, MarkCallaghan, leveldb, andrewkr, dhruba
Differential Revision: https://reviews.facebook.net/D59199
Summary: Deprecate options.soft_rate_limit, which is hard to tune, with options.soft_pending_compaction_bytes_limit, which would trigger the slowdown if estimated pending compaction bytes exceeds the threshold. The hope is to make it more striaght-forward to tune.
Test Plan: Modify DBTest.SoftLimit to cover options.soft_pending_compaction_bytes_limit instead; run all unit tests.
Reviewers: IslamAbdelRahman, yhchiang, rven, kradhakrishnan, igor, anthony
Reviewed By: anthony
Subscribers: leveldb, dhruba
Differential Revision: https://reviews.facebook.net/D51117
Summary: Add an option to stop writes if compaction lefts behind. If estimated pending compaction bytes is more than threshold specified by options.hard_pending_compaction_bytes_liimt, writes will stop until compactions are cleared to under the threshold.
Test Plan: Add unit test DBTest.HardLimit
Reviewers: rven, kradhakrishnan, anthony, IslamAbdelRahman, yhchiang, igor
Reviewed By: igor
Subscribers: MarkCallaghan, leveldb, dhruba
Differential Revision: https://reviews.facebook.net/D45999
Summary:
When we first started, max_background_flushes was 0 by default and compaction thread was executing flushes (since there was no flush thread). Then, we switched the default max_background_flushes to 1. However, we still support the case where there is no flush thread and flushes are done in compaction. This is making our code a bit more complicated. By not supporting this use-case we can make our code simpler.
We have a special case that when you set max_background_flushes to 0, we
schedule the flush to execute on the compaction thread.
Test Plan: make check (there might be some unit tests that depend on this behavior)
Reviewers: IslamAbdelRahman, yhchiang, sdong
Reviewed By: sdong
Subscribers: dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D41931
Summary: This helps Windows port to format their changes, as discussed. Might have formatted some other codes too becasue last 10 commits include more.
Test Plan: Build it.
Reviewers: anthony, IslamAbdelRahman, kradhakrishnan, yhchiang, igor
Reviewed By: igor
Subscribers: leveldb, dhruba
Differential Revision: https://reviews.facebook.net/D41961
Summary:
Public API depends on port/port.h which is wrong. Fix it.
Also with gcc 4.8.1 build was broken as MAX_INT32 was not recognized. Fix it by using ::max in linux.
Test Plan: Build it and try to build an external project on top of it.
Reviewers: anthony, yhchiang, kradhakrishnan, igor
Reviewed By: igor
Subscribers: yoshinorim, leveldb, dhruba
Differential Revision: https://reviews.facebook.net/D41745
Summary: Make RocksDb build and run on Windows to be functionally
complete and performant. All existing test cases run with no
regressions. Performance numbers are in the pull-request.
Test plan: make all of the existing unit tests pass, obtain perf numbers.
Co-authored-by: Praveen Rao praveensinghrao@outlook.com
Co-authored-by: Sherlock Huang baihan.huang@gmail.com
Co-authored-by: Alex Zinoviev alexander.zinoviev@me.com
Co-authored-by: Dmitri Smirnov dmitrism@microsoft.com
Summary:
When having fixed max_bytes_for_level_base, the ratio of size of largest level and the second one can range from 0 to the multiplier. This makes LSM tree frequently irregular and unpredictable. It can also cause poor space amplification in some cases.
In this improvement (proposed by Igor Kabiljo), we introduce a parameter option.level_compaction_use_dynamic_max_bytes. When turning it on, RocksDB is free to pick a level base in the range of (options.max_bytes_for_level_base/options.max_bytes_for_level_multiplier, options.max_bytes_for_level_base] so that real level ratios are close to options.max_bytes_for_level_multiplier.
Test Plan: New unit tests and pass tests suites including valgrind.
Reviewers: MarkCallaghan, rven, yhchiang, igor, ikabiljo
Reviewed By: ikabiljo
Subscribers: yoshinorim, ikabiljo, dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D31437
Summary:
The very last reference happens in DBImpl::GetOptions()
I built with both DBImpl::GetOptions() and ColumnFamilyData::options() commented out
Test Plan: make all check
Reviewers: sdong, yhchiang, igor
Reviewed By: igor
Subscribers: dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D29073
Summary:
We need to turn on -Wshorten-64-to-32 for mobile. See D1671432 (internal phabricator) for details.
This diff turns on the warning flag and fixes all the errors. There were also some interesting errors that I might call bugs, especially in plain table. Going forward, I think it makes sense to have this flag turned on and be very very careful when converting 64-bit to 32-bit variables.
Test Plan: compiles
Reviewers: ljin, rven, yhchiang, sdong
Reviewed By: yhchiang
Subscribers: bobbaldwin, dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D28689
Summary: as title
Test Plan: unit test
Reviewers: sdong, yhchiang, rven, igor
Reviewed By: igor
Subscribers: leveldb
Differential Revision: https://reviews.facebook.net/D25347
Summary: as title
Test Plan: make release
Reviewers: sdong, yhchiang, igor
Reviewed By: igor
Subscribers: leveldb
Differential Revision: https://reviews.facebook.net/D24903
Summary:
make compaction related options changeable. Most of changes are tedious,
following the same convention: grabs MutableCFOptions at the beginning
of compaction under mutex, then pass it throughout the job and register
it in SuperVersion at the end.
Test Plan: make all check
Reviewers: igor, yhchiang, sdong
Reviewed By: sdong
Subscribers: leveldb
Differential Revision: https://reviews.facebook.net/D23349