rocksdb

Commit Graph

Author	SHA1	Message	Date
Andrew Kryczka	23593171c4	minor improvements to db_stress Summary: fix some things that made this command hard to use from CLI: - use default values for `target_file_size_base` and `max_bytes_for_level_base`. previously we were using small values for these but default value of `write_buffer_size`, which led to enormous number of L1 files. - failure message for `value_size_mult` too big. previously there was just an assert, so in non-debug mode it'd overrun the value buffer and crash mysteriously. - only print verification success if there's no failure. before it'd print both in the failure case. - support `memtable_prefix_bloom_size_ratio` - support `num_bottom_pri_threads` (universal compaction) Closes https://github.com/facebook/rocksdb/pull/2741 Differential Revision: D5629495 Pulled By: ajkr fbshipit-source-id: ddad97d6d4ba0884e7c0f933b0a359712514fc1d	8 years ago
Andrew Kryczka	af012c0f83	fix deleterange with memtable prefix bloom Summary: the range delete tombstones in memtable should be added to the aggregator even when the memtable's prefix bloom filter tells us the lookup key's not there. This bug could cause data to temporarily reappear until the memtable containing range deletions is flushed. Reported in #2743. Closes https://github.com/facebook/rocksdb/pull/2745 Differential Revision: D5639007 Pulled By: ajkr fbshipit-source-id: 04fc6facb6f978340a3f639536f4ca7c0d73dfc9	8 years ago
Andrew Kryczka	1c8dbe2aa2	update scores after picking universal compaction Summary: We forgot to recompute compaction scores after picking a universal compaction like we do in level compaction (`a34b2e388e/db/compaction_picker.cc (L691-L695)`). This leads to a fairness issue where we waste compactions on CFs/DB instances that don't need it while others can starve. Previously, `ccecf3f4fb` fixed the issue for the read-amp-based compaction case; this PR avoids the issue earlier and also for size-ratio-based compactions. Closes https://github.com/facebook/rocksdb/pull/2688 Differential Revision: D5566191 Pulled By: ajkr fbshipit-source-id: 010bccb2a107f6a76f3d3022b90aadce5cc48feb	8 years ago
Maysam Yabandeh	eb6425303e	Update WritePrepared with the pseudo code Summary: Implement the main body of WritePrepared pseudo code. This includes PrepareInternal and CommitInternal, as well as AddCommitted which updates the commit map. It also provides a IsInSnapshot method that could be later called form the read path to decide if a version is in the read snapshot or it should other be skipped. This patch lacks unit tests and does not attempt to offer an efficient implementation. The idea is that to have the API specified so that we can work on related tasks in parallel. Closes https://github.com/facebook/rocksdb/pull/2713 Differential Revision: D5640021 Pulled By: maysamyabandeh fbshipit-source-id: bfa7a05e8d8498811fab714ce4b9c21530514e1c	8 years ago
Sagar Vemuri	132306fbf0	Remove PartialMerge implementation from Cassandra merge operator Summary: `PartialMergeMulti` implementation is enough for Cassandra, and `PartialMerge` is not required. Implementing both will just duplicate the code. As per https://github.com/facebook/rocksdb/blob/master/include/rocksdb/merge_operator.h#L130-L135 : ``` // The default implementation of PartialMergeMulti will use this function // as a helper, for backward compatibility. Any successor class of // MergeOperator should either implement PartialMerge or PartialMergeMulti, // although implementing PartialMergeMulti is suggested as it is in general // more effective to merge multiple operands at a time instead of two // operands at a time. ``` Closes https://github.com/facebook/rocksdb/pull/2737 Reviewed By: scv119 Differential Revision: D5633073 Pulled By: sagar0 fbshipit-source-id: ef4fa102c22fec6a0175ed12f5c44c15afe3c8ca	8 years ago
Siying Dong	71598cdc75	Fix false removal of tombstone issue in FIFO and kCompactionStyleNone Summary: Similar to the bug fixed by https://github.com/facebook/rocksdb/pull/2726, FIFO with compaction and kCompactionStyleNone during user customized CompactFiles() with output level to be 0 can suffer from the same problem. Fix it by leveraging the bottommost_level_ flag. Closes https://github.com/facebook/rocksdb/pull/2735 Differential Revision: D5626906 Pulled By: siying fbshipit-source-id: 2b148d0461c61dbd986d74655e384419ae442158	8 years ago
lxcode	3204a4f64b	Fix missing stdlib include required for abort() Summary: If ROCKSDB_LITE is defined, a call to abort() is introduced. This call requires stdlib.h. Build log of unpatched 5.7.1: http://beefy9.nyi.freebsd.org/data/110amd64-default/447974/logs/rocksdb-lite-5.7.1.log Closes https://github.com/facebook/rocksdb/pull/2744 Reviewed By: yiwu-arbug Differential Revision: D5632372 Pulled By: lxcode fbshipit-source-id: b2a8e692bf14ccf1f875f3a00463e87bba310a2b	8 years ago
Andrew Kryczka	7aa96db7a2	db_stress rolling active window Summary: Support a window of `active_width` keys that rolls through `[0, max_key)` over the duration of the test. Operations only affect keys inside the window. This gives us the ability to detect L0->L0 deletion bug (#2722). Closes https://github.com/facebook/rocksdb/pull/2739 Differential Revision: D5628555 Pulled By: ajkr fbshipit-source-id: 9cb2d8f4ab1a7c73f7797b8e19f7094970ea8749	8 years ago
Neal Poole	dfa6c23c4b	Update RocksDBCommonHelper to use escapeshellarg Summary: Most of the data used here in shell commands is not generated directly from user input but some data (ie: from environment variables) may have been external influenced. It is a good practice to escape this data before using it in a shell command. Originally D4800264 but we never quite got it merged. Reviewed By: yiwu-arbug Differential Revision: D5595052 fbshipit-source-id: c09d8b47fe35fc6a47afb4933ccad9d56ca8d7be	8 years ago
yiwu-arbug	e367774d19	Overload new[] to properly align LRUCacheShard Summary: Also verify it fixes gcc7 compile failure #2672 (see also #2699) Closes https://github.com/facebook/rocksdb/pull/2732 Differential Revision: D5620348 Pulled By: yiwu-arbug fbshipit-source-id: 87db657ab734f23b1bfaaa9db9b9956d10eaef59	8 years ago
Yi Wu	ad42d2fcbb	Remove residual arcanist_util directory	8 years ago
Nikhil Benesch	279296f4d8	properly set C[XX]FLAGS during CMake configure-time checks Summary: Some compilers require `-std=c++11` for the `cstdint` header to be available. We already have logic to add `-std=c++11` to `CXXFLAGS` when the compiler is not MSVC; simply reorder CMakeLists.txt so that logic happens before the calls to `CHECK_CXX_SOURCE_COMPILES`. Additionally add a missing `set(CMAKE_REQUIRED_FLAGS, ...)` before a call to `CHECK_C_SOURCE_COMPILES`. Closes https://github.com/facebook/rocksdb/pull/2535 Differential Revision: D5384244 Pulled By: yiwu-arbug fbshipit-source-id: 2dbae4297c5d8ab4636e08b1457ffb2d3e37aef4	8 years ago
Nikhil Benesch	c5f0c6cc66	compile with correct flags to determine SSE4.2 support Summary: With some compilers, `-std=c++11` is necessary for <cstdint> to be available. Pass this flag via $PLATFORM_CXXFLAGS. Fixes #2488. Closes https://github.com/facebook/rocksdb/pull/2545 Differential Revision: D5620610 Pulled By: yiwu-arbug fbshipit-source-id: 2f975b8c1ad52e283e677d9a33543abd064f13ce	8 years ago
Jay	185ade4c0c	cmake: support more compression type Summary: This pr enables linking all the supported compression libraries via cmake. Closes https://github.com/facebook/rocksdb/pull/2552 Differential Revision: D5620607 Pulled By: yiwu-arbug fbshipit-source-id: b6949181f305bfdf04a98f898c92fd0caba0c45a	8 years ago
Andrew Gallagher	5449c0990b	rocksdb: make buildable on aarch64 Summary: - Remove default arch-specified flags. - Move non-default arch-specific flags to arch-specific param. Reviewed By: yiwu-arbug Differential Revision: D5597499 fbshipit-source-id: c53108ac39c73ac36893d3fd9aaf3b5e3080f1ae	8 years ago
Adam Retter	a144a9782d	Fix for CMakeLists.txt on Windows for RocksJava Summary: Closes https://github.com/facebook/rocksdb/pull/2730 Differential Revision: D5619256 Pulled By: ajkr fbshipit-source-id: c80d697eeceab91964259132e58f5cd2219efb93	8 years ago
Andrew Kryczka	acf935e40f	fix deletion dropping in intra-L0 Summary: `KeyNotExistsBeyondOutputLevel` didn't consider L0 files' key-ranges. So if a key only was covered by older L0 files' key-ranges, we would incorrectly drop deletions of that key. This PR just skips the deletion-dropping optimization when output level is L0. Closes https://github.com/facebook/rocksdb/pull/2726 Differential Revision: D5617286 Pulled By: ajkr fbshipit-source-id: 4bff1396b06d49a828ba4542f249191052915bce	8 years ago
Andrew Kryczka	8254e9b57c	make sst_dump compression size command consistent Summary: - like other subcommands, reporting compression sizes should be specified with the `--command` CLI arg. - also added `--compression_types` arg as it's useful to restrict the types of compression used, at least in my dictionary compression experiments. Closes https://github.com/facebook/rocksdb/pull/2706 Differential Revision: D5589520 Pulled By: ajkr fbshipit-source-id: 305bb4ebcc95eecc8a85523cd3b1050619c9ddc5	8 years ago
Andrew Kryczka	74f18c1301	db_bench support for non-uniform column family ops Summary: Previously we could only select the CF on which to operate uniformly at random. This is a limitation, e.g., when testing universal compaction as all CFs would need to run full compaction at roughly the same time, which isn't realistic. This PR allows the user to specify the probability distribution for selecting CFs via the `--column_family_distribution` argument. Closes https://github.com/facebook/rocksdb/pull/2677 Differential Revision: D5544436 Pulled By: ajkr fbshipit-source-id: 478d56260995236ae90895ce5bd51f38882e185a	8 years ago
Andrew Kryczka	5de98f2d50	approximate histogram stats to save cpu Summary: sounds like we're willing to tradeoff minor inaccuracy in stats for speed. start with histogram stats. ticker stats will be harder (and, IMO, we shouldn't change them in this manner) as many test cases rely on them being exactly correct. Closes https://github.com/facebook/rocksdb/pull/2720 Differential Revision: D5607884 Pulled By: ajkr fbshipit-source-id: 1b754cda35ea6b252d1fdd5aa3cfb58866506372	8 years ago
yiwu-arbug	3f5888430a	Fix c_test ASAN failure Summary: Fix c_test missing deletion of write batch pointer. Closes https://github.com/facebook/rocksdb/pull/2725 Differential Revision: D5613866 Pulled By: yiwu-arbug fbshipit-source-id: bf3f59a6812178577c9c25bae558ef36414a1f51	8 years ago
yiwu-arbug	e5a1b727c0	Fix blob DB transaction usage while GC Summary: While GC, blob DB use optimistic transaction to delete or replace the index entry in LSM, to guarantee correctness if there's a normal write writing to the same key. However, the previous implementation doesn't call SetSnapshot() nor use GetForUpdate() of transaction API, instead it do its own sequence number checking before beginning the transaction. A normal write can sneak in after the sequence number check and overwrite the key, and the GC will delete or relocate the old version of the key by mistake. Update the code to property use GetForUpdate() to check the existing index entry. After the patch the sequence number store with each blob record is useless, So I'm considering remove the sequence number from blob record, in another patch. Closes https://github.com/facebook/rocksdb/pull/2703 Differential Revision: D5589178 Pulled By: yiwu-arbug fbshipit-source-id: 8dc960cd5f4e61b36024ba7c32d05584ce149c24	8 years ago
Andrew Kryczka	6f051e0c71	fix corruption_test valgrind Summary: Closes https://github.com/facebook/rocksdb/pull/2724 Differential Revision: D5613416 Pulled By: ajkr fbshipit-source-id: ed55fb66ab1b41dfdfe765fe3264a1c87a8acb00	8 years ago
Kent767	ac098a4626	expose set_skip_stats_update_on_db_open to C bindings Summary: It would be super helpful to not have to recompile rocksdb to get this performance tweak for mechanical disks. I have signed the CLA. Closes https://github.com/facebook/rocksdb/pull/2718 Differential Revision: D5606994 Pulled By: yiwu-arbug fbshipit-source-id: c05e92bad0d03bd38211af1e1ced0d0d1e02f634	8 years ago
Siying Dong	666a005f9b	Support prefetch last 512KB with direct I/O in block based file reader Summary: Right now, if direct I/O is enabled, prefetching the last 512KB cannot be applied, except compaction inputs or readahead is enabled for iterators. This can create a lot of I/O for HDD cases. To solve the problem, the 512KB is prefetched in block based table if direct I/O is enabled. The prefetched buffer is passed in totegher with random access file reader, so that we try to read from the buffer before reading from the file. This can be extended in the future to support flexible user iterator readahead too. Closes https://github.com/facebook/rocksdb/pull/2708 Differential Revision: D5593091 Pulled By: siying fbshipit-source-id: ee36ff6d8af11c312a2622272b21957a7b5c81e7	8 years ago
yiwu-arbug	ad77ee0ea0	Revert "Makefile: correct faligned-new test" Summary: This reverting #2699 to fix clang build. Closes https://github.com/facebook/rocksdb/pull/2723 Differential Revision: D5610207 Pulled By: yiwu-arbug fbshipit-source-id: 6857f4556d6d18f17b74cf81fa936d1dc0bd364c	8 years ago
Siying Dong	b87ee6f773	Use more keys per lock in daily TSAN crash test Summary: TSAN shows error when we grab too many locks at the same time. In TSAN crash test, make one shard key cover 2^22 keys so that no many keys will be hold at the same time. Closes https://github.com/facebook/rocksdb/pull/2719 Differential Revision: D5609035 Pulled By: siying fbshipit-source-id: 930e5d63fff92dbc193dc154c4c615efbdf06c6a	8 years ago
Stanislav Tkach	25df24254b	Add column families related functions (C API) Summary: (#2564) Closes https://github.com/facebook/rocksdb/pull/2669 Differential Revision: D5594151 Pulled By: yiwu-arbug fbshipit-source-id: 67ae9446342f3323d6ecad8e811f4158da194270	8 years ago
Daniel Black	64f8484356	block_cache_tier: fix gcc-7 warnings Summary: Error was: utilities/persistent_cache/block_cache_tier.cc: In instantiation of ‘void rocksdb::Add(std::map<std::__cxx11::basic_string<char>, double>*, const string&, const T&) [with T = double; std::__cxx11::string = std::__cxx11::basic_string<char>]’: utilities/persistent_cache/block_cache_tier.cc:147:40: required from here utilities/persistent_cache/block_cache_tier.cc:141:23: error: type qualifiers ignored on cast result type [-Werror=ignored-qualifiers] stats->insert({key, static_cast<const double>(t)}); fixing like #2562 Closes https://github.com/facebook/rocksdb/pull/2603 Differential Revision: D5600910 Pulled By: yiwu-arbug fbshipit-source-id: 891a5ec7e451d2dec6ad1b6b7fac545657f87363	8 years ago
Oleksandr Anyshchenko	0cecf8155b	Write batch for `TransactionDB` in C API Summary: Closes https://github.com/facebook/rocksdb/pull/2655 Differential Revision: D5600858 Pulled By: yiwu-arbug fbshipit-source-id: cf52f9104e348438bf168dc6bf7af3837faf12ef	8 years ago
FireMail	6a9de43477	Windows.h macro call fix Summary: - moved the max call for numeric limits into paranthesis so that max wont be called as macro when including <Windows.h> Closes https://github.com/facebook/rocksdb/pull/2709 Differential Revision: D5600773 Pulled By: yiwu-arbug fbshipit-source-id: fd28b6f7c10ddce21bad4030f2db06f965bb08da	8 years ago
jimmyway	23c7d13540	fix comment Summary: Signed-off-by: tang.jin <tang.jin@istuary.com> Closes https://github.com/facebook/rocksdb/pull/2644 Differential Revision: D5600861 Pulled By: yiwu-arbug fbshipit-source-id: 9516636cb6e77b09fe0ebef78953adf4b7e88cc8	8 years ago
Daniel Black	1fbad84b69	Makefile: correct faligned-new test Summary: Commit `4f81ab38bf` has the test wrong. clang doesn't support a -dumpversion option. By lucky coincidence clang/gcc --version both place a version number at the same output location when --verison is passed. Example output (1st line only). $ clang --version clang version 3.9.1 (tags/RELEASE_391/final) $ gcc --version gcc (GCC) 6.4.1 20170727 (Red Hat 6.4.1-1) During the test of the compiler we ensure that a minimum version is met as Makefile doesn't support patterns. Also xcode9 doesn't seem affected by https://github.com/facebook/rocksdb/issues/2672 and also doesn't have "clang" as the first part of its output so the fix implemented here also is Apple clang friendly. $ clang --version Apple LLVM version 9.0.0 (clang-900.0.31) Signed-off-by: Daniel Black <daniel.black@au.ibm.com> Closes https://github.com/facebook/rocksdb/pull/2699 Differential Revision: D5600818 Pulled By: yiwu-arbug fbshipit-source-id: 3b0f2751becb53c1c35468bf29f3f828e7cf2c2a	8 years ago
Aaron G	7848f0b24c	add VerifyChecksum() to db.h Summary: We need a tool to check any sst file corruption in the db. It will check all the sst files in current version and read all the blocks (data, meta, index) with checksum verification. If any verification fails, the function will return non-OK status. Closes https://github.com/facebook/rocksdb/pull/2498 Differential Revision: D5324269 Pulled By: lightmark fbshipit-source-id: 6f8a272008b722402a772acfc804524c9d1a483b	8 years ago
Andrew Kryczka	47ed3bfc3b	fix WinEnv assertions Summary: Closes https://github.com/facebook/rocksdb/pull/2702 Differential Revision: D5585389 Pulled By: ajkr fbshipit-source-id: cb54041eb481d0d759c440f82a8a2c5b34534173	8 years ago
Chang Liu	d97a72d63f	Try to repair db with wal_dir option, avoid leak some WAL files Summary: We should search wal_dir in Repairer::FindFiles function, and avoid use LogFileNmae(dbname, number) to get WAL file's name, which will get a wrong WAL filename. as following: ``` [WARN] [/home/liuchang/Workspace/rocksdb/db/repair.cc:310] Log #3: ignoring conversion error: IO error: While opening a file for sequentially reading: /tmp/rocksdbtest-1000/repair_test/000003.log: No such file or directory ``` I have added a new test case to repair_test.cc, which try to repair db with all WAL options. Signed-off-by: Chang Liu <liuchang0812@gmail.com> Closes https://github.com/facebook/rocksdb/pull/2692 Differential Revision: D5575888 Pulled By: ajkr fbshipit-source-id: 5b93e9f85cddc01663ccecd87631fa723ac466a3	8 years ago
James Page	36375de76f	gcc-7/i386: markup intentional fallthroughs Summary: Markup i386 code paths resolving compilation failure under i386 with gcc-7. Signed-off-by: James Page <james.page@ubuntu.com> Closes https://github.com/facebook/rocksdb/pull/2700 Differential Revision: D5583047 Pulled By: maysamyabandeh fbshipit-source-id: fe31bcfeaf7cd2d3f51b55f5ae0b3b0cb3788fbc	8 years ago
Maysam Yabandeh	bdc056f8aa	Refactor PessimisticTransaction Summary: This patch splits Commit and Prepare into lock-related logic and db-write-related logic. It moves lock-related logic to PessimisticTransaction to be reused by all children classes and movies the existing impl of db-write-related to PrepareInternal, CommitSingleInternal, and CommitInternal in WriteCommittedTxnImpl. Closes https://github.com/facebook/rocksdb/pull/2691 Differential Revision: D5569464 Pulled By: maysamyabandeh fbshipit-source-id: d1b8698e69801a4126c7bc211745d05c636f5325	8 years ago
Maysam Yabandeh	a9a4e89c38	Fix valgrind complaint about initialization Summary: Closes https://github.com/facebook/rocksdb/pull/2697 Differential Revision: D5573894 Pulled By: maysamyabandeh fbshipit-source-id: 8fc03ea8ea6f3f3bc0f68b64cf90243a70562dc4	8 years ago
janlzlabs	4ca11b4b07	Update USERS.md Summary: I'd like to propose adding my company as a RocksDB user Closes https://github.com/facebook/rocksdb/pull/2694 Differential Revision: D5572113 Pulled By: ajkr fbshipit-source-id: 646143b955e3efddee56691cce912d7badaa6e8b	8 years ago
Maysam Yabandeh	c9804e007a	Refactor TransactionDBImpl Summary: This opens space for the new implementations of TransactionDBImpl such as WritePreparedTxnDBImpl that has a different policy of how to write to DB. Closes https://github.com/facebook/rocksdb/pull/2689 Differential Revision: D5568918 Pulled By: maysamyabandeh fbshipit-source-id: f7eac866e175daf3793ae79da108f65cc7dc7b25	8 years ago
Sagar Vemuri	20dc5e74f2	Optimize range-delete aggregator call in merge helper. Summary: In the condition: ``` if (range_del_agg != nullptr && range_del_agg->ShouldDelete( iter->key(), RangeDelAggregator::RangePositioningMode::kForwardTraversal) && filter != CompactionFilter::Decision::kRemoveAndSkipUntil) { ... } ``` it could be possible that all the work done in `range_del_agg->ShouldDelete` is wasted due to not having the right `filter` value later on. Instead, check `filter` value before even calling `range_del_agg->ShouldDelete`, which is a much more involved function. Closes https://github.com/facebook/rocksdb/pull/2690 Differential Revision: D5568931 Pulled By: sagar0 fbshipit-source-id: 17512d52360425c7ae9de7675383f5d7bc3dad58	8 years ago
Yi Wu	0d4a2b7330	Avoid blob db call Sync() while writing Summary: The FsyncFiles background job call Fsync() periodically for blob files. However it can access WritableFileWriter concurrently with a Put() or Write(). And WritableFileWriter does not support concurrent access. It will lead to WritableFileWriter buffer being flush with same content twice, and blob file end up corrupted. Fixing by simply let FsyncFiles hold write_mutex_. Closes https://github.com/facebook/rocksdb/pull/2685 Differential Revision: D5561908 Pulled By: yiwu-arbug fbshipit-source-id: f0bb5bcab0e05694e053b8c49eab43640721e872	8 years ago
Maysam Yabandeh	627c9f1abb	Don't add -ljemalloc when DISABLE_JEMALLOC is set Summary: fixes #2555 Closes https://github.com/facebook/rocksdb/pull/2684 Differential Revision: D5560527 Pulled By: maysamyabandeh fbshipit-source-id: 6e1d874ae0b4e699a77203d9d52d0bb8f59013b0	8 years ago
Andrew Kryczka	dce6d5a838	db_bench background work thread pool size arguments Summary: The background thread pools' sizes weren't easily configurable by `max_background_compactions` and `max_background_flushes` in multi-instance setups. Introduced separate arguments for their sizes. Closes https://github.com/facebook/rocksdb/pull/2680 Differential Revision: D5550675 Pulled By: ajkr fbshipit-source-id: bab5f0a7bc5db63bb084d0c10facbe437096367d	8 years ago
Cholerae Hu	4f81ab38bf	Makefile: fix for GCC 7+ and clang 4+ Summary: maysamyabandeh IslamAbdelRahman PTAL Fix https://github.com/facebook/rocksdb/issues/2672 Signed-off-by: Cholerae Hu <huyingqian@pingcap.com> Closes https://github.com/facebook/rocksdb/pull/2681 Differential Revision: D5561515 Pulled By: ajkr fbshipit-source-id: 676187802ebd8a87a6c051bb565818a1bf89d0a9	8 years ago
Yi Wu	92afe830f9	Update all blob db TTL and timestamps to uint64_t Summary: The current blob db implementation use mix of int32_t, uint32_t and uint64_t for TTL and expiration. Update all timestamps to uint64_t for consistency. Closes https://github.com/facebook/rocksdb/pull/2683 Differential Revision: D5557103 Pulled By: yiwu-arbug fbshipit-source-id: e4eab2691629a755e614e8cf1eed9c3a681d0c42	8 years ago
Alan Somers	5883a1ae24	Fix /bin/bash shebangs Summary: "/bin/bash" is a Linuxism. "/usr/bin/env bash" is portable. Closes https://github.com/facebook/rocksdb/pull/2646 Differential Revision: D5556259 Pulled By: ajkr fbshipit-source-id: cbffd38ecdbfffb2438969ec007ab345ed893ccb	8 years ago
Andrew Kryczka	cc01985db0	Introduce bottom-pri thread pool for large universal compactions Summary: When we had a single thread pool for compactions, a thread could be busy for a long time (minutes) executing a compaction involving the bottom level. In multi-instance setups, the entire thread pool could be consumed by such bottom-level compactions. Then, top-level compactions (e.g., a few L0 files) would be blocked for a long time ("head-of-line blocking"). Such top-level compactions are critical to prevent compaction stalls as they can quickly reduce number of L0 files / sorted runs. This diff introduces a bottom-priority queue for universal compactions including the bottom level. This alleviates the head-of-line blocking situation for fast, top-level compactions. - Added `Env::Priority::BOTTOM` thread pool. This feature is only enabled if user explicitly configures it to have a positive number of threads. - Changed `ThreadPoolImpl`'s default thread limit from one to zero. This change is invisible to users as we call `IncBackgroundThreadsIfNeeded` on the low-pri/high-pri pools during `DB::Open` with values of at least one. It is necessary, though, for bottom-pri to start with zero threads so the feature is disabled by default. - Separated `ManualCompaction` into two parts in `PrepickedCompaction`. `PrepickedCompaction` is used for any compaction that's picked outside of its execution thread, either manual or automatic. - Forward universal compactions involving last level to the bottom pool (worker thread's entry point is `BGWorkBottomCompaction`). - Track `bg_bottom_compaction_scheduled_` so we can wait for bottom-level compactions to finish. We don't count them against the background jobs limits. So users of this feature will get an extra compaction for free. Closes https://github.com/facebook/rocksdb/pull/2580 Differential Revision: D5422916 Pulled By: ajkr fbshipit-source-id: a74bd11f1ea4933df3739b16808bb21fcd512333	8 years ago
Yi Wu	0b814ba92d	Allow concurrent writes to blob db Summary: I'm going with brute-force solution, just letting Put() and Write() holding a mutex before writing. May improve concurrent writing with finer granularity locking later. Closes https://github.com/facebook/rocksdb/pull/2682 Differential Revision: D5552690 Pulled By: yiwu-arbug fbshipit-source-id: 039abd675b5d274a7af6428198d1733cafecef4c	8 years ago

1 2 3 4 5 ...

6417 Commits (23593171c42e88ea1c6d288dd1ab6f2b65bdbbe1) All Branches Search

6417 Commits (23593171c42e88ea1c6d288dd1ab6f2b65bdbbe1)

All Branches