rocksdb

Commit Graph

Author	SHA1	Message	Date
mkosieradzki	a12479819d	Improved transactions support in C API Summary: Solves #2632 Added OptimisticTransactionDB to the C API. Added missing merge operations to Transaction. Added missing get_for_update operation to transaction If required I will create tests for this another day. Closes https://github.com/facebook/rocksdb/pull/2633 Differential Revision: D5600906 Pulled By: yiwu-arbug fbshipit-source-id: da23e4484433d8f59d471f778ff2ae210e3fe4eb	8 years ago
Andrew Kryczka	8ace1f79b5	add counter for deletion dropping optimization Summary: add this counter stat to track usage of deletion-dropping optimization. if usage is low, we can delete it to prevent bugs like #2726. Closes https://github.com/facebook/rocksdb/pull/2761 Differential Revision: D5665421 Pulled By: ajkr fbshipit-source-id: 881befa2d199838dac88709e7b376a43d304e3d4	8 years ago
Andrew Kryczka	ed0a4c93ef	perf_context measure user bytes read Summary: With this PR, we can measure read-amp for queries where perf_context is enabled as follows: ``` SetPerfLevel(kEnableCount); Get(1, "foo"); double read_amp = static_cast<double>(get_perf_context()->block_read_byte / get_perf_context()->get_read_bytes); SetPerfLevel(kDisable); ``` Our internal infra enables perf_context for a sampling of queries. So we'll be able to compute the read-amp for the sample set, which can give us a good estimate of read-amp. Closes https://github.com/facebook/rocksdb/pull/2749 Differential Revision: D5647240 Pulled By: ajkr fbshipit-source-id: ad73550b06990cf040cc4528fa885360f308ec12	8 years ago
Sagar Vemuri	9a44b4c32c	Allow merge operator to be called even with a single operand Summary: Added a function `MergeOperator::DoesAllowSingleMergeOperand()` to allow invoking a merge operator even with a single merge operand, if overriden. This is needed for Cassandra-on-RocksDB work. All Cassandra writes are through merges and this will allow a single merge-value to be updated in the merge-operator invoked via a compaction, if needed, due to an expired TTL. Closes https://github.com/facebook/rocksdb/pull/2721 Differential Revision: D5608706 Pulled By: sagar0 fbshipit-source-id: f299f9f91c4d1ac26e48bd5906e122c1c5e5f3fc	8 years ago
follitude	ac8fb77afd	fix some misspellings Summary: PTAL ajkr Closes https://github.com/facebook/rocksdb/pull/2750 Differential Revision: D5648052 Pulled By: ajkr fbshipit-source-id: 7cd1ddd61364d5a55a10fdd293fa74b2bf89dd98	8 years ago
Andrew Kryczka	af012c0f83	fix deleterange with memtable prefix bloom Summary: the range delete tombstones in memtable should be added to the aggregator even when the memtable's prefix bloom filter tells us the lookup key's not there. This bug could cause data to temporarily reappear until the memtable containing range deletions is flushed. Reported in #2743. Closes https://github.com/facebook/rocksdb/pull/2745 Differential Revision: D5639007 Pulled By: ajkr fbshipit-source-id: 04fc6facb6f978340a3f639536f4ca7c0d73dfc9	8 years ago
Andrew Kryczka	1c8dbe2aa2	update scores after picking universal compaction Summary: We forgot to recompute compaction scores after picking a universal compaction like we do in level compaction (`a34b2e388e/db/compaction_picker.cc (L691-L695)`). This leads to a fairness issue where we waste compactions on CFs/DB instances that don't need it while others can starve. Previously, `ccecf3f4fb` fixed the issue for the read-amp-based compaction case; this PR avoids the issue earlier and also for size-ratio-based compactions. Closes https://github.com/facebook/rocksdb/pull/2688 Differential Revision: D5566191 Pulled By: ajkr fbshipit-source-id: 010bccb2a107f6a76f3d3022b90aadce5cc48feb	8 years ago
Maysam Yabandeh	eb6425303e	Update WritePrepared with the pseudo code Summary: Implement the main body of WritePrepared pseudo code. This includes PrepareInternal and CommitInternal, as well as AddCommitted which updates the commit map. It also provides a IsInSnapshot method that could be later called form the read path to decide if a version is in the read snapshot or it should other be skipped. This patch lacks unit tests and does not attempt to offer an efficient implementation. The idea is that to have the API specified so that we can work on related tasks in parallel. Closes https://github.com/facebook/rocksdb/pull/2713 Differential Revision: D5640021 Pulled By: maysamyabandeh fbshipit-source-id: bfa7a05e8d8498811fab714ce4b9c21530514e1c	8 years ago
Siying Dong	71598cdc75	Fix false removal of tombstone issue in FIFO and kCompactionStyleNone Summary: Similar to the bug fixed by https://github.com/facebook/rocksdb/pull/2726, FIFO with compaction and kCompactionStyleNone during user customized CompactFiles() with output level to be 0 can suffer from the same problem. Fix it by leveraging the bottommost_level_ flag. Closes https://github.com/facebook/rocksdb/pull/2735 Differential Revision: D5626906 Pulled By: siying fbshipit-source-id: 2b148d0461c61dbd986d74655e384419ae442158	8 years ago
Andrew Kryczka	acf935e40f	fix deletion dropping in intra-L0 Summary: `KeyNotExistsBeyondOutputLevel` didn't consider L0 files' key-ranges. So if a key only was covered by older L0 files' key-ranges, we would incorrectly drop deletions of that key. This PR just skips the deletion-dropping optimization when output level is L0. Closes https://github.com/facebook/rocksdb/pull/2726 Differential Revision: D5617286 Pulled By: ajkr fbshipit-source-id: 4bff1396b06d49a828ba4542f249191052915bce	8 years ago
yiwu-arbug	3f5888430a	Fix c_test ASAN failure Summary: Fix c_test missing deletion of write batch pointer. Closes https://github.com/facebook/rocksdb/pull/2725 Differential Revision: D5613866 Pulled By: yiwu-arbug fbshipit-source-id: bf3f59a6812178577c9c25bae558ef36414a1f51	8 years ago
Andrew Kryczka	6f051e0c71	fix corruption_test valgrind Summary: Closes https://github.com/facebook/rocksdb/pull/2724 Differential Revision: D5613416 Pulled By: ajkr fbshipit-source-id: ed55fb66ab1b41dfdfe765fe3264a1c87a8acb00	8 years ago
Kent767	ac098a4626	expose set_skip_stats_update_on_db_open to C bindings Summary: It would be super helpful to not have to recompile rocksdb to get this performance tweak for mechanical disks. I have signed the CLA. Closes https://github.com/facebook/rocksdb/pull/2718 Differential Revision: D5606994 Pulled By: yiwu-arbug fbshipit-source-id: c05e92bad0d03bd38211af1e1ced0d0d1e02f634	8 years ago
Siying Dong	666a005f9b	Support prefetch last 512KB with direct I/O in block based file reader Summary: Right now, if direct I/O is enabled, prefetching the last 512KB cannot be applied, except compaction inputs or readahead is enabled for iterators. This can create a lot of I/O for HDD cases. To solve the problem, the 512KB is prefetched in block based table if direct I/O is enabled. The prefetched buffer is passed in totegher with random access file reader, so that we try to read from the buffer before reading from the file. This can be extended in the future to support flexible user iterator readahead too. Closes https://github.com/facebook/rocksdb/pull/2708 Differential Revision: D5593091 Pulled By: siying fbshipit-source-id: ee36ff6d8af11c312a2622272b21957a7b5c81e7	8 years ago
Stanislav Tkach	25df24254b	Add column families related functions (C API) Summary: (#2564) Closes https://github.com/facebook/rocksdb/pull/2669 Differential Revision: D5594151 Pulled By: yiwu-arbug fbshipit-source-id: 67ae9446342f3323d6ecad8e811f4158da194270	8 years ago
Oleksandr Anyshchenko	0cecf8155b	Write batch for `TransactionDB` in C API Summary: Closes https://github.com/facebook/rocksdb/pull/2655 Differential Revision: D5600858 Pulled By: yiwu-arbug fbshipit-source-id: cf52f9104e348438bf168dc6bf7af3837faf12ef	8 years ago
jimmyway	23c7d13540	fix comment Summary: Signed-off-by: tang.jin <tang.jin@istuary.com> Closes https://github.com/facebook/rocksdb/pull/2644 Differential Revision: D5600861 Pulled By: yiwu-arbug fbshipit-source-id: 9516636cb6e77b09fe0ebef78953adf4b7e88cc8	8 years ago
Aaron G	7848f0b24c	add VerifyChecksum() to db.h Summary: We need a tool to check any sst file corruption in the db. It will check all the sst files in current version and read all the blocks (data, meta, index) with checksum verification. If any verification fails, the function will return non-OK status. Closes https://github.com/facebook/rocksdb/pull/2498 Differential Revision: D5324269 Pulled By: lightmark fbshipit-source-id: 6f8a272008b722402a772acfc804524c9d1a483b	8 years ago
Chang Liu	d97a72d63f	Try to repair db with wal_dir option, avoid leak some WAL files Summary: We should search wal_dir in Repairer::FindFiles function, and avoid use LogFileNmae(dbname, number) to get WAL file's name, which will get a wrong WAL filename. as following: ``` [WARN] [/home/liuchang/Workspace/rocksdb/db/repair.cc:310] Log #3: ignoring conversion error: IO error: While opening a file for sequentially reading: /tmp/rocksdbtest-1000/repair_test/000003.log: No such file or directory ``` I have added a new test case to repair_test.cc, which try to repair db with all WAL options. Signed-off-by: Chang Liu <liuchang0812@gmail.com> Closes https://github.com/facebook/rocksdb/pull/2692 Differential Revision: D5575888 Pulled By: ajkr fbshipit-source-id: 5b93e9f85cddc01663ccecd87631fa723ac466a3	8 years ago
Maysam Yabandeh	bdc056f8aa	Refactor PessimisticTransaction Summary: This patch splits Commit and Prepare into lock-related logic and db-write-related logic. It moves lock-related logic to PessimisticTransaction to be reused by all children classes and movies the existing impl of db-write-related to PrepareInternal, CommitSingleInternal, and CommitInternal in WriteCommittedTxnImpl. Closes https://github.com/facebook/rocksdb/pull/2691 Differential Revision: D5569464 Pulled By: maysamyabandeh fbshipit-source-id: d1b8698e69801a4126c7bc211745d05c636f5325	8 years ago
Sagar Vemuri	20dc5e74f2	Optimize range-delete aggregator call in merge helper. Summary: In the condition: ``` if (range_del_agg != nullptr && range_del_agg->ShouldDelete( iter->key(), RangeDelAggregator::RangePositioningMode::kForwardTraversal) && filter != CompactionFilter::Decision::kRemoveAndSkipUntil) { ... } ``` it could be possible that all the work done in `range_del_agg->ShouldDelete` is wasted due to not having the right `filter` value later on. Instead, check `filter` value before even calling `range_del_agg->ShouldDelete`, which is a much more involved function. Closes https://github.com/facebook/rocksdb/pull/2690 Differential Revision: D5568931 Pulled By: sagar0 fbshipit-source-id: 17512d52360425c7ae9de7675383f5d7bc3dad58	8 years ago
Andrew Kryczka	cc01985db0	Introduce bottom-pri thread pool for large universal compactions Summary: When we had a single thread pool for compactions, a thread could be busy for a long time (minutes) executing a compaction involving the bottom level. In multi-instance setups, the entire thread pool could be consumed by such bottom-level compactions. Then, top-level compactions (e.g., a few L0 files) would be blocked for a long time ("head-of-line blocking"). Such top-level compactions are critical to prevent compaction stalls as they can quickly reduce number of L0 files / sorted runs. This diff introduces a bottom-priority queue for universal compactions including the bottom level. This alleviates the head-of-line blocking situation for fast, top-level compactions. - Added `Env::Priority::BOTTOM` thread pool. This feature is only enabled if user explicitly configures it to have a positive number of threads. - Changed `ThreadPoolImpl`'s default thread limit from one to zero. This change is invisible to users as we call `IncBackgroundThreadsIfNeeded` on the low-pri/high-pri pools during `DB::Open` with values of at least one. It is necessary, though, for bottom-pri to start with zero threads so the feature is disabled by default. - Separated `ManualCompaction` into two parts in `PrepickedCompaction`. `PrepickedCompaction` is used for any compaction that's picked outside of its execution thread, either manual or automatic. - Forward universal compactions involving last level to the bottom pool (worker thread's entry point is `BGWorkBottomCompaction`). - Track `bg_bottom_compaction_scheduled_` so we can wait for bottom-level compactions to finish. We don't count them against the background jobs limits. So users of this feature will get an extra compaction for free. Closes https://github.com/facebook/rocksdb/pull/2580 Differential Revision: D5422916 Pulled By: ajkr fbshipit-source-id: a74bd11f1ea4933df3739b16808bb21fcd512333	8 years ago
Maysam Yabandeh	58410aee44	Fix the overflow bug in AwaitState Summary: https://github.com/facebook/rocksdb/issues/2559 reports an overflow in AwaitState. nbronson has debugged the issue and presented the fix, which is applied to this patch. Moreover this patch adds more comments to clarify the logic in AwaitState. I tried with both 16 and 64 threads on update benchmark. The fix lowers cpu usage by 1.6 but also lowers the throughput by 1.6 and 2% respectively. Apparently the bug had favored using the spinning more often. Benchmarks: TEST_TMPDIR=/dev/shm/tmpdb time ./db_bench --benchmarks="fillrandom" --threads=16 --num=2000000 TEST_TMPDIR=/dev/shm/tmpdb time ./db_bench --use_existing_db=1 --benchmarks="updaterandom[X3]" --threads=16 --num=2000000 TEST_TMPDIR=/dev/shm/tmpdb time ./db_bench --use_existing_db=1 --benchmarks="updaterandom[X3]" --threads=64 --num=200000 Results $ cat update-16t-bug.txt \| tail -4 updaterandom [AVG 3 runs] : 234117 ops/sec; 51.8 MB/sec updaterandom [MEDIAN 3 runs] : 233581 ops/sec; 51.7 MB/sec 3896.42user 1539.12system 6:50.61elapsed 1323%CPU (0avgtext+0avgdata 331308maxresident)k 0inputs+0outputs (0major+1281001minor)pagefaults 0swaps $ cat update-16t-fixed.txt \| tail -4 updaterandom [AVG 3 runs] : 230364 ops/sec; 51.0 MB/sec updaterandom [MEDIAN 3 runs] : 226169 ops/sec; 50.0 MB/sec 3865.46user 1568.32system 6:57.63elapsed 1301%CPU (0avgtext+0avgdata 315012maxresident)k 0inputs+0outputs (0major+1342568minor)pagefaults 0swaps $ cat update-64t-bug.txt \| tail -4 updaterandom [AVG 3 runs] : 261878 ops/sec; 57.9 MB/sec updaterandom [MEDIAN 3 runs] : 262859 ops/sec; 58.2 MB/sec 926.27user 578.06system 2:27.46elapsed 1020%CPU (0avgtext+0avgdata 475480maxresident)k 0inputs+0outputs (0major+1058728minor)pagefaults 0swaps $ cat update-64t-fixed.txt \| tail -4 updaterandom [AVG 3 runs] : 256699 ops/sec; 56.8 MB/sec updaterandom [MEDIAN 3 runs] : 256380 ops/sec; 56.7 MB/sec 933.47user 575.37system 2:30.41elapsed 1003%CPU (0avgtext+0avgdata 482340maxresident)k 0inputs+0outputs (0major+1078557minor)pagefaults 0swaps Closes https://github.com/facebook/rocksdb/pull/2679 Differential Revision: D5553732 Pulled By: maysamyabandeh fbshipit-source-id: 98b72dc3a8e0f22ea29d4f7c7790af10c369c5bb	8 years ago
Maysam Yabandeh	c3d5c4d38a	Refactor TransactionImpl Summary: This patch refactors TransactionImpl by separating the logic for pessimistic concurrency control from the implementation of how to write the data to rocksdb. The existing implementation is named WriteCommittedTxnImpl as it writes committed data to the db. A template named WritePreparedTxnImpl is also added which will be later completed to provide a an alternative implementation. Closes https://github.com/facebook/rocksdb/pull/2676 Differential Revision: D5549998 Pulled By: maysamyabandeh fbshipit-source-id: 16298e86b43ca4849324c1f35c731913c6d17bec	8 years ago
奏之章	3218edc573	Fix universal compaction bug Summary: this value ``` Compaction::is_trivial_move_ ``` uninitialized . under universal compaction , we enable ``` CompactionOptionsUniversal::allow_trivial_move ``` , `9b11d4345a/db/compaction.cc (L245)` here is a disastrous bug , some sst trivial move to target level without overlap check ... THEN , DATABASE DAMAGED , WE GOT A LEVEL WITH OVERLAP ! Closes https://github.com/facebook/rocksdb/pull/2634 Differential Revision: D5530722 Pulled By: siying fbshipit-source-id: 425ab55bca5967110377d634258360bcf88c200e	8 years ago
Andrew Kryczka	6a36b3a7b9	fix db get/write stats Summary: we were passing `record_read_stats` (a bool) as the `hist_type` argument, which meant we were updating either `rocksdb.db.get.micros` (`hist_type == 0`) or `rocksdb.db.write.micros` (`hist_type == 1`) with wrong data. Closes https://github.com/facebook/rocksdb/pull/2666 Differential Revision: D5520384 Pulled By: ajkr fbshipit-source-id: 2f7c956aec32f8b58c5c18845ac478e0230c9516	8 years ago
Siying Dong	21696ba502	Replace dynamic_cast<> Summary: Replace dynamic_cast<> so that users can choose to build with RTTI off, so that they can save several bytes per object, and get tiny more memory available. Some nontrivial changes: 1. Add Comparator::GetRootComparator() to get around the internal comparator hack 2. Add the two experiemental functions to DB 3. Add TableFactory::GetOptionString() to avoid unnecessary casting to get the option string 4. Since 3 is done, move the parsing option functions for table factory to table factory files too, to be symmetric. Closes https://github.com/facebook/rocksdb/pull/2645 Differential Revision: D5502723 Pulled By: siying fbshipit-source-id: fd13cec5601cf68a554d87bfcf056f2ffa5fbf7c	8 years ago
Sagar Vemuri	ac748c57ed	Fix FIFO Compaction with TTL tests Summary: - FIFOCompactionWithTTLTest was flaky when run in parallel earlier, and hence it was disabled. Fixed it now. - Also, faking sleep now instead of really sleeping to make tests more realistic by using TTLs like 1 hour and 1 day. Closes https://github.com/facebook/rocksdb/pull/2650 Differential Revision: D5506038 Pulled By: sagar0 fbshipit-source-id: deb429a527f045e3e2c5138b547c3e8ac8586aa2	8 years ago
Andrew Kryczka	710411aea6	fix asan/valgrind for TableCache cleanup Summary: Breaking commit: `d12691b86f` In the above commit, I moved the `TableCache` cleanup logic from `Version` destructor into `PurgeObsoleteFiles`. I missed cleaning up `TableCache` entries for the current `Version` during DB destruction. This PR adds that logic to `VersionSet` destructor. One unfortunate side effect is now we're potentially deleting `TableReader`s after `column_family_set_.reset()`, which means we can't call `BlockBasedTableReader::Close` a second time as the block cache might already be destroyed. Closes https://github.com/facebook/rocksdb/pull/2662 Differential Revision: D5515108 Pulled By: ajkr fbshipit-source-id: 2cb820e19aa813e0d258d17f76b2d7b6b7ee0b18	8 years ago
Siying Dong	fca4d6da17	Build fewer tests in Travis platform_dependent tests Summary: platform_dependent tests in Travis now builds all tests, which is not needed. Only build those tests we need to run. Closes https://github.com/facebook/rocksdb/pull/2647 Differential Revision: D5513954 Pulled By: siying fbshipit-source-id: 4d540b146124e70dd25586c47939d19f93655b0a	8 years ago
Aaron Gao	8f553d3c52	remove unnecessary internal_comparator param in newIterator Summary: solved https://github.com/facebook/rocksdb/issues/2604 Closes https://github.com/facebook/rocksdb/pull/2648 Differential Revision: D5504875 Pulled By: lightmark fbshipit-source-id: c14bb62ccbdc9e7bda9cd914cae4ea0765d882ee	8 years ago
Andrew Kryczka	d12691b86f	move TableCache::EraseHandle outside of db mutex Summary: Post-compaction work holds onto db mutex for the longest time (found by tracing lock acquires/releases with LTTng and correlating timestamps with our info log). Further experimentation showed `TableCache::EraseHandle` is responsible for ~86% of time mutex is held. We can just release the handle outside the db mutex. Closes https://github.com/facebook/rocksdb/pull/2654 Differential Revision: D5507126 Pulled By: ajkr fbshipit-source-id: 703c01ddf2aea16bc0f9e33c08935d78aa6b781d	8 years ago
Siying Dong	e7697b8ce8	Fix LITE unit tests Summary: Closes https://github.com/facebook/rocksdb/pull/2649 Differential Revision: D5505778 Pulled By: siying fbshipit-source-id: 7e935603ede3d958ea087ed6b8cfc4121e8797bc	8 years ago
Siying Dong	c281b44829	Revert "CRC32 Power Optimization Changes" Summary: This reverts commit `2289d38115`. Closes https://github.com/facebook/rocksdb/pull/2652 Differential Revision: D5506163 Pulled By: siying fbshipit-source-id: 105e31dd9d99090453a6b9f32c165206cd3affa3	8 years ago
Sagar Vemuri	9980de262c	Fix FIFO compaction picker test Summary: A FIFO compaction picker test is accidentally testing against an instance of level compaction picker. Closes https://github.com/facebook/rocksdb/pull/2641 Differential Revision: D5495390 Pulled By: sagar0 fbshipit-source-id: 301962736f629b1c499570fb504cdbe66bacb46f	8 years ago
Kamalalochana Subbaiah	2289d38115	CRC32 Power Optimization Changes Summary: Support for PowerPC Architecture Detecting AltiVec Support Closes https://github.com/facebook/rocksdb/pull/2353 Differential Revision: D5210948 Pulled By: siying fbshipit-source-id: 859a8c063d37697addd89ba2b8a14e5efd5d24bf	8 years ago
Maysam Yabandeh	30b58cf71a	Remove the orphan assert on !need_log_sync Summary: We initially had disabled support for write_options.sync when concurrent_prepare_ is set. We later added this support but the statement that asserts this combination is not used was left there. This patch cleans it up. Closes https://github.com/facebook/rocksdb/pull/2642 Differential Revision: D5496101 Pulled By: maysamyabandeh fbshipit-source-id: becbc503446f2a51bee24cc861958c090c724ec2	8 years ago
Yi Wu	fe1a5559f3	Fix flaky write_callback_test Summary: The test is failing occasionally on the assert: `ASSERT_TRUE(writer->state == WriteThread::State::STATE_INIT)`. This is because the test don't make the leader wait for long enough before updating state for its followers. The patch move the update to `threads_waiting` to the end of `WriteThread::JoinBatchGroup:Wait` callback to avoid this happening. Also adding `WriteThread::JoinBatchGroup:Start` and have each thread wait there while another thread is linking to the linked-list. This is to make the check of `is_leader` more deterministic. Also changing two while-loops of `compare_exchange_strong` to plain `fetch_add`, to make it look cleaner. Closes https://github.com/facebook/rocksdb/pull/2640 Differential Revision: D5491525 Pulled By: yiwu-arbug fbshipit-source-id: 6e897f122082bd6f98e6d51b31a25e5fd0a3fb82	8 years ago
Islam AbdelRahman	ea8ad4f678	Fix compaction div by zero logging Summary: We will divide by zero if `stats.micros` is zero, just add a simple check This happens sometimes during running tests and UBSAN complains Closes https://github.com/facebook/rocksdb/pull/2631 Differential Revision: D5481455 Pulled By: IslamAbdelRahman fbshipit-source-id: 69aa24e64e21de15d9e2b8009adf01675fcc6598	8 years ago
kapitan-k	34112aeffd	Added db paths to c Summary: Closes https://github.com/facebook/rocksdb/pull/2613 Differential Revision: D5476064 Pulled By: sagar0 fbshipit-source-id: 6b30a9eacb93a945bbe499eafb90565fa9f1798b	8 years ago
Daniel Black	1d8aa2961c	Gcc 7 ParsedInternalKey replace memset with clear function. Summary: I haven't looked to see if a class variable inside a loop like this is always initialised. Closes https://github.com/facebook/rocksdb/pull/2602 Differential Revision: D5475937 Pulled By: IslamAbdelRahman fbshipit-source-id: 8570b308f9a4b49e2a56ccc9e9b84d7c46568c15	8 years ago
Siying Dong	e67b35c076	Add Iterator::Refresh() Summary: Add and implement Iterator::Refresh(). When this function is called, if the super version doesn't change, update the sequence number of the iterator to the latest one and invalidate the iterator. If the super version changed, recreated the whole iterator. This can help users reuse the iterator more easily. Closes https://github.com/facebook/rocksdb/pull/2621 Differential Revision: D5464500 Pulled By: siying fbshipit-source-id: f548bd35e85c1efca2ea69273802f6704eba6ba9	8 years ago
Andrew Kryczka	a34b2e388e	Fix caching of compaction picker's next index Summary: The previous implementation of caching `file_size` index made no sense. It only remembered the original span of locked files starting from beginning of `file_size`. We should remember the index after all compactions that have been considered but rejected. This will reduce the work we do while holding the db mutex. Closes https://github.com/facebook/rocksdb/pull/2624 Differential Revision: D5468152 Pulled By: ajkr fbshipit-source-id: ab92a4bffe76f9f174d861bb5812b974d1013400	8 years ago
Sagar Vemuri	72502cf227	Revert "comment out unused parameters" Summary: This reverts the previous commit `1d7048c598`, which broke the build. Did a `git revert 1d7048c`. Closes https://github.com/facebook/rocksdb/pull/2627 Differential Revision: D5476473 Pulled By: sagar0 fbshipit-source-id: 4756ff5c0dfc88c17eceb00e02c36176de728d06	8 years ago
Victor Gao	1d7048c598	comment out unused parameters Summary: This uses `clang-tidy` to comment out unused parameters (in functions, methods and lambdas) in fbcode. Cases that the tool failed to handle are fixed manually. Reviewed By: igorsugak Differential Revision: D5454343 fbshipit-source-id: 5dee339b4334e25e963891b519a5aa81fbf627b2	8 years ago
Andrew Kryczka	a22b9cc6fe	overlapping endpoint fixes in level compaction picker Summary: This diff addresses two problems. Both problems cause us to miss scheduling desirable compactions. One side effect is compaction picking can spam logs, as there's no delay after failed attempts to pick compactions. 1. If a compaction pulled in a locked input-level file due to user-key overlap, we would not consider picking another file from the same input level. 2. If a compaction pulled in a locked output-level file due to user-key overlap, we would not consider picking any other compaction on any level. The code changes are dependent, which is why I solved both problems in a single diff. - Moved input-level `ExpandInputsToCleanCut` into the loop inside `PickFileToCompact`. This gives two benefits: (1) if it fails, we will try the next-largest file on the same input level; (2) we get the fully-expanded input-level key-range with which we can check for pending compactions in output level. - Added another call to `ExpandInputsToCleanCut` inside `PickFileToCompact`'s to check for compaction conflicts in output level. - Deleted call to `IsRangeInCompaction` in `PickFileToCompact`, as `ExpandInputsToCleanCut` also correctly handles the case where original output-level files (i.e., ones not pulled in due to user-key overlap) are pending compaction. Closes https://github.com/facebook/rocksdb/pull/2615 Differential Revision: D5454643 Pulled By: ajkr fbshipit-source-id: ea3fb5477d83e97148951af3fd4558d2039e9872	8 years ago
Andrew Kryczka	ffd2a2eefd	delete ExpandInputsToCleanCut failure log Summary: I decided not even to keep it as an INFO-level log as it is too normal for compactions to be skipped due to locked input files. Removing logging here makes us consistent with how we treat locked files that weren't pulled in due to overlap. We may want some error handling on line 422, which should never happen when called by `LevelCompactionBuilder::PickCompaction`, as `SetupInitialFiles` skips compactions where overlap causes the output level to pull in locked files. Closes https://github.com/facebook/rocksdb/pull/2617 Differential Revision: D5458502 Pulled By: ajkr fbshipit-source-id: c2e5f867c0a77c1812ce4242ab3e085b3eee0bae	8 years ago
Maysam Yabandeh	36651d14ee	Moving static AdaptationContext to outside function Summary: Moving static AdaptationContext to outside function to bypass tsan's false report with static initializers. It is because with optimization enabled std::atomic is simplified to as a simple read with no locks. The existing lock produced by static initializer is __cxa_guard_acquire which is apparently not understood by tsan as it is different from normal locks (__gthrw_pthread_mutex_lock). This is a known problem with tsan: https://stackoverflow.com/questions/27464190/gccs-tsan-reports-a-data-race-with-a-thread-safe-static-local https://stackoverflow.com/questions/42062557/c-multithreading-is-initialization-of-a-local-static-lambda-thread-safe A workaround that I tried was to move the static variable outside the function. It is not a good coding practice since it gives global visibility to variable but it is a hackish workaround until g++ tsan is improved. Closes https://github.com/facebook/rocksdb/pull/2598 Differential Revision: D5445281 Pulled By: yiwu-arbug fbshipit-source-id: 6142bd934eb5852d8fd7ce027af593ba697ed41d	8 years ago
Siying Dong	ae28634e9f	Remove some left-over BSD headers Summary: Closes https://github.com/facebook/rocksdb/pull/2608 Differential Revision: D5444797 Pulled By: siying fbshipit-source-id: 690581d03f37822e059a16085088e8e2d8a45016	8 years ago
Sushma Devendrappa	0655b58582	enable PinnableSlice for RowCache Summary: This patch enables using PinnableSlice for RowCache, changes include not releasing the cache handle immediately after lookup in TableCache::Get, instead pass a Cleanble function which does Cache::RleaseHandle. Closes https://github.com/facebook/rocksdb/pull/2492 Differential Revision: D5316216 Pulled By: maysamyabandeh fbshipit-source-id: d2a684bd7e4ba73772f762e58a82b5f4fbd5d362	8 years ago

... 14 15 16 17 18 ...

3618 Commits (ad52626cf4fd53b1549c4d04ea4c4dae9e4441d9)