rocksdb

Commit Graph

Author	SHA1	Message	Date
Zhongyi Xie	f95a5b2464	Avoid unnecessary big for-loop when reporting ticker stats stored in GetContext (#3490 ) Summary: Currently in `Version::Get` when reporting ticker stats stored in `GetContext`, there is a big for-loop through all `Ticker` which adds unnecessary cost to overall CPU usage. We can optimize by storing only ticker values that are used in `Get()` calls in a new struct `GetContextStats` since only a small fraction of all tickers are used in `Get()` calls. For comparison, with the new approach we only need to visit 17 values while old approach will require visiting 100+ `Ticker` Pull Request resolved: https://github.com/facebook/rocksdb/pull/3490 Differential Revision: D6969154 Pulled By: miasantreble fbshipit-source-id: fc27072965a3a94125a3e6883d20dafcf5b84029	7 years ago
Zhichao Cao	6811fb0658	Fixed the db_bench MergeRandom only access CF_default (#4155 ) Summary: When running the tracing and analyzing, I found that MergeRandom benchmark in db_bench only access the default column family even the -num_column_families is specified > 1. changes: Using the db_with_cfh as DB to randomly select the column family to execute the Merge operation if -num_column_families is specified > 1. Tested with make asan_check and verified in tracing Pull Request resolved: https://github.com/facebook/rocksdb/pull/4155 Differential Revision: D8907888 Pulled By: zhichao-cao fbshipit-source-id: 2b4bc8fe0e99c8f262f5be6b986c7025d62cf850	7 years ago
Siying Dong	a5e851e113	Reformatting some recent changes (#4161 ) Summary: Lint is not happy with some new code recently committed. Format them. Pull Request resolved: https://github.com/facebook/rocksdb/pull/4161 Differential Revision: D8940582 Pulled By: siying fbshipit-source-id: c9b43b1ef8c88b5e923911058b44eb77234b36b7	7 years ago
Siying Dong	8425c8bd4d	BlockBasedTableReader: automatically adjust tail prefetch size (#4156 ) Summary: Right now we use one hard-coded prefetch size to prefetch data from the tail of the SST files. However, this may introduce a waste for some use cases, while not efficient for others. Introduce a way to adjust this prefetch size by tracking 32 recent times, and pick a value with which the wasted read is less than 10% Pull Request resolved: https://github.com/facebook/rocksdb/pull/4156 Differential Revision: D8916847 Pulled By: siying fbshipit-source-id: 8413f9eb3987e0033ed0bd910f83fc2eeaaf5758	7 years ago
Andrew Kryczka	ab35505e21	Write properties metablock last in block-based tables (#4158 ) Summary: The properties meta-block should come at the end since we always need to read it when opening a file, unlike index/filter/other meta-blocks, which are sometimes read depending on the user's configuration. This ordering will allow us to (in a future PR) do a small readahead on the end of the file to read properties and meta-index blocks with one I/O. The bulk of this PR is a refactoring of the `BlockBasedTableBuilder::Finish` function. It was previously too large with inconsistent error handling, which made it difficult to change. So I broke it up into one function per meta-block write, and tried to make error handling consistent within those functions. Then reordering the metablocks was trivial -- just reorder the calls to these helper functions. Pull Request resolved: https://github.com/facebook/rocksdb/pull/4158 Differential Revision: D8921705 Pulled By: ajkr fbshipit-source-id: 96c9cc3182eb1adf11af46adab79dbeba7b12fcc	7 years ago
Yanqin Jin	2736752b33	Fix a bug in MANIFEST group commit (#4157 ) Summary: PR #3944 introduces group commit of `VersionEdit` in MANIFEST. The implementation has a bug. When updating the log file number of each column family, we must consider only `VersionEdit`s that operate on the same column family. Otherwise, a column family may accidentally set its log file number higher than actual value, indicating that log files with smaller file number will be ignored, thus causing some updates to be lost. Pull Request resolved: https://github.com/facebook/rocksdb/pull/4157 Differential Revision: D8916650 Pulled By: riversand963 fbshipit-source-id: 8f456cf688f17bf35ad87b38e30e899aa162f201	7 years ago
Andrew Kryczka	b5613227a9	Smaller tail readahead when not reading index/filters (#4159 ) Summary: In all cases during `BlockBasedTable::Open`, we issue at least three read requests to the file's tail: (1) footer, (2) metaindex block, and (3) properties block. Depending on the config, we may also read other metablocks like filter and index. This PR issues smaller readahead when we expect to do only the three necessary reads mentioned above. Then, 4KB should be enough (ignoring the case where there are lots of user-defined properties). We can keep doing 512KB readahead when additional reads are expected. Pull Request resolved: https://github.com/facebook/rocksdb/pull/4159 Differential Revision: D8924002 Pulled By: ajkr fbshipit-source-id: cfc713275de4d05ce11f18571f1d72e27ccd3356	7 years ago
Dmitri Smirnov	78ab11cd71	Return new operator for Status allocations for Windows (#4128 ) Summary: Windows requires new/delete for memory allocations to be overriden. Refactor to be less intrusive. Differential Revision: D8878047 Pulled By: siying fbshipit-source-id: 35f2b5fec2f88ea48c9be926539c6469060aab36	7 years ago
Sagar Vemuri	f3801528c1	Disable DBFlushTest.SyncFail and DBTest.GroupCommitTest on Travis (#4154 ) Summary: I am temporarily disabling DBFlushTest.SyncFail and DBTest.GroupCommitTest tests on Travis until we figure out the root-cause. These tests will still continue to run locally though. I haven't been able to reproduce these failures locally so far (even on a [local Travis environment](https://docs.travis-ci.com/user/common-build-problems/#Troubleshooting-Locally-in-a-Docker-Image) ). These tests are failing way too frequently causing everyone to wonder why their PR failed on travis, and waste time in debugging. Pull Request resolved: https://github.com/facebook/rocksdb/pull/4154 Differential Revision: D8907258 Pulled By: sagar0 fbshipit-source-id: f40068b16e9245fb3791b6a4796435d1ce1ed205	7 years ago
Pooja Malik	1857576e03	db_bench support for OPTIONS+bloom and nicer output for perf_context (#4153 ) Summary: Adding the string "PERF_CONTEXT:" before the perf_context stats are printed. Setting the filter policy if it's a block based table even when options are being loaded from the provided FLAGS_options_file. Pull Request resolved: https://github.com/facebook/rocksdb/pull/4153 Differential Revision: D8905517 Pulled By: poojam23 fbshipit-source-id: 5956ed7882d39ec8ae654d5dadeb88727a36f0dd	7 years ago
Tomas Kolda	80afa84903	Windows JNI build fixes (#4015 ) Summary: Fixing compilation, unsatisfied link exceptions (updated list of files that needs to be linked) and warnings for Windows build. ```C++ //MSVC 2015 does not support dynamic arrays like: rocksdb::Slice key_parts[jkey_parts_len]; //I have converted to: std::vector<rocksdb::Slice> key_parts; ``` Also reusing `free_key_parts` that does the same as `free_key_value_parts` that was removed. Java elapsedTime unit test increase of sleep to 2 ms. Otherwise it was failing. Pull Request resolved: https://github.com/facebook/rocksdb/pull/4015 Differential Revision: D8558215 Pulled By: sagar0 fbshipit-source-id: d3c34f846343f9218424da2402a2bd367bbd0aa2	7 years ago
Siying Dong	4bb1e239b5	Cap concurrent arena's shard block size to 128KB (#4147 ) Summary: Users sometime see their memtable size far smaller than expected. They probably have hit a fragementation of shard blocks. Cap their size anyway to reduce the impact of problem. 128KB is conservative so I don't imagine it can cause any performance problem. Pull Request resolved: https://github.com/facebook/rocksdb/pull/4147 Differential Revision: D8886706 Pulled By: siying fbshipit-source-id: 8528a2a4196aa4457274522e2565fd3ff28f621e	7 years ago
Yanqin Jin	79f009f22e	Release 5.15. (#4148 ) Summary: Cut 5.15.fb Pull Request resolved: https://github.com/facebook/rocksdb/pull/4148 Differential Revision: D8886802 Pulled By: riversand963 fbshipit-source-id: 6b6427ce97f5b323a7eebf92458fda8b24b0cece	7 years ago
Siying Dong	37e0fdc824	DBSSTTest.DeleteSchedulerMultipleDBPaths data race (#4146 ) Summary: Fix a minor data race in DBSSTTest.DeleteSchedulerMultipleDBPaths reported by TSAN Pull Request resolved: https://github.com/facebook/rocksdb/pull/4146 Differential Revision: D8880945 Pulled By: siying fbshipit-source-id: 25c632f685757735c59ad4ff26b2f346a443a446	7 years ago
Yi Wu	d538ebdff0	Fix write get stuck when pipelined write is enabled (#4143 ) Summary: Fix the issue when pipelined write is enabled, writers can get stuck indefinitely and not able to finish the write. It can show with the following example: Assume there are 4 writers W1, W2, W3, W4 (W1 is the first, W4 is the last). T1: all writers pending in WAL writer queue: WAL writer queue: W1, W2, W3, W4 memtable writer queue: empty T2. W1 finish WAL writer and move to memtable writer queue: WAL writer queue: W2, W3, W4, memtable writer queue: W1 T3. W2 and W3 finish WAL write as a batch group. W2 enter ExitAsBatchGroupLeader and move the group to memtable writer queue, but before wake up next leader. WAL writer queue: W4 memtable writer queue: W1, W2, W3 T4. W1, W2, W3 finish memtable write as a batch group. Note that W2 still in the previous ExitAsBatchGroupLeader, although W1 have done memtable write for W2. WAL writer queue: W4 memtable writer queue: empty T5. The thread corresponding to W3 create another writer W3' with the same address as W3. WAL writer queue: W4, W3' memtable writer queue: empty T6. W2 continue with ExitAsBatchGroupLeader. Because the address of W3' is the same as W3, the last writer in its group, it thinks there are no pending writers, so it reset newest_writer_ to null, emptying the queue. W4 and W3' are deleted from the queue and will never be wake up. The issue exists since pipelined write was introduced in 5.5.0. Closes #3704 Pull Request resolved: https://github.com/facebook/rocksdb/pull/4143 Differential Revision: D8871599 Pulled By: yiwu-arbug fbshipit-source-id: 3502674e51066a954a0660257e24ac588f815e2a	7 years ago
Siying Dong	ddc07b40fc	Remove managed iterator Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/4124 Differential Revision: D8829910 Pulled By: siying fbshipit-source-id: f3e952ccf3a631071a5d77c48e327046f8abb560	7 years ago
Siying Dong	995fcf7573	Pending output file number should be released after bulkload failure (#4145 ) Summary: If bulkload fails for an input error, the pending output file number wasn't released. This bug can cause all future files with larger number than the current number won't be deleted, even they are compacted. This commit fixes the bug. Pull Request resolved: https://github.com/facebook/rocksdb/pull/4145 Differential Revision: D8877900 Pulled By: siying fbshipit-source-id: 080be92a23d43305ca1e13fe1c06eb4cd0b01466	7 years ago
Fenggang Wu	5a59ce4149	Coding.h: Added Fixed16 support (#4142 ) Summary: Added Get Put Encode Decode support for Fixed16 (uint16_t). Unit test added in `coding_test.cc` Pull Request resolved: https://github.com/facebook/rocksdb/pull/4142 Differential Revision: D8873516 Pulled By: fgwu fbshipit-source-id: 331913e0a9a8fe9c95606a08e856e953477d64d3	7 years ago
Sagar Vemuri	fb768a4289	Dump mutable FIFO and Universal compaction options (#4140 ) Summary: We forgot to dump FIFO and Universal compaction options to the LOG when any option was dynamically changed via `SetOptions` API. Now added those options also to `MutableCFOptions::Dump`. Pull Request resolved: https://github.com/facebook/rocksdb/pull/4140 Differential Revision: D8865634 Pulled By: sagar0 fbshipit-source-id: 05a93e26ab8e72fec6249acccd09b0eb3e1ef0ac	7 years ago
Maysam Yabandeh	b55da012f6	Refactor IndexBlockIter (#4141 ) Summary: Refactor IndexBlockIter to reduce conditional branches on key_includes_seq_. IndexBlockIter::Prev is also separated from DataBlockIter::Prev, not to cache the prev entries as they are of less importance when iterating over the index block. Pull Request resolved: https://github.com/facebook/rocksdb/pull/4141 Differential Revision: D8866437 Pulled By: maysamyabandeh fbshipit-source-id: fdac76880426fc2be7d3c6354c09ab98f6657d4b	7 years ago
Sagar Vemuri	991120fa10	Allow ttl to be changed dynamically (#4133 ) Summary: Allow ttl to be changed dynamically. Pull Request resolved: https://github.com/facebook/rocksdb/pull/4133 Differential Revision: D8845440 Pulled By: sagar0 fbshipit-source-id: c8c87ae643b3a8c4123e4c037c4645efc094a2d3	7 years ago
Siying Dong	8f06b4fa01	Separate some IndexBlockIter logic from BlockIter (#4136 ) Summary: Some logic only related to IndexBlockIter is separated from BlockIter to IndexBlockIter. This is done by writing an exclusive Seek() and SeekForPrev() for DataBlockIter, and all metadata block iter and tombstone block iter now use data block iter. Dealing with the BinarySeek() sharing problem by passing in the comparator to use. Pull Request resolved: https://github.com/facebook/rocksdb/pull/4136 Reviewed By: maysamyabandeh Differential Revision: D8859673 Pulled By: siying fbshipit-source-id: 703e5e6824b82b7cbf4721f3594b94127797ca9e	7 years ago
Nathan VanBenschoten	ef7815b803	Support range deletion tombstones in IngestExternalFile SSTs (#3778 ) Summary: Fixes #3391. This change adds a `DeleteRange` method to `SstFileWriter` and adds support for ingesting SSTs with range deletion tombstones. This is important for applications that need to atomically ingest SSTs while clearing out any existing keys in a given key range. Pull Request resolved: https://github.com/facebook/rocksdb/pull/3778 Differential Revision: D8821836 Pulled By: anand1976 fbshipit-source-id: ca7786c1947ff129afa703dab011d524c7883844	7 years ago
Zhongyi Xie	91d7c03cdc	Exclude time waiting for rate limiter from rocksdb.sst.read.micros (#4102 ) Summary: Our "rocksdb.sst.read.micros" stat includes time spent waiting for rate limiter. It probably only affects people rate limiting compaction reads, which is fairly rare. Pull Request resolved: https://github.com/facebook/rocksdb/pull/4102 Differential Revision: D8848506 Pulled By: miasantreble fbshipit-source-id: 01258ac5ae56e4eee372978cfc9143a6869f8bfc	7 years ago
Peter Mattis	90fc40690a	Relax VersionStorageInfo::GetOverlappingInputs check (#4050 ) Summary: Do not consider the range tombstone sentinel key as causing 2 adjacent sstables in a level to overlap. When a range tombstone's end key is the largest key in an sstable, the sstable's end key is so to a "sentinel" value that is the smallest key in the next sstable with a sequence number of kMaxSequenceNumber. This "sentinel" is guaranteed to not overlap in internal-key space with the next sstable. Unfortunately, GetOverlappingFiles uses user-keys to determine overlap and was thus considering 2 adjacent sstables in a level to overlap if they were separated by this sentinel key. This in turn would cause compactions to be larger than necessary. Note that this conflicts with https://github.com/facebook/rocksdb/pull/2769 and cases `DBRangeDelTest.CompactionTreatsSplitInputLevelDeletionAtomically` to fail. Pull Request resolved: https://github.com/facebook/rocksdb/pull/4050 Differential Revision: D8844423 Pulled By: ajkr fbshipit-source-id: df3f9f1db8f4cff2bff77376b98b83c2ae1d155b	7 years ago
Yanqin Jin	21171615c1	Reduce execution time of IngestFileWithGlobalSeqnoRandomized (#4131 ) Summary: Make `ExternalSSTFileTest.IngestFileWithGlobalSeqnoRandomized` run faster. `make format` Pull Request resolved: https://github.com/facebook/rocksdb/pull/4131 Differential Revision: D8839952 Pulled By: riversand963 fbshipit-source-id: 4a7e842fde1cde4dc902e928a1cf511322578521	7 years ago
Maysam Yabandeh	8581a93a6b	Per-thread unique test db names (#4135 ) Summary: The patch makes sure that two parallel test threads will operate on different db paths. This enables using open source tools such as gtest-parallel to run the tests of a file in parallel. Example: ``` ~/gtest-parallel/gtest-parallel ./table_test``` Pull Request resolved: https://github.com/facebook/rocksdb/pull/4135 Differential Revision: D8846653 Pulled By: maysamyabandeh fbshipit-source-id: 799bad1abb260e3d346bcb680d2ae207a852ba84	7 years ago
Zhongyi Xie	23b76252c8	db_bench: enable setting cache_size when loading options file Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/4118 Differential Revision: D8845554 Pulled By: miasantreble fbshipit-source-id: 13bd3c1259a7c30bad762a413fe3bb24eea650ba	7 years ago
Fosco Marotto	8527012bb6	Converted db/merge_test.cc to use gtest (#4114 ) Summary: Picked up a task to convert this to use the gtest framework. It can't be this simple, can it? It works, but should all the std::cout be removed? ``` [$] ~/git/rocksdb [gft !]: ./merge_test [==========] Running 2 tests from 1 test case. [----------] Global test environment set-up. [----------] 2 tests from MergeTest [ RUN ] MergeTest.MergeDbTest Test read-modify-write counters... a: 3 1 2 a: 3 b: 1225 3 Compaction started ... Compaction ended a: 3 b: 1225 Test merge-based counters... a: 3 1 2 a: 3 b: 1225 3 Test merge in memtable... a: 3 1 2 a: 3 b: 1225 3 Test Partial-Merge Test merge-operator not set after reopen [ OK ] MergeTest.MergeDbTest (93 ms) [ RUN ] MergeTest.MergeDbTtlTest Opening database with TTL Test read-modify-write counters... a: 3 1 2 a: 3 b: 1225 3 Compaction started ... Compaction ended a: 3 b: 1225 Test merge-based counters... a: 3 1 2 a: 3 b: 1225 3 Test merge in memtable... Opening database with TTL a: 3 1 2 a: 3 b: 1225 3 Test Partial-Merge Opening database with TTL Opening database with TTL Opening database with TTL Opening database with TTL Test merge-operator not set after reopen [ OK ] MergeTest.MergeDbTtlTest (97 ms) [----------] 2 tests from MergeTest (190 ms total) [----------] Global test environment tear-down [==========] 2 tests from 1 test case ran. (190 ms total) [ PASSED ] 2 tests. ``` Pull Request resolved: https://github.com/facebook/rocksdb/pull/4114 Differential Revision: D8822886 Pulled By: gfosco fbshipit-source-id: c299d008e883c3bb911d2b357a2e9e4423f8e91a	7 years ago
Maysam Yabandeh	537a233941	Exclude StackableDB from transaction stress tests (#4132 ) Summary: The transactions are currently tested with and without using StackableDB. This is mostly to check that the code path is consistent with stackable db as well. Slow, stress tests however do not benefit from being run again with StackableDB. The patch excludes StackableDB from such tests. On a single core it reduced the runtime of transaction_test from 199s to 135s. Pull Request resolved: https://github.com/facebook/rocksdb/pull/4132 Differential Revision: D8841655 Pulled By: maysamyabandeh fbshipit-source-id: 7b9aaba2673b542b195439dfb306cef26bd63b19	7 years ago
Anand Ananthabhotla	e3eba52a5d	Re-enable kUniversalSubcompactions option_config (#4125 ) Summary: 1. Move kUniversalSubcompactions up before kEnd in db_test_util.h, so tests that cycle through all the option_configs include this 2. Skip kUniversalSubcompactions wherever kUniversalCompaction and kUniversalCompactionMultilevel are skipped Related to #3935 Pull Request resolved: https://github.com/facebook/rocksdb/pull/4125 Differential Revision: D8828637 Pulled By: anand1976 fbshipit-source-id: 650dee15fd27d85281cf9bb4ca8ab460e04cac6f	7 years ago
Tamir Duberstein	7bee48bdbd	Add GCC 8 to Travis (#3433 ) Summary: - Avoid `strdup` to use jemalloc on Windows - Use `size_t` for consistency - Add GCC 8 to Travis - Add CMAKE_BUILD_TYPE=Release to Travis Pull Request resolved: https://github.com/facebook/rocksdb/pull/3433 Differential Revision: D6837948 Pulled By: sagar0 fbshipit-source-id: b8543c3a4da9cd07ee9a33f9f4623188e233261f	7 years ago
Zhongyi Xie	de98fd88e3	Support compaction filter in db_bench (#4106 ) Summary: Right now there is no support for enabling compaction filter in db_bench, we should add support for that to facilitate testing of compaction filter. This PR adds a compaction filter called KeepFilter and make `Filter` always returns false, essentially a noop compaction filter. This will allow us to test compaction filter code path without having to support arbitrary compaction filters Pull Request resolved: https://github.com/facebook/rocksdb/pull/4106 Differential Revision: D8828517 Pulled By: miasantreble fbshipit-source-id: 9ad76d04103eaa9d00da98334b4a39e542d26c41	7 years ago
Andrew Kryczka	97fe23fc5c	Fix unsigned int flag in db_bench (#4129 ) Summary: `DEFINE_uint32` was unavailable on some platforms, e.g., https://travis-ci.org/facebook/rocksdb/jobs/403352902. Use `DEFINE_uint64` instead which should work as it's used many times elsewhere in this file. Pull Request resolved: https://github.com/facebook/rocksdb/pull/4129 Differential Revision: D8830311 Pulled By: ajkr fbshipit-source-id: b4fc90ba3f50e649c070ce8069c68e530d731f05	7 years ago
Yanqin Jin	520bbb1774	Disable EnvPosixTest.RunImmediately, add EnvPosixTest.RunEventually. (#4126 ) Summary: The original `EnvPosixTest.RunImmediately` assumes that after scheduling a background thread, the thread is guaranteed to complete after 0.1 second. I do not know about any non-real-time OS/runtime providing this guarantee. Nor does C++11 standard say anything about this in the documentation of `std::thread`. In fact, we have observed this test failure multiple times on appveyor, and we haven't been able to reproduce the failure deterministically. Therefore, I disable this test for now until we know for sure how it used to fail. Instead, I add another test `EnvPosixTest.RunEventually` that checks that a thread will be scheduled eventually. Pull Request resolved: https://github.com/facebook/rocksdb/pull/4126 Differential Revision: D8827086 Pulled By: riversand963 fbshipit-source-id: abc5cb655f90d50b791493da5eeb3716885dfe93	7 years ago
Yanqin Jin	90ebf1a257	Reduce execution time of a test. (#4127 ) Summary: Reduce the number of key ranges in `ExternalSSTFileTest.OverlappingRanges` so that the test completes in shorter time to avoid timeouts. Pull Request resolved: https://github.com/facebook/rocksdb/pull/4127 Differential Revision: D8827851 Pulled By: riversand963 fbshipit-source-id: a16387b0cc92a7c872b1c50f0cfbadc463afc9db	7 years ago
Maysam Yabandeh	d4ad32d7bd	Refactor BlockIter (#4121 ) Summary: BlockIter is getting crowded including details that specific only to either index or data blocks. The patch moves down such details to DataBlockIter and IndexBlockIter, both inheriting from BlockIter. Pull Request resolved: https://github.com/facebook/rocksdb/pull/4121 Differential Revision: D8816832 Pulled By: maysamyabandeh fbshipit-source-id: d492e74155c11d8a0c1c85cd7ee33d24c7456197	7 years ago
Andrew Kryczka	63904434eb	db_bench periodically dump stats to info log (#4109 ) Summary: give control of how often stats are printed, including jemalloc stats if enabled. Previously the default was 10 minutes so we'd only see updated stats for very long benchmark runs. Pull Request resolved: https://github.com/facebook/rocksdb/pull/4109 Differential Revision: D8796444 Pulled By: ajkr fbshipit-source-id: fd7902fe3f105fae89322c4ab63316bba4a2b15e	7 years ago
Yanqin Jin	dbeaa0d397	Reduce #iterations to shorten execution time. (#4123 ) Summary: Reduce #iterations from 5000 to 1000 so that `ExternalSSTFileTest.CompactDuringAddFileRandom` can finish faster. On the one hand, 5000 iterations does not seem to improve the quality of unit test in comparison with 1000. On the other hand, long running tests should belong to stress tests. Pull Request resolved: https://github.com/facebook/rocksdb/pull/4123 Differential Revision: D8822514 Pulled By: riversand963 fbshipit-source-id: 0f439b8d5ccd9a4aed84638f8bac16382de17245	7 years ago
Nikhil Benesch	5f3088d565	Range deletion performance improvements + cleanup (#4014 ) Summary: This fixes the same performance issue that #3992 fixes but with much more invasive cleanup. I'm more excited about this PR because it paves the way for fixing another problem we uncovered at Cockroach where range deletion tombstones can cause massive compactions. For example, suppose L4 contains deletions from [a, c) and [x, z) and no other keys, and L5 is entirely empty. L6, however, is full of data. When compacting L4 -> L5, we'll end up with one file that spans, massively, from [a, z). When we go to compact L5 -> L6, we'll have to rewrite all of L6! If, instead of range deletions in L4, we had keys a, b, x, y, and z, RocksDB would have been smart enough to create two files in L5: one for a and b and another for x, y, and z. With the changes in this PR, it will be possible to adjust the compaction logic to split tombstones/start new output files when they would span too many files in the grandparent level. ajkr please take a look when you have a minute! Pull Request resolved: https://github.com/facebook/rocksdb/pull/4014 Differential Revision: D8773253 Pulled By: ajkr fbshipit-source-id: ec62fa85f648fdebe1380b83ed997f9baec35677	7 years ago
Fosco Marotto	121e321549	Update docs/Gemfile.lock for nokogiri cve (#4116 ) Summary: Per GitHub warning Pull Request resolved: https://github.com/facebook/rocksdb/pull/4116 Differential Revision: D8812291 Pulled By: gfosco fbshipit-source-id: 3c55adc4ac737e4be077ddf29322c8961018d67c	7 years ago
Siying Dong	a61ff876a1	Remove two CI tests (#4110 ) Summary: Two CI tests never pass because of the environment problem. Delete them. Pull Request resolved: https://github.com/facebook/rocksdb/pull/4110 Differential Revision: D8805713 Pulled By: siying fbshipit-source-id: 6eb4813dc2094ee2045ec8ede7fe8967d546d6e8	7 years ago
Anand Ananthabhotla	1ea83c5de9	Reduce runtime of compact_on_deletion_collector_test (#4117 ) Summary: This test routinely exceeds the FB contbuild test timeout of 10 minutes, due to the large number of iterations. The large number (mainly due to 100 randomly selected window sizes) does not seem to add any value. Reduce it to allow the test to finish in < 10 mins. Pull Request resolved: https://github.com/facebook/rocksdb/pull/4117 Differential Revision: D8815646 Pulled By: anand1976 fbshipit-source-id: 260690d24f444767ad93b039dec3ae8b9cdd1843	7 years ago
Siying Dong	35b38a232c	Update comments of WriteBatchWithIndex Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/4113 Differential Revision: D8814172 Pulled By: siying fbshipit-source-id: cabc31db2c74803af9b2f99329155a1086eb1b22	7 years ago
Nikhil Benesch	5cd8240b86	Test range deletions with more configurations (#4021 ) Summary: Run the basic range deletion tests against the standard set of configurations. This testing exposed that files with hash indexes and partitioned indexes were not handling the case where the file contained only range deletions--i.e., where the index was empty. Additionally file a TODO about the fact that range deletions are broken when allow_mmap_reads = true is set. /cc ajkr nvanbenschoten Best viewed with ?w=1: https://github.com/facebook/rocksdb/pull/4021/files?w=1 Pull Request resolved: https://github.com/facebook/rocksdb/pull/4021 Differential Revision: D8811860 Pulled By: ajkr fbshipit-source-id: 3cc07e6d6210a2a00b932866481b3d5c59775343	7 years ago
Nicolas Pépin-Perreault	cfee7fb51a	Allow storing metadata with backups for Java API (#4111 ) Summary: Exposes BackupEngine::CreateNewBackupWithMetadata and BackupInfo metadata to the Java API. Full disclaimer, I'm not familiar with JNI stuff, so I might have forgotten something (hopefully no memory leaks!). I also tried to find contributing guidelines but didn't see any, but I hope the PR style is consistent with the rest of the code base. Pull Request resolved: https://github.com/facebook/rocksdb/pull/4111 Differential Revision: D8811180 Pulled By: ajkr fbshipit-source-id: e38b3e396c7574328c2a1a0e55acc8d092b6a569	7 years ago
Sagar Vemuri	1c912196de	Remove external tracking of AlignedBuffer's size (#4105 ) Summary: Remove external tracking of AlignedBuffer's size in `ReadaheadRandomAccessFile` and `FilePrefetchBuffer`. Pull Request resolved: https://github.com/facebook/rocksdb/pull/4105 Differential Revision: D8805724 Pulled By: sagar0 fbshipit-source-id: d61d8c203c7c500e3f36e912132d7852026ed023	7 years ago
Yanqin Jin	331cb63641	SetOptions Backup Race Condition (#4108 ) Summary: Prior to this PR, there was a race condition between `DBImpl::SetOptions` and `BackupEngine::CreateNewBackup`, as illustrated below. ``` Time thread 1 thread 2 \| CreateNewBackup -> GetLiveFiles \| SetOptions -> RenameTempFileToOptionsFile \| SetOptions -> RenameTempFileToOptionsFile \| SetOptions -> RenameTempFileToOptionsFile // unlink oldest OPTIONS file \| copy the oldest OPTIONS // IO error! V ``` Proposed fix is to check the value of `DBImpl::disable_obsolete_files_deletion_` before calling `DeleteObsoleteOptionsFiles`. Pull Request resolved: https://github.com/facebook/rocksdb/pull/4108 Differential Revision: D8796360 Pulled By: riversand963 fbshipit-source-id: 02045317f793ea4c7d4400a5bf333b8502fa3e82	7 years ago
Sagar Vemuri	440621aab8	Fix Copying of data between buffers in FilePrefetchBuffer (#4100 ) Summary: Copy data between buffers inside FilePrefetchBuffer only when chunk length is greater than 0. Otherwise AlignedBuffer was accessing memory out of its range causing crashes. Removing the tracking of buffer length outside of `AlignedBuffer`, i.e. in `FilePrefetchBuffer` and `ReadaheadRandomAccessFile`, will follow in a separate PR, as it is not the root cause of the crash reported in #4051. (`FilePrefetchBuffer` itself has been this way from its inception, and `ReadaheadRandomAccessFile` was updated to add the buffer length at some point). Comprehensive tests for `FilePrefetchBuffer` also to follow in a separate PR. Pull Request resolved: https://github.com/facebook/rocksdb/pull/4100 Differential Revision: D8792590 Pulled By: sagar0 fbshipit-source-id: 3578f45761cf6884243e767f749db4016ccc93e1	7 years ago
Siying Dong	926f3a78a6	In delete scheduler, before ftruncate file for slow delete, check whether there is other hard links (#4093 ) Summary: Right now slow deletion with ftruncate doesn't work well with checkpoints because it ruin hard linked files in checkpoints. To fix it, check the file has no other hard link before ftruncate it. Pull Request resolved: https://github.com/facebook/rocksdb/pull/4093 Differential Revision: D8730360 Pulled By: siying fbshipit-source-id: 756eea5bce8a87b9a2ea3a5bfa190b2cab6f75df	7 years ago

1 2 3 4 5 ...

7300 Commits (f95a5b2464d0a80e8badfbacda06e567e60cc79a) All Branches Search

7300 Commits (f95a5b2464d0a80e8badfbacda06e567e60cc79a)

All Branches