rocksdb

Commit Graph

Author	SHA1	Message	Date
Yanqin Jin	fa52376117	Move RADOS support to separate repo (#9206 ) Summary: This PR moves RADOS support from RocksDB repo to a separate repo. The new (temporary?) repo in this PR serves as an example before we finalize the decision on where and who to host RADOS support. At this point, people can start from the example repo and fork. The goal is to include this commit in RocksDB 7.0 release. Reference: https://github.com/ajkr/dedupfs by ajkr Pull Request resolved: https://github.com/facebook/rocksdb/pull/9206 Test Plan: Follow instructions in https://github.com/riversand963/rocksdb-rados-env/blob/main/README.md and build test binary `env_librados_test` and run it. Also, make check Reviewed By: ajkr Differential Revision: D33751690 Pulled By: riversand963 fbshipit-source-id: 30466c62afa9e4619847a48567ed158e62835e35	4 years ago
Yanqin Jin	5d30668cab	Remove tools/rdb from main repo (#9399 ) Summary: This PR is one proposal to resolve https://github.com/facebook/rocksdb/issues/9382. Looking at the code, I can't think of a reason why rdb is an internal component of RocksDB: it does not require any header files NOT in `include/rocksdb`. It's a better idea to host it somewhere else. Plus, rdb requires python2 which is not supported any more. No fixes or improvements will be made, even for potential security bugs (https://www.python.org/doc/sunset-python-2/). Pull Request resolved: https://github.com/facebook/rocksdb/pull/9399 Test Plan: make check Reviewed By: ajkr Differential Revision: D33641965 Pulled By: riversand963 fbshipit-source-id: 2a6a74693e5de36834f355e41d6865db206af48b	4 years ago
Yanqin Jin	50135c1bf3	Move HDFS support to separate repo (#9170 ) Summary: This PR moves HDFS support from RocksDB repo to a separate repo. The new (temporary?) repo in this PR serves as an example before we finalize the decision on where and who to host hdfs support. At this point, people can start from the example repo and fork. Java/JNI is not included yet, and needs to be done later if necessary. The goal is to include this commit in RocksDB 7.0 release. Reference: https://github.com/ajkr/dedupfs by ajkr Pull Request resolved: https://github.com/facebook/rocksdb/pull/9170 Test Plan: Follow the instructions in https://github.com/riversand963/rocksdb-hdfs-env/blob/master/README.md. Build and run db_bench and db_stress. make check Reviewed By: ajkr Differential Revision: D33751662 Pulled By: riversand963 fbshipit-source-id: 22b4db7f31762ed417a20239f5a08dcd1696244f	4 years ago
sdong	1cecd22de9	Increase wait time within EnvPosixTestWithParam.RunMany (#9413 ) Summary: We see: [ RUN ] ChrootEnvWithDirectIO/EnvPosixTestWithParam.RunMany/0 env/env_test.cc:464: Failure Expected equality of these values: 4 cur Which is: 0 The suspicious is that the wait time is not long enough. Increase the wait time to 10s and allows earlier check. Pull Request resolved: https://github.com/facebook/rocksdb/pull/9413 Test Plan: Run the test Reviewed By: riversand963 Differential Revision: D33697715 fbshipit-source-id: 3d71715562a8cceb694b773276dd9e4e451a18bc	4 years ago
anand76	e8f116deab	Update version to 6.29.0 (#9418 ) Summary: Update version for 6.29 release Pull Request resolved: https://github.com/facebook/rocksdb/pull/9418 Reviewed By: riversand963 Differential Revision: D33721048 Pulled By: anand1976 fbshipit-source-id: e73602ee1c829c2e47ce6e181bca4db7cb663979	4 years ago
sdong	a750b8a3a3	Remove VS2017 from Appveyor CI (#9417 ) Summary: It appears that VS2017 is covered in CircleCI so we don't need it in Appveyor. Also, currently Appveyor has some problem with installing VS2017. Pull Request resolved: https://github.com/facebook/rocksdb/pull/9417 Test Plan: Watch Appveyor run. Reviewed By: riversand963 Differential Revision: D33719364 fbshipit-source-id: 7f31bf056eeaf487b372881f85d134dc0fe5832a	4 years ago
Peter Dillinger	e7ac7363b4	Add to HISTORY and minor loose ends from #9294 , #9254 (#9386 ) Summary: Loose ends relate to mmap on 32-bit systems. (Testing is more complicated when the feature was completely disabled on 32-bit.) Pull Request resolved: https://github.com/facebook/rocksdb/pull/9386 Test Plan: CI Reviewed By: ajkr Differential Revision: D33590715 Pulled By: pdillinger fbshipit-source-id: f2637036a538a552200adee65b6765fce8cae27b	4 years ago
Peter Dillinger	fc9d4071f0	Fast path for detecting unchanged prefix_extractor (#9407 ) Summary: Fixes a major performance regression in 6.26, where extra CPU is spent in SliceTransform::AsString when reads involve a prefix_extractor (Get, MultiGet, Seek). Common case performance is now better than 6.25. This change creates a "fast path" for verifying that the current prefix extractor is unchanged and compatible with what was used to generate a table file. This fast path detects the common case by pointer comparison on the current prefix_extractor and a "known good" prefix extractor (if applicable) that is saved at the time the table reader is opened. The "known good" prefix extractor is saved as another shared_ptr copy (in an existing field, however) to ensure the pointer is not recycled. When the prefix_extractor has changed to a different instance but same compatible configuration (rare, odd), performance is still a regression compared to 6.25, but this is likely acceptable because of the oddity of such a case. The performance of incompatible prefix_extractor is essentially unchanged. Also fixed a minor case (ForwardIterator) where a prefix_extractor could be used via a raw pointer after being freed as a shared_ptr, if replaced via SetOptions. Pull Request resolved: https://github.com/facebook/rocksdb/pull/9407 Test Plan: ## Performance Populate DB with `TEST_TMPDIR=/dev/shm/rocksdb ./db_bench -benchmarks=fillrandom -num=10000000 -disable_wal=1 -write_buffer_size=10000000 -bloom_bits=16 -compaction_style=2 -fifo_compaction_max_table_files_size_mb=10000 -fifo_compaction_allow_compaction=0 -prefix_size=12` Running head-to-head comparisons simultaneously with `TEST_TMPDIR=/dev/shm/rocksdb ./db_bench -use_existing_db -readonly -benchmarks=seekrandom -num=10000000 -duration=20 -disable_wal=1 -bloom_bits=16 -compaction_style=2 -fifo_compaction_max_table_files_size_mb=10000 -fifo_compaction_allow_compaction=0 -prefix_size=12` Below each is compared by ops/sec vs. baseline which is version 6.25 (multiple baseline runs because of variable machine load) v6.26: 4833 vs. 6698 (<- major regression!) v6.27: 4737 vs. 6397 (still) New: 6704 vs. 6461 (better than baseline in common case) Disabled fastpath: 4843 vs. 6389 (e.g. if prefix extractor instance changes but is still compatible) Changed prefix size (no usable filter) in new: 787 vs. 5927 Changed prefix size (no usable filter) in new & baseline: 773 vs. 784 Reviewed By: mrambacher Differential Revision: D33677812 Pulled By: pdillinger fbshipit-source-id: 571d9711c461fb97f957378a061b7e7dbc4d6a76	4 years ago
Jay Zhuang	7711f8cbb4	Remove pyenv installation and use deps from S3 (#9406 ) Summary: * remove pyenv installation step which is not needed (it takes 3 minutes to install for every job and fail from time to time) * download compression lib fail from time to time, Uploaded the libs to S3 and download from them for CI, which should be more stable. Pull Request resolved: https://github.com/facebook/rocksdb/pull/9406 Test Plan: CI Reviewed By: riversand963 Differential Revision: D33700158 Pulled By: jay-zhuang fbshipit-source-id: be7b172d7cd059c9d7b3139fd7a34f8070460e31	4 years ago
Peter Dillinger	8064a3ac31	Fix flaky EventListenerTest.DisableBGCompaction (#9400 ) Summary: Wasn't able to easily reproduce error, but easy to see a race condition between TestFlushListener::OnFlushCompleted and DBTestBase::Close(), which frees CF handles before closing DB. Pull Request resolved: https://github.com/facebook/rocksdb/pull/9400 Test Plan: CI etc. Reviewed By: riversand963 Differential Revision: D33645134 Pulled By: pdillinger fbshipit-source-id: d0ec914cc43c9e14f53da633876b95b61995138d	4 years ago
Jay Zhuang	cd50078ae0	Update circleci xcode version (#9405 ) Summary: xcode 11.3.1 is deprecated https://circleci.com/docs/2.0/testing-ios/ , jobs are failing: ``` failed to create host: Image xcode:11.3.0 is not supported ``` Pull Request resolved: https://github.com/facebook/rocksdb/pull/9405 Test Plan: CI Reviewed By: ajkr, hx235 Differential Revision: D33674462 Pulled By: jay-zhuang fbshipit-source-id: 85dd27aad84d26eaaa5c5375015344182b2c50b9	4 years ago
Brian Chen	93a0e9f3fa	Mark destructors as override (#9404 ) Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/9404 It is better practice to mark destructors as override. Without this change there can be issues building with -Wsuggest-destructor-override. Reviewed By: riversand963 Differential Revision: D33671992 fbshipit-source-id: 75b0c15010cbab5fbc071c150fef1dc85d5d9d96	4 years ago
Peter Dillinger	ffe1e4b820	Make some FilterPolicy deprecations more clear (#9403 ) Summary: The old block-based filter has been deprecated for years, but this makes that more clear by marking the functions specific to it and logging a warning when the feature is used. It is deprecated because of performance. In that old design, you have to binary search through the full SST index before a bloom filter query, which is much more expensive than a bloom query itself. Pull Request resolved: https://github.com/facebook/rocksdb/pull/9403 Test Plan: Used db_bench with and without -use_block_based_filter, running at the same time TEST_TMPDIR=/dev/shm/rocksdb ./db_bench -benchmarks=fillrandom,readrandom -num=10000000 -duration=20 -disable_wal=1 -write_buffer_size=10000000 -bloom_bits=16 -compaction_style=2 -fifo_compaction_max_table_files_size_mb=10000 -fifo_compaction_allow_compaction=0 No significant difference in construction time but 3x slower readrandom with -use_block_based_filter: readrandom : 100.517 micros/op 9948 ops/sec; 1.1 MB/s vs. readrandom : 33.368 micros/op 29968 ops/sec; 3.3 MB/s Also saw deprecation message (just once) in LOG only with -use_block_based_filter Reviewed By: ajkr Differential Revision: D33673202 Pulled By: pdillinger fbshipit-source-id: 99f6f0eff619408d9e5f7ef546954ed0be6c7a5b	4 years ago
Andrew Kryczka	875bfd75a0	Add API warning for `Iterator::Refresh()` with range tombstones (#9398 ) Summary: Need this until we properly return an error or fix the combination. Reported in https://github.com/facebook/rocksdb/issues/9255. Pull Request resolved: https://github.com/facebook/rocksdb/pull/9398 Reviewed By: riversand963 Differential Revision: D33641396 Pulled By: ajkr fbshipit-source-id: 9fe804108f7b93912f5b9c7252ac49acedc4f805	4 years ago
Hui Xiao	f61df25cc2	Add missing comment to RateLimiter::Request() (#9392 ) Summary: Context/Summary: There are two `RateLimiter::Request()` in public header. One of them is missing some comment that the other one has. Pull Request resolved: https://github.com/facebook/rocksdb/pull/9392 Test Plan: rely on CI test Reviewed By: pdillinger Differential Revision: D33623609 Pulled By: hx235 fbshipit-source-id: 42dc06308ff0bcf5ee7ef67e0b1c0172fc239b20	4 years ago
Yanqin Jin	1a8e9f0e07	Use fcntl(F_FULLFSYNC) on OS X (#9356 ) Summary: Closing https://github.com/facebook/rocksdb/issues/5954 fsync/fdatasync on Linux: ``` (fsync/fdatasync) includes writing through or flushing a disk cache if present. ``` However, on OS X and iOS: ``` (fsync) will flush all data from the host to the drive (i.e. the "permanent storage device"), the drive itself may not physically write the data to the platters for quite some time and it may be written in an out-of-order sequence. ``` Solution is to use `fcntl(F_FULLFSYNC)` on OS X so that we get the same persistence guarantee. According to OSX man page, ``` The F_FULLFSYNC fcntl asks the drive to flush all buffered data to permanent storage. ``` This suggests that it will be no faster than `fsync` on Linux, since Linux, according to its man page, ``` writing through or flushing a disk cache if present ``` It means Linux may not flush all data from disk cache. This is similar to bug reports/fixes in: - golang: https://github.com/golang/go/issues/26650 - leveldb: `296de8d5b8`. Not sure if we should fallback to fsync since we break persistence contract. Pull Request resolved: https://github.com/facebook/rocksdb/pull/9356 Reviewed By: jay-zhuang Differential Revision: D33417416 Pulled By: riversand963 fbshipit-source-id: 475548ff9c5eaccde325e0f6842694271cbc8cb7	4 years ago
Peter Dillinger	5576ded762	Add Options::DisableExtraChecks, clarify force_consistency_checks (#9363 ) Summary: In response to https://github.com/facebook/rocksdb/issues/9354, this PR adds a way for users to "opt out" of extra checks that can impact peak write performance, which currently only includes force_consistency_checks. I considered including some other options but did not see a db_bench performance difference. Also clarify in comment for force_consistency_checks that it can "slow down saturated writing." Pull Request resolved: https://github.com/facebook/rocksdb/pull/9363 Test Plan: basic coverage in unit tests Using my perf test in https://github.com/facebook/rocksdb/issues/9354 comment, I see force_consistency_checks=true -> 725360 ops/s force_consistency_checks=false -> 783072 ops/s Reviewed By: mrambacher Differential Revision: D33636559 Pulled By: pdillinger fbshipit-source-id: 25bfd006f4844675e7669b342817dd4c6a641e84	4 years ago
Peter Dillinger	288dfd0ba5	README: De-list slack channel, list Google group (#9387 ) Summary: We are phasing out the slack channel, but keeping the Google Group email list. Pull Request resolved: https://github.com/facebook/rocksdb/pull/9387 Test Plan: no code Reviewed By: riversand963 Differential Revision: D33591265 Pulled By: pdillinger fbshipit-source-id: 48e45a74753d05611db2c8f4efc4de16a1f50e70	4 years ago
Fabrice Fontaine	53c8f739fd	build_tools/build_detect_platform: fix C++ tests (#6479 ) Summary: Replace `-o /dev/null` by `-o test.o` when testing for C++ features such as -faligned-new otherwise tests will fail with some bugged binutils (https://sourceware.org/bugzilla/show_bug.cgi?id=19526): ``` output/host/bin/xtensa-buildroot-linux-uclibc-g++ -faligned-new -x c++ - -o /dev/null <<EOF struct alignas(1024) t {int a;}; int main() {} EOF /home/fabrice/buildroot/output/host/lib/gcc/xtensa-buildroot-linux-uclibc/8.3.0/../../../../xtensa-buildroot-linux-uclibc/bin/ld: final link failed: file truncated ``` Signed-off-by: Fabrice Fontaine <fontaine.fabrice@gmail.com> Pull Request resolved: https://github.com/facebook/rocksdb/pull/6479 Reviewed By: ajkr Differential Revision: D33574136 Pulled By: riversand963 fbshipit-source-id: 12b48658b17e36013042c98219b89ddf71161d3c	4 years ago
Sergei Petrunia	c9042db619	Range Locking: add support for escalation barriers (#9290 ) Summary: Range Locking supports Lock Escalation. Lock Escalation is invoked when lock memory is nearly exhausted and it reduced the amount of memory used by joining adjacent locks. Bridging the gap between certain locks has adverse effects. For example, in MyRocks it is not a good idea to bridge the gap between locks in different indexes, as that get the lock to cover large portions of indexes, or even entire indexes. Resolve this by introducing Escalation Barrier. The escalation process will call the user-provided barrier callback function: bool(const Endpoint& a, const Endpoint& b) If the function returns true, there's a barrier between a and b and Lock Escalation will not try to bridge the gap between a and b. Pull Request resolved: https://github.com/facebook/rocksdb/pull/9290 Reviewed By: akankshamahajan15 Differential Revision: D33486753 Pulled By: riversand963 fbshipit-source-id: f97910b67aba0579ea1d35f523ca6863d3dd018e	4 years ago
Si Ke	93b1de4f45	Enable db_test running in Centos 32 bit OS and Alpine 32 bit OS (#9294 ) Summary: Closes https://github.com/facebook/rocksdb/issues/9271 Pull Request resolved: https://github.com/facebook/rocksdb/pull/9294 Reviewed By: riversand963, hx235 Differential Revision: D33586002 Pulled By: pdillinger fbshipit-source-id: 3d1a2fa71023e108613ff03dbd37a5f954fc4920	4 years ago
Eric Thérond	5602b1d3d9	Add support for Apple Silicon to RocksJava (#9254 ) Summary: Fixes facebook/rocksdb#7720 Updated Makefile with flags to define target architecture when compiling/linking, and added goal `rocksdbjavastaticosxub` to build a OS X Universal Binary native library. Pull Request resolved: https://github.com/facebook/rocksdb/pull/9254 Reviewed By: mrambacher Differential Revision: D33551160 Pulled By: pdillinger fbshipit-source-id: 9ce9962e03aacf55014545a6cdf638b5b14b8fa9	4 years ago
Yanqin Jin	d247230aec	Add check for using namespace (#9383 ) Summary: As title. Pull Request resolved: https://github.com/facebook/rocksdb/pull/9383 Test Plan: manually add `using namespace` to a file, and run `make check-sources`. Then, remove `using namespace`, and run `make check-sources` Reviewed By: ajkr Differential Revision: D33551706 Pulled By: riversand963 fbshipit-source-id: 1bb8304f38434da7de0656882e62e77673155725	4 years ago
zhuchong0329	5f2b661f54	FlushMemTable return ok but memtable does not synchronize flush (#8173 ) Summary: Fix https://github.com/facebook/rocksdb/issues/8046 : FlushMemTable return ok but memtable does not synchronize flush. The way to fix it is to expose RecoveryError. Pull Request resolved: https://github.com/facebook/rocksdb/pull/8173 Reviewed By: ajkr Differential Revision: D31674552 Pulled By: jay-zhuang fbshipit-source-id: 9d16b69ba12a196bb429332ec8224754de97773d	4 years ago
Yanqin Jin	0376869f05	Remove using namespace (#9369 ) Summary: As title. This is part of an fb-internal task. First, remove all `using namespace` statements if applicable. Next, utilize multiple build platforms and see if anything is broken. Should anything become broken, fix the compilation errors with as little extra change as possible. Pull Request resolved: https://github.com/facebook/rocksdb/pull/9369 Test Plan: internal build and make check make clean && make static_lib && cd examples && make all Reviewed By: pdillinger Differential Revision: D33517260 Pulled By: riversand963 fbshipit-source-id: 3fc4ce6402a073421dfd9a9b2d1c79441dca7a40	4 years ago
Yanqin Jin	21e71d1c73	Fix compilation error when building static_lib (#9377 ) Summary: With memkind installed, either on a non-fb machine or using `ROCKSDB_NO_FBCODE=1`. ``` ROCKSDB_NO_FBCODE=1 make static_lib ``` Compilation failed due to unused variable warning treated as error. To bypass this, we need to disable warning-as-error, which is not ideal. Pull Request resolved: https://github.com/facebook/rocksdb/pull/9377 Test Plan: Repeat the above command, and rely on CI. Reviewed By: ajkr Differential Revision: D33543343 Pulled By: riversand963 fbshipit-source-id: 9a2790b38c00b8696c7910287f4ae5a9b394341d	4 years ago
Niklas Fiekas	f8bdd5797f	Take compression level_values as const pointer (#9376 ) Summary: Compatible change, more natural (especially in generated Rust bindings), no risk that the API will ever need mutable access because it has to make a copy anyway. Pull Request resolved: https://github.com/facebook/rocksdb/pull/9376 Reviewed By: ajkr Differential Revision: D33541435 Pulled By: pdillinger fbshipit-source-id: 15c512a0d70b6e8694fa99d598b7d022751c1e59	4 years ago
Jay Zhuang	9c6fb26033	Fix clang13 build error (#9374 ) Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/9374 Test Plan: Add CI for clang13 build Reviewed By: riversand963 Differential Revision: D33522867 Pulled By: jay-zhuang fbshipit-source-id: 642756825cf0b51e35861fb847ebaee4611b76ca	4 years ago
mrambacher	1973fcba11	Restore Regex support for ObjectLibrary::Register, rename new APIs to allow old one to be deprecated in the future (#9362 ) Summary: In order to support old-style regex function registration, restored the original "Register<T>(string, Factory)" method using regular expressions. The PatternEntry methods were left in place but renamed to AddFactory. The goal is to allow for the deprecation of the original regex Registry method in an upcoming release. Added modes to the PatternEntry kMatchZeroOrMore and kMatchAtLeastOne to match * or +, respectively (kMatchAtLeastOne was the original behavior). Pull Request resolved: https://github.com/facebook/rocksdb/pull/9362 Reviewed By: pdillinger Differential Revision: D33432562 Pulled By: mrambacher fbshipit-source-id: ed88ab3f9a2ad0d525c7bd1692873f9bb3209d02	4 years ago
Jay Zhuang	6bab278291	Fix flaky SimCacheTest.SimCacheLogging (#9373 ) Summary: The random string may contain the string we're checking, e.g.: ``` ADD - 206FBC78E96BC4C6A2DDDDC0AD5D1ADD - 111 ``` Only check the line starts-with "ADD -". Pull Request resolved: https://github.com/facebook/rocksdb/pull/9373 Test Plan: `gtest-parallel ./sim_cache_test --gtest_filter=SimCacheTest.SimCacheLogging -r 1000` Reviewed By: riversand963 Differential Revision: D33519574 Pulled By: jay-zhuang fbshipit-source-id: d0c1c9b0b489246d292e7da4133030edaa748099	4 years ago
Yanqin Jin	55a2105258	Make RocksDB codebase compatible with newer compilers like clang-12 (#9370 ) Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/9370 GCC and newer clang, e.g. clang-12 treat `std::unique_ptr` slightly differently. For the following code ``` #include <iostream> #include <memory> #include <type_traits> struct A { std::unique_ptr<int> m1; }; int main() { std::cout << std::boolalpha; std::cout << std::is_standard_layout<A>::value << '\n'; return 0; } ``` GCC11(C++20) (tested on https://en.cppreference.com/w/cpp/types/is_standard_layout) will print "true", while newer clang, e.g. clang-12 will print "false". This breaks the usage of `offsetof()` on structs with non-static members of type `std::unique_ptr`. Fixing this by replacing the builtin `offsetof` with a trick documented at https://gist.github.com/graphitemaster/494f21190bb2c63c5516. Reviewed By: jay-zhuang Differential Revision: D33420840 fbshipit-source-id: 02bde281dfa28809bec787ad0f7019e85dd9c607	4 years ago
jsteemann	255aefb628	Add filename to several Corruption messages (#9239 ) Summary: This change adds the filename of the offending filen to several place that produce Status objects with code `kCorruption`. This is not an attempt to have every Corruption message in the codebase extended with the filename, but it is a start. The motivation for the change was to quickly diagnose which file is corrupted when a large database is openend and there is not option to copy it offsite for analysis, run strace or install the ldb tool. In the particular case in question, the error message improved from a mere ``` Corruption: checksum mismatch ``` to ``` Corruption: checksum mismatch in file /path/to/db/engine-rocksdb/MANIFEST-000171 ``` Pull Request resolved: https://github.com/facebook/rocksdb/pull/9239 Reviewed By: jay-zhuang Differential Revision: D33237742 Pulled By: riversand963 fbshipit-source-id: bd42559cfbf786a0a674d091671d1a2bf07bdd31	4 years ago
Youngjae Lee	3dfee770c6	Remove obsolete function declaration (#8724 ) Summary: Function `Version::UpdateFilesByCompactionPri()` is never called and not implemented. Pull Request resolved: https://github.com/facebook/rocksdb/pull/8724 Reviewed By: ajkr Differential Revision: D30643943 Pulled By: riversand963 fbshipit-source-id: 174b2d9a2a42e286222909a035cc74a7b5602335	4 years ago
Hui Xiao	9110685e8c	Release cache reservation of hash entries of the fall-back Ribbon Filter earlier (#9345 ) Summary: Note: rebase on and merge after https://github.com/facebook/rocksdb/pull/9349, as part of https://github.com/facebook/rocksdb/pull/9342 Context: https://github.com/facebook/rocksdb/pull/9073 charged the hash entries' memory in block cache with `CacheReservationHandle`. However, in the edge case where Ribbon Filter falls back to Bloom Filter and swaps its hash entries to the embedded bloom filter object, the handles associated with those entries are not swapped and thus not released as soon as those entries are cleared during Bloom Filter's finish process. Although this is a minor issue since RocksDB internal calls `FilterBitsBuilder->Reset()` right after `FilterBitsBuilder->Finish()` on the main path, which releases all the cache reservation related to both the Ribbon Filter and its embedded Bloom Filter, it still worths this fix to avoid confusion. Summary: - Swapped the `CacheReservationHandle` associated with the hash entries on Ribbon Filter's fallback Pull Request resolved: https://github.com/facebook/rocksdb/pull/9345 Test Plan: - Added a unit test to verify the number of cache reservation after clearing hash entries, which failed before the change and now succeeds Reviewed By: pdillinger Differential Revision: D33377225 Pulled By: hx235 fbshipit-source-id: 7487f4c40dfb6ee7928232021f93ef2c5329cffa	4 years ago
Hui Xiao	f62efb9d35	Clarify Options::rate_limiter api (#9361 ) Summary: Context/Summary: I believe we also rate-limit read rate using the rate limiter passed into db options, e.g, https://github.com/facebook/rocksdb/blob/6.27.fb/file/random_access_file_reader.cc#L159 Pull Request resolved: https://github.com/facebook/rocksdb/pull/9361 Test Plan: Existing tests Reviewed By: jay-zhuang Differential Revision: D33420803 Pulled By: hx235 fbshipit-source-id: 0ef3c4d0aaacb9bee9a5d2caceddfc76588c8949	4 years ago
Hui Xiao	fb0a76a9e2	Always check previous conditionally unchecked status due to shortcut evaluation in BlockBasedTableBuilder::WriteIndexBlock (#9349 ) Summary: Note: part of https://github.com/facebook/rocksdb/pull/9342 Context/Summary: Due to shortcut evaluation in `ok() && s.IsIncomplete()`, status `s` remains unchecked if `ok()==false`, which is the case in https://app.circleci.com/pipelines/github/facebook/rocksdb/10718/workflows/429f7ad4-6b9a-446b-b9b3-710d51b90409/jobs/265508 revealed by the change in the corresponding PR https://github.com/facebook/rocksdb/pull/9342. As suggested by reviewers, separation and clarification of status checking for partitioned index building from general table building status is added. Pull Request resolved: https://github.com/facebook/rocksdb/pull/9349 Test Plan: - The newly added if-else code is an equivalent translation of the existing logic plus always checking the conditionally unchecked status so relying on existing tests should be fine - https://github.com/facebook/rocksdb/pull/9342's `[build-linux-shared_lib-alt_namespace-status_checked](https://app.circleci.com/pipelines/github/facebook/rocksdb/10721/workflows/a200efe0-d545-4075-8c42-26dd3dc00f27/jobs/265625)` test should now pass after rebasing on this change Reviewed By: pdillinger Differential Revision: D33377223 Pulled By: hx235 fbshipit-source-id: cb81da9709ae9185e9cea89776e3012e915d6ef9	4 years ago
Yanqin Jin	b2e53ab2d8	Add checking for `DB::DestroyColumnFamilyHandle()` (#9347 ) Summary: Closing https://github.com/facebook/rocksdb/issues/5006 Calling `DB::DestroyColumnFamilyHandle(column_family)` with `column_family` being the return value of `DB::DefaultColumnFamily()` will return `Status::InvalidArgument()`. Pull Request resolved: https://github.com/facebook/rocksdb/pull/9347 Test Plan: make check Reviewed By: akankshamahajan15 Differential Revision: D33369675 Pulled By: riversand963 fbshipit-source-id: a8266a4daddf2b7a773c2dc7f3eb9a4adfb6b6dd	4 years ago
Andrew Kryczka	6892f19b11	Test correctness with WAL disabled in non-txn blackbox crash tests (#9338 ) Summary: Recently we added the ability to verify some prefix of operations are recovered (AKA no "hole" in the recovered data) (https://github.com/facebook/rocksdb/issues/8966). Besides testing unsynced data loss scenarios, it is also useful to test WAL disabled use cases, where unflushed writes are expected to be lost. Note RocksDB only offers the prefix-recovery guarantee to WAL-disabled use cases that use atomic flush, so crash test always enables atomic flush when WAL is disabled. To verify WAL-disabled crash-recovery correctness globally, i.e., also in whitebox and blackbox transaction tests, it is possible but requires further changes. I added TODOs in db_crashtest.py. Depends on https://github.com/facebook/rocksdb/issues/9305. Pull Request resolved: https://github.com/facebook/rocksdb/pull/9338 Test Plan: Running all crash tests and many instances of blackbox. Sandcastle links are in Phabricator diff test plan. Reviewed By: riversand963 Differential Revision: D33345333 Pulled By: ajkr fbshipit-source-id: f56dd7d2e5a78d59301bf4fc3fedb980eb31e0ce	4 years ago
Andrew Kryczka	b860a42158	Recover to exact latest seqno of data committed to MANIFEST (#9305 ) Summary: The LastSequence field in the MANIFEST file is the baseline seqno for a recovered DB. Recovering WAL entries might cause the recovered DB's seqno to advance above this baseline, but the recovered DB will never use a smaller seqno. Before this PR, we were writing the DB's seqno at the time of LogAndApply() as the LastSequence value. This works in the sense that it is a large enough baseline for the recovered DB that it'll never overwrite any records in existing SST files. At the same time, it's arbitrarily larger than what's needed. This behavior comes from LevelDB, where there was no tracking of largest seqno in an SST file. Now we know the largest seqno of newly written SST files, so we can write an exact value in LastSequence that actually reflects the largest seqno in any file referred to by the MANIFEST. This is primarily useful for correctness testing with unsynced data loss, where the recovered DB's seqno needs to indicate what records were recovered. Pull Request resolved: https://github.com/facebook/rocksdb/pull/9305 Test Plan: - https://github.com/facebook/rocksdb/issues/9338 adds crash-recovery correctness testing coverage for WAL disabled use cases - https://github.com/facebook/rocksdb/issues/9357 will extend that testing to cover file ingestion - Added assertion at end of LogAndApply() for `VersionSet::descriptor_last_sequence_` consistency with files - Manually tested upgrade/downgrade compatibility with a custom crash test that randomly picks between a `db_stress` built with and without this PR (for old code it must run with `-disable_wal=0`) Reviewed By: riversand963 Differential Revision: D33182770 Pulled By: ajkr fbshipit-source-id: 0bfafaf685f347cc8cb0e1d62e0186340a738f7d	4 years ago
mrambacher	fe31dc53ca	Make the Env class Customizable (#9293 ) Summary: Allows the Env to have options (Configurable) and loads like other Customizable classes. Pull Request resolved: https://github.com/facebook/rocksdb/pull/9293 Reviewed By: pdillinger, zhichao-cao Differential Revision: D33181591 Pulled By: mrambacher fbshipit-source-id: 55e823886c654d214eda9eedd45ccdc54dac14d7	4 years ago
Yanqin Jin	677d2b4a8f	Fix a bug in C-binding causing iterator to return incorrect result (#9343 ) Summary: Fixes https://github.com/facebook/rocksdb/issues/9339 When writing SST file, the name, computed as `prefix_extractor->GetId()` will be written to the properties block. When the SST is opened again in the future, `CreateFromString()` will take the name as argument and try to create a prefix extractor object. Without this fix, the C API will pass a `Wrapper` pointer to the underlying DB's `prefix_extractor`. `Wrapper::GetId()`, in this case, will be missing the prefix length component, causing a prefix extractor of length 0 to be silently created and used. Pull Request resolved: https://github.com/facebook/rocksdb/pull/9343 Test Plan: ``` make c_test ./c_test ``` Reviewed By: mrambacher Differential Revision: D33355549 Pulled By: riversand963 fbshipit-source-id: c92c3acd8be262c3bff8794b4229e42b9ee31203	4 years ago
sdong	a931bacf5d	Improve SimulatedHybridFileSystem (#9301 ) Summary: Several improvements to SimulatedHybridFileSystem: (1) Allow a mode where all I/Os to all files simulate HDD. This can be enabled in db_bench using -simulate_hdd (2) Latency calculation is slightly more accurate (3) Allow to simulate more than one HDD spindles. Pull Request resolved: https://github.com/facebook/rocksdb/pull/9301 Test Plan: Run db_bench and observe the results are reasonable. Reviewed By: jay-zhuang Differential Revision: D33141662 fbshipit-source-id: b736e58c4ba910d06899cc9ccec79b628275f4fa	4 years ago
mrambacher	1c39b7952b	Remove/Reduce use of Regex in ObjectRegistry/Library (#9264 ) Summary: Added new ObjectLibrary::Entry classes to replace/reduce the use of Regex. For simple factories that only do name matching, there are "StringEntry" and "AltStringEntry" classes. For classes that use some semblance of regular expressions, there is a PatternEntry class that can match a name and prefixes. There is also a class for Customizable::IndividualId format matches. Added tests for the new derivative classes and got all unit tests to pass. Resolves https://github.com/facebook/rocksdb/issues/9225. Pull Request resolved: https://github.com/facebook/rocksdb/pull/9264 Reviewed By: pdillinger Differential Revision: D33062001 Pulled By: mrambacher fbshipit-source-id: c2d2143bd2d38bdf522705c8280c35381b135c03	4 years ago
mrambacher	0a563ae278	Change GTEST_SKIP to BYPASS for MemoryAllocatorTest (#9340 ) Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/9340 Reviewed By: riversand963 Differential Revision: D33344152 Pulled By: mrambacher fbshipit-source-id: 283637625b86c33497571c5f52cac3ddf910b6f3	4 years ago
Peter Dillinger	26a238f5b7	New blog post for Ribbon filter (#8992 ) Summary: new blog post for Ribbon filter Pull Request resolved: https://github.com/facebook/rocksdb/pull/8992 Test Plan: markdown render in GitHub, Pages on my fork Reviewed By: jay-zhuang Differential Revision: D33342496 Pulled By: pdillinger fbshipit-source-id: a0a7c19100abdf8755f8a618eb4dead755dfddae	4 years ago
Andrew Kryczka	aa2b3bf675	Added `TraceOptions::preserve_write_order` (#9334 ) Summary: This option causes trace records to be written in the serialized write thread. That way, the write records in the trace must follow the same order as writes that are logged to WAL and writes that are applied to the DB. By default I left it disabled to match existing behavior. I enabled it in `db_stress`, though, as that use case requires order of write records in trace matches the order in WAL. Pull Request resolved: https://github.com/facebook/rocksdb/pull/9334 Test Plan: - See if below unsynced data loss crash test can run for 24h straight. It used to crash after a few hours when reaching an unlucky trace ordering. ``` DEBUG_LEVEL=0 TEST_TMPDIR=/dev/shm /usr/local/bin/python3 -u tools/db_crashtest.py blackbox --interval=10 --max_key=100000 --write_buffer_size=524288 --target_file_size_base=524288 --max_bytes_for_level_base=2097152 --value_size_mult=33 --sync_fault_injection=1 --test_batches_snapshots=0 --duration=86400 ``` Reviewed By: zhichao-cao Differential Revision: D33301990 Pulled By: ajkr fbshipit-source-id: 82d97559727adb4462a7af69758449c8725b22d3	4 years ago
Andrew Kryczka	2ee20a669d	Extend trace filtering to more operation types (#9335 ) Summary: - Extended trace filtering to cover `MultiGet()`, `Seek()`, and `SeekForPrev()`. Now all user ops that can be traced support filtering. - Enabled the new filter masks in `db_stress` since it only cares to trace writes. Pull Request resolved: https://github.com/facebook/rocksdb/pull/9335 Test Plan: - trace-heavy `db_stress` command reduced 30% elapsed time (79.21 -> 55.47 seconds) Benchmark command: ``` $ /usr/bin/time ./db_stress -ops_per_thread=100000 -sync_fault_injection=1 --db=/dev/shm/rocksdb_stress_db/ --expected_values_dir=/dev/shm/rocksdb_stress_expected/ --clear_column_family_one_in=0 ``` - replay-heavy `db_stress` command reduced 12.4% elapsed time (23.69 -> 20.75 seconds) Setup command: ``` $ ./db_stress -ops_per_thread=100000000 -sync_fault_injection=1 -db=/dev/shm/rocksdb_stress_db/ -expected_values_dir=/dev/shm/rocksdb_stress_expected --clear_column_family_one_in=0 & sleep 120; pkill -9 db_stress ``` Benchmark command: ``` $ /usr/bin/time ./db_stress -ops_per_thread=1 -reopen=0 -expected_values_dir=/dev/shm/rocksdb_stress_expected/ -db=/dev/shm/rocksdb_stress_db/ --clear_column_family_one_in=0 --destroy_db_initially=0 ``` Reviewed By: zhichao-cao Differential Revision: D33304580 Pulled By: ajkr fbshipit-source-id: 0df10f87c1fc506e9484b6b42cea2ef96c7ecd65	4 years ago
slk	2e5f764294	Make IncreaseFullHistoryTsLow to a public API (#9221 ) Summary: As (https://github.com/facebook/rocksdb/issues/9210) discussed, the full_history_ts_low is a member of CompactRangeOptions currently, which means a CF's fullHistoryTsLow is advanced only when users submit a CompactRange request. However, users may want to advance the fllHistoryTsLow without an immediate compact. This merge make IncreaseFullHistoryTsLow to a public API so users can advance each CF's fullHistoryTsLow seperately. Pull Request resolved: https://github.com/facebook/rocksdb/pull/9221 Reviewed By: akankshamahajan15 Differential Revision: D33201106 Pulled By: riversand963 fbshipit-source-id: 9cb1d013ba93260f72e16353e693ffee167b47ee	4 years ago
Andrew Kryczka	538d2365e9	Fix race condition in BackupEngineTest.ChangeManifestDuringBackupCreation (#9327 ) Summary: The failure looked like this: ``` utilities/backupable/backupable_db_test.cc:3161: Failure Value of: db_chroot_env_->FileExists(prev_manifest_path).IsNotFound() Actual: false Expected: true ``` The failure could be coerced consistently with the following patch: ``` diff --git a/db/db_impl/db_impl_compaction_flush.cc b/db/db_impl/db_impl_compaction_flush.cc index 80410f671..637636791 100644 --- a/db/db_impl/db_impl_compaction_flush.cc +++ b/db/db_impl/db_impl_compaction_flush.cc @@ -2772,6 +2772,8 @@ void DBImpl::BackgroundCallFlush(Env::Priority thread_pri) { if (job_context.HaveSomethingToClean() \|\| job_context.HaveSomethingToDelete() \|\| !log_buffer.IsEmpty()) { mutex_.Unlock(); + bg_cv_.SignalAll(); + sleep(1); TEST_SYNC_POINT("DBImpl::BackgroundCallFlush:FilesFound"); // Have to flush the info logs before bg_flush_scheduled_-- // because if bg_flush_scheduled_ becomes 0 and the lock is ``` The cause was a familiar problem, which is manual flush/compaction may return before files they obsoleted are removed. The solution is just to wait for "scheduled" work to complete, which includes all phases including cleanup. Pull Request resolved: https://github.com/facebook/rocksdb/pull/9327 Test Plan: after this PR, even the above patch to coerce the bug cannot cause the test to fail. Reviewed By: riversand963 Differential Revision: D33252208 Pulled By: ajkr fbshipit-source-id: 720a7eaca58c7247d221911fffe3d5e1dbf581e9	4 years ago
Sergei Petrunia	1b076e82db	Expose locktree's wait count in RangeLockManagerHandle::Counters (#9289 ) Summary: locktree is a module providing Range Locking. It has a counter for the number of times a lock acquisition request was blocked by an existing conflicting lock and had to wait for it to be released. Expose this counter in RangeLockManagerHandle::Counters::lock_wait_count. Pull Request resolved: https://github.com/facebook/rocksdb/pull/9289 Reviewed By: jay-zhuang Differential Revision: D33079182 Pulled By: riversand963 fbshipit-source-id: 25b1a362d9da247536ab5007bd15900b319f139e	4 years ago

1 2 3 4 5 ...

10649 Commits (fa523761176fd69e45aaa26e8e13e2e163177456) All Branches Search

10649 Commits (fa523761176fd69e45aaa26e8e13e2e163177456)

All Branches