rocksdb

Commit Graph

Author	SHA1	Message	Date
Yanqin Jin	c53db172a1	Fix TestIterate for HashSkipList in db_stress (#5942 ) Summary: Since SeekForPrev (used by Prev) is not supported by HashSkipList when prefix is used, we disable it when stress testing HashSkipList. - Change the default memtablerep to skip list. - Avoid Prev() when memtablerep is HashSkipList and prefix is used. Test Plan (on devserver): ``` $make db_stress $./db_stress -ops_per_thread=10000 -reopen=1 -destroy_db_initially=true -column_families=1 -threads=1 -column_families=1 -memtablerep=prefix_hash $# or simply $./db_stress $./db_stress -memtablerep=prefix_hash ``` Results must print "Verification successful". Pull Request resolved: https://github.com/facebook/rocksdb/pull/5942 Differential Revision: D18017062 Pulled By: riversand963 fbshipit-source-id: af867e59aa9e6f533143c984d7d529febf232fd7	6 years ago
Peter Dillinger	5f8f2fda0e	Refactor / clean up / optimize FullFilterBitsReader (#5941 ) Summary: FullFilterBitsReader, after creating in BloomFilterPolicy, was responsible for decoding metadata bits. This meant that FullFilterBitsReader::MayMatch had some metadata checks in order to implement "always true" or "always false" functionality in the case of inconsistent or trivial metadata. This made for ugly mixing-of-concerns code and probably had some runtime cost. It also didn't really support plugging in alternative filter implementations with extensions to the existing metadata schema. BloomFilterPolicy::GetFilterBitsReader is now (exclusively) responsible for decoding filter metadata bits and constructing appropriate instances deriving from FilterBitsReader. "Always false" and "always true" derived classes allow FullFilterBitsReader not to be concerned with handling of trivial or inconsistent metadata. This also makes for easy expansion to alternative filter implementations in new, alternative derived classes. This change makes calls to FilterBitsReader::MayMatch necessarily virtual because there's now more than one built-in implementation. Compared with the previous implementation's extra 'if' checks in MayMatch, there's no consistent performance difference, measured by (an older revision of) filter_bench (differences here seem to be within noise): Inside queries... - Dry run (407) ns/op: 35.9996 + Dry run (407) ns/op: 35.2034 - Single filter ns/op: 47.5483 + Single filter ns/op: 47.4034 - Batched, prepared ns/op: 43.1559 + Batched, prepared ns/op: 42.2923 ... - Random filter ns/op: 150.697 + Random filter ns/op: 149.403 ---------------------------- Outside queries... - Dry run (980) ns/op: 34.6114 + Dry run (980) ns/op: 34.0405 - Single filter ns/op: 56.8326 + Single filter ns/op: 55.8414 - Batched, prepared ns/op: 48.2346 + Batched, prepared ns/op: 47.5667 - Random filter ns/op: 155.377 + Random filter ns/op: 153.942 Average FP rate %: 1.1386 Also, the FullFilterBitsReader ctor was responsible for a surprising amount of CPU in production, due in part to inefficient determination of the CACHE_LINE_SIZE used to construct the filter being read. The overwhelming common case (same as my CACHE_LINE_SIZE) is now substantially optimized, as shown with filter_bench with -new_reader_every=1 (old option - see below) (repeatable result): Inside queries... - Dry run (453) ns/op: 118.799 + Dry run (453) ns/op: 105.869 - Single filter ns/op: 82.5831 + Single filter ns/op: 74.2509 ... - Random filter ns/op: 224.936 + Random filter ns/op: 194.833 ---------------------------- Outside queries... - Dry run (aa1) ns/op: 118.503 + Dry run (aa1) ns/op: 104.925 - Single filter ns/op: 90.3023 + Single filter ns/op: 83.425 ... - Random filter ns/op: 220.455 + Random filter ns/op: 175.7 Average FP rate %: 1.13886 However PR#5936 has/will reclaim most of this cost. After that PR, the optimization of this code path is likely negligible, but nonetheless it's clear we aren't making performance any worse. Also fixed inadequate check of consistency between filter data size and num_lines. (Unit test updated.) Pull Request resolved: https://github.com/facebook/rocksdb/pull/5941 Test Plan: previously added unit tests FullBloomTest.CorruptFilters and FullBloomTest.RawSchema Differential Revision: D18018353 Pulled By: pdillinger fbshipit-source-id: 8e04c2b4a7d93223f49a237fd52ef2483929ed9c	6 years ago
Peter Dillinger	fe464bca5c	Fix PlainTableReader not to crash sst_dump (#5940 ) Summary: Plain table SSTs could crash sst_dump because of a bug in PlainTableReader that can leave table_properties_ as null. Even if it was intended not to keep the table properties in some cases, they were leaked on the offending code path. Steps to reproduce: $ db_bench --benchmarks=fillrandom --num=2000000 --use_plain_table --prefix-size=12 $ sst_dump --file=0000xx.sst --show_properties from [] to [] Process /dev/shm/dbbench/000014.sst Sst file format: plain table Raw user collected properties ------------------------------ Segmentation fault (core dumped) Also added missing unit testing of plain table full_scan_mode, and an assertion in NewIterator to check for regression. Pull Request resolved: https://github.com/facebook/rocksdb/pull/5940 Test Plan: new unit test, manual, make check Differential Revision: D18018145 Pulled By: pdillinger fbshipit-source-id: 4310c755e824c4cd6f3f86a3abc20dfa417c5e07	6 years ago
Zhichao Cao	526e3b9763	Enable trace_replay with multi-threads (#5934 ) Summary: In the current trace replay, all the queries are serialized and called by single threads. It may not simulate the original application query situations closely. The multi-threads replay is implemented in this PR. Users can set the number of threads to replay the trace. The queries generated according to the trace records are scheduled in the thread pool job queue. Pull Request resolved: https://github.com/facebook/rocksdb/pull/5934 Test Plan: test with make check and real trace replay. Differential Revision: D17998098 Pulled By: zhichao-cao fbshipit-source-id: 87eecf6f7c17a9dc9d7ab29dd2af74f6f60212c8	6 years ago
Levi Tamasi	69bd8a2859	Update HISTORY.md with recent BlobDB adjacent changes Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/5939 Differential Revision: D18009096 Pulled By: ltamasi fbshipit-source-id: 032a48a302f9da38aecf4055b5a8d4e1dffd9dc7	6 years ago
Yanqin Jin	e60cc0925c	Expose db stress tests (#5937 ) Summary: expose db stress test by providing db_stress_tool.h in public header. This PR does the following: - adds a new header, db_stress_tool.h, in include/rocksdb/ - renames db_stress.cc to db_stress_tool.cc - adds a db_stress.cc which simply invokes a test function. - update Makefile accordingly. Test Plan (dev server): ``` make db_stress ./db_stress ``` Pull Request resolved: https://github.com/facebook/rocksdb/pull/5937 Differential Revision: D17997647 Pulled By: riversand963 fbshipit-source-id: 1a8d9994f89ce198935566756947c518f0052410	6 years ago
Levi Tamasi	fdc1cb43a6	Support decoding blob indexes in sst_dump (#5926 ) Summary: The patch adds a new command line parameter --decode_blob_index to sst_dump. If this switch is specified, sst_dump prints blob indexes in a human readable format, printing the blob file number, offset, size, and expiration (if applicable) for blob references, and the blob value (and expiration) for inlined blobs. Pull Request resolved: https://github.com/facebook/rocksdb/pull/5926 Test Plan: Used db_bench's BlobDB mode to generate SST files containing blob references with and without expiration, as well as inlined blobs with and without expiration (note: the latter are stored as plain values), and confirmed sst_dump correctly prints all four types of records. Differential Revision: D17939077 Pulled By: ltamasi fbshipit-source-id: edc5f58fee94ba35f6699c6a042d5758f5b3963d	6 years ago
Yi Wu	1f9d7c0f54	Fix OnFlushCompleted fired before flush result write to MANIFEST (#5908 ) Summary: When there are concurrent flush job on the same CF, `OnFlushCompleted` can be called before the flush result being install to LSM. Fixing the issue by passing `FlushJobInfo` through `MemTable`, and the thread who commit the flush result can fetch the `FlushJobInfo` and fire `OnFlushCompleted` on behave of the thread actually writing the SST. Fix https://github.com/facebook/rocksdb/issues/5892 Pull Request resolved: https://github.com/facebook/rocksdb/pull/5908 Test Plan: Add new test. The test will fail without the fix. Differential Revision: D17916144 Pulled By: riversand963 fbshipit-source-id: e18df67d9533b5baee52ae3605026cdeb05cbe10	6 years ago
Maysam Yabandeh	2c9e9f2a59	Update HISTORY for SeekForPrev bug fix (#5925 ) Summary: Update history for the bug fix in https://github.com/facebook/rocksdb/pull/5907 Pull Request resolved: https://github.com/facebook/rocksdb/pull/5925 Differential Revision: D17952605 Pulled By: maysamyabandeh fbshipit-source-id: 609afcbb2e4087f9153822c4d11193a75a7b0e7a	6 years ago
Yanqin Jin	5ef27dea33	Fix clang analyzer error (#5924 ) Summary: Without this PR, clang analyzer complains. ``` $USE_CLANG=1 make analyze db/compaction/compaction_job_test.cc:161:20: warning: The left operand of '==' is a garbage value if (key.type == kTypeBlobIndex) { ~~~~~~~~ ^ 1 warning generated. ``` Test Plan (on devserver) ``` $USE_CLANG=1 make analyze ``` Pull Request resolved: https://github.com/facebook/rocksdb/pull/5924 Differential Revision: D17923226 Pulled By: riversand963 fbshipit-source-id: 9d1eb769b5e0de7cb3d89dc90d1cfa895db7fdc8	6 years ago
Levi Tamasi	78b28d80b0	Support non-TTL Puts for BlobDB in db_bench (#5921 ) Summary: Currently, db_bench only supports PutWithTTL operations for BlobDB but not regular Puts. The patch adds support for regular (non-TTL) Puts and also changes the default for blob_db_max_ttl_range to zero, which corresponds to no TTL. Pull Request resolved: https://github.com/facebook/rocksdb/pull/5921 Test Plan: make check ./db_bench -benchmarks=fillrandom -statistics -stats_interval_seconds=1 -duration=90 -num=500000 -use_blob_db=1 -blob_db_file_size=1000000 -target_file_size_base=1000000 (issues Put operations with no TTL) ./db_bench -benchmarks=fillrandom -statistics -stats_interval_seconds=1 -duration=90 -num=500000 -use_blob_db=1 -blob_db_file_size=1000000 -target_file_size_base=1000000 -blob_db_max_ttl_range=86400 (issues PutWithTTL operations with random TTLs in the [0, blob_db_max_ttl_range) interval, as before) Differential Revision: D17919798 Pulled By: ltamasi fbshipit-source-id: b946c3522b836b92b4c157ffbad24f92ba2b0a16	6 years ago
Peter Dillinger	93edd51c4a	bloom_test.cc: include <array> (#5920 ) Summary: Fix build failure on some platforms, reported in issue https://github.com/facebook/rocksdb/issues/5914 Pull Request resolved: https://github.com/facebook/rocksdb/pull/5920 Test Plan: make bloom_test && ./bloom_test Differential Revision: D17918328 Pulled By: pdillinger fbshipit-source-id: b822004d4442de0171db2aeff433677783f7b94e	6 years ago
Levi Tamasi	5f025ea832	BlobDB GC: add SST <-> oldest blob file referenced mapping (#5903 ) Summary: This is groundwork for adding garbage collection support to BlobDB. The patch adds logic that keeps track of the oldest blob file referred to by each SST file. The oldest blob file is identified during flush/ compaction (similarly to how the range of keys covered by the SST is identified), and persisted in the manifest as a custom field of the new file edit record. Blob indexes with TTL are ignored for the purposes of identifying the oldest blob file (since such blob files are cleaned up by the TTL logic in BlobDB). Pull Request resolved: https://github.com/facebook/rocksdb/pull/5903 Test Plan: Added new unit tests; also ran db_bench in BlobDB mode, inspected the manifest using ldb, and confirmed (by scanning the SST files using sst_dump) that the value of the oldest blob file number field matches the contents of the file for each SST. Differential Revision: D17859997 Pulled By: ltamasi fbshipit-source-id: 21662c137c6259a6af70446faaf3a9912c550e90	6 years ago
Levi Tamasi	a59dc843a4	Move blob_index.h to db/ (#5919 ) Summary: Extracted from PR https://github.com/facebook/rocksdb/issues/5903 for technical reasons. Pull Request resolved: https://github.com/facebook/rocksdb/pull/5919 Test Plan: make check Differential Revision: D17910132 Pulled By: ltamasi fbshipit-source-id: 6ecbb8d6e84b2a1d1f28575ad48ac3cc65833eb5	6 years ago
Yanqin Jin	231fffd07c	Add Env::SanitizeEnvOptions (#5885 ) Summary: Add Env::SanitizeEnvOptions to allow underlying environments properly configure env options. Pull Request resolved: https://github.com/facebook/rocksdb/pull/5885 Test Plan: ``` make check ``` Differential Revision: D17910327 Pulled By: riversand963 fbshipit-source-id: 86a1ac616e485742c35c4a9cc9f1227c529fc00f	6 years ago
Maysam Yabandeh	a6e615a7ba	Enable partitioned index/filter in stress tests (#5918 ) Summary: This is the 3rd attempt after the revert of https://github.com/facebook/rocksdb/issues/4020 and https://github.com/facebook/rocksdb/issues/5895 The last bug is fixed https://github.com/facebook/rocksdb/pull/5907 Pull Request resolved: https://github.com/facebook/rocksdb/pull/5918 Test Plan: ``` make -j32 crash_test ``` Differential Revision: D17909489 Pulled By: maysamyabandeh fbshipit-source-id: 7dfb8cf998c2d295c86465dd21734593d277887e	6 years ago
Yanqin Jin	6febfd8451	OnTableFileCreationCompleted use "(nil)" for empty file during flush (#5905 ) Summary: Compaction can call OnTableFileCreationCompleted(). If file is empty, "(nil)" is used as the file name. Do the same for flush. Test plan (dev server): ``` make all make check ``` Pull Request resolved: https://github.com/facebook/rocksdb/pull/5905 Differential Revision: D17883285 Pulled By: riversand963 fbshipit-source-id: 6565884adbb00e8023d88b17dfb3b6eb92220b59	6 years ago
Maysam Yabandeh	4e729f9095	Fix SeekForPrev bug with Partitioned Filters and Prefix (#5907 ) Summary: Partition Filters make use of a top-level index to find the partition that might have the bloom hash of the key. The index is with internal key format (before format version 3). Each partition contains the i) blooms of the keys in that range ii) bloom of prefixes of keys in that range, iii) the bloom of the prefix of the last key in the previous partition. When ::SeekForPrev(key), we first perform a prefix bloom test on the SST file. The partition however is identified using the full internal key, rather than the prefix key. The reason is to be compatible with the internal key format of the top-level index. This creates a corner case. Example: - SST k, Partition N: P1K1, P1K2 - SST k, top-level index: P1K2 - SST k+1, Partition 1: P2K1, P3K1 - SST k+1 top-level index: P3K1 When SeekForPrev(P1K3), it should point us to P1K2. However SST k top-level index would reject P1K3 since it is out of range. One possible fix would be to search with the prefix P1 (instead of full internal key P1K3) however the details of properly comparing prefix with full internal key might get complicated. The fix we apply in this PR is to look into the last partition anyway even if the key is out of range. Pull Request resolved: https://github.com/facebook/rocksdb/pull/5907 Differential Revision: D17889918 Pulled By: maysamyabandeh fbshipit-source-id: 169fd7b3c71dbc08808eae5a8340611ebe5bdc1e	6 years ago
Andrew Kryczka	b00761eea6	Fix block cache ID uniqueness for Windows builds (#5844 ) Summary: Since we do not evict a file's blocks from block cache before that file is deleted, we require a file's cache ID prefix is both unique and non-reusable. However, the Windows functionality we were relying on only guaranteed uniqueness. That meant a newly created file could be assigned the same cache ID prefix as a deleted file. If the newly created file had block offsets matching the deleted file, full cache keys could be exactly the same, resulting in obsolete data blocks returned from cache when trying to read from the new file. We noticed this when running on FAT32 where compaction was writing out of order keys due to reading obsolete blocks from its input files. The functionality is documented as behaving the same on NTFS, although I wasn't able to repro it there. Pull Request resolved: https://github.com/facebook/rocksdb/pull/5844 Test Plan: we had a reliable repro of out-of-order keys on FAT32 that was fixed by this change Differential Revision: D17752442 fbshipit-source-id: 95d983f9196cf415f269e19293b97341edbf7e00	6 years ago
Yanqin Jin	bc8b05cb77	Revert "Enable partitioned index/filter in stress tests (#5895 )" (#5904 ) Summary: This reverts commit `2f4e288143`. Pull Request resolved: https://github.com/facebook/rocksdb/pull/5904 Differential Revision: D17871282 Pulled By: riversand963 fbshipit-source-id: d210725f8f3b26d8eac25892094da09d9694337e	6 years ago
Yanqin Jin	ddb62d1f29	Remove a webhook due to potential security concern (#5902 ) Summary: As title. Pull Request resolved: https://github.com/facebook/rocksdb/pull/5902 Differential Revision: D17858150 Pulled By: riversand963 fbshipit-source-id: db2cd8a756faf7b9751b2651a22e1b29ca9fecec	6 years ago
Adam Retter	1e9c8d42a0	Fix the rocksjava release Vagrant build on CentOS (#5901 ) Summary: Closes https://github.com/facebook/rocksdb/issues/5873 Pull Request resolved: https://github.com/facebook/rocksdb/pull/5901 Differential Revision: D17869585 fbshipit-source-id: 559472486f1d3ac80c0c7df6c421c4b612b9b7f9	6 years ago
Vijay Nadimpalli	4c49e38f15	MultiGet batching in memtable (#5818 ) Summary: RocksDB has a MultiGet() API that implements batched key lookup for higher performance (https://github.com/facebook/rocksdb/blob/master/include/rocksdb/db.h#L468). Currently, batching is implemented in BlockBasedTableReader::MultiGet() for SST file lookups. One of the ways it improves performance is by pipelining bloom filter lookups (by prefetching required cachelines for all the keys in the batch, and then doing the probe) and thus hiding the cache miss latency. The same concept can be extended to the memtable as well. This PR involves implementing a pipelined bloom filter lookup in DynamicBloom, and implementing MemTable::MultiGet() that can leverage it. Pull Request resolved: https://github.com/facebook/rocksdb/pull/5818 Test Plan: Existing tests Performance Test: Ran the below command which fills up the memtable and makes sure there are no flushes and then call multiget. Ran it on master and on the new change and see atleast 1% performance improvement across all the test runs I did. Sometimes the improvement was upto 5%. TEST_TMPDIR=/data/users/$USER/benchmarks/feature/ numactl -C 10 ./db_bench -benchmarks="fillseq,multireadrandom" -num=600000 -compression_type="none" -level_compaction_dynamic_level_bytes -write_buffer_size=200000000 -target_file_size_base=200000000 -max_bytes_for_level_base=16777216 -reads=90000 -threads=1 -compression_type=none -cache_size=4194304000 -batch_size=32 -disable_auto_compactions=true -bloom_bits=10 -cache_index_and_filter_blocks=true -pin_l0_filter_and_index_blocks_in_cache=true -multiread_batched=true -multiread_stride=4 -statistics -memtable_whole_key_filtering=true -memtable_bloom_size_ratio=10 Differential Revision: D17578869 Pulled By: vjnadimpalli fbshipit-source-id: 23dc651d9bf49db11d22375bf435708875a1f192	6 years ago
anand76	80ad996b35	Make the db_stress reopen loop in OperateDb() more robust (#5893 ) Summary: The loop in OperateDb() is getting quite complicated with the introduction of multiple key operations such as MultiGet and Reseeks. This is resulting in a number of corner cases that hangs db_stress due to synchronization problems during reopen (i.e when -reopen=<> option is specified). This PR makes it more robust by ensuring all db_stress threads vote to reopen the DB the exact same number of times. Most of the changes in this diff are due to indentation. Pull Request resolved: https://github.com/facebook/rocksdb/pull/5893 Test Plan: Run crash test Differential Revision: D17823827 Pulled By: anand1976 fbshipit-source-id: ec893829f611ac7cac4057c0d3d99f9ffb6a6dd9	6 years ago
katherine	5b123813f8	Remove deprecated RocksDBCommonHelper and cont_integration.sh (#5889 ) Summary: As titled. RocksDBCommonHelper contains references to legacy APIs. Pull Request resolved: https://github.com/facebook/rocksdb/pull/5889 Differential Revision: D17783179 fbshipit-source-id: dcde82a73a311bfa3300ad69189b3a32727134d1	6 years ago
Peter Dillinger	90e285efde	Fix some implicit conversions in filter_bench (#5894 ) Summary: Fixed some spots where converting size_t or uint_fast32_t to uint32_t. Wrapped mt19937 in a new Random32 class to avoid future such traps. NB: I tried using Random32::Uniform (std::uniform_int_distribution) in filter_bench instead of fastrange, but that more than doubled the dry run time! So I added fastrange as Random32::Uniformish. ;) Pull Request resolved: https://github.com/facebook/rocksdb/pull/5894 Test Plan: USE_CLANG=1 build, and manual re-run filter_bench Differential Revision: D17825131 Pulled By: pdillinger fbshipit-source-id: 68feee333b5f8193c084ded760e3d6679b405ecd	6 years ago
Yanqin Jin	167cdc9f17	Support custom env in sst_dump (#5845 ) Summary: This PR allows for the creation of custom env when using sst_dump. If the user does not set options.env or set options.env to nullptr, then sst_dump will automatically try to create a custom env depending on the path to the sst file or db directory. In order to use this feature, the user must call ObjectRegistry::Register() beforehand. Test Plan (on devserver): ``` $make all && make check ``` All tests must pass to ensure this change does not break anything. Pull Request resolved: https://github.com/facebook/rocksdb/pull/5845 Differential Revision: D17678038 Pulled By: riversand963 fbshipit-source-id: 58ecb4b3f75246d52b07c4c924a63ee61c1ee626	6 years ago
Maysam Yabandeh	2f4e288143	Enable partitioned index/filter in stress tests (#5895 ) Summary: This is the 2nd attempt after the revert of https://github.com/facebook/rocksdb/pull/4020 Pull Request resolved: https://github.com/facebook/rocksdb/pull/5895 Test Plan: ``` ./tools/db_crashtest.py blackbox --simple --interval=10 --max_key=10000000 ``` Differential Revision: D17822137 Pulled By: maysamyabandeh fbshipit-source-id: 3d148c0d8cc129080410ff859c04b544223c8ea3	6 years ago
Tomas Kolda	e3a93c9ee1	Fix crash when background task fails (#5879 ) Summary: Fixing crash. Full story in issue: https://github.com/facebook/rocksdb/issues/5878 Pull Request resolved: https://github.com/facebook/rocksdb/pull/5879 Differential Revision: D17812299 Pulled By: anand1976 fbshipit-source-id: 14e5a4fc502ade974583da9692d0ed6e5014613a	6 years ago
Peter Dillinger	46ca51d430	filter_bench - a prelim tool for SST filter benchmarking (#5825 ) Summary: Example: using the tool before and after PR https://github.com/facebook/rocksdb/issues/5784 shows that the refactoring, presumed performance-neutral, actually sped up SST filters by about 3% to 8% (repeatable result): Before: - Dry run ns/op: 22.4725 - Single filter ns/op: 51.1078 - Random filter ns/op: 120.133 After: + Dry run ns/op: 22.2301 + Single filter run ns/op: 47.4313 + Random filter ns/op: 115.9 Only tests filters for the block-based table (full filters and partitioned filters - same implementation; not block-based filters), which seems to be the recommended format/implementation. Pull Request resolved: https://github.com/facebook/rocksdb/pull/5825 Differential Revision: D17804987 Pulled By: pdillinger fbshipit-source-id: 0f18a9c254c57f7866030d03e7fa4ba503bac3c5	6 years ago
Yanqin Jin	457bcfde02	Let TestEnv and FaultInjectEnv use Env of choice (#5886 ) Summary: Instead of hard coding Env::Default in TestEnv and a few other places, use the DBTestBase::env_ that has been deduced from the constructor. Pull Request resolved: https://github.com/facebook/rocksdb/pull/5886 Test Plan: ``` make check ``` Differential Revision: D17773029 Pulled By: riversand963 fbshipit-source-id: 7ce4e5175a487e9d281ea2c3aae3c41bffd44629	6 years ago
lokeshgupta0912	9905101c8c	Replaced some words (#5877 ) Summary: improved Vocabulary Pull Request resolved: https://github.com/facebook/rocksdb/pull/5877 Differential Revision: D17753217 Pulled By: anand1976 fbshipit-source-id: f255418534297e537a2735f0a0546c724b8f7c70	6 years ago
jsteemann	da3b2840cb	save a few redundant container lookups (#5875 ) Summary: This PR eliminates repeated lookups in associative or ordered containers when a single lookup suffices. Pull Request resolved: https://github.com/facebook/rocksdb/pull/5875 Differential Revision: D17753172 Pulled By: anand1976 fbshipit-source-id: 796b02b760082521d8c42a1cb65a76bf0e6c1b8e	6 years ago
anand76	19a97dd139	Fix data block upper bound checking for iterator reseek case (#5883 ) Summary: When an iterator reseek happens with the user specifying a new iterate_upper_bound in ReadOptions, and the new seek position is at the end of the same data block, the Seek() ends up using a stale value of data_block_within_upper_bound_ and may return incorrect results. Pull Request resolved: https://github.com/facebook/rocksdb/pull/5883 Test Plan: Added a new test case DBIteratorTest.IterReseekNewUpperBound. Verified that it failed due to the assertion failure without the fix, and passes with the fix. Differential Revision: D17752740 Pulled By: anand1976 fbshipit-source-id: f9b635ff5d6aeb0e1bef102cf8b2f900efd378e3	6 years ago
Peter Dillinger	9f54446525	Fix type in shift operation in bloom_test (#5882 ) Summary: Broken type for shift in PR#5834. Fixing code means fixing expected values in test. Pull Request resolved: https://github.com/facebook/rocksdb/pull/5882 Test Plan: thisisthetest Differential Revision: D17746136 Pulled By: pdillinger fbshipit-source-id: d3c456ed30b433d55fcab6fc7d836940fe3b46b8	6 years ago
anand76	cca87d7722	Fix reopen voting logic in db_stress to prevent hangs (#5876 ) Summary: When multiple operations are performed in a db_stress thread in one loop iteration, the reopen voting logic needs to take that into account. It was doing that for MultiGet, but a new option was introduced recently to do multiple iterator seeks per iteration, which broke it again. Fix the logic to be more robust and agnostic of the type of operation performed. Pull Request resolved: https://github.com/facebook/rocksdb/pull/5876 Test Plan: Run db_stress Differential Revision: D17733590 Pulled By: anand1976 fbshipit-source-id: 787f01abefa1e83bba43e0b4f4abb26699b2089e	6 years ago
Peter Dillinger	9e4913ce9d	Add FullBloomTest.CorruptFilters,RawSchema (#5834 ) Summary: There was significant untested logic in FullFilterBitsReader in the handling of serialized Bloom filter bits that cannot be generated by FullFilterBitsBuilder in the current compilation. These now test many of those corner-case behaviors, including bad metadata or filters created with different cache line size than the current compiled-in value. Pull Request resolved: https://github.com/facebook/rocksdb/pull/5834 Test Plan: thisisthetest Differential Revision: D17726372 Pulled By: pdillinger fbshipit-source-id: fb7b8003b5a8e6fb4666fe95206128f3d5835fc7	6 years ago
sdong	d783af1857	Fix a timer bug in MergingIterator::Seek() caused by #5871 (#5874 ) Summary: Conflict resolving in `846e05005d` ("Revert "Merging iterator to avoid child iterator reseek for some cases") caused some timer misplaced. Fix it. Pull Request resolved: https://github.com/facebook/rocksdb/pull/5874 Test Plan: See it build. Differential Revision: D17705073 fbshipit-source-id: 9bd3a8dc4901ac33c2c6fc5b1091ffbc56a8529f	6 years ago
Yanqin Jin	9f31df8679	Fix compilation error (#5872 ) Summary: Without this fix, compiler complains. ``` $ROCKSDB_NO_FBCODE=1 USE_CLANG=1 make ldb table/block_based/full_filter_block.cc: In constructor ‘rocksdb::FullFilterBlockBuilder::FullFilterBlockBuilder(const rocksdb::SliceTransform, bool, rocksdb::FilterBitsBuilder)’: table/block_based/full_filter_block.cc:20:43: error: declaration of ‘prefix_extractor’ shadows a member of 'this' [-Werror=shadow] FilterBitsBuilder* filter_bits_builder) ``` Pull Request resolved: https://github.com/facebook/rocksdb/pull/5872 Test Plan: ``` $ROCKSDB_NO_FBCODE=1 make all ``` Differential Revision: D17690058 Pulled By: riversand963 fbshipit-source-id: 19e3d9bd86e1123847095240e73d30da5d66240e	6 years ago
sdong	846e05005d	Revert "Merging iterator to avoid child iterator reseek for some cases (#5286 )" (#5871 ) Summary: This reverts commit `9fad3e21eb`. Iterator verification in stress tests sometimes fail for assertion table/block_based/block_based_table_reader.cc:2973: void rocksdb::BlockBasedTableIterator<TBlockIter, TValue>::FindBlockForward() [with TBlockIter = rocksdb::DataBlockIter; TValue = rocksdb::Slice]: Assertion `!next_block_is_out_of_bound \|\| user_comparator_.Compare(*read_options_.iterate_upper_bound, index_iter_->user_key()) <= 0' failed. It is likely to be linked to https://github.com/facebook/rocksdb/pull/5286 together with https://github.com/facebook/rocksdb/pull/5468 as the former PR makes some child iterator's seek being avoided, so that upper bound condition fails to be updated there. Strictly speaking, the former PR was merged before the latter one, but the latter one feels a more important improvement so I choose to revert the former one for now. Pull Request resolved: https://github.com/facebook/rocksdb/pull/5871 Differential Revision: D17689196 fbshipit-source-id: 4ded5be68f67bee2782d31a29cb72ea68f59dd8c	6 years ago
sdong	503a756e42	Fix clang analyze warning in db_stress (#5870 ) Summary: Recent changes trigger clang analyze warning. Fix it. Pull Request resolved: https://github.com/facebook/rocksdb/pull/5870 Test Plan: "USE_CLANG=1 TEST_TMPDIR=/dev/shm/rocksdb OPT=-g make -j60 analyze" and make sure it passes. Differential Revision: D17682533 fbshipit-source-id: 02716f2a24572550a22db4bbe9b54d4872dfae32	6 years ago
Jay Zhuang	51413e0a85	Fix a compile error (#5864 ) Summary: ``` tools/block_cache_analyzer/block_cache_trace_analyzer.cc:653:48: error: implicit conversion loses integer precision: 'uint64_t' (aka 'unsigned long long') to 'std::__1::linear_congruential_engine<unsigned int, 48271, 0, 2147483647>::result_type' (aka 'unsigned int') [-Werror,-Wshorten-64-to-32] std::default_random_engine rand_engine(env_->NowMicros()); ``` Pull Request resolved: https://github.com/facebook/rocksdb/pull/5864 Differential Revision: D17668962 fbshipit-source-id: e08fa58b2a78a8dd8b334862b5714208f696b8ab	6 years ago
sdong	69c4ccb970	Fix three more db_stress bugs (#5867 ) Summary: Two more bug fixes in db_stress: 1. this is to complete the fix of the regression bug causing overflowing when supporting FLAGS_prefix_size = -1. 2. Fix regression bug in compare iterator itself: (1) when creating control iterator, which used the same read option as the normal iterator by mistake; (2) the logic of comparing has some problems. Fix them. (3) disable validation for lower bound now, which generated some wildly different results. Disabling it to make normal tests pass while investigating it. 3. Cleaning up snapshots in verification failure cases. Memory is leaked otherwise. Pull Request resolved: https://github.com/facebook/rocksdb/pull/5867 Test Plan: Run "make crash_test" for a while and see at least 1 is fixed. Differential Revision: D17671712 fbshipit-source-id: 011f98ea1a72aef23e19ff28656830c78699b402	6 years ago
Yanqin Jin	643df920d8	Explicitly declare atomic flush incompatible with pipelined write (#5860 ) Summary: Atomic flush is incompatible with pipelined write. At least now. If pipelined write is enabled, a thread performing write can exit the write thread and start inserting into memtables. Consequently a thread performing flush will enter write thread and race with memtable insertion by the former. This will cause undefined result in terms of data persistence. Pull Request resolved: https://github.com/facebook/rocksdb/pull/5860 Test Plan: ``` $make all && make check ``` Differential Revision: D17638944 Pulled By: riversand963 fbshipit-source-id: abc578dc49a5dbe41bc5adcecf448f8e042a6d49	6 years ago
sdong	5cd8aaf75f	db_stress: fix run time error when prefix_size = -1 (#5862 ) Summary: When prefix_size = -1, stress test crashes with run time error because of overflow. Fix it by not using -1 but 7 in prefix scan mode. Pull Request resolved: https://github.com/facebook/rocksdb/pull/5862 Test Plan: Run python -u tools/db_crashtest.py --simple whitebox --random_kill_odd \ 888887 --compression_type=zstd and see it doesn't crash. Differential Revision: D17642313 fbshipit-source-id: f029e7651498c905af1b1bee6d310ae50cdcda41	6 years ago
sdong	679a45d0cb	crash_test to do some verification for prefix extractor and iterator bounds. (#5846 ) Summary: For now, crash_test is not able to report any failure for the logic related to iterator upper, lower bounds or iterators, or reseek. These are features prone to errors. Improve db_stress in several ways: (1) For each iterator run, reseek up to 3 times. (2) For every iterator, create control iterator with upper or lower bound, with total order seek. Compare the results with the iterator. (3) Make simple crash test to avoid prefix size to have more coverage. (4) make prefix_size = 0 a valid size and -1 to indicate disabling prefix extractor. Pull Request resolved: https://github.com/facebook/rocksdb/pull/5846 Test Plan: Manually hack the code to create wrong results and see they are caught by the tool. Differential Revision: D17631760 fbshipit-source-id: acd460a177bd2124a5ffd7fff490702dba63030b	6 years ago
Chen, You	51185592fd	Add unordered write option rocksjava (#5839 ) Summary: Add unordered_write option api and related ut to rocksjava Pull Request resolved: https://github.com/facebook/rocksdb/pull/5839 Differential Revision: D17604446 Pulled By: maysamyabandeh fbshipit-source-id: c6b07e85ca9d5e3a92973ddb6ab2bc079e53c9c1	6 years ago
Yanqin Jin	ae45835703	Add TryCatchUpWithPrimary to StackableDB (#5855 ) Summary: as title. Test Plan (on devserver): ``` $make all && make check ``` Pull Request resolved: https://github.com/facebook/rocksdb/pull/5855 Differential Revision: D17615125 Pulled By: riversand963 fbshipit-source-id: bd6ed8cf59eafff41f0d1fc044f39e8f3573172a	6 years ago
sdong	76e951dbb1	Add a unit test to reproduce a corruption bug (#5851 ) Summary: This is a bug occaionally shows up in crash test, and this unit test is to reproduce it. The bug is following: 1. Database has multiple CFs. 2. Between one DB restart, the last log file is corrupted in the middle (not the tail) 3. During restart, DB crashes between flushes between two CFs. The DB will fail to be opened again with error "SST file is ahead of WALs" Pull Request resolved: https://github.com/facebook/rocksdb/pull/5851 Test Plan: Run the test itself. Differential Revision: D17614721 fbshipit-source-id: 1b0abce49b203a76a039e38e76bc940429975f20	6 years ago
Maysam Yabandeh	6652c94f59	Fix a bug in format_version 3 + partition filters + prefix search (#5835 ) Summary: Partitioned filters make use of a top-level index to find the partition in which the filter resides. The top-level index has a key per partition. The key is guaranteed to be larger or equal than any key in that partition. When used with format_version 3, which excludes the sequence number form index keys, the separator key in the index could be equal to the prefix of the keys in the next partition. In this way, when searching for the key, the top-level index will lead us to the previous partition, which has no key with that prefix. The prefix bloom test thus returns false, although the prefix exists in the bloom of the next partition. The patch fixes that by a hack: It always adds the prefix of the first key of the next partition to the bloom of the current partition. In this way, in the corner cases that the index will lead us to the previous partition, we still can find the bloom filter there. Pull Request resolved: https://github.com/facebook/rocksdb/pull/5835 Differential Revision: D17513585 Pulled By: maysamyabandeh fbshipit-source-id: e2d1ff26c759e6e03875c4d57f4228316ecf50e9	6 years ago

... 3 4 5 6 7 ...

8604 Commits (77d5ba78879ab90d93a5ff2373c0be3ff8153d5d) All Branches Search

8604 Commits (77d5ba78879ab90d93a5ff2373c0be3ff8153d5d)

All Branches