rocksdb

Commit Graph

Author	SHA1	Message	Date
sdong	2d1d51d385	db_stress: deep clean directory before checkpoint (#7039 ) Summary: We see crash test occassionally fails with "A checkpoint operation failed with: Invalid argument: Directory exists". The suspicious is that the directory fails to be deleted because some trash files. Deep clean the directory after a DestroyDB() call. Also add more debugging printf in case it fails. Also, preserve the DB if verification fails. Pull Request resolved: https://github.com/facebook/rocksdb/pull/7039 Test Plan: Run db_stress with low --checkpoint_one_in value Reviewed By: riversand963 Differential Revision: D22271694 fbshipit-source-id: 6a9b2abb664fc69a4dc666741df4f6b23703cd6d	4 years ago
Peter Dillinger	5b2bbacb6f	Minimize memory internal fragmentation for Bloom filters (#6427 ) Summary: New experimental option BBTO::optimize_filters_for_memory builds filters that maximize their use of "usable size" from malloc_usable_size, which is also used to compute block cache charges. Rather than always "rounding up," we track state in the BloomFilterPolicy object to mix essentially "rounding down" and "rounding up" so that the average FP rate of all generated filters is the same as without the option. (YMMV as heavily accessed filters might be unluckily lower accuracy.) Thus, the option near-minimizes what the block cache considers as "memory used" for a given target Bloom filter false positive rate and Bloom filter implementation. There are no forward or backward compatibility issues with this change, though it only works on the format_version=5 Bloom filter. With Jemalloc, we see about 10% reduction in memory footprint (and block cache charge) for Bloom filters, but 1-2% increase in storage footprint, due to encoding efficiency losses (FP rate is non-linear with bits/key). Why not weighted random round up/down rather than state tracking? By only requiring malloc_usable_size, we don't actually know what the next larger and next smaller usable sizes for the allocator are. We pick a requested size, accept and use whatever usable size it has, and use the difference to inform our next choice. This allows us to narrow in on the right balance without tracking/predicting usable sizes. Why not weight history of generated filter false positive rates by number of keys? This could lead to excess skew in small filters after generating a large filter. Results from filter_bench with jemalloc (irrelevant details omitted): (normal keys/filter, but high variance) $ ./filter_bench -quick -impl=2 -average_keys_per_filter=30000 -vary_key_count_ratio=0.9 Build avg ns/key: 29.6278 Number of filters: 5516 Total size (MB): 200.046 Reported total allocated memory (MB): 220.597 Reported internal fragmentation: 10.2732% Bits/key stored: 10.0097 Average FP rate %: 0.965228 $ ./filter_bench -quick -impl=2 -average_keys_per_filter=30000 -vary_key_count_ratio=0.9 -optimize_filters_for_memory Build avg ns/key: 30.5104 Number of filters: 5464 Total size (MB): 200.015 Reported total allocated memory (MB): 200.322 Reported internal fragmentation: 0.153709% Bits/key stored: 10.1011 Average FP rate %: 0.966313 (very few keys / filter, optimization not as effective due to ~59 byte internal fragmentation in blocked Bloom filter representation) $ ./filter_bench -quick -impl=2 -average_keys_per_filter=1000 -vary_key_count_ratio=0.9 Build avg ns/key: 29.5649 Number of filters: 162950 Total size (MB): 200.001 Reported total allocated memory (MB): 224.624 Reported internal fragmentation: 12.3117% Bits/key stored: 10.2951 Average FP rate %: 0.821534 $ ./filter_bench -quick -impl=2 -average_keys_per_filter=1000 -vary_key_count_ratio=0.9 -optimize_filters_for_memory Build avg ns/key: 31.8057 Number of filters: 159849 Total size (MB): 200 Reported total allocated memory (MB): 208.846 Reported internal fragmentation: 4.42297% Bits/key stored: 10.4948 Average FP rate %: 0.811006 (high keys/filter) $ ./filter_bench -quick -impl=2 -average_keys_per_filter=1000000 -vary_key_count_ratio=0.9 Build avg ns/key: 29.7017 Number of filters: 164 Total size (MB): 200.352 Reported total allocated memory (MB): 221.5 Reported internal fragmentation: 10.5552% Bits/key stored: 10.0003 Average FP rate %: 0.969358 $ ./filter_bench -quick -impl=2 -average_keys_per_filter=1000000 -vary_key_count_ratio=0.9 -optimize_filters_for_memory Build avg ns/key: 30.7131 Number of filters: 160 Total size (MB): 200.928 Reported total allocated memory (MB): 200.938 Reported internal fragmentation: 0.00448054% Bits/key stored: 10.1852 Average FP rate %: 0.963387 And from db_bench (block cache) with jemalloc: $ ./db_bench -db=/dev/shm/dbbench.no_optimize -benchmarks=fillrandom -format_version=5 -value_size=90 -bloom_bits=10 -num=2000000 -threads=8 -compaction_style=2 -fifo_compaction_max_table_files_size_mb=10000 -fifo_compaction_allow_compaction=false $ ./db_bench -db=/dev/shm/dbbench -benchmarks=fillrandom -format_version=5 -value_size=90 -bloom_bits=10 -num=2000000 -threads=8 -optimize_filters_for_memory -compaction_style=2 -fifo_compaction_max_table_files_size_mb=10000 -fifo_compaction_allow_compaction=false $ (for FILE in /dev/shm/dbbench.no_optimize/.sst; do ./sst_dump --file=$FILE --show_properties \| grep 'filter block' ; done) \| awk '{ t += $4; } END { print t; }' 17063835 $ (for FILE in /dev/shm/dbbench/.sst; do ./sst_dump --file=$FILE --show_properties \| grep 'filter block' ; done) \| awk '{ t += $4; } END { print t; }' 17430747 $ #^ 2.1% additional filter storage $ ./db_bench -db=/dev/shm/dbbench.no_optimize -use_existing_db -benchmarks=readrandom,stats -statistics -bloom_bits=10 -num=2000000 -compaction_style=2 -fifo_compaction_max_table_files_size_mb=10000 -fifo_compaction_allow_compaction=false -duration=10 -cache_index_and_filter_blocks -cache_size=1000000000 rocksdb.block.cache.index.add COUNT : 33 rocksdb.block.cache.index.bytes.insert COUNT : 8440400 rocksdb.block.cache.filter.add COUNT : 33 rocksdb.block.cache.filter.bytes.insert COUNT : 21087528 rocksdb.bloom.filter.useful COUNT : 4963889 rocksdb.bloom.filter.full.positive COUNT : 1214081 rocksdb.bloom.filter.full.true.positive COUNT : 1161999 $ #^ 1.04 % observed FP rate $ ./db_bench -db=/dev/shm/dbbench -use_existing_db -benchmarks=readrandom,stats -statistics -bloom_bits=10 -num=2000000 -compaction_style=2 -fifo_compaction_max_table_files_size_mb=10000 -fifo_compaction_allow_compaction=false -optimize_filters_for_memory -duration=10 -cache_index_and_filter_blocks -cache_size=1000000000 rocksdb.block.cache.index.add COUNT : 33 rocksdb.block.cache.index.bytes.insert COUNT : 8448592 rocksdb.block.cache.filter.add COUNT : 33 rocksdb.block.cache.filter.bytes.insert COUNT : 18220328 rocksdb.bloom.filter.useful COUNT : 5360933 rocksdb.bloom.filter.full.positive COUNT : 1321315 rocksdb.bloom.filter.full.true.positive COUNT : 1262999 $ #^ 1.08 % observed FP rate, 13.6% less memory usage for filters (Due to specific key density, this example tends to generate filters that are "worse than average" for internal fragmentation. "Better than average" cases can show little or no improvement.) Pull Request resolved: https://github.com/facebook/rocksdb/pull/6427 Test Plan: unit test added, 'make check' with gcc, clang and valgrind Reviewed By: siying Differential Revision: D22124374 Pulled By: pdillinger fbshipit-source-id: f3e3aa152f9043ddf4fae25799e76341d0d8714e	4 years ago
Peter Dillinger	88b4210701	Remove racially charged terms "whitelist" and "blacklist" (#7008 ) Summary: We don't need them. Pull Request resolved: https://github.com/facebook/rocksdb/pull/7008 Test Plan: "make check" and ensure "make crash_test" starts Reviewed By: ajkr Differential Revision: D22143838 Pulled By: pdillinger fbshipit-source-id: 72c8e16603abc59f4954e304466bc4dc1f58f94e	4 years ago
Andrew Kryczka	775dc623ad	add `CompactionFilter` to stress/crash tests (#6988 ) Summary: Added a `CompactionFilter` that is aware of the stress test's expected state. It only drops key versions that are already covered according to the expected state. It is incompatible with snapshots (same as all `CompactionFilter`s), so disables all snapshot-related features when used in the crash test. Pull Request resolved: https://github.com/facebook/rocksdb/pull/6988 Test Plan: running a minified blackbox crash test ``` $ TEST_TMPDIR=/dev/shm python tools/db_crashtest.py blackbox --max_key=1000000 -write_buffer_size=1048576 -max_bytes_for_level_base=4194304 -target_file_size_base=1048576 -value_size_mult=33 --interval=10 --duration=3600 ``` Reviewed By: anand1976 Differential Revision: D22072888 Pulled By: ajkr fbshipit-source-id: 727b9d7a90d5eab18be0ec6cd5a810712ac13320	4 years ago
Yanqin Jin	15d9f28da5	Add stress test for best-efforts recovery (#6819 ) Summary: Add crash test for the case of best-efforts recovery. After a certain amount of time, we kill the db_stress process, randomly delete some certain table files and restart db_stress. Given the randomness of file deletion, it is difficult to verify against a reference for data correctness. Therefore, we just check that the db can restart successfully. Pull Request resolved: https://github.com/facebook/rocksdb/pull/6819 Test Plan: ``` ./db_stress -best_efforts_recovery=true -disable_wal=1 -reopen=0 ./db_stress -best_efforts_recovery=true -disable_wal=0 -skip_verifydb=1 -verify_db_one_in=0 -continuous_verification_interval=0 make crash_test_with_best_efforts_recovery ``` Reviewed By: anand1976 Differential Revision: D21436753 Pulled By: riversand963 fbshipit-source-id: 0b3605c922a16c37ed17d5ab6682ca4240e47926	4 years ago
Ziyue Yang	e619a20e93	Add an option for parallel compression in for db_stress (#6722 ) Summary: This commit adds an `compression_parallel_threads` option in db_stress. It also fixes the naming of parallel compression option in db_bench to keep it aligned with others. Pull Request resolved: https://github.com/facebook/rocksdb/pull/6722 Reviewed By: pdillinger Differential Revision: D21091385 fbshipit-source-id: c9ba8c4e5cc327ff9e6094a6dc6a15fcff70f100	5 years ago
sdong	73523baeb1	crash_test to cover options.avoid_flush_during_recovery (#6712 ) Summary: Options.avoid_flush_during_recovery is uncovered in crash_test. Add the coverage with a chance of 1/8, as it is a less frequently used options. Pull Request resolved: https://github.com/facebook/rocksdb/pull/6712 Test Plan: Run crash_test and see the option can be used or not used by chance. Reviewed By: ltamasi Differential Revision: D21056566 fbshipit-source-id: c3b1521517cfc204786e6ef8c6acd7fffda64793	5 years ago
Yueh-Hsuan Chiang	5801af4646	Add env_fault_injection argument to db_stress (#6687 ) Summary: Add env_fault_injection argument to db_stress. When enabled, FaultInjectionTestEnv will be used instead. Currently this option does not support running with other env setting. This will allow us to later manually produce error when running db_crashtest. Pull Request resolved: https://github.com/facebook/rocksdb/pull/6687 Test Plan: make db_stress -j32 ./db_stress --env_fault_injection ./db_stress --env_fault_injection --hdfs // expect error message Reviewed By: ajkr Differential Revision: D21014683 Pulled By: yhchiang fbshipit-source-id: 0724aeac37efd57adb72a37defe6dbd3bfa8106a	5 years ago
anand76	5c19a441c4	Fault injection in db_stress (#6538 ) Summary: This PR implements a fault injection mechanism for injecting errors in reads in db_stress. The FaultInjectionTestFS is used for this purpose. A thread local structure is used to track the errors, so that each db_stress thread can independently enable/disable error injection and verify observed errors against expected errors. This is initially enabled only for Get and MultiGet, but can be extended to iterator as well once its proven stable. Pull Request resolved: https://github.com/facebook/rocksdb/pull/6538 Test Plan: crash_test make check Reviewed By: riversand963 Differential Revision: D20714347 Pulled By: anand1976 fbshipit-source-id: d7598321d4a2d72bda0ced57411a337a91d87dc7	5 years ago
Levi Tamasi	217ce20021	Remove GetSortedWalFiles/GetCurrentWalFile from the crash test (#6491 ) Summary: Currently, `db_stress` tests a randomly picked one of `GetLiveFiles`, `GetSortedWalFiles`, and `GetCurrentWalFile` with a 1/N chance when the command line parameter `get_live_files_and_wal_files_one_in` is specified. The problem is that `GetSortedWalFiles` and `GetCurrentWalFile` are unreliable in the sense that they can return errors if another thread removes a WAL file while they are executing (which is a perfectly plausible and legitimate scenario). The patch splits this command line parameter into three (one for each API), and changes the crash test script so that only `GetLiveFiles` is tested during our continuous crash test runs. Pull Request resolved: https://github.com/facebook/rocksdb/pull/6491 Test Plan: ``` make check python tools/db_crashtest.py whitebox ``` Reviewed By: siying Differential Revision: D20312200 Pulled By: ltamasi fbshipit-source-id: e7c3481eddfe3bd3d5349476e34abc9eee5b7dc8	5 years ago
Yanqin Jin	58918d4ccc	Use correct Env for DestroyDB in stress test (#6539 ) Summary: When using custom Env, trying to call DestroyDB() with default Options will fail. Pull Request resolved: https://github.com/facebook/rocksdb/pull/6539 Test Plan: ./db_stress Differential Revision: D20476204 Pulled By: riversand963 fbshipit-source-id: 612c6754660cc9b5bb3e9c2dbb2f6ecd7f648797	5 years ago
Andrew Kryczka	f52db84650	support SstFileManager in db_stress (#6454 ) Summary: Add some flags for configuring an SstFileManager. An SstFileManager is only created when one or more of these flags are set. Pull Request resolved: https://github.com/facebook/rocksdb/pull/6454 Test Plan: - ran it a while: ``` $ python ./tools/db_crashtest.py blackbox --simple -max_key=100000 -write_buffer_size=131072 -target_file_size_base=131072 -max_bytes_for_level_base=524288 -value_size_mult=33 --interval=10 -max_background_compactions=4 -max_background_flushes=2 -sst_file_manager_bytes_per_sec=1048576 ``` - verified with strace the SstFileManager is behaving as configured: ``` $ strace -fp `pidof db_stress` -e ftruncate,unlink ... [pid 3074805] ftruncate(9</tmp/rocksdb_crashtest_blackbox6OJywh/000070.sst.trash>, 67423) = 0 [pid 3074805] ftruncate(9</tmp/rocksdb_crashtest_blackbox6OJywh/000070.sst.trash>, 51039) = 0 [pid 3074805] ftruncate(9</tmp/rocksdb_crashtest_blackbox6OJywh/000070.sst.trash>, 34655) = 0 [pid 3074805] ftruncate(9</tmp/rocksdb_crashtest_blackbox6OJywh/000070.sst.trash>, 18271) = 0 [pid 3074805] ftruncate(9</tmp/rocksdb_crashtest_blackbox6OJywh/000070.sst.trash>, 1887) = 0 [pid 3074805] unlink("/tmp/rocksdb_crashtest_blackbox6OJywh/000070.sst.trash") = 0 ... ``` Differential Revision: D20103315 Pulled By: ajkr fbshipit-source-id: b3e1092747157459d244b047947a979b85c98f48	5 years ago
sdong	fdf882ded2	Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433 ) Summary: When dynamically linking two binaries together, different builds of RocksDB from two sources might cause errors. To provide a tool for user to solve the problem, the RocksDB namespace is changed to a flag which can be overridden in build time. Pull Request resolved: https://github.com/facebook/rocksdb/pull/6433 Test Plan: Build release, all and jtest. Try to build with ROCKSDB_NAMESPACE with another flag. Differential Revision: D19977691 fbshipit-source-id: aa7f2d0972e1c31d75339ac48478f34f6cfcfb3e	5 years ago
Maysam Yabandeh	01ab882ba3	Fix release warning for unused bg_canceled Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/6357 Differential Revision: D19670931 Pulled By: maysamyabandeh fbshipit-source-id: d528c4c7f9450f1f38b9d2a36e0d5d0865b39be9	5 years ago
Maysam Yabandeh	2243030bc5	Cancel bg jobs before deleting WritePrepared DB in stress tests (#6355 ) Summary: Background jobs in WritePrepared DB might access the db via a snapshot checker callback. The stress tests therefore should cancel background jobs before deleting the db in ::Reopen. Pull Request resolved: https://github.com/facebook/rocksdb/pull/6355 Differential Revision: D19664132 Pulled By: maysamyabandeh fbshipit-source-id: 6060a830e8aad0015c10448286ad37c8a346ac01	5 years ago
Burton Li	c9a5e48762	fix build warnnings on MSVC (#6309 ) Summary: Fix build warnings on MSVC. siying Pull Request resolved: https://github.com/facebook/rocksdb/pull/6309 Differential Revision: D19455012 Pulled By: ltamasi fbshipit-source-id: 940739f2c92de60e47cc2bed8dd7f921459545a9	5 years ago
sdong	8f2bee6747	Add ReadOptions.auto_prefix_mode (#6314 ) Summary: Add a new option ReadOptions.auto_prefix_mode. When set to true, iterator should return the same result as total order seek, but may choose to do prefix seek internally, based on iterator upper bounds. Also fix two previous bugs when handling prefix extrator changes: (1) reverse iterator should not rely on upper bound to determine prefix. Fix it with skipping prefix check. (2) block-based filter is not handled properly. Pull Request resolved: https://github.com/facebook/rocksdb/pull/6314 Test Plan: (1) add a unit test; (2) add the check to stress test and run see whether it can pass at least one run. Differential Revision: D19458717 fbshipit-source-id: 51c1bcc5cdd826c2469af201979a39600e779bce	5 years ago
sdong	1244abef66	Stress Test: relax prefix iterator check condition (#6269 ) Summary: Right now, when validating prefix iterator, if control iterator is invalidate but prefix iterator shows value, we determine it as a test failure. However, this fails to consider the case where a file or memtable containing a tombstone is filtered out by a prefix bloom filter. The fix is to relax the check in this case. If we are out of prefix range, then ignore the check. Pull Request resolved: https://github.com/facebook/rocksdb/pull/6269 Test Plan: Run crash_test for a short while and it still passes. Differential Revision: D19317594 fbshipit-source-id: b964a1cdc1df5efe439d4b32f8023e1fbc8598c1	5 years ago
Maysam Yabandeh	f4a378be3e	Print out non-ok DB::Open status in db_stress (#6272 ) Summary: The crash test is failing with non-ok status after TransactionDB::Open. This patch adds more debugging information. Pull Request resolved: https://github.com/facebook/rocksdb/pull/6272 Differential Revision: D19314527 Pulled By: maysamyabandeh fbshipit-source-id: d45ecb0f2144e052fb4b5fdd483150440991a3b4	5 years ago
Maysam Yabandeh	7c98d71567	Print before AddErrors in stress tests (#6256 ) Summary: Stress tests count number of errors and report them at the end. Not all the cases are accompanied with a log line which makes debugging difficult. The patch adds a log line to the remaining cases. Pull Request resolved: https://github.com/facebook/rocksdb/pull/6256 Differential Revision: D19268785 Pulled By: maysamyabandeh fbshipit-source-id: bdabcaa5c5c7edcb4ce4f25e38fd8a3fd9c7700b	5 years ago
Maysam Yabandeh	48a678b7c9	Prevent an incompatible combination of options (#6254 ) Summary: allow_concurrent_memtable_write is incompatible with non-zero max_successive_merges. Although we check this at runtime, we currently don't prevent the user from setting this combination in options. This has led to stress tests to fail with this combination is tried in ::SetOptions. The patch fixes that. Pull Request resolved: https://github.com/facebook/rocksdb/pull/6254 Differential Revision: D19265819 Pulled By: maysamyabandeh fbshipit-source-id: 47f2e2dc26fe0972c7152f4da15dadb9703f1179	5 years ago
anand76	d4da412864	Add Transaction::MultiGet to db_stress (#6227 ) Summary: Call Transaction::MultiGet from TestMultiGet() in db_stress. We add some Puts/Merges/Deletes into the transaction in order to exercise MultiGetFromBatchAndDB. There is no data verification on read, just check status. There is currently no read data verification in db_stress as it requires synchronization with writes. It needs to be tackled separately. Pull Request resolved: https://github.com/facebook/rocksdb/pull/6227 Test Plan: make crash_test_with_txn Differential Revision: D19204611 Pulled By: anand1976 fbshipit-source-id: 770d0e30d002e88626c264c58103f1d709bb060c	5 years ago
sdong	79cc8dc29b	db_stress: cover approximate size (#6213 ) Summary: db_stress to execute DB::GetApproximateSizes() with randomized keys and options. Return value is not validated but error will be reported. Two ways to generate the range keys: (1) two random keys; (2) a small range. Pull Request resolved: https://github.com/facebook/rocksdb/pull/6213 Test Plan: (1) run "make crash_test" for a while; (2) hack the code to ingest some errors to see it is reported. Differential Revision: D19204665 fbshipit-source-id: 652db36f13bcb5a3bd8fe4a10c0aa22a77a0bce2	5 years ago
sdong	338c149b92	crash_test to cover bottommost compression and some other changes (#6215 ) Summary: Several improvements to crash_test/stress_test: (1) Stress_test to support an parameter of bottommost compression (2) Rename those FLAGS_* variables that are not gflags to avoid confusion (3) Crash_test to randomly generate compression type for bottommost compression with half the chance. (4) Stress_test to sanitize unsupported compression type to snappy, so that crash_test to cover all possible compression types and people don't need to worry about they don't support all comrpession types in their environment. (5) In crash_test, when generating db_stress command, sort arguments in alphabeta order, so that it is easier to find value for a specific argument. Pull Request resolved: https://github.com/facebook/rocksdb/pull/6215 Test Plan: Run "make crash_test" for a while and see the botommost option shown in LOG files. Differential Revision: D19171255 fbshipit-source-id: d7001e246c4ff9ee5760776eea0be97738650735	5 years ago
sdong	e55c2b3f0b	db_stress: improvements in TestIterator (#6166 ) Summary: 1. Cover SeekToFirst() and SeekToLast(). 2. Try to record the history of iterator operations. Pull Request resolved: https://github.com/facebook/rocksdb/pull/6166 Test Plan: Do some manual changes in the code to cover the failure cases and see the error printing is correct and SeekToFirst() and SeekToLast() sometimes show up. Differential Revision: D19047079 fbshipit-source-id: 1ed616f919fe4d32c0a021fc37932a7bd3063bcd	5 years ago
Zhichao Cao	f89dea4fec	db_stress: Added the verification for GetLiveFiles, GetSortedWalFiles and GetCurrentWalFile (#6224 ) Summary: Add the verification in operateDB to verify GetLiveFiles, GetSortedWalFiles and GetCurrentWalFile. The test will be called every 1 out of N, N is decided by get_live_files_and_wal_files_one_i, whose default is 1000000. Pull Request resolved: https://github.com/facebook/rocksdb/pull/6224 Test Plan: pass db_stress default run. Differential Revision: D19183358 Pulled By: zhichao-cao fbshipit-source-id: 20073cf72ede77a3e0d3cf5f28304f1f605d2b1a	5 years ago
Levi Tamasi	786c3d45ed	Support BlobDB in db_stress (#6230 ) Summary: The patch adds support for BlobDB to `db_stress`. Note that BlobDB currently does not support (amongst other features) Column Families or the `SingleDelete` API, so for now, those should be disabled on the command line when running `db_stress` in BlobDB mode (using `-column_families=1` and `-nooverwritepercent=0`, respectively). Also, some BlobDB features that do not go well with the verification logic in `db_stress` like TTL and FIFO eviction are not supported currently. Pull Request resolved: https://github.com/facebook/rocksdb/pull/6230 Test Plan: ``` ./db_stress -max_key=100000 -use_blob_db -column_families=1 -nooverwritepercent=0 -reopen=1 -blob_db_file_size=1000000 -target_file_size_base=1000000 -blob_db_enable_gc -blob_db_gc_cutoff=0.1 -blob_db_min_blob_size=10 -blob_db_bytes_per_sync=16384 ``` Differential Revision: D19191476 Pulled By: ltamasi fbshipit-source-id: 35840452af8c5e6095249c7fd9a53a119a0985fc	5 years ago
Yanqin Jin	670a916d01	Add more verification to db_stress (#6173 ) Summary: Currently, db_stress performs verification by calling `VerifyDb()` at the end of test and optionally before tests start. In case of corruption or incorrect result, it will be too late. This PR adds more verification in two ways. 1. For cf consistency test, each test thread takes a snapshot and verifies every N ops. N is configurable via `-verify_db_one_in`. This option is not supported in other stress tests. 2. For cf consistency test, we use another background thread in which a secondary instance periodically tails the primary (interval is configurable). We verify the secondary. Once an error is detected, we terminate the test and report. This does not affect other stress tests. Test plan (devserver) ``` $./db_stress -test_cf_consistency -verify_db_one_in=0 -ops_per_thread=100000 -continuous_verification_interval=100 $./db_stress -test_cf_consistency -verify_db_one_in=1000 -ops_per_thread=10000 -continuous_verification_interval=0 $make crash_test ``` Pull Request resolved: https://github.com/facebook/rocksdb/pull/6173 Differential Revision: D19047367 Pulled By: riversand963 fbshipit-source-id: aeed584ad71f9310c111445f34975e5ab47a0615	5 years ago
Peter Dillinger	873331fe49	Refactor pulling out parts of StressTest::OperateDb (#6195 ) Summary: Complete some refactoring called for in https://github.com/facebook/rocksdb/issues/6148. Somehow I got some 'make format' in here for files I didn't change, but that should be OK. (I'm not sure why "hide whitespace changes" doesn't seem to help in review.) Not addressed in this PR: some operations simply print to stdout rather than failing on discovering a bad status or inconsistency. Pull Request resolved: https://github.com/facebook/rocksdb/pull/6195 Differential Revision: D19131067 Pulled By: pdillinger fbshipit-source-id: 4f416e6b792023989ef119f385fe122426cb825d	5 years ago
anand76	2afea29762	Add VerifyChecksum() to db_stress (#6203 ) Summary: Add an option to db_stress, verify_checksum_one_in, to call DB::VerifyChecksum() once every N ops. Pull Request resolved: https://github.com/facebook/rocksdb/pull/6203 Differential Revision: D19145753 Pulled By: anand1976 fbshipit-source-id: d09edf21f309ad53aa40dd25b7a563d50665fd8b	5 years ago
Maysam Yabandeh	68d5d82d1f	Wait for CancelAllBackgroundWork before Close in db stress (#6191 ) Summary: In https://github.com/facebook/rocksdb/issues/6174 we fixed the stress test to respect the CancelAllBackgroundWork + Close order for WritePrepared transactions. The fix missed to take into account that some invocation of CancelAllBackgroundWork are with wait=false parameter which essentially breaks the order. Pull Request resolved: https://github.com/facebook/rocksdb/pull/6191 Differential Revision: D19102709 Pulled By: maysamyabandeh fbshipit-source-id: f4e7b5fdae47ff1c1ac284ba1cf67d5d3f3d03eb	5 years ago
sdong	bcc372c0c3	Add some new options to crash_test (#6176 ) Summary: Several options are trivially added to crash test and random values are picked. Made simple test run non-dynamic level and normal test run dynamic level. Pull Request resolved: https://github.com/facebook/rocksdb/pull/6176 Test Plan: Run crash_test and watch the printing Differential Revision: D19053955 fbshipit-source-id: 958cb43c968541ebd87ed4d91e778bd1d40e7502	5 years ago
sdong	35126dd874	db_stress: preserve all historic manifest files (#6142 ) Summary: compaction history is stored in manifest files. Preserve all of them in db_stress would help debugging. Pull Request resolved: https://github.com/facebook/rocksdb/pull/6142 Test Plan: Run db_stress and observe that manifest files are preserved. Run whole crash_test and see how DB directory looks like. Differential Revision: D19047026 fbshipit-source-id: f83c3e0bb5332b1b4768be5dcee56a24f9b760a9	5 years ago
Maysam Yabandeh	4b97812da8	Add long-running snapshots to stress tests (#6171 ) Summary: Current implementation holds on to 10% of snapshots for 10x longer, and 1% of snapshots 100x longer. Pull Request resolved: https://github.com/facebook/rocksdb/pull/6171 Test Plan: ``` make -j32 crash_test Differential Revision: D19038399 Pulled By: maysamyabandeh fbshipit-source-id: 75da2dbb5c47a0b3f37d299b8719e392b73b42c0	5 years ago
Maysam Yabandeh	349bd3ed82	CancelAllBackgroundWork before Close in db stress (#6174 ) Summary: Close asserts that there is no unreleased snapshots. For WritePrepared transaction, this means that the background work that holds on a snapshot must be canceled first. Update the stress tests to respect the sequence. Pull Request resolved: https://github.com/facebook/rocksdb/pull/6174 Test Plan: ``` make -j32 crash_test Differential Revision: D19057322 Pulled By: maysamyabandeh fbshipit-source-id: c9e9e24f779bbfb0ab72c2717e34576c01bc6362	5 years ago
Peter Dillinger	58d46d1915	Add useful idioms to Random API (OneInOpt, PercentTrue) (#6154 ) Summary: And clean up related code, especially in stress test. (More clean up of db_stress_test_base.cc coming after this.) Pull Request resolved: https://github.com/facebook/rocksdb/pull/6154 Test Plan: make check, make blackbox_crash_test for a bit Differential Revision: D18938180 Pulled By: pdillinger fbshipit-source-id: 524d27621b8dbb25f6dff40f1081e7c00630357e	5 years ago
Maysam Yabandeh	fec7302a9d	Enable unordered_write in stress tests (#6164 ) Summary: With WritePrepared transactions configured with two_write_queues, unordered_write will offer the same guarantees as vanilla rocksdb and thus can be enabled in stress tests. Pull Request resolved: https://github.com/facebook/rocksdb/pull/6164 Test Plan: ``` make -j32 crash_test_with_txn Differential Revision: D18991899 Pulled By: maysamyabandeh fbshipit-source-id: eece5e96b4169b67d7931e5c0afca88540a113e1	5 years ago
Maysam Yabandeh	8613ee2e94	Enable all txn write policies in crash test (#6158 ) Summary: Currently the default txn write policy in crash tests is WRITE_PREPARED. The patch randomly picks the write policy at the start of the crash test. Pull Request resolved: https://github.com/facebook/rocksdb/pull/6158 Test Plan: ``` make -j32 crash_test_with_txn ``` Differential Revision: D18946307 Pulled By: maysamyabandeh fbshipit-source-id: f77d7a94f99a08791ef9626da153d284bf521950	5 years ago
Yanqin Jin	383f5071f0	Add SyncWAL to db_stress (#6149 ) Summary: Add SyncWAL to db_stress. Specify with `-sync_wal_one_in=N` so that it will be called once every N operations on average. Pull Request resolved: https://github.com/facebook/rocksdb/pull/6149 Test Plan: ``` $make db_stress $./db_stress -sync_wal_one_in=100 -ops_per_thread=100000 ``` Differential Revision: D18922529 Pulled By: riversand963 fbshipit-source-id: 4c0b8cb8fa21852722cffd957deddf688f12ea56	5 years ago
sdong	7a99162a74	db_stress: sometimes call CancelAllBackgroundWork() and Close() before closing DB (#6141 ) Summary: CancelAllBackgroundWork() and Close() are frequently used features but we don't cover it in stress test. Simply execute them before closing the DB with 1/2 chance. Pull Request resolved: https://github.com/facebook/rocksdb/pull/6141 Test Plan: Run "db_stress". Differential Revision: D18900861 fbshipit-source-id: 49b46ccfae120d0f9de3e0543b82fb6d715949d0	5 years ago
Peter Dillinger	a653857178	Add PauseBackgroundWork() to db_stress (#6148 ) Summary: Worker thread will occasionally call PauseBackgroundWork(), briefly sleep (to avoid stalling itself) and then call ContinueBackgroundWork(). Pull Request resolved: https://github.com/facebook/rocksdb/pull/6148 Test Plan: some running of 'make blackbox_crash_test' with temporary printf output to confirm code occasionally reached. Differential Revision: D18913886 Pulled By: pdillinger fbshipit-source-id: ae9356a803390929f3165dfb6a00194692ba92be	5 years ago
sdong	14c38baca0	db_stress: sometimes validate compact range data (#6140 ) Summary: Right now, in db_stress, compact range is simply executed without any immediate data validation. Add a simply validation which compares hash for all keys within the compact range to stay the same against the same snapshot before and after the compaction. Also, randomly tune most knobs of CompactRangeOptions. Pull Request resolved: https://github.com/facebook/rocksdb/pull/6140 Test Plan: Run db_stress with "--compact_range_one_in=2000 --compact_range_width=100000000" for a while. Manually ingest some hacky code and observe the error path. Differential Revision: D18900230 fbshipit-source-id: d96e75bc8c38dd5ec702571ffe7cf5f4ea93ee10	5 years ago
Peter Dillinger	6380df5e10	Vary bloom_bits in db_crashtest (#6103 ) Summary: Especially with non-integral bits/key now supported, db_crashtest should vary the bloom_bits configuration. The probabilities look like this: 1/2 chance of a uniform int from 0 to 19. This includes overall 1/40 chance of 0 which disables the bloom filter. 1/2 chance of a float from a lognormal distribution with a median of 10. This always produces positive values but with a decent chance of < 1 (overall ~1/40) or > 100 (overall ~1/40), the enforced/coerced implementation limits. Pull Request resolved: https://github.com/facebook/rocksdb/pull/6103 Test Plan: start 'make blackbox_crash_test' several times and look at configuration output Differential Revision: D18734877 Pulled By: pdillinger fbshipit-source-id: 4a38cb057d3b3fc1327f93199f65b9a9ffbd7316	5 years ago
sdong	7d79b32618	Break db_stress_tool.cc to a list of source files (#6134 ) Summary: db_stress_tool.cc now is a giant file. In order to main it easier to improve and maintain, break it down to multiple source files. Most classes are turned into their own files. Separate .h and .cc files are created for gflag definiations. Another .h and .cc files are created for some common functions. Some test execution logic that is only loosely related to class StressTest is moved to db_stress_driver.h and db_stress_driver.cc. All the files are located under db_stress_tool/. The directory name is created as such because if we end it with either stress or test, .gitignore will ignore any file under it and makes it prone to issues in developements. Pull Request resolved: https://github.com/facebook/rocksdb/pull/6134 Test Plan: Build under GCC7 with and without LITE on using GNU Make. Build with GCC 4.8. Build with cmake with -DWITH_TOOL=1 Differential Revision: D18876064 fbshipit-source-id: b25d0a7451840f31ac0f5ebb0068785f783fdf7d	5 years ago

1 2 3 4

194 Commits (943247b76ec9a537fe84a275088ed266ee4a8c24)