Summary:
ROCKSDB_SUPPORT_THREAD_LOCAL definition has been removed.
`__thread`(#define) has been replaced with `thread_local`(C++ keyword) across the code base.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/10015
Reviewed By: siying
Differential Revision: D36485491
Pulled By: pdillinger
fbshipit-source-id: 6522d212514ee190b90b4e2750c80c7e34013c78
Summary:
When compaction filter determines that a key should be removed, it updates the internal key's type
to `Delete`. If this internal key is preserved in current compaction but seen by a later compaction
together with `SingleDelete`, it will cause compaction iterator to return Corruption.
To fix the issue, compaction filter should return more information in addition to the intention of removing
a key. Therefore, we add a new `kRemoveWithSingleDelete` to `CompactionFilter::Decision`. Seeing
`kRemoveWithSingleDelete`, compaction iterator will update the op type of the internal key to `kTypeSingleDelete`.
In addition, I updated db_stress_shared_state.[cc|h] so that `no_overwrite_ids_` becomes `const`. It is easier to
reason about thread-safety if accessed from multiple threads. This information is passed to `PrepareTxnDBOptions()`
when calling from `Open()` so that we can set up the rollback deletion type callback for transactions.
Finally, disable compaction filter for multiops_txn because the key removal logic of `DbStressCompactionFilter` does
not quite work with `MultiOpsTxnsStressTest`.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/9929
Test Plan:
make check
make crash_test
make crash_test_with_txn
Reviewed By: anand1976
Differential Revision: D36069678
Pulled By: riversand963
fbshipit-source-id: cedd2f1ba958af59ad3916f1ba6f424307955f92
Summary:
This PR implements a fault injection mechanism for injecting errors in reads in db_stress. The FaultInjectionTestFS is used for this purpose. A thread local structure is used to track the errors, so that each db_stress thread can independently enable/disable error injection and verify observed errors against expected errors. This is initially enabled only for Get and MultiGet, but can be extended to iterator as well once its proven stable.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6538
Test Plan:
crash_test
make check
Reviewed By: riversand963
Differential Revision: D20714347
Pulled By: anand1976
fbshipit-source-id: d7598321d4a2d72bda0ced57411a337a91d87dc7
Summary:
When dynamically linking two binaries together, different builds of RocksDB from two sources might cause errors. To provide a tool for user to solve the problem, the RocksDB namespace is changed to a flag which can be overridden in build time.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6433
Test Plan: Build release, all and jtest. Try to build with ROCKSDB_NAMESPACE with another flag.
Differential Revision: D19977691
fbshipit-source-id: aa7f2d0972e1c31d75339ac48478f34f6cfcfb3e
Summary:
Currently, db_stress performs verification by calling `VerifyDb()` at the end of test and optionally before tests start. In case of corruption or incorrect result, it will be too late. This PR adds more verification in two ways.
1. For cf consistency test, each test thread takes a snapshot and verifies every N ops. N is configurable via `-verify_db_one_in`. This option is not supported in other stress tests.
2. For cf consistency test, we use another background thread in which a secondary instance periodically tails the primary (interval is configurable). We verify the secondary. Once an error is detected, we terminate the test and report. This does not affect other stress tests.
Test plan (devserver)
```
$./db_stress -test_cf_consistency -verify_db_one_in=0 -ops_per_thread=100000 -continuous_verification_interval=100
$./db_stress -test_cf_consistency -verify_db_one_in=1000 -ops_per_thread=10000 -continuous_verification_interval=0
$make crash_test
```
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6173
Differential Revision: D19047367
Pulled By: riversand963
fbshipit-source-id: aeed584ad71f9310c111445f34975e5ab47a0615
Summary:
db_stress_tool.cc now is a giant file. In order to main it easier to improve and maintain, break it down to multiple source files.
Most classes are turned into their own files. Separate .h and .cc files are created for gflag definiations. Another .h and .cc files are created for some common functions. Some test execution logic that is only loosely related to class StressTest is moved to db_stress_driver.h and db_stress_driver.cc. All the files are located under db_stress_tool/. The directory name is created as such because if we end it with either stress or test, .gitignore will ignore any file under it and makes it prone to issues in developements.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6134
Test Plan: Build under GCC7 with and without LITE on using GNU Make. Build with GCC 4.8. Build with cmake with -DWITH_TOOL=1
Differential Revision: D18876064
fbshipit-source-id: b25d0a7451840f31ac0f5ebb0068785f783fdf7d
Summary:
expose db stress test by providing db_stress_tool.h in public header.
This PR does the following:
- adds a new header, db_stress_tool.h, in include/rocksdb/
- renames db_stress.cc to db_stress_tool.cc
- adds a db_stress.cc which simply invokes a test function.
- update Makefile accordingly.
Test Plan (dev server):
```
make db_stress
./db_stress
```
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5937
Differential Revision: D17997647
Pulled By: riversand963
fbshipit-source-id: 1a8d9994f89ce198935566756947c518f0052410
Summary:
The loop in OperateDb() is getting quite complicated with the introduction of multiple key operations such as MultiGet and Reseeks. This is resulting in a number of corner cases that hangs db_stress due to synchronization problems during reopen (i.e when -reopen=<> option is specified). This PR makes it more robust by ensuring all db_stress threads vote to reopen the DB the exact same number of times.
Most of the changes in this diff are due to indentation.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5893
Test Plan: Run crash test
Differential Revision: D17823827
Pulled By: anand1976
fbshipit-source-id: ec893829f611ac7cac4057c0d3d99f9ffb6a6dd9
Summary:
This is the 2nd attempt after the revert of https://github.com/facebook/rocksdb/pull/4020
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5895
Test Plan:
```
./tools/db_crashtest.py blackbox --simple --interval=10 --max_key=10000000
```
Differential Revision: D17822137
Pulled By: maysamyabandeh
fbshipit-source-id: 3d148c0d8cc129080410ff859c04b544223c8ea3
Summary:
When multiple operations are performed in a db_stress thread in one loop
iteration, the reopen voting logic needs to take that into account. It
was doing that for MultiGet, but a new option was introduced recently to
do multiple iterator seeks per iteration, which broke it again. Fix the
logic to be more robust and agnostic of the type of operation performed.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5876
Test Plan: Run db_stress
Differential Revision: D17733590
Pulled By: anand1976
fbshipit-source-id: 787f01abefa1e83bba43e0b4f4abb26699b2089e
Summary:
Two more bug fixes in db_stress:
1. this is to complete the fix of the regression bug causing overflowing when supporting FLAGS_prefix_size = -1.
2. Fix regression bug in compare iterator itself:
(1) when creating control iterator, which used the same read option as the normal iterator by mistake; (2) the logic of comparing has some problems. Fix them.
(3) disable validation for lower bound now, which generated some wildly different results. Disabling it to make normal tests pass while investigating it.
3. Cleaning up snapshots in verification failure cases. Memory is leaked otherwise.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5867
Test Plan: Run "make crash_test" for a while and see at least 1 is fixed.
Differential Revision: D17671712
fbshipit-source-id: 011f98ea1a72aef23e19ff28656830c78699b402
Summary:
When prefix_size = -1, stress test crashes with run time error because of overflow. Fix it by not using -1 but 7 in prefix scan mode.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5862
Test Plan:
Run
python -u tools/db_crashtest.py --simple whitebox --random_kill_odd \
888887 --compression_type=zstd
and see it doesn't crash.
Differential Revision: D17642313
fbshipit-source-id: f029e7651498c905af1b1bee6d310ae50cdcda41
Summary:
For now, crash_test is not able to report any failure for the logic related to iterator upper, lower bounds or iterators, or reseek. These are features prone to errors. Improve db_stress in several ways:
(1) For each iterator run, reseek up to 3 times.
(2) For every iterator, create control iterator with upper or lower bound, with total order seek. Compare the results with the iterator.
(3) Make simple crash test to avoid prefix size to have more coverage.
(4) make prefix_size = 0 a valid size and -1 to indicate disabling prefix extractor.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5846
Test Plan: Manually hack the code to create wrong results and see they are caught by the tool.
Differential Revision: D17631760
fbshipit-source-id: acd460a177bd2124a5ffd7fff490702dba63030b
Summary:
Further apply formatter to more recent commits.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5830
Test Plan: Run all existing tests.
Differential Revision: D17488031
fbshipit-source-id: 137458fd94d56dd271b8b40c522b03036943a2ab
Summary:
Some recent commits might not have passed through the formatter. I formatted recent 45 commits. The script hangs for more commits so I stopped there.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5827
Test Plan: Run all existing tests.
Differential Revision: D17483727
fbshipit-source-id: af23113ee63015d8a43d89a3bc2c1056189afe8f
Summary:
- In `db_stress`, support choosing index type and whether to enable filter partitioning, and randomly set those options in crash test
- When partitioned filter is enabled by crash test, force partitioned index to also be enabled since it's a prerequisite
Pull Request resolved: https://github.com/facebook/rocksdb/pull/4020
Test Plan:
currently this is blocked on fixing the bug that crash test caught:
```
$ TEST_TMPDIR=/data/compaction_bench python ./tools/db_crashtest.py blackbox --simple --interval=10 --max_key=10000000
...
Verification failed for column family 0 key 937501: Value not found: NotFound:
Crash-recovery verification failed :(
```
Differential Revision: D8508683
Pulled By: maysamyabandeh
fbshipit-source-id: 0337e5d0558bcef26b1f3699f47265a2c1e99629
Summary:
Covering periodic compaction and compaction TTL can help us expose potential issues. Add it there.
Randomly select value for these two options.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5741
Test Plan: Run crash_test and see the perameters generated.
Differential Revision: D17059515
fbshipit-source-id: 8213974846a0b6a22fc13be705825c9054d1d097
Summary:
MyRocks currently sets `max_write_buffer_number_to_maintain` in order to maintain enough history for transaction conflict checking. The effectiveness of this approach depends on the size of memtables. When memtables are small, it may not keep enough history; when memtables are large, this may consume too much memory.
We are proposing a new way to configure memtable list history: by limiting the memory usage of immutable memtables. The new option is `max_write_buffer_size_to_maintain` and it will take precedence over the old `max_write_buffer_number_to_maintain` if they are both set to non-zero values. The new option accounts for the total memory usage of flushed immutable memtables and mutable memtable. When the total usage exceeds the limit, RocksDB may start dropping immutable memtables (which is also called trimming history), starting from the oldest one.
The semantics of the old option actually works both as an upper bound and lower bound. History trimming will start if number of immutable memtables exceeds the limit, but it will never go below (limit-1) due to history trimming.
In order the mimic the behavior with the new option, history trimming will stop if dropping the next immutable memtable causes the total memory usage go below the size limit. For example, assuming the size limit is set to 64MB, and there are 3 immutable memtables with sizes of 20, 30, 30. Although the total memory usage is 80MB > 64MB, dropping the oldest memtable will reduce the memory usage to 60MB < 64MB, so in this case no memtable will be dropped.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5022
Differential Revision: D14394062
Pulled By: miasantreble
fbshipit-source-id: 60457a509c6af89d0993f988c9b5c2aa9e45f5c5
Summary:
AtomicFlushStressTest is a powerful test, but right now we only run it for atomic_flush=true + disable_wal=true. We further extend it to the case where atomic_flush=false + disable_wal = false. All the workload generation and validation can stay the same.
Atomic flush crash test is also changed to switch between the two test scenarios. It makes the name "atomic flush crash test" out of sync from what it really does. We leave it as it is to avoid troubles with continous test set-up.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5729
Test Plan: Run "CRASH_TEST_KILL_ODD=188 TEST_TMPDIR=/dev/shm/ USE_CLANG=1 make whitebox_crash_test_with_atomic_flush", observe the settings used and see it passed.
Differential Revision: D16969791
fbshipit-source-id: 56e37487000ae631e31b0100acd7bdc441c04163
Summary:
Atomic flush test started to fail after https://github.com/facebook/rocksdb/issues/5099. Then https://github.com/facebook/rocksdb/issues/5278 provided a fix after
which the same error occurred much less frequently. However it still occur
occasionally. Not sure what the root cause is. This PR disables the feature of
snapshot list refresh, and we should keep an eye on the failure in the future.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5581
Differential Revision: D16295985
Pulled By: riversand963
fbshipit-source-id: c9e62e65133c52c21b07097de359632ca62571e4
Summary:
When atomic flush stress test fails, we print internal keys within the range with mismatched key/values for all column families.
Test plan (on devserver)
Manually hack the code to randomly insert wrong data. Run the test.
```
$make clean && COMPILE_WITH_TSAN=1 make -j32 db_stress
$./db_stress -test_atomic_flush=true -ops_per_thread=10000
```
Check that proper error messages are printed, as follows:
```
2019/07/08-17:40:14 Starting verification
Verification failed
Latest Sequence Number: 190903
[default] 000000000000050B => 56290000525350515E5F5C5D5A5B5859
[3] 0000000000000533 => EE100000EAEBE8E9E6E7E4E5E2E3E0E1FEFFFCFDFAFBF8F9
Internal keys in CF 'default', [000000000000050B, 0000000000000533] (max 8)
key 000000000000050B seq 139920 type 1
key 0000000000000533 seq 0 type 1
Internal keys in CF '3', [000000000000050B, 0000000000000533] (max 8)
key 0000000000000533 seq 0 type 1
```
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5549
Differential Revision: D16158709
Pulled By: riversand963
fbshipit-source-id: f07fa87763f87b3bd908da03c956709c6456bcab
Summary:
Print out some more information when db_tress fails with verification failures to help debugging problems.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5543
Test Plan:
Manually ingest some failures and observe the outputs are like this:
Verification failed
[default] 0000000000199A5A => 7C3D000078797A7B74757677707172736C6D6E6F68696A6B
[6] 000000000019C8BD => 65380000616063626D6C6F6E69686B6A
internal keys in default CF [0000000000199A5A, 000000000019C8BD] (max 8)
key 0000000000199A5A seq 179246 type 1
key 000000000019C8BD seq 163970 type 1
Lastest Sequence Number: 292234
Differential Revision: D16153717
fbshipit-source-id: b33fa50a828c190cbf8249a37955432044f92daf
Summary:
When using `PRIu64` type of printf specifier, current code base does the following:
```
#ifndef __STDC_FORMAT_MACROS
#define __STDC_FORMAT_MACROS
#endif
#include <inttypes.h>
```
However, this can be simplified to
```
#include <cinttypes>
```
as long as flag `-std=c++11` is used.
This should solve issues like https://github.com/facebook/rocksdb/issues/5159
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5402
Differential Revision: D15701195
Pulled By: miasantreble
fbshipit-source-id: 6dac0a05f52aadb55e9728038599d3d2e4b59d03
Summary:
Many logging related source files are under util/. It will be more structured if they are together.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5387
Differential Revision: D15579036
Pulled By: siying
fbshipit-source-id: 3850134ed50b8c0bb40a0c8ae1f184fa4081303f
Summary:
When the --reopen option is non-zero, the DB is reopened after every ops_per_thread/(reopen+1) ops, with the check being done after every op. With MultiGet, we might do multiple ops in one iteration, which broke the logic that checked when to synchronize among the threads and reopen the DB. This PR fixes that logic.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5374
Differential Revision: D15559780
Pulled By: anand1976
fbshipit-source-id: ee6563a68045df7f367eca3cbc2500d3e26359ef
Summary:
There are too many types of files under util/. Some test related files don't belong to there or just are just loosely related. Mo
ve them to a new directory test_util/, so that util/ is cleaner.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5377
Differential Revision: D15551366
Pulled By: siying
fbshipit-source-id: 0f5c8653832354ef8caa31749c0143815d719e2c
Summary:
This PR has two fixes for crash test failures -
1. Fix a bug in TestMultiGet() in db_stress that was passing list of key to MultiGet() in the wrong order, thus ensuring that actual values don't match expected values
2. Remove an incorrect assertion in FilePickerMultiGet::GetNextFileInLevelWithKeys() that checks that files in a level are in sorted order. This is not true with MultiGet(), especially if there are duplicate keys and we may have to go back one file for the next key. Furthermore, this assertion makes more sense when a new version is created, rather than at lookup time
Test -
asan_crash and ubsan_crash tests
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5301
Differential Revision: D15337383
Pulled By: anand1976
fbshipit-source-id: 35092cb15bbc1700e5e823cbe07bfa62f1e9e6c6
Summary:
This PR fixes three memory issues found by ASAN
* in db_stress, the key vector for MultiGet is created using `emplace_back` which could potentially invalidates references to the underlying storage (vector<string>) due to auto resizing. Fix by calling reserve in advance.
* Similar issue in construction of GetContext autovector in version_set.cc
* In multiget_context.h use T[] specialization for unique_ptr that holds a char array
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5279
Differential Revision: D15202893
Pulled By: miasantreble
fbshipit-source-id: 14cc2cda0ed64d29f2a1e264a6bfdaa4294ee75d
Summary:
The new option will pick a batch size randomly in the range 1-64. It will then space the keys in the batch by random intervals.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5264
Differential Revision: D15175522
Pulled By: anand1976
fbshipit-source-id: c16baa69d0f1ff4cf53c55c813ddd82c8aeb58fc
Summary:
At least one of the meta-block loading functions (`ReadRangeDelBlock`)
uses the same block reading function (`NewDataBlockIterator`) as data
block reads, which means it uses the dictionary handle. However, the
dictionary handle was uninitialized while reading meta-blocks, causing
readers to receive an error. This situation was only noticed when
`cache_index_and_filter_blocks=true`.
This PR initializes the handle to null while reading meta-blocks to
prevent the error. It also adds support to `db_stress` /
`db_crashtest.py` for `cache_index_and_filter_blocks`.
Fixes#5263.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5267
Differential Revision: D15149264
Pulled By: maysamyabandeh
fbshipit-source-id: 991d38a306c62db5976778bfb050fa3cd4a0671b
Summary:
Fix some hdfs-related code so that it can compile and run 'db_stress'
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5122
Differential Revision: D14675495
Pulled By: riversand963
fbshipit-source-id: cac280479efcf5451982558947eac1732e8bc45a
Summary:
Since this feature affects the WAL behavior, it seems important our crash-recovery tests cover it.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5070
Differential Revision: D14470085
Pulled By: miasantreble
fbshipit-source-id: 9b9682a718a926d57d055e0a5ec867efbd2eb9c1
Summary:
MyRocks calls `GetForUpdate` on `INSERT`, for unique key check, and in almost all cases GetForUpdate returns empty result. For such cases, whole key bloom filter is helpful.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/4985
Differential Revision: D14118257
Pulled By: miasantreble
fbshipit-source-id: d35cb7109c62fd5ad541a26968e3a3e16d3e85ea
Summary:
as titled.
We can remove the assersion because we do not perform verification in
AtomicFlushStressTest::TestCheckpoint for similar reasons to TestGet, TestPut,
etc.
Therefore, we override TestCheckpoint in AtomicFlushStressTest so that the
assertion `rand_column_families.size() == rand_keys.size()' is removed, and we
do not verify the DB in this function.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/4846
Differential Revision: D13583377
Pulled By: riversand963
fbshipit-source-id: 03647f3da67e27a397413fd666e3bb43003bf596
Summary:
Updated stress test will support testing of db in read-only mode.
The user has to make sure that only read/scan operations are enabled.
This PR relies on #4681.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/4690
Differential Revision: D13102741
Pulled By: riversand963
fbshipit-source-id: f5a36b34db187fe12dd355f7eda161f99d6c75e4
Summary:
- To be consistent with the accounting of other optypes in `TableProperties`, we should count range tombstones in `TableProperties::num_entries` and `TableProperties::num_deletions`.
- Updated assertions in stress test's `OnTableFileCreated` handler to accept files with range tombstones only.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/4841
Differential Revision: D13568424
Pulled By: ajkr
fbshipit-source-id: 0139d7806494eda20ece67ec460d2458dbbf6026
Summary:
Separate flag for enabling option from flag for enabling dedicated atomic stress test. I have found setting the former without setting the latter can detect different problems.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/4781
Differential Revision: D13463211
Pulled By: ajkr
fbshipit-source-id: 054f777885b2dc7d5ea99faafa21d6537eee45fd
Summary:
When iterator becomes invalid, there are two possibilities.
First, all data in the column family have been scanned and there is nothing
more to scan.
Second, an underlying error has occurred, causing `status()` to be !ok.
Therefore, we need to check for both cases when `!iter->Valid()`.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/4648
Differential Revision: D12959601
Pulled By: riversand963
fbshipit-source-id: 49c9382c9ea9e78f2e2b6f3708f0670b822ca8dd
Summary:
Ran the following commands to recursively change all the files under RocksDB:
```
find . -type f -name "*.cc" -exec sed -i 's/ unique_ptr/ std::unique_ptr/g' {} +
find . -type f -name "*.cc" -exec sed -i 's/<unique_ptr/<std::unique_ptr/g' {} +
find . -type f -name "*.cc" -exec sed -i 's/ shared_ptr/ std::shared_ptr/g' {} +
find . -type f -name "*.cc" -exec sed -i 's/<shared_ptr/<std::shared_ptr/g' {} +
```
Running `make format` updated some formatting on the files touched.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/4638
Differential Revision: D12934992
Pulled By: sagar0
fbshipit-source-id: 45a15d23c230cdd64c08f9c0243e5183934338a8