Summary:
If WAL dir is different from the DB dir, we should still honor the SstFileManager deletion rate limit for SST files.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/8967
Test Plan: Add a new unit test in db_sst_test
Reviewed By: pdillinger
Differential Revision: D31220116
Pulled By: anand1976
fbshipit-source-id: bcde8a53a7d728e15e597fb5d07ee86c1b38bd28
Summary:
Add comments for MultiGetBlob() that input argument `offsets` must be
sorted. In addition, add assertion for this condition in debug build.
Repeat the same for RandomAccessFileReader::MultiRead().
Pull Request resolved: https://github.com/facebook/rocksdb/pull/8972
Test Plan: make check
Reviewed By: pdillinger
Differential Revision: D31253205
Pulled By: riversand963
fbshipit-source-id: 98758229b8052f3aeb319d5584026b4de2d220a2
Summary:
Fill in some missing info; fix some incorrect info.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/8963
Test Plan: comments only
Reviewed By: mrambacher
Differential Revision: D31211183
Pulled By: pdillinger
fbshipit-source-id: 783ff6673791c01d44c3ed92d4398c64ae5a5005
Summary:
This header file was including everything and the kitchen sink when it did not need to. This resulted in many places including this header when they needed other pieces instead.
Cleaned up this header to only include what was needed and fixed up the remaining code to include what was now missing.
Hopefully, this sort of code hygiene cleanup will speed up the builds...
Pull Request resolved: https://github.com/facebook/rocksdb/pull/8930
Reviewed By: pdillinger
Differential Revision: D31142788
Pulled By: mrambacher
fbshipit-source-id: 6b45de3f300750c79f751f6227dece9cfd44085d
Summary:
This is a precursor refactoring to enable an upcoming feature: persistence failure correctness testing.
- Changed `--expected_values_path` to `--expected_values_dir` and migrated "db_crashtest.py" to use the new flag. For persistence failure correctness testing there are multiple possible correct states since unsynced data is allowed to be dropped. Making it possible to restore all these possible correct states will eventually involve files containing snapshots of expected values and DB trace files.
- The expected values directory is managed by an `ExpectedStateManager` instance. Managing expected state files is separated out of `SharedState` to prevent `SharedState` from becoming too complex when the new files and features (snapshotting, tracing, and restoring) are introduced.
- Migrated expected values file access/management out of `SharedState` into a separate class called `ExpectedState`. This is not exposed directly to the test but rather the `ExpectedState` for the latest values file is accessed via a pass-through API on `ExpectedStateManager`. This forces the test to always access the single latest `ExpectedState`.
- Changed the initialization of the latest expected values file to use a tempfile followed by rename, and also add cleanup logic for possible stranded tempfiles.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/8913
Test Plan:
run in several ways; try to make sure it's not obviously broken.
- crashtest blackbox without TEST_TMPDIR
```
$ python3 tools/db_crashtest.py blackbox --simple --write_buffer_size=1048576 --target_file_size_base=1048576 --max_bytes_for_level_base=4194304 --max_key=100000 --value_size_mult=33 --compression_type=none --duration=120 --interval=10 --compression_type=none --blob_compression_type=none
```
- crashtest blackbox with TEST_TMPDIR
```
$ TEST_TMPDIR=/dev/shm python3 tools/db_crashtest.py blackbox --simple --write_buffer_size=1048576 --target_file_size_base=1048576 --max_bytes_for_level_base=4194304 --max_key=100000 --value_size_mult=33 --compression_type=none --duration=120 --interval=10 --compression_type=none --blob_compression_type=none
```
- crashtest whitebox with TEST_TMPDIR
```
$ TEST_TMPDIR=/dev/shm python3 tools/db_crashtest.py whitebox --simple --write_buffer_size=1048576 --target_file_size_base=1048576 --max_bytes_for_level_base=4194304 --max_key=100000 --value_size_mult=33 --compression_type=none --duration=120 --interval=10 --compression_type=none --blob_compression_type=none --random_kill_odd=88887
```
- db_stress without expected_values_dir
```
$ ./db_stress --write_buffer_size=1048576 --target_file_size_base=1048576 --max_bytes_for_level_base=4194304 --max_key=100000 --value_size_mult=33 --compression_type=none --ops_per_thread=10000 --clear_column_family_one_in=0 --destroy_db_initially=true
```
- db_stress with expected_values_dir and manual corruption
```
$ ./db_stress --write_buffer_size=1048576 --target_file_size_base=1048576 --max_bytes_for_level_base=4194304 --max_key=100000 --value_size_mult=33 --compression_type=none --ops_per_thread=10000 --clear_column_family_one_in=0 --destroy_db_initially=true --expected_values_dir=./
// modify one byte in "./LATEST.state"
$ ./db_stress --write_buffer_size=1048576 --target_file_size_base=1048576 --max_bytes_for_level_base=4194304 --max_key=100000 --value_size_mult=33 --compression_type=none --ops_per_thread=10000 --clear_column_family_one_in=0 --destroy_db_initially=false --expected_values_dir=./
...
Verification failed for column family 0 key 0000000000000000 (0): Value not found: NotFound:
...
```
Reviewed By: riversand963
Differential Revision: D30921951
Pulled By: ajkr
fbshipit-source-id: babfe218062e55d018c9b046536c0289fb78f41c
Summary:
Context:
Exposing the level of the sst file (i.e, table) where it is created in `TablePropertiesCollectorFactory::Context` allows users of `TablePropertiesCollectorFactory` to customize some implementation details of `TablePropertiesCollectorFactory` and `TablePropertiesCollector` based on the level of creation. For example, `TablePropertiesCollector::NeedCompact()` can return different values based on level of creation.
- Declared an extra field `level_at_creation` in `TablePropertiesCollectorFactory::Context`
- Allowed `level_at_creation` to be passed in as an argument in `IntTblPropCollectorFactory::CreateIntTblPropCollector()` and `UserKeyTablePropertiesCollectorFactory::CreateIntTblPropCollector()`, the latter of which is an internal wrapper of user's passed-in `TablePropertiesCollectorFactory::CreateTablePropertiesCollector()` used in table-building process
- Called `IntTblPropCollectorFactory::CreateIntTblPropCollector()` with `level_at_creation` passed into both `BlockBasedTableBuilder` and `PlainTableBuilder`
- `PlainTableBuilder` previously did not capture `level_at_creation` from `TableBuilderOptions` in `PlainTableFactory`. In order for it to call the method with this parameter, this PR also made `PlainTableBuilder` capture `level_at_creation` as a required parameter
- Called `IntTblPropCollectorFactory::CreateIntTblPropCollector()` with `level_at_creation` its overridden functions in its derived classes, including `RegularKeysStartWithAFactory::CreateIntTblPropCollector()` in `table_properties_collector_test.cc`, `SstFileWriterPropertiesCollectorFactory::CreateIntTblPropCollector()` in `sst_file_writer_collectors.h`
Pull Request resolved: https://github.com/facebook/rocksdb/pull/8919
Test Plan:
- Passed the added assertion for `context.level_at_creation`
- Passed existing tests
- Run `Make` to make sure adding a required parameter to `PlainTableBuilder`'s constructor does not break anything
Reviewed By: anand1976
Differential Revision: D30951729
Pulled By: hx235
fbshipit-source-id: c4a0173b0d9344a4cf47e1b987d759c1c73cb474
Summary:
Add a paranoid check where in case FileSystem layer doesn't fill the buffer but returns succeed, checksum is unlikely to match even if buffer contains a previous block. The byte modified is not useful anyway, so it isn't expect to change any behavior.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/8955
Test Plan: See existing CI to pass.
Reviewed By: pdillinger
Differential Revision: D31183966
fbshipit-source-id: dcc4de429e18131873f783b90d3be55d7eb44a1f
Summary:
Bumps [nokogiri](https://github.com/sparklemotion/nokogiri) from 1.11.4 to 1.12.5.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a href="https://github.com/sparklemotion/nokogiri/releases">nokogiri's releases</a>.</em></p>
<blockquote>
<h2>1.12.5 / 2021-09-27</h2>
<h3>Security</h3>
<p>[JRuby] Address CVE-2021-41098 (<a href="https://github.com/sparklemotion/nokogiri/security/advisories/GHSA-2rr5-8q37-2w7h">GHSA-2rr5-8q37-2w7h</a>).</p>
<p>In Nokogiri v1.12.4 and earlier, on JRuby only, the SAX parsers resolve external entities (XXE) by default. This fix turns off entity-resolution-by-default in the JRuby SAX parsers to match the CRuby SAX parsers' behavior.</p>
<p>CRuby users are not affected by this CVE.</p>
<h3>Fixed</h3>
<ul>
<li>[CRuby] <code>Document#to_xhtml</code> properly serializes self-closing tags in libxml > 2.9.10. A behavior change introduced in libxml 2.9.11 resulted in emitting start and and tags (e.g., <code><br></br></code>) instead of a self-closing tag (e.g., <code><br/></code>) in previous Nokogiri versions. [<a href="https://github-redirect.dependabot.com/sparklemotion/nokogiri/issues/2324">https://github.com/facebook/rocksdb/issues/2324</a>]</li>
</ul>
<hr />
<p>SHA256 checksums:</p>
<pre><code>36bfa3a07aced069b3f3c9b39d9fb62cb0728d284d02b079404cd55780beaeff nokogiri-1.12.5-arm64-darwin.gem
16b1a9ddbb70a9c998462912a5972097cbc79c3e01eb373906886ef8a469f589 nokogiri-1.12.5-java.gem
218dcc6edd1b49cc6244b5f88afb978739bb2f3f166c271557fe5f51e4bc713c nokogiri-1.12.5-x64-mingw32.gem
e33bb919d64c16d931a5f26dc880969e587d225cfa97e6b56e790fb52179f527 nokogiri-1.12.5-x86-linux.gem
e13c2ed011b8346fbd589e96fe3542d763158bc2c7ad0f4f55f6d801afd1d9ff nokogiri-1.12.5-x86-mingw32.gem
1ed64f7db7c1414b87fce28029f2a10128611d2037e0871ba298d00f9a00edd6 nokogiri-1.12.5-x86_64-darwin.gem
0868c8d0a147904d4dedaaa05af5f06656f2d3c67e4432601718559bf69d6cea nokogiri-1.12.5-x86_64-linux.gem
2b20905942acc580697c8c496d0d1672ab617facb9d30d156b3c7676e67902ec nokogiri-1.12.5.gem
</code></pre>
<h2>1.12.4 / 2021-08-29</h2>
<h3>Notable fix: Namespace inheritance</h3>
<p>Namespace behavior when reparenting nodes has historically been poorly specified and the behavior diverged between CRuby and JRuby. As a result, making this behavior consistent in v1.12.0 introduced a breaking change.</p>
<p>This patch release reverts the Builder behavior present in v1.12.0..v1.12.3 but keeps the Document behavior. This release also introduces a Document attribute to allow affected users to easily change this behavior for their legacy code without invasive changes.</p>
<h4>Compensating Feature in XML::Document</h4>
<p>This release of Nokogiri introduces a new <code>Document</code> boolean attribute, <code>namespace_inheritance</code>, which controls whether children should inherit a namespace when they are reparented. <code>Nokogiri::XML:Document</code> defaults this attribute to <code>false</code> meaning "do not inherit," thereby making explicit the behavior change introduced in v1.12.0.</p>
<p>CRuby users who desire the pre-v1.12.0 behavior may set <code>document.namespace_inheritance = true</code> before reparenting nodes.</p>
<p>See <a href="https://nokogiri.org/rdoc/Nokogiri/XML/Document.html#namespace_inheritance-instance_method">https://nokogiri.org/rdoc/Nokogiri/XML/Document.html#namespace_inheritance-instance_method</a> for example usage.</p>
<h4>Fix for XML::Builder</h4>
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Changelog</summary>
<p><em>Sourced from <a href="https://github.com/sparklemotion/nokogiri/blob/main/CHANGELOG.md">nokogiri's changelog</a>.</em></p>
<blockquote>
<h2>1.12.5 / 2021-09-27</h2>
<h3>Security</h3>
<p>[JRuby] Address CVE-2021-41098 (<a href="https://github.com/sparklemotion/nokogiri/security/advisories/GHSA-2rr5-8q37-2w7h">GHSA-2rr5-8q37-2w7h</a>).</p>
<p>In Nokogiri v1.12.4 and earlier, on JRuby only, the SAX parsers resolve external entities (XXE) by default. This fix turns off entity-resolution-by-default in the JRuby SAX parsers to match the CRuby SAX parsers' behavior.</p>
<p>CRuby users are not affected by this CVE.</p>
<h3>Fixed</h3>
<ul>
<li>[CRuby] <code>Document#to_xhtml</code> properly serializes self-closing tags in libxml > 2.9.10. A behavior change introduced in libxml 2.9.11 resulted in emitting start and and tags (e.g., <code><br></br></code>) instead of a self-closing tag (e.g., <code><br/></code>) in previous Nokogiri versions. [<a href="https://github-redirect.dependabot.com/sparklemotion/nokogiri/issues/2324">https://github.com/facebook/rocksdb/issues/2324</a>]</li>
</ul>
<h2>1.12.4 / 2021-08-29</h2>
<h3>Notable fix: Namespace inheritance</h3>
<p>Namespace behavior when reparenting nodes has historically been poorly specified and the behavior diverged between CRuby and JRuby. As a result, making this behavior consistent in v1.12.0 introduced a breaking change.</p>
<p>This patch release reverts the Builder behavior present in v1.12.0..v1.12.3 but keeps the Document behavior. This release also introduces a Document attribute to allow affected users to easily change this behavior for their legacy code without invasive changes.</p>
<h4>Compensating Feature in XML::Document</h4>
<p>This release of Nokogiri introduces a new <code>Document</code> boolean attribute, <code>namespace_inheritance</code>, which controls whether children should inherit a namespace when they are reparented. <code>Nokogiri::XML:Document</code> defaults this attribute to <code>false</code> meaning "do not inherit," thereby making explicit the behavior change introduced in v1.12.0.</p>
<p>CRuby users who desire the pre-v1.12.0 behavior may set <code>document.namespace_inheritance = true</code> before reparenting nodes.</p>
<p>See <a href="https://nokogiri.org/rdoc/Nokogiri/XML/Document.html#namespace_inheritance-instance_method">https://nokogiri.org/rdoc/Nokogiri/XML/Document.html#namespace_inheritance-instance_method</a> for example usage.</p>
<h4>Fix for XML::Builder</h4>
<p>However, recognizing that we want <code>Builder</code>-created children to inherit namespaces, Builder now will set <code>namespace_inheritance=true</code> on the underlying document for both JRuby and CRuby. This means that, on CRuby, the pre-v1.12.0 behavior is restored.</p>
<p>Users who want to turn this behavior off may pass a keyword argument to the Builder constructor like so:</p>
<pre lang="ruby"><code>Nokogiri::XML::Builder.new(namespace_inheritance: false)
</code></pre>
<p>See <a href="https://nokogiri.org/rdoc/Nokogiri/XML/Builder.html#label-Namespace+inheritance">https://nokogiri.org/rdoc/Nokogiri/XML/Builder.html#label-Namespace+inheritance</a> for example usage.</p>
<h4>Downstream gem maintainers</h4>
<p>Note that any downstream gems may want to specifically omit Nokogiri v1.12.0--v1.12.3 from their dependency specification if they rely on child namespace inheritance:</p>
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a href="47f6a461fd"><code>47f6a46</code></a> version bump to v1.12.5</li>
<li><a href="2a0ac88518"><code>2a0ac88</code></a> update CHANGELOG</li>
<li><a href="6b6063782c"><code>6b60637</code></a> Merge pull request <a href="https://github-redirect.dependabot.com/sparklemotion/nokogiri/issues/2329">https://github.com/facebook/rocksdb/issues/2329</a> from sparklemotion/flavorjones-GHSA-2rr5-8q37-2w7h_1...</li>
<li><a href="4bd943cae3"><code>4bd943c</code></a> fix(jruby): SAX parser uses an entity resolver</li>
<li><a href="f943ee4108"><code>f943ee4</code></a> refactor(jruby): handle errors more consistently</li>
<li><a href="2790122748"><code>2790122</code></a> format: test files</li>
<li><a href="01e1618f75"><code>01e1618</code></a> Merge pull request <a href="https://github-redirect.dependabot.com/sparklemotion/nokogiri/issues/2327">https://github.com/facebook/rocksdb/issues/2327</a> from sparklemotion/2324-xhtml-self-closing-tags_v1.12.x</li>
<li><a href="a0180c72c5"><code>a0180c7</code></a> fix: HTML4::Document.to_xhtml self-closing tags</li>
<li><a href="564ac17873"><code>564ac17</code></a> release v1.12.4</li>
<li><a href="4d5754baed"><code>4d5754b</code></a> backport <a href="https://github-redirect.dependabot.com/sparklemotion/nokogiri/issues/2320">https://github.com/facebook/rocksdb/issues/2320</a></li>
<li>Additional commits viewable in <a href="https://github.com/sparklemotion/nokogiri/compare/v1.11.4...v1.12.5">compare view</a></li>
</ul>
</details>
<br />
[![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=nokogiri&package-manager=bundler&previous-version=1.11.4&new-version=1.12.5)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)
Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `dependabot rebase`.
[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)
---
<details>
<summary>Dependabot commands and options</summary>
<br />
You can trigger Dependabot actions by commenting on this PR:
- `dependabot rebase` will rebase this PR
- `dependabot recreate` will recreate this PR, overwriting any edits that have been made to it
- `dependabot merge` will merge this PR after your CI passes on it
- `dependabot squash and merge` will squash and merge this PR after your CI passes on it
- `dependabot cancel merge` will cancel a previously requested merge and block automerging
- `dependabot reopen` will reopen this PR if it is closed
- `dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
- `dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
- `dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
- `dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
- `dependabot use these labels` will set the current labels as the default for future PRs for this repo and language
- `dependabot use these reviewers` will set the current reviewers as the default for future PRs for this repo and language
- `dependabot use these assignees` will set the current assignees as the default for future PRs for this repo and language
- `dependabot use this milestone` will set the current milestone as the default for future PRs for this repo and language
You can disable automated security fix PRs for this repo from the [Security Alerts page](https://github.com/facebook/rocksdb/network/alerts).
</details>
Pull Request resolved: https://github.com/facebook/rocksdb/pull/8965
Reviewed By: akankshamahajan15
Differential Revision: D31217632
Pulled By: ltamasi
fbshipit-source-id: c98c5a42f29eb45164a266edd91569737595ab2a
Summary:
Added support for SingleDelete for user-defined timestamps. Users can now Get and Iterate over keys deleted with SingleDelete. It also includes changes in CompactionIterator which preserves the same user key with different timestamps, unless the timestamp is below a certain threshold full_history_ts_low.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/8921
Test Plan: Added new unit tests
Reviewed By: riversand963
Differential Revision: D31098191
Pulled By: akankshamahajan15
fbshipit-source-id: 78a59ef4b4884ae324fcd10f56e62a27d5ee2f49
Summary:
The origin error message of uncompressing block is confusing, which may result from either build support or data corruption.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/8934
Reviewed By: ltamasi
Differential Revision: D31112588
Pulled By: pdillinger
fbshipit-source-id: 1cbf2d4fbcb0ef376cf942246d06f48cb603f852
Summary:
Made SliceTransform into a Customizable class.
Would be nice to write a test that stored and used a custom transform in an SST table.
There are a set of tests (DBBlockFliterTest.PrefixExtractor*, SamePrefixTest.InDomainTest, PrefixTest.PrefixAndWholeKeyTest that run the same with or without a SliceTransform/PrefixFilter. Is this expected?
Pull Request resolved: https://github.com/facebook/rocksdb/pull/8641
Reviewed By: zhichao-cao
Differential Revision: D31142793
Pulled By: mrambacher
fbshipit-source-id: bb08672fccbfdc263dcae21f25a62307e1facda1
Summary:
`RandomAccessFileReader::MultiRead()` tries to merge requests in direct IO, assuming input IO requests are
sorted by offsets.
Add a test in direct IO mode.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/8953
Test Plan: make check
Reviewed By: ltamasi
Differential Revision: D31183546
Pulled By: riversand963
fbshipit-source-id: 5d043ec68e2daa47a3149066150afd41ee3d73e6
Summary:
For now, disable it since the below command indicates it can cause a
failure. Running that command with `-experimental_mempurge_threshold=0`
has been running successfully for several minutes, whereas before it
failed in seconds.
```
$ while rm -rf /dev/shm/single_stress && ./db_stress --clear_column_family_one_in=0 --column_families=1 --db=/dev/shm/single_stress --experimental_mempurge_threshold=5.493146827397074 --flush_one_in=10000 --reopen=0 --write_buffer_size=262144 --value_size_mult=33 --max_write_buffer_number=3 -ops_per_thread=10000; do : ; done
```
Pull Request resolved: https://github.com/facebook/rocksdb/pull/8958
Reviewed By: ltamasi
Differential Revision: D31187059
Pulled By: ajkr
fbshipit-source-id: 04d5bfb4fcc4f5b66233e691427dfd940c67037f
Summary:
The cyclic dependency was:
- `StressTest::OperateDb()` locks the mutex for key 'k'
- `StressTest::OperateDb()` calls a function like `PauseBackgroundWork()`, which waits for pending compaction to complete.
- The pending compaction reaches key `k` and `DbStressCompactionFilter::FilterV2()` calls `Lock()` on that key's mutex, which hangs forever.
The cycle can be broken by using a new function, `port::Mutex::TryLock()`, which returns immediately upon failure to acquire a lock. In that case `DbStressCompactionFilter::FilterV2()` can just decide to keep the key.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/8956
Reviewed By: riversand963
Differential Revision: D31183718
Pulled By: ajkr
fbshipit-source-id: 329e4a31ce43085af174cf367ef560b5a04399c5
Summary:
For internal build enviroment only. Developer could run the
microbenchmark without `ROCKSDB_NO_FBCODE=1`.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/8954
Test Plan: `$ make microbench` on dev server
Reviewed By: ajkr
Differential Revision: D31163717
Pulled By: jay-zhuang
fbshipit-source-id: 1ff59f660ca05afd0fd5c7c7dcdfd831ac365462
Summary:
Right now FaultInjectionTestFS::InjectThreadSpecificReadError() might try to corrupt return bytes, but these bytes might be from mmapped files, which would cause segfault. Instead FaultInjectionTestFS::InjectThreadSpecificReadError() should never corrupt data unless it is in caller's buffer.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/8952
Test Plan: See db_stress still runs and make sure in a test run failurs are still injected in non-mmap cases.
Reviewed By: ajkr, ltamasi
Differential Revision: D31147318
fbshipit-source-id: 9484a64ff2aaa36685557203f449286e694e65f9
Summary:
Context:
After more discussion, a fix in https://github.com/facebook/rocksdb/issues/8938 might turn out to be too restrictive for the case where `GetTotalPendingRequests` might be invoked on RateLimiter classes that does not support the recently added API `RateLimiter::GetTotalPendingRequests` (https://github.com/facebook/rocksdb/issues/8890) due to the `assert(false)` in https://github.com/facebook/rocksdb/issues/8938. Furthermore, sentinel value like `-1` proposed in https://github.com/facebook/rocksdb/issues/8938 is easy to be ignored and unchecked. Therefore we decided to adopt `Status::NotSupported()`, which is also a convention of adding new API to public header in RocksDB.
- Changed return value type of `RateLimiter::GetTotalPendingRequests` in related declaration/definition
- Passed in pointer argument to hold the output instead of returning it as before
- Adapted to the changes above in calling `RateLimiter::GetTotalPendingRequests` in test
- Minor improvement to `TEST_F(RateLimiterTest, GetTotalPendingRequests)`: added failure message for assertion and replaced repetitive statements with a loop
Pull Request resolved: https://github.com/facebook/rocksdb/pull/8950
Reviewed By: ajkr, pdillinger
Differential Revision: D31128450
Pulled By: hx235
fbshipit-source-id: 282ac9c4f3dacaa0aec6d0a993161f77ad47a040
Summary:
There is a corner case when using WriteUnprepared transactions when
`WriteUnpreparedTxn::Get` returns `Status::TryAgain` instead of
propagating the result of `GetFromBatchAndDB`. The patch adds
`PermitUncheckedError` to make the `ASSERT_STATUS_CHECKED` build pass in
this case as well.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/8947
Test Plan: `make check`
Reviewed By: riversand963
Differential Revision: D31125422
Pulled By: ltamasi
fbshipit-source-id: 42de51dcfa9384e032244c2b4d3f40e9a4111194
Summary:
Context/Summary:
https://github.com/facebook/rocksdb/pull/8890 added a public API `RateLimiter::GetTotalPendingRequest()` but mistakenly marked it as pure virtual, forcing RateLimiter's derived classes to implement this function and breaking backward compatibility.
This PR makes `RateLimiter::GetTotalPendingRequest()` as non-pure virtual method by providing a trivial implementation in rate_limiter.h
Pull Request resolved: https://github.com/facebook/rocksdb/pull/8938
Test Plan: Passing existing tests
Reviewed By: pdillinger
Differential Revision: D31100661
Pulled By: hx235
fbshipit-source-id: 06eff1005156a6e5a881e393b2c5b2ad706897d8
Summary:
Several improvements to MultiRead:
1. Fix a bug in stress test which causes false positive when both MultiRead() return and individual read request have failure injected.
2. Add two more types of fault that should be handled: empty read results and checksum mismatch
3. Add a message indicating which type of fault is injected
4. Increase the failure rate
Pull Request resolved: https://github.com/facebook/rocksdb/pull/8937
Reviewed By: anand1976
Differential Revision: D31085930
fbshipit-source-id: 3a04994a3cadebf9a64d25e1fe12b14b7a272fba
Summary:
Right now, if underlying read returns fewer bytes than asked for, RandomAccessFileReader::MultiRead() still returns those in the buffer to upper layer. This can be a surprise to upper layer.
This is unlikely to cause incorrect data. To cause incorrect data, checksum checking in upper layer should pass with short reads, whose chance is low.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/8941
Test Plan: Run stress tests for a while
Reviewed By: anand1976
Differential Revision: D31085780
fbshipit-source-id: 999adf2d6c2712f1323d14bb68b678df59969973
Summary:
In FileChecksumTestHelper::VerifyEachFileChecksum(), we query the file list, and then for each file in the list verify the checksum. However, compaction can delete those files in the mean time and cause failures. To prevent it from happening, disable file deletion during the validation.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/8942
Test Plan: Run exsiting test and see it doesn't fail.
Reviewed By: pdillinger
Differential Revision: D31086488
fbshipit-source-id: 554608f36d2dd3bf0a20dfc4039c68bd8533d7f8
Summary:
Updates a few remaining functions that should have been updated
from Status -> IOStatus, and adds to HISTORY for the overall change
including https://github.com/facebook/rocksdb/issues/8820.
This change is for inclusion in version 6.25.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/8940
Test Plan: CI
Reviewed By: zhichao-cao
Differential Revision: D31085029
Pulled By: pdillinger
fbshipit-source-id: 91557c6a39ef1d90357d4f4dcd79af0645d87c7b
Summary:
Made SystemClock into a Customizable class, complete with CreateFromString.
Cleaned up some of the existing SystemClock implementations that were redundant (NoSleep was the same as the internal one for MockEnv).
Changed MockEnv construction to allow Clock to be passed to the Memory/MockFileSystem.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/8636
Reviewed By: zhichao-cao
Differential Revision: D30483360
Pulled By: mrambacher
fbshipit-source-id: cd0e3a876c39f8c98fe13374c06e8edbd5b9f2a1
Summary:
In case of IO uring bugs, we need to provide a way for users to turn it off.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/8931
Test Plan: Manually run db_bench with/without the option and verify the behavior
Reviewed By: pdillinger
Differential Revision: D31040252
Pulled By: anand1976
fbshipit-source-id: 56f2537d6ac8488c9e126296d8190ad9e0158f70
Summary:
Add support for fallback to local compaction, the user can
return `CompactionServiceJobStatus::kUseLocal` to instruct RocksDB to
run the compaction locally instead of waiting for the remote compaction
result.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/8709
Test Plan: unittest
Reviewed By: ajkr
Differential Revision: D30560163
Pulled By: jay-zhuang
fbshipit-source-id: 65d8905a4a1bc185a68daa120997f21d3198dbe1
Summary:
As title. The reason is that after loading customized options, the env is not set back to the correct one. Fix it.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/8929
Test Plan: Manually validate in an environment where the command failed.
Reviewed By: riversand963
Differential Revision: D31026931
fbshipit-source-id: c25dc788bf80ed5bf4b24922c442781943bcd65b
Summary:
Because even 32-bit systems can have large files
This is a "change" that I don't want intermingled with an upcoming refactoring.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/8926
Test Plan: CI
Reviewed By: zhichao-cao
Differential Revision: D31020974
Pulled By: pdillinger
fbshipit-source-id: ca9eb4510697df6f1f55e37b37730b88b1809a92
Summary:
- Fixed a bug in `RateLimiterTest.GeneratePriorityIterationOrder` that the callbacks in this test were not called starting from `i = 1`. Fix by increasing `rate_bytes_per_sec` and requested bytes.
- The bug is due to the previous `rate_bytes_per_sec` was set too small, resulting in `refill_bytes_per_period` less than `kMinRefillBytesPerPeriod`. Hence the actual `refill_bytes_per_period` was equal to `kMinRefillBytesPerPeriod` due to the logic [here](https://github.com/facebook/rocksdb/blob/main/util/rate_limiter.cc#L302-L303) and it ended up being greater than the previously set requested bytes. Therefore starting from `i = 1`, `RefillBytesAndGrantRequests()` and `GeneratePriorityIterationOrder` won't be called and the test callbacks was not triggered to execute the assertion.
- Added internal flag to assert callbacks are called in `RateLimiterTest.GeneratePriorityIterationOrder` to prevent any future changes defeat the purpose of the test [as suggested](https://github.com/facebook/rocksdb/pull/8890#discussion_r704915134)
- Increased `rate_bytes_per_sec` and bytes of each request in `RateLimiterTest.GetTotalBytesThrough`, `RateLimiterTest.GetTotalRequests`, `RateLimiterTest.GetTotalPendingRequests` to trigger the "long path" of execution (i.e, the one trigger RefillBytesAndGrantRequests()) to increase test coverage
- This increased the running time of the three tests, see test plan for time difference running locally
- Cleared up sync point effects after each test by calling `SyncPoint::GetInstance()->DisableProcessing();` and `SyncPoint::GetInstance()->ClearAllCallBacks();` in `~RateLimiterTest()` [as suggested](https://github.com/facebook/rocksdb/pull/8595/files#r697534279)
- It's fine to call these two methods even when `EnableProcessing()` or `SetCallBack()` is not called in the test or is already cleaned up. In those cases, calling these two functions in destructor is effectively no-op.
- This will allow cleaning up sync point effects of previous test even when the previous test failed in assertion.
- Added missing `SyncPoint::GetInstance()->DisableProcessing();` and `SyncPoint::GetInstance()->ClearCallBacks(..);` in existing tests for completeness
- Called `SyncPoint::GetInstance()->DisableProcessing();` and `SyncPoint::GetInstance()->ClearCallBacks(..);` in loop in `RateLimiterTest.GeneratePriorityIterationOrder` for completeness
Pull Request resolved: https://github.com/facebook/rocksdb/pull/8904
Test Plan:
- Passing existing tests
- To verify the 1st change, run `RateLimiterTest.GeneratePriorityIterationOrder` with assertions of callbacks are indeed called under original `rate_bytes_per_sec` and request byte and under updated `rate_bytes_per_sec` and request byte. The former will fail the assertion while the latter succeeds.
- Here is the increased test time due to the 3rd change mentioned above in the summary. The relevant 3 tests mentioned in total increase the test time by 6s (~6000/33848 = 17.7% of the original total test time), which IMO is acceptable for better test coverage through running the "long path".
- current (run on branch rate_limiter_ut_improve locally)
[ RUN ] RateLimiterTest.GetTotalBytesThrough
[ OK ] RateLimiterTest.GetTotalBytesThrough (3000 ms)
[ RUN ] RateLimiterTest.GetTotalRequests
[ OK ] RateLimiterTest.GetTotalRequests (3001 ms)
[ RUN ] RateLimiterTest.GetTotalPendingRequests
[ OK ] RateLimiterTest.GetTotalPendingRequests (0 ms)
...
[----------] 10 tests from RateLimiterTest (43349 ms total)
[----------] Global test environment tear-down
[==========] 10 tests from 1 test case ran. (43349 ms total)
[ PASSED ] 10 tests.
- previous (run on branch main locally)
[ RUN ] RateLimiterTest.GetTotalBytesThrough
[ OK ] RateLimiterTest.GetTotalBytesThrough (0 ms)
[ RUN ] RateLimiterTest.GetTotalRequests
[ OK ] RateLimiterTest.GetTotalRequests (0 ms)
[ RUN ] RateLimiterTest.GetTotalPendingRequests
[ OK ] RateLimiterTest.GetTotalPendingRequests (0 ms)
...
[----------] 10 tests from RateLimiterTest (33848 ms total)
[----------] Global test environment tear-down
[==========] 10 tests from 1 test case ran. (33848 ms total)
[ PASSED ] 10 tests.
Reviewed By: ajkr
Differential Revision: D30872544
Pulled By: hx235
fbshipit-source-id: ff894f5c1a4bef70e8e407d53b00be45f776b3e4
Summary:
This keeps the implementations/API backward compatible. Implementations of Statistics will need to override this method (and be registered with the ObjectRegistry) in order to be created via CreateFromString.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/8918
Reviewed By: pdillinger
Differential Revision: D30958916
Pulled By: mrambacher
fbshipit-source-id: 75b99a84e9e11fda2a9e8eff9ee1ef69a17517b2
Summary:
1. Extend FlushJobInfo and CompactionJobInfo with information about the blob files generated by flush/compaction jobs. This PR add two structures BlobFileInfo and BlobFileGarbageInfo that contains the required information of blob files.
2. Notify the creation and deletion of blob files through OnBlobFileCreationStarted, OnBlobFileCreated, and OnBlobFileDeleted.
3. Test OnFile*Finish operations notifications with Blob Files.
4. Log the blob file creation/deletion events through EventLogger in Log file.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/8675
Test Plan: Add new unit tests in listener_test
Reviewed By: ltamasi
Differential Revision: D30412613
Pulled By: akankshamahajan15
fbshipit-source-id: ca51b63c6e8c8d0485a38c503572bc5a82bd5d07
Summary:
Right now, the failure injection test for MultiGet() is not sufficient. Improve it with TestFSRandomAccessFile::MultiRead() injecting failures.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/8925
Test Plan: Run crash test locally for a while.
Reviewed By: anand1976
Differential Revision: D31000529
fbshipit-source-id: 439c7e02cf7440ac5af82deb609e202abdca3e1f
Summary:
Add compaction priority information in RemoteCompaction, which
can be used to schedule high priority job first.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/8707
Test Plan: unittest
Reviewed By: ajkr
Differential Revision: D30548401
Pulled By: jay-zhuang
fbshipit-source-id: b30446511fb31b4583c49edd8565d496cf013a34
Summary:
One contrun name is incorrect, which mixed error reporting with another one. Fix it.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/8924
Reviewed By: ltamasi
Differential Revision: D30999477
fbshipit-source-id: 46a04b2e4b48f755181aa9a47c353d91f1128469
Summary:
Test did not consider that slower deletion rate only kicks in
after a file is deleted
Fixes https://github.com/facebook/rocksdb/issues/7546
Pull Request resolved: https://github.com/facebook/rocksdb/pull/8917
Test Plan:
no longer reproduces using
buck test mode/dev //internal_repo_rocksdb/repo:db_sst_test -- --exact 'internal_repo_rocksdb/repo:db_sst_test - DBWALTestWithParam/DBWALTestWithParam.WALTrashCleanupOnOpen/0' --jobs 40 --stress-runs 600 --record-results
Reviewed By: siying
Differential Revision: D30949127
Pulled By: pdillinger
fbshipit-source-id: 5d0607f8f548071b07410fe8f532b4618cd225e5
Summary:
kFlushOnly currently means "always" except in the case of
remote compaction. This makes it flushes only.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/8750
Test Plan: test updated
Reviewed By: akankshamahajan15
Differential Revision: D30968034
Pulled By: pdillinger
fbshipit-source-id: 5dbd24dde18852a0e937a540995fba9bfbe89037
Summary:
In order to populate the IOStatus up to the higher level, replace some of the Status to IOStatus.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/8820
Test Plan: make check
Reviewed By: pdillinger
Differential Revision: D30967215
Pulled By: zhichao-cao
fbshipit-source-id: ccf9d5cfbd9d3de047c464aaa85f9fa43b474903
Summary:
Potential bugs in the IO uring implementation can cause bad data to be returned in the completion queue. Add some checks in the PosixRandomAccessFile::MultiRead completion handling code to catch such errors and fail the entire MultiRead. Also log some diagnostic messages and stack trace.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/8894
Reviewed By: siying, pdillinger
Differential Revision: D30826982
Pulled By: anand1976
fbshipit-source-id: af91815ac760e095d6cc0466cf8bd5c10167fd15
Summary:
ArenaWrappedDBIter::db_iter_ should never be nullptr. However, when debugging a segfault, it's hard to distinguish it is not initialized (not possible) and other corruption. Add this nullptr to help distinguish the case.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/8889
Test Plan: Run existing unit tests.
Reviewed By: pdillinger
Differential Revision: D30814756
fbshipit-source-id: 4b1f36896a33dc203d4f1f424ded9554927d61ba
Summary:
After https://github.com/facebook/rocksdb/issues/8725, keys added to `WriteBatch` may be timestamp-suffixed, while `WriteBatch` has no awareness of the timestamp size. Therefore, `WriteBatch` can no longer calculate timestamp checksum separately from the rest of the key's checksum in all cases.
This PR changes the definition of key in KV checksum to include the timestamp suffix. That way we do not need to worry about where the timestamp begins within the key. I believe the only practical effect of this change is now `AssignTimestamp()` requires recomputing the whole key checksum (`UpdateK()`) rather than just the timestamp portion (`UpdateT()`).
Pull Request resolved: https://github.com/facebook/rocksdb/pull/8914
Test Plan:
run stress command that used to fail
```
$ ./db_stress --batch_protection_bytes_per_key=8 -clear_column_family_one_in=0 -test_batches_snapshots=1
```
Reviewed By: riversand963
Differential Revision: D30925715
Pulled By: ajkr
fbshipit-source-id: c143f7ccb46c0efb390ad57ef415c250d754deff