rocksdb

Commit Graph

Author	SHA1	Message	Date
Hui Xiao	629605d645	Move prefetching responsibility to page cache for compaction read under non directIO usecase (#11631 ) Summary: Context/Summary As titled. The benefit of doing so is to explicitly call readahead() instead of relying page cache behavior for compaction read when we know that we most likely need readahead as compaction read is sequential read . Test Extended the existing UT to cover compaction read case Pull Request resolved: https://github.com/facebook/rocksdb/pull/11631 Reviewed By: ajkr Differential Revision: D47681437 Pulled By: hx235 fbshipit-source-id: 78792f64985c4dc44aa8f2a9c41ab3e8bbc0bc90	2 years ago
Changyu Bi	df082c8d1d	Deprecate option `periodic_compaction_seconds` for FIFO compaction (#11550 ) Summary: both options `ttl` and `periodic_compaction_seconds` have the same meaning for FIFO compaction, which is redundant and can be confusing to use. For example, setting TTL to 0 does not disable TTL: user needs to also set periodic_compaction_seconds to 0. Another example is that dynamically setting `periodic_compaction_seconds` (surprisingly) has no effect on TTL compaction. This is because FIFO compaction picker internally only looks at value of `ttl`. The value of `ttl` is in `SanitizeOptions()` which take into account the value of `periodic_compaction_seconds`, but dynamically setting an option does not invoke this method. This PR clarifies the usage of both options for FIFO compaction: only `ttl` should be used, `periodic_compaction_seconds` will not have any effect on FIFO compaction. Pull Request resolved: https://github.com/facebook/rocksdb/pull/11550 Test Plan: - updated existing unit test `DBOptionsTest.SanitizeFIFOPeriodicCompaction` - checked existing values of both options in feature matrix: https://fburl.com/daiquery/xxd0gs9w. All current uses cases either have `periodic_compaction_seconds = 0` or have `periodic_compaction_seconds > ttl`, so should not cause change of behavior. Reviewed By: ajkr Differential Revision: D46902959 Pulled By: cbi42 fbshipit-source-id: a9ede235b276783b4906aaec443551fa62ceff4c	2 years ago
akankshamahajan	94c247bff8	Update HISTORY.md for branch cut for 8.4.fb (#11565 ) Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/11565 Reviewed By: jowlyzhang, cbi42 Differential Revision: D47027788 Pulled By: akankshamahajan15 fbshipit-source-id: e5e8db2eb21f8aa68fe072f0e1b63b83ba7beb9f	2 years ago
Changyu Bi	bc04ec85db	Make option `level_compaction_dynamic_level_bytes` true by default (#11525 ) Summary: after https://github.com/facebook/rocksdb/issues/11321 and https://github.com/facebook/rocksdb/issues/11340 (both included in RocksDB v8.2), migration from `level_compaction_dynamic_level_bytes=false` to `level_compaction_dynamic_level_bytes=true` is automatic by RocksDB and requires no manual compaction from user. Making the option true by default as it has several advantages: 1. better space amplification guarantee (a more stable LSM shape). 2. compaction is more adaptive to write traffic. 3. automatic draining of unneeded levels. Wiki is updated with more detail: https://github.com/facebook/rocksdb/wiki/Leveled-Compaction#option-level_compaction_dynamic_level_bytes-and-levels-target-size. The PR mostly contains fixes for unit tests as they assumed `level_compaction_dynamic_level_bytes=false`. Most notable change is commit `f742be330c` and `b1928e42b3` which override the default option in DBTestBase to still set `level_compaction_dynamic_level_bytes=false` by default. This helps to reduce the change needed for unit tests. I think this default option override in unit tests is okay since the behavior of `level_compaction_dynamic_level_bytes=true` is tested by explicitly setting this option. Also, `level_compaction_dynamic_level_bytes=false` may be more desired in unit tests as it makes it easier to create a desired LSM shape. Comment for option `level_compaction_dynamic_level_bytes` is updated to reflect this change and change made in https://github.com/facebook/rocksdb/issues/10057. Pull Request resolved: https://github.com/facebook/rocksdb/pull/11525 Test Plan: `make -j32 J=32 check` several times to try to catch flaky tests due to this option change. Reviewed By: ajkr Differential Revision: D46654256 Pulled By: cbi42 fbshipit-source-id: 6b5827dae124f6f1fdc8cca2ac6f6fcd878830e1	2 years ago
Changyu Bi	15e8a843d9	Do not include last level in compaction when `allow_ingest_behind=true` (#11489 ) Summary: when a DB is configured with `allow_ingest_behind = true`, the last level should be reserved for ingested files and these files should not be included in any compaction. Currently, a major compaction can compact these files to smaller levels. This can cause future files to be rejected for ingest behind (see `ExternalSstFileIngestionJob::CheckLevelForIngestedBehindFile()`). This PR fixes the issue such that files in the last level is not included in any compaction. Pull Request resolved: https://github.com/facebook/rocksdb/pull/11489 Test Plan: * Updated unit test `ExternalSSTFileTest.IngestBehind` to test that last level is not included in manual and auto-compaction. Reviewed By: ajkr Differential Revision: D46455711 Pulled By: cbi42 fbshipit-source-id: 5e2142c2a709ef932ad797897795021c06c4ac8c	2 years ago
Changyu Bi	4aa52d89cf	Drop range tombstone during non-bottommost compaction (#11459 ) Summary: Similar to point tombstones, we can drop a range tombstone during compaction when we know its range does not exist in any higher level. This PR adds this optimization. Some existing test in db_range_del_test is fixed to work under this optimization. Pull Request resolved: https://github.com/facebook/rocksdb/pull/11459 Test Plan: * Add unit test `DBRangeDelTest, NonBottommostCompactionDropRangetombstone`. * Ran crash test that issues range deletion for a few hours: `python3 tools/db_crashtest.py blackbox --simple --write_buffer_size=1048576 --delrangepercent=10 --writepercent=31 --readpercent=40` Reviewed By: ajkr Differential Revision: D46007904 Pulled By: cbi42 fbshipit-source-id: 3f37205b6778b7d55ed106369ca41b0632a6d0fd	2 years ago
Changyu Bi	e95cc1217d	`CompactRange()` always compacts to bottommost level for leveled compaction (#11468 ) Summary: currently for leveled compaction, the max output level of a call to `CompactRange()` is pre-computed before compacting each level. This max output level is the max level whose key range overlaps with the manual compaction key range. However, during manual compaction, files in the max output level may be compacted down further by some background compaction. When this background compaction is a trivial move, there is a race condition and the manual compaction may not be able to compact all keys in the specified key range. This PR updates `CompactRange()` to always compact to the bottommost level to make this race condition more unlikely (it can still happen, see more in comment here: `796f58f42a/db/db_impl/db_impl_compaction_flush.cc (L1180C29-L1184)`). This PR also changes the behavior of CompactRange() when `bottommost_level_compaction=kIfHaveCompactionFilter` (the default option). The old behavior is that, if a compaction filter is provided, CompactRange() always does an intra-level compaction at the final output level for all files in the manual compaction key range. The only exception when `first_overlapped_level = 0` and `max_overlapped_level = 0`. It’s awkward to maintain the same behavior after this PR since we do not compute max_overlapped_level anymore. So the new behavior is similar to kForceOptimized: always does intra-level compaction at the bottommost level, but not including new files generated during this manual compaction. Several unit tests are updated to work with this new manual compaction behavior. Pull Request resolved: https://github.com/facebook/rocksdb/pull/11468 Test Plan: Add new unit tests `DBCompactionTest.ManualCompactionCompactAllKeysInRange*` Reviewed By: ajkr Differential Revision: D46079619 Pulled By: cbi42 fbshipit-source-id: 19d844ba4ec8dc1a0b8af5d2f36ff15820c6e76f	2 years ago
Peter Dillinger	8848ec92dd	Better management of unreleased HISTORY (#11481 ) Summary: See new NOTE in HISTORY.md and unreleased_history/README.txt Pull Request resolved: https://github.com/facebook/rocksdb/pull/11481 Test Plan: some manual testing on my CentOS 8 system Reviewed By: jaykorean Differential Revision: D46233342 Pulled By: pdillinger fbshipit-source-id: daf59cf3dc907f450b469090dcc481a30a7d7c0d	2 years ago

8 Commits (9cc0986ae2e652c8d121fcebdd027ac281849e2a)