Summary:
Currently, if blob files are enabled (i.e. `enable_blob_files` is true), large values are extracted both during flush/recovery (when SST files are written into level 0 of the LSM tree) and during compaction into any LSM tree level. For certain use cases that have a mix of short-lived and long-lived values, it might make sense to support extracting large values only during compactions whose output level is greater than or equal to a specified LSM tree level (e.g. compactions into L1/L2/... or above). This could reduce the space amplification caused by large values that are turned into garbage shortly after being written at the price of some write amplification incurred by long-lived values whose extraction to blob files is delayed.
In order to achieve this, we would like to do the following:
- Add a new configuration option `blob_file_starting_level` (default: 0) to `AdvancedColumnFamilyOptions` (and `MutableCFOptions` and extend the related logic)
- Instantiate `BlobFileBuilder` in `BuildTable` (used during flush and recovery, where the LSM tree level is L0) and `CompactionJob` iff `enable_blob_files` is set and the LSM tree level is `>= blob_file_starting_level`
- Add unit tests for the new functionality, and add the new option to our stress tests (`db_stress` and `db_crashtest.py` )
- Add the new option to our benchmarking tool `db_bench` and the BlobDB benchmark script `run_blob_bench.sh`
- Add the new option to the `ldb` tool (see https://github.com/facebook/rocksdb/wiki/Administration-and-Data-Access-Tool)
- Ideally extend the C and Java bindings with the new option
- Update the BlobDB wiki to document the new option.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/10077
Reviewed By: ltamasi
Differential Revision: D36884156
Pulled By: gangliao
fbshipit-source-id: 942bab025f04633edca8564ed64791cb5e31627d
Summary:
The patch adds a new BlobDB benchmarking script called `run_blob_bench.sh`.
It is a thin wrapper around `benchmark.sh` (similarly to `run_flash_bench.sh`):
it actually calls `benchmark.sh` a number of times, cycling through six workloads,
two write-only ones (bulk load and overwrite), two read/write ones (point lookups
while writing, range scans while writing), and two read-only ones (point lookups
and range scans).
Note: this is a simpler/cleaned up/reworked version of the script used to produce the
benchmark results in http://rocksdb.org/blog/2021/05/26/integrated-blob-db.html .
The new version takes advantage of several recent `benchmark.sh` improvements
like the ability to pass in arbitrary `db_bench` options or the possibility of using a
job ID.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/9015
Test Plan: Ran the script manually with different parameter combinations.
Reviewed By: riversand963
Differential Revision: D31555277
Pulled By: ltamasi
fbshipit-source-id: 0e151b2f7b2cf6f66ed7f95455571492ad7ea87f