Summary:
Follow-up to https://github.com/facebook/rocksdb/issues/8426 .
The patch adds a new kind of `InternalIterator` that wraps another one and
passes each key-value encountered to `BlobGarbageMeter` as inflow.
This iterator will be used as an input iterator for compactions when the input
SSTs reference blob files.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/8443
Test Plan: `make check`
Reviewed By: jay-zhuang
Differential Revision: D29311987
Pulled By: ltamasi
fbshipit-source-id: b4493b4c0c0c2e3c2ecc33c8969a5ef02de5d9d8
Summary:
This is part of an alternative approach to https://github.com/facebook/rocksdb/issues/8316.
Unlike that approach, this one relies on key-values getting processed one by one
during compaction, and does not involve persistence.
Specifically, the patch adds a class `BlobGarbageMeter` that can track the number
and total size of blobs in a (sub)compaction's input and output on a per-blob file
basis. This information can then be used to compute the amount of additional
garbage generated by the compaction for any given blob file by subtracting the
"outflow" from the "inflow."
Note: this patch only adds `BlobGarbageMeter` and associated unit tests. I plan to
hook up this class to the input and output of `CompactionIterator` in a subsequent PR.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/8426
Test Plan: `make check`
Reviewed By: jay-zhuang
Differential Revision: D29242250
Pulled By: ltamasi
fbshipit-source-id: 597e50ad556540e413a50e804ba15bc044d809bb
Summary:
- `c_test` fails because `rocksdb_compact_range()` swallows a `Status`.
- `env_test` fails because `ReadRequest`s to `MultiRead()` do not have their `Status`es checked.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/8430
Test Plan: `ASSERT_STATUS_CHECKED=1 make -j48 check`
Reviewed By: jay-zhuang
Differential Revision: D29257473
Pulled By: ajkr
fbshipit-source-id: e02127f971703744be7de85f0a028e4664c79577
Summary:
Logically, subcompactions process a key range [start, end); however, the way
this is currently implemented is that the `CompactionIterator` for any given
subcompaction keeps processing key-values until it actually outputs a key that
is out of range, which is then discarded. Instead of doing this, the patch
introduces a new type of internal iterator called `ClippingIterator` which wraps
another internal iterator and "clips" its range of key-values so that any KVs
returned are strictly in the [start, end) interval. This does eliminate a (minor)
inefficiency by stopping processing in subcompactions exactly at the limit;
however, the main motivation is related to BlobDB: namely, we need this to be
able to measure the amount of garbage generated by a subcompaction
precisely and prevent off-by-one errors.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/8327
Test Plan: `make check`
Reviewed By: siying
Differential Revision: D28761541
Pulled By: ltamasi
fbshipit-source-id: ee0e7229f04edabbc7bed5adb51771fbdc287f69
Summary:
This PR adds a ```-secondary_cache_uri``` option to the cache_bench and db_bench tools to allow the user to specify a custom secondary cache URI. The object registry is used to create an instance of the ```SecondaryCache``` object of the type specified in the URI.
The main cache_bench code is packaged into a separate library, similar to db_bench.
An example invocation of db_bench with a secondary cache URI -
```db_bench --env_uri=ws://ws.flash_sandbox.vll1_2/ -db=anand/nvm_cache_2 -use_existing_db=true -benchmarks=readrandom -num=30000000 -key_size=32 -value_size=256 -use_direct_reads=true -cache_size=67108864 -cache_index_and_filter_blocks=true -secondary_cache_uri='cachelibwrapper://filename=/home/anand76/nvm_cache/cache_file;size=2147483648;regionSize=16777216;admPolicy=random;admProbability=1.0;volatileSize=8388608;bktPower=20;lockPower=12' -partition_index_and_filters=true -duration=1800```
Pull Request resolved: https://github.com/facebook/rocksdb/pull/8312
Reviewed By: zhichao-cao
Differential Revision: D28544325
Pulled By: anand1976
fbshipit-source-id: 8f209b9af900c459dc42daa7a610d5f00176eeed
Summary:
When WriteBufferManager is shared across DBs and column families
to maintain memory usage under a limit, OOMs have been observed when flush cannot
finish but writes continuously insert to memtables.
In order to avoid OOMs, when memory usage goes beyond buffer_limit_ and DBs tries to write,
this change will stall incoming writers until flush is completed and memory_usage
drops.
Design: Stall condition: When total memory usage exceeds WriteBufferManager::buffer_size_
(memory_usage() >= buffer_size_) WriterBufferManager::ShouldStall() returns true.
DBImpl first block incoming/future writers by calling write_thread_.BeginWriteStall()
(which adds dummy stall object to the writer's queue).
Then DB is blocked on a state State::Blocked (current write doesn't go
through). WBStallInterface object maintained by every DB instance is added to the queue of
WriteBufferManager.
If multiple DBs tries to write during this stall, they will also be
blocked when check WriteBufferManager::ShouldStall() returns true.
End Stall condition: When flush is finished and memory usage goes down, stall will end only if memory
waiting to be flushed is less than buffer_size/2. This lower limit will give time for flush
to complete and avoid continous stalling if memory usage remains close to buffer_size.
WriterBufferManager::EndWriteStall() is called,
which removes all instances from its queue and signal them to continue.
Their state is changed to State::Running and they are unblocked. DBImpl
then signal all incoming writers of that DB to continue by calling
write_thread_.EndWriteStall() (which removes dummy stall object from the
queue).
DB instance creates WBMStallInterface which is an interface to block and
signal DBs during stall.
When DB needs to be blocked or signalled by WriteBufferManager,
state_for_wbm_ state is changed accordingly (RUNNING or BLOCKED).
Pull Request resolved: https://github.com/facebook/rocksdb/pull/7898
Test Plan: Added a new test db/db_write_buffer_manager_test.cc
Reviewed By: anand1976
Differential Revision: D26093227
Pulled By: akankshamahajan15
fbshipit-source-id: 2bbd982a3fb7033f6de6153aa92a221249861aae
Summary:
- Fixes the makefile to do the right thing when invoking multiple targets (e.g. make shared_lib install-shared).
- Fixes the building of db_stress in shared lib mode.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/8195
Reviewed By: pdillinger
Differential Revision: D27803452
Pulled By: mrambacher
fbshipit-source-id: 7c285d267770a359eb47f25855affdf58687e0e4
Summary:
The previous version of ZStd doesn't build correctly with Make 3.82. Updating it resolves the issue.
jay-zhuang This also needs to be cherry-picked to:
1. 6.17.fb
2. 6.18.fb
3. 6.19.fb
Pull Request resolved: https://github.com/facebook/rocksdb/pull/8155
Reviewed By: riversand963
Differential Revision: D27596460
Pulled By: ajkr
fbshipit-source-id: ac8492245e6273f54efcc1587346a797a91c9441
Summary:
New tests should by default be expected to be parallelizeable
and passing with ASSERT_STATUS_CHECKED. Thus, I'm changing those two
lists to exclusions rather than inclusions.
For the set of exclusions, I only listed things that currently failed
for me when attempting not to exclude, or had some other documented
reason. This marks many more tests as "parallel," which will potentially
cause some failures from self-interference, but we can address those as
they are discovered.
Also changed CircleCI ASC test to be parallelized; the easy way to do
that is to exclude building tests that don't pass ASC, which is now a
small set.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/8146
Test Plan: Watch CI, etc.
Reviewed By: riversand963
Differential Revision: D27542782
Pulled By: pdillinger
fbshipit-source-id: bdd74bcd912a963ee33f3fc0d2cad2567dc7740f
Summary:
At least under MacOS, some things were excluded from the build (like Snappy) because the compilation flags were not passed in correctly. This PR does a few things:
- Passes the EXTRA_CXX/LDFLAGS into build_detect_platform. This means that if some tool (like TBB for example) is not installed in a standard place, it could still be detected by build_detect_platform. In this case, the developer would invoke: "EXTRA_CXXFLAGS=<path to TBB include> EXTRA_LDFLAGS=<path to TBB library> make", and the build script would find the tools in the extra location.
- Changes the compilation tests to use PLATFORM_CXXFLAGS. This change causes the EXTRA_FLAGS passed in to the script to be included in the compilation check. Additionally, flags set by the script itself (like --std=c++11) will be used during the checks.
Validated that the make_platform.mk file generated on Linux does not change with this change. On my MacOS machine, the SNAPPY libraries are now available (they were not before as they required --std=c++11 to build).
I also verified that I can build against TBB installed on my Mac by passing in the EXTRA CXX and LD FLAGS to the location in which TBB is installed.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/8111
Reviewed By: jay-zhuang
Differential Revision: D27353516
Pulled By: mrambacher
fbshipit-source-id: b6b378c96dbf678bab1479556dcbcb49c47e807d
Summary:
If the platform is ppc64 and the libc is not GNU libc, then we exclude the range_tree from compilation.
See https://jira.percona.com/browse/PS-7559
Pull Request resolved: https://github.com/facebook/rocksdb/pull/8070
Reviewed By: jay-zhuang
Differential Revision: D27246004
Pulled By: mrambacher
fbshipit-source-id: 59d8433242ce7ce608988341becb4f83312445f5
Summary:
Because build_version.cc is dependent on the library objects (to force a re-generation of it), the library objects would be built in order to satisfy this rule. Because there is a build_version.d file, it would need generated and included.
Change the ALL_DEPS/FILES to not include build_version.cc (meaning no .d file for it, which is okay since it is generated). Also changed the rule on whether or not to generate DEP files to skip tags.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/8097
Reviewed By: ajkr
Differential Revision: D27299815
Pulled By: mrambacher
fbshipit-source-id: 1efbe8a56d062f57ae13b6c2944ad3faf775087e
Summary:
Add some basic test for user-defined timestamp to db_stress. Currently,
read with timestamp always tries to read using the current timestamp.
Due to the per-key timestamp-sequence ordering constraint, we only add timestamp-
related tests to the `NonBatchedOpsStressTest` since this test serializes accesses
to the same key and uses a file to cross-check data correctness.
The timestamp feature is not supported in a number of components, e.g. Merge, SingleDelete,
DeleteRange, CompactionFilter, Readonly instance, secondary instance, SST file ingestion, transaction,
etc. Therefore, db_stress should exit if user enables both timestamp and these features at the same
time. The (currently) incompatible features can be found in
`CheckAndSetOptionsForUserTimestamp`.
This PR also fixes a bug triggered when timestamp is enabled together with
`index_type=kBinarySearchWithFirstKey`. This bug fix will also be in another separate PR
with more unit tests coverage. Fixing it here because I do not want to exclude the index type
from crash test.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/8061
Test Plan: make crash_test_with_ts
Reviewed By: jay-zhuang
Differential Revision: D27056282
Pulled By: riversand963
fbshipit-source-id: c3e00ad1023fdb9ebbdf9601ec18270c5e2925a9
Summary:
Allow applications to implement a custom compaction filter and pass it to BlobDB.
The compaction filter's custom logic can operate on blobs.
To do so, application needs to subclass `CompactionFilter` abstract class and implement `FilterV2()` method.
Optionally, a method called `ShouldFilterBlobByKey()` can be implemented if application's custom logic rely solely
on the key to make a decision without reading the blob, thus saving extra IO. Examples can be found in
db/blob/db_blob_compaction_test.cc.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/7974
Test Plan: make check
Reviewed By: ltamasi
Differential Revision: D26509280
Pulled By: riversand963
fbshipit-source-id: 59f9ae5614c4359de32f4f2b16684193cc537b39
Summary:
Extend VerifyFileChecksums API to verify blob files in case of
use_file_checksum.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/7979
Test Plan: New unit test db_blob_corruption_test
Reviewed By: ltamasi
Differential Revision: D26534040
Pulled By: akankshamahajan15
fbshipit-source-id: 7dc5951a3df9d265ea1265e0122b43c966856ade
Summary:
I noticed tests frequently timing out on CircleCI when I submit a PR. I did some investigation and found the SeqAdvanceConcurrentTest suite (OneWriteQueue, TwoWriteQueues) tests were all taking a long time to complete (30 tests each taking at least 15K ms).
This PR adds those test to the "slow reg" list in order to move them earlier in the execution sequence so that they are not the "long tail".
For completeness, other tests that were also slow are:
NumLevels/DBTestUniversalCompaction.UniversalCompactionTrivialMoveTest : 12 tests all taking 12K+ ms
ReadSequentialFileTest with ReadaheadSize: 8 tests all 12K+ ms
WriteUnpreparedTransactionTest.RecoveryTest : 2 tests at 22K+ ms
DBBasicTest.EmptyFlush: 1 test at 35K+ ms
RateLimiterTest.Rate: 1 test at 23K+ ms
BackupableDBTest.ShareTableFilesWithChecksumsTransition: 1 test at 16K+ ms
MulitThreadedDBTest.MultitThreaded: 78 tests at 10K+ ms
TransactionStressTest.DeadlockStress: 7 tests at 11K+ ms
DBBasicTestDeadline.IteratorDeadline: 3 tests at 10K+ ms
No effort was made to determine why the tests were slow.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/7973
Reviewed By: jay-zhuang
Differential Revision: D26519130
Pulled By: mrambacher
fbshipit-source-id: 11555c9115acc207e45e210a7fc7f879170a3853
Summary:
Added support for detecting plugins linked in the "plugin/" directory and building them from our Makefile in a standardized way. See "plugin/README.md" for details. An example of a plugin that can be built in this way can be found in https://github.com/ajkr/dedupfs.
There will be more to do in terms of making this process more convenient and adding support for CMake.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/7918
Test Plan: my own plugin (https://github.com/ajkr/dedupfs) and also heard this patch worked with ZenFS.
Reviewed By: pdillinger
Differential Revision: D26189969
Pulled By: ajkr
fbshipit-source-id: 6624d4357d0ffbaedb42f0d12a3fcb737c78f758
Summary:
Recent Github actions of format checking fail due to invalid location
from where clang-format-diff.py is downloaded. Update the path to point
to a stable, archived location.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/7944
Test Plan: manually check the result of Github action.
Reviewed By: ltamasi
Differential Revision: D26345066
Pulled By: riversand963
fbshipit-source-id: 2b1a58c2e59c2f1eb11202d321d2ea002cb0917e
Summary:
(Fixes a regression introduced in the build_version generation PR https://github.com/facebook/rocksdb/issues/7866 )
In the Makefile case, needed to ignore stderr on the tag (everywhere else was fine).
In the CMAKE case, no GIT implies "changes" so that we use the system date rather than the empty GIT date.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/7916
Test Plan: Built in a tree that did not contain the ".git" directory. Validated that no errors appeared during the build process and that the build version date was not empty.
Reviewed By: jay-zhuang
Differential Revision: D26169203
Pulled By: mrambacher
fbshipit-source-id: 3288a23b48d97efed5e5b38c9aefb3ef1153fa16
Summary:
This PR adds the foundation classes for key-value integrity protection and the first use case: protecting live updates from the source buffers added to `WriteBatch` through the destination buffer in `MemTable`. The width of the protection info is not yet configurable -- only eight bytes per key is supported. This PR allows users to enable protection by constructing `WriteBatch` with `protection_bytes_per_key == 8`. It does not yet expose a way for users to get integrity protection via other write APIs (e.g., `Put()`, `Merge()`, `Delete()`, etc.).
The foundation classes (`ProtectionInfo.*`) embed the coverage info in their type, and provide `Protect.*()` and `Strip.*()` functions to navigate between types with different coverage. For making bytes per key configurable (for powers of two up to eight) in the future, these classes are templated on the unsigned integer type used to store the protection info. That integer contains the XOR'd result of hashes with independent seeds for all covered fields. For integer fields, the hash is computed on the raw unadjusted bytes, so the result is endian-dependent. The most significant bytes are truncated when the hash value (8 bytes) is wider than the protection integer.
When `WriteBatch` is constructed with `protection_bytes_per_key == 8`, we hold a `ProtectionInfoKVOTC` (i.e., one that covers key, value, optype aka `ValueType`, timestamp, and CF ID) for each entry added to the batch. The protection info is generated from the original buffers passed by the user, as well as the original metadata generated internally. When writing to memtable, each entry is transformed to a `ProtectionInfoKVOTS` (i.e., dropping coverage of CF ID and adding coverage of sequence number), since at that point we know the sequence number, and have already selected a memtable corresponding to a particular CF. This protection info is verified once the entry is encoded in the `MemTable` buffer.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/7748
Test Plan:
- an integration test to verify a wide variety of single-byte changes to the encoded `MemTable` buffer are caught
- add to stress/crash test to verify it works in variety of configs/operations without intentional corruption
- [deferred] unit tests for `ProtectionInfo.*` classes for edge cases like KV swap, `SliceParts` and `Slice` APIs are interchangeable, etc.
Reviewed By: pdillinger
Differential Revision: D25754492
Pulled By: ajkr
fbshipit-source-id: e481bac6c03c2ab268be41359730f1ceb9964866
Summary:
Closes https://github.com/facebook/rocksdb/issues/7035
Changed how build_version.cc was generated:
- Included the GIT tag/branch in the build_version file
- Changed the "Build Date" to be:
- If the GIT branch is "clean" (no changes), the date of the last git commit
- If the branch is not clean, the current date
- Added APIs to access the "build information", rather than accessing the strings directly.
The build_version.cc file is now regenerated whenever the library objects are rebuilt.
Verified that the built files remain the same size across builds on a "clean build" and the same information is reported by sst_dump --version
Pull Request resolved: https://github.com/facebook/rocksdb/pull/7866
Reviewed By: pdillinger
Differential Revision: D26086565
Pulled By: mrambacher
fbshipit-source-id: 6fcbe47f6033989d5cf26a0ccb6dfdd9dd239d7f
Summary:
This fixes an issue introduced in https://github.com/facebook/rocksdb/pull/7769 that caused many errors about missing compression libraries to be displayed during compilation, although compilation actually succeeded. This PR fixes the compilation so the compression libraries are only introduced where strictly needed.
It likely needs to be merged into the same branches as https://github.com/facebook/rocksdb/pull/7769 which I think are:
1. master
2. 6.15.fb
3. 6.16.fb
Pull Request resolved: https://github.com/facebook/rocksdb/pull/7803
Reviewed By: ramvadiv
Differential Revision: D25733743
Pulled By: pdillinger
fbshipit-source-id: 6c04f6864b2ff4a345841d791a89b19e0e3f5bf7
Summary:
Added "no-elide-constructors to the ASSERT_STATUS_CHECK builds. This flag gives more errors/warnings for some of the Status checks where an inner class checks a Status and later returns it. In this case, without the elide check on, the returned status may not have been checked in the caller, thereby bypassing the checked code.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/7798
Reviewed By: jay-zhuang
Differential Revision: D25680451
Pulled By: pdillinger
fbshipit-source-id: c3f14ed9e2a13f0a8c54d839d5fb4d1fc1e93917
Summary:
Range Locking - an implementation based on the locktree library
- Add a RangeTreeLockManager and RangeTreeLockTracker which implement
range locking using the locktree library.
- Point locks are handled as locks on single-point ranges.
- Add a unit test: range_locking_test
Pull Request resolved: https://github.com/facebook/rocksdb/pull/7506
Reviewed By: akankshamahajan15
Differential Revision: D25320703
Pulled By: cheng-chang
fbshipit-source-id: f86347384b42ba2b0257d67eca0f45f806b69da7
Summary:
* Fixes a Java test compilation issue on macOS
* Cleans up CircleCI RocksDBJava build config
* Adds CircleCI for RocksDBJava on MacOS
* Ensures backwards compatibility with older macOS via CircleCI
* Fixes RocksJava static builds ordering
* Adds missing RocksJava static builds to CircleCI for Mac and Linux
* Improves parallelism in RocksJava builds
* Reduces the size of the machines used for RocksJava CircleCI as they don't need to be so large (Saves credits)
Pull Request resolved: https://github.com/facebook/rocksdb/pull/7769
Reviewed By: akankshamahajan15
Differential Revision: D25601293
Pulled By: pdillinger
fbshipit-source-id: 0a0bb9906f65438fe143487d78e37e1947364d08
Summary:
Expands on https://github.com/facebook/rocksdb/pull/7016 so that when `PORTABLE=1` is set the dependencies for RocksJava static target will also be built with backwards compatibility for MacOS as far back as 10.12 (i.e. 2016).
Pull Request resolved: https://github.com/facebook/rocksdb/pull/7683
Reviewed By: ajkr
Differential Revision: D25034164
Pulled By: pdillinger
fbshipit-source-id: dc9e51828869ed9ec336a8a86683e4d0bfe04f27
Summary:
The Customizable class is an extension of the Configurable class and allows instances to be created by a name/ID. Classes that extend customizable can define their Type (e.g. "TableFactory", "Cache") and a method to instantiate them (TableFactory::CreateFromString). Customizable objects can be registered with the ObjectRegistry and created dynamically.
Future PRs will make more types of objects extend Customizable.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6590
Reviewed By: cheng-chang
Differential Revision: D24841553
Pulled By: zhichao-cao
fbshipit-source-id: d0c2132bd932e971cbfe2c908ca2e5db30c5e155
Summary:
Since the hashes should not be persisted in output_validator
nor mock_env.
Also updated NPHash64 to use 64-bit seed, and comments.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/7632
Test Plan:
make check, and new build setting that enables modification
to NPHash64, to check for behavior depending on specific values. Added
that setting to one of the CircleCI configurations.
Reviewed By: jay-zhuang
Differential Revision: D24833780
Pulled By: pdillinger
fbshipit-source-id: 02a57652ccf1ac105fbca79e77875bb7bf7c071f
Summary:
This is intended as the first commit toward a near-optimal alternative to static Bloom filters for SSTs. Stephan Walzer and I have agreed upon the name "Ribbon" for a PHSF based on his linear system construction in "Efficient Gauss Elimination for Near-Quadratic Matrices with One Short Random Block per Row, with Applications" ("SGauss") and my much faster "on the fly" algorithm for gaussian elimination (or for this linear system, "banding"), which can be faster than peeling while also more compact and flexible. See util/ribbon_alg.h for more detailed introduction and background. RIBBON = Rapid Incremental Boolean Banding ON-the-fly
This commit just adds generic (templatized) core algorithms and a basic unit test showing some features, including the ability to construct structures within 2.5% space overhead vs. information theoretic lower bound. (Compare to cache-local Bloom filter's ~50% space overhead -> ~30% reduction anticipated.) This commit does not include the storage scheme necessary to make queries fast, especially for filter queries, nor fractional "result bits", but there is some description already and those implementations will come soon. Nor does this commit add FilterPolicy support, for use in SST files, but that will also come soon.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/7491
Reviewed By: jay-zhuang
Differential Revision: D24517954
Pulled By: pdillinger
fbshipit-source-id: 0119ee597e250d7e0edd38ada2ba50d755606fa7
Summary:
In order to be able to introduce more locking protocols, we need to abstract out the locking subsystem in TransactionDB into a set of interfaces.
PR https://github.com/facebook/rocksdb/pull/7013 introduces interface `LockTracker`. This PR is a follow up to take the first step to abstract out a `LockManager` interface.
Further modifications to the interface may be needed when introducing the first implementation of range lock. But the idea here is to put the range lock implementation based on range tree under the `utilities/transactions/lock/range/range_tree`.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/7532
Test Plan: point_lock_manager_test
Reviewed By: ajkr
Differential Revision: D24238731
Pulled By: cheng-chang
fbshipit-source-id: 2a9458cd8b3fb008d9529dbc4d3b28c24631f463
Summary:
The patch adds blob file support to the `Get` API by extending `Version` so that
whenever a blob reference is read from a file, the blob is retrieved from the corresponding
blob file and passed back to the caller. (This is assuming the blob reference is valid
and the blob file is actually part of the given `Version`.) It also introduces a cache
of `BlobFileReader`s called `BlobFileCache` that enables sharing `BlobFileReader`s
between callers. `BlobFileCache` uses the same backing cache as `TableCache`, so
`max_open_files` (if specified) limits the total number of open (table + blob) files.
TODO: proactively open/cache blob files and pin the cache handles of the readers in the
metadata objects similarly to what `VersionBuilder::LoadTableHandlers` does for
table files.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/7540
Test Plan: `make check`
Reviewed By: riversand963
Differential Revision: D24260219
Pulled By: ltamasi
fbshipit-source-id: a8a2a4f11d3d04d6082201b52184bc4d7b0857ba