Summary:
After https://github.com/facebook/rocksdb/issues/11301, I wasn't sure whether I had regressed block cache tracing with MultiGet. Demo PR https://github.com/facebook/rocksdb/issues/11330 shows the flawed state of tracing MultiGet before my change, and based on the unit test, there was essentially no change in tracing behavior with https://github.com/facebook/rocksdb/issues/11301. This change is to leave that code and behavior better than I found it.
This change is not intended to change any production behaviors except when block cache tracing is active, though might improve general read path efficiency by disabling some related tracking when such tracing is disabled.
More detail on production code:
* Refactoring to consolidate the construction of BlockCacheTraceRecord, and other related functionality, in block-based table reader, though it's somewhat awkward to preserve an optimization to avoid copying Slices into temporary strings in BlockCacheLookupContext.
* Accurately track cache hits and misses (etc.) for each data block accessed by a MultiGet(). (Previously reported hits as misses.)
* Reduced repeated checking of `block_cache_tracer_` state (by creating lookup_context only when active) for efficiency and to reduce the risk of corner case bugs where tracing is enabled or disabled for different parts of a read op. (See a TODO below)
* Improved estimate calculation for num_keys_in_block (see code comment)
Possible follow-up:
* `XXX:` use_cache=true means double cache query? (possible double-query of block cache when allow_mmap_reads=true)
* `TODO:` need more than one lookup_context here to track individual filter and index partition hits and misses
* `TODO:` optimize more state checks of `block_cache_tracer_` down to `lookup_context != nullptr`
* Pre-existing `XXX:` There appear to be 'break' statements above that bypass this writing of the block cache trace record
* Expand test coverage (see below)
Pull Request resolved: https://github.com/facebook/rocksdb/pull/11339
Test Plan:
* Added a basic unit test for block cache tracing MultiGet, for now just covering one data block with two keys.
* Added HitMissCountingCache to independently verify that the actual block cache trace and expected block cache trace also agree with the actual number of cache hits / misses (nothing missing or mislabeled). For now only used with MultiGet test.
* Better testing of num_keys_in_block, for now just with MultiGet
* Misc improvements to table_test to improve clarity, such as making it clear that certain keys are auto-inserted at the start of every test.
Performance test:
Testing multireadrandom as in https://github.com/facebook/rocksdb/issues/11301, except averaging over distinct runs rather than [-X30] which doesn't seem to sufficiently reset after each run to work as an independent test run.
Base with revert of 11301: 3148926 ops/sec
Base: 3019146 ops/sec
New: 2999529 ops/sec
Possibly a tiny MultiGet CPU regression with this change. We are now always allocating an additional vector for the LookupContexts. I'm still contemplating options to try to correct the regression in https://github.com/facebook/rocksdb/issues/11301.
Testing readrandom:
Base with revert of 11301: 2311988
Base: 2281726
New: 2299722
Possibly a tiny Get CPU improvement with this change. We are now avoiding some unnecessary LookupContext population.
Reviewed By: akankshamahajan15
Differential Revision: D44557845
Pulled By: pdillinger
fbshipit-source-id: b841691799d2a48fb59cc8880dc7cbb1e107ae3d