Summary:
Background. One of the core risks of chosing HyperClockCache is ending up with degraded performance if estimated_entry_charge is very significantly wrong. Too low leads to under-utilized hash table, which wastes a bit of (tracked) memory and likely increases access times due to larger working set size (more TLB misses). Too high leads to fully populated hash table (at some limit with reasonable lookup performance) and not being able to cache as many objects as the memory limit would allow. In either case, performance degradation is graceful/continuous but can be quite significant. For example, cutting block size in half without updating estimated_entry_charge could lead to a large portion of configured block cache memory (up to roughly 1/3) going unused.
Fix. This change adds a mechanism through which the DB periodically probes the block cache(s) for "problems" to report, and adds diagnostics to the HyperClockCache for bad estimated_entry_charge. The periodic probing is currently done with DumpStats / stats_dump_period_sec, and diagnostics reported to info_log (normally LOG file).
Pull Request resolved: https://github.com/facebook/rocksdb/pull/10965
Test Plan:
unit test included. Doesn't cover all the implemented subtleties of reporting, but ensures basics of when to report or not.
Also manual testing with db_bench. Create db with
```
./db_bench --benchmarks=fillrandom,flush --num=3000000 --disable_wal=1
```
Use and check LOG file for HyperClockCache for various block sizes (used as estimated_entry_charge)
```
./db_bench --use_existing_db --benchmarks=readrandom --num=3000000 --duration=20 --stats_dump_period_sec=8 --cache_type=hyper_clock_cache -block_size=XXXX
```
Seeing warnings / errors or not as expected.
Reviewed By: anand1976
Differential Revision: D41406932
Pulled By: pdillinger
fbshipit-source-id: 4ca56162b73017e4b9cec2cad74466f49c27a0a7
main
Peter Dillinger2 years agocommitted byFacebook GitHub Bot
* Add stats for ReadAsync time spent and async read errors.
* Basic support for the wide-column data model is now available. Wide-column entities can be stored using the `PutEntity` API, and retrieved using `GetEntity` and the new `columns` API of iterator. For compatibility, the classic APIs `Get` and `MultiGet`, as well as iterator's `value` API return the value of the anonymous default column of wide-column entities; also, `GetEntity` and iterator's `columns` return any plain key-values in the form of an entity which only has the anonymous default column. `Merge` (and `GetMergeOperands`) currently also apply to the default column; any other columns of entities are unaffected by `Merge` operations. Note that some features like compaction filters, transactions, user-defined timestamps, and the SST file writer do not yet support wide-column entities; also, there is currently no `MultiGet`-like API to retrieve multiple entities at once. We plan to gradually close the above gaps and also implement new features like column-level operations (e.g. updating or querying only certain columns of an entity).
* Marked HyperClockCache as a production-ready alternative to LRUCache for the block cache. HyperClockCache greatly improves hot-path CPU efficiency under high parallel load or high contention, with some documented caveats and limitations. As much as 4.5x higher ops/sec vs. LRUCache has been seen in db_bench under high parallel load.
* Add periodic diagnostics to info_log (LOG file) for HyperClockCache block cache if performance is degraded by bad `estimated_entry_charge` option.
### Public API Changes
* Marked `block_cache_compressed` as a deprecated feature. Use SecondaryCache instead.