Summary:
Top-level index in partitioned index/filter blocks are small and could be pinned in memory. So far we use that by cache_index_and_filter_blocks to false. This however make it difficult to keep account of the total memory usage. This patch introduces pin_top_level_index_and_filter which in combination with cache_index_and_filter_blocks=true keeps the top-level index in cache and yet pinned them to avoid cache misses and also cache lookup overhead.
Closes https://github.com/facebook/rocksdb/pull/4037
Differential Revision: D8596218
Pulled By: maysamyabandeh
fbshipit-source-id: 3a5f7f9ca6b4b525b03ff6bd82354881ae974ad2
main
Maysam Yabandeh6 years agocommitted byFacebook Github Bot
* Changes the format of index blocks by storing the key in their raw form rather than converting them to InternalKey. This saves 8 bytes per index key. The feature is backward compatbile but not forward compatible. It is disabled by default unless format_version 3 or above is used.
* Improve the performance of iterators doing long range scans by using readahead, when using direct IO.
* pin_top_level_index_and_filter (default true) in BlockBasedTableOptions can be used in combination with cache_index_and_filter_blocks to prefetch and pin the top-level index of partitioned index and filter blocks in cache. It has no impact when cache_index_and_filter_blocks is false.
### Bug Fixes
* fix deadlock with enable_pipelined_write=true and max_successive_merges > 0