Summary:
This is a conceptually simple change, but it touches many files to
pass the allocator through function calls.
We introduce CacheAllocator, which can be used by clients to configure
custom allocator for cache blocks. Our motivation is to hook this up
with folly's `JemallocNodumpAllocator`
(f43ce6d686/folly/experimental/JemallocNodumpAllocator.h),
but there are many other possible use cases.
Additionally, this commit cleans up memory allocation in
`util/compression.h`, making sure that all allocations are wrapped in a
unique_ptr as soon as possible.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/4437
Differential Revision: D10132814
Pulled By: yiwu-arbug
fbshipit-source-id: be1343a4b69f6048df127939fea9bbc96969f564
Summary:
Reverting is needed to unblock a user building against master, who is blocked for multiple days due to a thread-safety issue in `GetEmptyDict`. We haven't been able to fix it quickly, so reverting.
Simply ran `git revert 6c40806e51a89386d2b066fddf73d3fd03a36f65`. There were no merge conflicts.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/4347
Differential Revision: D9668365
Pulled By: ajkr
fbshipit-source-id: 0c56334f0a23cf5ee0233d4e4679eae6709739cd
Summary:
In RocksDB, for a given SST file, all data blocks are compressed with the same dictionary. When we compress a block using the dictionary's raw bytes, the compression library first has to digest the dictionary to get it into a usable form. This digestion work is redundant and ideally should be done once per file.
ZSTD offers APIs for the caller to create and reuse a digested dictionary object (`ZSTD_CDict`). In this PR, we call `ZSTD_createCDict` once per file to digest the raw bytes. Then we use `ZSTD_compress_usingCDict` to compress each data block using the pre-digested dictionary. Once the file's created `ZSTD_freeCDict` releases the resources held by the digested dictionary.
There are a couple other changes included in this PR:
- Changed the parameter object for (un)compression functions from `CompressionContext`/`UncompressionContext` to `CompressionInfo`/`UncompressionInfo`. This avoids the previous pattern, where `CompressionContext`/`UncompressionContext` had to be mutated before calling a (un)compression function depending on whether dictionary should be used. I felt that mutation was error-prone so eliminated it.
- Added support for digested uncompression dictionaries (`ZSTD_DDict`) as well. However, this PR does not support reusing them across uncompression calls for the same file. That work is deferred to a later PR when we will store the `ZSTD_DDict` objects in block cache.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/4251
Differential Revision: D9257078
Pulled By: ajkr
fbshipit-source-id: 21b8cb6bbdd48e459f1c62343780ab66c0a64438
Summary:
Suppress two CLANG analyze warnings. They don't seem to be real bugs
Pull Request resolved: https://github.com/facebook/rocksdb/pull/4291
Differential Revision: D9407333
Pulled By: siying
fbshipit-source-id: 2ed63d88fa0b217fdccb1572d7508467c2203dc8
Summary:
This was caught by crash test, and the following is a simple way to reproduce it and verify the fix.
One way to trigger this code path is to use the following configuration:
- Compress SST file
- Enable direct IO and prefetch buffer
- Do NOT use compressed block cache
Closes https://github.com/facebook/rocksdb/pull/4096
Differential Revision: D8742009
Pulled By: riversand963
fbshipit-source-id: f13381078bbb0dce92f60bd313a78ab602bcacd2
Summary:
https://github.com/facebook/rocksdb/pull/3881 fixed a bug where PinnableSlice pin mmap files which could be deleted with background compaction. This is however a non-issue for ReadOnlyDB when there is no compaction running and max_open_files is -1. This patch reenables the pinning feature for that case.
Closes https://github.com/facebook/rocksdb/pull/4053
Differential Revision: D8662546
Pulled By: maysamyabandeh
fbshipit-source-id: 402962602eb0f644e17822748332999c3af029fd
Summary:
PR https://github.com/facebook/rocksdb/pull/3838 made some changes that triggers lint warnings.
Run `make format` to fix formatting as suggested by siying .
Also piggyback two changes:
1) fix singleton destruction order for windows and posix env
2) fix two clang warnings
Closes https://github.com/facebook/rocksdb/pull/3954
Differential Revision: D8272041
Pulled By: miasantreble
fbshipit-source-id: 7c4fd12bd17aac13534520de0c733328aa3c6c9f
Summary:
Windows does not have LD_PRELOAD mechanism to override all memory allocation functions and ZSTD makes use of C-tuntime calloc. During flushes and compactions default system allocator fragments and the system slows down considerably.
For builds with jemalloc we employ an advanced ZSTD context creation API that re-directs memory allocation to jemalloc. To reduce the cost of context creation on each block we cache ZSTD context within the block based table builder while a new SST file is being built, this will help all platform builds including those w/o jemalloc. This avoids system allocator fragmentation and improves the performance.
The change does not address random reads and currently on Windows reads with ZSTD regress as compared with SNAPPY compression.
Closes https://github.com/facebook/rocksdb/pull/3838
Differential Revision: D8229794
Pulled By: miasantreble
fbshipit-source-id: 719b622ab7bf4109819bc44f45ec66f0dd3ee80d
Summary:
Divide ReadBlockContents() to multiple sub-functions. Maintaining the input and intermediate data in a new class BlockFetcher.
I hope in general it makes the code easier to maintain.
Another motivation to do it is to clearly divide the logic before file reading and after file reading. The refactor will help us evaluate how can we make I/O async in the future.
Closes https://github.com/facebook/rocksdb/pull/3244
Differential Revision: D6520983
Pulled By: siying
fbshipit-source-id: 338d90bc0338472d46be7a7682028dc9114b12e9