Summary:
In PR https://github.com/facebook/rocksdb/issues/7523 , checksum handoff is introduced in RocksDB for WAL, Manifest, and SST files. When user enable checksum handoff for a certain type of file, before the data is written to the lower layer storage system, we calculate the checksum (crc32c) of each piece of data and pass the checksum down with the data, such that data verification can be down by the lower layer storage system if it has the capability. However, it cannot cover the whole lifetime of the data in the memory and also it potentially introduces extra checksum calculation overhead.
In this PR, we introduce a new interface in WritableFileWriter::Append, which allows the caller be able to pass the data and the checksum (crc32c) together. In this way, WritableFileWriter can directly use the pass-in checksum (crc32c) to generate the checksum of data being passed down to the storage system. It saves the calculation overhead and achieves higher protection coverage. When a new checksum is added with the data, we use Crc32cCombine https://github.com/facebook/rocksdb/issues/8305 to combine the existing checksum and the new checksum. To avoid the segmenting of data by rate-limiter before it is stored, rate-limiter is called enough times to accumulate enough credits for a certain write. This design only support Manifest and WAL which use log_writer in the current stage.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/8412
Test Plan: make check, add new testing cases.
Reviewed By: anand1976
Differential Revision: D29151545
Pulled By: zhichao-cao
fbshipit-source-id: 75e2278c5126cfd58393c67b1efd18dcc7a30772
Summary:
Implement a function to generate the crc32c of two combined strings. Suppose we have the string 1 (s1) with crc32c checksum crc32c_1 and string 2 (s2) with crc32c checksum crc32c_2, the new string is s1+s2 and its checksum is crc32c_new=Crc32cCombine(crc32c_1, crc32c_2, s2.size).
Pull Request resolved: https://github.com/facebook/rocksdb/pull/8305
Test Plan: make check, added new testing case
Reviewed By: pdillinger
Differential Revision: D28651665
Pulled By: zhichao-cao
fbshipit-source-id: c84116108388f11a81f6a217b49f99c70d4ffacf
Summary:
To build on FreeBSD, arch_ppc_probe needs to be adapted to FreeBSD.
Since FreeBSD uses elf_aux_info as an getauxval equivalent, use it and include necessary headers:
- machine/cpu.h for PPC_FEATURE2_HAS_VEC_CRYPTO,
- sys/auxv.h for elf_aux_info,
- sys/elf_common.h for AT_HWCAP2.
elf_aux_info isn't checked for being available, because it's available since FreeBSD 12.0. rocksdb assumes using Clang on FreeBSD, but powerpc* platforms switch to Clang only since 13.0.
This patch makes rocksdb build on FreeBSD on powerpc64 and powerpc64le platforms.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/7732
Reviewed By: ltamasi
Differential Revision: D25399194
Pulled By: pdillinger
fbshipit-source-id: 9c905147d75f98cd2557dd2f86a940b8e6c5afcd
Summary:
Closes - https://github.com/facebook/rocksdb/issues/7710
I tested this on an Apple DTK (Developer Transition Kit) with an Apple A12Z Bionic CPU and macOS Big Sur (11.0.1).
Previously the arm64 specific CRC optimisations were limited to Linux only OS... Well now Apple Silicon is also arm64 but runs macOS ;-)
Pull Request resolved: https://github.com/facebook/rocksdb/pull/7714
Reviewed By: ltamasi
Differential Revision: D25287349
Pulled By: pdillinger
fbshipit-source-id: 639b168bf0ac2652907531e9604936ac4974b577
Summary:
Closes https://github.com/facebook/rocksdb/issues/7691
The optimised CRC code for PPC64le which was originally imported in https://github.com/facebook/rocksdb/pull/2353 is not compatible with Clang 11. It looks like the code most likely originated from https://github.com/antonblanchard/crc32-vpmsum.
The code relied on a GCC header file `ppc-asm.h` which is not available in Clang.
To solve this, I have taken the same approach as the the upstream project from which the CRC code came ffc8018efc (diff-ec3e62c56fbcddeb07230f2a4673c1abd7f0f1cc8e48a2aa560056cfc1b25d60) and simply imported a copy of the GCC header file into our code-base which will be used when Clang is the compiler on pcc64le.
**NOTE**: The new file `util/ppc-asm.h` may have licensing implications which I guess need to be approved by RocksDB/Facebook before this is merged
Pull Request resolved: https://github.com/facebook/rocksdb/pull/7713
Reviewed By: jay-zhuang
Differential Revision: D25222645
Pulled By: pdillinger
fbshipit-source-id: e3fec9136f26ce1eb7a027048bcf77a6cb3c769c
Summary:
Issue:https://github.com/facebook/rocksdb/issues/7042
No PMULL runtime check will lead to SIGILL on a Raspberry pi 4.
Leverage 'getauxval' to get Hardware-Cap to detect whether target
platform does support PMULL or not in runtime.
Consider the condition that the target platform does support crc32 but not support PMULL.
In this condition, the code should leverage the crc32 instruction
rather than skip all hardware crc32 instruction.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/7233
Reviewed By: jay-zhuang
Differential Revision: D23790116
fbshipit-source-id: a3ebd821fbd4a38dd2f59064adbb7c3013ee8140
Summary:
Based on https://github.com/facebook/rocksdb/issues/6648 (CLA Signed), but heavily modified / extended:
* Implicit capture of this via [=] deprecated in C++20, and [=,this] not standard before C++20 -> now using explicit capture lists
* Implicit copy operator deprecated in gcc 9 -> add explicit '= default' definition
* std::random_shuffle deprecated in C++17 and removed in C++20 -> migrated to a replacement in RocksDB random.h API
* Add the ability to build with different std version though -DCMAKE_CXX_STANDARD=11/14/17/20 on the cmake command line
* Minimal rebuild flag of MSVC is deprecated and is forbidden with /std:c++latest (C++20)
* Added MSVC 2019 C++11 & MSVC 2019 C++20 in AppVeyor
* Added GCC 9 C++11 & GCC9 C++20 in Travis
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6697
Test Plan: make check and CI
Reviewed By: cheng-chang
Differential Revision: D21020318
Pulled By: pdillinger
fbshipit-source-id: 12311be5dbd8675a0e2c817f7ec50fa11c18ab91
Summary:
Check for sys/auxv.h and getauxval before using them as they are not
always available (for example on uclibc)
Signed-off-by: Fabrice Fontaine <fontaine.fabrice@gmail.com>
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6359
Differential Revision: D20239797
fbshipit-source-id: 175a098094d81545628c2372e7c388e70a32fd48
Summary:
When dynamically linking two binaries together, different builds of RocksDB from two sources might cause errors. To provide a tool for user to solve the problem, the RocksDB namespace is changed to a flag which can be overridden in build time.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6433
Test Plan: Build release, all and jtest. Try to build with ROCKSDB_NAMESPACE with another flag.
Differential Revision: D19977691
fbshipit-source-id: aa7f2d0972e1c31d75339ac48478f34f6cfcfb3e
Summary:
When 'HAVE_ARM64_CRC' is set, the blew methods:
- bool rocksdb::crc32c::isSSE42()
- bool rocksdb::crc32c::isPCLMULQDQ()
are defined but not used, the unused-function is raised
when do rocksdb build.
This patch try to cleanup these warnings by add ifndef,
if it build under the HAVE_ARM64_CRC, we will not define
`isSSE42` and `isPCLMULQDQ`.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5565
Differential Revision: D16233654
fbshipit-source-id: c32a9dda7465dbf65f9ccafef159124db92cdffd
Summary:
- Avoid `strdup` to use jemalloc on Windows
- Use `size_t` for consistency
- Add GCC 8 to Travis
- Add CMAKE_BUILD_TYPE=Release to Travis
Pull Request resolved: https://github.com/facebook/rocksdb/pull/3433
Differential Revision: D6837948
Pulled By: sagar0
fbshipit-source-id: b8543c3a4da9cd07ee9a33f9f4623188e233261f
Summary:
Now we suppress alignment UBSAN error as a whole. Suppressing 3-way CRC and murmurhash feels a better idea than turning off alignment check as a whole.
Closes https://github.com/facebook/rocksdb/pull/3495
Differential Revision: D6971273
Pulled By: siying
fbshipit-source-id: 080b59fed6df494b9f622ef7cb5d42d39e6a8cdf
Summary:
**# Summary**
RocksDB uses SSE crc32 intrinsics to calculate the crc32 values but it does it in single way fashion (not pipelined on single CPU core). Intel's whitepaper () published an algorithm that uses 3-way pipelining for the crc32 intrinsics, then use pclmulqdq intrinsic to combine the values. Because pclmulqdq has overhead on its own, this algorithm will show perf gains on buffers larger than 216 bytes, which makes RocksDB a perfect user, since most of the buffers RocksDB call crc32c on is over 4KB. Initial db_bench show tremendous CPU gain.
This change uses the 3-way SSE algorithm by default. The old SSE algorithm is now behind a compiler tag NO_THREEWAY_CRC32C. If user compiles the code with NO_THREEWAY_CRC32C=1 then the old SSE Crc32c algorithm would be used. If the server does not have SSE4.2 at the run time the slow way (Non SSE) will be used.
**# Performance Test Results**
We ran the FillRandom and ReadRandom benchmarks in db_bench. ReadRandom is the point of interest here since it calculates the CRC32 for the in-mem buffers. We did 3 runs for each algorithm.
Before this change the CRC32 value computation takes about 11.5% of total CPU cost, and with the new 3-way algorithm it reduced to around 4.5%. The overall throughput also improved from 25.53MB/s to 27.63MB/s.
1) ReadRandom in db_bench overall metrics
PER RUN
Algorithm | run | micros/op | ops/sec |Throughput (MB/s)
3-way | 1 | 4.143 | 241387 | 26.7
3-way | 2 | 3.775 | 264872 | 29.3
3-way | 3 | 4.116 | 242929 | 26.9
FastCrc32c|1 | 4.037 | 247727 | 27.4
FastCrc32c|2 | 4.648 | 215166 | 23.8
FastCrc32c|3 | 4.352 | 229799 | 25.4
AVG
Algorithm | Average of micros/op | Average of ops/sec | Average of Throughput (MB/s)
3-way | 4.01 | 249,729 | 27.63
FastCrc32c | 4.35 | 230,897 | 25.53
2) Crc32c computation CPU cost (inclusive samples percentage)
PER RUN
Implementation | run | TotalSamples | Crc32c percentage
3-way | 1 | 4,572,250,000 | 4.37%
3-way | 2 | 3,779,250,000 | 4.62%
3-way | 3 | 4,129,500,000 | 4.48%
FastCrc32c | 1 | 4,663,500,000 | 11.24%
FastCrc32c | 2 | 4,047,500,000 | 12.34%
FastCrc32c | 3 | 4,366,750,000 | 11.68%
**# Test Plan**
make -j64 corruption_test && ./corruption_test
By default it uses 3-way SSE algorithm
NO_THREEWAY_CRC32C=1 make -j64 corruption_test && ./corruption_test
make clean && DEBUG_LEVEL=0 make -j64 db_bench
make clean && DEBUG_LEVEL=0 NO_THREEWAY_CRC32C=1 make -j64 db_bench
Closes https://github.com/facebook/rocksdb/pull/3173
Differential Revision: D6330882
Pulled By: yingsu00
fbshipit-source-id: 8ec3d89719533b63b536a736663ca6f0dd4482e9
Summary:
it turns out that, with older GCC shipped from centos7, the SSE42
intrinsics are not available even with "target" specified. so we
need to pass "-msse42" for checking compiler's sse4.2 support and
for building crc32c.cc which uses sse4.2 intrinsics for crc32.
Signed-off-by: Kefu Chai <tchaikov@gmail.com>
Closes https://github.com/facebook/rocksdb/pull/2950
Differential Revision: D6032298
Pulled By: siying
fbshipit-source-id: 124c946321043661b3fb0a70b6cdf4c9c5126ab4
Summary:
if we enable SSE42 globally when compiling the tree for preparing a
portable binary, which could be running on CPU w/o SSE42 instructions
even the GCC on the building host is able to emit SSE42 code, this leads
to illegal instruction errors on machines not supporting SSE42. to solve
this problem, crc32 detects the supported instruction at runtime, and
selects the supported CRC32 implementation according to the result of
`cpuid`. but intrinics like "_mm_crc32_u64()" will not be available
unless the "target" machine is appropriately specified in the command
line, like "-msse42", or using the "target" attribute.
we could pass "-msse42" only when compiling crc32c.cc, and allow the
compiler to generate the SSE42 instructions, but we are still at the
risk of executing illegal instructions on machines does not support
SSE42 if the compiler emits code that is not guarded by our runtime
detection. and we need to do the change in both Makefile and CMakefile.
or, we can use GCC's "target" attribute to enable the machine specific
instructions on certain function. in this way, we have finer grained
control of the used "target". and no need to change the makefiles. so
we don't need to duplicate the changes on both makefile and cmake as
the previous approach.
this problem surfaces when preparing a package for GNU/Linux distribution,
and we only applies to optimization for SSE42, so using a feature
only available on GCC/Clang is not that formidable.
Closes https://github.com/facebook/rocksdb/pull/2807
Differential Revision: D5786084
Pulled By: siying
fbshipit-source-id: bca5c0f877b8d6fb55f58f8f122254a26422843d
Summary:
We've had a couple CockroachDB users fail to build RocksDB on exotic platforms, so I figured I'd try my hand at solving these issues upstream. The problems stem from a) `USE_SSE=1` being too aggressive about turning on SSE4.2, even on toolchains that don't support SSE4.2 and b) RocksDB attempting to detect support for thread-local storage based on OS, even though it can vary by compiler on the same OS.
See the individual commit messages for details. Regarding SSE support, this PR should change virtually nothing for non-CMake based builds. `make`, `PORTABLE=1 make`, `USE_SSE=1 make`, and `PORTABLE=1 USE_SSE=1 make` function exactly as before, except that SSE support will be automatically disabled when a simple SSE4.2-using test program fails to compile, as it does on OpenBSD. (OpenBSD's ports GCC supports SSE4.2, but its binutils do not, so `__SSE_4_2__` is defined but an SSE4.2-using program will fail to assemble.) A warning is emitted in this case. The CMake build is modified to support the same set of options, except that `USE_SSE` is spelled `FORCE_SSE42` because `USE_SSE` is rather useless now that we can automatically detect SSE support, and I figure changing options in the CMake build is less disruptive than changing the non-CMake build.
I've tested these changes on all the platforms I can get my hands on (macOS, Windows MSVC, Windows MinGW, and OpenBSD) and it all works splendidly. Let me know if there's anything you object to—I obviously don't mean to break any of your build pipelines in the process of fixing ours downstream.
Closes https://github.com/facebook/rocksdb/pull/2199
Differential Revision: D5054042
Pulled By: yiwu-arbug
fbshipit-source-id: 938e1fc665c049c02ae15698e1409155b8e72171
Summary:
Currently the fast crc32 path is not enabled on Windows. I am trying to enable it here, hopefully, with the minimum impact to the existing code structure.
Closes https://github.com/facebook/rocksdb/pull/2033
Differential Revision: D4770635
Pulled By: siying
fbshipit-source-id: 676f8b8
Summary: Print whether fast CRC32 is supported in DB info LOG
Test Plan: Run db_bench and see it prints out correctly.
Reviewers: yhchiang, anthony, kradhakrishnan, igor
Reviewed By: igor
Subscribers: MarkCallaghan, yoshinorim, leveldb, dhruba
Differential Revision: https://reviews.facebook.net/D41733
Summary:
We need to turn on -Wshorten-64-to-32 for mobile. See D1671432 (internal phabricator) for details.
This diff turns on the warning flag and fixes all the errors. There were also some interesting errors that I might call bugs, especially in plain table. Going forward, I think it makes sense to have this flag turned on and be very very careful when converting 64-bit to 32-bit variables.
Test Plan: compiles
Reviewers: ljin, rven, yhchiang, sdong
Reviewed By: yhchiang
Subscribers: bobbaldwin, dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D28689
Summary: Otherwise, if we compile on machine with SSE4.2 support and run it on machine without the support, we will fail.
Test Plan: compiles, verified that isSse42() gets called.
Reviewers: dhruba
Reviewed By: dhruba
CC: leveldb
Differential Revision: https://reviews.facebook.net/D17505
Summary:
I had to make number of changes to the code and Makefile:
* Add `make lib`, that will create static library without debug info. We need this to avoid growing binary too much. Currently it's 14MB.
* Remove cpuinfo() function and use __SSE4_2__ macro. We actually used the macro as part of Fast_CRC32() function.
As a result, I also accidentally fixed this issue: https://www.facebook.com/groups/rocksdb.dev/permalink/549700778461774/?stream_ref=2
* Remove __thread locals in OS_MACOSX
Test Plan: `make lib PLATFORM=IOS`
Reviewers: ljin, haobo, dhruba, sdong
Reviewed By: haobo
CC: leveldb
Differential Revision: https://reviews.facebook.net/D17475
Disassembling the Extend function shows something that looks
much more healthy now. The SSE 4.2 instructions are right
there in the body of the function.
Intel(R) Core(TM) i7-3540M CPU @ 3.00GHz
Before:
crc32c: 1.305 micros/op 766260 ops/sec; 2993.2 MB/s (4K per op)
After:
crc32c: 0.442 micros/op 2263843 ops/sec; 8843.1 MB/s (4K per op)
Summary:
Change namespace from leveldb to rocksdb. This allows a single
application to link in open-source leveldb code as well as
rocksdb code into the same process.
Test Plan: compile rocksdb
Reviewers: emayanke
Reviewed By: emayanke
CC: leveldb
Differential Revision: https://reviews.facebook.net/D13287
Summary:
Scripted and removed all trailing spaces and converted all tabs to
spaces.
Also fixed other lint errors.
All lint errors from this point of time should be taken seriously.
Test Plan: make all check
Reviewers: dhruba
Reviewed By: dhruba
CC: leveldb
Differential Revision: https://reviews.facebook.net/D7059
Summary:
This speeds up CRC computation significantly on
hardware that supports it. Enabled via -msse4.
Note: the binary won't be usable on older CPUs
that don't support the instruction.
Test Plan: crc32c_test
Reviewers: dhruba
Reviewed By: dhruba
Differential Revision: https://reviews.facebook.net/D3201
Summary:
Some code reorganization in-preparation for replacing with a hardware
instruction.
* Use u64 for some of the key types
* Use an ALIGN macro so code is easier to read
Test Plan: crc32c_test
Reviewers: dhruba
Reviewed By: dhruba
Differential Revision: https://reviews.facebook.net/D3135
- Replace raw slice comparison with a call to user comparator.
Added test for custom comparators.
- Fix end of namespace comments.
- Fixed bug in picking inputs for a level-0 compaction.
When finding overlapping files, the covered range may expand
as files are added to the input set. We now correctly expand
the range when this happens instead of continuing to use the
old range. For example, suppose L0 contains files with the
following ranges:
F1: a .. d
F2: c .. g
F3: f .. j
and the initial compaction target is F3. We used to search
for range f..j which yielded {F2,F3}. However we now expand
the range as soon as another file is added. In this case,
when F2 is added, we expand the range to c..j and restart the
search. That picks up file F1 as well.
This change fixes a bug related to deleted keys showing up
incorrectly after a compaction as described in Issue 44.
(Sync with upstream @25072954)