rocksdb

fork of https://github.com/oxigraph/rocksdb and https://github.com/facebook/rocksdb for nextgraph and oxigraph

History

Peter Dillinger 459969e993 Simplify detection of x86 CPU features (#11419 ) Summary: Background - runtime detection of certain x86 CPU features was added for optimizing CRC32c checksums, where performance is dramatically affected by the availability of certain CPU instructions and code using intrinsics for those instructions. And Java builds with native library try to be broadly compatible but performant. What has changed is that CRC32c is no longer the most efficient cheecksum on contemporary x86_64 hardware, nor the default checksum. XXH3 is generally faster and not as dramatically impacted by the availability of certain CPU instructions. For example, on my Skylake system using db_bench (similar on an older Skylake system without AVX512): PORTABLE=1 empty USE_SSE : xxh3->8 GB/s crc32c->0.8 GB/s (no SSE4.2 nor AVX2 instructions) PORTABLE=1 USE_SSE=1 : xxh3->19 GB/s crc32c->16 GB/s (with SSE4.2 and AVX2) PORTABLE=0 USE_SSE ignored: xxh3->28 GB/s crc32c->16 GB/s (also some AVX512) Testing a ~10 year old system, with SSE4.2 but without AVX2, crc32c is a similar speed to the new systems but xxh3 is only about half that speed, also 8GB/s like the non-AVX2 compile above. Given that xxh3 has specific optimization for AVX2, I think we can infer that that crc32c is only fastest for that ~2008-2013 period when SSE4.2 was included but not AVX2. And given that xxh3 is only about 2x slower on these systems (not like >10x slower for unoptimized crc32c), I don't think we need to invest too much in optimally adapting to these old cases. x86 hardware that doesn't support fast CRC32c is now extremely rare, so requiring a custom build to support such hardware is fine IMHO. This change does two related things: * Remove runtime CPU detection for optimizing CRC32c on x86. Maintaining this code is non-zero work, and compiling special code that doesn't work on the configured target instruction set for code generation is always dubious. (On the one hand we have to ensure the CRC32c code uses SSE4.2 but on the other hand we have to ensure nothing else does.) * Detect CPU features in source code, not in build scripts. Although there are some hypothetical advantages to detectiong in build scripts (compiler generality), RocksDB supports at least three build systems: make, cmake, and buck. It's not practical to support feature detection on all three, and we have suffered from missed optimization opportunities by relying on missing or incomplete detection in cmake and buck. We also depend on some components like xxhash that do source code detection anyway. In more detail: * `HAVE_SSE42`, `HAVE_AVX2`, and `HAVE_PCLMUL` replaced by standard macros `__SSE4_2__`, `__AVX2__`, and `__PCLMUL__`. * MSVC does not provide high fidelity defines for SSE, PCLMUL, or POPCNT, but we can infer those from `__AVX__` or `__AVX2__` in a compatibility header. In rare cases of false negative or false positive feature detection, a build engineer should be able to set defines to work around the issue. * `__POPCNT__` is another standard define, but we happen to only need it on MSVC, where it is set by that compatibility header, or can be set by the build engineer. * `PORTABLE` can be set to a CPU type, e.g. "haswell", to compile for that CPU type. * `USE_SSE` is deprecated, now equivalent to PORTABLE=haswell, which roughly approximates its old behavior. Notably, this change should enable more builds to use the AVX2-optimized Bloom filter implementation. Pull Request resolved: https://github.com/facebook/rocksdb/pull/11419 Test Plan: existing tests, CI Manual performance tests after the change match the before above (none expected with make build). We also see AVX2 optimized Bloom filter code enabled when expected, by injecting a compiler error. (Performance difference is not big on my current CPU.) Reviewed By: ajkr Differential Revision: D45489041 Pulled By: pdillinger fbshipit-source-id: 60ceb0dd2aa3b365c99ed08a8b2a087a9abb6a70		2 years ago
..
ubuntu20_image	Add OpenSSL to docker image (#10741 )	3 years ago
amalgamate.py	Enable BLACK for internal_repo_rocksdb (#10710 )	3 years ago
benchmark_log_tool.py	Fix lint issues after enable BLACK (#10717 )	3 years ago
build_detect_platform	Simplify detection of x86 CPU features (#11419 )	2 years ago
check-sources.sh	Fix bad include (#10213 )	3 years ago
dependencies_platform010.sh	Upgrade development environment. (#9843 )	4 years ago
dockerbuild.sh	Add copyright headers per FB open-source checkup tool. (#5199 )	7 years ago
error_filter.py	Enable BLACK for internal_repo_rocksdb (#10710 )	3 years ago
fb_compile_mongo.sh	Add copyright headers per FB open-source checkup tool. (#5199 )	7 years ago
fbcode_config.sh	Simplify detection of x86 CPU features (#11419 )	2 years ago
fbcode_config_platform010.sh	Simplify detection of x86 CPU features (#11419 )	2 years ago
format-diff.sh	Remove Travis CI (#10407 )	3 years ago
gnu_parallel	Print stack traces on frozen tests in CI (#10828 )	3 years ago
make_package.sh	Use standard variables for installing/uninstalling with make (#7187 )	5 years ago
ps_with_stack	Print stack traces on frozen tests in CI (#10828 )	3 years ago
regression_build_test.sh	Backport an internal change to regression_build_test.sh (#11319 )	3 years ago
run_ci_db_test.ps1	Fix remaining uses of "backupable" (#9792 )	4 years ago
setup_centos7.sh	Adding new build script for CentOS 7 (#6617 )	6 years ago
update_dependencies.sh	Remove platform009 and default to platform010 (#11333 )	3 years ago
version.sh	Add copyright headers per FB open-source checkup tool. (#5199 )	7 years ago