# Copyright (c) 2011 The LevelDB Authors. All rights reserved.
# Use of this source code is governed by a BSD-style license that can be
# found in the LICENSE file. See the AUTHORS file for names of contributors.
# Inherit some settings from environment variables, if available
#-----------------------------------------------
BASH_EXISTS := $( shell which bash)
SHELL := $( shell which bash)
i n c l u d e c o m m o n . m k
run 'make check's rules (and even subtests) in parallel
Summary:
When GNU parallel is available, "make check" tests are now run in parallel.
When /dev/shm is usable, we tell those tests to create temporary files therein.
Now, the longest-running single test, db_test, (which is composed of hundreds of sub-tests)
is no longer run sequentially: instead, each of its sub-tests is run independently, and can
be parallelized along with all other tests. To make that process easier, this change
creates a temporary directory, "t/", in which it puts a small script for each of those
subtests. The output from each parallel-run test is now saved in t/log-TEST_NAME.
When GNU parallel is not available, we run the tests in sequence, just as before.
If GNU parallel is available and you don't like the default of running one subtest
per core, you can invoke "make J=1 check" to run only one test at a time.
Beware: this will take a long time, and it starts with the two longest-running tests, so you
will wait for a long time before seeing any results. Instead, if you want to use fewer resources
but still see useful progress, try "make J=60% check". That will attempt to ensure that 60% of
the cores are occupied by test runs.
To watch progress of individual tests (duration, success (PASS-or-FAIL), name), run "make watch-log"
in the same directory from another window. That will start with something like this:
and when complete should show numbers/names like this:
Every 0.1s: sort -k7,7nr -k4,4gr LOG|perl -n -e '@a=split("\t",$_,-1); $t=$a[8]; $t =~ s,^\./,,;' -e '$t =~ s, >.*,,; chomp $t;' -e '$t =~ /.*--gtest_filter=... Wed Apr 1 10:51:42 2015
152.221 PASS t/DBTest.FileCreationRandomFailure
109.280 PASS t/DBTest.EncodeDecompressedBlockSizeTest
82.315 PASS reduce_levels_test
77.812 PASS t/DBTest.CompactionFilterWithValueChange
73.236 PASS backupable_db_test
63.428 PASS deletefile_test
57.248 PASS table_test
55.665 PASS prefix_test
49.816 PASS t/DBTest.RateLimitingTest
...
Test Plan:
Timings (measured so as to exclude compile and link times):
With this change, all tests complete in 2m40s on a system for which nproc prints 32.
Prior to this this change, "make check" would take 24.5 minutes on that same system.
Here are durations (in seconds) of the longest-running subtests:
152.435 PASS t/DBTest.FileCreationRandomFailure
107.070 PASS t/DBTest.EncodeDecompressedBlockSizeTest
81.391 PASS ./reduce_levels_test
71.587 PASS ./backupable_db_test
61.746 PASS ./deletefile_test
57.960 PASS ./table_test
55.230 PASS ./prefix_test
54.060 PASS t/DBTest.CompactionFilterWithValueChange
48.873 PASS t/DBTest.RateLimitingTest
47.569 PASS ./fault_injection_test
46.593 PASS t/DBTest.Randomized
42.662 PASS t/DBTest.CompactionFilter
31.793 PASS t/DBTest.SparseMerge
30.612 PASS t/DBTest.CompactionFilterV2
25.891 PASS t/DBTest.GroupCommitTest
23.863 PASS t/DBTest.DynamicLevelMaxBytesBase
22.976 PASS ./rate_limiter_test
18.942 PASS t/DBTest.OptimizeFiltersForHits
16.851 PASS ./env_test
15.399 PASS t/DBTest.CompactionFilterV2WithValueChange
14.827 PASS t/DBTest.CompactionFilterV2NULLPrefix
Reviewers: igor, sdong, rven, yhchiang, igor.sugak
Reviewed By: igor.sugak
Subscribers: dhruba
Differential Revision: https://reviews.facebook.net/D35379
10 years ago
CLEAN_FILES = # deliberately empty, so we can append below.
CFLAGS += ${ EXTRA_CFLAGS }
CXXFLAGS += ${ EXTRA_CXXFLAGS }
LDFLAGS += $( EXTRA_LDFLAGS)
MACHINE ?= $( shell uname -m)
ARFLAGS = ${ EXTRA_ARFLAGS } rs
STRIPFLAGS = -S -x
run 'make check's rules (and even subtests) in parallel
Summary:
When GNU parallel is available, "make check" tests are now run in parallel.
When /dev/shm is usable, we tell those tests to create temporary files therein.
Now, the longest-running single test, db_test, (which is composed of hundreds of sub-tests)
is no longer run sequentially: instead, each of its sub-tests is run independently, and can
be parallelized along with all other tests. To make that process easier, this change
creates a temporary directory, "t/", in which it puts a small script for each of those
subtests. The output from each parallel-run test is now saved in t/log-TEST_NAME.
When GNU parallel is not available, we run the tests in sequence, just as before.
If GNU parallel is available and you don't like the default of running one subtest
per core, you can invoke "make J=1 check" to run only one test at a time.
Beware: this will take a long time, and it starts with the two longest-running tests, so you
will wait for a long time before seeing any results. Instead, if you want to use fewer resources
but still see useful progress, try "make J=60% check". That will attempt to ensure that 60% of
the cores are occupied by test runs.
To watch progress of individual tests (duration, success (PASS-or-FAIL), name), run "make watch-log"
in the same directory from another window. That will start with something like this:
and when complete should show numbers/names like this:
Every 0.1s: sort -k7,7nr -k4,4gr LOG|perl -n -e '@a=split("\t",$_,-1); $t=$a[8]; $t =~ s,^\./,,;' -e '$t =~ s, >.*,,; chomp $t;' -e '$t =~ /.*--gtest_filter=... Wed Apr 1 10:51:42 2015
152.221 PASS t/DBTest.FileCreationRandomFailure
109.280 PASS t/DBTest.EncodeDecompressedBlockSizeTest
82.315 PASS reduce_levels_test
77.812 PASS t/DBTest.CompactionFilterWithValueChange
73.236 PASS backupable_db_test
63.428 PASS deletefile_test
57.248 PASS table_test
55.665 PASS prefix_test
49.816 PASS t/DBTest.RateLimitingTest
...
Test Plan:
Timings (measured so as to exclude compile and link times):
With this change, all tests complete in 2m40s on a system for which nproc prints 32.
Prior to this this change, "make check" would take 24.5 minutes on that same system.
Here are durations (in seconds) of the longest-running subtests:
152.435 PASS t/DBTest.FileCreationRandomFailure
107.070 PASS t/DBTest.EncodeDecompressedBlockSizeTest
81.391 PASS ./reduce_levels_test
71.587 PASS ./backupable_db_test
61.746 PASS ./deletefile_test
57.960 PASS ./table_test
55.230 PASS ./prefix_test
54.060 PASS t/DBTest.CompactionFilterWithValueChange
48.873 PASS t/DBTest.RateLimitingTest
47.569 PASS ./fault_injection_test
46.593 PASS t/DBTest.Randomized
42.662 PASS t/DBTest.CompactionFilter
31.793 PASS t/DBTest.SparseMerge
30.612 PASS t/DBTest.CompactionFilterV2
25.891 PASS t/DBTest.GroupCommitTest
23.863 PASS t/DBTest.DynamicLevelMaxBytesBase
22.976 PASS ./rate_limiter_test
18.942 PASS t/DBTest.OptimizeFiltersForHits
16.851 PASS ./env_test
15.399 PASS t/DBTest.CompactionFilterV2WithValueChange
14.827 PASS t/DBTest.CompactionFilterV2NULLPrefix
Reviewers: igor, sdong, rven, yhchiang, igor.sugak
Reviewed By: igor.sugak
Subscribers: dhruba
Differential Revision: https://reviews.facebook.net/D35379
10 years ago
# Transform parallel LOG output into something more readable.
perl_command = perl -n \
-e '@a=split("\t",$$_,-1); $$t=$$a[8];' \
-e '$$t =~ /.*if\s\[\[\s"(.*?\.[\w\/]+)/ and $$t=$$1;' \
-e '$$t =~ s,^\./,,;' \
-e '$$t =~ s, >.*,,; chomp $$t;' \
-e '$$t =~ /.*--gtest_filter=(.*?\.[\w\/]+)/ and $$t=$$1;' \
-e 'printf "%7.3f %s %s\n", $$a[3], $$a[6] == 0 ? "PASS" : "FAIL", $$t'
run 'make check's rules (and even subtests) in parallel
Summary:
When GNU parallel is available, "make check" tests are now run in parallel.
When /dev/shm is usable, we tell those tests to create temporary files therein.
Now, the longest-running single test, db_test, (which is composed of hundreds of sub-tests)
is no longer run sequentially: instead, each of its sub-tests is run independently, and can
be parallelized along with all other tests. To make that process easier, this change
creates a temporary directory, "t/", in which it puts a small script for each of those
subtests. The output from each parallel-run test is now saved in t/log-TEST_NAME.
When GNU parallel is not available, we run the tests in sequence, just as before.
If GNU parallel is available and you don't like the default of running one subtest
per core, you can invoke "make J=1 check" to run only one test at a time.
Beware: this will take a long time, and it starts with the two longest-running tests, so you
will wait for a long time before seeing any results. Instead, if you want to use fewer resources
but still see useful progress, try "make J=60% check". That will attempt to ensure that 60% of
the cores are occupied by test runs.
To watch progress of individual tests (duration, success (PASS-or-FAIL), name), run "make watch-log"
in the same directory from another window. That will start with something like this:
and when complete should show numbers/names like this:
Every 0.1s: sort -k7,7nr -k4,4gr LOG|perl -n -e '@a=split("\t",$_,-1); $t=$a[8]; $t =~ s,^\./,,;' -e '$t =~ s, >.*,,; chomp $t;' -e '$t =~ /.*--gtest_filter=... Wed Apr 1 10:51:42 2015
152.221 PASS t/DBTest.FileCreationRandomFailure
109.280 PASS t/DBTest.EncodeDecompressedBlockSizeTest
82.315 PASS reduce_levels_test
77.812 PASS t/DBTest.CompactionFilterWithValueChange
73.236 PASS backupable_db_test
63.428 PASS deletefile_test
57.248 PASS table_test
55.665 PASS prefix_test
49.816 PASS t/DBTest.RateLimitingTest
...
Test Plan:
Timings (measured so as to exclude compile and link times):
With this change, all tests complete in 2m40s on a system for which nproc prints 32.
Prior to this this change, "make check" would take 24.5 minutes on that same system.
Here are durations (in seconds) of the longest-running subtests:
152.435 PASS t/DBTest.FileCreationRandomFailure
107.070 PASS t/DBTest.EncodeDecompressedBlockSizeTest
81.391 PASS ./reduce_levels_test
71.587 PASS ./backupable_db_test
61.746 PASS ./deletefile_test
57.960 PASS ./table_test
55.230 PASS ./prefix_test
54.060 PASS t/DBTest.CompactionFilterWithValueChange
48.873 PASS t/DBTest.RateLimitingTest
47.569 PASS ./fault_injection_test
46.593 PASS t/DBTest.Randomized
42.662 PASS t/DBTest.CompactionFilter
31.793 PASS t/DBTest.SparseMerge
30.612 PASS t/DBTest.CompactionFilterV2
25.891 PASS t/DBTest.GroupCommitTest
23.863 PASS t/DBTest.DynamicLevelMaxBytesBase
22.976 PASS ./rate_limiter_test
18.942 PASS t/DBTest.OptimizeFiltersForHits
16.851 PASS ./env_test
15.399 PASS t/DBTest.CompactionFilterV2WithValueChange
14.827 PASS t/DBTest.CompactionFilterV2NULLPrefix
Reviewers: igor, sdong, rven, yhchiang, igor.sugak
Reviewed By: igor.sugak
Subscribers: dhruba
Differential Revision: https://reviews.facebook.net/D35379
10 years ago
quoted_perl_command = $( subst ',' \' ' ,$( perl_command) )
# DEBUG_LEVEL can have three values:
# * DEBUG_LEVEL=2; this is the ultimate debug mode. It will compile rocksdb
# without any optimizations. To compile with level 2, issue `make dbg`
# * DEBUG_LEVEL=1; debug level 1 enables all assertions and debug code, but
# compiles rocksdb with -O2 optimizations. this is the default debug level.
# `make all` or `make <binary_target>` compile RocksDB with debug level 1.
# We use this debug level when developing RocksDB.
# * DEBUG_LEVEL=0; this is the debug level we use for release. If you're
# running rocksdb in production you most definitely want to compile RocksDB
# with debug level 0. To compile with level 0, run `make shared_lib`,
# `make install-shared`, `make static_lib`, `make install-static` or
# `make install`
# Set the default DEBUG_LEVEL to 1
DEBUG_LEVEL ?= 1
# OBJ_DIR is where the object files reside. Default to the current directory
OBJ_DIR ?= .
# Check the MAKECMDGOALS to set the DEBUG_LEVEL and LIB_MODE appropriately
i f n e q ( $( filter clean release install , $ ( MAKECMDGOALS ) ) , )
DEBUG_LEVEL = 0
e n d i f
i f n e q ( $( filter dbg , $ ( MAKECMDGOALS ) ) , )
DEBUG_LEVEL = 2
e l s e i f n e q ( $( filter shared_lib install -shared , $ ( MAKECMDGOALS ) ) , )
DEBUG_LEVEL = 0
LIB_MODE = shared
e l s e i f n e q ( $( filter static_lib install -static , $ ( MAKECMDGOALS ) ) , )
DEBUG_LEVEL = 0
LIB_MODE = static
e l s e i f n e q ( $( filter jtest rocksdbjava %, $ ( MAKECMDGOALS ) ) , )
OBJ_DIR = jl
LIB_MODE = shared
ifneq ( $( findstring rocksdbjavastatic, $( MAKECMDGOALS) ) ,)
OBJ_DIR = jls
ifneq ( $( DEBUG_LEVEL) ,2)
DEBUG_LEVEL = 0
endif
ifeq ( $( MAKECMDGOALS) ,rocksdbjavastaticpublish)
DEBUG_LEVEL = 0
endif
endif
e n d i f
# LIB_MODE says whether or not to use/build "shared" or "static" libraries.
# Mode "static" means to link against static libraries (.a)
# Mode "shared" means to link against shared libraries (.so, .sl, .dylib, etc)
#
i f e q ( $( DEBUG_LEVEL ) , 0 )
# For optimized, set the default LIB_MODE to static for code size/efficiency
LIB_MODE?= static
e l s e
# For debug, set the default LIB_MODE to shared for efficient `make check` etc.
LIB_MODE?= shared
e n d i f
$( info $ $ DEBUG_LEVEL is $ ( DEBUG_LEVEL ) , $ $ LIB_MODE is $ ( LIB_MODE ) )
# Figure out optimize level.
i f n e q ( $( DEBUG_LEVEL ) , 2 )
OPTIMIZE_LEVEL ?= -O2
e n d i f
# `OPTIMIZE_LEVEL` is empty when the user does not set it and `DEBUG_LEVEL=2`.
# In that case, the compiler default (`-O0` for gcc and clang) will be used.
OPT += $( OPTIMIZE_LEVEL)
# compile with -O2 if debug level is not 2
i f n e q ( $( DEBUG_LEVEL ) , 2 )
OPT += -fno-omit-frame-pointer
# Skip for archs that don't support -momit-leaf-frame-pointer
i f e q ( , $( shell $ ( CXX ) -fsyntax -only -momit -leaf -frame -pointer -xc /dev /null 2>&1) )
OPT += -momit-leaf-frame-pointer
e n d i f
e n d i f
i f e q ( , $( shell $ ( CXX ) -fsyntax -only -maltivec -xc /dev /null 2>&1) )
CXXFLAGS += -DHAS_ALTIVEC
CFLAGS += -DHAS_ALTIVEC
HAS_ALTIVEC = 1
e n d i f
i f e q ( , $( shell $ ( CXX ) -fsyntax -only -mcpu =power 8 -xc /dev /null 2>&1) )
CXXFLAGS += -DHAVE_POWER8
CFLAGS += -DHAVE_POWER8
HAVE_POWER8 = 1
e n d i f
# if we're compiling for shared libraries, add the shared flags
i f e q ( $( LIB_MODE ) , s h a r e d )
CXXFLAGS += $( PLATFORM_SHARED_CFLAGS) -DROCKSDB_DLL
CFLAGS += $( PLATFORM_SHARED_CFLAGS) -DROCKSDB_DLL
e n d i f
GIT_COMMAND ?= git
Multi file concurrency in MultiGet using coroutines and async IO (#9968)
Summary:
This PR implements a coroutine version of batched MultiGet in order to concurrently read from multiple SST files in a level using async IO, thus reducing the latency of the MultiGet. The API from the user perspective is still synchronous and single threaded, with the RocksDB part of the processing happening in the context of the caller's thread. In Version::MultiGet, the decision is made whether to call synchronous or coroutine code.
A good way to review this PR is to review the first 4 commits in order - de773b3, 70c2f70, 10b50e1, and 377a597 - before reviewing the rest.
TODO:
1. Figure out how to build it in CircleCI (requires some dependencies to be installed)
2. Do some stress testing with coroutines enabled
No regression in synchronous MultiGet between this branch and main -
```
./db_bench -use_existing_db=true --db=/data/mysql/rocksdb/prefix_scan -benchmarks="readseq,multireadrandom" -key_size=32 -value_size=512 -num=5000000 -batch_size=64 -multiread_batched=true -use_direct_reads=false -duration=60 -ops_between_duration_checks=1 -readonly=true -adaptive_readahead=true -threads=16 -cache_size=10485760000 -async_io=false -multiread_stride=40000 -statistics
```
Branch - ```multireadrandom : 4.025 micros/op 3975111 ops/sec 60.001 seconds 238509056 operations; 2062.3 MB/s (14767808 of 14767808 found)```
Main - ```multireadrandom : 3.987 micros/op 4013216 ops/sec 60.001 seconds 240795392 operations; 2082.1 MB/s (15231040 of 15231040 found)```
More benchmarks in various scenarios are given below. The measurements were taken with ```async_io=false``` (no coroutines) and ```async_io=true``` (use coroutines). For an IO bound workload (with every key requiring an IO), the coroutines version shows a clear benefit, being ~2.6X faster. For CPU bound workloads, the coroutines version has ~6-15% higher CPU utilization, depending on how many keys overlap an SST file.
1. Single thread IO bound workload on remote storage with sparse MultiGet batch keys (~1 key overlap/file) -
No coroutines - ```multireadrandom : 831.774 micros/op 1202 ops/sec 60.001 seconds 72136 operations; 0.6 MB/s (72136 of 72136 found)```
Using coroutines - ```multireadrandom : 318.742 micros/op 3137 ops/sec 60.003 seconds 188248 operations; 1.6 MB/s (188248 of 188248 found)```
2. Single thread CPU bound workload (all data cached) with ~1 key overlap/file -
No coroutines - ```multireadrandom : 4.127 micros/op 242322 ops/sec 60.000 seconds 14539384 operations; 125.7 MB/s (14539384 of 14539384 found)```
Using coroutines - ```multireadrandom : 4.741 micros/op 210935 ops/sec 60.000 seconds 12656176 operations; 109.4 MB/s (12656176 of 12656176 found)```
3. Single thread CPU bound workload with ~2 key overlap/file -
No coroutines - ```multireadrandom : 3.717 micros/op 269000 ops/sec 60.000 seconds 16140024 operations; 139.6 MB/s (16140024 of 16140024 found)```
Using coroutines - ```multireadrandom : 4.146 micros/op 241204 ops/sec 60.000 seconds 14472296 operations; 125.1 MB/s (14472296 of 14472296 found)```
4. CPU bound multi-threaded (16 threads) with ~4 key overlap/file -
No coroutines - ```multireadrandom : 4.534 micros/op 3528792 ops/sec 60.000 seconds 211728728 operations; 1830.7 MB/s (12737024 of 12737024 found) ```
Using coroutines - ```multireadrandom : 4.872 micros/op 3283812 ops/sec 60.000 seconds 197030096 operations; 1703.6 MB/s (12548032 of 12548032 found) ```
Pull Request resolved: https://github.com/facebook/rocksdb/pull/9968
Reviewed By: akankshamahajan15
Differential Revision: D36348563
Pulled By: anand1976
fbshipit-source-id: c0ce85a505fd26ebfbb09786cbd7f25202038696
3 years ago
i f e q ( $( USE_COROUTINES ) , 1 )
USE_FOLLY = 1
# glog/logging.h requires HAVE_CXX11_ATOMIC
OPT += -DUSE_COROUTINES -DHAVE_CXX11_ATOMIC
Multi file concurrency in MultiGet using coroutines and async IO (#9968)
Summary:
This PR implements a coroutine version of batched MultiGet in order to concurrently read from multiple SST files in a level using async IO, thus reducing the latency of the MultiGet. The API from the user perspective is still synchronous and single threaded, with the RocksDB part of the processing happening in the context of the caller's thread. In Version::MultiGet, the decision is made whether to call synchronous or coroutine code.
A good way to review this PR is to review the first 4 commits in order - de773b3, 70c2f70, 10b50e1, and 377a597 - before reviewing the rest.
TODO:
1. Figure out how to build it in CircleCI (requires some dependencies to be installed)
2. Do some stress testing with coroutines enabled
No regression in synchronous MultiGet between this branch and main -
```
./db_bench -use_existing_db=true --db=/data/mysql/rocksdb/prefix_scan -benchmarks="readseq,multireadrandom" -key_size=32 -value_size=512 -num=5000000 -batch_size=64 -multiread_batched=true -use_direct_reads=false -duration=60 -ops_between_duration_checks=1 -readonly=true -adaptive_readahead=true -threads=16 -cache_size=10485760000 -async_io=false -multiread_stride=40000 -statistics
```
Branch - ```multireadrandom : 4.025 micros/op 3975111 ops/sec 60.001 seconds 238509056 operations; 2062.3 MB/s (14767808 of 14767808 found)```
Main - ```multireadrandom : 3.987 micros/op 4013216 ops/sec 60.001 seconds 240795392 operations; 2082.1 MB/s (15231040 of 15231040 found)```
More benchmarks in various scenarios are given below. The measurements were taken with ```async_io=false``` (no coroutines) and ```async_io=true``` (use coroutines). For an IO bound workload (with every key requiring an IO), the coroutines version shows a clear benefit, being ~2.6X faster. For CPU bound workloads, the coroutines version has ~6-15% higher CPU utilization, depending on how many keys overlap an SST file.
1. Single thread IO bound workload on remote storage with sparse MultiGet batch keys (~1 key overlap/file) -
No coroutines - ```multireadrandom : 831.774 micros/op 1202 ops/sec 60.001 seconds 72136 operations; 0.6 MB/s (72136 of 72136 found)```
Using coroutines - ```multireadrandom : 318.742 micros/op 3137 ops/sec 60.003 seconds 188248 operations; 1.6 MB/s (188248 of 188248 found)```
2. Single thread CPU bound workload (all data cached) with ~1 key overlap/file -
No coroutines - ```multireadrandom : 4.127 micros/op 242322 ops/sec 60.000 seconds 14539384 operations; 125.7 MB/s (14539384 of 14539384 found)```
Using coroutines - ```multireadrandom : 4.741 micros/op 210935 ops/sec 60.000 seconds 12656176 operations; 109.4 MB/s (12656176 of 12656176 found)```
3. Single thread CPU bound workload with ~2 key overlap/file -
No coroutines - ```multireadrandom : 3.717 micros/op 269000 ops/sec 60.000 seconds 16140024 operations; 139.6 MB/s (16140024 of 16140024 found)```
Using coroutines - ```multireadrandom : 4.146 micros/op 241204 ops/sec 60.000 seconds 14472296 operations; 125.1 MB/s (14472296 of 14472296 found)```
4. CPU bound multi-threaded (16 threads) with ~4 key overlap/file -
No coroutines - ```multireadrandom : 4.534 micros/op 3528792 ops/sec 60.000 seconds 211728728 operations; 1830.7 MB/s (12737024 of 12737024 found) ```
Using coroutines - ```multireadrandom : 4.872 micros/op 3283812 ops/sec 60.000 seconds 197030096 operations; 1703.6 MB/s (12548032 of 12548032 found) ```
Pull Request resolved: https://github.com/facebook/rocksdb/pull/9968
Reviewed By: akankshamahajan15
Differential Revision: D36348563
Pulled By: anand1976
fbshipit-source-id: c0ce85a505fd26ebfbb09786cbd7f25202038696
3 years ago
ROCKSDB_CXX_STANDARD = c++2a
USE_RTTI = 1
i f n e q ( $( USE_CLANG ) , 1 )
ROCKSDB_CXX_STANDARD = c++20
PLATFORM_CXXFLAGS += -fcoroutines
e n d i f
Multi file concurrency in MultiGet using coroutines and async IO (#9968)
Summary:
This PR implements a coroutine version of batched MultiGet in order to concurrently read from multiple SST files in a level using async IO, thus reducing the latency of the MultiGet. The API from the user perspective is still synchronous and single threaded, with the RocksDB part of the processing happening in the context of the caller's thread. In Version::MultiGet, the decision is made whether to call synchronous or coroutine code.
A good way to review this PR is to review the first 4 commits in order - de773b3, 70c2f70, 10b50e1, and 377a597 - before reviewing the rest.
TODO:
1. Figure out how to build it in CircleCI (requires some dependencies to be installed)
2. Do some stress testing with coroutines enabled
No regression in synchronous MultiGet between this branch and main -
```
./db_bench -use_existing_db=true --db=/data/mysql/rocksdb/prefix_scan -benchmarks="readseq,multireadrandom" -key_size=32 -value_size=512 -num=5000000 -batch_size=64 -multiread_batched=true -use_direct_reads=false -duration=60 -ops_between_duration_checks=1 -readonly=true -adaptive_readahead=true -threads=16 -cache_size=10485760000 -async_io=false -multiread_stride=40000 -statistics
```
Branch - ```multireadrandom : 4.025 micros/op 3975111 ops/sec 60.001 seconds 238509056 operations; 2062.3 MB/s (14767808 of 14767808 found)```
Main - ```multireadrandom : 3.987 micros/op 4013216 ops/sec 60.001 seconds 240795392 operations; 2082.1 MB/s (15231040 of 15231040 found)```
More benchmarks in various scenarios are given below. The measurements were taken with ```async_io=false``` (no coroutines) and ```async_io=true``` (use coroutines). For an IO bound workload (with every key requiring an IO), the coroutines version shows a clear benefit, being ~2.6X faster. For CPU bound workloads, the coroutines version has ~6-15% higher CPU utilization, depending on how many keys overlap an SST file.
1. Single thread IO bound workload on remote storage with sparse MultiGet batch keys (~1 key overlap/file) -
No coroutines - ```multireadrandom : 831.774 micros/op 1202 ops/sec 60.001 seconds 72136 operations; 0.6 MB/s (72136 of 72136 found)```
Using coroutines - ```multireadrandom : 318.742 micros/op 3137 ops/sec 60.003 seconds 188248 operations; 1.6 MB/s (188248 of 188248 found)```
2. Single thread CPU bound workload (all data cached) with ~1 key overlap/file -
No coroutines - ```multireadrandom : 4.127 micros/op 242322 ops/sec 60.000 seconds 14539384 operations; 125.7 MB/s (14539384 of 14539384 found)```
Using coroutines - ```multireadrandom : 4.741 micros/op 210935 ops/sec 60.000 seconds 12656176 operations; 109.4 MB/s (12656176 of 12656176 found)```
3. Single thread CPU bound workload with ~2 key overlap/file -
No coroutines - ```multireadrandom : 3.717 micros/op 269000 ops/sec 60.000 seconds 16140024 operations; 139.6 MB/s (16140024 of 16140024 found)```
Using coroutines - ```multireadrandom : 4.146 micros/op 241204 ops/sec 60.000 seconds 14472296 operations; 125.1 MB/s (14472296 of 14472296 found)```
4. CPU bound multi-threaded (16 threads) with ~4 key overlap/file -
No coroutines - ```multireadrandom : 4.534 micros/op 3528792 ops/sec 60.000 seconds 211728728 operations; 1830.7 MB/s (12737024 of 12737024 found) ```
Using coroutines - ```multireadrandom : 4.872 micros/op 3283812 ops/sec 60.000 seconds 197030096 operations; 1703.6 MB/s (12548032 of 12548032 found) ```
Pull Request resolved: https://github.com/facebook/rocksdb/pull/9968
Reviewed By: akankshamahajan15
Differential Revision: D36348563
Pulled By: anand1976
fbshipit-source-id: c0ce85a505fd26ebfbb09786cbd7f25202038696
3 years ago
e n d i f
# if we're compiling for release, compile without debug code (-DNDEBUG)
i f e q ( $( DEBUG_LEVEL ) , 0 )
OPT += -DNDEBUG
i f n e q ( $( USE_RTTI ) , 1 )
CXXFLAGS += -fno-rtti
e l s e
CXXFLAGS += -DROCKSDB_USE_RTTI
e n d i f
e l s e
i f n e q ( $( USE_RTTI ) , 0 )
CXXFLAGS += -DROCKSDB_USE_RTTI
e l s e
CXXFLAGS += -fno-rtti
e n d i f
i f d e f A S S E R T _ S T A T U S _ C H E C K E D
# For ASC, turn off constructor elision, preventing the case where a constructor returned
# by a method may pass the ASC check if the status is checked in the inner method. Forcing
# the copy constructor to be invoked disables the optimization and will cause the calling method
# to check the status in order to prevent an error from being raised.
PLATFORM_CXXFLAGS += -fno-elide-constructors
i f e q ( $( filter -DROCKSDB_ASSERT_STATUS_CHECKED ,$ ( OPT ) ) , )
OPT += -DROCKSDB_ASSERT_STATUS_CHECKED
e n d i f
e n d i f
$(warning Warning : Compiling in debug mode . Don 't use the resulting binary in production )
e n d i f
# `USE_LTO=1` enables link-time optimizations. Among other things, this enables
# more devirtualization opportunities and inlining across translation units.
# This can save significant overhead introduced by RocksDB's pluggable
# interfaces/internal abstractions, like in the iterator hierarchy. It works
# better when combined with profile-guided optimizations (not currently
# supported natively in Makefile).
i f e q ( $( USE_LTO ) , 1 )
CXXFLAGS += -flto
LDFLAGS += -flto -fuse-linker-plugin
e n d i f
# `COERCE_CONTEXT_SWITCH=1` will inject spurious wakeup and
# random length of sleep or context switch at critical
# points (e.g, before acquring db mutex) in RocksDB.
# In this way, it coerces as many excution orders as possible in the hope of
# exposing the problematic excution order
COERCE_CONTEXT_SWITCH ?= 0
i f e q ( $( COERCE_CONTEXT_SWITCH ) , 1 )
OPT += -DCOERCE_CONTEXT_SWITCH
e n d i f
#-----------------------------------------------
build: fix missing dependency problems
Summary:
Any time one would modify a dependent of any *test*.cc file,
"make" would fail to rebuild the affected test binaries,
e.g., db_test. That was due to the fact that we deliberately
excluded those test-related files from the definition of SOURCES
and only $(SOURCES) was used to create the automatically-generated
.d dependency files. The fix is to generate a .d file for every
source file.
* src.mk: New file. Defines LIB_SOURCES, MOCK_SOURCES
and TEST_BENCH_SOURCES.
* Makefile: Include src.mk.
Reflect s/SOURCES/LIB_SOURCES/ renaming.
* build_tools/build_detect_platform: Remove the code
that was used to generate SOURCES= and MOCK_SOURCES=
definitions in make_config.mk. Those lists of files
are now hard-coded in src.mk. Hard-coding this list of
sources is desirable, because without that, one risks
including stray .cc files in a build. Not reproducible.
Test Plan:
Touch a file used by db_test's dependent .o files and ensure that
they are all recompiled. Before, none would be:
$ touch db/db_impl.h && make db_test
CC db/db_test.o
CC db/column_family.o
CC db/db_filesnapshot.o
CC db/db_impl.o
CC db/db_impl_debug.o
CC db/db_impl_readonly.o
CC db/forward_iterator.o
CC db/internal_stats.o
CC db/managed_iterator.o
CC db/repair.o
CC db/write_batch.o
CC utilities/compacted_db/compacted_db_impl.o
CC utilities/ttl/db_ttl_impl.o
CC util/ldb_cmd.o
CC util/ldb_tool.o
CC util/sst_dump_tool.o
CC util/xfunc.o
CCLD db_test
Reviewers: ljin, igor.sugak, igor, rven, sdong
Reviewed By: sdong
Subscribers: yhchiang, adamretter, fyrz, dhruba
Differential Revision: https://reviews.facebook.net/D33849
10 years ago
i n c l u d e s r c . m k
AM_DEFAULT_VERBOSITY ?= 0
build: make "make" output readable by default
Summary:
With this change, make now prints a summary line for each
compiler and linker invocation, e.g.,:
CC db/builder.o
CC db/c.o
CC db/column_family.o
To see full commands, insert "V=1" into your make command.
E.g., run "make V=1 all" if you want it to print each command
in its full glory.
$^ is GNU make's abbreviation for the prerequisites of the current target.
These AM_V_... variables expand to some very short string like "CC" or
"LD", by default, so that the output of "make" is readable. If/when you
want more details, just build with "make V=1 ...", and make will print
each full command as it is executed. If you prefer to see the noise
all the time, and only want to optionally see the abbreviated output,
set AM_DEFAULT_VERBOSITY=1 in your environment, and then build with
V=0 to see the abbreviated command indicators.
Test Plan:
invoke make a few different ways and observe:
make clean; make # abbreviated
make clean; make V=0 # also abbreviated
make clean; make V=1 # full detail
Reviewers: sdong, ljin, igor
Reviewed By: igor
Subscribers: dhruba
Differential Revision: https://reviews.facebook.net/D33579
10 years ago
AM_V_GEN = $( am__v_GEN_$( V) )
am__v_GEN_ = $( am__v_GEN_$( AM_DEFAULT_VERBOSITY) )
am__v_GEN_0 = @echo " GEN " $@ ;
am__v_GEN_1 =
AM_V_at = $( am__v_at_$( V) )
am__v_at_ = $( am__v_at_$( AM_DEFAULT_VERBOSITY) )
am__v_at_0 = @
am__v_at_1 =
build: make "make" output readable by default
Summary:
With this change, make now prints a summary line for each
compiler and linker invocation, e.g.,:
CC db/builder.o
CC db/c.o
CC db/column_family.o
To see full commands, insert "V=1" into your make command.
E.g., run "make V=1 all" if you want it to print each command
in its full glory.
$^ is GNU make's abbreviation for the prerequisites of the current target.
These AM_V_... variables expand to some very short string like "CC" or
"LD", by default, so that the output of "make" is readable. If/when you
want more details, just build with "make V=1 ...", and make will print
each full command as it is executed. If you prefer to see the noise
all the time, and only want to optionally see the abbreviated output,
set AM_DEFAULT_VERBOSITY=1 in your environment, and then build with
V=0 to see the abbreviated command indicators.
Test Plan:
invoke make a few different ways and observe:
make clean; make # abbreviated
make clean; make V=0 # also abbreviated
make clean; make V=1 # full detail
Reviewers: sdong, ljin, igor
Reviewed By: igor
Subscribers: dhruba
Differential Revision: https://reviews.facebook.net/D33579
10 years ago
AM_V_CC = $( am__v_CC_$( V) )
am__v_CC_ = $( am__v_CC_$( AM_DEFAULT_VERBOSITY) )
am__v_CC_0 = @echo " CC " $@ ;
am__v_CC_1 =
build: make "make" output readable by default
Summary:
With this change, make now prints a summary line for each
compiler and linker invocation, e.g.,:
CC db/builder.o
CC db/c.o
CC db/column_family.o
To see full commands, insert "V=1" into your make command.
E.g., run "make V=1 all" if you want it to print each command
in its full glory.
$^ is GNU make's abbreviation for the prerequisites of the current target.
These AM_V_... variables expand to some very short string like "CC" or
"LD", by default, so that the output of "make" is readable. If/when you
want more details, just build with "make V=1 ...", and make will print
each full command as it is executed. If you prefer to see the noise
all the time, and only want to optionally see the abbreviated output,
set AM_DEFAULT_VERBOSITY=1 in your environment, and then build with
V=0 to see the abbreviated command indicators.
Test Plan:
invoke make a few different ways and observe:
make clean; make # abbreviated
make clean; make V=0 # also abbreviated
make clean; make V=1 # full detail
Reviewers: sdong, ljin, igor
Reviewed By: igor
Subscribers: dhruba
Differential Revision: https://reviews.facebook.net/D33579
10 years ago
AM_V_CCLD = $( am__v_CCLD_$( V) )
am__v_CCLD_ = $( am__v_CCLD_$( AM_DEFAULT_VERBOSITY) )
i f n e q ( $( SKIP_LINK ) , 1 )
build: make "make" output readable by default
Summary:
With this change, make now prints a summary line for each
compiler and linker invocation, e.g.,:
CC db/builder.o
CC db/c.o
CC db/column_family.o
To see full commands, insert "V=1" into your make command.
E.g., run "make V=1 all" if you want it to print each command
in its full glory.
$^ is GNU make's abbreviation for the prerequisites of the current target.
These AM_V_... variables expand to some very short string like "CC" or
"LD", by default, so that the output of "make" is readable. If/when you
want more details, just build with "make V=1 ...", and make will print
each full command as it is executed. If you prefer to see the noise
all the time, and only want to optionally see the abbreviated output,
set AM_DEFAULT_VERBOSITY=1 in your environment, and then build with
V=0 to see the abbreviated command indicators.
Test Plan:
invoke make a few different ways and observe:
make clean; make # abbreviated
make clean; make V=0 # also abbreviated
make clean; make V=1 # full detail
Reviewers: sdong, ljin, igor
Reviewed By: igor
Subscribers: dhruba
Differential Revision: https://reviews.facebook.net/D33579
10 years ago
am__v_CCLD_0 = @echo " CCLD " $@ ;
am__v_CCLD_1 =
e l s e
am__v_CCLD_0 = @echo " !CCLD " $@ ; true skip
am__v_CCLD_1 = true skip
e n d i f
AM_V_AR = $( am__v_AR_$( V) )
am__v_AR_ = $( am__v_AR_$( AM_DEFAULT_VERBOSITY) )
am__v_AR_0 = @echo " AR " $@ ;
am__v_AR_1 =
build: make "make" output readable by default
Summary:
With this change, make now prints a summary line for each
compiler and linker invocation, e.g.,:
CC db/builder.o
CC db/c.o
CC db/column_family.o
To see full commands, insert "V=1" into your make command.
E.g., run "make V=1 all" if you want it to print each command
in its full glory.
$^ is GNU make's abbreviation for the prerequisites of the current target.
These AM_V_... variables expand to some very short string like "CC" or
"LD", by default, so that the output of "make" is readable. If/when you
want more details, just build with "make V=1 ...", and make will print
each full command as it is executed. If you prefer to see the noise
all the time, and only want to optionally see the abbreviated output,
set AM_DEFAULT_VERBOSITY=1 in your environment, and then build with
V=0 to see the abbreviated command indicators.
Test Plan:
invoke make a few different ways and observe:
make clean; make # abbreviated
make clean; make V=0 # also abbreviated
make clean; make V=1 # full detail
Reviewers: sdong, ljin, igor
Reviewed By: igor
Subscribers: dhruba
Differential Revision: https://reviews.facebook.net/D33579
10 years ago
AM_LINK = $( AM_V_CCLD) $( CXX) -L. $( patsubst lib%.a, -l%, $( patsubst lib%.$( PLATFORM_SHARED_EXT) , -l%, $^) ) $( EXEC_LDFLAGS) -o $@ $( LDFLAGS) $( COVERAGEFLAGS)
AM_SHARE = $( AM_V_CCLD) $( CXX) $( PLATFORM_SHARED_LDFLAGS) $@ -L. $( patsubst lib%.$( PLATFORM_SHARED_EXT) , -l%, $^) $( EXEC_LDFLAGS) $( LDFLAGS) -o $@
# Detect what platform we're building on.
# Export some common variables that might have been passed as Make variables
# instead of environment variables.
dummy := $( shell ( export ROCKSDB_ROOT = " $( CURDIR) " ; \
export CXXFLAGS = " $( EXTRA_CXXFLAGS) " ; \
export LDFLAGS = " $( EXTRA_LDFLAGS) " ; \
export COMPILE_WITH_ASAN = " $( COMPILE_WITH_ASAN) " ; \
export COMPILE_WITH_TSAN = " $( COMPILE_WITH_TSAN) " ; \
export COMPILE_WITH_UBSAN = " $( COMPILE_WITH_UBSAN) " ; \
export PORTABLE = " $( PORTABLE) " ; \
export ROCKSDB_NO_FBCODE = " $( ROCKSDB_NO_FBCODE) " ; \
export USE_CLANG = " $( USE_CLANG) " ; \
export LIB_MODE = " $( LIB_MODE) " ; \
Multi file concurrency in MultiGet using coroutines and async IO (#9968)
Summary:
This PR implements a coroutine version of batched MultiGet in order to concurrently read from multiple SST files in a level using async IO, thus reducing the latency of the MultiGet. The API from the user perspective is still synchronous and single threaded, with the RocksDB part of the processing happening in the context of the caller's thread. In Version::MultiGet, the decision is made whether to call synchronous or coroutine code.
A good way to review this PR is to review the first 4 commits in order - de773b3, 70c2f70, 10b50e1, and 377a597 - before reviewing the rest.
TODO:
1. Figure out how to build it in CircleCI (requires some dependencies to be installed)
2. Do some stress testing with coroutines enabled
No regression in synchronous MultiGet between this branch and main -
```
./db_bench -use_existing_db=true --db=/data/mysql/rocksdb/prefix_scan -benchmarks="readseq,multireadrandom" -key_size=32 -value_size=512 -num=5000000 -batch_size=64 -multiread_batched=true -use_direct_reads=false -duration=60 -ops_between_duration_checks=1 -readonly=true -adaptive_readahead=true -threads=16 -cache_size=10485760000 -async_io=false -multiread_stride=40000 -statistics
```
Branch - ```multireadrandom : 4.025 micros/op 3975111 ops/sec 60.001 seconds 238509056 operations; 2062.3 MB/s (14767808 of 14767808 found)```
Main - ```multireadrandom : 3.987 micros/op 4013216 ops/sec 60.001 seconds 240795392 operations; 2082.1 MB/s (15231040 of 15231040 found)```
More benchmarks in various scenarios are given below. The measurements were taken with ```async_io=false``` (no coroutines) and ```async_io=true``` (use coroutines). For an IO bound workload (with every key requiring an IO), the coroutines version shows a clear benefit, being ~2.6X faster. For CPU bound workloads, the coroutines version has ~6-15% higher CPU utilization, depending on how many keys overlap an SST file.
1. Single thread IO bound workload on remote storage with sparse MultiGet batch keys (~1 key overlap/file) -
No coroutines - ```multireadrandom : 831.774 micros/op 1202 ops/sec 60.001 seconds 72136 operations; 0.6 MB/s (72136 of 72136 found)```
Using coroutines - ```multireadrandom : 318.742 micros/op 3137 ops/sec 60.003 seconds 188248 operations; 1.6 MB/s (188248 of 188248 found)```
2. Single thread CPU bound workload (all data cached) with ~1 key overlap/file -
No coroutines - ```multireadrandom : 4.127 micros/op 242322 ops/sec 60.000 seconds 14539384 operations; 125.7 MB/s (14539384 of 14539384 found)```
Using coroutines - ```multireadrandom : 4.741 micros/op 210935 ops/sec 60.000 seconds 12656176 operations; 109.4 MB/s (12656176 of 12656176 found)```
3. Single thread CPU bound workload with ~2 key overlap/file -
No coroutines - ```multireadrandom : 3.717 micros/op 269000 ops/sec 60.000 seconds 16140024 operations; 139.6 MB/s (16140024 of 16140024 found)```
Using coroutines - ```multireadrandom : 4.146 micros/op 241204 ops/sec 60.000 seconds 14472296 operations; 125.1 MB/s (14472296 of 14472296 found)```
4. CPU bound multi-threaded (16 threads) with ~4 key overlap/file -
No coroutines - ```multireadrandom : 4.534 micros/op 3528792 ops/sec 60.000 seconds 211728728 operations; 1830.7 MB/s (12737024 of 12737024 found) ```
Using coroutines - ```multireadrandom : 4.872 micros/op 3283812 ops/sec 60.000 seconds 197030096 operations; 1703.6 MB/s (12548032 of 12548032 found) ```
Pull Request resolved: https://github.com/facebook/rocksdb/pull/9968
Reviewed By: akankshamahajan15
Differential Revision: D36348563
Pulled By: anand1976
fbshipit-source-id: c0ce85a505fd26ebfbb09786cbd7f25202038696
3 years ago
export ROCKSDB_CXX_STANDARD = " $( ROCKSDB_CXX_STANDARD) " ; \
export USE_FOLLY = " $( USE_FOLLY) " ; \
" $( CURDIR) /build_tools/build_detect_platform " " $( CURDIR) /make_config.mk " ) )
# this file is generated by the previous line to set build flags and sources
i n c l u d e m a k e _ c o n f i g . m k
Major CircleCI/Linux fixes / tweaks / enhancements (#7078)
Summary:
Primarily, this change adds a way to work around a bug limiting the effective output (and therefore debugability) of the Linux builds using parallel make. We would get
make[1]: write error: stdout
probably due to a kernel bug, apparently affecting both available ubuntu 16 machine images (maybe not affecting docker images, less horsepower). https://bugs.launchpad.net/ubuntu/+source/linux-signed/+bug/1814393
Now in the CircleCI config, make output on Ubuntu is piped through a custom 'cat' that ignores EAGAIN errors, which seems to fix the problem.
Significant other changes:
* Add another linux build that combines
* LIB_MODE=shared, to ensure this works with compile and unit test execution
* Alternative rocksdb namespace, to ensure this works (not rely on Travis)
* ASSERT_STATUS_CHECKED=1, but with building all unit tests and running those expected to pass with it
* Run release build with and without gflags. (Was running only without, ignore large swaths of code in a normal release build! Two regressions in this build, only with gflags, in the last week not caught by CI!)
* Use gflags with unity and LITE build, as typical case.
Debugability improvements:
* Use V=1 to show commands being executed (thanks to EAGAIN work-around)
* Print kernel version and compiler versions as part of V=1 output from Makefile
Cosmetic other changes:
* Put more commands on one line, for less clutter in CircleCI output pages
* Remove redundant "all" in "make all check" and put make command options before targets
* Change some recursive "make clean" into dependency on "clean," toward minimizing unnecessary overhead (detect platform, build version, etc.) of extra recursive makes
Pull Request resolved: https://github.com/facebook/rocksdb/pull/7078
Reviewed By: siying
Differential Revision: D22391647
Pulled By: pdillinger
fbshipit-source-id: d446fccf5a8c568b37dc8748621c8a5c546fe135
5 years ago
ROCKSDB_PLUGIN_MKS = $( foreach plugin, $( ROCKSDB_PLUGINS) , plugin/$( plugin) /*.mk)
i n c l u d e $( ROCKSDB_PLUGIN_MKS )
ROCKSDB_PLUGIN_PROTO = ROCKSDB_NAMESPACE::ObjectLibrary\& , const std::string\&
ROCKSDB_PLUGIN_SOURCES = $( foreach p, $( ROCKSDB_PLUGINS) , $( foreach source, $( $( p) _SOURCES) , plugin/$( p) /$( source ) ) )
ROCKSDB_PLUGIN_HEADERS = $( foreach p, $( ROCKSDB_PLUGINS) , $( foreach header, $( $( p) _HEADERS) , plugin/$( p) /$( header) ) )
ROCKSDB_PLUGIN_LIBS = $( foreach p, $( ROCKSDB_PLUGINS) , $( foreach lib, $( $( p) _LIBS) , -l$( lib) ) )
ROCKSDB_PLUGIN_W_FUNCS = $( foreach p, $( ROCKSDB_PLUGINS) , $( if $( $( p) _FUNC) , $( p) ) )
ROCKSDB_PLUGIN_EXTERNS = $( foreach p, $( ROCKSDB_PLUGIN_W_FUNCS) , int $( $( p) _FUNC) ( $( ROCKSDB_PLUGIN_PROTO) ) ; )
ROCKSDB_PLUGIN_BUILTINS = $( foreach p, $( ROCKSDB_PLUGIN_W_FUNCS) , { \" $( p) \" \, $( $( p) _FUNC) } \, )
ROCKSDB_PLUGIN_LDFLAGS = $( foreach plugin, $( ROCKSDB_PLUGINS) , $( $( plugin) _LDFLAGS) )
ROCKSDB_PLUGIN_PKGCONFIG_REQUIRES = $( foreach plugin, $( ROCKSDB_PLUGINS) , $( $( plugin) _PKGCONFIG_REQUIRES) )
ROCKSDB_PLUGIN_TESTS = $( foreach p, $( ROCKSDB_PLUGINS) , $( foreach test, $( $( p) _TESTS) , plugin/$( p) /$( test ) ) )
CXXFLAGS += $( foreach plugin, $( ROCKSDB_PLUGINS) , $( $( plugin) _CXXFLAGS) )
PLATFORM_LDFLAGS += $( ROCKSDB_PLUGIN_LDFLAGS)
# Patch up the link flags for JNI from the plugins
JAVA_LDFLAGS += $( ROCKSDB_PLUGIN_LDFLAGS)
JAVA_STATIC_LDFLAGS += $( ROCKSDB_PLUGIN_LDFLAGS)
# Patch up the list of java native sources with files from the plugins
ROCKSDB_PLUGIN_JNI_NATIVE_SOURCES = $( foreach plugin, $( ROCKSDB_PLUGINS) , $( foreach source, $( $( plugin) _JNI_NATIVE_SOURCES) , plugin/$( plugin) /$( source ) ) )
ALL_JNI_NATIVE_SOURCES = $( JNI_NATIVE_SOURCES) $( ROCKSDB_PLUGIN_JNI_NATIVE_SOURCES)
ROCKSDB_PLUGIN_JNI_CXX_INCLUDEFLAGS = $( foreach plugin, $( ROCKSDB_PLUGINS) , -I./plugin/$( plugin) )
i f n e q ( $( strip $ ( ROCKSDB_PLUGIN_PKGCONFIG_REQUIRES ) ) , )
LDFLAGS := $( LDFLAGS) $( shell pkg-config --libs $( ROCKSDB_PLUGIN_PKGCONFIG_REQUIRES) )
i f n e q ( $( .SHELLSTATUS ) , 0 )
$( error pkg -config failed )
e n d i f
CXXFLAGS := $( CXXFLAGS) $( shell pkg-config --cflags $( ROCKSDB_PLUGIN_PKGCONFIG_REQUIRES) )
i f n e q ( $( .SHELLSTATUS ) , 0 )
$( error pkg -config failed )
e n d i f
e n d i f
CXXFLAGS += $( ARCHFLAG)
i f e q ( , $( shell $ ( CXX ) -fsyntax -only -march =armv 8-a +crc +crypto -xc /dev /null 2>&1) )
i f n e q ( $( PLATFORM ) , O S _ M A C O S X )
CXXFLAGS += -march= armv8-a+crc+crypto
CFLAGS += -march= armv8-a+crc+crypto
ARMCRC_SOURCE = 1
e n d i f
e n d i f
export JAVAC_ARGS
CLEAN_FILES += make_config.mk rocksdb.pc
Major CircleCI/Linux fixes / tweaks / enhancements (#7078)
Summary:
Primarily, this change adds a way to work around a bug limiting the effective output (and therefore debugability) of the Linux builds using parallel make. We would get
make[1]: write error: stdout
probably due to a kernel bug, apparently affecting both available ubuntu 16 machine images (maybe not affecting docker images, less horsepower). https://bugs.launchpad.net/ubuntu/+source/linux-signed/+bug/1814393
Now in the CircleCI config, make output on Ubuntu is piped through a custom 'cat' that ignores EAGAIN errors, which seems to fix the problem.
Significant other changes:
* Add another linux build that combines
* LIB_MODE=shared, to ensure this works with compile and unit test execution
* Alternative rocksdb namespace, to ensure this works (not rely on Travis)
* ASSERT_STATUS_CHECKED=1, but with building all unit tests and running those expected to pass with it
* Run release build with and without gflags. (Was running only without, ignore large swaths of code in a normal release build! Two regressions in this build, only with gflags, in the last week not caught by CI!)
* Use gflags with unity and LITE build, as typical case.
Debugability improvements:
* Use V=1 to show commands being executed (thanks to EAGAIN work-around)
* Print kernel version and compiler versions as part of V=1 output from Makefile
Cosmetic other changes:
* Put more commands on one line, for less clutter in CircleCI output pages
* Remove redundant "all" in "make all check" and put make command options before targets
* Change some recursive "make clean" into dependency on "clean," toward minimizing unnecessary overhead (detect platform, build version, etc.) of extra recursive makes
Pull Request resolved: https://github.com/facebook/rocksdb/pull/7078
Reviewed By: siying
Differential Revision: D22391647
Pulled By: pdillinger
fbshipit-source-id: d446fccf5a8c568b37dc8748621c8a5c546fe135
5 years ago
i f e q ( $( V ) , 1 )
$( info $ ( shell uname -a ) )
$( info $ ( shell $ ( CC ) --version ) )
$( info $ ( shell $ ( CXX ) --version ) )
e n d i f
missing_make_config_paths := $( shell \
grep "\./\S*\|/\S*" -o $( CURDIR) /make_config.mk | \
while read path; \
do [ -e $$ path ] || echo $$ path; \
done | sort | uniq | grep -v "/DOES/NOT/EXIST" )
$( foreach path , $ ( missing_make_config_paths ) , \
$( warning Warning: $( path) does not exist) )
i f e q ( $( PLATFORM ) , O S _ A I X )
# no debug info
e l s e i f n e q ( $( PLATFORM ) , I O S )
CFLAGS += -g
CXXFLAGS += -g
e l s e
# no debug info for IOS, that will make our library big
OPT += -DNDEBUG
e n d i f
i f e q ( $( PLATFORM ) , O S _ A I X )
ARFLAGS = -X64 rs
STRIPFLAGS = -X64 -x
e n d i f
i f e q ( $( PLATFORM ) , O S _ S O L A R I S )
PLATFORM_CXXFLAGS += -D _GLIBCXX_USE_C99
e n d i f
i f e q ( $( LIB_MODE ) , s h a r e d )
# So that binaries are executable from build location, in addition to install location
EXEC_LDFLAGS += -Wl,-rpath -Wl,'$$ORIGIN'
e n d i f
i f e q ( $( PLATFORM ) , O S _ M A C O S X )
i f e q ( $( ARCHFLAG ) , - a r c h a r m 6 4 )
i f n e q ( $( MACHINE ) , a r m 6 4 )
# If we're building on a non-arm64 machine but targeting arm64 Mac, we need to disable
# linking with jemalloc (as it won't be arm64-compatible) and remove some other options
# set during platform detection
DISABLE_JEMALLOC = 1
Simplify detection of x86 CPU features (#11419)
Summary:
**Background** - runtime detection of certain x86 CPU features was added for optimizing CRC32c checksums, where performance is dramatically affected by the availability of certain CPU instructions and code using intrinsics for those instructions. And Java builds with native library try to be broadly compatible but performant.
What has changed is that CRC32c is no longer the most efficient cheecksum on contemporary x86_64 hardware, nor the default checksum. XXH3 is generally faster and not as dramatically impacted by the availability of certain CPU instructions. For example, on my Skylake system using db_bench (similar on an older Skylake system without AVX512):
PORTABLE=1 empty USE_SSE : xxh3->8 GB/s crc32c->0.8 GB/s (no SSE4.2 nor AVX2 instructions)
PORTABLE=1 USE_SSE=1 : xxh3->19 GB/s crc32c->16 GB/s (with SSE4.2 and AVX2)
PORTABLE=0 USE_SSE ignored: xxh3->28 GB/s crc32c->16 GB/s (also some AVX512)
Testing a ~10 year old system, with SSE4.2 but without AVX2, crc32c is a similar speed to the new systems but xxh3 is only about half that speed, also 8GB/s like the non-AVX2 compile above. Given that xxh3 has specific optimization for AVX2, I think we can infer that that crc32c is only fastest for that ~2008-2013 period when SSE4.2 was included but not AVX2. And given that xxh3 is only about 2x slower on these systems (not like >10x slower for unoptimized crc32c), I don't think we need to invest too much in optimally adapting to these old cases.
x86 hardware that doesn't support fast CRC32c is now extremely rare, so requiring a custom build to support such hardware is fine IMHO.
**This change** does two related things:
* Remove runtime CPU detection for optimizing CRC32c on x86. Maintaining this code is non-zero work, and compiling special code that doesn't work on the configured target instruction set for code generation is always dubious. (On the one hand we have to ensure the CRC32c code uses SSE4.2 but on the other hand we have to ensure nothing else does.)
* Detect CPU features in source code, not in build scripts. Although there are some hypothetical advantages to detectiong in build scripts (compiler generality), RocksDB supports at least three build systems: make, cmake, and buck. It's not practical to support feature detection on all three, and we have suffered from missed optimization opportunities by relying on missing or incomplete detection in cmake and buck. We also depend on some components like xxhash that do source code detection anyway.
**In more detail:**
* `HAVE_SSE42`, `HAVE_AVX2`, and `HAVE_PCLMUL` replaced by standard macros `__SSE4_2__`, `__AVX2__`, and `__PCLMUL__`.
* MSVC does not provide high fidelity defines for SSE, PCLMUL, or POPCNT, but we can infer those from `__AVX__` or `__AVX2__` in a compatibility header. In rare cases of false negative or false positive feature detection, a build engineer should be able to set defines to work around the issue.
* `__POPCNT__` is another standard define, but we happen to only need it on MSVC, where it is set by that compatibility header, or can be set by the build engineer.
* `PORTABLE` can be set to a CPU type, e.g. "haswell", to compile for that CPU type.
* `USE_SSE` is deprecated, now equivalent to PORTABLE=haswell, which roughly approximates its old behavior.
Notably, this change should enable more builds to use the AVX2-optimized Bloom filter implementation.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/11419
Test Plan:
existing tests, CI
Manual performance tests after the change match the before above (none expected with make build).
We also see AVX2 optimized Bloom filter code enabled when expected, by injecting a compiler error. (Performance difference is not big on my current CPU.)
Reviewed By: ajkr
Differential Revision: D45489041
Pulled By: pdillinger
fbshipit-source-id: 60ceb0dd2aa3b365c99ed08a8b2a087a9abb6a70
2 years ago
PLATFORM_CCFLAGS := $( filter-out -march= native, $( PLATFORM_CCFLAGS) )
PLATFORM_CXXFLAGS := $( filter-out -march= native, $( PLATFORM_CXXFLAGS) )
e n d i f
e n d i f
e n d i f
# ASAN doesn't work well with jemalloc. If we're compiling with ASAN, we should use regular malloc.
i f d e f C O M P I L E _ W I T H _ A S A N
DISABLE_JEMALLOC = 1
ASAN_OPTIONS?= detect_stack_use_after_return = 1
export ASAN_OPTIONS
EXEC_LDFLAGS += -fsanitize= address
PLATFORM_CCFLAGS += -fsanitize= address
PLATFORM_CXXFLAGS += -fsanitize= address
Improve / clean up meta block code & integrity (#9163)
Summary:
* Checksums are now checked on meta blocks unless specifically
suppressed or not applicable (e.g. plain table). (Was other way around.)
This means a number of cases that were not checking checksums now are,
including direct read TableProperties in Version::GetTableProperties
(fixed in meta_blocks ReadTableProperties), reading any block from
PersistentCache (fixed in BlockFetcher), read TableProperties in
SstFileDumper (ldb/sst_dump/BackupEngine) before table reader open,
maybe more.
* For that to work, I moved the global_seqno+TableProperties checksum
logic to the shared table/ code, because that is used by many utilies
such as SstFileDumper.
* Also for that to work, we have to know when we're dealing with a block
that has a checksum (trailer), so added that capability to Footer based
on magic number, and from there BlockFetcher.
* Knowledge of trailer presence has also fixed a problem where other
table formats were reading blocks including bytes for a non-existant
trailer--and awkwardly kind-of not using them, e.g. no shared code
checking checksums. (BlockFetcher compression type was populated
incorrectly.) Now we only read what is needed.
* Minimized code duplication and differing/incompatible/awkward
abstractions in meta_blocks.{cc,h} (e.g. SeekTo in metaindex block
without parsing block handle)
* Moved some meta block handling code from table_properties*.*
* Moved some code specific to block-based table from shared table/ code
to BlockBasedTable class. The checksum stuff means we can't completely
separate it, but things that don't need to be in shared table/ code
should not be.
* Use unique_ptr rather than raw ptr in more places. (Note: you can
std::move from unique_ptr to shared_ptr.)
Without enhancements to GetPropertiesOfAllTablesTest (see below),
net reduction of roughly 100 lines of code.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/9163
Test Plan:
existing tests and
* Enhanced DBTablePropertiesTest.GetPropertiesOfAllTablesTest to verify that
checksums are now checked on direct read of table properties by TableCache
(new test would fail before this change)
* Also enhanced DBTablePropertiesTest.GetPropertiesOfAllTablesTest to test
putting table properties under old meta name
* Also generally enhanced that same test to actually test what it was
supposed to be testing already, by kicking things out of table cache when
we don't want them there.
Reviewed By: ajkr, mrambacher
Differential Revision: D32514757
Pulled By: pdillinger
fbshipit-source-id: 507964b9311d186ae8d1131182290cbd97a99fa9
3 years ago
i f e q ( $( LIB_MODE ) , s h a r e d )
i f d e f U S E _ C L A N G
# Fix false ODR violation; see https://github.com/google/sanitizers/issues/1017
EXEC_LDFLAGS += -mllvm -asan-use-private-alias= 1
PLATFORM_CXXFLAGS += -mllvm -asan-use-private-alias= 1
e n d i f
e n d i f
e n d i f
# TSAN doesn't work well with jemalloc. If we're compiling with TSAN, we should use regular malloc.
i f d e f C O M P I L E _ W I T H _ T S A N
DISABLE_JEMALLOC = 1
EXEC_LDFLAGS += -fsanitize= thread
Fix TSAN failures in DistributedMutex tests (#5684)
Summary:
TSAN was not able to correctly instrument atomic bts and btr instructions, so
when TSAN is enabled implement those with std::atomic::fetch_or and
std::atomic::fetch_and. Also disable tests that fail on TSAN with false
negatives (we know these are false negatives because this other verifiably
correct program fails with the same TSAN error <link>)
```
make clean
TEST_TMPDIR=/dev/shm/rocksdb OPT=-g COMPILE_WITH_TSAN=1 make J=1 -j56 folly_synchronization_distributed_mutex_test
```
This is the code that fails with the same false-negative with TSAN
```
namespace {
class ExceptionWithConstructionTrack : public std::exception {
public:
explicit ExceptionWithConstructionTrack(int id)
: id_{folly::to<std::string>(id)}, constructionTrack_{id} {}
const char* what() const noexcept override {
return id_.c_str();
}
private:
std::string id_;
TestConstruction constructionTrack_;
};
template <typename Storage, typename Atomic>
void transferCurrentException(Storage& storage, Atomic& produced) {
assert(std::current_exception());
new (&storage) std::exception_ptr(std::current_exception());
produced->store(true, std::memory_order_release);
}
void concurrentExceptionPropagationStress(
int numThreads,
std::chrono::milliseconds milliseconds) {
auto&& stop = std::atomic<bool>{false};
auto&& exceptions = std::vector<std::aligned_storage<48, 8>::type>{};
auto&& produced = std::vector<std::unique_ptr<std::atomic<bool>>>{};
auto&& consumed = std::vector<std::unique_ptr<std::atomic<bool>>>{};
auto&& consumers = std::vector<std::thread>{};
for (auto i = 0; i < numThreads; ++i) {
produced.emplace_back(new std::atomic<bool>{false});
consumed.emplace_back(new std::atomic<bool>{false});
exceptions.push_back({});
}
auto producer = std::thread{[&]() {
auto counter = std::vector<int>(numThreads, 0);
for (auto i = 0; true; i = ((i + 1) % numThreads)) {
try {
throw ExceptionWithConstructionTrack{counter.at(i)++};
} catch (...) {
transferCurrentException(exceptions.at(i), produced.at(i));
}
while (!consumed.at(i)->load(std::memory_order_acquire)) {
if (stop.load(std::memory_order_acquire)) {
return;
}
}
consumed.at(i)->store(false, std::memory_order_release);
}
}};
for (auto i = 0; i < numThreads; ++i) {
consumers.emplace_back([&, i]() {
auto counter = 0;
while (true) {
while (!produced.at(i)->load(std::memory_order_acquire)) {
if (stop.load(std::memory_order_acquire)) {
return;
}
}
produced.at(i)->store(false, std::memory_order_release);
try {
auto storage = &exceptions.at(i);
auto exc = folly::launder(
reinterpret_cast<std::exception_ptr*>(storage));
auto copy = std::move(*exc);
exc->std::exception_ptr::~exception_ptr();
std::rethrow_exception(std::move(copy));
} catch (std::exception& exc) {
auto value = std::stoi(exc.what());
EXPECT_EQ(value, counter++);
}
consumed.at(i)->store(true, std::memory_order_release);
}
});
}
std::this_thread::sleep_for(milliseconds);
stop.store(true);
producer.join();
for (auto& thread : consumers) {
thread.join();
}
}
} // namespace
```
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5684
Differential Revision: D16746077
Pulled By: miasantreble
fbshipit-source-id: 8af88dcf9161c05daec1a76290f577918638f79d
5 years ago
PLATFORM_CCFLAGS += -fsanitize= thread -fPIC -DFOLLY_SANITIZE_THREAD
PLATFORM_CXXFLAGS += -fsanitize= thread -fPIC -DFOLLY_SANITIZE_THREAD
# Turn off -pg when enabling TSAN testing, because that induces
# a link failure. TODO: find the root cause
PROFILING_FLAGS =
# LUA is not supported under TSAN
LUA_PATH =
# Limit keys for crash test under TSAN to avoid error:
# "ThreadSanitizer: DenseSlabAllocator overflow. Dying."
CRASH_TEST_EXT_ARGS += --max_key= 1000000
e n d i f
# AIX doesn't work with -pg
i f e q ( $( PLATFORM ) , O S _ A I X )
PROFILING_FLAGS =
e n d i f
# USAN doesn't work well with jemalloc. If we're compiling with USAN, we should use regular malloc.
i f d e f C O M P I L E _ W I T H _ U B S A N
DISABLE_JEMALLOC = 1
# Suppress alignment warning because murmurhash relies on casting unaligned
# memory to integer. Fixing it may cause performance regression. 3-way crc32
# relies on it too, although it can be rewritten to eliminate with minimal
# performance regression.
EXEC_LDFLAGS += -fsanitize= undefined -fno-sanitize-recover= all
PLATFORM_CCFLAGS += -fsanitize= undefined -fno-sanitize-recover= all -DROCKSDB_UBSAN_RUN
PLATFORM_CXXFLAGS += -fsanitize= undefined -fno-sanitize-recover= all -DROCKSDB_UBSAN_RUN
e n d i f
i f d e f R O C K S D B _ V A L G R I N D _ R U N
PLATFORM_CCFLAGS += -DROCKSDB_VALGRIND_RUN
PLATFORM_CXXFLAGS += -DROCKSDB_VALGRIND_RUN
e n d i f
i f d e f R O C K S D B _ F U L L _ V A L G R I N D _ R U N
# Some tests are slow when run under valgrind and are only run when
# explicitly requested via the ROCKSDB_FULL_VALGRIND_RUN compiler flag.
PLATFORM_CCFLAGS += -DROCKSDB_VALGRIND_RUN -DROCKSDB_FULL_VALGRIND_RUN
PLATFORM_CXXFLAGS += -DROCKSDB_VALGRIND_RUN -DROCKSDB_FULL_VALGRIND_RUN
e n d i f
i f n d e f D I S A B L E _ J E M A L L O C
ifdef JEMALLOC
PLATFORM_CXXFLAGS += -DROCKSDB_JEMALLOC -DJEMALLOC_NO_DEMANGLE
PLATFORM_CCFLAGS += -DROCKSDB_JEMALLOC -DJEMALLOC_NO_DEMANGLE
Meta-internal folly integration with F14FastMap (#9546)
Summary:
Especially after updating to C++17, I don't see a compelling case for
*requiring* any folly components in RocksDB. I was able to purge the existing
hard dependencies, and it can be quite difficult to strip out non-trivial components
from folly for use in RocksDB. (The prospect of doing that on F14 has changed
my mind on the best approach here.)
But this change creates an optional integration where we can plug in
components from folly at compile time, starting here with F14FastMap to replace
std::unordered_map when possible (probably no public APIs for example). I have
replaced the biggest CPU users of std::unordered_map with compile-time
pluggable UnorderedMap which will use F14FastMap when USE_FOLLY is set.
USE_FOLLY is always set in the Meta-internal buck build, and a simulation of
that is in the Makefile for public CI testing. A full folly build is not needed, but
checking out the full folly repo is much simpler for getting the dependency,
and anything else we might want to optionally integrate in the future.
Some picky details:
* I don't think the distributed mutex stuff is actually used, so it was easy to remove.
* I implemented an alternative to `folly::constexpr_log2` (which is much easier
in C++17 than C++11) so that I could pull out the hard dependencies on
`ConstexprMath.h`
* I had to add noexcept move constructors/operators to some types to make
F14's complainUnlessNothrowMoveAndDestroy check happy, and I added a
macro to make that easier in some common cases.
* Updated Meta-internal buck build to use folly F14Map (always)
No updates to HISTORY.md nor INSTALL.md as this is not (yet?) considered a
production integration for open source users.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/9546
Test Plan:
CircleCI tests updated so that a couple of them use folly.
Most internal unit & stress/crash tests updated to use Meta-internal latest folly.
(Note: they should probably use buck but they currently use Makefile.)
Example performance improvement: when filter partitions are pinned in cache,
they are tracked by PartitionedFilterBlockReader::filter_map_ and we can build
a test that exercises that heavily. Build DB with
```
TEST_TMPDIR=/dev/shm/rocksdb ./db_bench -benchmarks=fillrandom -num=10000000 -disable_wal=1 -write_buffer_size=30000000 -bloom_bits=16 -compaction_style=2 -fifo_compaction_max_table_files_size_mb=10000 -fifo_compaction_allow_compaction=0 -partition_index_and_filters
```
and test with (simultaneous runs with & without folly, ~20 times each to see
convergence)
```
TEST_TMPDIR=/dev/shm/rocksdb ./db_bench_folly -readonly -use_existing_db -benchmarks=readrandom -num=10000000 -bloom_bits=16 -compaction_style=2 -fifo_compaction_max_table_files_size_mb=10000 -fifo_compaction_allow_compaction=0 -partition_index_and_filters -duration=40 -pin_l0_filter_and_index_blocks_in_cache
```
Average ops/s no folly: 26229.2
Average ops/s with folly: 26853.3 (+2.4%)
Reviewed By: ajkr
Differential Revision: D34181736
Pulled By: pdillinger
fbshipit-source-id: ffa6ad5104c2880321d8a1aa7187e00ab0d02e94
3 years ago
ifeq ( $( USE_FOLLY) ,1)
PLATFORM_CXXFLAGS += -DUSE_JEMALLOC
PLATFORM_CCFLAGS += -DUSE_JEMALLOC
endif
ifeq ( $( USE_FOLLY_LITE) ,1)
PLATFORM_CXXFLAGS += -DUSE_JEMALLOC
PLATFORM_CCFLAGS += -DUSE_JEMALLOC
endif
endif
ifdef WITH_JEMALLOC_FLAG
PLATFORM_LDFLAGS += -ljemalloc
JAVA_LDFLAGS += -ljemalloc
endif
EXEC_LDFLAGS := $( JEMALLOC_LIB) $( EXEC_LDFLAGS)
PLATFORM_CXXFLAGS += $( JEMALLOC_INCLUDE)
PLATFORM_CCFLAGS += $( JEMALLOC_INCLUDE)
e n d i f
Meta-internal folly integration with F14FastMap (#9546)
Summary:
Especially after updating to C++17, I don't see a compelling case for
*requiring* any folly components in RocksDB. I was able to purge the existing
hard dependencies, and it can be quite difficult to strip out non-trivial components
from folly for use in RocksDB. (The prospect of doing that on F14 has changed
my mind on the best approach here.)
But this change creates an optional integration where we can plug in
components from folly at compile time, starting here with F14FastMap to replace
std::unordered_map when possible (probably no public APIs for example). I have
replaced the biggest CPU users of std::unordered_map with compile-time
pluggable UnorderedMap which will use F14FastMap when USE_FOLLY is set.
USE_FOLLY is always set in the Meta-internal buck build, and a simulation of
that is in the Makefile for public CI testing. A full folly build is not needed, but
checking out the full folly repo is much simpler for getting the dependency,
and anything else we might want to optionally integrate in the future.
Some picky details:
* I don't think the distributed mutex stuff is actually used, so it was easy to remove.
* I implemented an alternative to `folly::constexpr_log2` (which is much easier
in C++17 than C++11) so that I could pull out the hard dependencies on
`ConstexprMath.h`
* I had to add noexcept move constructors/operators to some types to make
F14's complainUnlessNothrowMoveAndDestroy check happy, and I added a
macro to make that easier in some common cases.
* Updated Meta-internal buck build to use folly F14Map (always)
No updates to HISTORY.md nor INSTALL.md as this is not (yet?) considered a
production integration for open source users.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/9546
Test Plan:
CircleCI tests updated so that a couple of them use folly.
Most internal unit & stress/crash tests updated to use Meta-internal latest folly.
(Note: they should probably use buck but they currently use Makefile.)
Example performance improvement: when filter partitions are pinned in cache,
they are tracked by PartitionedFilterBlockReader::filter_map_ and we can build
a test that exercises that heavily. Build DB with
```
TEST_TMPDIR=/dev/shm/rocksdb ./db_bench -benchmarks=fillrandom -num=10000000 -disable_wal=1 -write_buffer_size=30000000 -bloom_bits=16 -compaction_style=2 -fifo_compaction_max_table_files_size_mb=10000 -fifo_compaction_allow_compaction=0 -partition_index_and_filters
```
and test with (simultaneous runs with & without folly, ~20 times each to see
convergence)
```
TEST_TMPDIR=/dev/shm/rocksdb ./db_bench_folly -readonly -use_existing_db -benchmarks=readrandom -num=10000000 -bloom_bits=16 -compaction_style=2 -fifo_compaction_max_table_files_size_mb=10000 -fifo_compaction_allow_compaction=0 -partition_index_and_filters -duration=40 -pin_l0_filter_and_index_blocks_in_cache
```
Average ops/s no folly: 26229.2
Average ops/s with folly: 26853.3 (+2.4%)
Reviewed By: ajkr
Differential Revision: D34181736
Pulled By: pdillinger
fbshipit-source-id: ffa6ad5104c2880321d8a1aa7187e00ab0d02e94
3 years ago
i f n d e f U S E _ F O L L Y
USE_FOLLY = 0
e n d i f
i f n d e f G T E S T _ T H R O W _ O N _ F A I L U R E
export GTEST_THROW_ON_FAILURE = 1
e n d i f
i f n d e f G T E S T _ H A S _ E X C E P T I O N S
export GTEST_HAS_EXCEPTIONS = 1
e n d i f
GTEST_DIR = third-party/gtest-1.8.1/fused-src
# AIX: pre-defined system headers are surrounded by an extern "C" block
i f e q ( $( PLATFORM ) , O S _ A I X )
PLATFORM_CCFLAGS += -I$( GTEST_DIR)
PLATFORM_CXXFLAGS += -I$( GTEST_DIR)
e l s e
PLATFORM_CCFLAGS += -isystem $( GTEST_DIR)
PLATFORM_CXXFLAGS += -isystem $( GTEST_DIR)
e n d i f
Meta-internal folly integration with F14FastMap (#9546)
Summary:
Especially after updating to C++17, I don't see a compelling case for
*requiring* any folly components in RocksDB. I was able to purge the existing
hard dependencies, and it can be quite difficult to strip out non-trivial components
from folly for use in RocksDB. (The prospect of doing that on F14 has changed
my mind on the best approach here.)
But this change creates an optional integration where we can plug in
components from folly at compile time, starting here with F14FastMap to replace
std::unordered_map when possible (probably no public APIs for example). I have
replaced the biggest CPU users of std::unordered_map with compile-time
pluggable UnorderedMap which will use F14FastMap when USE_FOLLY is set.
USE_FOLLY is always set in the Meta-internal buck build, and a simulation of
that is in the Makefile for public CI testing. A full folly build is not needed, but
checking out the full folly repo is much simpler for getting the dependency,
and anything else we might want to optionally integrate in the future.
Some picky details:
* I don't think the distributed mutex stuff is actually used, so it was easy to remove.
* I implemented an alternative to `folly::constexpr_log2` (which is much easier
in C++17 than C++11) so that I could pull out the hard dependencies on
`ConstexprMath.h`
* I had to add noexcept move constructors/operators to some types to make
F14's complainUnlessNothrowMoveAndDestroy check happy, and I added a
macro to make that easier in some common cases.
* Updated Meta-internal buck build to use folly F14Map (always)
No updates to HISTORY.md nor INSTALL.md as this is not (yet?) considered a
production integration for open source users.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/9546
Test Plan:
CircleCI tests updated so that a couple of them use folly.
Most internal unit & stress/crash tests updated to use Meta-internal latest folly.
(Note: they should probably use buck but they currently use Makefile.)
Example performance improvement: when filter partitions are pinned in cache,
they are tracked by PartitionedFilterBlockReader::filter_map_ and we can build
a test that exercises that heavily. Build DB with
```
TEST_TMPDIR=/dev/shm/rocksdb ./db_bench -benchmarks=fillrandom -num=10000000 -disable_wal=1 -write_buffer_size=30000000 -bloom_bits=16 -compaction_style=2 -fifo_compaction_max_table_files_size_mb=10000 -fifo_compaction_allow_compaction=0 -partition_index_and_filters
```
and test with (simultaneous runs with & without folly, ~20 times each to see
convergence)
```
TEST_TMPDIR=/dev/shm/rocksdb ./db_bench_folly -readonly -use_existing_db -benchmarks=readrandom -num=10000000 -bloom_bits=16 -compaction_style=2 -fifo_compaction_max_table_files_size_mb=10000 -fifo_compaction_allow_compaction=0 -partition_index_and_filters -duration=40 -pin_l0_filter_and_index_blocks_in_cache
```
Average ops/s no folly: 26229.2
Average ops/s with folly: 26853.3 (+2.4%)
Reviewed By: ajkr
Differential Revision: D34181736
Pulled By: pdillinger
fbshipit-source-id: ffa6ad5104c2880321d8a1aa7187e00ab0d02e94
3 years ago
# This provides a Makefile simulation of a Meta-internal folly integration.
# It is not validated for general use.
#
# USE_FOLLY links the build targets with libfolly.a. The latter could be
# built using 'make build_folly', or built externally and specified in
# the CXXFLAGS and EXTRA_LDFLAGS env variables. The build_detect_platform
# script tries to detect if an external folly dependency has been specified.
# If not, it exports FOLLY_PATH to the path of the installed Folly and
# dependency libraries.
#
# USE_FOLLY_LITE cherry picks source files from Folly to include in the
# RocksDB library. Its faster and has fewer dependencies on 3rd party
# libraries, but with limited functionality. For example, coroutine
# functionality is not available.
Meta-internal folly integration with F14FastMap (#9546)
Summary:
Especially after updating to C++17, I don't see a compelling case for
*requiring* any folly components in RocksDB. I was able to purge the existing
hard dependencies, and it can be quite difficult to strip out non-trivial components
from folly for use in RocksDB. (The prospect of doing that on F14 has changed
my mind on the best approach here.)
But this change creates an optional integration where we can plug in
components from folly at compile time, starting here with F14FastMap to replace
std::unordered_map when possible (probably no public APIs for example). I have
replaced the biggest CPU users of std::unordered_map with compile-time
pluggable UnorderedMap which will use F14FastMap when USE_FOLLY is set.
USE_FOLLY is always set in the Meta-internal buck build, and a simulation of
that is in the Makefile for public CI testing. A full folly build is not needed, but
checking out the full folly repo is much simpler for getting the dependency,
and anything else we might want to optionally integrate in the future.
Some picky details:
* I don't think the distributed mutex stuff is actually used, so it was easy to remove.
* I implemented an alternative to `folly::constexpr_log2` (which is much easier
in C++17 than C++11) so that I could pull out the hard dependencies on
`ConstexprMath.h`
* I had to add noexcept move constructors/operators to some types to make
F14's complainUnlessNothrowMoveAndDestroy check happy, and I added a
macro to make that easier in some common cases.
* Updated Meta-internal buck build to use folly F14Map (always)
No updates to HISTORY.md nor INSTALL.md as this is not (yet?) considered a
production integration for open source users.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/9546
Test Plan:
CircleCI tests updated so that a couple of them use folly.
Most internal unit & stress/crash tests updated to use Meta-internal latest folly.
(Note: they should probably use buck but they currently use Makefile.)
Example performance improvement: when filter partitions are pinned in cache,
they are tracked by PartitionedFilterBlockReader::filter_map_ and we can build
a test that exercises that heavily. Build DB with
```
TEST_TMPDIR=/dev/shm/rocksdb ./db_bench -benchmarks=fillrandom -num=10000000 -disable_wal=1 -write_buffer_size=30000000 -bloom_bits=16 -compaction_style=2 -fifo_compaction_max_table_files_size_mb=10000 -fifo_compaction_allow_compaction=0 -partition_index_and_filters
```
and test with (simultaneous runs with & without folly, ~20 times each to see
convergence)
```
TEST_TMPDIR=/dev/shm/rocksdb ./db_bench_folly -readonly -use_existing_db -benchmarks=readrandom -num=10000000 -bloom_bits=16 -compaction_style=2 -fifo_compaction_max_table_files_size_mb=10000 -fifo_compaction_allow_compaction=0 -partition_index_and_filters -duration=40 -pin_l0_filter_and_index_blocks_in_cache
```
Average ops/s no folly: 26229.2
Average ops/s with folly: 26853.3 (+2.4%)
Reviewed By: ajkr
Differential Revision: D34181736
Pulled By: pdillinger
fbshipit-source-id: ffa6ad5104c2880321d8a1aa7187e00ab0d02e94
3 years ago
i f e q ( $( USE_FOLLY ) , 1 )
i f e q ( $( USE_FOLLY_LITE ) , 1 )
$( error Please specify only one of USE_FOLLY and USE_FOLLY_LITE )
e n d i f
i f n e q ( $( strip $ ( FOLLY_PATH ) ) , )
BOOST_PATH = $( shell ( ls -d $( FOLLY_PATH) /../boost*) )
DBL_CONV_PATH = $( shell ( ls -d $( FOLLY_PATH) /../double-conversion*) )
GFLAGS_PATH = $( shell ( ls -d $( FOLLY_PATH) /../gflags*) )
GLOG_PATH = $( shell ( ls -d $( FOLLY_PATH) /../glog*) )
LIBEVENT_PATH = $( shell ( ls -d $( FOLLY_PATH) /../libevent*) )
XZ_PATH = $( shell ( ls -d $( FOLLY_PATH) /../xz*) )
LIBSODIUM_PATH = $( shell ( ls -d $( FOLLY_PATH) /../libsodium*) )
FMT_PATH = $( shell ( ls -d $( FOLLY_PATH) /../fmt*) )
# For some reason, glog and fmt libraries are under either lib or lib64
GLOG_LIB_PATH = $( shell ( ls -d $( GLOG_PATH) /lib*) )
FMT_LIB_PATH = $( shell ( ls -d $( FMT_PATH) /lib*) )
# AIX: pre-defined system headers are surrounded by an extern "C" block
ifeq ( $( PLATFORM) , OS_AIX)
PLATFORM_CCFLAGS += -I$( BOOST_PATH) /include -I$( DBL_CONV_PATH) /include -I$( GLOG_PATH) /include -I$( LIBEVENT_PATH) /include -I$( XZ_PATH) /include -I$( LIBSODIUM_PATH) /include -I$( FOLLY_PATH) /include -I$( FMT_PATH) /include
PLATFORM_CXXFLAGS += -I$( BOOST_PATH) /include -I$( DBL_CONV_PATH) /include -I$( GLOG_PATH) /include -I$( LIBEVENT_PATH) /include -I$( XZ_PATH) /include -I$( LIBSODIUM_PATH) /include -I$( FOLLY_PATH) /include -I$( FMT_PATH) /include
else
PLATFORM_CCFLAGS += -isystem $( BOOST_PATH) /include -isystem $( DBL_CONV_PATH) /include -isystem $( GLOG_PATH) /include -isystem $( LIBEVENT_PATH) /include -isystem $( XZ_PATH) /include -isystem $( LIBSODIUM_PATH) /include -isystem $( FOLLY_PATH) /include -isystem $( FMT_PATH) /include
PLATFORM_CXXFLAGS += -isystem $( BOOST_PATH) /include -isystem $( DBL_CONV_PATH) /include -isystem $( GLOG_PATH) /include -isystem $( LIBEVENT_PATH) /include -isystem $( XZ_PATH) /include -isystem $( LIBSODIUM_PATH) /include -isystem $( FOLLY_PATH) /include -isystem $( FMT_PATH) /include
Meta-internal folly integration with F14FastMap (#9546)
Summary:
Especially after updating to C++17, I don't see a compelling case for
*requiring* any folly components in RocksDB. I was able to purge the existing
hard dependencies, and it can be quite difficult to strip out non-trivial components
from folly for use in RocksDB. (The prospect of doing that on F14 has changed
my mind on the best approach here.)
But this change creates an optional integration where we can plug in
components from folly at compile time, starting here with F14FastMap to replace
std::unordered_map when possible (probably no public APIs for example). I have
replaced the biggest CPU users of std::unordered_map with compile-time
pluggable UnorderedMap which will use F14FastMap when USE_FOLLY is set.
USE_FOLLY is always set in the Meta-internal buck build, and a simulation of
that is in the Makefile for public CI testing. A full folly build is not needed, but
checking out the full folly repo is much simpler for getting the dependency,
and anything else we might want to optionally integrate in the future.
Some picky details:
* I don't think the distributed mutex stuff is actually used, so it was easy to remove.
* I implemented an alternative to `folly::constexpr_log2` (which is much easier
in C++17 than C++11) so that I could pull out the hard dependencies on
`ConstexprMath.h`
* I had to add noexcept move constructors/operators to some types to make
F14's complainUnlessNothrowMoveAndDestroy check happy, and I added a
macro to make that easier in some common cases.
* Updated Meta-internal buck build to use folly F14Map (always)
No updates to HISTORY.md nor INSTALL.md as this is not (yet?) considered a
production integration for open source users.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/9546
Test Plan:
CircleCI tests updated so that a couple of them use folly.
Most internal unit & stress/crash tests updated to use Meta-internal latest folly.
(Note: they should probably use buck but they currently use Makefile.)
Example performance improvement: when filter partitions are pinned in cache,
they are tracked by PartitionedFilterBlockReader::filter_map_ and we can build
a test that exercises that heavily. Build DB with
```
TEST_TMPDIR=/dev/shm/rocksdb ./db_bench -benchmarks=fillrandom -num=10000000 -disable_wal=1 -write_buffer_size=30000000 -bloom_bits=16 -compaction_style=2 -fifo_compaction_max_table_files_size_mb=10000 -fifo_compaction_allow_compaction=0 -partition_index_and_filters
```
and test with (simultaneous runs with & without folly, ~20 times each to see
convergence)
```
TEST_TMPDIR=/dev/shm/rocksdb ./db_bench_folly -readonly -use_existing_db -benchmarks=readrandom -num=10000000 -bloom_bits=16 -compaction_style=2 -fifo_compaction_max_table_files_size_mb=10000 -fifo_compaction_allow_compaction=0 -partition_index_and_filters -duration=40 -pin_l0_filter_and_index_blocks_in_cache
```
Average ops/s no folly: 26229.2
Average ops/s with folly: 26853.3 (+2.4%)
Reviewed By: ajkr
Differential Revision: D34181736
Pulled By: pdillinger
fbshipit-source-id: ffa6ad5104c2880321d8a1aa7187e00ab0d02e94
3 years ago
endif
# Add -ldl at the end as gcc resolves a symbol in a library by searching only in libraries specified later
# in the command line
PLATFORM_LDFLAGS += $( FOLLY_PATH) /lib/libfolly.a $( BOOST_PATH) /lib/libboost_context.a $( BOOST_PATH) /lib/libboost_filesystem.a $( BOOST_PATH) /lib/libboost_atomic.a $( BOOST_PATH) /lib/libboost_program_options.a $( BOOST_PATH) /lib/libboost_regex.a $( BOOST_PATH) /lib/libboost_system.a $( BOOST_PATH) /lib/libboost_thread.a $( DBL_CONV_PATH) /lib/libdouble-conversion.a $( FMT_LIB_PATH) /libfmt.a $( GLOG_LIB_PATH) /libglog.so $( GFLAGS_PATH) /lib/libgflags.so.2.2 $( LIBEVENT_PATH) /lib/libevent-2.1.so -ldl
PLATFORM_LDFLAGS += -Wl,-rpath= $( GFLAGS_PATH) /lib -Wl,-rpath= $( GLOG_LIB_PATH) -Wl,-rpath= $( LIBEVENT_PATH) /lib -Wl,-rpath= $( LIBSODIUM_PATH) /lib -Wl,-rpath= $( LIBEVENT_PATH) /lib
e n d i f
PLATFORM_CCFLAGS += -DUSE_FOLLY -DFOLLY_NO_CONFIG
PLATFORM_CXXFLAGS += -DUSE_FOLLY -DFOLLY_NO_CONFIG
e n d i f
i f e q ( $( USE_FOLLY_LITE ) , 1 )
# Path to the Folly source code and include files
FOLLY_DIR = ./third-party/folly
# AIX: pre-defined system headers are surrounded by an extern "C" block
ifeq ( $( PLATFORM) , OS_AIX)
PLATFORM_CCFLAGS += -I$( FOLLY_DIR)
PLATFORM_CXXFLAGS += -I$( FOLLY_DIR)
else
PLATFORM_CCFLAGS += -isystem $( FOLLY_DIR)
PLATFORM_CXXFLAGS += -isystem $( FOLLY_DIR)
endif
Meta-internal folly integration with F14FastMap (#9546)
Summary:
Especially after updating to C++17, I don't see a compelling case for
*requiring* any folly components in RocksDB. I was able to purge the existing
hard dependencies, and it can be quite difficult to strip out non-trivial components
from folly for use in RocksDB. (The prospect of doing that on F14 has changed
my mind on the best approach here.)
But this change creates an optional integration where we can plug in
components from folly at compile time, starting here with F14FastMap to replace
std::unordered_map when possible (probably no public APIs for example). I have
replaced the biggest CPU users of std::unordered_map with compile-time
pluggable UnorderedMap which will use F14FastMap when USE_FOLLY is set.
USE_FOLLY is always set in the Meta-internal buck build, and a simulation of
that is in the Makefile for public CI testing. A full folly build is not needed, but
checking out the full folly repo is much simpler for getting the dependency,
and anything else we might want to optionally integrate in the future.
Some picky details:
* I don't think the distributed mutex stuff is actually used, so it was easy to remove.
* I implemented an alternative to `folly::constexpr_log2` (which is much easier
in C++17 than C++11) so that I could pull out the hard dependencies on
`ConstexprMath.h`
* I had to add noexcept move constructors/operators to some types to make
F14's complainUnlessNothrowMoveAndDestroy check happy, and I added a
macro to make that easier in some common cases.
* Updated Meta-internal buck build to use folly F14Map (always)
No updates to HISTORY.md nor INSTALL.md as this is not (yet?) considered a
production integration for open source users.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/9546
Test Plan:
CircleCI tests updated so that a couple of them use folly.
Most internal unit & stress/crash tests updated to use Meta-internal latest folly.
(Note: they should probably use buck but they currently use Makefile.)
Example performance improvement: when filter partitions are pinned in cache,
they are tracked by PartitionedFilterBlockReader::filter_map_ and we can build
a test that exercises that heavily. Build DB with
```
TEST_TMPDIR=/dev/shm/rocksdb ./db_bench -benchmarks=fillrandom -num=10000000 -disable_wal=1 -write_buffer_size=30000000 -bloom_bits=16 -compaction_style=2 -fifo_compaction_max_table_files_size_mb=10000 -fifo_compaction_allow_compaction=0 -partition_index_and_filters
```
and test with (simultaneous runs with & without folly, ~20 times each to see
convergence)
```
TEST_TMPDIR=/dev/shm/rocksdb ./db_bench_folly -readonly -use_existing_db -benchmarks=readrandom -num=10000000 -bloom_bits=16 -compaction_style=2 -fifo_compaction_max_table_files_size_mb=10000 -fifo_compaction_allow_compaction=0 -partition_index_and_filters -duration=40 -pin_l0_filter_and_index_blocks_in_cache
```
Average ops/s no folly: 26229.2
Average ops/s with folly: 26853.3 (+2.4%)
Reviewed By: ajkr
Differential Revision: D34181736
Pulled By: pdillinger
fbshipit-source-id: ffa6ad5104c2880321d8a1aa7187e00ab0d02e94
3 years ago
PLATFORM_CCFLAGS += -DUSE_FOLLY -DFOLLY_NO_CONFIG
PLATFORM_CXXFLAGS += -DUSE_FOLLY -DFOLLY_NO_CONFIG
# TODO: fix linking with fbcode compiler config
PLATFORM_LDFLAGS += -lglog
e n d i f
i f d e f T E S T _ C A C H E _ L I N E _ S I Z E
PLATFORM_CCFLAGS += -DTEST_CACHE_LINE_SIZE= $( TEST_CACHE_LINE_SIZE)
PLATFORM_CXXFLAGS += -DTEST_CACHE_LINE_SIZE= $( TEST_CACHE_LINE_SIZE)
e n d i f
i f d e f T E S T _ U I N T 1 2 8 _ C O M P A T
PLATFORM_CCFLAGS += -DTEST_UINT128_COMPAT= 1
PLATFORM_CXXFLAGS += -DTEST_UINT128_COMPAT= 1
e n d i f
i f d e f R O C K S D B _ M O D I F Y _ N P H A S H
PLATFORM_CCFLAGS += -DROCKSDB_MODIFY_NPHASH= 1
PLATFORM_CXXFLAGS += -DROCKSDB_MODIFY_NPHASH= 1
e n d i f
# This (the first rule) must depend on "all".
default : all
WARNING_FLAGS = -W -Wextra -Wall -Wsign-compare -Wshadow \
-Wunused-parameter
i f e q ( , $( filter amd 64, $ ( MACHINE ) ) )
C_WARNING_FLAGS = -Wstrict-prototypes
e n d i f
i f d e f U S E _ C L A N G
# Used by some teams in Facebook
WARNING_FLAGS += -Wshift-sign-overflow -Wambiguous-reversed-operator
e n d i f
i f e q ( $( PLATFORM ) , O S _ O P E N B S D )
WARNING_FLAGS += -Wno-unused-lambda-capture
e n d i f
i f n d e f D I S A B L E _ W A R N I N G _ A S _ E R R O R
WARNING_FLAGS += -Werror
e n d i f
i f d e f L U A _ P A T H
i f n d e f L U A _ I N C L U D E
LUA_INCLUDE = $( LUA_PATH) /include
e n d i f
LUA_INCLUDE_FILE = $( LUA_INCLUDE) /lualib.h
i f e q ( "$(wildcard $(LUA_INCLUDE_FILE))" , "" )
# LUA_INCLUDE_FILE does not exist
$( error Cannot find lualib .h under $ ( LUA_INCLUDE ) . Try to specify both LUA_PATH and LUA_INCLUDE manually )
e n d i f
LUA_FLAGS = -I$( LUA_INCLUDE) -DLUA -DLUA_COMPAT_ALL
CFLAGS += $( LUA_FLAGS)
CXXFLAGS += $( LUA_FLAGS)
i f n d e f L U A _ L I B
LUA_LIB = $( LUA_PATH) /lib/liblua.a
e n d i f
i f e q ( "$(wildcard $(LUA_LIB))" , "" ) # LUA_LIB does not exist
$( error $ ( LUA_LIB ) does not exist . Try to specify both LUA_PATH and LUA_LIB manually )
e n d i f
EXEC_LDFLAGS += $( LUA_LIB)
e n d i f
Port 3 way SSE4.2 crc32c implementation from Folly
Summary:
**# Summary**
RocksDB uses SSE crc32 intrinsics to calculate the crc32 values but it does it in single way fashion (not pipelined on single CPU core). Intel's whitepaper () published an algorithm that uses 3-way pipelining for the crc32 intrinsics, then use pclmulqdq intrinsic to combine the values. Because pclmulqdq has overhead on its own, this algorithm will show perf gains on buffers larger than 216 bytes, which makes RocksDB a perfect user, since most of the buffers RocksDB call crc32c on is over 4KB. Initial db_bench show tremendous CPU gain.
This change uses the 3-way SSE algorithm by default. The old SSE algorithm is now behind a compiler tag NO_THREEWAY_CRC32C. If user compiles the code with NO_THREEWAY_CRC32C=1 then the old SSE Crc32c algorithm would be used. If the server does not have SSE4.2 at the run time the slow way (Non SSE) will be used.
**# Performance Test Results**
We ran the FillRandom and ReadRandom benchmarks in db_bench. ReadRandom is the point of interest here since it calculates the CRC32 for the in-mem buffers. We did 3 runs for each algorithm.
Before this change the CRC32 value computation takes about 11.5% of total CPU cost, and with the new 3-way algorithm it reduced to around 4.5%. The overall throughput also improved from 25.53MB/s to 27.63MB/s.
1) ReadRandom in db_bench overall metrics
PER RUN
Algorithm | run | micros/op | ops/sec |Throughput (MB/s)
3-way | 1 | 4.143 | 241387 | 26.7
3-way | 2 | 3.775 | 264872 | 29.3
3-way | 3 | 4.116 | 242929 | 26.9
FastCrc32c|1 | 4.037 | 247727 | 27.4
FastCrc32c|2 | 4.648 | 215166 | 23.8
FastCrc32c|3 | 4.352 | 229799 | 25.4
AVG
Algorithm | Average of micros/op | Average of ops/sec | Average of Throughput (MB/s)
3-way | 4.01 | 249,729 | 27.63
FastCrc32c | 4.35 | 230,897 | 25.53
2) Crc32c computation CPU cost (inclusive samples percentage)
PER RUN
Implementation | run | TotalSamples | Crc32c percentage
3-way | 1 | 4,572,250,000 | 4.37%
3-way | 2 | 3,779,250,000 | 4.62%
3-way | 3 | 4,129,500,000 | 4.48%
FastCrc32c | 1 | 4,663,500,000 | 11.24%
FastCrc32c | 2 | 4,047,500,000 | 12.34%
FastCrc32c | 3 | 4,366,750,000 | 11.68%
**# Test Plan**
make -j64 corruption_test && ./corruption_test
By default it uses 3-way SSE algorithm
NO_THREEWAY_CRC32C=1 make -j64 corruption_test && ./corruption_test
make clean && DEBUG_LEVEL=0 make -j64 db_bench
make clean && DEBUG_LEVEL=0 NO_THREEWAY_CRC32C=1 make -j64 db_bench
Closes https://github.com/facebook/rocksdb/pull/3173
Differential Revision: D6330882
Pulled By: yingsu00
fbshipit-source-id: 8ec3d89719533b63b536a736663ca6f0dd4482e9
7 years ago
i f e q ( $( NO_THREEWAY_CRC 32C ) , 1 )
CXXFLAGS += -DNO_THREEWAY_CRC32C
e n d i f
CFLAGS += $( C_WARNING_FLAGS) $( WARNING_FLAGS) -I. -I./include $( PLATFORM_CCFLAGS) $( OPT)
CXXFLAGS += $( WARNING_FLAGS) -I. -I./include $( PLATFORM_CXXFLAGS) $( OPT) -Woverloaded-virtual -Wnon-virtual-dtor -Wno-missing-field-initializers
Use -Wno-invalid-offsetof instead of dangerous offset_of hack (#9563)
Summary:
After https://github.com/facebook/rocksdb/issues/9515 added a unique_ptr to Status, we see some
warnings-as-error in some internal builds like this:
```
stderr: rocksdb/src/db/compaction/compaction_job.cc:2839:7: error:
offset of on non-standard-layout type 'struct CompactionServiceResult'
[-Werror,-Winvalid-offsetof]
{offsetof(struct CompactionServiceResult, status),
^ ~~~~~~
```
I see three potential solutions to resolving this:
* Expand our use of an idiom that works around the warning (see offset_of
functions removed in this change, inspired by
https://gist.github.com/graphitemaster/494f21190bb2c63c5516) However,
this construction is invoking undefined behavior that assumes consistent
layout with no compiler-introduced indirection. A compiler incompatible
with our assumptions will likely compile the code and exhibit undefined
behavior.
* Migrate to something in place of offset, like a function mapping
CompactionServiceResult* to Status* (for the `status` field). This might
be required in the long term.
* **Selected:** Use our new C++17 dependency to use offsetof in a well-defined way
when the compiler allows it. From a comment on
https://gist.github.com/graphitemaster/494f21190bb2c63c5516:
> A final note: in C++17, offsetof is conditionally supported, which
> means that you can use it on any type (not just standard layout
> types) and the compiler will error if it can't compile it correctly.
> That appears to be the best option if you can live with C++17 and
> don't need constexpr support.
The C++17 semantics are confirmed on
https://en.cppreference.com/w/cpp/types/offsetof, so we can suppress the
warning as long as we accept that we might run into a compiler that
rejects the code, and at that point we will find a solution, such as
the more intrusive "migrate" solution above.
Although this is currently only showing in our buck build, it will
surely show up also with make and cmake, so I have updated those
configurations as well.
Also in the buck build, -Wno-expansion-to-defined does not appear to be
needed anymore (both current compiler configurations) so I
removed it.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/9563
Test Plan: Tried out buck builds with both current compiler configurations
Reviewed By: riversand963
Differential Revision: D34220931
Pulled By: pdillinger
fbshipit-source-id: d39436008259bd1eaaa87c77be69fb2a5b559e1f
3 years ago
# Allow offsetof to work on non-standard layout types. Some compiler could
# completely reject our usage of offsetof, but we will solve that when it
# happens.
CXXFLAGS += -Wno-invalid-offsetof
LDFLAGS += $( PLATFORM_LDFLAGS)
LIB_OBJECTS = $( patsubst %.cc, $( OBJ_DIR) /%.o, $( LIB_SOURCES) )
LIB_OBJECTS += $( patsubst %.cc, $( OBJ_DIR) /%.o, $( ROCKSDB_PLUGIN_SOURCES) )
i f e q ( $( HAVE_POWER 8) , 1 )
LIB_OBJECTS += $( patsubst %.c, $( OBJ_DIR) /%.o, $( LIB_SOURCES_C) )
LIB_OBJECTS += $( patsubst %.S, $( OBJ_DIR) /%.o, $( LIB_SOURCES_ASM) )
e n d i f
i f e q ( $( USE_FOLLY_LITE ) , 1 )
LIB_OBJECTS += $( patsubst %.cpp, $( OBJ_DIR) /%.o, $( FOLLY_SOURCES) )
e n d i f
# range_tree is not compatible with non GNU libc on ppc64
# see https://jira.percona.com/browse/PS-7559
i f n e q ( $( PPC_LIBC_IS_GNU ) , 0 )
LIB_OBJECTS += $( patsubst %.cc, $( OBJ_DIR) /%.o, $( RANGE_TREE_SOURCES) )
e n d i f
GTEST = $( OBJ_DIR) /$( GTEST_DIR) /gtest/gtest-all.o
TESTUTIL = $( OBJ_DIR) /test_util/testutil.o
TESTHARNESS = $( OBJ_DIR) /test_util/testharness.o $( TESTUTIL) $( GTEST)
VALGRIND_ERROR = 2
VALGRIND_VER := $( join $( VALGRIND_VER) ,valgrind)
VALGRIND_OPTS = --error-exitcode= $( VALGRIND_ERROR) --leak-check= full
# Not yet supported: --show-leak-kinds=definite,possible,reachable --errors-for-leak-kinds=definite,possible,reachable
TEST_OBJECTS = $( patsubst %.cc, $( OBJ_DIR) /%.o, $( TEST_LIB_SOURCES) $( MOCK_LIB_SOURCES) ) $( GTEST)
BENCH_OBJECTS = $( patsubst %.cc, $( OBJ_DIR) /%.o, $( BENCH_LIB_SOURCES) )
CACHE_BENCH_OBJECTS = $( patsubst %.cc, $( OBJ_DIR) /%.o, $( CACHE_BENCH_LIB_SOURCES) )
TOOL_OBJECTS = $( patsubst %.cc, $( OBJ_DIR) /%.o, $( TOOL_LIB_SOURCES) )
ANALYZE_OBJECTS = $( patsubst %.cc, $( OBJ_DIR) /%.o, $( ANALYZER_LIB_SOURCES) )
STRESS_OBJECTS = $( patsubst %.cc, $( OBJ_DIR) /%.o, $( STRESS_LIB_SOURCES) )
# Exclude build_version.cc -- a generated source file -- from all sources. Not needed for dependencies
ALL_SOURCES = $( filter-out util/build_version.cc, $( LIB_SOURCES) ) $( TEST_LIB_SOURCES) $( MOCK_LIB_SOURCES) $( GTEST_DIR) /gtest/gtest-all.cc
ALL_SOURCES += $( TOOL_LIB_SOURCES) $( BENCH_LIB_SOURCES) $( CACHE_BENCH_LIB_SOURCES) $( ANALYZER_LIB_SOURCES) $( STRESS_LIB_SOURCES)
ALL_SOURCES += $( TEST_MAIN_SOURCES) $( TOOL_MAIN_SOURCES) $( BENCH_MAIN_SOURCES)
ALL_SOURCES += $( ROCKSDB_PLUGIN_SOURCES) $( ROCKSDB_PLUGIN_TESTS)
PLUGIN_TESTS = $( patsubst %.cc, %, $( notdir $( ROCKSDB_PLUGIN_TESTS) ) )
TESTS = $( patsubst %.cc, %, $( notdir $( TEST_MAIN_SOURCES) ) )
TESTS += $( patsubst %.c, %, $( notdir $( TEST_MAIN_SOURCES_C) ) )
TESTS += $( PLUGIN_TESTS)
# `make check-headers` to very that each header file includes its own
# dependencies
i f n e q ( $( filter check -headers , $ ( MAKECMDGOALS ) ) , )
# TODO: add/support JNI headers
DEV_HEADER_DIRS := $( sort include/ $( dir $( ALL_SOURCES) ) )
# Some headers like in port/ are platform-specific
DEV_HEADERS := $( shell $( FIND) $( DEV_HEADER_DIRS) -type f -name '*.h' | grep -E -v 'port/|plugin/|lua/|range_tree/' )
e l s e
DEV_HEADERS :=
e n d i f
HEADER_OK_FILES = $( patsubst %.h, %.h.ok, $( DEV_HEADERS) )
AM_V_CCH = $( am__v_CCH_$( V) )
am__v_CCH_ = $( am__v_CCH_$( AM_DEFAULT_VERBOSITY) )
am__v_CCH_0 = @echo " CC.h " $<;
am__v_CCH_1 =
%.h.ok : %.h # .h.ok not actually created, so re-checked on each invocation
# -DROCKSDB_NAMESPACE=42 ensures the namespace header is included
$( AM_V_CCH) echo '#include "$<"' | $( CXX) $( CXXFLAGS) -DROCKSDB_NAMESPACE= 42 -x c++ -c - -o /dev/null
check-headers : $( HEADER_OK_FILES )
# options_settable_test doesn't pass with UBSAN as we use hack in the test
i f d e f A S S E R T _ S T A T U S _ C H E C K E D
# TODO: finish fixing all tests to pass this check
TESTS_FAILING_ASC = \
c_test \
env_test \
range_locking_test \
testutil_test \
# Since we have very few ASC exclusions left, excluding them from
# the build is the most convenient way to exclude them from testing
TESTS := $( filter-out $( TESTS_FAILING_ASC) ,$( TESTS) )
e n d i f
ROCKSDBTESTS_SUBSET ?= $( TESTS)
# c_test - doesn't use gtest
# env_test - suspicious use of test::TmpDir
# deletefile_test - serial because it generates giant temporary files in
# its various tests. Parallel can fill up your /dev/shm
# db_bloom_filter_test - serial because excessive space usage by instances
# of DBFilterConstructionReserveMemoryTestWithParam can fill up /dev/shm
NON_PARALLEL_TEST = \
c_test \
env_test \
deletefile_test \
db_bloom_filter_test \
$( PLUGIN_TESTS) \
PARALLEL_TEST = $( filter-out $( NON_PARALLEL_TEST) , $( TESTS) )
# Not necessarily well thought out or up-to-date, but matches old list
TESTS_PLATFORM_DEPENDENT := \
db_basic_test \
db_blob_basic_test \
db_encryption_test \
external_sst_file_basic_test \
auto_roll_logger_test \
bloom_test \
dynamic_bloom_test \
c_test \
checkpoint_test \
crc32c_test \
coding_test \
inlineskiplist_test \
env_basic_test \
env_test \
env_logger_test \
io_posix_test \
hash_test \
random_test \
ribbon_test \
thread_local_test \
work_queue_test \
rate_limiter_test \
perf_context_test \
iostats_context_test \
# Sort ROCKSDBTESTS_SUBSET for filtering, except db_test is special (expensive)
# so is placed first (out-of-order)
ROCKSDBTESTS_SUBSET := $( filter db_test, $( ROCKSDBTESTS_SUBSET) ) $( sort $( filter-out db_test, $( ROCKSDBTESTS_SUBSET) ) )
i f d e f R O C K S D B T E S T S _ S T A R T
ROCKSDBTESTS_SUBSET := $( shell echo $( ROCKSDBTESTS_SUBSET) | sed 's/^.*$(ROCKSDBTESTS_START)/$(ROCKSDBTESTS_START)/' )
e n d i f
i f d e f R O C K S D B T E S T S _ E N D
ROCKSDBTESTS_SUBSET := $( shell echo $( ROCKSDBTESTS_SUBSET) | sed 's/$(ROCKSDBTESTS_END).*//' )
e n d i f
i f e q ( $( ROCKSDBTESTS_PLATFORM_DEPENDENT ) , o n l y )
ROCKSDBTESTS_SUBSET := $( filter $( TESTS_PLATFORM_DEPENDENT) , $( ROCKSDBTESTS_SUBSET) )
e l s e i f e q ( $( ROCKSDBTESTS_PLATFORM_DEPENDENT ) , e x c l u d e )
ROCKSDBTESTS_SUBSET := $( filter-out $( TESTS_PLATFORM_DEPENDENT) , $( ROCKSDBTESTS_SUBSET) )
e n d i f
# bench_tool_analyer main is in bench_tool_analyzer_tool, or this would be simpler...
TOOLS = $( patsubst %.cc, %, $( notdir $( patsubst %_tool.cc, %.cc, $( TOOLS_MAIN_SOURCES) ) ) )
TEST_LIBS = \
librocksdb_env_basic_test.a
# TODO: add back forward_iterator_bench, after making it build in all environemnts.
BENCHMARKS = $( patsubst %.cc, %, $( notdir $( BENCH_MAIN_SOURCES) ) )
MICROBENCHS = $( patsubst %.cc, %, $( notdir $( MICROBENCH_SOURCES) ) )
# if user didn't config LIBNAME, set the default
i f e q ( $( LIBNAME ) , )
LIBNAME = librocksdb
# we should only run rocksdb in production with DEBUG_LEVEL 0
i f n e q ( $( DEBUG_LEVEL ) , 0 )
LIBDEBUG = _debug
e n d i f
e n d i f
STATIC_LIBRARY = ${ LIBNAME } $( LIBDEBUG) .a
STATIC_TEST_LIBRARY = ${ LIBNAME } _test$( LIBDEBUG) .a
STATIC_TOOLS_LIBRARY = ${ LIBNAME } _tools$( LIBDEBUG) .a
STATIC_STRESS_LIBRARY = ${ LIBNAME } _stress$( LIBDEBUG) .a
ALL_STATIC_LIBS = $( STATIC_LIBRARY) $( STATIC_TEST_LIBRARY) $( STATIC_TOOLS_LIBRARY) $( STATIC_STRESS_LIBRARY)
SHARED_TEST_LIBRARY = ${ LIBNAME } _test$( LIBDEBUG) .$( PLATFORM_SHARED_EXT)
SHARED_TOOLS_LIBRARY = ${ LIBNAME } _tools$( LIBDEBUG) .$( PLATFORM_SHARED_EXT)
SHARED_STRESS_LIBRARY = ${ LIBNAME } _stress$( LIBDEBUG) .$( PLATFORM_SHARED_EXT)
ALL_SHARED_LIBS = $( SHARED1) $( SHARED2) $( SHARED3) $( SHARED4) $( SHARED_TEST_LIBRARY) $( SHARED_TOOLS_LIBRARY) $( SHARED_STRESS_LIBRARY)
i f e q ( $( LIB_MODE ) , s h a r e d )
LIBRARY = $( SHARED1)
TEST_LIBRARY = $( SHARED_TEST_LIBRARY)
TOOLS_LIBRARY = $( SHARED_TOOLS_LIBRARY)
STRESS_LIBRARY = $( SHARED_STRESS_LIBRARY)
CLOUD_LIBRARY = $( SHARED_CLOUD_LIBRARY)
e l s e
LIBRARY = $( STATIC_LIBRARY)
TEST_LIBRARY = $( STATIC_TEST_LIBRARY)
TOOLS_LIBRARY = $( STATIC_TOOLS_LIBRARY)
e n d i f
STRESS_LIBRARY = $( STATIC_STRESS_LIBRARY)
ROCKSDB_MAJOR = $( shell grep -E "ROCKSDB_MAJOR.[0-9]" include/rocksdb/version.h | cut -d ' ' -f 3)
ROCKSDB_MINOR = $( shell grep -E "ROCKSDB_MINOR.[0-9]" include/rocksdb/version.h | cut -d ' ' -f 3)
ROCKSDB_PATCH = $( shell grep -E "ROCKSDB_PATCH.[0-9]" include/rocksdb/version.h | cut -d ' ' -f 3)
Package generation for Ubuntu and CentOS
Summary:
I put together a script to assist in the generation of deb's and
rpm's. I've tested that this works on ubuntu via vagrant. I've included the
Vagrantfile here, but I can remove it if it's not useful. The package.sh
script should work on any ubuntu or centos machine, I just added a bit of
logic in there to allow a base Ubuntu or Centos machine to be able to build
RocksDB from scratch.
Example output on Ubuntu 14.04:
```
root@vagrant-ubuntu-trusty-64:/vagrant# ./tools/package.sh
[+] g++-4.7 is already installed. skipping.
[+] libgflags-dev is already installed. skipping.
[+] ruby-all-dev is already installed. skipping.
[+] fpm is already installed. skipping.
Created package {:path=>"rocksdb_3.5_amd64.deb"}
root@vagrant-ubuntu-trusty-64:/vagrant# dpkg --info rocksdb_3.5_amd64.deb
new debian package, version 2.0.
size 17392022 bytes: control archive=1518 bytes.
275 bytes, 11 lines control
2911 bytes, 38 lines md5sums
Package: rocksdb
Version: 3.5
License: BSD
Vendor: Facebook
Architecture: amd64
Maintainer: rocksdb@fb.com
Installed-Size: 83358
Section: default
Priority: extra
Homepage: http://rocksdb.org/
Description: RocksDB is an embeddable persistent key-value store for fast storage.
```
Example output on CentOS 6.5:
```
[root@localhost vagrant]# rpm -qip rocksdb-3.5-1.x86_64.rpm
Name : rocksdb Relocations: /usr
Version : 3.5 Vendor: Facebook
Release : 1 Build Date: Mon 29 Sep 2014 01:26:11 AM UTC
Install Date: (not installed) Build Host: localhost
Group : default Source RPM: rocksdb-3.5-1.src.rpm
Size : 96231106 License: BSD
Signature : (none)
Packager : rocksdb@fb.com
URL : http://rocksdb.org/
Summary : RocksDB is an embeddable persistent key-value store for fast storage.
Description :
RocksDB is an embeddable persistent key-value store for fast storage.
```
Test Plan:
How this gets used is really up to the RocksDB core team. If you
want to actually get this into mainline, you might have to change `make
install` such that it install the RocksDB shared object file as well, which
would require you to link against gflags (maybe?) and that would require some
potential modifications to the script here (basically add a depends on that
package).
Currently, this will install the headers and a pre-compiled statically linked
object file. If that's what you want out of life, than this requires no
modifications.
Reviewers: ljin, yhchiang, igor
Reviewed By: igor
Differential Revision: https://reviews.facebook.net/D24141
10 years ago
# If NO_UPDATE_BUILD_VERSION is set we don't update util/build_version.cc, but
# the file needs to already exist or else the build will fail
i f n d e f N O _ U P D A T E _ B U I L D _ V E R S I O N
# By default, use the current date-time as the date. If there are no changes,
# we will use the last commit date instead.
build_date := $( shell date "+%Y-%m-%d %T" )
i f d e f F O R C E _ G I T _ S H A
git_sha := $( FORCE_GIT_SHA)
git_mod := 1
git_date := $( build_date)
e l s e
git_sha := $( shell git rev-parse HEAD 2>/dev/null)
git_tag := $( shell git symbolic-ref -q --short HEAD 2> /dev/null || git describe --tags --exact-match 2>/dev/null)
git_mod := $( shell git diff-index HEAD --quiet 2>/dev/null; echo $$ ?)
git_date := $( shell git log -1 --date= format:"%Y-%m-%d %T" --format= "%ad" 2>/dev/null)
e n d i f
gen_build_version = sed -e s/@GIT_SHA@/$( git_sha) / -e s:@GIT_TAG@:" $( git_tag) " : -e s/@GIT_MOD@/" $( git_mod) " / -e s/@BUILD_DATE@/" $( build_date) " / -e s/@GIT_DATE@/" $( git_date) " / -e s/@ROCKSDB_PLUGIN_BUILTINS@/'$(ROCKSDB_PLUGIN_BUILTINS)' / -e s/@ROCKSDB_PLUGIN_EXTERNS@/" $( ROCKSDB_PLUGIN_EXTERNS) " / util/build_version.cc.in
# Record the version of the source that we are compiling.
# We keep a record of the git revision in this file. It is then built
# as a regular source file as part of the compilation process.
# One can run "strings executable_filename | grep _build_" to find
# the version of the source that we used to build the executable file.
util/build_version.cc : $( filter -out $ ( OBJ_DIR ) /util /build_version .o , $ ( LIB_OBJECTS ) ) util /build_version .cc .in
$( AM_V_GEN) rm -f $@ -t
$( AM_V_at) $( gen_build_version) > $@
e n d i f
CLEAN_FILES += util/build_version.cc
default : all
#-----------------------------------------------
# Create platform independent shared libraries.
#-----------------------------------------------
i f n e q ( $( PLATFORM_SHARED_EXT ) , )
i f n e q ( $( PLATFORM_SHARED_VERSIONED ) , t r u e )
SHARED1 = ${ LIBNAME } $( LIBDEBUG) .$( PLATFORM_SHARED_EXT)
SHARED2 = $( SHARED1)
SHARED3 = $( SHARED1)
SHARED4 = $( SHARED1)
SHARED = $( SHARED1)
e l s e
Package generation for Ubuntu and CentOS
Summary:
I put together a script to assist in the generation of deb's and
rpm's. I've tested that this works on ubuntu via vagrant. I've included the
Vagrantfile here, but I can remove it if it's not useful. The package.sh
script should work on any ubuntu or centos machine, I just added a bit of
logic in there to allow a base Ubuntu or Centos machine to be able to build
RocksDB from scratch.
Example output on Ubuntu 14.04:
```
root@vagrant-ubuntu-trusty-64:/vagrant# ./tools/package.sh
[+] g++-4.7 is already installed. skipping.
[+] libgflags-dev is already installed. skipping.
[+] ruby-all-dev is already installed. skipping.
[+] fpm is already installed. skipping.
Created package {:path=>"rocksdb_3.5_amd64.deb"}
root@vagrant-ubuntu-trusty-64:/vagrant# dpkg --info rocksdb_3.5_amd64.deb
new debian package, version 2.0.
size 17392022 bytes: control archive=1518 bytes.
275 bytes, 11 lines control
2911 bytes, 38 lines md5sums
Package: rocksdb
Version: 3.5
License: BSD
Vendor: Facebook
Architecture: amd64
Maintainer: rocksdb@fb.com
Installed-Size: 83358
Section: default
Priority: extra
Homepage: http://rocksdb.org/
Description: RocksDB is an embeddable persistent key-value store for fast storage.
```
Example output on CentOS 6.5:
```
[root@localhost vagrant]# rpm -qip rocksdb-3.5-1.x86_64.rpm
Name : rocksdb Relocations: /usr
Version : 3.5 Vendor: Facebook
Release : 1 Build Date: Mon 29 Sep 2014 01:26:11 AM UTC
Install Date: (not installed) Build Host: localhost
Group : default Source RPM: rocksdb-3.5-1.src.rpm
Size : 96231106 License: BSD
Signature : (none)
Packager : rocksdb@fb.com
URL : http://rocksdb.org/
Summary : RocksDB is an embeddable persistent key-value store for fast storage.
Description :
RocksDB is an embeddable persistent key-value store for fast storage.
```
Test Plan:
How this gets used is really up to the RocksDB core team. If you
want to actually get this into mainline, you might have to change `make
install` such that it install the RocksDB shared object file as well, which
would require you to link against gflags (maybe?) and that would require some
potential modifications to the script here (basically add a depends on that
package).
Currently, this will install the headers and a pre-compiled statically linked
object file. If that's what you want out of life, than this requires no
modifications.
Reviewers: ljin, yhchiang, igor
Reviewed By: igor
Differential Revision: https://reviews.facebook.net/D24141
10 years ago
SHARED_MAJOR = $( ROCKSDB_MAJOR)
SHARED_MINOR = $( ROCKSDB_MINOR)
SHARED_PATCH = $( ROCKSDB_PATCH)
SHARED1 = ${ LIBNAME } .$( PLATFORM_SHARED_EXT)
i f e q ( $( PLATFORM ) , O S _ M A C O S X )
SHARED_OSX = $( LIBNAME) $( LIBDEBUG) .$( SHARED_MAJOR)
SHARED2 = $( SHARED_OSX) .$( PLATFORM_SHARED_EXT)
SHARED3 = $( SHARED_OSX) .$( SHARED_MINOR) .$( PLATFORM_SHARED_EXT)
SHARED4 = $( SHARED_OSX) .$( SHARED_MINOR) .$( SHARED_PATCH) .$( PLATFORM_SHARED_EXT)
e l s e
SHARED2 = $( SHARED1) .$( SHARED_MAJOR)
SHARED3 = $( SHARED1) .$( SHARED_MAJOR) .$( SHARED_MINOR)
SHARED4 = $( SHARED1) .$( SHARED_MAJOR) .$( SHARED_MINOR) .$( SHARED_PATCH)
e n d i f # MACOSX
SHARED = $( SHARED1) $( SHARED2) $( SHARED3) $( SHARED4)
$(SHARED1) : $( SHARED 4) $( SHARED 2)
ln -fs $( SHARED4) $( SHARED1)
$(SHARED2) : $( SHARED 4) $( SHARED 3)
ln -fs $( SHARED4) $( SHARED2)
$(SHARED3) : $( SHARED 4)
ln -fs $( SHARED4) $( SHARED3)
e n d i f # PLATFORM_SHARED_VERSIONED
$(SHARED4) : $( LIB_OBJECTS )
$( AM_V_CCLD) $( CXX) $( PLATFORM_SHARED_LDFLAGS) $( SHARED3) $( LIB_OBJECTS) $( LDFLAGS) -o $@
e n d i f # PLATFORM_SHARED_EXT
.PHONY : check clean coverage ldb_tests package dbg gen -pc build_size \
release tags tags0 valgrind_check format static_lib shared_lib all \
rocksdbjavastatic rocksdbjava install install-static install-shared \
uninstall analyze tools tools_lib check-headers checkout_folly
all : $( LIBRARY ) $( BENCHMARKS ) tools tools_lib test_libs $( TESTS )
all_but_some_tests : $( LIBRARY ) $( BENCHMARKS ) tools tools_lib test_libs $( ROCKSDBTESTS_SUBSET )
static_lib : $( STATIC_LIBRARY )
shared_lib : $( SHARED )
stress_lib : $( STRESS_LIBRARY )
tools : $( TOOLS )
tools_lib : $( TOOLS_LIBRARY )
test_libs : $( TEST_LIBS )
benchmarks : $( BENCHMARKS )
microbench : $( MICROBENCHS )
run_microbench : $( MICROBENCHS )
for t in $( MICROBENCHS) ; do echo " ===== Running benchmark $$ t (`date`) " ; ./$$ t || exit 1; done ;
dbg : $( LIBRARY ) $( BENCHMARKS ) tools $( TESTS )
Major CircleCI/Linux fixes / tweaks / enhancements (#7078)
Summary:
Primarily, this change adds a way to work around a bug limiting the effective output (and therefore debugability) of the Linux builds using parallel make. We would get
make[1]: write error: stdout
probably due to a kernel bug, apparently affecting both available ubuntu 16 machine images (maybe not affecting docker images, less horsepower). https://bugs.launchpad.net/ubuntu/+source/linux-signed/+bug/1814393
Now in the CircleCI config, make output on Ubuntu is piped through a custom 'cat' that ignores EAGAIN errors, which seems to fix the problem.
Significant other changes:
* Add another linux build that combines
* LIB_MODE=shared, to ensure this works with compile and unit test execution
* Alternative rocksdb namespace, to ensure this works (not rely on Travis)
* ASSERT_STATUS_CHECKED=1, but with building all unit tests and running those expected to pass with it
* Run release build with and without gflags. (Was running only without, ignore large swaths of code in a normal release build! Two regressions in this build, only with gflags, in the last week not caught by CI!)
* Use gflags with unity and LITE build, as typical case.
Debugability improvements:
* Use V=1 to show commands being executed (thanks to EAGAIN work-around)
* Print kernel version and compiler versions as part of V=1 output from Makefile
Cosmetic other changes:
* Put more commands on one line, for less clutter in CircleCI output pages
* Remove redundant "all" in "make all check" and put make command options before targets
* Change some recursive "make clean" into dependency on "clean," toward minimizing unnecessary overhead (detect platform, build version, etc.) of extra recursive makes
Pull Request resolved: https://github.com/facebook/rocksdb/pull/7078
Reviewed By: siying
Differential Revision: D22391647
Pulled By: pdillinger
fbshipit-source-id: d446fccf5a8c568b37dc8748621c8a5c546fe135
5 years ago
# creates library and programs
release : clean
LIB_MODE = $( LIB_MODE) DEBUG_LEVEL = 0 $( MAKE) $( LIBRARY) tools db_bench
Major CircleCI/Linux fixes / tweaks / enhancements (#7078)
Summary:
Primarily, this change adds a way to work around a bug limiting the effective output (and therefore debugability) of the Linux builds using parallel make. We would get
make[1]: write error: stdout
probably due to a kernel bug, apparently affecting both available ubuntu 16 machine images (maybe not affecting docker images, less horsepower). https://bugs.launchpad.net/ubuntu/+source/linux-signed/+bug/1814393
Now in the CircleCI config, make output on Ubuntu is piped through a custom 'cat' that ignores EAGAIN errors, which seems to fix the problem.
Significant other changes:
* Add another linux build that combines
* LIB_MODE=shared, to ensure this works with compile and unit test execution
* Alternative rocksdb namespace, to ensure this works (not rely on Travis)
* ASSERT_STATUS_CHECKED=1, but with building all unit tests and running those expected to pass with it
* Run release build with and without gflags. (Was running only without, ignore large swaths of code in a normal release build! Two regressions in this build, only with gflags, in the last week not caught by CI!)
* Use gflags with unity and LITE build, as typical case.
Debugability improvements:
* Use V=1 to show commands being executed (thanks to EAGAIN work-around)
* Print kernel version and compiler versions as part of V=1 output from Makefile
Cosmetic other changes:
* Put more commands on one line, for less clutter in CircleCI output pages
* Remove redundant "all" in "make all check" and put make command options before targets
* Change some recursive "make clean" into dependency on "clean," toward minimizing unnecessary overhead (detect platform, build version, etc.) of extra recursive makes
Pull Request resolved: https://github.com/facebook/rocksdb/pull/7078
Reviewed By: siying
Differential Revision: D22391647
Pulled By: pdillinger
fbshipit-source-id: d446fccf5a8c568b37dc8748621c8a5c546fe135
5 years ago
coverage : clean
COVERAGEFLAGS = "-fprofile-arcs -ftest-coverage" LDFLAGS += "-lgcov" $( MAKE) J = 1 all check
cd coverage && ./coverage_test.sh
# Delete intermediate files
$( FIND) . -type f \( -name "*.gcda" -o -name "*.gcno" \) -exec rm -f { } \;
# Run all tests in parallel, accumulating per-test logs in t/log-*.
#
# Each t/run-* file is a tiny generated bourne shell script that invokes one of
# sub-tests. Why use a file for this? Because that makes the invocation of
# parallel below simpler, which in turn makes the parsing of parallel's
# LOG simpler (the latter is for live monitoring as parallel
# tests run).
#
# Test names are extracted by running tests with --gtest_list_tests.
run 'make check's rules (and even subtests) in parallel
Summary:
When GNU parallel is available, "make check" tests are now run in parallel.
When /dev/shm is usable, we tell those tests to create temporary files therein.
Now, the longest-running single test, db_test, (which is composed of hundreds of sub-tests)
is no longer run sequentially: instead, each of its sub-tests is run independently, and can
be parallelized along with all other tests. To make that process easier, this change
creates a temporary directory, "t/", in which it puts a small script for each of those
subtests. The output from each parallel-run test is now saved in t/log-TEST_NAME.
When GNU parallel is not available, we run the tests in sequence, just as before.
If GNU parallel is available and you don't like the default of running one subtest
per core, you can invoke "make J=1 check" to run only one test at a time.
Beware: this will take a long time, and it starts with the two longest-running tests, so you
will wait for a long time before seeing any results. Instead, if you want to use fewer resources
but still see useful progress, try "make J=60% check". That will attempt to ensure that 60% of
the cores are occupied by test runs.
To watch progress of individual tests (duration, success (PASS-or-FAIL), name), run "make watch-log"
in the same directory from another window. That will start with something like this:
and when complete should show numbers/names like this:
Every 0.1s: sort -k7,7nr -k4,4gr LOG|perl -n -e '@a=split("\t",$_,-1); $t=$a[8]; $t =~ s,^\./,,;' -e '$t =~ s, >.*,,; chomp $t;' -e '$t =~ /.*--gtest_filter=... Wed Apr 1 10:51:42 2015
152.221 PASS t/DBTest.FileCreationRandomFailure
109.280 PASS t/DBTest.EncodeDecompressedBlockSizeTest
82.315 PASS reduce_levels_test
77.812 PASS t/DBTest.CompactionFilterWithValueChange
73.236 PASS backupable_db_test
63.428 PASS deletefile_test
57.248 PASS table_test
55.665 PASS prefix_test
49.816 PASS t/DBTest.RateLimitingTest
...
Test Plan:
Timings (measured so as to exclude compile and link times):
With this change, all tests complete in 2m40s on a system for which nproc prints 32.
Prior to this this change, "make check" would take 24.5 minutes on that same system.
Here are durations (in seconds) of the longest-running subtests:
152.435 PASS t/DBTest.FileCreationRandomFailure
107.070 PASS t/DBTest.EncodeDecompressedBlockSizeTest
81.391 PASS ./reduce_levels_test
71.587 PASS ./backupable_db_test
61.746 PASS ./deletefile_test
57.960 PASS ./table_test
55.230 PASS ./prefix_test
54.060 PASS t/DBTest.CompactionFilterWithValueChange
48.873 PASS t/DBTest.RateLimitingTest
47.569 PASS ./fault_injection_test
46.593 PASS t/DBTest.Randomized
42.662 PASS t/DBTest.CompactionFilter
31.793 PASS t/DBTest.SparseMerge
30.612 PASS t/DBTest.CompactionFilterV2
25.891 PASS t/DBTest.GroupCommitTest
23.863 PASS t/DBTest.DynamicLevelMaxBytesBase
22.976 PASS ./rate_limiter_test
18.942 PASS t/DBTest.OptimizeFiltersForHits
16.851 PASS ./env_test
15.399 PASS t/DBTest.CompactionFilterV2WithValueChange
14.827 PASS t/DBTest.CompactionFilterV2NULLPrefix
Reviewers: igor, sdong, rven, yhchiang, igor.sugak
Reviewed By: igor.sugak
Subscribers: dhruba
Differential Revision: https://reviews.facebook.net/D35379
10 years ago
# This filter removes the "#"-introduced comments, and expands to
# fully-qualified names by changing input like this:
#
# DBTest.
# Empty
# WriteEmptyBatch
# MultiThreaded/MultiThreadedDBTest.
# MultiThreaded/0 # GetParam() = 0
# MultiThreaded/1 # GetParam() = 1
#
# into this:
#
# DBTest.Empty
# DBTest.WriteEmptyBatch
# MultiThreaded/MultiThreadedDBTest.MultiThreaded/0
# MultiThreaded/MultiThreadedDBTest.MultiThreaded/1
#
parallel_tests = $( patsubst %,parallel_%,$( PARALLEL_TEST) )
.PHONY : gen_parallel_tests $( parallel_tests )
$(parallel_tests) :
$( AM_V_at) TEST_BINARY = $( patsubst parallel_%,%,$@ ) ; \
TEST_NAMES = ` \
( ./$$ TEST_BINARY --gtest_list_tests || echo " $$ {TEST_BINARY}__list_tests_failure " ) \
| awk '/^[^ ]/ { prefix = $$1 } /^[ ]/ { print prefix $$1 }' ` ; \
echo " Generating parallel test scripts for $$ TEST_BINARY " ; \
for TEST_NAME in $$ TEST_NAMES; do \
TEST_SCRIPT = t/run-$$ TEST_BINARY-$$ { TEST_NAME//\/ /-} ; \
printf '%s\n' \
'#!/bin/sh' \
" d=\$(TEST_TMPDIR) $$ TEST_SCRIPT " \
'mkdir -p $$d' \
" TEST_TMPDIR=\$ $d $( DRIVER) ./ $$ TEST_BINARY --gtest_filter= $$ TEST_NAME " \
> $$ TEST_SCRIPT; \
chmod a = rx $$ TEST_SCRIPT; \
done
gen_parallel_tests :
$( AM_V_at) mkdir -p t
$( AM_V_at) $( FIND) t -type f -name 'run-*' -exec rm -f { } \;
$( MAKE) $( parallel_tests)
run 'make check's rules (and even subtests) in parallel
Summary:
When GNU parallel is available, "make check" tests are now run in parallel.
When /dev/shm is usable, we tell those tests to create temporary files therein.
Now, the longest-running single test, db_test, (which is composed of hundreds of sub-tests)
is no longer run sequentially: instead, each of its sub-tests is run independently, and can
be parallelized along with all other tests. To make that process easier, this change
creates a temporary directory, "t/", in which it puts a small script for each of those
subtests. The output from each parallel-run test is now saved in t/log-TEST_NAME.
When GNU parallel is not available, we run the tests in sequence, just as before.
If GNU parallel is available and you don't like the default of running one subtest
per core, you can invoke "make J=1 check" to run only one test at a time.
Beware: this will take a long time, and it starts with the two longest-running tests, so you
will wait for a long time before seeing any results. Instead, if you want to use fewer resources
but still see useful progress, try "make J=60% check". That will attempt to ensure that 60% of
the cores are occupied by test runs.
To watch progress of individual tests (duration, success (PASS-or-FAIL), name), run "make watch-log"
in the same directory from another window. That will start with something like this:
and when complete should show numbers/names like this:
Every 0.1s: sort -k7,7nr -k4,4gr LOG|perl -n -e '@a=split("\t",$_,-1); $t=$a[8]; $t =~ s,^\./,,;' -e '$t =~ s, >.*,,; chomp $t;' -e '$t =~ /.*--gtest_filter=... Wed Apr 1 10:51:42 2015
152.221 PASS t/DBTest.FileCreationRandomFailure
109.280 PASS t/DBTest.EncodeDecompressedBlockSizeTest
82.315 PASS reduce_levels_test
77.812 PASS t/DBTest.CompactionFilterWithValueChange
73.236 PASS backupable_db_test
63.428 PASS deletefile_test
57.248 PASS table_test
55.665 PASS prefix_test
49.816 PASS t/DBTest.RateLimitingTest
...
Test Plan:
Timings (measured so as to exclude compile and link times):
With this change, all tests complete in 2m40s on a system for which nproc prints 32.
Prior to this this change, "make check" would take 24.5 minutes on that same system.
Here are durations (in seconds) of the longest-running subtests:
152.435 PASS t/DBTest.FileCreationRandomFailure
107.070 PASS t/DBTest.EncodeDecompressedBlockSizeTest
81.391 PASS ./reduce_levels_test
71.587 PASS ./backupable_db_test
61.746 PASS ./deletefile_test
57.960 PASS ./table_test
55.230 PASS ./prefix_test
54.060 PASS t/DBTest.CompactionFilterWithValueChange
48.873 PASS t/DBTest.RateLimitingTest
47.569 PASS ./fault_injection_test
46.593 PASS t/DBTest.Randomized
42.662 PASS t/DBTest.CompactionFilter
31.793 PASS t/DBTest.SparseMerge
30.612 PASS t/DBTest.CompactionFilterV2
25.891 PASS t/DBTest.GroupCommitTest
23.863 PASS t/DBTest.DynamicLevelMaxBytesBase
22.976 PASS ./rate_limiter_test
18.942 PASS t/DBTest.OptimizeFiltersForHits
16.851 PASS ./env_test
15.399 PASS t/DBTest.CompactionFilterV2WithValueChange
14.827 PASS t/DBTest.CompactionFilterV2NULLPrefix
Reviewers: igor, sdong, rven, yhchiang, igor.sugak
Reviewed By: igor.sugak
Subscribers: dhruba
Differential Revision: https://reviews.facebook.net/D35379
10 years ago
# Reorder input lines (which are one per test) so that the
# longest-running tests appear first in the output.
# Do this by prefixing each selected name with its duration,
# sort the resulting names, and remove the leading numbers.
# FIXME: the "100" we prepend is a fake time, for now.
# FIXME: squirrel away timings from each run and use them
# (when present) on subsequent runs to order these tests.
#
# Without this reordering, these two tests would happen to start only
# after almost all other tests had completed, thus adding 100 seconds
# to the duration of parallel "make check". That's the difference
# between 4 minutes (old) and 2m20s (new).
#
# 152.120 PASS t/DBTest.FileCreationRandomFailure
# 107.816 PASS t/DBTest.EncodeDecompressedBlockSizeTest
#
slow_test_regexp = \
^.*MySQLStyleTransactionTest.*$$ | ^.*SnapshotConcurrentAccessTest.*$$ | ^.*SeqAdvanceConcurrentTest.*$$ | ^t/run-table_test-HarnessTest.Randomized$$ | ^t/run-db_test-.*( ?:FileCreationRandomFailure| EncodeDecompressedBlockSizeTest) $$ | ^.*RecoverFromCorruptedWALWithoutFlush$$
run 'make check's rules (and even subtests) in parallel
Summary:
When GNU parallel is available, "make check" tests are now run in parallel.
When /dev/shm is usable, we tell those tests to create temporary files therein.
Now, the longest-running single test, db_test, (which is composed of hundreds of sub-tests)
is no longer run sequentially: instead, each of its sub-tests is run independently, and can
be parallelized along with all other tests. To make that process easier, this change
creates a temporary directory, "t/", in which it puts a small script for each of those
subtests. The output from each parallel-run test is now saved in t/log-TEST_NAME.
When GNU parallel is not available, we run the tests in sequence, just as before.
If GNU parallel is available and you don't like the default of running one subtest
per core, you can invoke "make J=1 check" to run only one test at a time.
Beware: this will take a long time, and it starts with the two longest-running tests, so you
will wait for a long time before seeing any results. Instead, if you want to use fewer resources
but still see useful progress, try "make J=60% check". That will attempt to ensure that 60% of
the cores are occupied by test runs.
To watch progress of individual tests (duration, success (PASS-or-FAIL), name), run "make watch-log"
in the same directory from another window. That will start with something like this:
and when complete should show numbers/names like this:
Every 0.1s: sort -k7,7nr -k4,4gr LOG|perl -n -e '@a=split("\t",$_,-1); $t=$a[8]; $t =~ s,^\./,,;' -e '$t =~ s, >.*,,; chomp $t;' -e '$t =~ /.*--gtest_filter=... Wed Apr 1 10:51:42 2015
152.221 PASS t/DBTest.FileCreationRandomFailure
109.280 PASS t/DBTest.EncodeDecompressedBlockSizeTest
82.315 PASS reduce_levels_test
77.812 PASS t/DBTest.CompactionFilterWithValueChange
73.236 PASS backupable_db_test
63.428 PASS deletefile_test
57.248 PASS table_test
55.665 PASS prefix_test
49.816 PASS t/DBTest.RateLimitingTest
...
Test Plan:
Timings (measured so as to exclude compile and link times):
With this change, all tests complete in 2m40s on a system for which nproc prints 32.
Prior to this this change, "make check" would take 24.5 minutes on that same system.
Here are durations (in seconds) of the longest-running subtests:
152.435 PASS t/DBTest.FileCreationRandomFailure
107.070 PASS t/DBTest.EncodeDecompressedBlockSizeTest
81.391 PASS ./reduce_levels_test
71.587 PASS ./backupable_db_test
61.746 PASS ./deletefile_test
57.960 PASS ./table_test
55.230 PASS ./prefix_test
54.060 PASS t/DBTest.CompactionFilterWithValueChange
48.873 PASS t/DBTest.RateLimitingTest
47.569 PASS ./fault_injection_test
46.593 PASS t/DBTest.Randomized
42.662 PASS t/DBTest.CompactionFilter
31.793 PASS t/DBTest.SparseMerge
30.612 PASS t/DBTest.CompactionFilterV2
25.891 PASS t/DBTest.GroupCommitTest
23.863 PASS t/DBTest.DynamicLevelMaxBytesBase
22.976 PASS ./rate_limiter_test
18.942 PASS t/DBTest.OptimizeFiltersForHits
16.851 PASS ./env_test
15.399 PASS t/DBTest.CompactionFilterV2WithValueChange
14.827 PASS t/DBTest.CompactionFilterV2NULLPrefix
Reviewers: igor, sdong, rven, yhchiang, igor.sugak
Reviewed By: igor.sugak
Subscribers: dhruba
Differential Revision: https://reviews.facebook.net/D35379
10 years ago
prioritize_long_running_tests = \
perl -pe 's,($(slow_test_regexp)),100 $$1,' \
| sort -k1,1gr \
| sed 's/^[.0-9]* //'
run 'make check's rules (and even subtests) in parallel
Summary:
When GNU parallel is available, "make check" tests are now run in parallel.
When /dev/shm is usable, we tell those tests to create temporary files therein.
Now, the longest-running single test, db_test, (which is composed of hundreds of sub-tests)
is no longer run sequentially: instead, each of its sub-tests is run independently, and can
be parallelized along with all other tests. To make that process easier, this change
creates a temporary directory, "t/", in which it puts a small script for each of those
subtests. The output from each parallel-run test is now saved in t/log-TEST_NAME.
When GNU parallel is not available, we run the tests in sequence, just as before.
If GNU parallel is available and you don't like the default of running one subtest
per core, you can invoke "make J=1 check" to run only one test at a time.
Beware: this will take a long time, and it starts with the two longest-running tests, so you
will wait for a long time before seeing any results. Instead, if you want to use fewer resources
but still see useful progress, try "make J=60% check". That will attempt to ensure that 60% of
the cores are occupied by test runs.
To watch progress of individual tests (duration, success (PASS-or-FAIL), name), run "make watch-log"
in the same directory from another window. That will start with something like this:
and when complete should show numbers/names like this:
Every 0.1s: sort -k7,7nr -k4,4gr LOG|perl -n -e '@a=split("\t",$_,-1); $t=$a[8]; $t =~ s,^\./,,;' -e '$t =~ s, >.*,,; chomp $t;' -e '$t =~ /.*--gtest_filter=... Wed Apr 1 10:51:42 2015
152.221 PASS t/DBTest.FileCreationRandomFailure
109.280 PASS t/DBTest.EncodeDecompressedBlockSizeTest
82.315 PASS reduce_levels_test
77.812 PASS t/DBTest.CompactionFilterWithValueChange
73.236 PASS backupable_db_test
63.428 PASS deletefile_test
57.248 PASS table_test
55.665 PASS prefix_test
49.816 PASS t/DBTest.RateLimitingTest
...
Test Plan:
Timings (measured so as to exclude compile and link times):
With this change, all tests complete in 2m40s on a system for which nproc prints 32.
Prior to this this change, "make check" would take 24.5 minutes on that same system.
Here are durations (in seconds) of the longest-running subtests:
152.435 PASS t/DBTest.FileCreationRandomFailure
107.070 PASS t/DBTest.EncodeDecompressedBlockSizeTest
81.391 PASS ./reduce_levels_test
71.587 PASS ./backupable_db_test
61.746 PASS ./deletefile_test
57.960 PASS ./table_test
55.230 PASS ./prefix_test
54.060 PASS t/DBTest.CompactionFilterWithValueChange
48.873 PASS t/DBTest.RateLimitingTest
47.569 PASS ./fault_injection_test
46.593 PASS t/DBTest.Randomized
42.662 PASS t/DBTest.CompactionFilter
31.793 PASS t/DBTest.SparseMerge
30.612 PASS t/DBTest.CompactionFilterV2
25.891 PASS t/DBTest.GroupCommitTest
23.863 PASS t/DBTest.DynamicLevelMaxBytesBase
22.976 PASS ./rate_limiter_test
18.942 PASS t/DBTest.OptimizeFiltersForHits
16.851 PASS ./env_test
15.399 PASS t/DBTest.CompactionFilterV2WithValueChange
14.827 PASS t/DBTest.CompactionFilterV2NULLPrefix
Reviewers: igor, sdong, rven, yhchiang, igor.sugak
Reviewed By: igor.sugak
Subscribers: dhruba
Differential Revision: https://reviews.facebook.net/D35379
10 years ago
# "make check" uses
# Run with "make J=1 check" to disable parallelism in "make check".
# Run with "make J=200% check" to run two parallel jobs per core.
# The default is to run one job per core (J=100%).
# See "man parallel" for its "-j ..." option.
J ?= 100%
run 'make check's rules (and even subtests) in parallel
Summary:
When GNU parallel is available, "make check" tests are now run in parallel.
When /dev/shm is usable, we tell those tests to create temporary files therein.
Now, the longest-running single test, db_test, (which is composed of hundreds of sub-tests)
is no longer run sequentially: instead, each of its sub-tests is run independently, and can
be parallelized along with all other tests. To make that process easier, this change
creates a temporary directory, "t/", in which it puts a small script for each of those
subtests. The output from each parallel-run test is now saved in t/log-TEST_NAME.
When GNU parallel is not available, we run the tests in sequence, just as before.
If GNU parallel is available and you don't like the default of running one subtest
per core, you can invoke "make J=1 check" to run only one test at a time.
Beware: this will take a long time, and it starts with the two longest-running tests, so you
will wait for a long time before seeing any results. Instead, if you want to use fewer resources
but still see useful progress, try "make J=60% check". That will attempt to ensure that 60% of
the cores are occupied by test runs.
To watch progress of individual tests (duration, success (PASS-or-FAIL), name), run "make watch-log"
in the same directory from another window. That will start with something like this:
and when complete should show numbers/names like this:
Every 0.1s: sort -k7,7nr -k4,4gr LOG|perl -n -e '@a=split("\t",$_,-1); $t=$a[8]; $t =~ s,^\./,,;' -e '$t =~ s, >.*,,; chomp $t;' -e '$t =~ /.*--gtest_filter=... Wed Apr 1 10:51:42 2015
152.221 PASS t/DBTest.FileCreationRandomFailure
109.280 PASS t/DBTest.EncodeDecompressedBlockSizeTest
82.315 PASS reduce_levels_test
77.812 PASS t/DBTest.CompactionFilterWithValueChange
73.236 PASS backupable_db_test
63.428 PASS deletefile_test
57.248 PASS table_test
55.665 PASS prefix_test
49.816 PASS t/DBTest.RateLimitingTest
...
Test Plan:
Timings (measured so as to exclude compile and link times):
With this change, all tests complete in 2m40s on a system for which nproc prints 32.
Prior to this this change, "make check" would take 24.5 minutes on that same system.
Here are durations (in seconds) of the longest-running subtests:
152.435 PASS t/DBTest.FileCreationRandomFailure
107.070 PASS t/DBTest.EncodeDecompressedBlockSizeTest
81.391 PASS ./reduce_levels_test
71.587 PASS ./backupable_db_test
61.746 PASS ./deletefile_test
57.960 PASS ./table_test
55.230 PASS ./prefix_test
54.060 PASS t/DBTest.CompactionFilterWithValueChange
48.873 PASS t/DBTest.RateLimitingTest
47.569 PASS ./fault_injection_test
46.593 PASS t/DBTest.Randomized
42.662 PASS t/DBTest.CompactionFilter
31.793 PASS t/DBTest.SparseMerge
30.612 PASS t/DBTest.CompactionFilterV2
25.891 PASS t/DBTest.GroupCommitTest
23.863 PASS t/DBTest.DynamicLevelMaxBytesBase
22.976 PASS ./rate_limiter_test
18.942 PASS t/DBTest.OptimizeFiltersForHits
16.851 PASS ./env_test
15.399 PASS t/DBTest.CompactionFilterV2WithValueChange
14.827 PASS t/DBTest.CompactionFilterV2NULLPrefix
Reviewers: igor, sdong, rven, yhchiang, igor.sugak
Reviewed By: igor.sugak
Subscribers: dhruba
Differential Revision: https://reviews.facebook.net/D35379
10 years ago
# Use this regexp to select the subset of tests whose names match.
tests-regexp = .
EXCLUDE_TESTS_REGEX ?= " ^ $$ "
i f e q ( $( PRINT_PARALLEL_OUTPUTS ) , 1 )
parallel_redir =
e l s e i f e q ( $( QUIET_PARALLEL_TESTS ) , 1 )
parallel_redir = >& t/$( test_log_prefix) log-{ /}
e l s e
# Default: print failure output only, as it happens
# Note: gnu_parallel --eta is now always used, but has been modified to provide
# only infrequent updates when not connected to a terminal. (CircleCI will
# kill a job if no output for 10min.)
parallel_redir = >& t/$( test_log_prefix) log-{ /} || bash -c " cat t/ $( test_log_prefix) log-{/}; exit $$ ? "
e n d i f
run 'make check's rules (and even subtests) in parallel
Summary:
When GNU parallel is available, "make check" tests are now run in parallel.
When /dev/shm is usable, we tell those tests to create temporary files therein.
Now, the longest-running single test, db_test, (which is composed of hundreds of sub-tests)
is no longer run sequentially: instead, each of its sub-tests is run independently, and can
be parallelized along with all other tests. To make that process easier, this change
creates a temporary directory, "t/", in which it puts a small script for each of those
subtests. The output from each parallel-run test is now saved in t/log-TEST_NAME.
When GNU parallel is not available, we run the tests in sequence, just as before.
If GNU parallel is available and you don't like the default of running one subtest
per core, you can invoke "make J=1 check" to run only one test at a time.
Beware: this will take a long time, and it starts with the two longest-running tests, so you
will wait for a long time before seeing any results. Instead, if you want to use fewer resources
but still see useful progress, try "make J=60% check". That will attempt to ensure that 60% of
the cores are occupied by test runs.
To watch progress of individual tests (duration, success (PASS-or-FAIL), name), run "make watch-log"
in the same directory from another window. That will start with something like this:
and when complete should show numbers/names like this:
Every 0.1s: sort -k7,7nr -k4,4gr LOG|perl -n -e '@a=split("\t",$_,-1); $t=$a[8]; $t =~ s,^\./,,;' -e '$t =~ s, >.*,,; chomp $t;' -e '$t =~ /.*--gtest_filter=... Wed Apr 1 10:51:42 2015
152.221 PASS t/DBTest.FileCreationRandomFailure
109.280 PASS t/DBTest.EncodeDecompressedBlockSizeTest
82.315 PASS reduce_levels_test
77.812 PASS t/DBTest.CompactionFilterWithValueChange
73.236 PASS backupable_db_test
63.428 PASS deletefile_test
57.248 PASS table_test
55.665 PASS prefix_test
49.816 PASS t/DBTest.RateLimitingTest
...
Test Plan:
Timings (measured so as to exclude compile and link times):
With this change, all tests complete in 2m40s on a system for which nproc prints 32.
Prior to this this change, "make check" would take 24.5 minutes on that same system.
Here are durations (in seconds) of the longest-running subtests:
152.435 PASS t/DBTest.FileCreationRandomFailure
107.070 PASS t/DBTest.EncodeDecompressedBlockSizeTest
81.391 PASS ./reduce_levels_test
71.587 PASS ./backupable_db_test
61.746 PASS ./deletefile_test
57.960 PASS ./table_test
55.230 PASS ./prefix_test
54.060 PASS t/DBTest.CompactionFilterWithValueChange
48.873 PASS t/DBTest.RateLimitingTest
47.569 PASS ./fault_injection_test
46.593 PASS t/DBTest.Randomized
42.662 PASS t/DBTest.CompactionFilter
31.793 PASS t/DBTest.SparseMerge
30.612 PASS t/DBTest.CompactionFilterV2
25.891 PASS t/DBTest.GroupCommitTest
23.863 PASS t/DBTest.DynamicLevelMaxBytesBase
22.976 PASS ./rate_limiter_test
18.942 PASS t/DBTest.OptimizeFiltersForHits
16.851 PASS ./env_test
15.399 PASS t/DBTest.CompactionFilterV2WithValueChange
14.827 PASS t/DBTest.CompactionFilterV2NULLPrefix
Reviewers: igor, sdong, rven, yhchiang, igor.sugak
Reviewed By: igor.sugak
Subscribers: dhruba
Differential Revision: https://reviews.facebook.net/D35379
10 years ago
.PHONY : check_ 0
check_0 :
printf '%s\n' '' \
'To monitor subtest <duration,pass/fail,name>,' \
' run "make watch-log" in a separate window' '' ; \
{ \
printf './%s\n' $( filter-out $( PARALLEL_TEST) ,$( TESTS) ) ; \
find t -name 'run-*' -print; \
} \
| $( prioritize_long_running_tests) \
| grep -E '$(tests-regexp)' \
| grep -E -v '$(EXCLUDE_TESTS_REGEX)' \
| build_tools/gnu_parallel -j$( J) --plain --joblog= LOG --eta --gnu \
--tmpdir= $( TEST_TMPDIR) '{} $(parallel_redir)' ; \
parallel_retcode = $$ ? ; \
awk '{ if ($$7 != 0 || $$8 != 0) { if ($$7 == "Exitval") { h = $$0; } else { if (!f) print h; print; f = 1 } } } END { if(f) exit 1; }' < LOG ; \
awk_retcode = $$ ?; \
if [ $$ parallel_retcode -ne 0 ] || [ $$ awk_retcode -ne 0 ] ; then exit 1 ; fi
valgrind-exclude-regexp = InlineSkipTest.ConcurrentInsert| TransactionStressTest.DeadlockStress| DBCompactionTest.SuggestCompactRangeNoTwoLevel0Compactions| BackupableDBTest.RateLimiting| DBTest.CloseSpeedup| DBTest.ThreadStatusFlush| DBTest.RateLimitingTest| DBTest.EncodeDecompressedBlockSizeTest| FaultInjectionTest.UninstalledCompaction| HarnessTest.Randomized| ExternalSSTFileTest.CompactDuringAddFileRandom| ExternalSSTFileTest.IngestFileWithGlobalSeqnoRandomized| MySQLStyleTransactionTest.TransactionStressTest
.PHONY : valgrind_check_ 0
valgrind_check_0 : test_log_prefix := valgrind_
valgrind_check_0 :
printf '%s\n' '' \
'To monitor subtest <duration,pass/fail,name>,' \
' run "make watch-log" in a separate window' '' ; \
{ \
printf './%s\n' $( filter-out $( PARALLEL_TEST) %skiplist_test options_settable_test, $( TESTS) ) ; \
find t -name 'run-*' -print; \
} \
| $( prioritize_long_running_tests) \
| grep -E '$(tests-regexp)' \
| grep -E -v '$(valgrind-exclude-regexp)' \
| build_tools/gnu_parallel -j$( J) --plain --joblog= LOG --eta --gnu \
--tmpdir= $( TEST_TMPDIR) \
' ( if [ [ "{}" = = "./" * ] ] ; then $( DRIVER) { } ; else { } ; fi ) \
$( parallel_redir) ' \
run 'make check's rules (and even subtests) in parallel
Summary:
When GNU parallel is available, "make check" tests are now run in parallel.
When /dev/shm is usable, we tell those tests to create temporary files therein.
Now, the longest-running single test, db_test, (which is composed of hundreds of sub-tests)
is no longer run sequentially: instead, each of its sub-tests is run independently, and can
be parallelized along with all other tests. To make that process easier, this change
creates a temporary directory, "t/", in which it puts a small script for each of those
subtests. The output from each parallel-run test is now saved in t/log-TEST_NAME.
When GNU parallel is not available, we run the tests in sequence, just as before.
If GNU parallel is available and you don't like the default of running one subtest
per core, you can invoke "make J=1 check" to run only one test at a time.
Beware: this will take a long time, and it starts with the two longest-running tests, so you
will wait for a long time before seeing any results. Instead, if you want to use fewer resources
but still see useful progress, try "make J=60% check". That will attempt to ensure that 60% of
the cores are occupied by test runs.
To watch progress of individual tests (duration, success (PASS-or-FAIL), name), run "make watch-log"
in the same directory from another window. That will start with something like this:
and when complete should show numbers/names like this:
Every 0.1s: sort -k7,7nr -k4,4gr LOG|perl -n -e '@a=split("\t",$_,-1); $t=$a[8]; $t =~ s,^\./,,;' -e '$t =~ s, >.*,,; chomp $t;' -e '$t =~ /.*--gtest_filter=... Wed Apr 1 10:51:42 2015
152.221 PASS t/DBTest.FileCreationRandomFailure
109.280 PASS t/DBTest.EncodeDecompressedBlockSizeTest
82.315 PASS reduce_levels_test
77.812 PASS t/DBTest.CompactionFilterWithValueChange
73.236 PASS backupable_db_test
63.428 PASS deletefile_test
57.248 PASS table_test
55.665 PASS prefix_test
49.816 PASS t/DBTest.RateLimitingTest
...
Test Plan:
Timings (measured so as to exclude compile and link times):
With this change, all tests complete in 2m40s on a system for which nproc prints 32.
Prior to this this change, "make check" would take 24.5 minutes on that same system.
Here are durations (in seconds) of the longest-running subtests:
152.435 PASS t/DBTest.FileCreationRandomFailure
107.070 PASS t/DBTest.EncodeDecompressedBlockSizeTest
81.391 PASS ./reduce_levels_test
71.587 PASS ./backupable_db_test
61.746 PASS ./deletefile_test
57.960 PASS ./table_test
55.230 PASS ./prefix_test
54.060 PASS t/DBTest.CompactionFilterWithValueChange
48.873 PASS t/DBTest.RateLimitingTest
47.569 PASS ./fault_injection_test
46.593 PASS t/DBTest.Randomized
42.662 PASS t/DBTest.CompactionFilter
31.793 PASS t/DBTest.SparseMerge
30.612 PASS t/DBTest.CompactionFilterV2
25.891 PASS t/DBTest.GroupCommitTest
23.863 PASS t/DBTest.DynamicLevelMaxBytesBase
22.976 PASS ./rate_limiter_test
18.942 PASS t/DBTest.OptimizeFiltersForHits
16.851 PASS ./env_test
15.399 PASS t/DBTest.CompactionFilterV2WithValueChange
14.827 PASS t/DBTest.CompactionFilterV2NULLPrefix
Reviewers: igor, sdong, rven, yhchiang, igor.sugak
Reviewed By: igor.sugak
Subscribers: dhruba
Differential Revision: https://reviews.facebook.net/D35379
10 years ago
CLEAN_FILES += t LOG $( TEST_TMPDIR)
run 'make check's rules (and even subtests) in parallel
Summary:
When GNU parallel is available, "make check" tests are now run in parallel.
When /dev/shm is usable, we tell those tests to create temporary files therein.
Now, the longest-running single test, db_test, (which is composed of hundreds of sub-tests)
is no longer run sequentially: instead, each of its sub-tests is run independently, and can
be parallelized along with all other tests. To make that process easier, this change
creates a temporary directory, "t/", in which it puts a small script for each of those
subtests. The output from each parallel-run test is now saved in t/log-TEST_NAME.
When GNU parallel is not available, we run the tests in sequence, just as before.
If GNU parallel is available and you don't like the default of running one subtest
per core, you can invoke "make J=1 check" to run only one test at a time.
Beware: this will take a long time, and it starts with the two longest-running tests, so you
will wait for a long time before seeing any results. Instead, if you want to use fewer resources
but still see useful progress, try "make J=60% check". That will attempt to ensure that 60% of
the cores are occupied by test runs.
To watch progress of individual tests (duration, success (PASS-or-FAIL), name), run "make watch-log"
in the same directory from another window. That will start with something like this:
and when complete should show numbers/names like this:
Every 0.1s: sort -k7,7nr -k4,4gr LOG|perl -n -e '@a=split("\t",$_,-1); $t=$a[8]; $t =~ s,^\./,,;' -e '$t =~ s, >.*,,; chomp $t;' -e '$t =~ /.*--gtest_filter=... Wed Apr 1 10:51:42 2015
152.221 PASS t/DBTest.FileCreationRandomFailure
109.280 PASS t/DBTest.EncodeDecompressedBlockSizeTest
82.315 PASS reduce_levels_test
77.812 PASS t/DBTest.CompactionFilterWithValueChange
73.236 PASS backupable_db_test
63.428 PASS deletefile_test
57.248 PASS table_test
55.665 PASS prefix_test
49.816 PASS t/DBTest.RateLimitingTest
...
Test Plan:
Timings (measured so as to exclude compile and link times):
With this change, all tests complete in 2m40s on a system for which nproc prints 32.
Prior to this this change, "make check" would take 24.5 minutes on that same system.
Here are durations (in seconds) of the longest-running subtests:
152.435 PASS t/DBTest.FileCreationRandomFailure
107.070 PASS t/DBTest.EncodeDecompressedBlockSizeTest
81.391 PASS ./reduce_levels_test
71.587 PASS ./backupable_db_test
61.746 PASS ./deletefile_test
57.960 PASS ./table_test
55.230 PASS ./prefix_test
54.060 PASS t/DBTest.CompactionFilterWithValueChange
48.873 PASS t/DBTest.RateLimitingTest
47.569 PASS ./fault_injection_test
46.593 PASS t/DBTest.Randomized
42.662 PASS t/DBTest.CompactionFilter
31.793 PASS t/DBTest.SparseMerge
30.612 PASS t/DBTest.CompactionFilterV2
25.891 PASS t/DBTest.GroupCommitTest
23.863 PASS t/DBTest.DynamicLevelMaxBytesBase
22.976 PASS ./rate_limiter_test
18.942 PASS t/DBTest.OptimizeFiltersForHits
16.851 PASS ./env_test
15.399 PASS t/DBTest.CompactionFilterV2WithValueChange
14.827 PASS t/DBTest.CompactionFilterV2NULLPrefix
Reviewers: igor, sdong, rven, yhchiang, igor.sugak
Reviewed By: igor.sugak
Subscribers: dhruba
Differential Revision: https://reviews.facebook.net/D35379
10 years ago
# When running parallel "make check", you can monitor its progress
# from another window.
# Run "make watch_LOG" to show the duration,PASS/FAIL,name of parallel
# tests as they are being run. We sort them so that longer-running ones
# appear at the top of the list and any failing tests remain at the top
# regardless of their duration. As with any use of "watch", hit ^C to
# interrupt.
watch-log :
$( WATCH) --interval= 0 'sort -k7,7nr -k4,4gr LOG|$(quoted_perl_command)'
run 'make check's rules (and even subtests) in parallel
Summary:
When GNU parallel is available, "make check" tests are now run in parallel.
When /dev/shm is usable, we tell those tests to create temporary files therein.
Now, the longest-running single test, db_test, (which is composed of hundreds of sub-tests)
is no longer run sequentially: instead, each of its sub-tests is run independently, and can
be parallelized along with all other tests. To make that process easier, this change
creates a temporary directory, "t/", in which it puts a small script for each of those
subtests. The output from each parallel-run test is now saved in t/log-TEST_NAME.
When GNU parallel is not available, we run the tests in sequence, just as before.
If GNU parallel is available and you don't like the default of running one subtest
per core, you can invoke "make J=1 check" to run only one test at a time.
Beware: this will take a long time, and it starts with the two longest-running tests, so you
will wait for a long time before seeing any results. Instead, if you want to use fewer resources
but still see useful progress, try "make J=60% check". That will attempt to ensure that 60% of
the cores are occupied by test runs.
To watch progress of individual tests (duration, success (PASS-or-FAIL), name), run "make watch-log"
in the same directory from another window. That will start with something like this:
and when complete should show numbers/names like this:
Every 0.1s: sort -k7,7nr -k4,4gr LOG|perl -n -e '@a=split("\t",$_,-1); $t=$a[8]; $t =~ s,^\./,,;' -e '$t =~ s, >.*,,; chomp $t;' -e '$t =~ /.*--gtest_filter=... Wed Apr 1 10:51:42 2015
152.221 PASS t/DBTest.FileCreationRandomFailure
109.280 PASS t/DBTest.EncodeDecompressedBlockSizeTest
82.315 PASS reduce_levels_test
77.812 PASS t/DBTest.CompactionFilterWithValueChange
73.236 PASS backupable_db_test
63.428 PASS deletefile_test
57.248 PASS table_test
55.665 PASS prefix_test
49.816 PASS t/DBTest.RateLimitingTest
...
Test Plan:
Timings (measured so as to exclude compile and link times):
With this change, all tests complete in 2m40s on a system for which nproc prints 32.
Prior to this this change, "make check" would take 24.5 minutes on that same system.
Here are durations (in seconds) of the longest-running subtests:
152.435 PASS t/DBTest.FileCreationRandomFailure
107.070 PASS t/DBTest.EncodeDecompressedBlockSizeTest
81.391 PASS ./reduce_levels_test
71.587 PASS ./backupable_db_test
61.746 PASS ./deletefile_test
57.960 PASS ./table_test
55.230 PASS ./prefix_test
54.060 PASS t/DBTest.CompactionFilterWithValueChange
48.873 PASS t/DBTest.RateLimitingTest
47.569 PASS ./fault_injection_test
46.593 PASS t/DBTest.Randomized
42.662 PASS t/DBTest.CompactionFilter
31.793 PASS t/DBTest.SparseMerge
30.612 PASS t/DBTest.CompactionFilterV2
25.891 PASS t/DBTest.GroupCommitTest
23.863 PASS t/DBTest.DynamicLevelMaxBytesBase
22.976 PASS ./rate_limiter_test
18.942 PASS t/DBTest.OptimizeFiltersForHits
16.851 PASS ./env_test
15.399 PASS t/DBTest.CompactionFilterV2WithValueChange
14.827 PASS t/DBTest.CompactionFilterV2NULLPrefix
Reviewers: igor, sdong, rven, yhchiang, igor.sugak
Reviewed By: igor.sugak
Subscribers: dhruba
Differential Revision: https://reviews.facebook.net/D35379
10 years ago
dump-log :
bash -c '$(quoted_perl_command)' < LOG
# If J != 1 and GNU parallel is installed, run the tests in parallel,
run 'make check's rules (and even subtests) in parallel
Summary:
When GNU parallel is available, "make check" tests are now run in parallel.
When /dev/shm is usable, we tell those tests to create temporary files therein.
Now, the longest-running single test, db_test, (which is composed of hundreds of sub-tests)
is no longer run sequentially: instead, each of its sub-tests is run independently, and can
be parallelized along with all other tests. To make that process easier, this change
creates a temporary directory, "t/", in which it puts a small script for each of those
subtests. The output from each parallel-run test is now saved in t/log-TEST_NAME.
When GNU parallel is not available, we run the tests in sequence, just as before.
If GNU parallel is available and you don't like the default of running one subtest
per core, you can invoke "make J=1 check" to run only one test at a time.
Beware: this will take a long time, and it starts with the two longest-running tests, so you
will wait for a long time before seeing any results. Instead, if you want to use fewer resources
but still see useful progress, try "make J=60% check". That will attempt to ensure that 60% of
the cores are occupied by test runs.
To watch progress of individual tests (duration, success (PASS-or-FAIL), name), run "make watch-log"
in the same directory from another window. That will start with something like this:
and when complete should show numbers/names like this:
Every 0.1s: sort -k7,7nr -k4,4gr LOG|perl -n -e '@a=split("\t",$_,-1); $t=$a[8]; $t =~ s,^\./,,;' -e '$t =~ s, >.*,,; chomp $t;' -e '$t =~ /.*--gtest_filter=... Wed Apr 1 10:51:42 2015
152.221 PASS t/DBTest.FileCreationRandomFailure
109.280 PASS t/DBTest.EncodeDecompressedBlockSizeTest
82.315 PASS reduce_levels_test
77.812 PASS t/DBTest.CompactionFilterWithValueChange
73.236 PASS backupable_db_test
63.428 PASS deletefile_test
57.248 PASS table_test
55.665 PASS prefix_test
49.816 PASS t/DBTest.RateLimitingTest
...
Test Plan:
Timings (measured so as to exclude compile and link times):
With this change, all tests complete in 2m40s on a system for which nproc prints 32.
Prior to this this change, "make check" would take 24.5 minutes on that same system.
Here are durations (in seconds) of the longest-running subtests:
152.435 PASS t/DBTest.FileCreationRandomFailure
107.070 PASS t/DBTest.EncodeDecompressedBlockSizeTest
81.391 PASS ./reduce_levels_test
71.587 PASS ./backupable_db_test
61.746 PASS ./deletefile_test
57.960 PASS ./table_test
55.230 PASS ./prefix_test
54.060 PASS t/DBTest.CompactionFilterWithValueChange
48.873 PASS t/DBTest.RateLimitingTest
47.569 PASS ./fault_injection_test
46.593 PASS t/DBTest.Randomized
42.662 PASS t/DBTest.CompactionFilter
31.793 PASS t/DBTest.SparseMerge
30.612 PASS t/DBTest.CompactionFilterV2
25.891 PASS t/DBTest.GroupCommitTest
23.863 PASS t/DBTest.DynamicLevelMaxBytesBase
22.976 PASS ./rate_limiter_test
18.942 PASS t/DBTest.OptimizeFiltersForHits
16.851 PASS ./env_test
15.399 PASS t/DBTest.CompactionFilterV2WithValueChange
14.827 PASS t/DBTest.CompactionFilterV2NULLPrefix
Reviewers: igor, sdong, rven, yhchiang, igor.sugak
Reviewed By: igor.sugak
Subscribers: dhruba
Differential Revision: https://reviews.facebook.net/D35379
10 years ago
# via the check_0 rule above. Otherwise, run them sequentially.
check : all
$( MAKE) gen_parallel_tests
$( AM_V_GEN) if test " $( J) " != 1 \
&& ( build_tools/gnu_parallel --gnu --help 2>/dev/null) | \
grep -q 'GNU Parallel' ; \
then \
$( MAKE) T = " $$ t " check_0; \
else \
for t in $( TESTS) ; do \
echo " ===== Running $$ t (`date`) " ; ./$$ t || exit 1; done ; \
fi
rm -rf $( TEST_TMPDIR)
i f n e q ( $( PLATFORM ) , O S _ A I X )
$( PYTHON) tools/check_all_python.py
i f n d e f A S S E R T _ S T A T U S _ C H E C K E D # not yet working with these tests
$( PYTHON) tools/ldb_test.py
sh tools/rocksdb_dump_test.sh
e n d i f
e n d i f
i f n d e f S K I P _ F O R M A T _ B U C K _ C H E C K S
$( MAKE) check-format
$( MAKE) check-buck-targets
$( MAKE) check-sources
e n d i f
run 'make check's rules (and even subtests) in parallel
Summary:
When GNU parallel is available, "make check" tests are now run in parallel.
When /dev/shm is usable, we tell those tests to create temporary files therein.
Now, the longest-running single test, db_test, (which is composed of hundreds of sub-tests)
is no longer run sequentially: instead, each of its sub-tests is run independently, and can
be parallelized along with all other tests. To make that process easier, this change
creates a temporary directory, "t/", in which it puts a small script for each of those
subtests. The output from each parallel-run test is now saved in t/log-TEST_NAME.
When GNU parallel is not available, we run the tests in sequence, just as before.
If GNU parallel is available and you don't like the default of running one subtest
per core, you can invoke "make J=1 check" to run only one test at a time.
Beware: this will take a long time, and it starts with the two longest-running tests, so you
will wait for a long time before seeing any results. Instead, if you want to use fewer resources
but still see useful progress, try "make J=60% check". That will attempt to ensure that 60% of
the cores are occupied by test runs.
To watch progress of individual tests (duration, success (PASS-or-FAIL), name), run "make watch-log"
in the same directory from another window. That will start with something like this:
and when complete should show numbers/names like this:
Every 0.1s: sort -k7,7nr -k4,4gr LOG|perl -n -e '@a=split("\t",$_,-1); $t=$a[8]; $t =~ s,^\./,,;' -e '$t =~ s, >.*,,; chomp $t;' -e '$t =~ /.*--gtest_filter=... Wed Apr 1 10:51:42 2015
152.221 PASS t/DBTest.FileCreationRandomFailure
109.280 PASS t/DBTest.EncodeDecompressedBlockSizeTest
82.315 PASS reduce_levels_test
77.812 PASS t/DBTest.CompactionFilterWithValueChange
73.236 PASS backupable_db_test
63.428 PASS deletefile_test
57.248 PASS table_test
55.665 PASS prefix_test
49.816 PASS t/DBTest.RateLimitingTest
...
Test Plan:
Timings (measured so as to exclude compile and link times):
With this change, all tests complete in 2m40s on a system for which nproc prints 32.
Prior to this this change, "make check" would take 24.5 minutes on that same system.
Here are durations (in seconds) of the longest-running subtests:
152.435 PASS t/DBTest.FileCreationRandomFailure
107.070 PASS t/DBTest.EncodeDecompressedBlockSizeTest
81.391 PASS ./reduce_levels_test
71.587 PASS ./backupable_db_test
61.746 PASS ./deletefile_test
57.960 PASS ./table_test
55.230 PASS ./prefix_test
54.060 PASS t/DBTest.CompactionFilterWithValueChange
48.873 PASS t/DBTest.RateLimitingTest
47.569 PASS ./fault_injection_test
46.593 PASS t/DBTest.Randomized
42.662 PASS t/DBTest.CompactionFilter
31.793 PASS t/DBTest.SparseMerge
30.612 PASS t/DBTest.CompactionFilterV2
25.891 PASS t/DBTest.GroupCommitTest
23.863 PASS t/DBTest.DynamicLevelMaxBytesBase
22.976 PASS ./rate_limiter_test
18.942 PASS t/DBTest.OptimizeFiltersForHits
16.851 PASS ./env_test
15.399 PASS t/DBTest.CompactionFilterV2WithValueChange
14.827 PASS t/DBTest.CompactionFilterV2NULLPrefix
Reviewers: igor, sdong, rven, yhchiang, igor.sugak
Reviewed By: igor.sugak
Subscribers: dhruba
Differential Revision: https://reviews.facebook.net/D35379
10 years ago
# TODO add ldb_tests
check_some : $( ROCKSDBTESTS_SUBSET )
for t in $( ROCKSDBTESTS_SUBSET) ; do echo " ===== Running $$ t (`date`) " ; ./$$ t || exit 1; done
run 'make check's rules (and even subtests) in parallel
Summary:
When GNU parallel is available, "make check" tests are now run in parallel.
When /dev/shm is usable, we tell those tests to create temporary files therein.
Now, the longest-running single test, db_test, (which is composed of hundreds of sub-tests)
is no longer run sequentially: instead, each of its sub-tests is run independently, and can
be parallelized along with all other tests. To make that process easier, this change
creates a temporary directory, "t/", in which it puts a small script for each of those
subtests. The output from each parallel-run test is now saved in t/log-TEST_NAME.
When GNU parallel is not available, we run the tests in sequence, just as before.
If GNU parallel is available and you don't like the default of running one subtest
per core, you can invoke "make J=1 check" to run only one test at a time.
Beware: this will take a long time, and it starts with the two longest-running tests, so you
will wait for a long time before seeing any results. Instead, if you want to use fewer resources
but still see useful progress, try "make J=60% check". That will attempt to ensure that 60% of
the cores are occupied by test runs.
To watch progress of individual tests (duration, success (PASS-or-FAIL), name), run "make watch-log"
in the same directory from another window. That will start with something like this:
and when complete should show numbers/names like this:
Every 0.1s: sort -k7,7nr -k4,4gr LOG|perl -n -e '@a=split("\t",$_,-1); $t=$a[8]; $t =~ s,^\./,,;' -e '$t =~ s, >.*,,; chomp $t;' -e '$t =~ /.*--gtest_filter=... Wed Apr 1 10:51:42 2015
152.221 PASS t/DBTest.FileCreationRandomFailure
109.280 PASS t/DBTest.EncodeDecompressedBlockSizeTest
82.315 PASS reduce_levels_test
77.812 PASS t/DBTest.CompactionFilterWithValueChange
73.236 PASS backupable_db_test
63.428 PASS deletefile_test
57.248 PASS table_test
55.665 PASS prefix_test
49.816 PASS t/DBTest.RateLimitingTest
...
Test Plan:
Timings (measured so as to exclude compile and link times):
With this change, all tests complete in 2m40s on a system for which nproc prints 32.
Prior to this this change, "make check" would take 24.5 minutes on that same system.
Here are durations (in seconds) of the longest-running subtests:
152.435 PASS t/DBTest.FileCreationRandomFailure
107.070 PASS t/DBTest.EncodeDecompressedBlockSizeTest
81.391 PASS ./reduce_levels_test
71.587 PASS ./backupable_db_test
61.746 PASS ./deletefile_test
57.960 PASS ./table_test
55.230 PASS ./prefix_test
54.060 PASS t/DBTest.CompactionFilterWithValueChange
48.873 PASS t/DBTest.RateLimitingTest
47.569 PASS ./fault_injection_test
46.593 PASS t/DBTest.Randomized
42.662 PASS t/DBTest.CompactionFilter
31.793 PASS t/DBTest.SparseMerge
30.612 PASS t/DBTest.CompactionFilterV2
25.891 PASS t/DBTest.GroupCommitTest
23.863 PASS t/DBTest.DynamicLevelMaxBytesBase
22.976 PASS ./rate_limiter_test
18.942 PASS t/DBTest.OptimizeFiltersForHits
16.851 PASS ./env_test
15.399 PASS t/DBTest.CompactionFilterV2WithValueChange
14.827 PASS t/DBTest.CompactionFilterV2NULLPrefix
Reviewers: igor, sdong, rven, yhchiang, igor.sugak
Reviewed By: igor.sugak
Subscribers: dhruba
Differential Revision: https://reviews.facebook.net/D35379
10 years ago
.PHONY : ldb_tests
ldb_tests : ldb
$( PYTHON) tools/ldb_test.py
i n c l u d e c r a s h _ t e s t . m k
Add user-defined timestamps to db_stress (#8061)
Summary:
Add some basic test for user-defined timestamp to db_stress. Currently,
read with timestamp always tries to read using the current timestamp.
Due to the per-key timestamp-sequence ordering constraint, we only add timestamp-
related tests to the `NonBatchedOpsStressTest` since this test serializes accesses
to the same key and uses a file to cross-check data correctness.
The timestamp feature is not supported in a number of components, e.g. Merge, SingleDelete,
DeleteRange, CompactionFilter, Readonly instance, secondary instance, SST file ingestion, transaction,
etc. Therefore, db_stress should exit if user enables both timestamp and these features at the same
time. The (currently) incompatible features can be found in
`CheckAndSetOptionsForUserTimestamp`.
This PR also fixes a bug triggered when timestamp is enabled together with
`index_type=kBinarySearchWithFirstKey`. This bug fix will also be in another separate PR
with more unit tests coverage. Fixing it here because I do not want to exclude the index type
from crash test.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/8061
Test Plan: make crash_test_with_ts
Reviewed By: jay-zhuang
Differential Revision: D27056282
Pulled By: riversand963
fbshipit-source-id: c3e00ad1023fdb9ebbdf9601ec18270c5e2925a9
4 years ago
Major CircleCI/Linux fixes / tweaks / enhancements (#7078)
Summary:
Primarily, this change adds a way to work around a bug limiting the effective output (and therefore debugability) of the Linux builds using parallel make. We would get
make[1]: write error: stdout
probably due to a kernel bug, apparently affecting both available ubuntu 16 machine images (maybe not affecting docker images, less horsepower). https://bugs.launchpad.net/ubuntu/+source/linux-signed/+bug/1814393
Now in the CircleCI config, make output on Ubuntu is piped through a custom 'cat' that ignores EAGAIN errors, which seems to fix the problem.
Significant other changes:
* Add another linux build that combines
* LIB_MODE=shared, to ensure this works with compile and unit test execution
* Alternative rocksdb namespace, to ensure this works (not rely on Travis)
* ASSERT_STATUS_CHECKED=1, but with building all unit tests and running those expected to pass with it
* Run release build with and without gflags. (Was running only without, ignore large swaths of code in a normal release build! Two regressions in this build, only with gflags, in the last week not caught by CI!)
* Use gflags with unity and LITE build, as typical case.
Debugability improvements:
* Use V=1 to show commands being executed (thanks to EAGAIN work-around)
* Print kernel version and compiler versions as part of V=1 output from Makefile
Cosmetic other changes:
* Put more commands on one line, for less clutter in CircleCI output pages
* Remove redundant "all" in "make all check" and put make command options before targets
* Change some recursive "make clean" into dependency on "clean," toward minimizing unnecessary overhead (detect platform, build version, etc.) of extra recursive makes
Pull Request resolved: https://github.com/facebook/rocksdb/pull/7078
Reviewed By: siying
Differential Revision: D22391647
Pulled By: pdillinger
fbshipit-source-id: d446fccf5a8c568b37dc8748621c8a5c546fe135
5 years ago
asan_check : clean
COMPILE_WITH_ASAN = 1 $( MAKE) check -j32
$( MAKE) clean
Major CircleCI/Linux fixes / tweaks / enhancements (#7078)
Summary:
Primarily, this change adds a way to work around a bug limiting the effective output (and therefore debugability) of the Linux builds using parallel make. We would get
make[1]: write error: stdout
probably due to a kernel bug, apparently affecting both available ubuntu 16 machine images (maybe not affecting docker images, less horsepower). https://bugs.launchpad.net/ubuntu/+source/linux-signed/+bug/1814393
Now in the CircleCI config, make output on Ubuntu is piped through a custom 'cat' that ignores EAGAIN errors, which seems to fix the problem.
Significant other changes:
* Add another linux build that combines
* LIB_MODE=shared, to ensure this works with compile and unit test execution
* Alternative rocksdb namespace, to ensure this works (not rely on Travis)
* ASSERT_STATUS_CHECKED=1, but with building all unit tests and running those expected to pass with it
* Run release build with and without gflags. (Was running only without, ignore large swaths of code in a normal release build! Two regressions in this build, only with gflags, in the last week not caught by CI!)
* Use gflags with unity and LITE build, as typical case.
Debugability improvements:
* Use V=1 to show commands being executed (thanks to EAGAIN work-around)
* Print kernel version and compiler versions as part of V=1 output from Makefile
Cosmetic other changes:
* Put more commands on one line, for less clutter in CircleCI output pages
* Remove redundant "all" in "make all check" and put make command options before targets
* Change some recursive "make clean" into dependency on "clean," toward minimizing unnecessary overhead (detect platform, build version, etc.) of extra recursive makes
Pull Request resolved: https://github.com/facebook/rocksdb/pull/7078
Reviewed By: siying
Differential Revision: D22391647
Pulled By: pdillinger
fbshipit-source-id: d446fccf5a8c568b37dc8748621c8a5c546fe135
5 years ago
asan_crash_test : clean
COMPILE_WITH_ASAN = 1 $( MAKE) crash_test
$( MAKE) clean
whitebox_asan_crash_test : clean
COMPILE_WITH_ASAN = 1 $( MAKE) whitebox_crash_test
$( MAKE) clean
blackbox_asan_crash_test : clean
COMPILE_WITH_ASAN = 1 $( MAKE) blackbox_crash_test
$( MAKE) clean
Major CircleCI/Linux fixes / tweaks / enhancements (#7078)
Summary:
Primarily, this change adds a way to work around a bug limiting the effective output (and therefore debugability) of the Linux builds using parallel make. We would get
make[1]: write error: stdout
probably due to a kernel bug, apparently affecting both available ubuntu 16 machine images (maybe not affecting docker images, less horsepower). https://bugs.launchpad.net/ubuntu/+source/linux-signed/+bug/1814393
Now in the CircleCI config, make output on Ubuntu is piped through a custom 'cat' that ignores EAGAIN errors, which seems to fix the problem.
Significant other changes:
* Add another linux build that combines
* LIB_MODE=shared, to ensure this works with compile and unit test execution
* Alternative rocksdb namespace, to ensure this works (not rely on Travis)
* ASSERT_STATUS_CHECKED=1, but with building all unit tests and running those expected to pass with it
* Run release build with and without gflags. (Was running only without, ignore large swaths of code in a normal release build! Two regressions in this build, only with gflags, in the last week not caught by CI!)
* Use gflags with unity and LITE build, as typical case.
Debugability improvements:
* Use V=1 to show commands being executed (thanks to EAGAIN work-around)
* Print kernel version and compiler versions as part of V=1 output from Makefile
Cosmetic other changes:
* Put more commands on one line, for less clutter in CircleCI output pages
* Remove redundant "all" in "make all check" and put make command options before targets
* Change some recursive "make clean" into dependency on "clean," toward minimizing unnecessary overhead (detect platform, build version, etc.) of extra recursive makes
Pull Request resolved: https://github.com/facebook/rocksdb/pull/7078
Reviewed By: siying
Differential Revision: D22391647
Pulled By: pdillinger
fbshipit-source-id: d446fccf5a8c568b37dc8748621c8a5c546fe135
5 years ago
asan_crash_test_with_atomic_flush : clean
COMPILE_WITH_ASAN = 1 $( MAKE) crash_test_with_atomic_flush
$( MAKE) clean
Major CircleCI/Linux fixes / tweaks / enhancements (#7078)
Summary:
Primarily, this change adds a way to work around a bug limiting the effective output (and therefore debugability) of the Linux builds using parallel make. We would get
make[1]: write error: stdout
probably due to a kernel bug, apparently affecting both available ubuntu 16 machine images (maybe not affecting docker images, less horsepower). https://bugs.launchpad.net/ubuntu/+source/linux-signed/+bug/1814393
Now in the CircleCI config, make output on Ubuntu is piped through a custom 'cat' that ignores EAGAIN errors, which seems to fix the problem.
Significant other changes:
* Add another linux build that combines
* LIB_MODE=shared, to ensure this works with compile and unit test execution
* Alternative rocksdb namespace, to ensure this works (not rely on Travis)
* ASSERT_STATUS_CHECKED=1, but with building all unit tests and running those expected to pass with it
* Run release build with and without gflags. (Was running only without, ignore large swaths of code in a normal release build! Two regressions in this build, only with gflags, in the last week not caught by CI!)
* Use gflags with unity and LITE build, as typical case.
Debugability improvements:
* Use V=1 to show commands being executed (thanks to EAGAIN work-around)
* Print kernel version and compiler versions as part of V=1 output from Makefile
Cosmetic other changes:
* Put more commands on one line, for less clutter in CircleCI output pages
* Remove redundant "all" in "make all check" and put make command options before targets
* Change some recursive "make clean" into dependency on "clean," toward minimizing unnecessary overhead (detect platform, build version, etc.) of extra recursive makes
Pull Request resolved: https://github.com/facebook/rocksdb/pull/7078
Reviewed By: siying
Differential Revision: D22391647
Pulled By: pdillinger
fbshipit-source-id: d446fccf5a8c568b37dc8748621c8a5c546fe135
5 years ago
asan_crash_test_with_txn : clean
COMPILE_WITH_ASAN = 1 $( MAKE) crash_test_with_txn
$( MAKE) clean
Major CircleCI/Linux fixes / tweaks / enhancements (#7078)
Summary:
Primarily, this change adds a way to work around a bug limiting the effective output (and therefore debugability) of the Linux builds using parallel make. We would get
make[1]: write error: stdout
probably due to a kernel bug, apparently affecting both available ubuntu 16 machine images (maybe not affecting docker images, less horsepower). https://bugs.launchpad.net/ubuntu/+source/linux-signed/+bug/1814393
Now in the CircleCI config, make output on Ubuntu is piped through a custom 'cat' that ignores EAGAIN errors, which seems to fix the problem.
Significant other changes:
* Add another linux build that combines
* LIB_MODE=shared, to ensure this works with compile and unit test execution
* Alternative rocksdb namespace, to ensure this works (not rely on Travis)
* ASSERT_STATUS_CHECKED=1, but with building all unit tests and running those expected to pass with it
* Run release build with and without gflags. (Was running only without, ignore large swaths of code in a normal release build! Two regressions in this build, only with gflags, in the last week not caught by CI!)
* Use gflags with unity and LITE build, as typical case.
Debugability improvements:
* Use V=1 to show commands being executed (thanks to EAGAIN work-around)
* Print kernel version and compiler versions as part of V=1 output from Makefile
Cosmetic other changes:
* Put more commands on one line, for less clutter in CircleCI output pages
* Remove redundant "all" in "make all check" and put make command options before targets
* Change some recursive "make clean" into dependency on "clean," toward minimizing unnecessary overhead (detect platform, build version, etc.) of extra recursive makes
Pull Request resolved: https://github.com/facebook/rocksdb/pull/7078
Reviewed By: siying
Differential Revision: D22391647
Pulled By: pdillinger
fbshipit-source-id: d446fccf5a8c568b37dc8748621c8a5c546fe135
5 years ago
asan_crash_test_with_best_efforts_recovery : clean
COMPILE_WITH_ASAN = 1 $( MAKE) crash_test_with_best_efforts_recovery
$( MAKE) clean
Major CircleCI/Linux fixes / tweaks / enhancements (#7078)
Summary:
Primarily, this change adds a way to work around a bug limiting the effective output (and therefore debugability) of the Linux builds using parallel make. We would get
make[1]: write error: stdout
probably due to a kernel bug, apparently affecting both available ubuntu 16 machine images (maybe not affecting docker images, less horsepower). https://bugs.launchpad.net/ubuntu/+source/linux-signed/+bug/1814393
Now in the CircleCI config, make output on Ubuntu is piped through a custom 'cat' that ignores EAGAIN errors, which seems to fix the problem.
Significant other changes:
* Add another linux build that combines
* LIB_MODE=shared, to ensure this works with compile and unit test execution
* Alternative rocksdb namespace, to ensure this works (not rely on Travis)
* ASSERT_STATUS_CHECKED=1, but with building all unit tests and running those expected to pass with it
* Run release build with and without gflags. (Was running only without, ignore large swaths of code in a normal release build! Two regressions in this build, only with gflags, in the last week not caught by CI!)
* Use gflags with unity and LITE build, as typical case.
Debugability improvements:
* Use V=1 to show commands being executed (thanks to EAGAIN work-around)
* Print kernel version and compiler versions as part of V=1 output from Makefile
Cosmetic other changes:
* Put more commands on one line, for less clutter in CircleCI output pages
* Remove redundant "all" in "make all check" and put make command options before targets
* Change some recursive "make clean" into dependency on "clean," toward minimizing unnecessary overhead (detect platform, build version, etc.) of extra recursive makes
Pull Request resolved: https://github.com/facebook/rocksdb/pull/7078
Reviewed By: siying
Differential Revision: D22391647
Pulled By: pdillinger
fbshipit-source-id: d446fccf5a8c568b37dc8748621c8a5c546fe135
5 years ago
ubsan_check : clean
COMPILE_WITH_UBSAN = 1 $( MAKE) check -j32
$( MAKE) clean
Major CircleCI/Linux fixes / tweaks / enhancements (#7078)
Summary:
Primarily, this change adds a way to work around a bug limiting the effective output (and therefore debugability) of the Linux builds using parallel make. We would get
make[1]: write error: stdout
probably due to a kernel bug, apparently affecting both available ubuntu 16 machine images (maybe not affecting docker images, less horsepower). https://bugs.launchpad.net/ubuntu/+source/linux-signed/+bug/1814393
Now in the CircleCI config, make output on Ubuntu is piped through a custom 'cat' that ignores EAGAIN errors, which seems to fix the problem.
Significant other changes:
* Add another linux build that combines
* LIB_MODE=shared, to ensure this works with compile and unit test execution
* Alternative rocksdb namespace, to ensure this works (not rely on Travis)
* ASSERT_STATUS_CHECKED=1, but with building all unit tests and running those expected to pass with it
* Run release build with and without gflags. (Was running only without, ignore large swaths of code in a normal release build! Two regressions in this build, only with gflags, in the last week not caught by CI!)
* Use gflags with unity and LITE build, as typical case.
Debugability improvements:
* Use V=1 to show commands being executed (thanks to EAGAIN work-around)
* Print kernel version and compiler versions as part of V=1 output from Makefile
Cosmetic other changes:
* Put more commands on one line, for less clutter in CircleCI output pages
* Remove redundant "all" in "make all check" and put make command options before targets
* Change some recursive "make clean" into dependency on "clean," toward minimizing unnecessary overhead (detect platform, build version, etc.) of extra recursive makes
Pull Request resolved: https://github.com/facebook/rocksdb/pull/7078
Reviewed By: siying
Differential Revision: D22391647
Pulled By: pdillinger
fbshipit-source-id: d446fccf5a8c568b37dc8748621c8a5c546fe135
5 years ago
ubsan_crash_test : clean
COMPILE_WITH_UBSAN = 1 $( MAKE) crash_test
$( MAKE) clean
whitebox_ubsan_crash_test : clean
COMPILE_WITH_UBSAN = 1 $( MAKE) whitebox_crash_test
$( MAKE) clean
blackbox_ubsan_crash_test : clean
COMPILE_WITH_UBSAN = 1 $( MAKE) blackbox_crash_test
$( MAKE) clean
Major CircleCI/Linux fixes / tweaks / enhancements (#7078)
Summary:
Primarily, this change adds a way to work around a bug limiting the effective output (and therefore debugability) of the Linux builds using parallel make. We would get
make[1]: write error: stdout
probably due to a kernel bug, apparently affecting both available ubuntu 16 machine images (maybe not affecting docker images, less horsepower). https://bugs.launchpad.net/ubuntu/+source/linux-signed/+bug/1814393
Now in the CircleCI config, make output on Ubuntu is piped through a custom 'cat' that ignores EAGAIN errors, which seems to fix the problem.
Significant other changes:
* Add another linux build that combines
* LIB_MODE=shared, to ensure this works with compile and unit test execution
* Alternative rocksdb namespace, to ensure this works (not rely on Travis)
* ASSERT_STATUS_CHECKED=1, but with building all unit tests and running those expected to pass with it
* Run release build with and without gflags. (Was running only without, ignore large swaths of code in a normal release build! Two regressions in this build, only with gflags, in the last week not caught by CI!)
* Use gflags with unity and LITE build, as typical case.
Debugability improvements:
* Use V=1 to show commands being executed (thanks to EAGAIN work-around)
* Print kernel version and compiler versions as part of V=1 output from Makefile
Cosmetic other changes:
* Put more commands on one line, for less clutter in CircleCI output pages
* Remove redundant "all" in "make all check" and put make command options before targets
* Change some recursive "make clean" into dependency on "clean," toward minimizing unnecessary overhead (detect platform, build version, etc.) of extra recursive makes
Pull Request resolved: https://github.com/facebook/rocksdb/pull/7078
Reviewed By: siying
Differential Revision: D22391647
Pulled By: pdillinger
fbshipit-source-id: d446fccf5a8c568b37dc8748621c8a5c546fe135
5 years ago
ubsan_crash_test_with_atomic_flush : clean
COMPILE_WITH_UBSAN = 1 $( MAKE) crash_test_with_atomic_flush
$( MAKE) clean
Major CircleCI/Linux fixes / tweaks / enhancements (#7078)
Summary:
Primarily, this change adds a way to work around a bug limiting the effective output (and therefore debugability) of the Linux builds using parallel make. We would get
make[1]: write error: stdout
probably due to a kernel bug, apparently affecting both available ubuntu 16 machine images (maybe not affecting docker images, less horsepower). https://bugs.launchpad.net/ubuntu/+source/linux-signed/+bug/1814393
Now in the CircleCI config, make output on Ubuntu is piped through a custom 'cat' that ignores EAGAIN errors, which seems to fix the problem.
Significant other changes:
* Add another linux build that combines
* LIB_MODE=shared, to ensure this works with compile and unit test execution
* Alternative rocksdb namespace, to ensure this works (not rely on Travis)
* ASSERT_STATUS_CHECKED=1, but with building all unit tests and running those expected to pass with it
* Run release build with and without gflags. (Was running only without, ignore large swaths of code in a normal release build! Two regressions in this build, only with gflags, in the last week not caught by CI!)
* Use gflags with unity and LITE build, as typical case.
Debugability improvements:
* Use V=1 to show commands being executed (thanks to EAGAIN work-around)
* Print kernel version and compiler versions as part of V=1 output from Makefile
Cosmetic other changes:
* Put more commands on one line, for less clutter in CircleCI output pages
* Remove redundant "all" in "make all check" and put make command options before targets
* Change some recursive "make clean" into dependency on "clean," toward minimizing unnecessary overhead (detect platform, build version, etc.) of extra recursive makes
Pull Request resolved: https://github.com/facebook/rocksdb/pull/7078
Reviewed By: siying
Differential Revision: D22391647
Pulled By: pdillinger
fbshipit-source-id: d446fccf5a8c568b37dc8748621c8a5c546fe135
5 years ago
ubsan_crash_test_with_txn : clean
COMPILE_WITH_UBSAN = 1 $( MAKE) crash_test_with_txn
$( MAKE) clean
Major CircleCI/Linux fixes / tweaks / enhancements (#7078)
Summary:
Primarily, this change adds a way to work around a bug limiting the effective output (and therefore debugability) of the Linux builds using parallel make. We would get
make[1]: write error: stdout
probably due to a kernel bug, apparently affecting both available ubuntu 16 machine images (maybe not affecting docker images, less horsepower). https://bugs.launchpad.net/ubuntu/+source/linux-signed/+bug/1814393
Now in the CircleCI config, make output on Ubuntu is piped through a custom 'cat' that ignores EAGAIN errors, which seems to fix the problem.
Significant other changes:
* Add another linux build that combines
* LIB_MODE=shared, to ensure this works with compile and unit test execution
* Alternative rocksdb namespace, to ensure this works (not rely on Travis)
* ASSERT_STATUS_CHECKED=1, but with building all unit tests and running those expected to pass with it
* Run release build with and without gflags. (Was running only without, ignore large swaths of code in a normal release build! Two regressions in this build, only with gflags, in the last week not caught by CI!)
* Use gflags with unity and LITE build, as typical case.
Debugability improvements:
* Use V=1 to show commands being executed (thanks to EAGAIN work-around)
* Print kernel version and compiler versions as part of V=1 output from Makefile
Cosmetic other changes:
* Put more commands on one line, for less clutter in CircleCI output pages
* Remove redundant "all" in "make all check" and put make command options before targets
* Change some recursive "make clean" into dependency on "clean," toward minimizing unnecessary overhead (detect platform, build version, etc.) of extra recursive makes
Pull Request resolved: https://github.com/facebook/rocksdb/pull/7078
Reviewed By: siying
Differential Revision: D22391647
Pulled By: pdillinger
fbshipit-source-id: d446fccf5a8c568b37dc8748621c8a5c546fe135
5 years ago
ubsan_crash_test_with_best_efforts_recovery : clean
COMPILE_WITH_UBSAN = 1 $( MAKE) crash_test_with_best_efforts_recovery
$( MAKE) clean
full_valgrind_test :
ROCKSDB_FULL_VALGRIND_RUN = 1 DISABLE_JEMALLOC = 1 $( MAKE) valgrind_check
full_valgrind_test_some :
ROCKSDB_FULL_VALGRIND_RUN = 1 DISABLE_JEMALLOC = 1 $( MAKE) valgrind_check_some
valgrind_test :
ROCKSDB_VALGRIND_RUN = 1 DISABLE_JEMALLOC = 1 $( MAKE) valgrind_check
valgrind_test_some :
ROCKSDB_VALGRIND_RUN = 1 DISABLE_JEMALLOC = 1 $( MAKE) valgrind_check_some
valgrind_check : $( TESTS )
$( MAKE) DRIVER = " $( VALGRIND_VER) $( VALGRIND_OPTS) " gen_parallel_tests
$( AM_V_GEN) if test " $( J) " != 1 \
&& ( build_tools/gnu_parallel --gnu --help 2>/dev/null) | \
grep -q 'GNU Parallel' ; \
then \
$( MAKE) \
DRIVER = " $( VALGRIND_VER) $( VALGRIND_OPTS) " valgrind_check_0; \
else \
for t in $( filter-out %skiplist_test options_settable_test,$( TESTS) ) ; do \
$( VALGRIND_VER) $( VALGRIND_OPTS) ./$$ t; \
ret_code = $$ ?; \
if [ $$ ret_code -ne 0 ] ; then \
exit $$ ret_code; \
fi ; \
done ; \
fi
valgrind_check_some : $( ROCKSDBTESTS_SUBSET )
for t in $( ROCKSDBTESTS_SUBSET) ; do \
$( VALGRIND_VER) $( VALGRIND_OPTS) ./$$ t; \
ret_code = $$ ?; \
if [ $$ ret_code -ne 0 ] ; then \
exit $$ ret_code; \
fi ; \
done
test_names = \
./db_test --gtest_list_tests \
| perl -n \
-e 's/ *\#.*//;' \
-e '/^(\s*)(\S+)/; !$$1 and do {$$p=$$2; break};' \
-e 'print qq! $$p$$2!'
analyze : clean
USE_CLANG = 1 $( MAKE) analyze_incremental
analyze_incremental :
$( CLANG_SCAN_BUILD) --use-analyzer= $( CLANG_ANALYZER) \
--use-c++= $( CXX) --use-cc= $( CC) --status-bugs \
-o $( CURDIR) /scan_build_report \
$( MAKE) SKIP_LINK = 1 dbg
run 'make check's rules (and even subtests) in parallel
Summary:
When GNU parallel is available, "make check" tests are now run in parallel.
When /dev/shm is usable, we tell those tests to create temporary files therein.
Now, the longest-running single test, db_test, (which is composed of hundreds of sub-tests)
is no longer run sequentially: instead, each of its sub-tests is run independently, and can
be parallelized along with all other tests. To make that process easier, this change
creates a temporary directory, "t/", in which it puts a small script for each of those
subtests. The output from each parallel-run test is now saved in t/log-TEST_NAME.
When GNU parallel is not available, we run the tests in sequence, just as before.
If GNU parallel is available and you don't like the default of running one subtest
per core, you can invoke "make J=1 check" to run only one test at a time.
Beware: this will take a long time, and it starts with the two longest-running tests, so you
will wait for a long time before seeing any results. Instead, if you want to use fewer resources
but still see useful progress, try "make J=60% check". That will attempt to ensure that 60% of
the cores are occupied by test runs.
To watch progress of individual tests (duration, success (PASS-or-FAIL), name), run "make watch-log"
in the same directory from another window. That will start with something like this:
and when complete should show numbers/names like this:
Every 0.1s: sort -k7,7nr -k4,4gr LOG|perl -n -e '@a=split("\t",$_,-1); $t=$a[8]; $t =~ s,^\./,,;' -e '$t =~ s, >.*,,; chomp $t;' -e '$t =~ /.*--gtest_filter=... Wed Apr 1 10:51:42 2015
152.221 PASS t/DBTest.FileCreationRandomFailure
109.280 PASS t/DBTest.EncodeDecompressedBlockSizeTest
82.315 PASS reduce_levels_test
77.812 PASS t/DBTest.CompactionFilterWithValueChange
73.236 PASS backupable_db_test
63.428 PASS deletefile_test
57.248 PASS table_test
55.665 PASS prefix_test
49.816 PASS t/DBTest.RateLimitingTest
...
Test Plan:
Timings (measured so as to exclude compile and link times):
With this change, all tests complete in 2m40s on a system for which nproc prints 32.
Prior to this this change, "make check" would take 24.5 minutes on that same system.
Here are durations (in seconds) of the longest-running subtests:
152.435 PASS t/DBTest.FileCreationRandomFailure
107.070 PASS t/DBTest.EncodeDecompressedBlockSizeTest
81.391 PASS ./reduce_levels_test
71.587 PASS ./backupable_db_test
61.746 PASS ./deletefile_test
57.960 PASS ./table_test
55.230 PASS ./prefix_test
54.060 PASS t/DBTest.CompactionFilterWithValueChange
48.873 PASS t/DBTest.RateLimitingTest
47.569 PASS ./fault_injection_test
46.593 PASS t/DBTest.Randomized
42.662 PASS t/DBTest.CompactionFilter
31.793 PASS t/DBTest.SparseMerge
30.612 PASS t/DBTest.CompactionFilterV2
25.891 PASS t/DBTest.GroupCommitTest
23.863 PASS t/DBTest.DynamicLevelMaxBytesBase
22.976 PASS ./rate_limiter_test
18.942 PASS t/DBTest.OptimizeFiltersForHits
16.851 PASS ./env_test
15.399 PASS t/DBTest.CompactionFilterV2WithValueChange
14.827 PASS t/DBTest.CompactionFilterV2NULLPrefix
Reviewers: igor, sdong, rven, yhchiang, igor.sugak
Reviewed By: igor.sugak
Subscribers: dhruba
Differential Revision: https://reviews.facebook.net/D35379
10 years ago
CLEAN_FILES += unity.cc
unity.cc : Makefile util /build_version .cc .in
rm -f $@ $@ -t
$( AM_V_at) $( gen_build_version) > util/build_version.cc
for source_file in $( LIB_SOURCES) ; do \
echo " #include \" $$ source_file\" " >> $@ -t; \
done
chmod a = r $@ -t
mv $@ -t $@
unity.a : $( OBJ_DIR ) /unity .o
$( AM_V_AR) rm -f $@
$( AM_V_at) $( AR) $( ARFLAGS) $@ $( OBJ_DIR) /unity.o
# try compiling db_test with unity
unity_test : $( OBJ_DIR ) /db /db_basic_test .o $( OBJ_DIR ) /db /db_test_util .o $( TEST_OBJECTS ) $( TOOL_OBJECTS ) unity .a
$( AM_LINK)
./unity_test
rocksdb.h rocksdb.cc : build_tools /amalgamate .py Makefile $( LIB_SOURCES ) unity .cc
build_tools/amalgamate.py -I. -i./include unity.cc -x include/rocksdb/c.h -H rocksdb.h -o rocksdb.cc
clean : clean -ext -libraries -all clean -rocks clean -rocksjava
clean-not-downloaded : clean -ext -libraries -bin clean -rocks clean -not -downloaded -rocksjava
clean-rocks :
# Not practical to exactly match all versions/variants in naming (e.g. debug or not)
rm -f ${ LIBNAME } *.so* ${ LIBNAME } *.a
rm -f $( BENCHMARKS) $( TOOLS) $( TESTS) $( PARALLEL_TEST) $( MICROBENCHS)
rm -rf $( CLEAN_FILES) ios-x86 ios-arm scan_build_report
$( FIND) . -name "*.[oda]" -exec rm -f { } \;
$( FIND) . -type f \( -name "*.gcda" -o -name "*.gcno" \) -exec rm -f { } \;
clean-rocksjava : clean -rocks
rm -rf jl jls
cd java && $( MAKE) clean
clean-not-downloaded-rocksjava :
cd java && $( MAKE) clean-not-downloaded
clean-ext-libraries-all :
rm -rf bzip2* snappy* zlib* lz4* zstd*
clean-ext-libraries-bin :
find . -maxdepth 1 -type d \( -name bzip2\* -or -name snappy\* -or -name zlib\* -or -name lz4\* -or -name zstd\* \) -prune -exec rm -rf { } \;
tags :
ctags -R .
cscope -b ` $( FIND) . -name '*.cc' ` ` $( FIND) . -name '*.h' ` ` $( FIND) . -name '*.c' `
ctags -e -R -o etags *
tags0 :
ctags -R .
cscope -b ` $( FIND) . -name '*.cc' -and ! -name '*_test.cc' ` \
` $( FIND) . -name '*.c' -and ! -name '*_test.c' ` \
` $( FIND) . -name '*.h' -and ! -name '*_test.h' `
ctags -e -R -o etags *
format :
build_tools/format-diff.sh
check-format :
build_tools/format-diff.sh -c
check-buck-targets :
buckifier/check_buck_targets.sh
check-sources :
build_tools/check-sources.sh
Package generation for Ubuntu and CentOS
Summary:
I put together a script to assist in the generation of deb's and
rpm's. I've tested that this works on ubuntu via vagrant. I've included the
Vagrantfile here, but I can remove it if it's not useful. The package.sh
script should work on any ubuntu or centos machine, I just added a bit of
logic in there to allow a base Ubuntu or Centos machine to be able to build
RocksDB from scratch.
Example output on Ubuntu 14.04:
```
root@vagrant-ubuntu-trusty-64:/vagrant# ./tools/package.sh
[+] g++-4.7 is already installed. skipping.
[+] libgflags-dev is already installed. skipping.
[+] ruby-all-dev is already installed. skipping.
[+] fpm is already installed. skipping.
Created package {:path=>"rocksdb_3.5_amd64.deb"}
root@vagrant-ubuntu-trusty-64:/vagrant# dpkg --info rocksdb_3.5_amd64.deb
new debian package, version 2.0.
size 17392022 bytes: control archive=1518 bytes.
275 bytes, 11 lines control
2911 bytes, 38 lines md5sums
Package: rocksdb
Version: 3.5
License: BSD
Vendor: Facebook
Architecture: amd64
Maintainer: rocksdb@fb.com
Installed-Size: 83358
Section: default
Priority: extra
Homepage: http://rocksdb.org/
Description: RocksDB is an embeddable persistent key-value store for fast storage.
```
Example output on CentOS 6.5:
```
[root@localhost vagrant]# rpm -qip rocksdb-3.5-1.x86_64.rpm
Name : rocksdb Relocations: /usr
Version : 3.5 Vendor: Facebook
Release : 1 Build Date: Mon 29 Sep 2014 01:26:11 AM UTC
Install Date: (not installed) Build Host: localhost
Group : default Source RPM: rocksdb-3.5-1.src.rpm
Size : 96231106 License: BSD
Signature : (none)
Packager : rocksdb@fb.com
URL : http://rocksdb.org/
Summary : RocksDB is an embeddable persistent key-value store for fast storage.
Description :
RocksDB is an embeddable persistent key-value store for fast storage.
```
Test Plan:
How this gets used is really up to the RocksDB core team. If you
want to actually get this into mainline, you might have to change `make
install` such that it install the RocksDB shared object file as well, which
would require you to link against gflags (maybe?) and that would require some
potential modifications to the script here (basically add a depends on that
package).
Currently, this will install the headers and a pre-compiled statically linked
object file. If that's what you want out of life, than this requires no
modifications.
Reviewers: ljin, yhchiang, igor
Reviewed By: igor
Differential Revision: https://reviews.facebook.net/D24141
10 years ago
package :
bash build_tools/make_package.sh $( SHARED_MAJOR) .$( SHARED_MINOR)
Package generation for Ubuntu and CentOS
Summary:
I put together a script to assist in the generation of deb's and
rpm's. I've tested that this works on ubuntu via vagrant. I've included the
Vagrantfile here, but I can remove it if it's not useful. The package.sh
script should work on any ubuntu or centos machine, I just added a bit of
logic in there to allow a base Ubuntu or Centos machine to be able to build
RocksDB from scratch.
Example output on Ubuntu 14.04:
```
root@vagrant-ubuntu-trusty-64:/vagrant# ./tools/package.sh
[+] g++-4.7 is already installed. skipping.
[+] libgflags-dev is already installed. skipping.
[+] ruby-all-dev is already installed. skipping.
[+] fpm is already installed. skipping.
Created package {:path=>"rocksdb_3.5_amd64.deb"}
root@vagrant-ubuntu-trusty-64:/vagrant# dpkg --info rocksdb_3.5_amd64.deb
new debian package, version 2.0.
size 17392022 bytes: control archive=1518 bytes.
275 bytes, 11 lines control
2911 bytes, 38 lines md5sums
Package: rocksdb
Version: 3.5
License: BSD
Vendor: Facebook
Architecture: amd64
Maintainer: rocksdb@fb.com
Installed-Size: 83358
Section: default
Priority: extra
Homepage: http://rocksdb.org/
Description: RocksDB is an embeddable persistent key-value store for fast storage.
```
Example output on CentOS 6.5:
```
[root@localhost vagrant]# rpm -qip rocksdb-3.5-1.x86_64.rpm
Name : rocksdb Relocations: /usr
Version : 3.5 Vendor: Facebook
Release : 1 Build Date: Mon 29 Sep 2014 01:26:11 AM UTC
Install Date: (not installed) Build Host: localhost
Group : default Source RPM: rocksdb-3.5-1.src.rpm
Size : 96231106 License: BSD
Signature : (none)
Packager : rocksdb@fb.com
URL : http://rocksdb.org/
Summary : RocksDB is an embeddable persistent key-value store for fast storage.
Description :
RocksDB is an embeddable persistent key-value store for fast storage.
```
Test Plan:
How this gets used is really up to the RocksDB core team. If you
want to actually get this into mainline, you might have to change `make
install` such that it install the RocksDB shared object file as well, which
would require you to link against gflags (maybe?) and that would require some
potential modifications to the script here (basically add a depends on that
package).
Currently, this will install the headers and a pre-compiled statically linked
object file. If that's what you want out of life, than this requires no
modifications.
Reviewers: ljin, yhchiang, igor
Reviewed By: igor
Differential Revision: https://reviews.facebook.net/D24141
10 years ago
# ---------------------------------------------------------------------------
# Unit tests and tools
# ---------------------------------------------------------------------------
$(STATIC_LIBRARY) : $( LIB_OBJECTS )
$( AM_V_AR) rm -f $@ $( SHARED1) $( SHARED2) $( SHARED3) $( SHARED4)
$( AM_V_at) $( AR) $( ARFLAGS) $@ $( LIB_OBJECTS)
$(STATIC_TEST_LIBRARY) : $( TEST_OBJECTS )
$( AM_V_AR) rm -f $@ $( SHARED_TEST_LIBRARY)
$( AM_V_at) $( AR) $( ARFLAGS) $@ $^
$(STATIC_TOOLS_LIBRARY) : $( TOOL_OBJECTS )
$( AM_V_AR) rm -f $@ $( SHARED_TOOLS_LIBRARY)
$( AM_V_at) $( AR) $( ARFLAGS) $@ $^
$(STATIC_STRESS_LIBRARY) : $( ANALYZE_OBJECTS ) $( STRESS_OBJECTS ) $( TESTUTIL )
$( AM_V_AR) rm -f $@ $( SHARED_STRESS_LIBRARY)
$( AM_V_at) $( AR) $( ARFLAGS) $@ $^
$(SHARED_TEST_LIBRARY) : $( TEST_OBJECTS ) $( SHARED 1)
$( AM_V_AR) rm -f $@ $( STATIC_TEST_LIBRARY)
$( AM_SHARE)
$(SHARED_TOOLS_LIBRARY) : $( TOOL_OBJECTS ) $( SHARED 1)
$( AM_V_AR) rm -f $@ $( STATIC_TOOLS_LIBRARY)
$( AM_SHARE)
$(SHARED_STRESS_LIBRARY) : $( ANALYZE_OBJECTS ) $( STRESS_OBJECTS ) $( TESTUTIL ) $( SHARED_TOOLS_LIBRARY ) $( SHARED 1)
$( AM_V_AR) rm -f $@ $( STATIC_STRESS_LIBRARY)
$( AM_SHARE)
librocksdb_env_basic_test.a : $( OBJ_DIR ) /env /env_basic_test .o $( LIB_OBJECTS ) $( TESTHARNESS )
$( AM_V_AR) rm -f $@
$( AM_V_at) $( AR) $( ARFLAGS) $@ $^
db_bench : $( OBJ_DIR ) /tools /db_bench .o $( BENCH_OBJECTS ) $( TESTUTIL ) $( LIBRARY )
$( AM_LINK)
trace_analyzer : $( OBJ_DIR ) /tools /trace_analyzer .o $( ANALYZE_OBJECTS ) $( TOOLS_LIBRARY ) $( LIBRARY )
$( AM_LINK)
block_cache_trace_analyzer : $( OBJ_DIR ) /tools /block_cache_analyzer /block_cache_trace_analyzer_tool .o $( ANALYZE_OBJECTS ) $( TOOLS_LIBRARY ) $( LIBRARY )
Support computing miss ratio curves using sim_cache. (#5449)
Summary:
This PR adds a BlockCacheTraceSimulator that reports the miss ratios given different cache configurations. A cache configuration contains "cache_name,num_shard_bits,cache_capacities". For example, "lru, 1, 1K, 2K, 4M, 4G".
When we replay the trace, we also perform lookups and inserts on the simulated caches.
In the end, it reports the miss ratio for each tuple <cache_name, num_shard_bits, cache_capacity> in a output file.
This PR also adds a main source block_cache_trace_analyzer so that we can run the analyzer in command line.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5449
Test Plan:
Added tests for block_cache_trace_analyzer.
COMPILE_WITH_ASAN=1 make check -j32.
Differential Revision: D15797073
Pulled By: HaoyuHuang
fbshipit-source-id: aef0c5c2e7938f3e8b6a10d4a6a50e6928ecf408
6 years ago
$( AM_LINK)
cache_bench : $( OBJ_DIR ) /cache /cache_bench .o $( CACHE_BENCH_OBJECTS ) $( LIBRARY )
$( AM_LINK)
persistent_cache_bench : $( OBJ_DIR ) /utilities /persistent_cache /persistent_cache_bench .o $( LIBRARY )
$( AM_LINK)
memtablerep_bench : $( OBJ_DIR ) /memtable /memtablerep_bench .o $( LIBRARY )
$( AM_LINK)
Memtablerep Benchmark
Summary:
Create a benchmark for testing memtablereps. This diff is a bit rough, but it should do the trick until other bootcampers can clean it up.
Addressing comments
Removed the mutexes
Changed ReadWriteBenchmark to fix number of reads and count the number of writes we can perform in that time.
Test Plan:
Run it.
Below runs pass
./memtablerep_bench --benchmarks fillrandom,readrandom --memtablerep skiplist
./memtablerep_bench --benchmarks fillseq,readseq --memtablerep skiplist
./memtablerep_bench --benchmarks readwrite,seqreadwrite --memtablerep skiplist --num_operations 200 --num_threads 5
./memtablerep_bench --benchmarks fillrandom,readrandom --memtablerep hashskiplist
./memtablerep_bench --benchmarks fillseq,readseq --memtablerep hashskiplist
--num_scans 2
./memtablerep_bench --benchmarks fillseq,readseq --memtablerep vector
Reviewers: jpaton, ikabiljo, sdong
Reviewed By: sdong
Subscribers: dhruba, ameyag
Differential Revision: https://reviews.facebook.net/D22683
10 years ago
filter_bench : $( OBJ_DIR ) /util /filter_bench .o $( LIBRARY )
$( AM_LINK)
db_stress : $( OBJ_DIR ) /db_stress_tool /db_stress .o $( STRESS_LIBRARY ) $( TOOLS_LIBRARY ) $( LIBRARY )
$( AM_LINK)
write_stress : $( OBJ_DIR ) /tools /write_stress .o $( LIBRARY )
Write stress test
Summary:
The goal of this diff is to create a simple stress test with focus on catching:
* bugs in compaction/flush processes, especially the ones that cause assertion errors
* bugs in the code that deletes obsolete files
There are two parts of the test:
* write_stress, a binary that writes to the database
* write_stress_runner.py, a script that invokes and kills write_stress
Here are some interesting parts of write_stress:
* Runs with very high concurrency of compactions and flushes (32 threads total) and tries to create a huge amount of small files
* The keys written to the database are not uniformly distributed -- there is a 3-character prefix that mutates occasionally (in prefix mutator thread), in such a way that the first character mutates slower than second, which mutates slower than third character. That way, the compaction stress tests some interesting compaction features like trivial moves and bottommost level calculation
* There is a thread that creates an iterator, holds it for couple of seconds and then iterates over all keys. This is supposed to test RocksDB's abilities to keep the files alive when there are references to them.
* Some writes trigger WAL sync. This is stress testing our WAL sync code.
* At the end of the run, we make sure that we didn't leak any of the sst files
write_stress_runner.py changes the mode in which we run write_stress and also kills and restarts it. There are some interesting characteristics:
* At the beginning we divide the full test runtime into smaller parts -- shorter runtimes (couple of seconds) and longer runtimes (100, 1000) seconds
* The first time we run write_stress, we destroy the old DB. Every next time during the test, we use the same DB.
* We can run in kill mode or clean-restart mode. Kill mode kills the write_stress violently.
* We can run in mode where delete_obsolete_files_with_fullscan is true or false
* We can run with low_open_files mode turned on or off. When it's turned on, we configure table cache to only hold a couple of files -- that way we need to reopen files every time we access them.
Another goal was to create a stress test without a lot of parameters. So tools/write_stress_runner.py should only take one parameter -- runtime_sec and it should figure out everything else on its own.
In a separate diff, I'll add this new test to our nightly legocastle runs.
Test Plan:
The goal of this test was to retroactively catch the following bugs: D33045, D48201, D46899, D42399. I failed to reproduce D48201, but all others have been caught!
When i reverted https://reviews.facebook.net/D33045:
./write_stress --runtime_sec=200 --low_open_files_mode=true
Iterator statuts not OK: IO error: /fast-rocksdb-tmp/rocksdb_test/write_stress/089166.sst: No such file or directory
When i reverted https://reviews.facebook.net/D42399:
python tools/write_stress_runner.py --runtime_sec=5000
Running write_stress, will kill after 5 seconds: ./write_stress --runtime_sec=-1
Running write_stress, will kill after 2 seconds: ./write_stress --runtime_sec=-1 --destroy_db=false --delete_obsolete_files_with_fullscan=true
Running write_stress, will kill after 7 seconds: ./write_stress --runtime_sec=-1 --destroy_db=false
Running write_stress, will kill after 5 seconds: ./write_stress --runtime_sec=-1 --destroy_db=false
Running write_stress, will kill after 8 seconds: ./write_stress --runtime_sec=-1 --destroy_db=false --low_open_files_mode=true
Write to DB failed: IO error: /fast-rocksdb-tmp/rocksdb_test/write_stress/019250.sst: No such file or directory
ERROR: write_stress died with exitcode=-6
When i reverted https://reviews.facebook.net/D46899:
python tools/write_stress_runner.py --runtime_sec=1000
runtime: 1000
Going to execute write stress for [3, 3, 100, 3, 2, 100, 1, 788]
Running write_stress for 3 seconds: ./write_stress --runtime_sec=3 --low_open_files_mode=true
Running write_stress for 3 seconds: ./write_stress --runtime_sec=3 --destroy_db=false --delete_obsolete_files_with_fullscan=true
Running write_stress, will kill after 100 seconds: ./write_stress --runtime_sec=-1 --destroy_db=false --delete_obsolete_files_with_fullscan=true
write_stress: db/db_impl.cc:2070: void rocksdb::DBImpl::MarkLogsSynced(uint64_t, bool, const rocksdb::Status&): Assertion `log.getting_synced' failed.
ERROR: write_stress died with exitcode=-6
Reviewers: IslamAbdelRahman, yhchiang, rven, kradhakrishnan, sdong, anthony
Reviewed By: anthony
Subscribers: leveldb, dhruba
Differential Revision: https://reviews.facebook.net/D49533
9 years ago
$( AM_LINK)
db_sanity_test : $( OBJ_DIR ) /tools /db_sanity_test .o $( LIBRARY )
$( AM_LINK)
db_repl_stress : $( OBJ_DIR ) /tools /db_repl_stress .o $( LIBRARY )
$( AM_LINK)
d e f i n e M a k e T e s t R u l e
$(notdir $(1 : %.cc =%)): $( 1:%.cc =$ $ ( OBJ_DIR ) /%.o ) $$( TEST_LIBRARY ) $$( LIBRARY )
$$ ( AM_LINK)
e n d e f
# For each PLUGIN test, create a rule to generate the test executable
$( foreach test , $ ( ROCKSDB_PLUGIN_TESTS ) , $ ( eval $ ( call MakeTestRule , $ ( test ) ) ) )
arena_test : $( OBJ_DIR ) /memory /arena_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
memory_allocator_test : memory /memory_allocator_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
autovector_test : $( OBJ_DIR ) /util /autovector_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
column_family_test : $( OBJ_DIR ) /db /column_family_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
table_properties_collector_test : $( OBJ_DIR ) /db /table_properties_collector_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
bloom_test : $( OBJ_DIR ) /util /bloom_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
dynamic_bloom_test : $( OBJ_DIR ) /util /dynamic_bloom_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
c_test : $( OBJ_DIR ) /db /c_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
cache_test : $( OBJ_DIR ) /cache /cache_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
coding_test : $( OBJ_DIR ) /util /coding_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
hash_test : $( OBJ_DIR ) /util /hash_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
random_test : $( OBJ_DIR ) /util /random_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
ribbon_test : $( OBJ_DIR ) /util /ribbon_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
option_change_migration_test : $( OBJ_DIR ) /utilities /option_change_migration /option_change_migration_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
agg_merge_test : $( OBJ_DIR ) /utilities /agg_merge /agg_merge_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
stringappend_test : $( OBJ_DIR ) /utilities /merge_operators /string_append /stringappend_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
cassandra_format_test : $( OBJ_DIR ) /utilities /cassandra /cassandra_format_test .o $( OBJ_DIR ) /utilities /cassandra /test_utils .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
cassandra_functional_test : $( OBJ_DIR ) /utilities /cassandra /cassandra_functional_test .o $( OBJ_DIR ) /utilities /cassandra /test_utils .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
cassandra_row_merge_test : $( OBJ_DIR ) /utilities /cassandra /cassandra_row_merge_test .o $( OBJ_DIR ) /utilities /cassandra /test_utils .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
cassandra_serialize_test : $( OBJ_DIR ) /utilities /cassandra /cassandra_serialize_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
hash_table_test : $( OBJ_DIR ) /utilities /persistent_cache /hash_table_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
histogram_test : $( OBJ_DIR ) /monitoring /histogram_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
thread_local_test : $( OBJ_DIR ) /util /thread_local_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
work_queue_test : $( OBJ_DIR ) /util /work_queue_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
corruption_test : $( OBJ_DIR ) /db /corruption_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
crc32c_test : $( OBJ_DIR ) /util /crc 32c_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
slice_test : $( OBJ_DIR ) /util /slice_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
slice_transform_test : $( OBJ_DIR ) /util /slice_transform_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
db_basic_test : $( OBJ_DIR ) /db /db_basic_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
db_blob_basic_test : $( OBJ_DIR ) /db /blob /db_blob_basic_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
db_blob_compaction_test : $( OBJ_DIR ) /db /blob /db_blob_compaction_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
db_readonly_with_timestamp_test : $( OBJ_DIR ) /db /db_readonly_with_timestamp_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
db_wide_basic_test : $( OBJ_DIR ) /db /wide /db_wide_basic_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
db_with_timestamp_basic_test : $( OBJ_DIR ) /db /db_with_timestamp_basic_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
db_with_timestamp_compaction_test : db /db_with_timestamp_compaction_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
db_encryption_test : $( OBJ_DIR ) /db /db_encryption_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
db_test : $( OBJ_DIR ) /db /db_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
db_test2 : $( OBJ_DIR ) /db /db_test 2.o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
db_logical_block_size_cache_test : $( OBJ_DIR ) /db /db_logical_block_size_cache_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
db_blob_index_test : $( OBJ_DIR ) /db /blob /db_blob_index_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
db_block_cache_test : $( OBJ_DIR ) /db /db_block_cache_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
db_bloom_filter_test : $( OBJ_DIR ) /db /db_bloom_filter_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
db_log_iter_test : $( OBJ_DIR ) /db /db_log_iter_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
db_compaction_filter_test : $( OBJ_DIR ) /db /db_compaction_filter_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
db_compaction_test : $( OBJ_DIR ) /db /db_compaction_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
db_clip_test : $( OBJ_DIR ) /db /db_clip_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
db_dynamic_level_test : $( OBJ_DIR ) /db /db_dynamic_level_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
db_flush_test : $( OBJ_DIR ) /db /db_flush_test .o $( TEST_LIBRARY ) $( LIBRARY )
Fix flush not being commit while writing manifest
Summary:
Fix flush not being commit while writing manifest, which is a recent bug introduced by D60075.
The issue:
# Options.max_background_flushes > 1
# Background thread A pick up a flush job, flush, then commit to manifest. (Note that mutex is released before writing manifest.)
# Background thread B pick up another flush job, flush. When it gets to `MemTableList::InstallMemtableFlushResults`, it notices another thread is commiting, so it quit.
# After the first commit, thread A doesn't double check if there are more flush result need to commit, leaving the second flush uncommitted.
Test Plan: run the test. Also verify the new test hit deadlock without the fix.
Reviewers: sdong, igor, lightmark
Reviewed By: lightmark
Subscribers: andrewkr, omegaga, dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D60969
9 years ago
$( AM_LINK)
db_inplace_update_test : $( OBJ_DIR ) /db /db_inplace_update_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
db_iterator_test : $( OBJ_DIR ) /db /db_iterator_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
Integrity protection for live updates to WriteBatch (#7748)
Summary:
This PR adds the foundation classes for key-value integrity protection and the first use case: protecting live updates from the source buffers added to `WriteBatch` through the destination buffer in `MemTable`. The width of the protection info is not yet configurable -- only eight bytes per key is supported. This PR allows users to enable protection by constructing `WriteBatch` with `protection_bytes_per_key == 8`. It does not yet expose a way for users to get integrity protection via other write APIs (e.g., `Put()`, `Merge()`, `Delete()`, etc.).
The foundation classes (`ProtectionInfo.*`) embed the coverage info in their type, and provide `Protect.*()` and `Strip.*()` functions to navigate between types with different coverage. For making bytes per key configurable (for powers of two up to eight) in the future, these classes are templated on the unsigned integer type used to store the protection info. That integer contains the XOR'd result of hashes with independent seeds for all covered fields. For integer fields, the hash is computed on the raw unadjusted bytes, so the result is endian-dependent. The most significant bytes are truncated when the hash value (8 bytes) is wider than the protection integer.
When `WriteBatch` is constructed with `protection_bytes_per_key == 8`, we hold a `ProtectionInfoKVOTC` (i.e., one that covers key, value, optype aka `ValueType`, timestamp, and CF ID) for each entry added to the batch. The protection info is generated from the original buffers passed by the user, as well as the original metadata generated internally. When writing to memtable, each entry is transformed to a `ProtectionInfoKVOTS` (i.e., dropping coverage of CF ID and adding coverage of sequence number), since at that point we know the sequence number, and have already selected a memtable corresponding to a particular CF. This protection info is verified once the entry is encoded in the `MemTable` buffer.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/7748
Test Plan:
- an integration test to verify a wide variety of single-byte changes to the encoded `MemTable` buffer are caught
- add to stress/crash test to verify it works in variety of configs/operations without intentional corruption
- [deferred] unit tests for `ProtectionInfo.*` classes for edge cases like KV swap, `SliceParts` and `Slice` APIs are interchangeable, etc.
Reviewed By: pdillinger
Differential Revision: D25754492
Pulled By: ajkr
fbshipit-source-id: e481bac6c03c2ab268be41359730f1ceb9964866
4 years ago
db_kv_checksum_test : $( OBJ_DIR ) /db /db_kv_checksum_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
db_memtable_test : $( OBJ_DIR ) /db /db_memtable_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
db_merge_operator_test : $( OBJ_DIR ) /db /db_merge_operator_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
db_merge_operand_test : $( OBJ_DIR ) /db /db_merge_operand_test .o $( TEST_LIBRARY ) $( LIBRARY )
New API to get all merge operands for a Key (#5604)
Summary:
This is a new API added to db.h to allow for fetching all merge operands associated with a Key. The main motivation for this API is to support use cases where doing a full online merge is not necessary as it is performance sensitive. Example use-cases:
1. Update subset of columns and read subset of columns -
Imagine a SQL Table, a row is encoded as a K/V pair (as it is done in MyRocks). If there are many columns and users only updated one of them, we can use merge operator to reduce write amplification. While users only read one or two columns in the read query, this feature can avoid a full merging of the whole row, and save some CPU.
2. Updating very few attributes in a value which is a JSON-like document -
Updating one attribute can be done efficiently using merge operator, while reading back one attribute can be done more efficiently if we don't need to do a full merge.
----------------------------------------------------------------------------------------------------
API :
Status GetMergeOperands(
const ReadOptions& options, ColumnFamilyHandle* column_family,
const Slice& key, PinnableSlice* merge_operands,
GetMergeOperandsOptions* get_merge_operands_options,
int* number_of_operands)
Example usage :
int size = 100;
int number_of_operands = 0;
std::vector<PinnableSlice> values(size);
GetMergeOperandsOptions merge_operands_info;
db_->GetMergeOperands(ReadOptions(), db_->DefaultColumnFamily(), "k1", values.data(), merge_operands_info, &number_of_operands);
Description :
Returns all the merge operands corresponding to the key. If the number of merge operands in DB is greater than merge_operands_options.expected_max_number_of_operands no merge operands are returned and status is Incomplete. Merge operands returned are in the order of insertion.
merge_operands-> Points to an array of at-least merge_operands_options.expected_max_number_of_operands and the caller is responsible for allocating it. If the status returned is Incomplete then number_of_operands will contain the total number of merge operands found in DB for key.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5604
Test Plan:
Added unit test and perf test in db_bench that can be run using the command:
./db_bench -benchmarks=getmergeoperands --merge_operator=sortlist
Differential Revision: D16657366
Pulled By: vjnadimpalli
fbshipit-source-id: 0faadd752351745224ee12d4ae9ef3cb529951bf
5 years ago
$( AM_LINK)
Fix TSAN failures in DistributedMutex tests (#5684)
Summary:
TSAN was not able to correctly instrument atomic bts and btr instructions, so
when TSAN is enabled implement those with std::atomic::fetch_or and
std::atomic::fetch_and. Also disable tests that fail on TSAN with false
negatives (we know these are false negatives because this other verifiably
correct program fails with the same TSAN error <link>)
```
make clean
TEST_TMPDIR=/dev/shm/rocksdb OPT=-g COMPILE_WITH_TSAN=1 make J=1 -j56 folly_synchronization_distributed_mutex_test
```
This is the code that fails with the same false-negative with TSAN
```
namespace {
class ExceptionWithConstructionTrack : public std::exception {
public:
explicit ExceptionWithConstructionTrack(int id)
: id_{folly::to<std::string>(id)}, constructionTrack_{id} {}
const char* what() const noexcept override {
return id_.c_str();
}
private:
std::string id_;
TestConstruction constructionTrack_;
};
template <typename Storage, typename Atomic>
void transferCurrentException(Storage& storage, Atomic& produced) {
assert(std::current_exception());
new (&storage) std::exception_ptr(std::current_exception());
produced->store(true, std::memory_order_release);
}
void concurrentExceptionPropagationStress(
int numThreads,
std::chrono::milliseconds milliseconds) {
auto&& stop = std::atomic<bool>{false};
auto&& exceptions = std::vector<std::aligned_storage<48, 8>::type>{};
auto&& produced = std::vector<std::unique_ptr<std::atomic<bool>>>{};
auto&& consumed = std::vector<std::unique_ptr<std::atomic<bool>>>{};
auto&& consumers = std::vector<std::thread>{};
for (auto i = 0; i < numThreads; ++i) {
produced.emplace_back(new std::atomic<bool>{false});
consumed.emplace_back(new std::atomic<bool>{false});
exceptions.push_back({});
}
auto producer = std::thread{[&]() {
auto counter = std::vector<int>(numThreads, 0);
for (auto i = 0; true; i = ((i + 1) % numThreads)) {
try {
throw ExceptionWithConstructionTrack{counter.at(i)++};
} catch (...) {
transferCurrentException(exceptions.at(i), produced.at(i));
}
while (!consumed.at(i)->load(std::memory_order_acquire)) {
if (stop.load(std::memory_order_acquire)) {
return;
}
}
consumed.at(i)->store(false, std::memory_order_release);
}
}};
for (auto i = 0; i < numThreads; ++i) {
consumers.emplace_back([&, i]() {
auto counter = 0;
while (true) {
while (!produced.at(i)->load(std::memory_order_acquire)) {
if (stop.load(std::memory_order_acquire)) {
return;
}
}
produced.at(i)->store(false, std::memory_order_release);
try {
auto storage = &exceptions.at(i);
auto exc = folly::launder(
reinterpret_cast<std::exception_ptr*>(storage));
auto copy = std::move(*exc);
exc->std::exception_ptr::~exception_ptr();
std::rethrow_exception(std::move(copy));
} catch (std::exception& exc) {
auto value = std::stoi(exc.what());
EXPECT_EQ(value, counter++);
}
consumed.at(i)->store(true, std::memory_order_release);
}
});
}
std::this_thread::sleep_for(milliseconds);
stop.store(true);
producer.join();
for (auto& thread : consumers) {
thread.join();
}
}
} // namespace
```
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5684
Differential Revision: D16746077
Pulled By: miasantreble
fbshipit-source-id: 8af88dcf9161c05daec1a76290f577918638f79d
5 years ago
db_options_test : $( OBJ_DIR ) /db /db_options_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
db_range_del_test : $( OBJ_DIR ) /db /db_range_del_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
db_rate_limiter_test : $( OBJ_DIR ) /db /db_rate_limiter_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
db_sst_test : $( OBJ_DIR ) /db /db_sst_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
db_statistics_test : $( OBJ_DIR ) /db /db_statistics_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
db_write_test : $( OBJ_DIR ) /db /db_write_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
error_handler_fs_test : $( OBJ_DIR ) /db /error_handler_fs_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
external_sst_file_basic_test : $( OBJ_DIR ) /db /external_sst_file_basic_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
external_sst_file_test : $( OBJ_DIR ) /db /external_sst_file_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
import_column_family_test : $( OBJ_DIR ) /db /import_column_family_test .o $( TEST_LIBRARY ) $( LIBRARY )
Export Import sst files (#5495)
Summary:
Refresh of the earlier change here - https://github.com/facebook/rocksdb/issues/5135
This is a review request for code change needed for - https://github.com/facebook/rocksdb/issues/3469
"Add support for taking snapshot of a column family and creating column family from a given CF snapshot"
We have an implementation for this that we have been testing internally. We have two new APIs that together provide this functionality.
(1) ExportColumnFamily() - This API is modelled after CreateCheckpoint() as below.
// Exports all live SST files of a specified Column Family onto export_dir,
// returning SST files information in metadata.
// - SST files will be created as hard links when the directory specified
// is in the same partition as the db directory, copied otherwise.
// - export_dir should not already exist and will be created by this API.
// - Always triggers a flush.
virtual Status ExportColumnFamily(ColumnFamilyHandle* handle,
const std::string& export_dir,
ExportImportFilesMetaData** metadata);
Internally, the API will DisableFileDeletions(), GetColumnFamilyMetaData(), Parse through
metadata, creating links/copies of all the sst files, EnableFileDeletions() and complete the call by
returning the list of file metadata.
(2) CreateColumnFamilyWithImport() - This API is modeled after IngestExternalFile(), but invoked only during a CF creation as below.
// CreateColumnFamilyWithImport() will create a new column family with
// column_family_name and import external SST files specified in metadata into
// this column family.
// (1) External SST files can be created using SstFileWriter.
// (2) External SST files can be exported from a particular column family in
// an existing DB.
// Option in import_options specifies whether the external files are copied or
// moved (default is copy). When option specifies copy, managing files at
// external_file_path is caller's responsibility. When option specifies a
// move, the call ensures that the specified files at external_file_path are
// deleted on successful return and files are not modified on any error
// return.
// On error return, column family handle returned will be nullptr.
// ColumnFamily will be present on successful return and will not be present
// on error return. ColumnFamily may be present on any crash during this call.
virtual Status CreateColumnFamilyWithImport(
const ColumnFamilyOptions& options, const std::string& column_family_name,
const ImportColumnFamilyOptions& import_options,
const ExportImportFilesMetaData& metadata,
ColumnFamilyHandle** handle);
Internally, this API creates a new CF, parses all the sst files and adds it to the specified column family, at the same level and with same sequence number as in the metadata. Also performs safety checks with respect to overlaps between the sst files being imported.
If incoming sequence number is higher than current local sequence number, local sequence
number is updated to reflect this.
Note, as the sst files is are being moved across Column Families, Column Family name in sst file
will no longer match the actual column family on destination DB. The API does not modify Column
Family name or id in the sst files being imported.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5495
Differential Revision: D16018881
fbshipit-source-id: 9ae2251025d5916d35a9fc4ea4d6707f6be16ff9
6 years ago
$( AM_LINK)
db_tailing_iter_test : $( OBJ_DIR ) /db /db_tailing_iter_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
db_iter_test : $( OBJ_DIR ) /db /db_iter_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
db_iter_stress_test : $( OBJ_DIR ) /db /db_iter_stress_test .o $( TEST_LIBRARY ) $( LIBRARY )
Change and clarify the relationship between Valid(), status() and Seek*() for all iterators. Also fix some bugs
Summary:
Before this PR, Iterator/InternalIterator may simultaneously have non-ok status() and Valid() = true. That state means that the last operation failed, but the iterator is nevertheless positioned on some unspecified record. Likely intended uses of that are:
* If some sst files are corrupted, a normal iterator can be used to read the data from files that are not corrupted.
* When using read_tier = kBlockCacheTier, read the data that's in block cache, skipping over the data that is not.
However, this behavior wasn't documented well (and until recently the wiki on github had misleading incorrect information). In the code there's a lot of confusion about the relationship between status() and Valid(), and about whether Seek()/SeekToLast()/etc reset the status or not. There were a number of bugs caused by this confusion, both inside rocksdb and in the code that uses rocksdb (including ours).
This PR changes the convention to:
* If status() is not ok, Valid() always returns false.
* Any seek operation resets status. (Before the PR, it depended on iterator type and on particular error.)
This does sacrifice the two use cases listed above, but siying said it's ok.
Overview of the changes:
* A commit that adds missing status checks in MergingIterator. This fixes a bug that actually affects us, and we need it fixed. `DBIteratorTest.NonBlockingIterationBugRepro` explains the scenario.
* Changes to lots of iterator types to make all of them conform to the new convention. Some bug fixes along the way. By far the biggest changes are in DBIter, which is a big messy piece of code; I tried to make it less big and messy but mostly failed.
* A stress-test for DBIter, to gain some confidence that I didn't break it. It does a few million random operations on the iterator, while occasionally modifying the underlying data (like ForwardIterator does) and occasionally returning non-ok status from internal iterator.
To find the iterator types that needed changes I searched for "public .*Iterator" in the code. Here's an overview of all 27 iterator types:
Iterators that didn't need changes:
* status() is always ok(), or Valid() is always false: MemTableIterator, ModelIter, TestIterator, KVIter (2 classes with this name anonymous namespaces), LoggingForwardVectorIterator, VectorIterator, MockTableIterator, EmptyIterator, EmptyInternalIterator.
* Thin wrappers that always pass through Valid() and status(): ArenaWrappedDBIter, TtlIterator, InternalIteratorFromIterator.
Iterators with changes (see inline comments for details):
* DBIter - an overhaul:
- It used to silently skip corrupted keys (`FindParseableKey()`), which seems dangerous. This PR makes it just stop immediately after encountering a corrupted key, just like it would for other kinds of corruption. Let me know if there was actually some deeper meaning in this behavior and I should put it back.
- It had a few code paths silently discarding subiterator's status. The stress test caught a few.
- The backwards iteration code path was expecting the internal iterator's set of keys to be immutable. It's probably always true in practice at the moment, since ForwardIterator doesn't support backwards iteration, but this PR fixes it anyway. See added DBIteratorTest.ReverseToForwardBug for an example.
- Some parts of backwards iteration code path even did things like `assert(iter_->Valid())` after a seek, which is never a safe assumption.
- It used to not reset status on seek for some types of errors.
- Some simplifications and better comments.
- Some things got more complicated from the added error handling. I'm open to ideas for how to make it nicer.
* MergingIterator - check status after every operation on every subiterator, and in some places assert that valid subiterators have ok status.
* ForwardIterator - changed to the new convention, also slightly simplified.
* ForwardLevelIterator - fixed some bugs and simplified.
* LevelIterator - simplified.
* TwoLevelIterator - changed to the new convention. Also fixed a bug that would make SeekForPrev() sometimes silently ignore errors from first_level_iter_.
* BlockBasedTableIterator - minor changes.
* BlockIter - replaced `SetStatus()` with `Invalidate()` to make sure non-ok BlockIter is always invalid.
* PlainTableIterator - some seeks used to not reset status.
* CuckooTableIterator - tiny code cleanup.
* ManagedIterator - fixed some bugs.
* BaseDeltaIterator - changed to the new convention and fixed a bug.
* BlobDBIterator - seeks used to not reset status.
* KeyConvertingIterator - some small change.
Closes https://github.com/facebook/rocksdb/pull/3810
Differential Revision: D7888019
Pulled By: al13n321
fbshipit-source-id: 4aaf6d3421c545d16722a815b2fa2e7912bc851d
7 years ago
$( AM_LINK)
db_universal_compaction_test : $( OBJ_DIR ) /db /db_universal_compaction_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
db_wal_test : $( OBJ_DIR ) /db /db_wal_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
db_io_failure_test : $( OBJ_DIR ) /db /db_io_failure_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
db_properties_test : $( OBJ_DIR ) /db /db_properties_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
db_table_properties_test : $( OBJ_DIR ) /db /db_table_properties_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
log_write_bench : $( OBJ_DIR ) /util /log_write_bench .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK) $( PROFILING_FLAGS)
seqno_time_test : $( OBJ_DIR ) /db /seqno_time_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
plain_table_db_test : $( OBJ_DIR ) /db /plain_table_db_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
comparator_db_test : $( OBJ_DIR ) /db /comparator_db_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
table_reader_bench : $( OBJ_DIR ) /table /table_reader_bench .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK) $( PROFILING_FLAGS)
perf_context_test : $( OBJ_DIR ) /db /perf_context_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
prefix_test : $( OBJ_DIR ) /db /prefix_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
backup_engine_test : $( OBJ_DIR ) /utilities /backup /backup_engine_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
[RocksDB] BackupableDB
Summary:
In this diff I present you BackupableDB v1. You can easily use it to backup your DB and it will do incremental snapshots for you.
Let's first describe how you would use BackupableDB. It's inheriting StackableDB interface so you can easily construct it with your DB object -- it will add a method RollTheSnapshot() to the DB object. When you call RollTheSnapshot(), current snapshot of the DB will be stored in the backup dir. To restore, you can just call RestoreDBFromBackup() on a BackupableDB (which is a static method) and it will restore all files from the backup dir. In the next version, it will even support automatic backuping every X minutes.
There are multiple things you can configure:
1. backup_env and db_env can be different, which is awesome because then you can easily backup to HDFS or wherever you feel like.
2. sync - if true, it *guarantees* backup consistency on machine reboot
3. number of snapshots to keep - this will keep last N snapshots around if you want, for some reason, be able to restore from an earlier snapshot. All the backuping is done in incremental fashion - if we already have 00010.sst, we will not copy it again. *IMPORTANT* -- This is based on assumption that 00010.sst never changes - two files named 00010.sst from the same DB will always be exactly the same. Is this true? I always copy manifest, current and log files.
4. You can decide if you want to flush the memtables before you backup, or you're fine with backing up the log files -- either way, you get a complete and consistent view of the database at a time of backup.
5. More things you can find in BackupableDBOptions
Here is the directory structure I use:
backup_dir/CURRENT_SNAPSHOT - just 4 bytes holding the latest snapshot
0, 1, 2, ... - files containing serialized version of each snapshot - containing a list of files
files/*.sst - sst files shared between snapshots - if one snapshot references 00010.sst and another one needs to backup it from the DB, it will just reference the same file
files/ 0/, 1/, 2/, ... - snapshot directories containing private snapshot files - current, manifest and log files
All the files are ref counted and deleted immediatelly when they get out of scope.
Some other stuff in this diff:
1. Added GetEnv() method to the DB. Discussed with @haobo and we agreed that it seems right thing to do.
2. Fixed StackableDB interface. The way it was set up before, I was not able to implement BackupableDB.
Test Plan:
I have a unittest, but please don't look at this yet. I just hacked it up to help me with debugging. I will write a lot of good tests and update the diff.
Also, `make asan_check`
Reviewers: dhruba, haobo, emayanke
Reviewed By: dhruba
CC: leveldb, haobo
Differential Revision: https://reviews.facebook.net/D14295
11 years ago
checkpoint_test : $( OBJ_DIR ) /utilities /checkpoint /checkpoint_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
cache_simulator_test : $( OBJ_DIR ) /utilities /simulator_cache /cache_simulator_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
sim_cache_test : $( OBJ_DIR ) /utilities /simulator_cache /sim_cache_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
env_mirror_test : $( OBJ_DIR ) /utilities /env_mirror_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
env_timed_test : $( OBJ_DIR ) /utilities /env_timed_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
object_registry_test : $( OBJ_DIR ) /utilities /object_registry_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
ttl_test : $( OBJ_DIR ) /utilities /ttl /ttl_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
Timestamp and TTL Wrapper for rocksdb
Summary:
When opened with DBTimestamp::Open call, timestamps are prepended to and stripped from the value during subsequent Put and Get calls respectively. The Timestamp is used to discard values in Get and custom compaction filter which have exceeded their TTL which is specified during Open.
Have made a temporary change to Makefile to let us test with the temporary file TestTime.cc. Have also changed the private members of db_impl.h to protected to let them be inherited by the new class DBTimestamp
Test Plan: make db_timestamp; TestTime.cc(will not check it in) shows how to use the apis currently, but I will write unit-tests shortly
Reviewers: dhruba, vamsi, haobo, sheki, heyongqiang, vkrest
Reviewed By: vamsi
CC: zshao, xjin, vkrest, MarkCallaghan
Differential Revision: https://reviews.facebook.net/D10311
12 years ago
write_batch_with_index_test : $( OBJ_DIR ) /utilities /write_batch_with_index /write_batch_with_index_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
Timestamp and TTL Wrapper for rocksdb
Summary:
When opened with DBTimestamp::Open call, timestamps are prepended to and stripped from the value during subsequent Put and Get calls respectively. The Timestamp is used to discard values in Get and custom compaction filter which have exceeded their TTL which is specified during Open.
Have made a temporary change to Makefile to let us test with the temporary file TestTime.cc. Have also changed the private members of db_impl.h to protected to let them be inherited by the new class DBTimestamp
Test Plan: make db_timestamp; TestTime.cc(will not check it in) shows how to use the apis currently, but I will write unit-tests shortly
Reviewers: dhruba, vamsi, haobo, sheki, heyongqiang, vkrest
Reviewed By: vamsi
CC: zshao, xjin, vkrest, MarkCallaghan
Differential Revision: https://reviews.facebook.net/D10311
12 years ago
flush_job_test : $( OBJ_DIR ) /db /flush_job_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
Support for SingleDelete()
Summary:
This patch fixes #7460559. It introduces SingleDelete as a new database
operation. This operation can be used to delete keys that were never
overwritten (no put following another put of the same key). If an overwritten
key is single deleted the behavior is undefined. Single deletion of a
non-existent key has no effect but multiple consecutive single deletions are
not allowed (see limitations).
In contrast to the conventional Delete() operation, the deletion entry is
removed along with the value when the two are lined up in a compaction. Note:
The semantics are similar to @igor's prototype that allowed to have this
behavior on the granularity of a column family (
https://reviews.facebook.net/D42093 ). This new patch, however, is more
aggressive when it comes to removing tombstones: It removes the SingleDelete
together with the value whenever there is no snapshot between them while the
older patch only did this when the sequence number of the deletion was older
than the earliest snapshot.
Most of the complex additions are in the Compaction Iterator, all other changes
should be relatively straightforward. The patch also includes basic support for
single deletions in db_stress and db_bench.
Limitations:
- Not compatible with cuckoo hash tables
- Single deletions cannot be used in combination with merges and normal
deletions on the same key (other keys are not affected by this)
- Consecutive single deletions are currently not allowed (and older version of
this patch supported this so it could be resurrected if needed)
Test Plan: make all check
Reviewers: yhchiang, sdong, rven, anthony, yoshinorim, igor
Reviewed By: igor
Subscribers: maykov, dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D43179
9 years ago
compaction_iterator_test : $( OBJ_DIR ) /db /compaction /compaction_iterator_test .o $( TEST_LIBRARY ) $( LIBRARY )
Support for SingleDelete()
Summary:
This patch fixes #7460559. It introduces SingleDelete as a new database
operation. This operation can be used to delete keys that were never
overwritten (no put following another put of the same key). If an overwritten
key is single deleted the behavior is undefined. Single deletion of a
non-existent key has no effect but multiple consecutive single deletions are
not allowed (see limitations).
In contrast to the conventional Delete() operation, the deletion entry is
removed along with the value when the two are lined up in a compaction. Note:
The semantics are similar to @igor's prototype that allowed to have this
behavior on the granularity of a column family (
https://reviews.facebook.net/D42093 ). This new patch, however, is more
aggressive when it comes to removing tombstones: It removes the SingleDelete
together with the value whenever there is no snapshot between them while the
older patch only did this when the sequence number of the deletion was older
than the earliest snapshot.
Most of the complex additions are in the Compaction Iterator, all other changes
should be relatively straightforward. The patch also includes basic support for
single deletions in db_stress and db_bench.
Limitations:
- Not compatible with cuckoo hash tables
- Single deletions cannot be used in combination with merges and normal
deletions on the same key (other keys are not affected by this)
- Consecutive single deletions are currently not allowed (and older version of
this patch supported this so it could be resurrected if needed)
Test Plan: make all check
Reviewers: yhchiang, sdong, rven, anthony, yoshinorim, igor
Reviewed By: igor
Subscribers: maykov, dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D43179
9 years ago
$( AM_LINK)
compaction_job_test : $( OBJ_DIR ) /db /compaction /compaction_job_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
compaction_job_stats_test : $( OBJ_DIR ) /db /compaction /compaction_job_stats_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
compaction_service_test : $( OBJ_DIR ) /db /compaction /compaction_service_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
compact_on_deletion_collector_test : $( OBJ_DIR ) /utilities /table_properties_collectors /compact_on_deletion_collector_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
wal_manager_test : $( OBJ_DIR ) /db /wal_manager_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
Define WAL related classes to be used in VersionEdit and VersionSet (#7164)
Summary:
`WalAddition`, `WalDeletion` are defined in `wal_version.h` and used in `VersionEdit`.
`WalAddition` is used to represent events of creating a new WAL (no size, just log number), or closing a WAL (with size).
`WalDeletion` is used to represent events of deleting or archiving a WAL, it means the WAL is no longer alive (won't be replayed during recovery).
`WalSet` is the set of alive WALs kept in `VersionSet`.
1. Why use `WalDeletion` instead of relying on `MinLogNumber` to identify outdated WALs
On recovery, we can compute `MinLogNumber()` based on the log numbers kept in MANIFEST, any log with number < MinLogNumber can be ignored. So it seems that we don't need to persist `WalDeletion` to MANIFEST, since we can ignore the WALs based on MinLogNumber.
But the `MinLogNumber()` is actually a lower bound, it does not exactly mean that logs starting from MinLogNumber must exist. This is because in a corner case, when a column family is empty and never flushed, its log number is set to the largest log number, but not persisted in MANIFEST. So let's say there are 2 column families, when creating the DB, the first WAL has log number 1, so it's persisted to MANIFEST for both column families. Then CF 0 is empty and never flushed, CF 1 is updated and flushed, so a new WAL with log number 2 is created and persisted to MANIFEST for CF 1. But CF 0's log number in MANIFEST is still 1. So on recovery, MinLogNumber is 1, but since log 1 only contains data for CF 1, and CF 1 is flushed, log 1 might have already been deleted from disk.
We can make `MinLogNumber()` be the exactly minimum log number that must exist, by persisting the most recent log number for empty column families that are not flushed. But if there are N such column families, then every time a new WAL is created, we need to add N records to MANIFEST.
In current design, a record is persisted to MANIFEST only when WAL is created, closed, or deleted/archived, so the number of WAL related records are bounded to 3x number of WALs.
2. Why keep `WalSet` in `VersionSet` instead of applying the `VersionEdit`s to `VersionStorageInfo`
`VersionEdit`s are originally designed to track the addition and deletion of SST files. The SST files are related to column families, each column family has a list of `Version`s, and each `Version` keeps the set of active SST files in `VersionStorageInfo`.
But WALs are a concept of DB, they are not bounded to specific column families. So logically it does not make sense to store WALs in a column family's `Version`s.
Also, `Version`'s purpose is to keep reference to SST / blob files, so that they are not deleted until there is no version referencing them. But a WAL is deleted regardless of version references.
So we keep the WALs in `VersionSet` for the purpose of writing out the DB state's snapshot when creating new MANIFESTs.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/7164
Test Plan:
make version_edit_test && ./version_edit_test
make wal_edit_test && ./wal_edit_test
Reviewed By: ltamasi
Differential Revision: D22677936
Pulled By: cheng-chang
fbshipit-source-id: 5a3b6890140e572ffd79eb37e6e4c3c32361a859
4 years ago
wal_edit_test : $( OBJ_DIR ) /db /wal_edit_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
dbformat_test : $( OBJ_DIR ) /db /dbformat_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
env_basic_test : $( OBJ_DIR ) /env /env_basic_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
env_test : $( OBJ_DIR ) /env /env_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
io_posix_test : $( OBJ_DIR ) /env /io_posix_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
fault_injection_test : $( OBJ_DIR ) /db /fault_injection_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
rate_limiter_test : $( OBJ_DIR ) /util /rate_limiter_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
generic rate limiter
Summary:
A generic rate limiter that can be shared by threads and rocksdb
instances. Will use this to smooth out write traffic generated by
compaction and flush. This will help us get better p99 behavior on flash
storage.
Test Plan:
unit test output
==== Test RateLimiterTest.Rate
request size [1 - 1023], limit 10 KB/sec, actual rate: 10.374969 KB/sec, elapsed 2002265
request size [1 - 2047], limit 20 KB/sec, actual rate: 20.771242 KB/sec, elapsed 2002139
request size [1 - 4095], limit 40 KB/sec, actual rate: 41.285299 KB/sec, elapsed 2202424
request size [1 - 8191], limit 80 KB/sec, actual rate: 81.371605 KB/sec, elapsed 2402558
request size [1 - 16383], limit 160 KB/sec, actual rate: 162.541268 KB/sec, elapsed 3303500
Reviewers: yhchiang, igor, sdong
Reviewed By: sdong
Subscribers: leveldb
Differential Revision: https://reviews.facebook.net/D19359
11 years ago
delete_scheduler_test : $( OBJ_DIR ) /file /delete_scheduler_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
filename_test : $( OBJ_DIR ) /db /filename_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
random_access_file_reader_test : $( OBJ_DIR ) /file /random_access_file_reader_test .o $( TEST_LIBRARY ) $( LIBRARY )
Support direct IO in RandomAccessFileReader::MultiRead (#6446)
Summary:
By supporting direct IO in RandomAccessFileReader::MultiRead, the benefits of parallel IO (IO uring) and direct IO can be combined.
In direct IO mode, read requests are aligned and merged together before being issued to RandomAccessFile::MultiRead, so blocks in the original requests might share the same underlying buffer, the shared buffers are returned in `aligned_bufs`, which is a new parameter of the `MultiRead` API.
For example, suppose alignment requirement for direct IO is 4KB, one request is (offset: 1KB, len: 1KB), another request is (offset: 3KB, len: 1KB), then since they all belong to page (offset: 0, len: 4KB), `MultiRead` only reads the page with direct IO into a buffer on heap, and returns 2 Slices referencing regions in that same buffer. See `random_access_file_reader_test.cc` for more examples.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6446
Test Plan: Added a new test `random_access_file_reader_test.cc`.
Reviewed By: anand1976
Differential Revision: D20097518
Pulled By: cheng-chang
fbshipit-source-id: ca48a8faf9c3af146465c102ef6b266a363e78d1
5 years ago
$( AM_LINK)
file_reader_writer_test : $( OBJ_DIR ) /util /file_reader_writer_test .o $( TEST_LIBRARY ) $( LIBRARY )
RangeSync not to sync last 1MB of the file
Summary:
From other ones' investigation:
"sync_file_range() behavior highly depends on kernel version and filesystem.
xfs does neighbor page flushing outside of the specified ranges. For example, sync_file_range(fd, 8192, 16384) does not only trigger flushing page #3 to #4, but also flushing many more dirty pages (i.e. up to page#16)... Ranges of the sync_file_range() should be far enough from write() offset (at least 1MB)."
Test Plan: make all check
Reviewers: igor, rven, kradhakrishnan, yhchiang, IslamAbdelRahman, anthony
Reviewed By: anthony
Subscribers: yoshinorim, MarkCallaghan, sumeet, domas, dhruba, leveldb, ljin
Differential Revision: https://reviews.facebook.net/D15807
10 years ago
$( AM_LINK)
block_based_table_reader_test : table /block_based /block_based_table_reader_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
full_filter_block_test : $( OBJ_DIR ) /table /block_based /full_filter_block_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
partitioned_filter_block_test : $( OBJ_DIR ) /table /block_based /partitioned_filter_block_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
log_test : $( OBJ_DIR ) /db /log_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
cleanable_test : $( OBJ_DIR ) /table /cleanable_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
table_test : $( OBJ_DIR ) /table /table_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
block_fetcher_test : table /block_fetcher_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
block_test : $( OBJ_DIR ) /table /block_based /block_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
data_block_hash_index_test : $( OBJ_DIR ) /table /block_based /data_block_hash_index_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
inlineskiplist_test : $( OBJ_DIR ) /memtable /inlineskiplist_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
skiplist_test : $( OBJ_DIR ) /memtable /skiplist_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
write_buffer_manager_test : $( OBJ_DIR ) /memtable /write_buffer_manager_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
version_edit_test : $( OBJ_DIR ) /db /version_edit_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
version_set_test : $( OBJ_DIR ) /db /version_set_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
compaction_picker_test : $( OBJ_DIR ) /db /compaction /compaction_picker_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
version_builder_test : $( OBJ_DIR ) /db /version_builder_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
file_indexer_test : $( OBJ_DIR ) /db /file_indexer_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
hints for narrowing down FindFile range and avoiding checking unrelevant L0 files
Summary:
The file tree structure in Version is prebuilt and the range of each file is known.
On the Get() code path, we do binary search in FindFile() by comparing
target key with each file's largest key and also check the range for each L0 file.
With some pre-calculated knowledge, each key comparision that has been done can serve
as a hint to narrow down further searches:
(1) If a key falls within a L0 file's range, we can safely skip the next
file if its range does not overlap with the current one.
(2) If a key falls within a file's range in level L0 - Ln-1, we should only
need to binary search in the next level for files that overlap with the current one.
(1) will be able to skip some files depending one the key distribution.
(2) can greatly reduce the range of binary search, especially for bottom
levels, given that one file most likely only overlaps with N files from
the level below (where N is max_bytes_for_level_multiplier). So on level
L, we will only look at ~N files instead of N^L files.
Some inital results: measured with 500M key DB, when write is light (10k/s = 1.2M/s), this
improves QPS ~7% on top of blocked bloom. When write is heavier (80k/s =
9.6M/s), it gives us ~13% improvement.
Test Plan: make all check
Reviewers: haobo, igor, dhruba, sdong, yhchiang
Reviewed By: haobo
CC: leveldb
Differential Revision: https://reviews.facebook.net/D17205
11 years ago
reduce_levels_test : $( OBJ_DIR ) /tools /reduce_levels_test .o $( TOOLS_LIBRARY ) $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
write_batch_test : $( OBJ_DIR ) /db /write_batch_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
write_controller_test : $( OBJ_DIR ) /db /write_controller_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
Push- instead of pull-model for managing Write stalls
Summary:
Introducing WriteController, which is a source of truth about per-DB write delays. Let's define an DB epoch as a period where there are no flushes and compactions (i.e. new epoch is started when flush or compaction finishes). Each epoch can either:
* proceed with all writes without delay
* delay all writes by fixed time
* stop all writes
The three modes are recomputed at each epoch change (flush, compaction), rather than on every write (which is currently the case).
When we have a lot of column families, our current pull behavior adds a big overhead, since we need to loop over every column family for every write. With new push model, overhead on Write code-path is minimal.
This is just the start. Next step is to also take care of stalls introduced by slow memtable flushes. The final goal is to eliminate function MakeRoomForWrite(), which currently needs to be called for every column family by every write.
Test Plan: make check for now. I'll add some unit tests later. Also, perf test.
Reviewers: dhruba, yhchiang, MarkCallaghan, sdong, ljin
Reviewed By: ljin
Subscribers: leveldb
Differential Revision: https://reviews.facebook.net/D22791
10 years ago
merge_helper_test : $( OBJ_DIR ) /db /merge_helper_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
memory_test : $( OBJ_DIR ) /utilities /memory /memory_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
merge_test : $( OBJ_DIR ) /db /merge_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
merger_test : $( OBJ_DIR ) /table /merger_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
util_merge_operators_test : $( OBJ_DIR ) /utilities /util_merge_operators_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
options_file_test : $( OBJ_DIR ) /db /options_file_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
deletefile_test : $( OBJ_DIR ) /db /deletefile_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
obsolete_files_test : $( OBJ_DIR ) /db /obsolete_files_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
rocksdb_dump : $( OBJ_DIR ) /tools /dump /rocksdb_dump .o $( LIBRARY )
$( AM_LINK)
rocksdb_undump : $( OBJ_DIR ) /tools /dump /rocksdb_undump .o $( LIBRARY )
$( AM_LINK)
cuckoo_table_builder_test : $( OBJ_DIR ) /table /cuckoo /cuckoo_table_builder_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
cuckoo_table_reader_test : $( OBJ_DIR ) /table /cuckoo /cuckoo_table_reader_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
cuckoo_table_db_test : $( OBJ_DIR ) /db /cuckoo_table_db_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
listener_test : $( OBJ_DIR ) /db /listener_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
CompactFiles, EventListener and GetDatabaseMetaData
Summary:
This diff adds three sets of APIs to RocksDB.
= GetColumnFamilyMetaData =
* This APIs allow users to obtain the current state of a RocksDB instance on one column family.
* See GetColumnFamilyMetaData in include/rocksdb/db.h
= EventListener =
* A virtual class that allows users to implement a set of
call-back functions which will be called when specific
events of a RocksDB instance happens.
* To register EventListener, simply insert an EventListener to ColumnFamilyOptions::listeners
= CompactFiles =
* CompactFiles API inputs a set of file numbers and an output level, and RocksDB
will try to compact those files into the specified level.
= Example =
* Example code can be found in example/compact_files_example.cc, which implements
a simple external compactor using EventListener, GetColumnFamilyMetaData, and
CompactFiles API.
Test Plan:
listener_test
compactor_test
example/compact_files_example
export ROCKSDB_TESTS=CompactFiles
db_test
export ROCKSDB_TESTS=MetaData
db_test
Reviewers: ljin, igor, rven, sdong
Reviewed By: sdong
Subscribers: MarkCallaghan, dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D24705
10 years ago
thread_list_test : $( OBJ_DIR ) /util /thread_list_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
compact_files_test : $( OBJ_DIR ) /db /compact_files_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
configurable_test : options /configurable_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
customizable_test : options /customizable_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
options_test : $( OBJ_DIR ) /options /options_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
options_settable_test : $( OBJ_DIR ) /options /options_settable_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
options_util_test : $( OBJ_DIR ) /utilities /options /options_util_test .o $( TEST_LIBRARY ) $( LIBRARY )
Add OptionsUtil::LoadOptionsFromFile() API
Summary:
This patch adds OptionsUtil::LoadOptionsFromFile() and
OptionsUtil::LoadLatestOptionsFromDB(), which allow developers
to construct DBOptions and ColumnFamilyOptions from a RocksDB
options file. Note that most pointer-typed options such as
merge_operator will not be constructed.
With this API, developers no longer need to remember all the
options in order to reopen an existing rocksdb instance like
the following:
DBOptions db_options;
std::vector<std::string> cf_names;
std::vector<ColumnFamilyOptions> cf_opts;
// Load primitive-typed options from an existing DB
OptionsUtil::LoadLatestOptionsFromDB(
dbname, &db_options, &cf_names, &cf_opts);
// Initialize necessary pointer-typed options
cf_opts[0].merge_operator.reset(new MyMergeOperator());
...
// Construct the vector of ColumnFamilyDescriptor
std::vector<ColumnFamilyDescriptor> cf_descs;
for (size_t i = 0; i < cf_opts.size(); ++i) {
cf_descs.emplace_back(cf_names[i], cf_opts[i]);
}
// Open the DB
DB* db = nullptr;
std::vector<ColumnFamilyHandle*> cf_handles;
auto s = DB::Open(db_options, dbname, cf_descs,
&handles, &db);
Test Plan:
Augment existing tests in column_family_test
options_test
db_test
Reviewers: igor, IslamAbdelRahman, sdong, anthony
Reviewed By: anthony
Subscribers: dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D49095
9 years ago
$( AM_LINK)
db_bench_tool_test : $( OBJ_DIR ) /tools /db_bench_tool_test .o $( BENCH_OBJECTS ) $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
trace_analyzer_test : $( OBJ_DIR ) /tools /trace_analyzer_test .o $( ANALYZE_OBJECTS ) $( TOOLS_LIBRARY ) $( TEST_LIBRARY ) $( LIBRARY )
RocksDB Trace Analyzer (#4091)
Summary:
A framework of trace analyzing for RocksDB
After collecting the trace by using the tool of [PR #3837](https://github.com/facebook/rocksdb/pull/3837). User can use the Trace Analyzer to interpret, analyze, and characterize the collected workload.
**Input:**
1. trace file
2. Whole keys space file
**Statistics:**
1. Access count of each operation (Get, Put, Delete, SingleDelete, DeleteRange, Merge) in each column family.
2. Key hotness (access count) of each one
3. Key space separation based on given prefix
4. Key size distribution
5. Value size distribution if appliable
6. Top K accessed keys
7. QPS statistics including the average QPS and peak QPS
8. Top K accessed prefix
9. The query correlation analyzing, output the number of X after Y and the corresponding average time
intervals
**Output:**
1. key access heat map (either in the accessed key space or whole key space)
2. trace sequence file (interpret the raw trace file to line base text file for future use)
3. Time serial (The key space ID and its access time)
4. Key access count distritbution
5. Key size distribution
6. Value size distribution (in each intervals)
7. whole key space separation by the prefix
8. Accessed key space separation by the prefix
9. QPS of each operation and each column family
10. Top K QPS and their accessed prefix range
**Test:**
1. Added the unit test of analyzing Get, Put, Delete, SingleDelete, DeleteRange, Merge
2. Generated the trace and analyze the trace
**Implemented but not tested (due to the limitation of trace_replay):**
1. Analyzing Iterator, supporting Seek() and SeekForPrev() analyzing
2. Analyzing the number of Key found by Get
**Future Work:**
1. Support execution time analyzing of each requests
2. Support cache hit situation and block read situation of Get
Pull Request resolved: https://github.com/facebook/rocksdb/pull/4091
Differential Revision: D9256157
Pulled By: zhichao-cao
fbshipit-source-id: f0ceacb7eedbc43a3eee6e85b76087d7832a8fe6
6 years ago
$( AM_LINK)
event_logger_test : $( OBJ_DIR ) /logging /event_logger_test .o $( TEST_LIBRARY ) $( LIBRARY )
EventLogger
Summary:
Here's my proposal for making our LOGs easier to read by machines.
The idea is to dump all events as JSON objects. JSON is easy to read by humans, but more importantly, it's easy to read by machines. That way, we can parse this, load into SQLite/mongo and then query or visualize.
I started with table_create and table_delete events, but if everybody agrees, I'll continue by adding more events (flush/compaction/etc etc)
Test Plan:
Ran db_bench. Observed:
2015/01/15-14:13:25.788019 1105ef000 EVENT_LOG_v1 {"time_micros": 1421360005788015, "event": "table_file_creation", "file_number": 12, "file_size": 1909699}
2015/01/15-14:13:25.956500 110740000 EVENT_LOG_v1 {"time_micros": 1421360005956498, "event": "table_file_deletion", "file_number": 12}
Reviewers: yhchiang, rven, dhruba, MarkCallaghan, lgalanis, sdong
Reviewed By: sdong
Subscribers: dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D31647
10 years ago
$( AM_LINK)
timer_queue_test : $( OBJ_DIR ) /util /timer_queue_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
sst_dump_test : $( OBJ_DIR ) /tools /sst_dump_test .o $( TOOLS_LIBRARY ) $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
optimistic_transaction_test : $( OBJ_DIR ) /utilities /transactions /optimistic_transaction_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
mock_env_test : $( OBJ_DIR ) /env /mock_env_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
manual_compaction_test : $( OBJ_DIR ) /db /manual_compaction_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
filelock_test : $( OBJ_DIR ) /util /filelock_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
auto_roll_logger_test : $( OBJ_DIR ) /logging /auto_roll_logger_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
env_logger_test : $( OBJ_DIR ) /logging /env_logger_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
memtable_list_test : $( OBJ_DIR ) /db /memtable_list_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
write_callback_test : $( OBJ_DIR ) /db /write_callback_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
heap_test : $( OBJ_DIR ) /util /heap_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
point_lock_manager_test : utilities /transactions /lock /point /point_lock_manager_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
transaction_test : $( OBJ_DIR ) /utilities /transactions /transaction_test .o $( TEST_LIBRARY ) $( LIBRARY )
Pessimistic Transactions
Summary:
Initial implementation of Pessimistic Transactions. This diff contains the api changes discussed in D38913. This diff is pretty large, so let me know if people would prefer to meet up to discuss it.
MyRocks folks: please take a look at the API in include/rocksdb/utilities/transaction[_db].h and let me know if you have any issues.
Also, you'll notice a couple of TODOs in the implementation of RollbackToSavePoint(). After chatting with Siying, I'm going to send out a separate diff for an alternate implementation of this feature that implements the rollback inside of WriteBatch/WriteBatchWithIndex. We can then decide which route is preferable.
Next, I'm planning on doing some perf testing and then integrating this diff into MongoRocks for further testing.
Test Plan: Unit tests, db_bench parallel testing.
Reviewers: igor, rven, sdong, yhchiang, yoshinorim
Reviewed By: sdong
Subscribers: hermanlee4, maykov, spetrunia, leveldb, dhruba
Differential Revision: https://reviews.facebook.net/D40869
10 years ago
$( AM_LINK)
Support user-defined timestamps in write-committed txns (#9629)
Summary:
Pull Request resolved: https://github.com/facebook/rocksdb/pull/9629
Pessimistic transactions use pessimistic concurrency control, i.e. locking. Keys are
locked upon first operation that writes the key or has the intention of writing. For example,
`PessimisticTransaction::Put()`, `PessimisticTransaction::Delete()`,
`PessimisticTransaction::SingleDelete()` will write to or delete a key, while
`PessimisticTransaction::GetForUpdate()` is used by application to indicate
to RocksDB that the transaction has the intention of performing write operation later
in the same transaction.
Pessimistic transactions support two-phase commit (2PC). A transaction can be
`Prepared()`'ed and then `Commit()`. The prepare phase is similar to a promise: once
`Prepare()` succeeds, the transaction has acquired the necessary resources to commit.
The resources include locks, persistence of WAL, etc.
Write-committed transaction is the default pessimistic transaction implementation. In
RocksDB write-committed transaction, `Prepare()` will write data to the WAL as a prepare
section. `Commit()` will write a commit marker to the WAL and then write data to the
memtables. While writing to the memtables, different keys in the transaction's write batch
will be assigned different sequence numbers in ascending order.
Until commit/rollback, the transaction holds locks on the keys so that no other transaction
can write to the same keys. Furthermore, the keys' sequence numbers represent the order
in which they are committed and should be made visible. This is convenient for us to
implement support for user-defined timestamps.
Since column families with and without timestamps can co-exist in the same database,
a transaction may or may not involve timestamps. Based on this observation, we add two
optional members to each `PessimisticTransaction`, `read_timestamp_` and
`commit_timestamp_`. If no key in the transaction's write batch has timestamp, then
setting these two variables do not have any effect. For the rest of this commit, we discuss
only the cases when these two variables are meaningful.
read_timestamp_ is used mainly for validation, and should be set before first call to
`GetForUpdate()`. Otherwise, the latter will return non-ok status. `GetForUpdate()` calls
`TryLock()` that can verify if another transaction has written the same key since
`read_timestamp_` till this call to `GetForUpdate()`. If another transaction has indeed
written the same key, then validation fails, and RocksDB allows this transaction to
refine `read_timestamp_` by increasing it. Note that a transaction can still use `Get()`
with a different timestamp to read, but the result of the read should not be used to
determine data that will be written later.
commit_timestamp_ must be set after finishing writing and before transaction commit.
This applies to both 2PC and non-2PC cases. In the case of 2PC, it's usually set after
prepare phase succeeds.
We currently require that the commit timestamp be chosen after all keys are locked. This
means we disallow the `TransactionDB`-level APIs if user-defined timestamp is used
by the transaction. Specifically, calling `PessimisticTransactionDB::Put()`,
`PessimisticTransactionDB::Delete()`, `PessimisticTransactionDB::SingleDelete()`,
etc. will return non-ok status because they specify timestamps before locking the keys.
Users are also prompted to use the `Transaction` APIs when they receive the non-ok status.
Reviewed By: ltamasi
Differential Revision: D31822445
fbshipit-source-id: b82abf8e230216dc89cc519564a588224a88fd43
3 years ago
write_committed_transaction_ts_test : $( OBJ_DIR ) /utilities /transactions /write_committed_transaction_ts_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
write_prepared_transaction_test : $( OBJ_DIR ) /utilities /transactions /write_prepared_transaction_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
write_unprepared_transaction_test : $( OBJ_DIR ) /utilities /transactions /write_unprepared_transaction_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
Snapshots with user-specified timestamps (#9879)
Summary:
In RocksDB, keys are associated with (internal) sequence numbers which denote when the keys are written
to the database. Sequence numbers in different RocksDB instances are unrelated, thus not comparable.
It is nice if we can associate sequence numbers with their corresponding actual timestamps. One thing we can
do is to support user-defined timestamp, which allows the applications to specify the format of custom timestamps
and encode a timestamp with each key. More details can be found at https://github.com/facebook/rocksdb/wiki/User-defined-Timestamp-%28Experimental%29.
This PR provides a different but complementary approach. We can associate rocksdb snapshots (defined in
https://github.com/facebook/rocksdb/blob/7.2.fb/include/rocksdb/snapshot.h#L20) with **user-specified** timestamps.
Since a snapshot is essentially an object representing a sequence number, this PR establishes a bi-directional mapping between sequence numbers and timestamps.
In the past, snapshots are usually taken by readers. The current super-version is grabbed, and a `rocksdb::Snapshot`
object is created with the last published sequence number of the super-version. You can see that the reader actually
has no good idea of what timestamp to assign to this snapshot, because by the time the `GetSnapshot()` is called,
an arbitrarily long period of time may have already elapsed since the last write, which is when the last published
sequence number is written.
This observation motivates the creation of "timestamped" snapshots on the write path. Currently, this functionality is
exposed only to the layer of `TransactionDB`. Application can tell RocksDB to create a snapshot when a transaction
commits, effectively associating the last sequence number with a timestamp. It is also assumed that application will
ensure any two snapshots with timestamps should satisfy the following:
```
snapshot1.seq < snapshot2.seq iff. snapshot1.ts < snapshot2.ts
```
If the application can guarantee that when a reader takes a timestamped snapshot, there is no active writes going on
in the database, then we also allow the user to use a new API `TransactionDB::CreateTimestampedSnapshot()` to create
a snapshot with associated timestamp.
Code example
```cpp
// Create a timestamped snapshot when committing transaction.
txn->SetCommitTimestamp(100);
txn->SetSnapshotOnNextOperation();
txn->Commit();
// A wrapper API for convenience
Status Transaction::CommitAndTryCreateSnapshot(
std::shared_ptr<TransactionNotifier> notifier,
TxnTimestamp ts,
std::shared_ptr<const Snapshot>* ret);
// Create a timestamped snapshot if caller guarantees no concurrent writes
std::pair<Status, std::shared_ptr<const Snapshot>> snapshot = txn_db->CreateTimestampedSnapshot(100);
```
The snapshots created in this way will be managed by RocksDB with ref-counting and potentially shared with
other readers. We provide the following APIs for readers to retrieve a snapshot given a timestamp.
```cpp
// Return the timestamped snapshot correponding to given timestamp. If ts is
// kMaxTxnTimestamp, then we return the latest timestamped snapshot if present.
// Othersise, we return the snapshot whose timestamp is equal to `ts`. If no
// such snapshot exists, then we return null.
std::shared_ptr<const Snapshot> TransactionDB::GetTimestampedSnapshot(TxnTimestamp ts) const;
// Return the latest timestamped snapshot if present.
std::shared_ptr<const Snapshot> TransactionDB::GetLatestTimestampedSnapshot() const;
```
We also provide two additional APIs for stats collection and reporting purposes.
```cpp
Status TransactionDB::GetAllTimestampedSnapshots(
std::vector<std::shared_ptr<const Snapshot>>& snapshots) const;
// Return timestamped snapshots whose timestamps fall in [ts_lb, ts_ub) and store them in `snapshots`.
Status TransactionDB::GetTimestampedSnapshots(
TxnTimestamp ts_lb,
TxnTimestamp ts_ub,
std::vector<std::shared_ptr<const Snapshot>>& snapshots) const;
```
To prevent the number of timestamped snapshots from growing infinitely, we provide the following API to release
timestamped snapshots whose timestamps are older than or equal to a given threshold.
```cpp
void TransactionDB::ReleaseTimestampedSnapshotsOlderThan(TxnTimestamp ts);
```
Before shutdown, RocksDB will release all timestamped snapshots.
Comparison with user-defined timestamp and how they can be combined:
User-defined timestamp persists every key with a timestamp, while timestamped snapshots maintain a volatile
mapping between snapshots (sequence numbers) and timestamps.
Different internal keys with the same user key but different timestamps will be treated as different by compaction,
thus a newer version will not hide older versions (with smaller timestamps) unless they are eligible for garbage collection.
In contrast, taking a timestamped snapshot at a certain sequence number and timestamp prevents all the keys visible in
this snapshot from been dropped by compaction. Here, visible means (seq < snapshot and most recent).
The timestamped snapshot supports the semantics of reading at an exact point in time.
Timestamped snapshots can also be used with user-defined timestamp.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/9879
Test Plan:
```
make check
TEST_TMPDIR=/dev/shm make crash_test_with_txn
```
Reviewed By: siying
Differential Revision: D35783919
Pulled By: riversand963
fbshipit-source-id: 586ad905e169189e19d3bfc0cb0177a7239d1bd4
3 years ago
timestamped_snapshot_test : $( OBJ_DIR ) /utilities /transactions /timestamped_snapshot_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
tiered_compaction_test : $( OBJ_DIR ) /db /compaction /tiered_compaction_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
sst_dump : $( OBJ_DIR ) /tools /sst_dump .o $( TOOLS_LIBRARY ) $( LIBRARY )
$( AM_LINK)
blob_dump : $( OBJ_DIR ) /tools /blob_dump .o $( TOOLS_LIBRARY ) $( LIBRARY )
$( AM_LINK)
repair_test : $( OBJ_DIR ) /db /repair_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
ldb_cmd_test : $( OBJ_DIR ) /tools /ldb_cmd_test .o $( TOOLS_LIBRARY ) $( TEST_LIBRARY ) $( LIBRARY )
Remove ldb HexToString method's usage of sscanf
Summary:
Fix hex2String performance issues by removing sscanf dependency.
Also fixed some edge case handling (odd length, bad input).
Test Plan: Created a test file which called old and new implementation, and validated results are the same. I'll paste results in the phabricator diff.
Reviewers: igor, rven, anthony, IslamAbdelRahman, kradhakrishnan, yhchiang, sdong
Reviewed By: sdong
Subscribers: thatsafunnyname, leveldb, dhruba
Differential Revision: https://reviews.facebook.net/D46785
9 years ago
$( AM_LINK)
ldb : $( OBJ_DIR ) /tools /ldb .o $( TOOLS_LIBRARY ) $( LIBRARY )
$( AM_LINK)
iostats_context_test : $( OBJ_DIR ) /monitoring /iostats_context_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_V_CCLD) $( CXX) $^ $( EXEC_LDFLAGS) -o $@ $( LDFLAGS)
persistent_cache_test : $( OBJ_DIR ) /utilities /persistent_cache /persistent_cache_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
statistics_test : $( OBJ_DIR ) /monitoring /statistics_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
stats_history_test : $( OBJ_DIR ) /monitoring /stats_history_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
compressed_secondary_cache_test : $( OBJ_DIR ) /cache /compressed_secondary_cache_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
lru_cache_test : $( OBJ_DIR ) /cache /lru_cache_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
range_del_aggregator_test : $( OBJ_DIR ) /db /range_del_aggregator_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
range_del_aggregator_bench : $( OBJ_DIR ) /db /range_del_aggregator_bench .o $( LIBRARY )
$( AM_LINK)
blob_db_test : $( OBJ_DIR ) /utilities /blob_db /blob_db_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
repeatable_thread_test : $( OBJ_DIR ) /util /repeatable_thread_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
range_locking_test : utilities /transactions /lock /range /range_locking_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
range_tombstone_fragmenter_test : $( OBJ_DIR ) /db /range_tombstone_fragmenter_test .o $( TEST_LIBRARY ) $( LIBRARY )
Use only "local" range tombstones during Get (#4449)
Summary:
Previously, range tombstones were accumulated from every level, which
was necessary if a range tombstone in a higher level covered a key in a lower
level. However, RangeDelAggregator::AddTombstones's complexity is based on
the number of tombstones that are currently stored in it, which is wasteful in
the Get case, where we only need to know the highest sequence number of range
tombstones that cover the key from higher levels, and compute the highest covering
sequence number at the current level. This change introduces this optimization, and
removes the use of RangeDelAggregator from the Get path.
In the benchmark results, the following command was used to initialize the database:
```
./db_bench -db=/dev/shm/5k-rts -use_existing_db=false -benchmarks=filluniquerandom -write_buffer_size=1048576 -compression_type=lz4 -target_file_size_base=1048576 -max_bytes_for_level_base=4194304 -value_size=112 -key_size=16 -block_size=4096 -level_compaction_dynamic_level_bytes=true -num=5000000 -max_background_jobs=12 -benchmark_write_rate_limit=20971520 -range_tombstone_width=100 -writes_per_range_tombstone=100 -max_num_range_tombstones=50000 -bloom_bits=8
```
...and the following command was used to measure read throughput:
```
./db_bench -db=/dev/shm/5k-rts/ -use_existing_db=true -benchmarks=readrandom -disable_auto_compactions=true -num=5000000 -reads=100000 -threads=32
```
The filluniquerandom command was only run once, and the resulting database was used
to measure read performance before and after the PR. Both binaries were compiled with
`DEBUG_LEVEL=0`.
Readrandom results before PR:
```
readrandom : 4.544 micros/op 220090 ops/sec; 16.9 MB/s (63103 of 100000 found)
```
Readrandom results after PR:
```
readrandom : 11.147 micros/op 89707 ops/sec; 6.9 MB/s (63103 of 100000 found)
```
So it's actually slower right now, but this PR paves the way for future optimizations (see #4493).
----
Pull Request resolved: https://github.com/facebook/rocksdb/pull/4449
Differential Revision: D10370575
Pulled By: abhimadan
fbshipit-source-id: 9a2e152be1ef36969055c0e9eb4beb0d96c11f4d
6 years ago
$( AM_LINK)
sst_file_reader_test : $( OBJ_DIR ) /table /sst_file_reader_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
db_secondary_test : $( OBJ_DIR ) /db /db_secondary_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
block_cache_tracer_test : $( OBJ_DIR ) /trace_replay /block_cache_tracer_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
block_cache_trace_analyzer_test : $( OBJ_DIR ) /tools /block_cache_analyzer /block_cache_trace_analyzer_test .o $( OBJ_DIR ) /tools /block_cache_analyzer /block_cache_trace_analyzer .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
defer_test : $( OBJ_DIR ) /util /defer_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
blob_counting_iterator_test : $( OBJ_DIR ) /db /blob /blob_counting_iterator_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
blob_file_addition_test : $( OBJ_DIR ) /db /blob /blob_file_addition_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
blob_file_builder_test : $( OBJ_DIR ) /db /blob /blob_file_builder_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
blob_file_cache_test : $( OBJ_DIR ) /db /blob /blob_file_cache_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
blob_file_garbage_test : $( OBJ_DIR ) /db /blob /blob_file_garbage_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
Introduce a blob file reader class (#7461)
Summary:
The patch adds a class called `BlobFileReader` that can be used to retrieve blobs
using the information available in blob references (e.g. blob file number, offset, and
size). This will come in handy when implementing blob support for `Get`, `MultiGet`,
and iterators, and also for compaction/garbage collection.
When a `BlobFileReader` object is created (using the factory method `Create`),
it first checks whether the specified file is potentially valid by comparing the file
size against the combined size of the blob file header and footer (files smaller than
the threshold are considered malformed). Then, it opens the file, and reads and verifies
the header and footer. The verification involves magic number/CRC checks
as well as checking for unexpected header/footer fields, e.g. incorrect column family ID
or TTL blob files.
Blobs can be retrieved using `GetBlob`. `GetBlob` validates the offset and compression
type passed by the caller (because of the presence of the header and footer, the
specified offset cannot be too close to the start/end of the file; also, the compression type
has to match the one in the blob file header), and retrieves and potentially verifies and
uncompresses the blob. In particular, when `ReadOptions::verify_checksums` is set,
`BlobFileReader` reads the blob record header as well (as opposed to just the blob itself)
and verifies the key/value size, the key itself, as well as the CRC of the blob record header
and the key/value pair.
In addition, the patch exposes the compression type from `BlobIndex` (both using an
accessor and via `DebugString`), and adds a blob file read latency histogram to
`InternalStats` that can be used with `BlobFileReader`.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/7461
Test Plan: `make check`
Reviewed By: riversand963
Differential Revision: D23999219
Pulled By: ltamasi
fbshipit-source-id: deb6b1160d251258b308d5156e2ec063c3e12e5e
4 years ago
blob_file_reader_test : $( OBJ_DIR ) /db /blob /blob_file_reader_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
blob_source_test : $( OBJ_DIR ) /db /blob /blob_source_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
blob_garbage_meter_test : $( OBJ_DIR ) /db /blob /blob_garbage_meter_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
timer_test : $( OBJ_DIR ) /util /timer_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
periodic_task_scheduler_test : $( OBJ_DIR ) /db /periodic_task_scheduler_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
testutil_test : $( OBJ_DIR ) /test_util /testutil_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
io_tracer_test : $( OBJ_DIR ) /trace_replay /io_tracer_test .o $( OBJ_DIR ) /trace_replay /io_tracer .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
Provide support for IOTracing for ReadAsync API (#9833)
Summary:
Same as title
Pull Request resolved: https://github.com/facebook/rocksdb/pull/9833
Test Plan:
Add unit test and manually check the output of tracing logs
For fixed readahead_size it logs as:
```
Access Time : 193352113447923 , File Name: 000026.sst , File Operation: ReadAsync , Latency: 15075 , IO Status: OK, Length: 12288, Offset: 659456
Access Time : 193352113465232 , File Name: 000026.sst , File Operation: ReadAsync , Latency: 14425 , IO Status: OK, Length: 12288, Offset: 671744
Access Time : 193352113481539 , File Name: 000026.sst , File Operation: ReadAsync , Latency: 13062 , IO Status: OK, Length: 12288, Offset: 684032
Access Time : 193352113497692 , File Name: 000026.sst , File Operation: ReadAsync , Latency: 13649 , IO Status: OK, Length: 12288, Offset: 696320
Access Time : 193352113520043 , File Name: 000026.sst , File Operation: ReadAsync , Latency: 19384 , IO Status: OK, Length: 12288, Offset: 708608
Access Time : 193352113538401 , File Name: 000026.sst , File Operation: ReadAsync , Latency: 15406 , IO Status: OK, Length: 12288, Offset: 720896
Access Time : 193352113554855 , File Name: 000026.sst , File Operation: ReadAsync , Latency: 13670 , IO Status: OK, Length: 12288, Offset: 733184
Access Time : 193352113571624 , File Name: 000026.sst , File Operation: ReadAsync , Latency: 13855 , IO Status: OK, Length: 12288, Offset: 745472
Access Time : 193352113587924 , File Name: 000026.sst , File Operation: ReadAsync , Latency: 13953 , IO Status: OK, Length: 12288, Offset: 757760
Access Time : 193352113603285 , File Name: 000026.sst , File Operation: Prefetch , Latency: 59 , IO Status: Not implemented: Prefetch not supported, Length: 8868, Offset: 898349
```
For implicit readahead:
```
Access Time : 193351865156587 , File Name: 000026.sst , File Operation: Prefetch , Latency: 48 , IO Status: Not implemented: Prefetch not supported, Length: 12266, Offset: 391174
Access Time : 193351865160354 , File Name: 000026.sst , File Operation: Prefetch , Latency: 51 , IO Status: Not implemented: Prefetch not supported, Length: 12266, Offset: 395248
Access Time : 193351865164253 , File Name: 000026.sst , File Operation: Prefetch , Latency: 49 , IO Status: Not implemented: Prefetch not supported, Length: 12266, Offset: 399322
Access Time : 193351865165461 , File Name: 000026.sst , File Operation: ReadAsync , Latency: 222871 , IO Status: OK, Length: 135168, Offset: 401408
```
Reviewed By: anand1976
Differential Revision: D35601634
Pulled By: akankshamahajan15
fbshipit-source-id: 5a4f32a850af878efa0767bd5706380152a1f26e
3 years ago
prefetch_test : $( OBJ_DIR ) /file /prefetch_test .o $( OBJ_DIR ) /tools /io_tracer_parser_tool .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
io_tracer_parser_test : $( OBJ_DIR ) /tools /io_tracer_parser_test .o $( OBJ_DIR ) /tools /io_tracer_parser_tool .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
io_tracer_parser : $( OBJ_DIR ) /tools /io_tracer_parser .o $( TOOLS_LIBRARY ) $( LIBRARY )
$( AM_LINK)
db_blob_corruption_test : $( OBJ_DIR ) /db /blob /db_blob_corruption_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
db_write_buffer_manager_test : $( OBJ_DIR ) /db /db_write_buffer_manager_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
clipping_iterator_test : $( OBJ_DIR ) /db /compaction /clipping_iterator_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
ribbon_bench : $( OBJ_DIR ) /microbench /ribbon_bench .o $( LIBRARY )
$( AM_LINK)
db_basic_bench : $( OBJ_DIR ) /microbench /db_basic_bench .o $( LIBRARY )
$( AM_LINK)
cache_reservation_manager_test : $( OBJ_DIR ) /cache /cache_reservation_manager_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
wide_column_serialization_test : $( OBJ_DIR ) /db /wide /wide_column_serialization_test .o $( TEST_LIBRARY ) $( LIBRARY )
$( AM_LINK)
#-------------------------------------------------
# make install related stuff
PREFIX ?= /usr/local
LIBDIR ?= $( PREFIX) /lib
INSTALL_LIBDIR = $( DESTDIR) $( LIBDIR)
uninstall :
rm -rf $( DESTDIR) $( PREFIX) /include/rocksdb \
$( INSTALL_LIBDIR) /$( LIBRARY) \
$( INSTALL_LIBDIR) /$( SHARED4) \
$( INSTALL_LIBDIR) /$( SHARED3) \
$( INSTALL_LIBDIR) /$( SHARED2) \
$( INSTALL_LIBDIR) /$( SHARED1) \
$( INSTALL_LIBDIR) /pkgconfig/rocksdb.pc
install-headers : gen -pc
install -d $( INSTALL_LIBDIR)
install -d $( INSTALL_LIBDIR) /pkgconfig
for header_dir in ` $( FIND) "include/rocksdb" -type d` ; do \
install -d $( DESTDIR) /$( PREFIX) /$$ header_dir; \
done
for header in ` $( FIND) "include/rocksdb" -type f -name *.h` ; do \
install -C -m 644 $$ header $( DESTDIR) /$( PREFIX) /$$ header; \
done
for header in $( ROCKSDB_PLUGIN_HEADERS) ; do \
install -d $( DESTDIR) /$( PREFIX) /include/rocksdb/` dirname $$ header` ; \
install -C -m 644 $$ header $( DESTDIR) /$( PREFIX) /include/rocksdb/$$ header; \
done
install -C -m 644 rocksdb.pc $( INSTALL_LIBDIR) /pkgconfig/rocksdb.pc
install-static : install -headers $( LIBRARY )
install -d $( INSTALL_LIBDIR)
install -C -m 755 $( LIBRARY) $( INSTALL_LIBDIR)
install-shared : install -headers $( SHARED 4)
install -d $( INSTALL_LIBDIR)
install -C -m 755 $( SHARED4) $( INSTALL_LIBDIR)
ln -fs $( SHARED4) $( INSTALL_LIBDIR) /$( SHARED3)
ln -fs $( SHARED4) $( INSTALL_LIBDIR) /$( SHARED2)
ln -fs $( SHARED4) $( INSTALL_LIBDIR) /$( SHARED1)
# install static by default + install shared if it exists
install : install -static
[ -e $( SHARED4) ] && $( MAKE) install-shared || :
# Generate the pkg-config file
gen-pc :
-echo 'prefix=$(PREFIX)' > rocksdb.pc
-echo 'exec_prefix=$${prefix}' >> rocksdb.pc
-echo 'includedir=$${prefix}/include' >> rocksdb.pc
-echo 'libdir=$(LIBDIR)' >> rocksdb.pc
-echo '' >> rocksdb.pc
-echo 'Name: rocksdb' >> rocksdb.pc
-echo 'Description: An embeddable persistent key-value store for fast storage' >> rocksdb.pc
-echo Version: $( shell ./build_tools/version.sh full) >> rocksdb.pc
-echo 'Libs: -L$${libdir} $(EXEC_LDFLAGS) -lrocksdb' >> rocksdb.pc
-echo 'Libs.private: $(PLATFORM_LDFLAGS)' >> rocksdb.pc
-echo 'Cflags: -I$${includedir} $(PLATFORM_CXXFLAGS)' >> rocksdb.pc
-echo 'Requires: $(subst ",,$(ROCKSDB_PLUGIN_PKGCONFIG_REQUIRES))' >> rocksdb.pc
#-------------------------------------------------
Add a jni library for rocksdb which supports Open, Get, Put, and Close.
Summary:
This diff contains a simple jni library for rocksdb which supports open, get,
put and closeusing default options (including Options, ReadOptions, and
WriteOptions.) In the usual case, Java developers can use the c++ rocksdb
library in the way similar to the following:
RocksDB db = RocksDB.open(path_to_db);
...
db.put("hello".getBytes(), "world".getBytes();
byte[] value = db.get("hello".getBytes());
...
db.close();
Specifically, this diff has the following major classes:
* RocksDB: a Java wrapper class which forwards the operations
from the java side to c++ rocksdb library.
* RocksDBException: ncapsulates the error of an operation.
This exception type is used to describe an internal error from
the c++ rocksdb library.
This diff also include a simple java sample code calling c++ rocksdb library.
To build the rocksdb jni library, simply run make jni, and make jtest will try to
build and run the sample code.
Note that if the rocksdb is not built with the default glibc that Java uses,
java will try to load the wrong glibc during the run time. As a result,
the sample code might not work properly during the run time.
Test Plan:
* make jni
* make jtest
Reviewers: haobo, dhruba, sdong, igor, ljin
Reviewed By: dhruba
CC: leveldb, xjin
Differential Revision: https://reviews.facebook.net/D17109
11 years ago
# ---------------------------------------------------------------------------
# Jni stuff
# ---------------------------------------------------------------------------
JAVA_INCLUDE = -I$( JAVA_HOME) /include/ -I$( JAVA_HOME) /include/linux
i f e q ( $( PLATFORM ) , O S _ S O L A R I S )
ARCH := $( shell isainfo -b)
e l s e i f e q ( $( PLATFORM ) , O S _ O P E N B S D )
ifneq ( ,$( filter amd64 ppc64 ppc64le s390x arm64 aarch64 sparc64 loongarch64, $( MACHINE) ) )
ARCH := 64
else
ARCH := 32
endif
e l s e
ARCH := $( shell getconf LONG_BIT)
e n d i f
i f e q ( $( shell ldd /usr /bin /env 2>/dev /null | grep -q musl ; echo $ $ ?) , 0 )
JNI_LIBC = musl
# GNU LibC (or glibc) is so pervasive we can assume it is the default
# else
# JNI_LIBC = glibc
e n d i f
i f n e q ( $( origin JNI_LIBC ) , u n d e f i n e d )
JNI_LIBC_POSTFIX = -$( JNI_LIBC)
e n d i f
i f e q ( , $( ROCKSDBJNILIB ) )
i f n e q ( , $( filter ppc % s 390x arm 64 aarch 64 sparc 64 loongarch 64, $ ( MACHINE ) ) )
ROCKSDBJNILIB = librocksdbjni-linux-$( MACHINE) $( JNI_LIBC_POSTFIX) .so
e l s e
ROCKSDBJNILIB = librocksdbjni-linux$( ARCH) $( JNI_LIBC_POSTFIX) .so
e n d i f
e n d i f
ROCKSDB_JAVA_VERSION ?= $( ROCKSDB_MAJOR) .$( ROCKSDB_MINOR) .$( ROCKSDB_PATCH)
ROCKSDB_JAR = rocksdbjni-$( ROCKSDB_JAVA_VERSION) -linux$( ARCH) $( JNI_LIBC_POSTFIX) .jar
ROCKSDB_JAR_ALL = rocksdbjni-$( ROCKSDB_JAVA_VERSION) .jar
ROCKSDB_JAVADOCS_JAR = rocksdbjni-$( ROCKSDB_JAVA_VERSION) -javadoc.jar
ROCKSDB_SOURCES_JAR = rocksdbjni-$( ROCKSDB_JAVA_VERSION) -sources.jar
SHA256_CMD = sha256sum
ZLIB_VER ?= 1.2.13
ZLIB_SHA256 ?= b3a24de97a8fdbc835b9833169501030b8977031bcb54b3b3ac13740f846ab30
ZLIB_DOWNLOAD_BASE ?= http://zlib.net
BZIP2_VER ?= 1.0.8
BZIP2_SHA256 ?= ab5a03176ee106d3f0fa90e381da478ddae405918153cca248e682cd0c4a2269
BZIP2_DOWNLOAD_BASE ?= http://sourceware.org/pub/bzip2
SNAPPY_VER ?= 1.1.8
SNAPPY_SHA256 ?= 16b677f07832a612b0836178db7f374e414f94657c138e6993cbfc5dcc58651f
SNAPPY_DOWNLOAD_BASE ?= https://github.com/google/snappy/archive
LZ4_VER ?= 1.9.3
LZ4_SHA256 ?= 030644df4611007ff7dc962d981f390361e6c97a34e5cbc393ddfbe019ffe2c1
LZ4_DOWNLOAD_BASE ?= https://github.com/lz4/lz4/archive
ZSTD_VER ?= 1.4.9
ZSTD_SHA256 ?= acf714d98e3db7b876e5b540cbf6dee298f60eb3c0723104f6d3f065cd60d6a8
ZSTD_DOWNLOAD_BASE ?= https://github.com/facebook/zstd/archive
CURL_SSL_OPTS ?= --tlsv1
Add a jni library for rocksdb which supports Open, Get, Put, and Close.
Summary:
This diff contains a simple jni library for rocksdb which supports open, get,
put and closeusing default options (including Options, ReadOptions, and
WriteOptions.) In the usual case, Java developers can use the c++ rocksdb
library in the way similar to the following:
RocksDB db = RocksDB.open(path_to_db);
...
db.put("hello".getBytes(), "world".getBytes();
byte[] value = db.get("hello".getBytes());
...
db.close();
Specifically, this diff has the following major classes:
* RocksDB: a Java wrapper class which forwards the operations
from the java side to c++ rocksdb library.
* RocksDBException: ncapsulates the error of an operation.
This exception type is used to describe an internal error from
the c++ rocksdb library.
This diff also include a simple java sample code calling c++ rocksdb library.
To build the rocksdb jni library, simply run make jni, and make jtest will try to
build and run the sample code.
Note that if the rocksdb is not built with the default glibc that Java uses,
java will try to load the wrong glibc during the run time. As a result,
the sample code might not work properly during the run time.
Test Plan:
* make jni
* make jtest
Reviewers: haobo, dhruba, sdong, igor, ljin
Reviewed By: dhruba
CC: leveldb, xjin
Differential Revision: https://reviews.facebook.net/D17109
11 years ago
i f e q ( $( PLATFORM ) , O S _ M A C O S X )
i f e q ( , $( findstring librocksdbjni -osx ,$ ( ROCKSDBJNILIB ) ) )
i f e q ( $( MACHINE ) , a r m 6 4 )
ROCKSDBJNILIB = librocksdbjni-osx-arm64.jnilib
e l s e i f e q ( $( MACHINE ) , x 8 6 _ 6 4 )
ROCKSDBJNILIB = librocksdbjni-osx-x86_64.jnilib
e l s e
ROCKSDBJNILIB = librocksdbjni-osx.jnilib
e n d i f
e n d i f
ROCKSDB_JAR = rocksdbjni-$( ROCKSDB_JAVA_VERSION) -osx.jar
SHA256_CMD = openssl sha256 -r
i f n e q ( "$(wildcard $(JAVA_HOME)/include/darwin)" , "" )
JAVA_INCLUDE = -I$( JAVA_HOME) /include -I $( JAVA_HOME) /include/darwin
e l s e
JAVA_INCLUDE = -I/System/Library/Frameworks/JavaVM.framework/Headers/
e n d i f
Add a jni library for rocksdb which supports Open, Get, Put, and Close.
Summary:
This diff contains a simple jni library for rocksdb which supports open, get,
put and closeusing default options (including Options, ReadOptions, and
WriteOptions.) In the usual case, Java developers can use the c++ rocksdb
library in the way similar to the following:
RocksDB db = RocksDB.open(path_to_db);
...
db.put("hello".getBytes(), "world".getBytes();
byte[] value = db.get("hello".getBytes());
...
db.close();
Specifically, this diff has the following major classes:
* RocksDB: a Java wrapper class which forwards the operations
from the java side to c++ rocksdb library.
* RocksDBException: ncapsulates the error of an operation.
This exception type is used to describe an internal error from
the c++ rocksdb library.
This diff also include a simple java sample code calling c++ rocksdb library.
To build the rocksdb jni library, simply run make jni, and make jtest will try to
build and run the sample code.
Note that if the rocksdb is not built with the default glibc that Java uses,
java will try to load the wrong glibc during the run time. As a result,
the sample code might not work properly during the run time.
Test Plan:
* make jni
* make jtest
Reviewers: haobo, dhruba, sdong, igor, ljin
Reviewed By: dhruba
CC: leveldb, xjin
Differential Revision: https://reviews.facebook.net/D17109
11 years ago
e n d i f
i f e q ( $( PLATFORM ) , O S _ F R E E B S D )
JAVA_INCLUDE = -I$( JAVA_HOME) /include -I$( JAVA_HOME) /include/freebsd
ROCKSDBJNILIB = librocksdbjni-freebsd$( ARCH) .so
ROCKSDB_JAR = rocksdbjni-$( ROCKSDB_JAVA_VERSION) -freebsd$( ARCH) .jar
e n d i f
i f e q ( $( PLATFORM ) , O S _ S O L A R I S )
ROCKSDBJNILIB = librocksdbjni-solaris$( ARCH) .so
ROCKSDB_JAR = rocksdbjni-$( ROCKSDB_MAJOR) .$( ROCKSDB_MINOR) .$( ROCKSDB_PATCH) -solaris$( ARCH) .jar
JAVA_INCLUDE = -I$( JAVA_HOME) /include/ -I$( JAVA_HOME) /include/solaris
SHA256_CMD = digest -a sha256
e n d i f
i f e q ( $( PLATFORM ) , O S _ A I X )
JAVA_INCLUDE = -I$( JAVA_HOME) /include/ -I$( JAVA_HOME) /include/aix
ROCKSDBJNILIB = librocksdbjni-aix.so
EXTRACT_SOURCES = gunzip < TAR_GZ | tar xvf -
SNAPPY_MAKE_TARGET = libsnappy.la
e n d i f
i f e q ( $( PLATFORM ) , O S _ O P E N B S D )
JAVA_INCLUDE = -I$( JAVA_HOME) /include -I$( JAVA_HOME) /include/openbsd
ROCKSDBJNILIB = librocksdbjni-openbsd$( ARCH) .so
ROCKSDB_JAR = rocksdbjni-$( ROCKSDB_JAVA_VERSION) -openbsd$( ARCH) .jar
e n d i f
export SHA256_CMD
Add a jni library for rocksdb which supports Open, Get, Put, and Close.
Summary:
This diff contains a simple jni library for rocksdb which supports open, get,
put and closeusing default options (including Options, ReadOptions, and
WriteOptions.) In the usual case, Java developers can use the c++ rocksdb
library in the way similar to the following:
RocksDB db = RocksDB.open(path_to_db);
...
db.put("hello".getBytes(), "world".getBytes();
byte[] value = db.get("hello".getBytes());
...
db.close();
Specifically, this diff has the following major classes:
* RocksDB: a Java wrapper class which forwards the operations
from the java side to c++ rocksdb library.
* RocksDBException: ncapsulates the error of an operation.
This exception type is used to describe an internal error from
the c++ rocksdb library.
This diff also include a simple java sample code calling c++ rocksdb library.
To build the rocksdb jni library, simply run make jni, and make jtest will try to
build and run the sample code.
Note that if the rocksdb is not built with the default glibc that Java uses,
java will try to load the wrong glibc during the run time. As a result,
the sample code might not work properly during the run time.
Test Plan:
* make jni
* make jtest
Reviewers: haobo, dhruba, sdong, igor, ljin
Reviewed By: dhruba
CC: leveldb, xjin
Differential Revision: https://reviews.facebook.net/D17109
11 years ago
zlib-$(ZLIB_VER).tar.gz :
curl --fail --output zlib-$( ZLIB_VER) .tar.gz --location ${ ZLIB_DOWNLOAD_BASE } /zlib-$( ZLIB_VER) .tar.gz
ZLIB_SHA256_ACTUAL = ` $( SHA256_CMD) zlib-$( ZLIB_VER) .tar.gz | cut -d ' ' -f 1` ; \
if [ " $( ZLIB_SHA256) " != " $$ ZLIB_SHA256_ACTUAL " ] ; then \
echo zlib-$( ZLIB_VER) .tar.gz checksum mismatch, expected = \" $( ZLIB_SHA256) \" actual = \" $$ ZLIB_SHA256_ACTUAL\" ; \
exit 1; \
fi
libz.a : zlib -$( ZLIB_VER ) .tar .gz
-rm -rf zlib-$( ZLIB_VER)
tar xvzf zlib-$( ZLIB_VER) .tar.gz
if [ -n" $( ARCHFLAG) " ] ; then \
cd zlib-$( ZLIB_VER) && CFLAGS = '-fPIC ${JAVA_STATIC_DEPS_CCFLAGS} ${EXTRA_CFLAGS}' LDFLAGS = '${JAVA_STATIC_DEPS_LDFLAGS} ${EXTRA_LDFLAGS}' ./configure --static --archs= " $( ARCHFLAG) " && $( MAKE) ; \
else \
cd zlib-$( ZLIB_VER) && CFLAGS = '-fPIC ${JAVA_STATIC_DEPS_CCFLAGS} ${EXTRA_CFLAGS}' LDFLAGS = '${JAVA_STATIC_DEPS_LDFLAGS} ${EXTRA_LDFLAGS}' ./configure --static && $( MAKE) ; \
fi
cp zlib-$( ZLIB_VER) /libz.a .
bzip2-$(BZIP2_VER).tar.gz :
curl --fail --output bzip2-$( BZIP2_VER) .tar.gz --location ${ CURL_SSL_OPTS } ${ BZIP2_DOWNLOAD_BASE } /bzip2-$( BZIP2_VER) .tar.gz
BZIP2_SHA256_ACTUAL = ` $( SHA256_CMD) bzip2-$( BZIP2_VER) .tar.gz | cut -d ' ' -f 1` ; \
if [ " $( BZIP2_SHA256) " != " $$ BZIP2_SHA256_ACTUAL " ] ; then \
echo bzip2-$( BZIP2_VER) .tar.gz checksum mismatch, expected = \" $( BZIP2_SHA256) \" actual = \" $$ BZIP2_SHA256_ACTUAL\" ; \
exit 1; \
fi
libbz2.a : bzip 2-$( BZIP 2_VER ) .tar .gz
-rm -rf bzip2-$( BZIP2_VER)
tar xvzf bzip2-$( BZIP2_VER) .tar.gz
cd bzip2-$( BZIP2_VER) && $( MAKE) CFLAGS = '-fPIC -O2 -g -D_FILE_OFFSET_BITS=64 $(ARCHFLAG) ${JAVA_STATIC_DEPS_CCFLAGS} ${EXTRA_CFLAGS}' LDFLAGS = '${JAVA_STATIC_DEPS_LDFLAGS} ${EXTRA_LDFLAGS}' AR = 'ar ${EXTRA_ARFLAGS}' libbz2.a
cp bzip2-$( BZIP2_VER) /libbz2.a .
snappy-$(SNAPPY_VER).tar.gz :
curl --fail --output snappy-$( SNAPPY_VER) .tar.gz --location ${ CURL_SSL_OPTS } ${ SNAPPY_DOWNLOAD_BASE } /$( SNAPPY_VER) .tar.gz
SNAPPY_SHA256_ACTUAL = ` $( SHA256_CMD) snappy-$( SNAPPY_VER) .tar.gz | cut -d ' ' -f 1` ; \
if [ " $( SNAPPY_SHA256) " != " $$ SNAPPY_SHA256_ACTUAL " ] ; then \
echo snappy-$( SNAPPY_VER) .tar.gz checksum mismatch, expected = \" $( SNAPPY_SHA256) \" actual = \" $$ SNAPPY_SHA256_ACTUAL\" ; \
exit 1; \
fi
libsnappy.a : snappy -$( SNAPPY_VER ) .tar .gz
-rm -rf snappy-$( SNAPPY_VER)
tar xvzf snappy-$( SNAPPY_VER) .tar.gz
mkdir snappy-$( SNAPPY_VER) /build
cd snappy-$( SNAPPY_VER) /build && CFLAGS = '$(ARCHFLAG) ${JAVA_STATIC_DEPS_CCFLAGS} ${EXTRA_CFLAGS}' CXXFLAGS = '$(ARCHFLAG) ${JAVA_STATIC_DEPS_CXXFLAGS} ${EXTRA_CXXFLAGS}' LDFLAGS = '${JAVA_STATIC_DEPS_LDFLAGS} ${EXTRA_LDFLAGS}' cmake -DCMAKE_POSITION_INDEPENDENT_CODE= ON ${ PLATFORM_CMAKE_FLAGS } .. && $( MAKE) ${ SNAPPY_MAKE_TARGET }
cp snappy-$( SNAPPY_VER) /build/libsnappy.a .
lz4-$(LZ4_VER).tar.gz :
curl --fail --output lz4-$( LZ4_VER) .tar.gz --location ${ CURL_SSL_OPTS } ${ LZ4_DOWNLOAD_BASE } /v$( LZ4_VER) .tar.gz
LZ4_SHA256_ACTUAL = ` $( SHA256_CMD) lz4-$( LZ4_VER) .tar.gz | cut -d ' ' -f 1` ; \
if [ " $( LZ4_SHA256) " != " $$ LZ4_SHA256_ACTUAL " ] ; then \
echo lz4-$( LZ4_VER) .tar.gz checksum mismatch, expected = \" $( LZ4_SHA256) \" actual = \" $$ LZ4_SHA256_ACTUAL\" ; \
exit 1; \
fi
liblz4.a : lz 4-$( LZ 4_VER ) .tar .gz
-rm -rf lz4-$( LZ4_VER)
tar xvzf lz4-$( LZ4_VER) .tar.gz
cd lz4-$( LZ4_VER) /lib && $( MAKE) CFLAGS = '-fPIC -O2 $(ARCHFLAG) ${JAVA_STATIC_DEPS_CCFLAGS} ${EXTRA_CFLAGS}' LDFLAGS = '${JAVA_STATIC_DEPS_LDFLAGS} ${EXTRA_LDFLAGS}' all
cp lz4-$( LZ4_VER) /lib/liblz4.a .
zstd-$(ZSTD_VER).tar.gz :
curl --fail --output zstd-$( ZSTD_VER) .tar.gz --location ${ CURL_SSL_OPTS } ${ ZSTD_DOWNLOAD_BASE } /v$( ZSTD_VER) .tar.gz
ZSTD_SHA256_ACTUAL = ` $( SHA256_CMD) zstd-$( ZSTD_VER) .tar.gz | cut -d ' ' -f 1` ; \
if [ " $( ZSTD_SHA256) " != " $$ ZSTD_SHA256_ACTUAL " ] ; then \
echo zstd-$( ZSTD_VER) .tar.gz checksum mismatch, expected = \" $( ZSTD_SHA256) \" actual = \" $$ ZSTD_SHA256_ACTUAL\" ; \
exit 1; \
fi
libzstd.a : zstd -$( ZSTD_VER ) .tar .gz
-rm -rf zstd-$( ZSTD_VER)
tar xvzf zstd-$( ZSTD_VER) .tar.gz
cd zstd-$( ZSTD_VER) /lib && DESTDIR = . PREFIX = $( MAKE) CFLAGS = '-fPIC -O2 $(ARCHFLAG) ${JAVA_STATIC_DEPS_CCFLAGS} ${EXTRA_CFLAGS}' LDFLAGS = '${JAVA_STATIC_DEPS_LDFLAGS} ${EXTRA_LDFLAGS}' libzstd.a
cp zstd-$( ZSTD_VER) /lib/libzstd.a .
# A version of each $(LIB_OBJECTS) compiled with -fPIC and a fixed set of static compression libraries
i f n e q ( $( ROCKSDB_JAVA_NO_COMPRESSION ) , 1 )
JAVA_COMPRESSIONS = libz.a libbz2.a libsnappy.a liblz4.a libzstd.a
e n d i f
JAVA_STATIC_FLAGS = -DZLIB -DBZIP2 -DSNAPPY -DLZ4 -DZSTD
JAVA_STATIC_INCLUDES = -I./zlib-$( ZLIB_VER) -I./bzip2-$( BZIP2_VER) -I./snappy-$( SNAPPY_VER) -I./snappy-$( SNAPPY_VER) /build -I./lz4-$( LZ4_VER) /lib -I./zstd-$( ZSTD_VER) /lib -I./zstd-$( ZSTD_VER) /lib/dictBuilder
i f n e q ( $( findstring rocksdbjavastatic , $ ( filter -out rocksdbjavastatic_deps , $ ( MAKECMDGOALS ) ) ) , )
CXXFLAGS += $( JAVA_STATIC_FLAGS) $( JAVA_STATIC_INCLUDES)
CFLAGS += $( JAVA_STATIC_FLAGS) $( JAVA_STATIC_INCLUDES)
e n d i f
rocksdbjavastatic :
i f e q ( $( JAVA_HOME ) , )
$( error JAVA_HOME is not set )
e n d i f
$( MAKE) rocksdbjavastatic_deps
$( MAKE) rocksdbjavastatic_libobjects
$( MAKE) rocksdbjavastatic_javalib
$( MAKE) rocksdbjava_jar
rocksdbjavastaticosx : rocksdbjavastaticosx_archs
cd java; $( JAR_CMD) -cf target/$( ROCKSDB_JAR) HISTORY*.md
cd java/target; $( JAR_CMD) -uf $( ROCKSDB_JAR) librocksdbjni-osx-x86_64.jnilib librocksdbjni-osx-arm64.jnilib
cd java/target/classes; $( JAR_CMD) -uf ../$( ROCKSDB_JAR) org/rocksdb/*.class org/rocksdb/util/*.class
openssl sha1 java/target/$( ROCKSDB_JAR) | sed 's/.*= \([0-9a-f]*\)/\1/' > java/target/$( ROCKSDB_JAR) .sha1
rocksdbjavastaticosx_ub : rocksdbjavastaticosx_archs
cd java/target; lipo -create -output librocksdbjni-osx.jnilib librocksdbjni-osx-x86_64.jnilib librocksdbjni-osx-arm64.jnilib
cd java; $( JAR_CMD) -cf target/$( ROCKSDB_JAR) HISTORY*.md
cd java/target; $( JAR_CMD) -uf $( ROCKSDB_JAR) librocksdbjni-osx.jnilib
cd java/target/classes; $( JAR_CMD) -uf ../$( ROCKSDB_JAR) org/rocksdb/*.class org/rocksdb/util/*.class
openssl sha1 java/target/$( ROCKSDB_JAR) | sed 's/.*= \([0-9a-f]*\)/\1/' > java/target/$( ROCKSDB_JAR) .sha1
rocksdbjavastaticosx_archs :
$( MAKE) rocksdbjavastaticosx_arch_x86_64
$( MAKE) rocksdbjavastaticosx_arch_arm64
rocksdbjavastaticosx_arch_% :
i f e q ( $( JAVA_HOME ) , )
$( error JAVA_HOME is not set )
e n d i f
$( MAKE) clean-ext-libraries-bin
$( MAKE) clean-rocks
ARCHFLAG = " -arch $* " $( MAKE) rocksdbjavastatic_deps
ARCHFLAG = " -arch $* " $( MAKE) rocksdbjavastatic_libobjects
ARCHFLAG = " -arch $* " ROCKSDBJNILIB = " librocksdbjni-osx- $* .jnilib " $( MAKE) rocksdbjavastatic_javalib
i f e q ( $( JAR_CMD ) , )
i f n e q ( $( JAVA_HOME ) , )
JAR_CMD := $( JAVA_HOME) /bin/jar
e l s e
JAR_CMD := jar
e n d i f
e n d i f
rocksdbjavastatic_javalib :
cd java; $( MAKE) javalib
rm -f java/target/$( ROCKSDBJNILIB)
$( CXX) $( CXXFLAGS) -I./java/. $( JAVA_INCLUDE) -shared -fPIC \
-o ./java/target/$( ROCKSDBJNILIB) $( ALL_JNI_NATIVE_SOURCES) \
$( LIB_OBJECTS) $( COVERAGEFLAGS) \
$( JAVA_COMPRESSIONS) $( JAVA_STATIC_LDFLAGS)
cd java/target; if [ " $( DEBUG_LEVEL) " = = "0" ] ; then \
strip $( STRIPFLAGS) $( ROCKSDBJNILIB) ; \
fi
rocksdbjava_jar :
cd java; $( JAR_CMD) -cf target/$( ROCKSDB_JAR) HISTORY*.md
cd java/target; $( JAR_CMD) -uf $( ROCKSDB_JAR) $( ROCKSDBJNILIB)
cd java/target/classes; $( JAR_CMD) -uf ../$( ROCKSDB_JAR) org/rocksdb/*.class org/rocksdb/util/*.class
openssl sha1 java/target/$( ROCKSDB_JAR) | sed 's/.*= \([0-9a-f]*\)/\1/' > java/target/$( ROCKSDB_JAR) .sha1
rocksdbjava_javadocs_jar :
cd java/target/apidocs; $( JAR_CMD) -cf ../$( ROCKSDB_JAVADOCS_JAR) *
openssl sha1 java/target/$( ROCKSDB_JAVADOCS_JAR) | sed 's/.*= \([0-9a-f]*\)/\1/' > java/target/$( ROCKSDB_JAVADOCS_JAR) .sha1
rocksdbjava_sources_jar :
cd java/src/main/java; $( JAR_CMD) -cf ../../../target/$( ROCKSDB_SOURCES_JAR) org
openssl sha1 java/target/$( ROCKSDB_SOURCES_JAR) | sed 's/.*= \([0-9a-f]*\)/\1/' > java/target/$( ROCKSDB_SOURCES_JAR) .sha1
rocksdbjavastatic_deps : $( JAVA_COMPRESSIONS )
rocksdbjavastatic_libobjects : $( LIB_OBJECTS )
rocksdbjavastaticrelease : rocksdbjavastaticosx rocksdbjava_javadocs_jar rocksdbjava_sources_jar
cd java/crossbuild && ( vagrant destroy -f || true ) && vagrant up linux32 && vagrant halt linux32 && vagrant up linux64 && vagrant halt linux64 && vagrant up linux64-musl && vagrant halt linux64-musl
cd java; $( JAR_CMD) -cf target/$( ROCKSDB_JAR_ALL) HISTORY*.md
cd java/target; $( JAR_CMD) -uf $( ROCKSDB_JAR_ALL) librocksdbjni-*.so librocksdbjni-*.jnilib
cd java/target/classes; $( JAR_CMD) -uf ../$( ROCKSDB_JAR_ALL) org/rocksdb/*.class org/rocksdb/util/*.class
openssl sha1 java/target/$( ROCKSDB_JAR_ALL) | sed 's/.*= \([0-9a-f]*\)/\1/' > java/target/$( ROCKSDB_JAR_ALL) .sha1
rocksdbjavastaticreleasedocker : rocksdbjavastaticosx rocksdbjavastaticdockerx 86 rocksdbjavastaticdockerx 86_ 64 rocksdbjavastaticdockerx 86musl rocksdbjavastaticdockerx 86_ 64musl rocksdbjava_javadocs_jar rocksdbjava_sources_jar
cd java; $( JAR_CMD) -cf target/$( ROCKSDB_JAR_ALL) HISTORY*.md
cd java/target; $( JAR_CMD) -uf $( ROCKSDB_JAR_ALL) librocksdbjni-*.so librocksdbjni-*.jnilib
cd java/target/classes; $( JAR_CMD) -uf ../$( ROCKSDB_JAR_ALL) org/rocksdb/*.class org/rocksdb/util/*.class
openssl sha1 java/target/$( ROCKSDB_JAR_ALL) | sed 's/.*= \([0-9a-f]*\)/\1/' > java/target/$( ROCKSDB_JAR_ALL) .sha1
rocksdbjavastaticdockerx86 :
mkdir -p java/target
docker run --rm --name rocksdb_linux_x86-be --platform linux/386 --attach stdin --attach stdout --attach stderr --volume $( HOME) /.m2:/root/.m2:ro --volume ` pwd ` :/rocksdb-host:ro --volume /rocksdb-local-build --volume ` pwd ` /java/target:/rocksdb-java-target --env DEBUG_LEVEL = $( DEBUG_LEVEL) evolvedbinary/rocksjava:centos6_x86-be /rocksdb-host/java/crossbuild/docker-build-linux-centos.sh
rocksdbjavastaticdockerx86_64 :
mkdir -p java/target
docker run --rm --name rocksdb_linux_x64-be --attach stdin --attach stdout --attach stderr --volume $( HOME) /.m2:/root/.m2:ro --volume ` pwd ` :/rocksdb-host:ro --volume /rocksdb-local-build --volume ` pwd ` /java/target:/rocksdb-java-target --env DEBUG_LEVEL = $( DEBUG_LEVEL) evolvedbinary/rocksjava:centos6_x64-be /rocksdb-host/java/crossbuild/docker-build-linux-centos.sh
rocksdbjavastaticdockerppc64le :
mkdir -p java/target
docker run --rm --name rocksdb_linux_ppc64le-be --attach stdin --attach stdout --attach stderr --volume $( HOME) /.m2:/root/.m2:ro --volume ` pwd ` :/rocksdb-host:ro --volume /rocksdb-local-build --volume ` pwd ` /java/target:/rocksdb-java-target --env DEBUG_LEVEL = $( DEBUG_LEVEL) evolvedbinary/rocksjava:centos7_ppc64le-be /rocksdb-host/java/crossbuild/docker-build-linux-centos.sh
rocksdbjavastaticdockerarm64v8 :
mkdir -p java/target
docker run --rm --name rocksdb_linux_arm64v8-be --attach stdin --attach stdout --attach stderr --volume $( HOME) /.m2:/root/.m2:ro --volume ` pwd ` :/rocksdb-host:ro --volume /rocksdb-local-build --volume ` pwd ` /java/target:/rocksdb-java-target --env DEBUG_LEVEL = $( DEBUG_LEVEL) evolvedbinary/rocksjava:centos7_arm64v8-be /rocksdb-host/java/crossbuild/docker-build-linux-centos.sh
rocksdbjavastaticdockers390x :
mkdir -p java/target
docker run --rm --name rocksdb_linux_s390x-be --attach stdin --attach stdout --attach stderr --volume $( HOME) /.m2:/root/.m2:ro --volume ` pwd ` :/rocksdb-host:ro --volume /rocksdb-local-build --volume ` pwd ` /java/target:/rocksdb-java-target --env DEBUG_LEVEL = $( DEBUG_LEVEL) evolvedbinary/rocksjava:ubuntu18_s390x-be /rocksdb-host/java/crossbuild/docker-build-linux-centos.sh
rocksdbjavastaticdockerx86musl :
mkdir -p java/target
docker run --rm --name rocksdb_linux_x86-musl-be --platform linux/386 --attach stdin --attach stdout --attach stderr --volume $( HOME) /.m2:/root/.m2:ro --volume ` pwd ` :/rocksdb-host:ro --volume /rocksdb-local-build --volume ` pwd ` /java/target:/rocksdb-java-target --env DEBUG_LEVEL = $( DEBUG_LEVEL) evolvedbinary/rocksjava:alpine3_x86-be /rocksdb-host/java/crossbuild/docker-build-linux-alpine.sh
rocksdbjavastaticdockerx86_64musl :
mkdir -p java/target
docker run --rm --name rocksdb_linux_x64-musl-be --attach stdin --attach stdout --attach stderr --volume $( HOME) /.m2:/root/.m2:ro --volume ` pwd ` :/rocksdb-host:ro --volume /rocksdb-local-build --volume ` pwd ` /java/target:/rocksdb-java-target --env DEBUG_LEVEL = $( DEBUG_LEVEL) evolvedbinary/rocksjava:alpine3_x64-be /rocksdb-host/java/crossbuild/docker-build-linux-alpine.sh
rocksdbjavastaticdockerppc64lemusl :
mkdir -p java/target
docker run --rm --name rocksdb_linux_ppc64le-musl-be --attach stdin --attach stdout --attach stderr --volume $( HOME) /.m2:/root/.m2:ro --volume ` pwd ` :/rocksdb-host:ro --volume /rocksdb-local-build --volume ` pwd ` /java/target:/rocksdb-java-target --env DEBUG_LEVEL = $( DEBUG_LEVEL) evolvedbinary/rocksjava:alpine3_ppc64le-be /rocksdb-host/java/crossbuild/docker-build-linux-alpine.sh
rocksdbjavastaticdockerarm64v8musl :
mkdir -p java/target
docker run --rm --name rocksdb_linux_arm64v8-musl-be --attach stdin --attach stdout --attach stderr --volume $( HOME) /.m2:/root/.m2:ro --volume ` pwd ` :/rocksdb-host:ro --volume /rocksdb-local-build --volume ` pwd ` /java/target:/rocksdb-java-target --env DEBUG_LEVEL = $( DEBUG_LEVEL) evolvedbinary/rocksjava:alpine3_arm64v8-be /rocksdb-host/java/crossbuild/docker-build-linux-alpine.sh
rocksdbjavastaticdockers390xmusl :
mkdir -p java/target
docker run --rm --name rocksdb_linux_s390x-musl-be --attach stdin --attach stdout --attach stderr --volume $( HOME) /.m2:/root/.m2:ro --volume ` pwd ` :/rocksdb-host:ro --volume /rocksdb-local-build --volume ` pwd ` /java/target:/rocksdb-java-target --env DEBUG_LEVEL = $( DEBUG_LEVEL) evolvedbinary/rocksjava:alpine3_s390x-be /rocksdb-host/java/crossbuild/docker-build-linux-alpine.sh
rocksdbjavastaticpublish : rocksdbjavastaticrelease rocksdbjavastaticpublishcentral
rocksdbjavastaticpublishdocker : rocksdbjavastaticreleasedocker rocksdbjavastaticpublishcentral
ROCKSDB_JAVA_RELEASE_CLASSIFIERS = javadoc sources linux64 linux32 linux64-musl linux32-musl osx win64
rocksdbjavastaticpublishcentral : rocksdbjavageneratepom
mvn gpg:sign-and-deploy-file -Durl= https://oss.sonatype.org/service/local/staging/deploy/maven2/ -DrepositoryId= sonatype-nexus-staging -DpomFile= java/pom.xml -Dfile= java/target/rocksdbjni-$( ROCKSDB_JAVA_VERSION) .jar
$( foreach classifier, $( ROCKSDB_JAVA_RELEASE_CLASSIFIERS) , mvn gpg:sign-and-deploy-file -Durl= https://oss.sonatype.org/service/local/staging/deploy/maven2/ -DrepositoryId= sonatype-nexus-staging -DpomFile= java/pom.xml -Dfile= java/target/rocksdbjni-$( ROCKSDB_JAVA_VERSION) -$( classifier) .jar -Dclassifier= $( classifier) ; )
rocksdbjavageneratepom :
cd java; cat pom.xml.template | sed 's/\$${ROCKSDB_JAVA_VERSION}/$(ROCKSDB_JAVA_VERSION)/' > pom.xml
rocksdbjavastaticnexusbundlejar : rocksdbjavageneratepom
openssl sha1 -r java/pom.xml | awk '{ print $$1 }' > java/target/pom.xml.sha1
openssl sha1 -r java/target/rocksdbjni-$( ROCKSDB_JAVA_VERSION) .jar | awk '{ print $$1 }' > java/target/rocksdbjni-$( ROCKSDB_JAVA_VERSION) .jar.sha1
$( foreach classifier, $( ROCKSDB_JAVA_RELEASE_CLASSIFIERS) , openssl sha1 -r java/target/rocksdbjni-$( ROCKSDB_JAVA_VERSION) -$( classifier) .jar | awk '{ print $$1 }' > java/target/rocksdbjni-$( ROCKSDB_JAVA_VERSION) -$( classifier) .jar.sha1; )
gpg --yes --output java/target/pom.xml.asc -ab java/pom.xml
gpg --yes -ab java/target/rocksdbjni-$( ROCKSDB_JAVA_VERSION) .jar
$( foreach classifier, $( ROCKSDB_JAVA_RELEASE_CLASSIFIERS) , gpg --yes -ab java/target/rocksdbjni-$( ROCKSDB_JAVA_VERSION) -$( classifier) .jar; )
$( JAR_CMD) cvf java/target/nexus-bundle-rocksdbjni-$( ROCKSDB_JAVA_VERSION) .jar -C java pom.xml -C java/target pom.xml.sha1 -C java/target pom.xml.asc -C java/target rocksdbjni-$( ROCKSDB_JAVA_VERSION) .jar -C java/target rocksdbjni-$( ROCKSDB_JAVA_VERSION) .jar.sha1 -C java/target rocksdbjni-$( ROCKSDB_JAVA_VERSION) .jar.asc
$( foreach classifier, $( ROCKSDB_JAVA_RELEASE_CLASSIFIERS) , $( JAR_CMD) uf java/target/nexus-bundle-rocksdbjni-$( ROCKSDB_JAVA_VERSION) .jar -C java/target rocksdbjni-$( ROCKSDB_JAVA_VERSION) -$( classifier) .jar -C java/target rocksdbjni-$( ROCKSDB_JAVA_VERSION) -$( classifier) .jar.sha1 -C java/target rocksdbjni-$( ROCKSDB_JAVA_VERSION) -$( classifier) .jar.asc; )
# A version of each $(LIBOBJECTS) compiled with -fPIC
jl/%.o : %.cc
$( AM_V_CC) mkdir -p $( @D) && $( CXX) $( CXXFLAGS) -fPIC -c $< -o $@ $( COVERAGEFLAGS)
rocksdbjava : $( LIB_OBJECTS )
i f e q ( $( JAVA_HOME ) , )
$( error JAVA_HOME is not set )
e n d i f
$( AM_V_GEN) cd java; $( MAKE) javalib;
$( AM_V_at) rm -f ./java/target/$( ROCKSDBJNILIB)
$( AM_V_at) $( CXX) $( CXXFLAGS) -I./java/. -I./java/rocksjni $( JAVA_INCLUDE) $( ROCKSDB_PLUGIN_JNI_CXX_INCLUDEFLAGS) -shared -fPIC -o ./java/target/$( ROCKSDBJNILIB) $( ALL_JNI_NATIVE_SOURCES) $( LIB_OBJECTS) $( JAVA_LDFLAGS) $( COVERAGEFLAGS)
$( AM_V_at) cd java; $( JAR_CMD) -cf target/$( ROCKSDB_JAR) HISTORY*.md
$( AM_V_at) cd java/target; $( JAR_CMD) -uf $( ROCKSDB_JAR) $( ROCKSDBJNILIB)
$( AM_V_at) cd java/target/classes; $( JAR_CMD) -uf ../$( ROCKSDB_JAR) org/rocksdb/*.class org/rocksdb/util/*.class
$( AM_V_at) openssl sha1 java/target/$( ROCKSDB_JAR) | sed 's/.*= \([0-9a-f]*\)/\1/' > java/target/$( ROCKSDB_JAR) .sha1
Add a jni library for rocksdb which supports Open, Get, Put, and Close.
Summary:
This diff contains a simple jni library for rocksdb which supports open, get,
put and closeusing default options (including Options, ReadOptions, and
WriteOptions.) In the usual case, Java developers can use the c++ rocksdb
library in the way similar to the following:
RocksDB db = RocksDB.open(path_to_db);
...
db.put("hello".getBytes(), "world".getBytes();
byte[] value = db.get("hello".getBytes());
...
db.close();
Specifically, this diff has the following major classes:
* RocksDB: a Java wrapper class which forwards the operations
from the java side to c++ rocksdb library.
* RocksDBException: ncapsulates the error of an operation.
This exception type is used to describe an internal error from
the c++ rocksdb library.
This diff also include a simple java sample code calling c++ rocksdb library.
To build the rocksdb jni library, simply run make jni, and make jtest will try to
build and run the sample code.
Note that if the rocksdb is not built with the default glibc that Java uses,
java will try to load the wrong glibc during the run time. As a result,
the sample code might not work properly during the run time.
Test Plan:
* make jni
* make jtest
Reviewers: haobo, dhruba, sdong, igor, ljin
Reviewed By: dhruba
CC: leveldb, xjin
Differential Revision: https://reviews.facebook.net/D17109
11 years ago
jclean :
cd java; $( MAKE) clean;
jtest_compile : rocksdbjava
cd java; $( MAKE) java_test
jtest_run :
cd java; $( MAKE) run_test
jtest : rocksdbjava
cd java; $( MAKE) sample test
Add a jni library for rocksdb which supports Open, Get, Put, and Close.
Summary:
This diff contains a simple jni library for rocksdb which supports open, get,
put and closeusing default options (including Options, ReadOptions, and
WriteOptions.) In the usual case, Java developers can use the c++ rocksdb
library in the way similar to the following:
RocksDB db = RocksDB.open(path_to_db);
...
db.put("hello".getBytes(), "world".getBytes();
byte[] value = db.get("hello".getBytes());
...
db.close();
Specifically, this diff has the following major classes:
* RocksDB: a Java wrapper class which forwards the operations
from the java side to c++ rocksdb library.
* RocksDBException: ncapsulates the error of an operation.
This exception type is used to describe an internal error from
the c++ rocksdb library.
This diff also include a simple java sample code calling c++ rocksdb library.
To build the rocksdb jni library, simply run make jni, and make jtest will try to
build and run the sample code.
Note that if the rocksdb is not built with the default glibc that Java uses,
java will try to load the wrong glibc during the run time. As a result,
the sample code might not work properly during the run time.
Test Plan:
* make jni
* make jtest
Reviewers: haobo, dhruba, sdong, igor, ljin
Reviewed By: dhruba
CC: leveldb, xjin
Differential Revision: https://reviews.facebook.net/D17109
11 years ago
jdb_bench :
cd java; $( MAKE) db_bench;
commit_prereq :
echo "TODO: bring this back using parts of old precommit_checker.py and rocksdb-lego-determinator"
false # J=$(J) build_tools/precommit_checker.py unit clang_unit release clang_release tsan asan ubsan lite unit_non_shm
# $(MAKE) clean && $(MAKE) jclean && $(MAKE) rocksdbjava;
A build option to run through all check-in requirements.
Summary: Make it easier for people to run all the tests.
Test Plan: Run it.
Reviewers: rven, yhchiang, igor, MarkCallaghan, IslamAbdelRahman, igor.sugak, anthony, kradhakrishnan, meyering
Reviewed By: meyering
Subscribers: meyering, leveldb, dhruba
Differential Revision: https://reviews.facebook.net/D35319
10 years ago
Meta-internal folly integration with F14FastMap (#9546)
Summary:
Especially after updating to C++17, I don't see a compelling case for
*requiring* any folly components in RocksDB. I was able to purge the existing
hard dependencies, and it can be quite difficult to strip out non-trivial components
from folly for use in RocksDB. (The prospect of doing that on F14 has changed
my mind on the best approach here.)
But this change creates an optional integration where we can plug in
components from folly at compile time, starting here with F14FastMap to replace
std::unordered_map when possible (probably no public APIs for example). I have
replaced the biggest CPU users of std::unordered_map with compile-time
pluggable UnorderedMap which will use F14FastMap when USE_FOLLY is set.
USE_FOLLY is always set in the Meta-internal buck build, and a simulation of
that is in the Makefile for public CI testing. A full folly build is not needed, but
checking out the full folly repo is much simpler for getting the dependency,
and anything else we might want to optionally integrate in the future.
Some picky details:
* I don't think the distributed mutex stuff is actually used, so it was easy to remove.
* I implemented an alternative to `folly::constexpr_log2` (which is much easier
in C++17 than C++11) so that I could pull out the hard dependencies on
`ConstexprMath.h`
* I had to add noexcept move constructors/operators to some types to make
F14's complainUnlessNothrowMoveAndDestroy check happy, and I added a
macro to make that easier in some common cases.
* Updated Meta-internal buck build to use folly F14Map (always)
No updates to HISTORY.md nor INSTALL.md as this is not (yet?) considered a
production integration for open source users.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/9546
Test Plan:
CircleCI tests updated so that a couple of them use folly.
Most internal unit & stress/crash tests updated to use Meta-internal latest folly.
(Note: they should probably use buck but they currently use Makefile.)
Example performance improvement: when filter partitions are pinned in cache,
they are tracked by PartitionedFilterBlockReader::filter_map_ and we can build
a test that exercises that heavily. Build DB with
```
TEST_TMPDIR=/dev/shm/rocksdb ./db_bench -benchmarks=fillrandom -num=10000000 -disable_wal=1 -write_buffer_size=30000000 -bloom_bits=16 -compaction_style=2 -fifo_compaction_max_table_files_size_mb=10000 -fifo_compaction_allow_compaction=0 -partition_index_and_filters
```
and test with (simultaneous runs with & without folly, ~20 times each to see
convergence)
```
TEST_TMPDIR=/dev/shm/rocksdb ./db_bench_folly -readonly -use_existing_db -benchmarks=readrandom -num=10000000 -bloom_bits=16 -compaction_style=2 -fifo_compaction_max_table_files_size_mb=10000 -fifo_compaction_allow_compaction=0 -partition_index_and_filters -duration=40 -pin_l0_filter_and_index_blocks_in_cache
```
Average ops/s no folly: 26229.2
Average ops/s with folly: 26853.3 (+2.4%)
Reviewed By: ajkr
Differential Revision: D34181736
Pulled By: pdillinger
fbshipit-source-id: ffa6ad5104c2880321d8a1aa7187e00ab0d02e94
3 years ago
# For public CI runs, checkout folly in a way that can build with RocksDB.
# This is mostly intended as a test-only simulation of Meta-internal folly
# integration.
checkout_folly :
if [ -e third-party/folly ] ; then \
cd third-party/folly && ${ GIT_COMMAND } fetch origin; \
Meta-internal folly integration with F14FastMap (#9546)
Summary:
Especially after updating to C++17, I don't see a compelling case for
*requiring* any folly components in RocksDB. I was able to purge the existing
hard dependencies, and it can be quite difficult to strip out non-trivial components
from folly for use in RocksDB. (The prospect of doing that on F14 has changed
my mind on the best approach here.)
But this change creates an optional integration where we can plug in
components from folly at compile time, starting here with F14FastMap to replace
std::unordered_map when possible (probably no public APIs for example). I have
replaced the biggest CPU users of std::unordered_map with compile-time
pluggable UnorderedMap which will use F14FastMap when USE_FOLLY is set.
USE_FOLLY is always set in the Meta-internal buck build, and a simulation of
that is in the Makefile for public CI testing. A full folly build is not needed, but
checking out the full folly repo is much simpler for getting the dependency,
and anything else we might want to optionally integrate in the future.
Some picky details:
* I don't think the distributed mutex stuff is actually used, so it was easy to remove.
* I implemented an alternative to `folly::constexpr_log2` (which is much easier
in C++17 than C++11) so that I could pull out the hard dependencies on
`ConstexprMath.h`
* I had to add noexcept move constructors/operators to some types to make
F14's complainUnlessNothrowMoveAndDestroy check happy, and I added a
macro to make that easier in some common cases.
* Updated Meta-internal buck build to use folly F14Map (always)
No updates to HISTORY.md nor INSTALL.md as this is not (yet?) considered a
production integration for open source users.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/9546
Test Plan:
CircleCI tests updated so that a couple of them use folly.
Most internal unit & stress/crash tests updated to use Meta-internal latest folly.
(Note: they should probably use buck but they currently use Makefile.)
Example performance improvement: when filter partitions are pinned in cache,
they are tracked by PartitionedFilterBlockReader::filter_map_ and we can build
a test that exercises that heavily. Build DB with
```
TEST_TMPDIR=/dev/shm/rocksdb ./db_bench -benchmarks=fillrandom -num=10000000 -disable_wal=1 -write_buffer_size=30000000 -bloom_bits=16 -compaction_style=2 -fifo_compaction_max_table_files_size_mb=10000 -fifo_compaction_allow_compaction=0 -partition_index_and_filters
```
and test with (simultaneous runs with & without folly, ~20 times each to see
convergence)
```
TEST_TMPDIR=/dev/shm/rocksdb ./db_bench_folly -readonly -use_existing_db -benchmarks=readrandom -num=10000000 -bloom_bits=16 -compaction_style=2 -fifo_compaction_max_table_files_size_mb=10000 -fifo_compaction_allow_compaction=0 -partition_index_and_filters -duration=40 -pin_l0_filter_and_index_blocks_in_cache
```
Average ops/s no folly: 26229.2
Average ops/s with folly: 26853.3 (+2.4%)
Reviewed By: ajkr
Differential Revision: D34181736
Pulled By: pdillinger
fbshipit-source-id: ffa6ad5104c2880321d8a1aa7187e00ab0d02e94
3 years ago
else \
cd third-party && ${ GIT_COMMAND } clone https://github.com/facebook/folly.git; \
Meta-internal folly integration with F14FastMap (#9546)
Summary:
Especially after updating to C++17, I don't see a compelling case for
*requiring* any folly components in RocksDB. I was able to purge the existing
hard dependencies, and it can be quite difficult to strip out non-trivial components
from folly for use in RocksDB. (The prospect of doing that on F14 has changed
my mind on the best approach here.)
But this change creates an optional integration where we can plug in
components from folly at compile time, starting here with F14FastMap to replace
std::unordered_map when possible (probably no public APIs for example). I have
replaced the biggest CPU users of std::unordered_map with compile-time
pluggable UnorderedMap which will use F14FastMap when USE_FOLLY is set.
USE_FOLLY is always set in the Meta-internal buck build, and a simulation of
that is in the Makefile for public CI testing. A full folly build is not needed, but
checking out the full folly repo is much simpler for getting the dependency,
and anything else we might want to optionally integrate in the future.
Some picky details:
* I don't think the distributed mutex stuff is actually used, so it was easy to remove.
* I implemented an alternative to `folly::constexpr_log2` (which is much easier
in C++17 than C++11) so that I could pull out the hard dependencies on
`ConstexprMath.h`
* I had to add noexcept move constructors/operators to some types to make
F14's complainUnlessNothrowMoveAndDestroy check happy, and I added a
macro to make that easier in some common cases.
* Updated Meta-internal buck build to use folly F14Map (always)
No updates to HISTORY.md nor INSTALL.md as this is not (yet?) considered a
production integration for open source users.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/9546
Test Plan:
CircleCI tests updated so that a couple of them use folly.
Most internal unit & stress/crash tests updated to use Meta-internal latest folly.
(Note: they should probably use buck but they currently use Makefile.)
Example performance improvement: when filter partitions are pinned in cache,
they are tracked by PartitionedFilterBlockReader::filter_map_ and we can build
a test that exercises that heavily. Build DB with
```
TEST_TMPDIR=/dev/shm/rocksdb ./db_bench -benchmarks=fillrandom -num=10000000 -disable_wal=1 -write_buffer_size=30000000 -bloom_bits=16 -compaction_style=2 -fifo_compaction_max_table_files_size_mb=10000 -fifo_compaction_allow_compaction=0 -partition_index_and_filters
```
and test with (simultaneous runs with & without folly, ~20 times each to see
convergence)
```
TEST_TMPDIR=/dev/shm/rocksdb ./db_bench_folly -readonly -use_existing_db -benchmarks=readrandom -num=10000000 -bloom_bits=16 -compaction_style=2 -fifo_compaction_max_table_files_size_mb=10000 -fifo_compaction_allow_compaction=0 -partition_index_and_filters -duration=40 -pin_l0_filter_and_index_blocks_in_cache
```
Average ops/s no folly: 26229.2
Average ops/s with folly: 26853.3 (+2.4%)
Reviewed By: ajkr
Differential Revision: D34181736
Pulled By: pdillinger
fbshipit-source-id: ffa6ad5104c2880321d8a1aa7187e00ab0d02e94
3 years ago
fi
@# Pin to a particular version for public CI, so that PR authors don' t
@# need to worry about folly breaking our integration. Update periodically
cd third-party/folly && git reset --hard beacd86d63cd71c904632262e6c36f60874d78ba
Meta-internal folly integration with F14FastMap (#9546)
Summary:
Especially after updating to C++17, I don't see a compelling case for
*requiring* any folly components in RocksDB. I was able to purge the existing
hard dependencies, and it can be quite difficult to strip out non-trivial components
from folly for use in RocksDB. (The prospect of doing that on F14 has changed
my mind on the best approach here.)
But this change creates an optional integration where we can plug in
components from folly at compile time, starting here with F14FastMap to replace
std::unordered_map when possible (probably no public APIs for example). I have
replaced the biggest CPU users of std::unordered_map with compile-time
pluggable UnorderedMap which will use F14FastMap when USE_FOLLY is set.
USE_FOLLY is always set in the Meta-internal buck build, and a simulation of
that is in the Makefile for public CI testing. A full folly build is not needed, but
checking out the full folly repo is much simpler for getting the dependency,
and anything else we might want to optionally integrate in the future.
Some picky details:
* I don't think the distributed mutex stuff is actually used, so it was easy to remove.
* I implemented an alternative to `folly::constexpr_log2` (which is much easier
in C++17 than C++11) so that I could pull out the hard dependencies on
`ConstexprMath.h`
* I had to add noexcept move constructors/operators to some types to make
F14's complainUnlessNothrowMoveAndDestroy check happy, and I added a
macro to make that easier in some common cases.
* Updated Meta-internal buck build to use folly F14Map (always)
No updates to HISTORY.md nor INSTALL.md as this is not (yet?) considered a
production integration for open source users.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/9546
Test Plan:
CircleCI tests updated so that a couple of them use folly.
Most internal unit & stress/crash tests updated to use Meta-internal latest folly.
(Note: they should probably use buck but they currently use Makefile.)
Example performance improvement: when filter partitions are pinned in cache,
they are tracked by PartitionedFilterBlockReader::filter_map_ and we can build
a test that exercises that heavily. Build DB with
```
TEST_TMPDIR=/dev/shm/rocksdb ./db_bench -benchmarks=fillrandom -num=10000000 -disable_wal=1 -write_buffer_size=30000000 -bloom_bits=16 -compaction_style=2 -fifo_compaction_max_table_files_size_mb=10000 -fifo_compaction_allow_compaction=0 -partition_index_and_filters
```
and test with (simultaneous runs with & without folly, ~20 times each to see
convergence)
```
TEST_TMPDIR=/dev/shm/rocksdb ./db_bench_folly -readonly -use_existing_db -benchmarks=readrandom -num=10000000 -bloom_bits=16 -compaction_style=2 -fifo_compaction_max_table_files_size_mb=10000 -fifo_compaction_allow_compaction=0 -partition_index_and_filters -duration=40 -pin_l0_filter_and_index_blocks_in_cache
```
Average ops/s no folly: 26229.2
Average ops/s with folly: 26853.3 (+2.4%)
Reviewed By: ajkr
Differential Revision: D34181736
Pulled By: pdillinger
fbshipit-source-id: ffa6ad5104c2880321d8a1aa7187e00ab0d02e94
3 years ago
@# A hack to remove boost dependency.
@# NOTE: this hack is only needed if building using USE_FOLLY_LITE
Meta-internal folly integration with F14FastMap (#9546)
Summary:
Especially after updating to C++17, I don't see a compelling case for
*requiring* any folly components in RocksDB. I was able to purge the existing
hard dependencies, and it can be quite difficult to strip out non-trivial components
from folly for use in RocksDB. (The prospect of doing that on F14 has changed
my mind on the best approach here.)
But this change creates an optional integration where we can plug in
components from folly at compile time, starting here with F14FastMap to replace
std::unordered_map when possible (probably no public APIs for example). I have
replaced the biggest CPU users of std::unordered_map with compile-time
pluggable UnorderedMap which will use F14FastMap when USE_FOLLY is set.
USE_FOLLY is always set in the Meta-internal buck build, and a simulation of
that is in the Makefile for public CI testing. A full folly build is not needed, but
checking out the full folly repo is much simpler for getting the dependency,
and anything else we might want to optionally integrate in the future.
Some picky details:
* I don't think the distributed mutex stuff is actually used, so it was easy to remove.
* I implemented an alternative to `folly::constexpr_log2` (which is much easier
in C++17 than C++11) so that I could pull out the hard dependencies on
`ConstexprMath.h`
* I had to add noexcept move constructors/operators to some types to make
F14's complainUnlessNothrowMoveAndDestroy check happy, and I added a
macro to make that easier in some common cases.
* Updated Meta-internal buck build to use folly F14Map (always)
No updates to HISTORY.md nor INSTALL.md as this is not (yet?) considered a
production integration for open source users.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/9546
Test Plan:
CircleCI tests updated so that a couple of them use folly.
Most internal unit & stress/crash tests updated to use Meta-internal latest folly.
(Note: they should probably use buck but they currently use Makefile.)
Example performance improvement: when filter partitions are pinned in cache,
they are tracked by PartitionedFilterBlockReader::filter_map_ and we can build
a test that exercises that heavily. Build DB with
```
TEST_TMPDIR=/dev/shm/rocksdb ./db_bench -benchmarks=fillrandom -num=10000000 -disable_wal=1 -write_buffer_size=30000000 -bloom_bits=16 -compaction_style=2 -fifo_compaction_max_table_files_size_mb=10000 -fifo_compaction_allow_compaction=0 -partition_index_and_filters
```
and test with (simultaneous runs with & without folly, ~20 times each to see
convergence)
```
TEST_TMPDIR=/dev/shm/rocksdb ./db_bench_folly -readonly -use_existing_db -benchmarks=readrandom -num=10000000 -bloom_bits=16 -compaction_style=2 -fifo_compaction_max_table_files_size_mb=10000 -fifo_compaction_allow_compaction=0 -partition_index_and_filters -duration=40 -pin_l0_filter_and_index_blocks_in_cache
```
Average ops/s no folly: 26229.2
Average ops/s with folly: 26853.3 (+2.4%)
Reviewed By: ajkr
Differential Revision: D34181736
Pulled By: pdillinger
fbshipit-source-id: ffa6ad5104c2880321d8a1aa7187e00ab0d02e94
3 years ago
perl -pi -e 's/^(#include <boost)/\/\/$$1/' third-party/folly/folly/functional/Invoke.h
@# NOTE: this hack is required for clang in some cases
perl -pi -e 's/int rv = syscall/int rv = (int)syscall/' third-party/folly/folly/detail/Futex.cpp
@# NOTE: this hack is required for gcc in some cases
perl -pi -e 's/(__has_include.<experimental.memory_resource>.)/__cpp_rtti && $$1/' third-party/folly/folly/memory/MemoryResource.h
Meta-internal folly integration with F14FastMap (#9546)
Summary:
Especially after updating to C++17, I don't see a compelling case for
*requiring* any folly components in RocksDB. I was able to purge the existing
hard dependencies, and it can be quite difficult to strip out non-trivial components
from folly for use in RocksDB. (The prospect of doing that on F14 has changed
my mind on the best approach here.)
But this change creates an optional integration where we can plug in
components from folly at compile time, starting here with F14FastMap to replace
std::unordered_map when possible (probably no public APIs for example). I have
replaced the biggest CPU users of std::unordered_map with compile-time
pluggable UnorderedMap which will use F14FastMap when USE_FOLLY is set.
USE_FOLLY is always set in the Meta-internal buck build, and a simulation of
that is in the Makefile for public CI testing. A full folly build is not needed, but
checking out the full folly repo is much simpler for getting the dependency,
and anything else we might want to optionally integrate in the future.
Some picky details:
* I don't think the distributed mutex stuff is actually used, so it was easy to remove.
* I implemented an alternative to `folly::constexpr_log2` (which is much easier
in C++17 than C++11) so that I could pull out the hard dependencies on
`ConstexprMath.h`
* I had to add noexcept move constructors/operators to some types to make
F14's complainUnlessNothrowMoveAndDestroy check happy, and I added a
macro to make that easier in some common cases.
* Updated Meta-internal buck build to use folly F14Map (always)
No updates to HISTORY.md nor INSTALL.md as this is not (yet?) considered a
production integration for open source users.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/9546
Test Plan:
CircleCI tests updated so that a couple of them use folly.
Most internal unit & stress/crash tests updated to use Meta-internal latest folly.
(Note: they should probably use buck but they currently use Makefile.)
Example performance improvement: when filter partitions are pinned in cache,
they are tracked by PartitionedFilterBlockReader::filter_map_ and we can build
a test that exercises that heavily. Build DB with
```
TEST_TMPDIR=/dev/shm/rocksdb ./db_bench -benchmarks=fillrandom -num=10000000 -disable_wal=1 -write_buffer_size=30000000 -bloom_bits=16 -compaction_style=2 -fifo_compaction_max_table_files_size_mb=10000 -fifo_compaction_allow_compaction=0 -partition_index_and_filters
```
and test with (simultaneous runs with & without folly, ~20 times each to see
convergence)
```
TEST_TMPDIR=/dev/shm/rocksdb ./db_bench_folly -readonly -use_existing_db -benchmarks=readrandom -num=10000000 -bloom_bits=16 -compaction_style=2 -fifo_compaction_max_table_files_size_mb=10000 -fifo_compaction_allow_compaction=0 -partition_index_and_filters -duration=40 -pin_l0_filter_and_index_blocks_in_cache
```
Average ops/s no folly: 26229.2
Average ops/s with folly: 26853.3 (+2.4%)
Reviewed By: ajkr
Differential Revision: D34181736
Pulled By: pdillinger
fbshipit-source-id: ffa6ad5104c2880321d8a1aa7187e00ab0d02e94
3 years ago
Simplify detection of x86 CPU features (#11419)
Summary:
**Background** - runtime detection of certain x86 CPU features was added for optimizing CRC32c checksums, where performance is dramatically affected by the availability of certain CPU instructions and code using intrinsics for those instructions. And Java builds with native library try to be broadly compatible but performant.
What has changed is that CRC32c is no longer the most efficient cheecksum on contemporary x86_64 hardware, nor the default checksum. XXH3 is generally faster and not as dramatically impacted by the availability of certain CPU instructions. For example, on my Skylake system using db_bench (similar on an older Skylake system without AVX512):
PORTABLE=1 empty USE_SSE : xxh3->8 GB/s crc32c->0.8 GB/s (no SSE4.2 nor AVX2 instructions)
PORTABLE=1 USE_SSE=1 : xxh3->19 GB/s crc32c->16 GB/s (with SSE4.2 and AVX2)
PORTABLE=0 USE_SSE ignored: xxh3->28 GB/s crc32c->16 GB/s (also some AVX512)
Testing a ~10 year old system, with SSE4.2 but without AVX2, crc32c is a similar speed to the new systems but xxh3 is only about half that speed, also 8GB/s like the non-AVX2 compile above. Given that xxh3 has specific optimization for AVX2, I think we can infer that that crc32c is only fastest for that ~2008-2013 period when SSE4.2 was included but not AVX2. And given that xxh3 is only about 2x slower on these systems (not like >10x slower for unoptimized crc32c), I don't think we need to invest too much in optimally adapting to these old cases.
x86 hardware that doesn't support fast CRC32c is now extremely rare, so requiring a custom build to support such hardware is fine IMHO.
**This change** does two related things:
* Remove runtime CPU detection for optimizing CRC32c on x86. Maintaining this code is non-zero work, and compiling special code that doesn't work on the configured target instruction set for code generation is always dubious. (On the one hand we have to ensure the CRC32c code uses SSE4.2 but on the other hand we have to ensure nothing else does.)
* Detect CPU features in source code, not in build scripts. Although there are some hypothetical advantages to detectiong in build scripts (compiler generality), RocksDB supports at least three build systems: make, cmake, and buck. It's not practical to support feature detection on all three, and we have suffered from missed optimization opportunities by relying on missing or incomplete detection in cmake and buck. We also depend on some components like xxhash that do source code detection anyway.
**In more detail:**
* `HAVE_SSE42`, `HAVE_AVX2`, and `HAVE_PCLMUL` replaced by standard macros `__SSE4_2__`, `__AVX2__`, and `__PCLMUL__`.
* MSVC does not provide high fidelity defines for SSE, PCLMUL, or POPCNT, but we can infer those from `__AVX__` or `__AVX2__` in a compatibility header. In rare cases of false negative or false positive feature detection, a build engineer should be able to set defines to work around the issue.
* `__POPCNT__` is another standard define, but we happen to only need it on MSVC, where it is set by that compatibility header, or can be set by the build engineer.
* `PORTABLE` can be set to a CPU type, e.g. "haswell", to compile for that CPU type.
* `USE_SSE` is deprecated, now equivalent to PORTABLE=haswell, which roughly approximates its old behavior.
Notably, this change should enable more builds to use the AVX2-optimized Bloom filter implementation.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/11419
Test Plan:
existing tests, CI
Manual performance tests after the change match the before above (none expected with make build).
We also see AVX2 optimized Bloom filter code enabled when expected, by injecting a compiler error. (Performance difference is not big on my current CPU.)
Reviewed By: ajkr
Differential Revision: D45489041
Pulled By: pdillinger
fbshipit-source-id: 60ceb0dd2aa3b365c99ed08a8b2a087a9abb6a70
2 years ago
CXX_M_FLAGS = $( filter -m%, $( CXXFLAGS) )
build_folly :
FOLLY_INST_PATH = ` cd third-party/folly; $( PYTHON) build/fbcode_builder/getdeps.py show-inst-dir` ; \
if [ " $$ FOLLY_INST_PATH " ] ; then \
rm -rf $$ { FOLLY_INST_PATH} /../../*; \
else \
echo "Please run checkout_folly first" ; \
false; \
fi
# Restore the original version of Invoke.h with boost dependency
cd third-party/folly && ${ GIT_COMMAND } checkout folly/functional/Invoke.h
Simplify detection of x86 CPU features (#11419)
Summary:
**Background** - runtime detection of certain x86 CPU features was added for optimizing CRC32c checksums, where performance is dramatically affected by the availability of certain CPU instructions and code using intrinsics for those instructions. And Java builds with native library try to be broadly compatible but performant.
What has changed is that CRC32c is no longer the most efficient cheecksum on contemporary x86_64 hardware, nor the default checksum. XXH3 is generally faster and not as dramatically impacted by the availability of certain CPU instructions. For example, on my Skylake system using db_bench (similar on an older Skylake system without AVX512):
PORTABLE=1 empty USE_SSE : xxh3->8 GB/s crc32c->0.8 GB/s (no SSE4.2 nor AVX2 instructions)
PORTABLE=1 USE_SSE=1 : xxh3->19 GB/s crc32c->16 GB/s (with SSE4.2 and AVX2)
PORTABLE=0 USE_SSE ignored: xxh3->28 GB/s crc32c->16 GB/s (also some AVX512)
Testing a ~10 year old system, with SSE4.2 but without AVX2, crc32c is a similar speed to the new systems but xxh3 is only about half that speed, also 8GB/s like the non-AVX2 compile above. Given that xxh3 has specific optimization for AVX2, I think we can infer that that crc32c is only fastest for that ~2008-2013 period when SSE4.2 was included but not AVX2. And given that xxh3 is only about 2x slower on these systems (not like >10x slower for unoptimized crc32c), I don't think we need to invest too much in optimally adapting to these old cases.
x86 hardware that doesn't support fast CRC32c is now extremely rare, so requiring a custom build to support such hardware is fine IMHO.
**This change** does two related things:
* Remove runtime CPU detection for optimizing CRC32c on x86. Maintaining this code is non-zero work, and compiling special code that doesn't work on the configured target instruction set for code generation is always dubious. (On the one hand we have to ensure the CRC32c code uses SSE4.2 but on the other hand we have to ensure nothing else does.)
* Detect CPU features in source code, not in build scripts. Although there are some hypothetical advantages to detectiong in build scripts (compiler generality), RocksDB supports at least three build systems: make, cmake, and buck. It's not practical to support feature detection on all three, and we have suffered from missed optimization opportunities by relying on missing or incomplete detection in cmake and buck. We also depend on some components like xxhash that do source code detection anyway.
**In more detail:**
* `HAVE_SSE42`, `HAVE_AVX2`, and `HAVE_PCLMUL` replaced by standard macros `__SSE4_2__`, `__AVX2__`, and `__PCLMUL__`.
* MSVC does not provide high fidelity defines for SSE, PCLMUL, or POPCNT, but we can infer those from `__AVX__` or `__AVX2__` in a compatibility header. In rare cases of false negative or false positive feature detection, a build engineer should be able to set defines to work around the issue.
* `__POPCNT__` is another standard define, but we happen to only need it on MSVC, where it is set by that compatibility header, or can be set by the build engineer.
* `PORTABLE` can be set to a CPU type, e.g. "haswell", to compile for that CPU type.
* `USE_SSE` is deprecated, now equivalent to PORTABLE=haswell, which roughly approximates its old behavior.
Notably, this change should enable more builds to use the AVX2-optimized Bloom filter implementation.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/11419
Test Plan:
existing tests, CI
Manual performance tests after the change match the before above (none expected with make build).
We also see AVX2 optimized Bloom filter code enabled when expected, by injecting a compiler error. (Performance difference is not big on my current CPU.)
Reviewed By: ajkr
Differential Revision: D45489041
Pulled By: pdillinger
fbshipit-source-id: 60ceb0dd2aa3b365c99ed08a8b2a087a9abb6a70
2 years ago
cd third-party/folly && \
CXXFLAGS = " $( CXX_M_FLAGS) -DHAVE_CXX11_ATOMIC " $( PYTHON) build/fbcode_builder/getdeps.py build --no-tests
# ---------------------------------------------------------------------------
# Build size testing
# ---------------------------------------------------------------------------
REPORT_BUILD_STATISTIC ?= echo STATISTIC:
build_size :
# === normal build, static ===
$( MAKE) clean
$( MAKE) static_lib
$( REPORT_BUILD_STATISTIC) rocksdb.build_size.static_lib $$ ( stat --printf= "%s" librocksdb.a)
strip librocksdb.a
$( REPORT_BUILD_STATISTIC) rocksdb.build_size.static_lib_stripped $$ ( stat --printf= "%s" librocksdb.a)
# === normal build, shared ===
$( MAKE) clean
$( MAKE) shared_lib
$( REPORT_BUILD_STATISTIC) rocksdb.build_size.shared_lib $$ ( stat --printf= "%s" ` readlink -f librocksdb.so` )
strip ` readlink -f librocksdb.so`
$( REPORT_BUILD_STATISTIC) rocksdb.build_size.shared_lib_stripped $$ ( stat --printf= "%s" ` readlink -f librocksdb.so` )
# ---------------------------------------------------------------------------
# Platform-specific compilation
# ---------------------------------------------------------------------------
i f e q ( $( PLATFORM ) , I O S )
# For iOS, create universal object files to be used on both the simulator and
# a device.
XCODEROOT = $( shell xcode-select -print-path)
PLATFORMSROOT = $( XCODEROOT) /Platforms
SIMULATORROOT = $( PLATFORMSROOT) /iPhoneSimulator.platform/Developer
DEVICEROOT = $( PLATFORMSROOT) /iPhoneOS.platform/Developer
IOSVERSION = $( shell defaults read $( PLATFORMSROOT) /iPhoneOS.platform/version CFBundleShortVersionString)
.cc.o :
mkdir -p ios-x86/$( dir $@ )
$( CXX) $( CXXFLAGS) -isysroot $( SIMULATORROOT) /SDKs/iPhoneSimulator$( IOSVERSION) .sdk -arch i686 -arch x86_64 -c $< -o ios-x86/$@
mkdir -p ios-arm/$( dir $@ )
xcrun -sdk iphoneos $( CXX) $( CXXFLAGS) -isysroot $( DEVICEROOT) /SDKs/iPhoneOS$( IOSVERSION) .sdk -arch armv6 -arch armv7 -arch armv7s -arch arm64 -c $< -o ios-arm/$@
lipo ios-x86/$@ ios-arm/$@ -create -output $@
.c.o :
mkdir -p ios-x86/$( dir $@ )
$( CC) $( CFLAGS) -isysroot $( SIMULATORROOT) /SDKs/iPhoneSimulator$( IOSVERSION) .sdk -arch i686 -arch x86_64 -c $< -o ios-x86/$@
mkdir -p ios-arm/$( dir $@ )
xcrun -sdk iphoneos $( CC) $( CFLAGS) -isysroot $( DEVICEROOT) /SDKs/iPhoneOS$( IOSVERSION) .sdk -arch armv6 -arch armv7 -arch armv7s -arch arm64 -c $< -o ios-arm/$@
lipo ios-x86/$@ ios-arm/$@ -create -output $@
e l s e
i f e q ( $( HAVE_POWER 8) , 1 )
$(OBJ_DIR)/util/crc32c_ppc.o : util /crc 32c_ppc .c
$( AM_V_CC) $( CC) $( CFLAGS) -c $< -o $@
$(OBJ_DIR)/util/crc32c_ppc_asm.o : util /crc 32c_ppc_asm .S
$( AM_V_CC) $( CC) $( CFLAGS) -c $< -o $@
e n d i f
$(OBJ_DIR)/%.o : %.cc
$( AM_V_CC) mkdir -p $( @D) && $( CXX) $( CXXFLAGS) -c $< -o $@ $( COVERAGEFLAGS)
$(OBJ_DIR)/%.o : %.cpp
$( AM_V_CC) mkdir -p $( @D) && $( CXX) $( CXXFLAGS) -c $< -o $@ $( COVERAGEFLAGS)
$(OBJ_DIR)/%.o : %.c
build: make "make" output readable by default
Summary:
With this change, make now prints a summary line for each
compiler and linker invocation, e.g.,:
CC db/builder.o
CC db/c.o
CC db/column_family.o
To see full commands, insert "V=1" into your make command.
E.g., run "make V=1 all" if you want it to print each command
in its full glory.
$^ is GNU make's abbreviation for the prerequisites of the current target.
These AM_V_... variables expand to some very short string like "CC" or
"LD", by default, so that the output of "make" is readable. If/when you
want more details, just build with "make V=1 ...", and make will print
each full command as it is executed. If you prefer to see the noise
all the time, and only want to optionally see the abbreviated output,
set AM_DEFAULT_VERBOSITY=1 in your environment, and then build with
V=0 to see the abbreviated command indicators.
Test Plan:
invoke make a few different ways and observe:
make clean; make # abbreviated
make clean; make V=0 # also abbreviated
make clean; make V=1 # full detail
Reviewers: sdong, ljin, igor
Reviewed By: igor
Subscribers: dhruba
Differential Revision: https://reviews.facebook.net/D33579
10 years ago
$( AM_V_CC) $( CC) $( CFLAGS) -c $< -o $@
e n d i f
# ---------------------------------------------------------------------------
# Source files dependencies detection
# ---------------------------------------------------------------------------
# If skip dependencies is ON, skip including the dep files
i f n e q ( $( SKIP_DEPENDS ) , 1 )
DEPFILES = $( patsubst %.cc, $( OBJ_DIR) /%.cc.d, $( ALL_SOURCES) )
D E P F I L E S + = $( patsubst %.c , $ ( OBJ_DIR ) /%.c .d , $ ( LIB_SOURCES_C ) $ ( TEST_MAIN_SOURCES_C ) )
i f e q ( $( USE_FOLLY_LITE ) , 1 )
DEPFILES += $( patsubst %.cpp, $( OBJ_DIR) /%.cpp.d, $( FOLLY_SOURCES) )
e n d i f
e n d i f
run 'make check's rules (and even subtests) in parallel
Summary:
When GNU parallel is available, "make check" tests are now run in parallel.
When /dev/shm is usable, we tell those tests to create temporary files therein.
Now, the longest-running single test, db_test, (which is composed of hundreds of sub-tests)
is no longer run sequentially: instead, each of its sub-tests is run independently, and can
be parallelized along with all other tests. To make that process easier, this change
creates a temporary directory, "t/", in which it puts a small script for each of those
subtests. The output from each parallel-run test is now saved in t/log-TEST_NAME.
When GNU parallel is not available, we run the tests in sequence, just as before.
If GNU parallel is available and you don't like the default of running one subtest
per core, you can invoke "make J=1 check" to run only one test at a time.
Beware: this will take a long time, and it starts with the two longest-running tests, so you
will wait for a long time before seeing any results. Instead, if you want to use fewer resources
but still see useful progress, try "make J=60% check". That will attempt to ensure that 60% of
the cores are occupied by test runs.
To watch progress of individual tests (duration, success (PASS-or-FAIL), name), run "make watch-log"
in the same directory from another window. That will start with something like this:
and when complete should show numbers/names like this:
Every 0.1s: sort -k7,7nr -k4,4gr LOG|perl -n -e '@a=split("\t",$_,-1); $t=$a[8]; $t =~ s,^\./,,;' -e '$t =~ s, >.*,,; chomp $t;' -e '$t =~ /.*--gtest_filter=... Wed Apr 1 10:51:42 2015
152.221 PASS t/DBTest.FileCreationRandomFailure
109.280 PASS t/DBTest.EncodeDecompressedBlockSizeTest
82.315 PASS reduce_levels_test
77.812 PASS t/DBTest.CompactionFilterWithValueChange
73.236 PASS backupable_db_test
63.428 PASS deletefile_test
57.248 PASS table_test
55.665 PASS prefix_test
49.816 PASS t/DBTest.RateLimitingTest
...
Test Plan:
Timings (measured so as to exclude compile and link times):
With this change, all tests complete in 2m40s on a system for which nproc prints 32.
Prior to this this change, "make check" would take 24.5 minutes on that same system.
Here are durations (in seconds) of the longest-running subtests:
152.435 PASS t/DBTest.FileCreationRandomFailure
107.070 PASS t/DBTest.EncodeDecompressedBlockSizeTest
81.391 PASS ./reduce_levels_test
71.587 PASS ./backupable_db_test
61.746 PASS ./deletefile_test
57.960 PASS ./table_test
55.230 PASS ./prefix_test
54.060 PASS t/DBTest.CompactionFilterWithValueChange
48.873 PASS t/DBTest.RateLimitingTest
47.569 PASS ./fault_injection_test
46.593 PASS t/DBTest.Randomized
42.662 PASS t/DBTest.CompactionFilter
31.793 PASS t/DBTest.SparseMerge
30.612 PASS t/DBTest.CompactionFilterV2
25.891 PASS t/DBTest.GroupCommitTest
23.863 PASS t/DBTest.DynamicLevelMaxBytesBase
22.976 PASS ./rate_limiter_test
18.942 PASS t/DBTest.OptimizeFiltersForHits
16.851 PASS ./env_test
15.399 PASS t/DBTest.CompactionFilterV2WithValueChange
14.827 PASS t/DBTest.CompactionFilterV2NULLPrefix
Reviewers: igor, sdong, rven, yhchiang, igor.sugak
Reviewed By: igor.sugak
Subscribers: dhruba
Differential Revision: https://reviews.facebook.net/D35379
10 years ago
# Add proper dependency support so changing a .h file forces a .cc file to
# rebuild.
# The .d file indicates .cc file's dependencies on .h files. We generate such
# dependency by g++'s -MM option, whose output is a make dependency rule.
$(OBJ_DIR)/%.cc.d : %.cc
@mkdir -p $( @D) && $( CXX) $( CXXFLAGS) $( PLATFORM_SHARED_CFLAGS) \
-MM -MT'$@' -MT'$(<:.cc=.o)' -MT'$(<:%.cc=$(OBJ_DIR)/%.o)' \
" $< " -o '$@'
$(OBJ_DIR)/%.cpp.d : %.cpp
@mkdir -p $( @D) && $( CXX) $( CXXFLAGS) $( PLATFORM_SHARED_CFLAGS) \
-MM -MT'$@' -MT'$(<:.cpp=.o)' -MT'$(<:%.cpp=$(OBJ_DIR)/%.o)' \
" $< " -o '$@'
i f e q ( $( HAVE_POWER 8) , 1 )
DEPFILES_C = $( patsubst %.c, $( OBJ_DIR) /%.c.d, $( LIB_SOURCES_C) )
DEPFILES_ASM = $( patsubst %.S, $( OBJ_DIR) /%.S.d, $( LIB_SOURCES_ASM) )
$(OBJ_DIR)/%.c.d : %.c
@$( CXX) $( CXXFLAGS) $( PLATFORM_SHARED_CFLAGS) \
-MM -MT'$@' -MT'$(<:.c=.o)' " $< " -o '$@'
$(OBJ_DIR)/%.S.d : %.S
@$( CXX) $( CXXFLAGS) $( PLATFORM_SHARED_CFLAGS) \
-MM -MT'$@' -MT'$(<:.S=.o)' " $< " -o '$@'
$(DEPFILES_C) : %.c .d
$(DEPFILES_ASM) : %.S .d
depend : $( DEPFILES ) $( DEPFILES_C ) $( DEPFILES_ASM )
e l s e
depend : $( DEPFILES )
e n d i f
build_subset_tests : $( ROCKSDBTESTS_SUBSET )
$( AM_V_GEN) if [ -n " $$ {ROCKSDBTESTS_SUBSET_TESTS_TO_FILE} " ] ; then echo " $( ROCKSDBTESTS_SUBSET) " > " $$ {ROCKSDBTESTS_SUBSET_TESTS_TO_FILE} " ; else echo " $( ROCKSDBTESTS_SUBSET) " ; fi
list_all_tests :
echo " $( ROCKSDBTESTS_SUBSET) "
# Remove the rules for which dependencies should not be generated and see if any are left.
#If so, include the dependencies; if not, do not include the dependency files
Meta-internal folly integration with F14FastMap (#9546)
Summary:
Especially after updating to C++17, I don't see a compelling case for
*requiring* any folly components in RocksDB. I was able to purge the existing
hard dependencies, and it can be quite difficult to strip out non-trivial components
from folly for use in RocksDB. (The prospect of doing that on F14 has changed
my mind on the best approach here.)
But this change creates an optional integration where we can plug in
components from folly at compile time, starting here with F14FastMap to replace
std::unordered_map when possible (probably no public APIs for example). I have
replaced the biggest CPU users of std::unordered_map with compile-time
pluggable UnorderedMap which will use F14FastMap when USE_FOLLY is set.
USE_FOLLY is always set in the Meta-internal buck build, and a simulation of
that is in the Makefile for public CI testing. A full folly build is not needed, but
checking out the full folly repo is much simpler for getting the dependency,
and anything else we might want to optionally integrate in the future.
Some picky details:
* I don't think the distributed mutex stuff is actually used, so it was easy to remove.
* I implemented an alternative to `folly::constexpr_log2` (which is much easier
in C++17 than C++11) so that I could pull out the hard dependencies on
`ConstexprMath.h`
* I had to add noexcept move constructors/operators to some types to make
F14's complainUnlessNothrowMoveAndDestroy check happy, and I added a
macro to make that easier in some common cases.
* Updated Meta-internal buck build to use folly F14Map (always)
No updates to HISTORY.md nor INSTALL.md as this is not (yet?) considered a
production integration for open source users.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/9546
Test Plan:
CircleCI tests updated so that a couple of them use folly.
Most internal unit & stress/crash tests updated to use Meta-internal latest folly.
(Note: they should probably use buck but they currently use Makefile.)
Example performance improvement: when filter partitions are pinned in cache,
they are tracked by PartitionedFilterBlockReader::filter_map_ and we can build
a test that exercises that heavily. Build DB with
```
TEST_TMPDIR=/dev/shm/rocksdb ./db_bench -benchmarks=fillrandom -num=10000000 -disable_wal=1 -write_buffer_size=30000000 -bloom_bits=16 -compaction_style=2 -fifo_compaction_max_table_files_size_mb=10000 -fifo_compaction_allow_compaction=0 -partition_index_and_filters
```
and test with (simultaneous runs with & without folly, ~20 times each to see
convergence)
```
TEST_TMPDIR=/dev/shm/rocksdb ./db_bench_folly -readonly -use_existing_db -benchmarks=readrandom -num=10000000 -bloom_bits=16 -compaction_style=2 -fifo_compaction_max_table_files_size_mb=10000 -fifo_compaction_allow_compaction=0 -partition_index_and_filters -duration=40 -pin_l0_filter_and_index_blocks_in_cache
```
Average ops/s no folly: 26229.2
Average ops/s with folly: 26853.3 (+2.4%)
Reviewed By: ajkr
Differential Revision: D34181736
Pulled By: pdillinger
fbshipit-source-id: ffa6ad5104c2880321d8a1aa7187e00ab0d02e94
3 years ago
ROCKS_DEP_RULES = $( filter-out clean format check-format check-buck-targets check-headers check-sources jclean jtest package analyze tags rocksdbjavastatic% unity.% unity_test checkout_folly, $( MAKECMDGOALS) )
i f n e q ( "$(ROCKS_DEP_RULES)" , "" )
- i n c l u d e $( DEPFILES )
e n d i f