# Copyright (c) 2011 The LevelDB Authors. All rights reserved.
# Use of this source code is governed by a BSD-style license that can be
# found in the LICENSE file. See the AUTHORS file for names of contributors.
# Inherit some settings from environment variables, if available
#-----------------------------------------------
BASH_EXISTS := $( shell which bash)
SHELL := $( shell which bash)
PYTHON ?= $( shell which python)
run 'make check's rules (and even subtests) in parallel
Summary:
When GNU parallel is available, "make check" tests are now run in parallel.
When /dev/shm is usable, we tell those tests to create temporary files therein.
Now, the longest-running single test, db_test, (which is composed of hundreds of sub-tests)
is no longer run sequentially: instead, each of its sub-tests is run independently, and can
be parallelized along with all other tests. To make that process easier, this change
creates a temporary directory, "t/", in which it puts a small script for each of those
subtests. The output from each parallel-run test is now saved in t/log-TEST_NAME.
When GNU parallel is not available, we run the tests in sequence, just as before.
If GNU parallel is available and you don't like the default of running one subtest
per core, you can invoke "make J=1 check" to run only one test at a time.
Beware: this will take a long time, and it starts with the two longest-running tests, so you
will wait for a long time before seeing any results. Instead, if you want to use fewer resources
but still see useful progress, try "make J=60% check". That will attempt to ensure that 60% of
the cores are occupied by test runs.
To watch progress of individual tests (duration, success (PASS-or-FAIL), name), run "make watch-log"
in the same directory from another window. That will start with something like this:
and when complete should show numbers/names like this:
Every 0.1s: sort -k7,7nr -k4,4gr LOG|perl -n -e '@a=split("\t",$_,-1); $t=$a[8]; $t =~ s,^\./,,;' -e '$t =~ s, >.*,,; chomp $t;' -e '$t =~ /.*--gtest_filter=... Wed Apr 1 10:51:42 2015
152.221 PASS t/DBTest.FileCreationRandomFailure
109.280 PASS t/DBTest.EncodeDecompressedBlockSizeTest
82.315 PASS reduce_levels_test
77.812 PASS t/DBTest.CompactionFilterWithValueChange
73.236 PASS backupable_db_test
63.428 PASS deletefile_test
57.248 PASS table_test
55.665 PASS prefix_test
49.816 PASS t/DBTest.RateLimitingTest
...
Test Plan:
Timings (measured so as to exclude compile and link times):
With this change, all tests complete in 2m40s on a system for which nproc prints 32.
Prior to this this change, "make check" would take 24.5 minutes on that same system.
Here are durations (in seconds) of the longest-running subtests:
152.435 PASS t/DBTest.FileCreationRandomFailure
107.070 PASS t/DBTest.EncodeDecompressedBlockSizeTest
81.391 PASS ./reduce_levels_test
71.587 PASS ./backupable_db_test
61.746 PASS ./deletefile_test
57.960 PASS ./table_test
55.230 PASS ./prefix_test
54.060 PASS t/DBTest.CompactionFilterWithValueChange
48.873 PASS t/DBTest.RateLimitingTest
47.569 PASS ./fault_injection_test
46.593 PASS t/DBTest.Randomized
42.662 PASS t/DBTest.CompactionFilter
31.793 PASS t/DBTest.SparseMerge
30.612 PASS t/DBTest.CompactionFilterV2
25.891 PASS t/DBTest.GroupCommitTest
23.863 PASS t/DBTest.DynamicLevelMaxBytesBase
22.976 PASS ./rate_limiter_test
18.942 PASS t/DBTest.OptimizeFiltersForHits
16.851 PASS ./env_test
15.399 PASS t/DBTest.CompactionFilterV2WithValueChange
14.827 PASS t/DBTest.CompactionFilterV2NULLPrefix
Reviewers: igor, sdong, rven, yhchiang, igor.sugak
Reviewed By: igor.sugak
Subscribers: dhruba
Differential Revision: https://reviews.facebook.net/D35379
10 years ago
CLEAN_FILES = # deliberately empty, so we can append below.
CFLAGS += ${ EXTRA_CFLAGS }
CXXFLAGS += ${ EXTRA_CXXFLAGS }
LDFLAGS += $( EXTRA_LDFLAGS)
MACHINE ?= $( shell uname -m)
ARFLAGS = ${ EXTRA_ARFLAGS } rs
STRIPFLAGS = -S -x
run 'make check's rules (and even subtests) in parallel
Summary:
When GNU parallel is available, "make check" tests are now run in parallel.
When /dev/shm is usable, we tell those tests to create temporary files therein.
Now, the longest-running single test, db_test, (which is composed of hundreds of sub-tests)
is no longer run sequentially: instead, each of its sub-tests is run independently, and can
be parallelized along with all other tests. To make that process easier, this change
creates a temporary directory, "t/", in which it puts a small script for each of those
subtests. The output from each parallel-run test is now saved in t/log-TEST_NAME.
When GNU parallel is not available, we run the tests in sequence, just as before.
If GNU parallel is available and you don't like the default of running one subtest
per core, you can invoke "make J=1 check" to run only one test at a time.
Beware: this will take a long time, and it starts with the two longest-running tests, so you
will wait for a long time before seeing any results. Instead, if you want to use fewer resources
but still see useful progress, try "make J=60% check". That will attempt to ensure that 60% of
the cores are occupied by test runs.
To watch progress of individual tests (duration, success (PASS-or-FAIL), name), run "make watch-log"
in the same directory from another window. That will start with something like this:
and when complete should show numbers/names like this:
Every 0.1s: sort -k7,7nr -k4,4gr LOG|perl -n -e '@a=split("\t",$_,-1); $t=$a[8]; $t =~ s,^\./,,;' -e '$t =~ s, >.*,,; chomp $t;' -e '$t =~ /.*--gtest_filter=... Wed Apr 1 10:51:42 2015
152.221 PASS t/DBTest.FileCreationRandomFailure
109.280 PASS t/DBTest.EncodeDecompressedBlockSizeTest
82.315 PASS reduce_levels_test
77.812 PASS t/DBTest.CompactionFilterWithValueChange
73.236 PASS backupable_db_test
63.428 PASS deletefile_test
57.248 PASS table_test
55.665 PASS prefix_test
49.816 PASS t/DBTest.RateLimitingTest
...
Test Plan:
Timings (measured so as to exclude compile and link times):
With this change, all tests complete in 2m40s on a system for which nproc prints 32.
Prior to this this change, "make check" would take 24.5 minutes on that same system.
Here are durations (in seconds) of the longest-running subtests:
152.435 PASS t/DBTest.FileCreationRandomFailure
107.070 PASS t/DBTest.EncodeDecompressedBlockSizeTest
81.391 PASS ./reduce_levels_test
71.587 PASS ./backupable_db_test
61.746 PASS ./deletefile_test
57.960 PASS ./table_test
55.230 PASS ./prefix_test
54.060 PASS t/DBTest.CompactionFilterWithValueChange
48.873 PASS t/DBTest.RateLimitingTest
47.569 PASS ./fault_injection_test
46.593 PASS t/DBTest.Randomized
42.662 PASS t/DBTest.CompactionFilter
31.793 PASS t/DBTest.SparseMerge
30.612 PASS t/DBTest.CompactionFilterV2
25.891 PASS t/DBTest.GroupCommitTest
23.863 PASS t/DBTest.DynamicLevelMaxBytesBase
22.976 PASS ./rate_limiter_test
18.942 PASS t/DBTest.OptimizeFiltersForHits
16.851 PASS ./env_test
15.399 PASS t/DBTest.CompactionFilterV2WithValueChange
14.827 PASS t/DBTest.CompactionFilterV2NULLPrefix
Reviewers: igor, sdong, rven, yhchiang, igor.sugak
Reviewed By: igor.sugak
Subscribers: dhruba
Differential Revision: https://reviews.facebook.net/D35379
10 years ago
# Transform parallel LOG output into something more readable.
perl_command = perl -n \
-e '@a=split("\t",$$_,-1); $$t=$$a[8];' \
-e '$$t =~ /.*if\s\[\[\s"(.*?\.[\w\/]+)/ and $$t=$$1;' \
-e '$$t =~ s,^\./,,;' \
-e '$$t =~ s, >.*,,; chomp $$t;' \
-e '$$t =~ /.*--gtest_filter=(.*?\.[\w\/]+)/ and $$t=$$1;' \
-e 'printf "%7.3f %s %s\n", $$a[3], $$a[6] == 0 ? "PASS" : "FAIL", $$t'
run 'make check's rules (and even subtests) in parallel
Summary:
When GNU parallel is available, "make check" tests are now run in parallel.
When /dev/shm is usable, we tell those tests to create temporary files therein.
Now, the longest-running single test, db_test, (which is composed of hundreds of sub-tests)
is no longer run sequentially: instead, each of its sub-tests is run independently, and can
be parallelized along with all other tests. To make that process easier, this change
creates a temporary directory, "t/", in which it puts a small script for each of those
subtests. The output from each parallel-run test is now saved in t/log-TEST_NAME.
When GNU parallel is not available, we run the tests in sequence, just as before.
If GNU parallel is available and you don't like the default of running one subtest
per core, you can invoke "make J=1 check" to run only one test at a time.
Beware: this will take a long time, and it starts with the two longest-running tests, so you
will wait for a long time before seeing any results. Instead, if you want to use fewer resources
but still see useful progress, try "make J=60% check". That will attempt to ensure that 60% of
the cores are occupied by test runs.
To watch progress of individual tests (duration, success (PASS-or-FAIL), name), run "make watch-log"
in the same directory from another window. That will start with something like this:
and when complete should show numbers/names like this:
Every 0.1s: sort -k7,7nr -k4,4gr LOG|perl -n -e '@a=split("\t",$_,-1); $t=$a[8]; $t =~ s,^\./,,;' -e '$t =~ s, >.*,,; chomp $t;' -e '$t =~ /.*--gtest_filter=... Wed Apr 1 10:51:42 2015
152.221 PASS t/DBTest.FileCreationRandomFailure
109.280 PASS t/DBTest.EncodeDecompressedBlockSizeTest
82.315 PASS reduce_levels_test
77.812 PASS t/DBTest.CompactionFilterWithValueChange
73.236 PASS backupable_db_test
63.428 PASS deletefile_test
57.248 PASS table_test
55.665 PASS prefix_test
49.816 PASS t/DBTest.RateLimitingTest
...
Test Plan:
Timings (measured so as to exclude compile and link times):
With this change, all tests complete in 2m40s on a system for which nproc prints 32.
Prior to this this change, "make check" would take 24.5 minutes on that same system.
Here are durations (in seconds) of the longest-running subtests:
152.435 PASS t/DBTest.FileCreationRandomFailure
107.070 PASS t/DBTest.EncodeDecompressedBlockSizeTest
81.391 PASS ./reduce_levels_test
71.587 PASS ./backupable_db_test
61.746 PASS ./deletefile_test
57.960 PASS ./table_test
55.230 PASS ./prefix_test
54.060 PASS t/DBTest.CompactionFilterWithValueChange
48.873 PASS t/DBTest.RateLimitingTest
47.569 PASS ./fault_injection_test
46.593 PASS t/DBTest.Randomized
42.662 PASS t/DBTest.CompactionFilter
31.793 PASS t/DBTest.SparseMerge
30.612 PASS t/DBTest.CompactionFilterV2
25.891 PASS t/DBTest.GroupCommitTest
23.863 PASS t/DBTest.DynamicLevelMaxBytesBase
22.976 PASS ./rate_limiter_test
18.942 PASS t/DBTest.OptimizeFiltersForHits
16.851 PASS ./env_test
15.399 PASS t/DBTest.CompactionFilterV2WithValueChange
14.827 PASS t/DBTest.CompactionFilterV2NULLPrefix
Reviewers: igor, sdong, rven, yhchiang, igor.sugak
Reviewed By: igor.sugak
Subscribers: dhruba
Differential Revision: https://reviews.facebook.net/D35379
10 years ago
quoted_perl_command = $( subst ',' \' ' ,$( perl_command) )
# DEBUG_LEVEL can have three values:
# * DEBUG_LEVEL=2; this is the ultimate debug mode. It will compile rocksdb
# without any optimizations. To compile with level 2, issue `make dbg`
# * DEBUG_LEVEL=1; debug level 1 enables all assertions and debug code, but
# compiles rocksdb with -O2 optimizations. this is the default debug level.
# `make all` or `make <binary_target>` compile RocksDB with debug level 1.
# We use this debug level when developing RocksDB.
# * DEBUG_LEVEL=0; this is the debug level we use for release. If you're
# running rocksdb in production you most definitely want to compile RocksDB
# with debug level 0. To compile with level 0, run `make shared_lib`,
# `make install-shared`, `make static_lib`, `make install-static` or
# `make install`
# Set the default DEBUG_LEVEL to 1
DEBUG_LEVEL ?= 1
i f e q ( $( MAKECMDGOALS ) , d b g )
DEBUG_LEVEL = 2
e n d i f
i f e q ( $( MAKECMDGOALS ) , c l e a n )
DEBUG_LEVEL = 0
e n d i f
i f e q ( $( MAKECMDGOALS ) , r e l e a s e )
DEBUG_LEVEL = 0
e n d i f
i f e q ( $( MAKECMDGOALS ) , s h a r e d _ l i b )
DEBUG_LEVEL = 0
e n d i f
i f e q ( $( MAKECMDGOALS ) , i n s t a l l - s h a r e d )
DEBUG_LEVEL = 0
e n d i f
i f e q ( $( MAKECMDGOALS ) , s t a t i c _ l i b )
DEBUG_LEVEL = 0
e n d i f
i f e q ( $( MAKECMDGOALS ) , i n s t a l l - s t a t i c )
DEBUG_LEVEL = 0
e n d i f
i f e q ( $( MAKECMDGOALS ) , i n s t a l l )
DEBUG_LEVEL = 0
e n d i f
i f e q ( $( MAKECMDGOALS ) , r o c k s d b j a v a s t a t i c )
ifneq ( $( DEBUG_LEVEL) ,2)
DEBUG_LEVEL = 0
endif
e n d i f
i f e q ( $( MAKECMDGOALS ) , r o c k s d b j a v a s t a t i c r e l e a s e )
ifneq ( $( DEBUG_LEVEL) ,2)
DEBUG_LEVEL = 0
endif
e n d i f
i f e q ( $( MAKECMDGOALS ) , r o c k s d b j a v a s t a t i c r e l e a s e d o c k e r )
ifneq ( $( DEBUG_LEVEL) ,2)
DEBUG_LEVEL = 0
endif
e n d i f
i f e q ( $( MAKECMDGOALS ) , r o c k s d b j a v a s t a t i c p u b l i s h )
DEBUG_LEVEL = 0
e n d i f
$( info $ $ DEBUG_LEVEL is $ {DEBUG_LEVEL })
# Lite build flag.
LITE ?= 0
i f e q ( $( LITE ) , 0 )
i f n e q ( $( filter -DROCKSDB_LITE ,$ ( OPT ) ) , )
# Be backward compatible and support older format where OPT=-DROCKSDB_LITE is
# specified instead of LITE=1 on the command line.
LITE = 1
e n d i f
e l s e i f e q ( $( LITE ) , 1 )
i f e q ( $( filter -DROCKSDB_LITE ,$ ( OPT ) ) , )
OPT += -DROCKSDB_LITE
e n d i f
e n d i f
# Figure out optimize level.
i f n e q ( $( DEBUG_LEVEL ) , 2 )
i f e q ( $( LITE ) , 0 )
OPT += -O2
e l s e
OPT += -Os
e n d i f
e n d i f
# compile with -O2 if debug level is not 2
i f n e q ( $( DEBUG_LEVEL ) , 2 )
OPT += -fno-omit-frame-pointer
# Skip for archs that don't support -momit-leaf-frame-pointer
i f e q ( , $( shell $ ( CXX ) -fsyntax -only -momit -leaf -frame -pointer -xc /dev /null 2>&1) )
OPT += -momit-leaf-frame-pointer
e n d i f
e n d i f
i f e q ( , $( shell $ ( CXX ) -fsyntax -only -maltivec -xc /dev /null 2>&1) )
CXXFLAGS += -DHAS_ALTIVEC
CFLAGS += -DHAS_ALTIVEC
HAS_ALTIVEC = 1
e n d i f
i f e q ( , $( shell $ ( CXX ) -fsyntax -only -mcpu =power 8 -xc /dev /null 2>&1) )
CXXFLAGS += -DHAVE_POWER8
CFLAGS += -DHAVE_POWER8
HAVE_POWER8 = 1
e n d i f
i f e q ( , $( shell $ ( CXX ) -fsyntax -only -march =armv 8-a +crc +crypto -xc /dev /null 2>&1) )
CXXFLAGS += -march= armv8-a+crc+crypto
CFLAGS += -march= armv8-a+crc+crypto
ARMCRC_SOURCE = 1
e n d i f
# if we're compiling for release, compile without debug code (-DNDEBUG)
i f e q ( $( DEBUG_LEVEL ) , 0 )
OPT += -DNDEBUG
i f n e q ( $( USE_RTTI ) , 1 )
CXXFLAGS += -fno-rtti
e l s e
CXXFLAGS += -DROCKSDB_USE_RTTI
e n d i f
e l s e
i f n e q ( $( USE_RTTI ) , 0 )
CXXFLAGS += -DROCKSDB_USE_RTTI
e l s e
CXXFLAGS += -fno-rtti
e n d i f
$(warning Warning : Compiling in debug mode . Don 't use the resulting binary in production )
e n d i f
#-----------------------------------------------
build: fix missing dependency problems
Summary:
Any time one would modify a dependent of any *test*.cc file,
"make" would fail to rebuild the affected test binaries,
e.g., db_test. That was due to the fact that we deliberately
excluded those test-related files from the definition of SOURCES
and only $(SOURCES) was used to create the automatically-generated
.d dependency files. The fix is to generate a .d file for every
source file.
* src.mk: New file. Defines LIB_SOURCES, MOCK_SOURCES
and TEST_BENCH_SOURCES.
* Makefile: Include src.mk.
Reflect s/SOURCES/LIB_SOURCES/ renaming.
* build_tools/build_detect_platform: Remove the code
that was used to generate SOURCES= and MOCK_SOURCES=
definitions in make_config.mk. Those lists of files
are now hard-coded in src.mk. Hard-coding this list of
sources is desirable, because without that, one risks
including stray .cc files in a build. Not reproducible.
Test Plan:
Touch a file used by db_test's dependent .o files and ensure that
they are all recompiled. Before, none would be:
$ touch db/db_impl.h && make db_test
CC db/db_test.o
CC db/column_family.o
CC db/db_filesnapshot.o
CC db/db_impl.o
CC db/db_impl_debug.o
CC db/db_impl_readonly.o
CC db/forward_iterator.o
CC db/internal_stats.o
CC db/managed_iterator.o
CC db/repair.o
CC db/write_batch.o
CC utilities/compacted_db/compacted_db_impl.o
CC utilities/ttl/db_ttl_impl.o
CC util/ldb_cmd.o
CC util/ldb_tool.o
CC util/sst_dump_tool.o
CC util/xfunc.o
CCLD db_test
Reviewers: ljin, igor.sugak, igor, rven, sdong
Reviewed By: sdong
Subscribers: yhchiang, adamretter, fyrz, dhruba
Differential Revision: https://reviews.facebook.net/D33849
10 years ago
i n c l u d e s r c . m k
build: make "make" output readable by default
Summary:
With this change, make now prints a summary line for each
compiler and linker invocation, e.g.,:
CC db/builder.o
CC db/c.o
CC db/column_family.o
To see full commands, insert "V=1" into your make command.
E.g., run "make V=1 all" if you want it to print each command
in its full glory.
$^ is GNU make's abbreviation for the prerequisites of the current target.
These AM_V_... variables expand to some very short string like "CC" or
"LD", by default, so that the output of "make" is readable. If/when you
want more details, just build with "make V=1 ...", and make will print
each full command as it is executed. If you prefer to see the noise
all the time, and only want to optionally see the abbreviated output,
set AM_DEFAULT_VERBOSITY=1 in your environment, and then build with
V=0 to see the abbreviated command indicators.
Test Plan:
invoke make a few different ways and observe:
make clean; make # abbreviated
make clean; make V=0 # also abbreviated
make clean; make V=1 # full detail
Reviewers: sdong, ljin, igor
Reviewed By: igor
Subscribers: dhruba
Differential Revision: https://reviews.facebook.net/D33579
10 years ago
AM_DEFAULT_VERBOSITY = 0
AM_V_GEN = $( am__v_GEN_$( V) )
am__v_GEN_ = $( am__v_GEN_$( AM_DEFAULT_VERBOSITY) )
am__v_GEN_0 = @echo " GEN " $@ ;
am__v_GEN_1 =
AM_V_at = $( am__v_at_$( V) )
am__v_at_ = $( am__v_at_$( AM_DEFAULT_VERBOSITY) )
am__v_at_0 = @
am__v_at_1 =
build: make "make" output readable by default
Summary:
With this change, make now prints a summary line for each
compiler and linker invocation, e.g.,:
CC db/builder.o
CC db/c.o
CC db/column_family.o
To see full commands, insert "V=1" into your make command.
E.g., run "make V=1 all" if you want it to print each command
in its full glory.
$^ is GNU make's abbreviation for the prerequisites of the current target.
These AM_V_... variables expand to some very short string like "CC" or
"LD", by default, so that the output of "make" is readable. If/when you
want more details, just build with "make V=1 ...", and make will print
each full command as it is executed. If you prefer to see the noise
all the time, and only want to optionally see the abbreviated output,
set AM_DEFAULT_VERBOSITY=1 in your environment, and then build with
V=0 to see the abbreviated command indicators.
Test Plan:
invoke make a few different ways and observe:
make clean; make # abbreviated
make clean; make V=0 # also abbreviated
make clean; make V=1 # full detail
Reviewers: sdong, ljin, igor
Reviewed By: igor
Subscribers: dhruba
Differential Revision: https://reviews.facebook.net/D33579
10 years ago
AM_V_CC = $( am__v_CC_$( V) )
am__v_CC_ = $( am__v_CC_$( AM_DEFAULT_VERBOSITY) )
am__v_CC_0 = @echo " CC " $@ ;
am__v_CC_1 =
CCLD = $( CC)
LINK = $( CCLD) $( AM_CFLAGS) $( CFLAGS) $( AM_LDFLAGS) $( LDFLAGS) -o $@
AM_V_CCLD = $( am__v_CCLD_$( V) )
am__v_CCLD_ = $( am__v_CCLD_$( AM_DEFAULT_VERBOSITY) )
am__v_CCLD_0 = @echo " CCLD " $@ ;
am__v_CCLD_1 =
AM_V_AR = $( am__v_AR_$( V) )
am__v_AR_ = $( am__v_AR_$( AM_DEFAULT_VERBOSITY) )
am__v_AR_0 = @echo " AR " $@ ;
am__v_AR_1 =
build: make "make" output readable by default
Summary:
With this change, make now prints a summary line for each
compiler and linker invocation, e.g.,:
CC db/builder.o
CC db/c.o
CC db/column_family.o
To see full commands, insert "V=1" into your make command.
E.g., run "make V=1 all" if you want it to print each command
in its full glory.
$^ is GNU make's abbreviation for the prerequisites of the current target.
These AM_V_... variables expand to some very short string like "CC" or
"LD", by default, so that the output of "make" is readable. If/when you
want more details, just build with "make V=1 ...", and make will print
each full command as it is executed. If you prefer to see the noise
all the time, and only want to optionally see the abbreviated output,
set AM_DEFAULT_VERBOSITY=1 in your environment, and then build with
V=0 to see the abbreviated command indicators.
Test Plan:
invoke make a few different ways and observe:
make clean; make # abbreviated
make clean; make V=0 # also abbreviated
make clean; make V=1 # full detail
Reviewers: sdong, ljin, igor
Reviewed By: igor
Subscribers: dhruba
Differential Revision: https://reviews.facebook.net/D33579
10 years ago
Add EnvLibrados - RocksDB Env of RADOS (#1222)
EnvLibrados is a customized RocksDB Env to use RADOS as the backend file system of RocksDB. It overrides all file system related API of default Env. The easiest way to use it is just like following:
std::string db_name = "test_db";
std::string config_path = "path/to/ceph/config";
DB* db;
Options options;
options.env = EnvLibrados(db_name, config_path);
Status s = DB::Open(options, kDBPath, &db);
Then EnvLibrados will forward all file read/write operation to the RADOS cluster assigned by config_path. Default pool is db_name+"_pool".
There are some options that users could set for EnvLibrados.
- write_buffer_size. This variable is the max buffer size for WritableFile. After reaching the buffer_max_size, EnvLibrados will sync buffer content to RADOS, then clear buffer.
- db_pool. Rather than using default pool, users could set their own db pool name
- wal_dir. The dir for WAL files. Because RocksDB only has 2-level structure (dir_name/file_name), the format of wal_dir is "/dir_name"(CAN'T be "/dir1/dir2"). Default wal_dir is "/wal".
- wal_pool. Corresponding pool name for WAL files. Default value is db_name+"_wal_pool"
The example of setting options looks like following:
db_name = "test_db";
db_pool = db_name+"_pool";
wal_dir = "/wal";
wal_pool = db_name+"_wal_pool";
write_buffer_size = 1 << 20;
env_ = new EnvLibrados(db_name, config, db_pool, wal_dir, wal_pool, write_buffer_size);
DB* db;
Options options;
options.env = env_;
// The last level dir name should match the dir name in prefix_pool_map
options.wal_dir = "/tmp/wal";
// open DB
Status s = DB::Open(options, kDBPath, &db);
Librados is required to compile EnvLibrados. Then use "$make LIBRADOS=1" to compile RocksDB. If you want to only compile EnvLibrados test, just run "$ make env_librados_test LIBRADOS=1". To run env_librados_test, you need to have a running RADOS cluster with the configure file located in "../ceph/src/ceph.conf" related to "rocksdb/".
9 years ago
i f d e f R O C K S D B _ U S E _ L I B R A D O S
LIB_SOURCES += utilities/env_librados.cc
LDFLAGS += -lrados
e n d i f
Add EnvLibrados - RocksDB Env of RADOS (#1222)
EnvLibrados is a customized RocksDB Env to use RADOS as the backend file system of RocksDB. It overrides all file system related API of default Env. The easiest way to use it is just like following:
std::string db_name = "test_db";
std::string config_path = "path/to/ceph/config";
DB* db;
Options options;
options.env = EnvLibrados(db_name, config_path);
Status s = DB::Open(options, kDBPath, &db);
Then EnvLibrados will forward all file read/write operation to the RADOS cluster assigned by config_path. Default pool is db_name+"_pool".
There are some options that users could set for EnvLibrados.
- write_buffer_size. This variable is the max buffer size for WritableFile. After reaching the buffer_max_size, EnvLibrados will sync buffer content to RADOS, then clear buffer.
- db_pool. Rather than using default pool, users could set their own db pool name
- wal_dir. The dir for WAL files. Because RocksDB only has 2-level structure (dir_name/file_name), the format of wal_dir is "/dir_name"(CAN'T be "/dir1/dir2"). Default wal_dir is "/wal".
- wal_pool. Corresponding pool name for WAL files. Default value is db_name+"_wal_pool"
The example of setting options looks like following:
db_name = "test_db";
db_pool = db_name+"_pool";
wal_dir = "/wal";
wal_pool = db_name+"_wal_pool";
write_buffer_size = 1 << 20;
env_ = new EnvLibrados(db_name, config, db_pool, wal_dir, wal_pool, write_buffer_size);
DB* db;
Options options;
options.env = env_;
// The last level dir name should match the dir name in prefix_pool_map
options.wal_dir = "/tmp/wal";
// open DB
Status s = DB::Open(options, kDBPath, &db);
Librados is required to compile EnvLibrados. Then use "$make LIBRADOS=1" to compile RocksDB. If you want to only compile EnvLibrados test, just run "$ make env_librados_test LIBRADOS=1". To run env_librados_test, you need to have a running RADOS cluster with the configure file located in "../ceph/src/ceph.conf" related to "rocksdb/".
9 years ago
AM_LINK = $( AM_V_CCLD) $( CXX) $^ $( EXEC_LDFLAGS) -o $@ $( LDFLAGS) $( COVERAGEFLAGS)
# detect what platform we're building on
dummy := $( shell ( export ROCKSDB_ROOT = " $( CURDIR) " ; export PORTABLE = " $( PORTABLE) " ; " $( CURDIR) /build_tools/build_detect_platform " " $( CURDIR) /make_config.mk " ) )
# this file is generated by the previous line to set build flags and sources
i n c l u d e m a k e _ c o n f i g . m k
export JAVAC_ARGS
run 'make check's rules (and even subtests) in parallel
Summary:
When GNU parallel is available, "make check" tests are now run in parallel.
When /dev/shm is usable, we tell those tests to create temporary files therein.
Now, the longest-running single test, db_test, (which is composed of hundreds of sub-tests)
is no longer run sequentially: instead, each of its sub-tests is run independently, and can
be parallelized along with all other tests. To make that process easier, this change
creates a temporary directory, "t/", in which it puts a small script for each of those
subtests. The output from each parallel-run test is now saved in t/log-TEST_NAME.
When GNU parallel is not available, we run the tests in sequence, just as before.
If GNU parallel is available and you don't like the default of running one subtest
per core, you can invoke "make J=1 check" to run only one test at a time.
Beware: this will take a long time, and it starts with the two longest-running tests, so you
will wait for a long time before seeing any results. Instead, if you want to use fewer resources
but still see useful progress, try "make J=60% check". That will attempt to ensure that 60% of
the cores are occupied by test runs.
To watch progress of individual tests (duration, success (PASS-or-FAIL), name), run "make watch-log"
in the same directory from another window. That will start with something like this:
and when complete should show numbers/names like this:
Every 0.1s: sort -k7,7nr -k4,4gr LOG|perl -n -e '@a=split("\t",$_,-1); $t=$a[8]; $t =~ s,^\./,,;' -e '$t =~ s, >.*,,; chomp $t;' -e '$t =~ /.*--gtest_filter=... Wed Apr 1 10:51:42 2015
152.221 PASS t/DBTest.FileCreationRandomFailure
109.280 PASS t/DBTest.EncodeDecompressedBlockSizeTest
82.315 PASS reduce_levels_test
77.812 PASS t/DBTest.CompactionFilterWithValueChange
73.236 PASS backupable_db_test
63.428 PASS deletefile_test
57.248 PASS table_test
55.665 PASS prefix_test
49.816 PASS t/DBTest.RateLimitingTest
...
Test Plan:
Timings (measured so as to exclude compile and link times):
With this change, all tests complete in 2m40s on a system for which nproc prints 32.
Prior to this this change, "make check" would take 24.5 minutes on that same system.
Here are durations (in seconds) of the longest-running subtests:
152.435 PASS t/DBTest.FileCreationRandomFailure
107.070 PASS t/DBTest.EncodeDecompressedBlockSizeTest
81.391 PASS ./reduce_levels_test
71.587 PASS ./backupable_db_test
61.746 PASS ./deletefile_test
57.960 PASS ./table_test
55.230 PASS ./prefix_test
54.060 PASS t/DBTest.CompactionFilterWithValueChange
48.873 PASS t/DBTest.RateLimitingTest
47.569 PASS ./fault_injection_test
46.593 PASS t/DBTest.Randomized
42.662 PASS t/DBTest.CompactionFilter
31.793 PASS t/DBTest.SparseMerge
30.612 PASS t/DBTest.CompactionFilterV2
25.891 PASS t/DBTest.GroupCommitTest
23.863 PASS t/DBTest.DynamicLevelMaxBytesBase
22.976 PASS ./rate_limiter_test
18.942 PASS t/DBTest.OptimizeFiltersForHits
16.851 PASS ./env_test
15.399 PASS t/DBTest.CompactionFilterV2WithValueChange
14.827 PASS t/DBTest.CompactionFilterV2NULLPrefix
Reviewers: igor, sdong, rven, yhchiang, igor.sugak
Reviewed By: igor.sugak
Subscribers: dhruba
Differential Revision: https://reviews.facebook.net/D35379
10 years ago
CLEAN_FILES += make_config.mk
missing_make_config_paths := $( shell \
grep "\./\S*\|/\S*" -o $( CURDIR) /make_config.mk | \
while read path; \
do [ -e $$ path ] || echo $$ path; \
done | sort | uniq)
$( foreach path , $ ( missing_make_config_paths ) , \
$( warning Warning: $( path) does not exist) )
i f e q ( $( PLATFORM ) , O S _ A I X )
# no debug info
e l s e i f n e q ( $( PLATFORM ) , I O S )
CFLAGS += -g
CXXFLAGS += -g
e l s e
# no debug info for IOS, that will make our library big
OPT += -DNDEBUG
e n d i f
i f e q ( $( PLATFORM ) , O S _ A I X )
ARFLAGS = -X64 rs
STRIPFLAGS = -X64 -x
e n d i f
i f e q ( $( PLATFORM ) , O S _ S O L A R I S )
PLATFORM_CXXFLAGS += -D _GLIBCXX_USE_C99
e n d i f
i f n e q ( $( filter -DROCKSDB_LITE ,$ ( OPT ) ) , )
# found
CFLAGS += -fno-exceptions
CXXFLAGS += -fno-exceptions
# LUA is not supported under ROCKSDB_LITE
LUA_PATH =
e n d i f
# ASAN doesn't work well with jemalloc. If we're compiling with ASAN, we should use regular malloc.
i f d e f C O M P I L E _ W I T H _ A S A N
DISABLE_JEMALLOC = 1
EXEC_LDFLAGS += -fsanitize= address
PLATFORM_CCFLAGS += -fsanitize= address
PLATFORM_CXXFLAGS += -fsanitize= address
e n d i f
# TSAN doesn't work well with jemalloc. If we're compiling with TSAN, we should use regular malloc.
i f d e f C O M P I L E _ W I T H _ T S A N
DISABLE_JEMALLOC = 1
EXEC_LDFLAGS += -fsanitize= thread
Fix TSAN failures in DistributedMutex tests (#5684)
Summary:
TSAN was not able to correctly instrument atomic bts and btr instructions, so
when TSAN is enabled implement those with std::atomic::fetch_or and
std::atomic::fetch_and. Also disable tests that fail on TSAN with false
negatives (we know these are false negatives because this other verifiably
correct program fails with the same TSAN error <link>)
```
make clean
TEST_TMPDIR=/dev/shm/rocksdb OPT=-g COMPILE_WITH_TSAN=1 make J=1 -j56 folly_synchronization_distributed_mutex_test
```
This is the code that fails with the same false-negative with TSAN
```
namespace {
class ExceptionWithConstructionTrack : public std::exception {
public:
explicit ExceptionWithConstructionTrack(int id)
: id_{folly::to<std::string>(id)}, constructionTrack_{id} {}
const char* what() const noexcept override {
return id_.c_str();
}
private:
std::string id_;
TestConstruction constructionTrack_;
};
template <typename Storage, typename Atomic>
void transferCurrentException(Storage& storage, Atomic& produced) {
assert(std::current_exception());
new (&storage) std::exception_ptr(std::current_exception());
produced->store(true, std::memory_order_release);
}
void concurrentExceptionPropagationStress(
int numThreads,
std::chrono::milliseconds milliseconds) {
auto&& stop = std::atomic<bool>{false};
auto&& exceptions = std::vector<std::aligned_storage<48, 8>::type>{};
auto&& produced = std::vector<std::unique_ptr<std::atomic<bool>>>{};
auto&& consumed = std::vector<std::unique_ptr<std::atomic<bool>>>{};
auto&& consumers = std::vector<std::thread>{};
for (auto i = 0; i < numThreads; ++i) {
produced.emplace_back(new std::atomic<bool>{false});
consumed.emplace_back(new std::atomic<bool>{false});
exceptions.push_back({});
}
auto producer = std::thread{[&]() {
auto counter = std::vector<int>(numThreads, 0);
for (auto i = 0; true; i = ((i + 1) % numThreads)) {
try {
throw ExceptionWithConstructionTrack{counter.at(i)++};
} catch (...) {
transferCurrentException(exceptions.at(i), produced.at(i));
}
while (!consumed.at(i)->load(std::memory_order_acquire)) {
if (stop.load(std::memory_order_acquire)) {
return;
}
}
consumed.at(i)->store(false, std::memory_order_release);
}
}};
for (auto i = 0; i < numThreads; ++i) {
consumers.emplace_back([&, i]() {
auto counter = 0;
while (true) {
while (!produced.at(i)->load(std::memory_order_acquire)) {
if (stop.load(std::memory_order_acquire)) {
return;
}
}
produced.at(i)->store(false, std::memory_order_release);
try {
auto storage = &exceptions.at(i);
auto exc = folly::launder(
reinterpret_cast<std::exception_ptr*>(storage));
auto copy = std::move(*exc);
exc->std::exception_ptr::~exception_ptr();
std::rethrow_exception(std::move(copy));
} catch (std::exception& exc) {
auto value = std::stoi(exc.what());
EXPECT_EQ(value, counter++);
}
consumed.at(i)->store(true, std::memory_order_release);
}
});
}
std::this_thread::sleep_for(milliseconds);
stop.store(true);
producer.join();
for (auto& thread : consumers) {
thread.join();
}
}
} // namespace
```
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5684
Differential Revision: D16746077
Pulled By: miasantreble
fbshipit-source-id: 8af88dcf9161c05daec1a76290f577918638f79d
6 years ago
PLATFORM_CCFLAGS += -fsanitize= thread -fPIC -DFOLLY_SANITIZE_THREAD
PLATFORM_CXXFLAGS += -fsanitize= thread -fPIC -DFOLLY_SANITIZE_THREAD
# Turn off -pg when enabling TSAN testing, because that induces
# a link failure. TODO: find the root cause
PROFILING_FLAGS =
# LUA is not supported under TSAN
LUA_PATH =
# Limit keys for crash test under TSAN to avoid error:
# "ThreadSanitizer: DenseSlabAllocator overflow. Dying."
CRASH_TEST_EXT_ARGS += --max_key= 1000000
e n d i f
# AIX doesn't work with -pg
i f e q ( $( PLATFORM ) , O S _ A I X )
PROFILING_FLAGS =
e n d i f
# USAN doesn't work well with jemalloc. If we're compiling with USAN, we should use regular malloc.
i f d e f C O M P I L E _ W I T H _ U B S A N
DISABLE_JEMALLOC = 1
# Suppress alignment warning because murmurhash relies on casting unaligned
# memory to integer. Fixing it may cause performance regression. 3-way crc32
# relies on it too, although it can be rewritten to eliminate with minimal
# performance regression.
EXEC_LDFLAGS += -fsanitize= undefined -fno-sanitize-recover= all
PLATFORM_CCFLAGS += -fsanitize= undefined -fno-sanitize-recover= all -DROCKSDB_UBSAN_RUN
PLATFORM_CXXFLAGS += -fsanitize= undefined -fno-sanitize-recover= all -DROCKSDB_UBSAN_RUN
e n d i f
i f d e f R O C K S D B _ V A L G R I N D _ R U N
PLATFORM_CCFLAGS += -DROCKSDB_VALGRIND_RUN
PLATFORM_CXXFLAGS += -DROCKSDB_VALGRIND_RUN
e n d i f
i f n d e f D I S A B L E _ J E M A L L O C
ifdef JEMALLOC
PLATFORM_CXXFLAGS += -DROCKSDB_JEMALLOC -DJEMALLOC_NO_DEMANGLE
PLATFORM_CCFLAGS += -DROCKSDB_JEMALLOC -DJEMALLOC_NO_DEMANGLE
endif
ifdef WITH_JEMALLOC_FLAG
PLATFORM_LDFLAGS += -ljemalloc
JAVA_LDFLAGS += -ljemalloc
endif
EXEC_LDFLAGS := $( JEMALLOC_LIB) $( EXEC_LDFLAGS)
PLATFORM_CXXFLAGS += $( JEMALLOC_INCLUDE)
PLATFORM_CCFLAGS += $( JEMALLOC_INCLUDE)
e n d i f
i f n d e f U S E _ F O L L Y _ D I S T R I B U T E D _ M U T E X
USE_FOLLY_DISTRIBUTED_MUTEX = 0
e n d i f
export GTEST_THROW_ON_FAILURE = 1
export GTEST_HAS_EXCEPTIONS = 1
GTEST_DIR = ./third-party/gtest-1.8.1/fused-src
# AIX: pre-defined system headers are surrounded by an extern "C" block
i f e q ( $( PLATFORM ) , O S _ A I X )
PLATFORM_CCFLAGS += -I$( GTEST_DIR)
PLATFORM_CXXFLAGS += -I$( GTEST_DIR)
e l s e
PLATFORM_CCFLAGS += -isystem $( GTEST_DIR)
PLATFORM_CXXFLAGS += -isystem $( GTEST_DIR)
e n d i f
i f e q ( $( USE_FOLLY_DISTRIBUTED_MUTEX ) , 1 )
FOLLY_DIR = ./third-party/folly
# AIX: pre-defined system headers are surrounded by an extern "C" block
ifeq ( $( PLATFORM) , OS_AIX)
PLATFORM_CCFLAGS += -I$( FOLLY_DIR)
PLATFORM_CXXFLAGS += -I$( FOLLY_DIR)
else
PLATFORM_CCFLAGS += -isystem $( FOLLY_DIR)
PLATFORM_CXXFLAGS += -isystem $( FOLLY_DIR)
endif
e n d i f
i f d e f T E S T _ C A C H E _ L I N E _ S I Z E
PLATFORM_CCFLAGS += -DTEST_CACHE_LINE_SIZE= $( TEST_CACHE_LINE_SIZE)
PLATFORM_CXXFLAGS += -DTEST_CACHE_LINE_SIZE= $( TEST_CACHE_LINE_SIZE)
e n d i f
# This (the first rule) must depend on "all".
default : all
WARNING_FLAGS = -W -Wextra -Wall -Wsign-compare -Wshadow \
-Wunused-parameter
i f e q ( $( PLATFORM ) , O S _ O P E N B S D )
WARNING_FLAGS += -Wno-unused-lambda-capture
e n d i f
i f n d e f D I S A B L E _ W A R N I N G _ A S _ E R R O R
WARNING_FLAGS += -Werror
e n d i f
i f d e f L U A _ P A T H
i f n d e f L U A _ I N C L U D E
LUA_INCLUDE = $( LUA_PATH) /include
e n d i f
LUA_INCLUDE_FILE = $( LUA_INCLUDE) /lualib.h
i f e q ( "$(wildcard $(LUA_INCLUDE_FILE))" , "" )
# LUA_INCLUDE_FILE does not exist
$( error Cannot find lualib .h under $ ( LUA_INCLUDE ) . Try to specify both LUA_PATH and LUA_INCLUDE manually )
e n d i f
LUA_FLAGS = -I$( LUA_INCLUDE) -DLUA -DLUA_COMPAT_ALL
CFLAGS += $( LUA_FLAGS)
CXXFLAGS += $( LUA_FLAGS)
i f n d e f L U A _ L I B
LUA_LIB = $( LUA_PATH) /lib/liblua.a
e n d i f
i f e q ( "$(wildcard $(LUA_LIB))" , "" ) # LUA_LIB does not exist
$( error $ ( LUA_LIB ) does not exist . Try to specify both LUA_PATH and LUA_LIB manually )
e n d i f
EXEC_LDFLAGS += $( LUA_LIB)
e n d i f
Port 3 way SSE4.2 crc32c implementation from Folly
Summary:
**# Summary**
RocksDB uses SSE crc32 intrinsics to calculate the crc32 values but it does it in single way fashion (not pipelined on single CPU core). Intel's whitepaper () published an algorithm that uses 3-way pipelining for the crc32 intrinsics, then use pclmulqdq intrinsic to combine the values. Because pclmulqdq has overhead on its own, this algorithm will show perf gains on buffers larger than 216 bytes, which makes RocksDB a perfect user, since most of the buffers RocksDB call crc32c on is over 4KB. Initial db_bench show tremendous CPU gain.
This change uses the 3-way SSE algorithm by default. The old SSE algorithm is now behind a compiler tag NO_THREEWAY_CRC32C. If user compiles the code with NO_THREEWAY_CRC32C=1 then the old SSE Crc32c algorithm would be used. If the server does not have SSE4.2 at the run time the slow way (Non SSE) will be used.
**# Performance Test Results**
We ran the FillRandom and ReadRandom benchmarks in db_bench. ReadRandom is the point of interest here since it calculates the CRC32 for the in-mem buffers. We did 3 runs for each algorithm.
Before this change the CRC32 value computation takes about 11.5% of total CPU cost, and with the new 3-way algorithm it reduced to around 4.5%. The overall throughput also improved from 25.53MB/s to 27.63MB/s.
1) ReadRandom in db_bench overall metrics
PER RUN
Algorithm | run | micros/op | ops/sec |Throughput (MB/s)
3-way | 1 | 4.143 | 241387 | 26.7
3-way | 2 | 3.775 | 264872 | 29.3
3-way | 3 | 4.116 | 242929 | 26.9
FastCrc32c|1 | 4.037 | 247727 | 27.4
FastCrc32c|2 | 4.648 | 215166 | 23.8
FastCrc32c|3 | 4.352 | 229799 | 25.4
AVG
Algorithm | Average of micros/op | Average of ops/sec | Average of Throughput (MB/s)
3-way | 4.01 | 249,729 | 27.63
FastCrc32c | 4.35 | 230,897 | 25.53
2) Crc32c computation CPU cost (inclusive samples percentage)
PER RUN
Implementation | run | TotalSamples | Crc32c percentage
3-way | 1 | 4,572,250,000 | 4.37%
3-way | 2 | 3,779,250,000 | 4.62%
3-way | 3 | 4,129,500,000 | 4.48%
FastCrc32c | 1 | 4,663,500,000 | 11.24%
FastCrc32c | 2 | 4,047,500,000 | 12.34%
FastCrc32c | 3 | 4,366,750,000 | 11.68%
**# Test Plan**
make -j64 corruption_test && ./corruption_test
By default it uses 3-way SSE algorithm
NO_THREEWAY_CRC32C=1 make -j64 corruption_test && ./corruption_test
make clean && DEBUG_LEVEL=0 make -j64 db_bench
make clean && DEBUG_LEVEL=0 NO_THREEWAY_CRC32C=1 make -j64 db_bench
Closes https://github.com/facebook/rocksdb/pull/3173
Differential Revision: D6330882
Pulled By: yingsu00
fbshipit-source-id: 8ec3d89719533b63b536a736663ca6f0dd4482e9
7 years ago
i f e q ( $( NO_THREEWAY_CRC 32C ) , 1 )
CXXFLAGS += -DNO_THREEWAY_CRC32C
e n d i f
CFLAGS += $( WARNING_FLAGS) -I. -I./include $( PLATFORM_CCFLAGS) $( OPT)
CXXFLAGS += $( WARNING_FLAGS) -I. -I./include $( PLATFORM_CXXFLAGS) $( OPT) -Woverloaded-virtual -Wnon-virtual-dtor -Wno-missing-field-initializers
LDFLAGS += $( PLATFORM_LDFLAGS)
# If NO_UPDATE_BUILD_VERSION is set we don't update util/build_version.cc, but
# the file needs to already exist or else the build will fail
i f n d e f N O _ U P D A T E _ B U I L D _ V E R S I O N
build: do not relink every single binary just for a timestamp
Summary:
Prior to this change, "make check" would always waste a lot of
time relinking 60+ binaries. With this change, it does that
only when the generated file, util/build_version.cc, changes,
and that happens only when the date changes or when the
current git SHA changes.
This change makes some other improvements: before, there was no
rule to build a deleted util/build_version.cc. If it was somehow
removed, any attempt to link a program would fail.
There is no longer any need for the separate file,
build_tools/build_detect_version. Its functionality is
now in the Makefile.
* Makefile (DEPFILES): Don't filter-out util/build_version.cc.
No need, and besides, removing that dependency was wrong.
(date, git_sha, gen_build_version): New helper variables.
(util/build_version.cc): New rule, to create this file
and update it only if it would contain new information.
* build_tools/build_detect_platform: Remove file.
* db/db_impl.cc: Now, print only date (not the time).
* util/build_version.h (rocksdb_build_compile_time): Remove
declaration. No longer used.
Test Plan:
- Run "make check" twice, and note that the second time no linking is performed.
- Remove util/build_version.cc and ensure that any "make"
command regenerates it before doing anything else.
- Run this: strings librocksdb.a|grep _build_.
That prints output including the following:
rocksdb_build_git_date:2015-02-19
rocksdb_build_git_sha:2.8.fb-1792-g3cb6cc0
Reviewers: ljin, sdong, igor
Reviewed By: igor
Subscribers: dhruba
Differential Revision: https://reviews.facebook.net/D33591
10 years ago
date := $( shell date +%F)
i f d e f F O R C E _ G I T _ S H A
git_sha := $( FORCE_GIT_SHA)
e l s e
git_sha := $( shell git rev-parse HEAD 2>/dev/null)
e n d i f
gen_build_version = sed -e s/@@GIT_SHA@@/$( git_sha) / -e s/@@GIT_DATE_TIME@@/$( date) / util/build_version.cc.in
build: do not relink every single binary just for a timestamp
Summary:
Prior to this change, "make check" would always waste a lot of
time relinking 60+ binaries. With this change, it does that
only when the generated file, util/build_version.cc, changes,
and that happens only when the date changes or when the
current git SHA changes.
This change makes some other improvements: before, there was no
rule to build a deleted util/build_version.cc. If it was somehow
removed, any attempt to link a program would fail.
There is no longer any need for the separate file,
build_tools/build_detect_version. Its functionality is
now in the Makefile.
* Makefile (DEPFILES): Don't filter-out util/build_version.cc.
No need, and besides, removing that dependency was wrong.
(date, git_sha, gen_build_version): New helper variables.
(util/build_version.cc): New rule, to create this file
and update it only if it would contain new information.
* build_tools/build_detect_platform: Remove file.
* db/db_impl.cc: Now, print only date (not the time).
* util/build_version.h (rocksdb_build_compile_time): Remove
declaration. No longer used.
Test Plan:
- Run "make check" twice, and note that the second time no linking is performed.
- Remove util/build_version.cc and ensure that any "make"
command regenerates it before doing anything else.
- Run this: strings librocksdb.a|grep _build_.
That prints output including the following:
rocksdb_build_git_date:2015-02-19
rocksdb_build_git_sha:2.8.fb-1792-g3cb6cc0
Reviewers: ljin, sdong, igor
Reviewed By: igor
Subscribers: dhruba
Differential Revision: https://reviews.facebook.net/D33591
10 years ago
# Record the version of the source that we are compiling.
# We keep a record of the git revision in this file. It is then built
# as a regular source file as part of the compilation process.
# One can run "strings executable_filename | grep _build_" to find
# the version of the source that we used to build the executable file.
build: always attempt to update util/build_version.cc
Summary:
This fixes two bugs: "make clean" would never remove the generated
file, util/build_version.cc, and since D33591, would be regenerated
only if it were absent.
* Makefile (clean): Remove the generated file.
(util/build_version.cc): Depend on the no-prereq FORCE target,
so that this target's rules are always run.
Since this is a generated file, make it read-only.
Also, be sure to remove the temporary file when it is the same
as the original.
Test Plan:
Ensure that we attempt regeneration every time.
Make it empty with an up-to-date time stamp and demonstrate
that it is rebuilt with the expected content:
$ : > util/build_version.cc
$ make util/build_version.o
GEN util/build_version.cc
GEN util/build_version.d
GEN util/build_version.cc
CC util/build_version.o
$ cat util/build_version.cc
#include "build_version.h"
const char* rocksdb_build_git_sha = "rocksdb_build_git_sha:v3.10-2-gb30e72a";
const char* rocksdb_build_git_date = "rocksdb_build_git_date:2015-03-27";
const char* rocksdb_build_compile_date = __DATE__;
Reviewers: igor.sugak, sdong, ljin, igor, rven
Reviewed By: rven
Subscribers: dhruba
Differential Revision: https://reviews.facebook.net/D36087
10 years ago
FORCE :
util/build_version.cc : FORCE
$( AM_V_GEN) rm -f $@ -t
$( AM_V_at) $( gen_build_version) > $@ -t
$( AM_V_at) if test -f $@ ; then \
cmp -s $@ -t $@ && rm -f $@ -t || mv -f $@ -t $@ ; \
else mv -f $@ -t $@ ; fi
e n d i f
build: do not relink every single binary just for a timestamp
Summary:
Prior to this change, "make check" would always waste a lot of
time relinking 60+ binaries. With this change, it does that
only when the generated file, util/build_version.cc, changes,
and that happens only when the date changes or when the
current git SHA changes.
This change makes some other improvements: before, there was no
rule to build a deleted util/build_version.cc. If it was somehow
removed, any attempt to link a program would fail.
There is no longer any need for the separate file,
build_tools/build_detect_version. Its functionality is
now in the Makefile.
* Makefile (DEPFILES): Don't filter-out util/build_version.cc.
No need, and besides, removing that dependency was wrong.
(date, git_sha, gen_build_version): New helper variables.
(util/build_version.cc): New rule, to create this file
and update it only if it would contain new information.
* build_tools/build_detect_platform: Remove file.
* db/db_impl.cc: Now, print only date (not the time).
* util/build_version.h (rocksdb_build_compile_time): Remove
declaration. No longer used.
Test Plan:
- Run "make check" twice, and note that the second time no linking is performed.
- Remove util/build_version.cc and ensure that any "make"
command regenerates it before doing anything else.
- Run this: strings librocksdb.a|grep _build_.
That prints output including the following:
rocksdb_build_git_date:2015-02-19
rocksdb_build_git_sha:2.8.fb-1792-g3cb6cc0
Reviewers: ljin, sdong, igor
Reviewed By: igor
Subscribers: dhruba
Differential Revision: https://reviews.facebook.net/D33591
10 years ago
LIBOBJECTS = $( LIB_SOURCES:.cc= .o)
i f e q ( $( HAVE_POWER 8) , 1 )
LIB_CC_OBJECTS = $( LIB_SOURCES:.cc= .o)
LIBOBJECTS += $( LIB_SOURCES_C:.c= .o)
LIBOBJECTS += $( LIB_SOURCES_ASM:.S= .o)
e l s e
LIB_CC_OBJECTS = $( LIB_SOURCES:.cc= .o)
e n d i f
LIBOBJECTS += $( TOOL_LIB_SOURCES:.cc= .o)
MOCKOBJECTS = $( MOCK_LIB_SOURCES:.cc= .o)
i f e q ( $( USE_FOLLY_DISTRIBUTED_MUTEX ) , 1 )
FOLLYOBJECTS = $( FOLLY_SOURCES:.cpp= .o)
e n d i f
GTEST = $( GTEST_DIR) /gtest/gtest-all.o
TESTUTIL = ./test_util/testutil.o
TESTHARNESS = ./test_util/testharness.o $( TESTUTIL) $( MOCKOBJECTS) $( GTEST)
VALGRIND_ERROR = 2
VALGRIND_VER := $( join $( VALGRIND_VER) ,valgrind)
VALGRIND_OPTS = --error-exitcode= $( VALGRIND_ERROR) --leak-check= full
BENCHTOOLOBJECTS = $( BENCH_LIB_SOURCES:.cc= .o) $( LIBOBJECTS) $( TESTUTIL)
ANALYZETOOLOBJECTS = $( ANALYZER_LIB_SOURCES:.cc= .o)
STRESSTOOLOBJECTS = $( STRESS_LIB_SOURCES:.cc= .o) $( LIBOBJECTS) $( TESTUTIL)
EXPOBJECTS = $( LIBOBJECTS) $( TESTUTIL)
TESTS = \
db_basic_test \
db_with_timestamp_basic_test \
db_encryption_test \
db_test2 \
external_sst_file_basic_test \
auto_roll_logger_test \
bloom_test \
dynamic_bloom_test \
c_test \
checkpoint_test \
crc32c_test \
coding_test \
inlineskiplist_test \
env_basic_test \
env_test \
env_logger_test \
io_posix_test \
hash_test \
random_test \
thread_local_test \
work_queue_test \
rate_limiter_test \
perf_context_test \
iostats_context_test \
db_wal_test \
db_block_cache_test \
db_test \
db_logical_block_size_cache_test \
db_blob_index_test \
db_iter_test \
Change and clarify the relationship between Valid(), status() and Seek*() for all iterators. Also fix some bugs
Summary:
Before this PR, Iterator/InternalIterator may simultaneously have non-ok status() and Valid() = true. That state means that the last operation failed, but the iterator is nevertheless positioned on some unspecified record. Likely intended uses of that are:
* If some sst files are corrupted, a normal iterator can be used to read the data from files that are not corrupted.
* When using read_tier = kBlockCacheTier, read the data that's in block cache, skipping over the data that is not.
However, this behavior wasn't documented well (and until recently the wiki on github had misleading incorrect information). In the code there's a lot of confusion about the relationship between status() and Valid(), and about whether Seek()/SeekToLast()/etc reset the status or not. There were a number of bugs caused by this confusion, both inside rocksdb and in the code that uses rocksdb (including ours).
This PR changes the convention to:
* If status() is not ok, Valid() always returns false.
* Any seek operation resets status. (Before the PR, it depended on iterator type and on particular error.)
This does sacrifice the two use cases listed above, but siying said it's ok.
Overview of the changes:
* A commit that adds missing status checks in MergingIterator. This fixes a bug that actually affects us, and we need it fixed. `DBIteratorTest.NonBlockingIterationBugRepro` explains the scenario.
* Changes to lots of iterator types to make all of them conform to the new convention. Some bug fixes along the way. By far the biggest changes are in DBIter, which is a big messy piece of code; I tried to make it less big and messy but mostly failed.
* A stress-test for DBIter, to gain some confidence that I didn't break it. It does a few million random operations on the iterator, while occasionally modifying the underlying data (like ForwardIterator does) and occasionally returning non-ok status from internal iterator.
To find the iterator types that needed changes I searched for "public .*Iterator" in the code. Here's an overview of all 27 iterator types:
Iterators that didn't need changes:
* status() is always ok(), or Valid() is always false: MemTableIterator, ModelIter, TestIterator, KVIter (2 classes with this name anonymous namespaces), LoggingForwardVectorIterator, VectorIterator, MockTableIterator, EmptyIterator, EmptyInternalIterator.
* Thin wrappers that always pass through Valid() and status(): ArenaWrappedDBIter, TtlIterator, InternalIteratorFromIterator.
Iterators with changes (see inline comments for details):
* DBIter - an overhaul:
- It used to silently skip corrupted keys (`FindParseableKey()`), which seems dangerous. This PR makes it just stop immediately after encountering a corrupted key, just like it would for other kinds of corruption. Let me know if there was actually some deeper meaning in this behavior and I should put it back.
- It had a few code paths silently discarding subiterator's status. The stress test caught a few.
- The backwards iteration code path was expecting the internal iterator's set of keys to be immutable. It's probably always true in practice at the moment, since ForwardIterator doesn't support backwards iteration, but this PR fixes it anyway. See added DBIteratorTest.ReverseToForwardBug for an example.
- Some parts of backwards iteration code path even did things like `assert(iter_->Valid())` after a seek, which is never a safe assumption.
- It used to not reset status on seek for some types of errors.
- Some simplifications and better comments.
- Some things got more complicated from the added error handling. I'm open to ideas for how to make it nicer.
* MergingIterator - check status after every operation on every subiterator, and in some places assert that valid subiterators have ok status.
* ForwardIterator - changed to the new convention, also slightly simplified.
* ForwardLevelIterator - fixed some bugs and simplified.
* LevelIterator - simplified.
* TwoLevelIterator - changed to the new convention. Also fixed a bug that would make SeekForPrev() sometimes silently ignore errors from first_level_iter_.
* BlockBasedTableIterator - minor changes.
* BlockIter - replaced `SetStatus()` with `Invalidate()` to make sure non-ok BlockIter is always invalid.
* PlainTableIterator - some seeks used to not reset status.
* CuckooTableIterator - tiny code cleanup.
* ManagedIterator - fixed some bugs.
* BaseDeltaIterator - changed to the new convention and fixed a bug.
* BlobDBIterator - seeks used to not reset status.
* KeyConvertingIterator - some small change.
Closes https://github.com/facebook/rocksdb/pull/3810
Differential Revision: D7888019
Pulled By: al13n321
fbshipit-source-id: 4aaf6d3421c545d16722a815b2fa2e7912bc851d
7 years ago
db_iter_stress_test \
db_log_iter_test \
db_bloom_filter_test \
db_compaction_filter_test \
db_compaction_test \
db_dynamic_level_test \
Fix flush not being commit while writing manifest
Summary:
Fix flush not being commit while writing manifest, which is a recent bug introduced by D60075.
The issue:
# Options.max_background_flushes > 1
# Background thread A pick up a flush job, flush, then commit to manifest. (Note that mutex is released before writing manifest.)
# Background thread B pick up another flush job, flush. When it gets to `MemTableList::InstallMemtableFlushResults`, it notices another thread is commiting, so it quit.
# After the first commit, thread A doesn't double check if there are more flush result need to commit, leaving the second flush uncommitted.
Test Plan: run the test. Also verify the new test hit deadlock without the fix.
Reviewers: sdong, igor, lightmark
Reviewed By: lightmark
Subscribers: andrewkr, omegaga, dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D60969
9 years ago
db_flush_test \
db_inplace_update_test \
db_iterator_test \
db_memtable_test \
db_merge_operator_test \
New API to get all merge operands for a Key (#5604)
Summary:
This is a new API added to db.h to allow for fetching all merge operands associated with a Key. The main motivation for this API is to support use cases where doing a full online merge is not necessary as it is performance sensitive. Example use-cases:
1. Update subset of columns and read subset of columns -
Imagine a SQL Table, a row is encoded as a K/V pair (as it is done in MyRocks). If there are many columns and users only updated one of them, we can use merge operator to reduce write amplification. While users only read one or two columns in the read query, this feature can avoid a full merging of the whole row, and save some CPU.
2. Updating very few attributes in a value which is a JSON-like document -
Updating one attribute can be done efficiently using merge operator, while reading back one attribute can be done more efficiently if we don't need to do a full merge.
----------------------------------------------------------------------------------------------------
API :
Status GetMergeOperands(
const ReadOptions& options, ColumnFamilyHandle* column_family,
const Slice& key, PinnableSlice* merge_operands,
GetMergeOperandsOptions* get_merge_operands_options,
int* number_of_operands)
Example usage :
int size = 100;
int number_of_operands = 0;
std::vector<PinnableSlice> values(size);
GetMergeOperandsOptions merge_operands_info;
db_->GetMergeOperands(ReadOptions(), db_->DefaultColumnFamily(), "k1", values.data(), merge_operands_info, &number_of_operands);
Description :
Returns all the merge operands corresponding to the key. If the number of merge operands in DB is greater than merge_operands_options.expected_max_number_of_operands no merge operands are returned and status is Incomplete. Merge operands returned are in the order of insertion.
merge_operands-> Points to an array of at-least merge_operands_options.expected_max_number_of_operands and the caller is responsible for allocating it. If the status returned is Incomplete then number_of_operands will contain the total number of merge operands found in DB for key.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5604
Test Plan:
Added unit test and perf test in db_bench that can be run using the command:
./db_bench -benchmarks=getmergeoperands --merge_operator=sortlist
Differential Revision: D16657366
Pulled By: vjnadimpalli
fbshipit-source-id: 0faadd752351745224ee12d4ae9ef3cb529951bf
6 years ago
db_merge_operand_test \
Fix flush not being commit while writing manifest
Summary:
Fix flush not being commit while writing manifest, which is a recent bug introduced by D60075.
The issue:
# Options.max_background_flushes > 1
# Background thread A pick up a flush job, flush, then commit to manifest. (Note that mutex is released before writing manifest.)
# Background thread B pick up another flush job, flush. When it gets to `MemTableList::InstallMemtableFlushResults`, it notices another thread is commiting, so it quit.
# After the first commit, thread A doesn't double check if there are more flush result need to commit, leaving the second flush uncommitted.
Test Plan: run the test. Also verify the new test hit deadlock without the fix.
Reviewers: sdong, igor, lightmark
Reviewed By: lightmark
Subscribers: andrewkr, omegaga, dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D60969
9 years ago
db_options_test \
db_range_del_test \
db_secondary_test \
db_sst_test \
db_tailing_iter_test \
db_io_failure_test \
db_properties_test \
db_table_properties_test \
db_statistics_test \
db_write_test \
error_handler_fs_test \
autovector_test \
blob_db_test \
cleanable_test \
column_family_test \
table_properties_collector_test \
arena_test \
Provide an allocator for new memory type to be used with RocksDB block cache (#6214)
Summary:
New memory technologies are being developed by various hardware vendors (Intel DCPMM is one such technology currently available). These new memory types require different libraries for allocation and management (such as PMDK and memkind). The high capacities available make it possible to provision large caches (up to several TBs in size), beyond what is achievable with DRAM.
The new allocator provided in this PR uses the memkind library to allocate memory on different media.
**Performance**
We tested the new allocator using db_bench.
- For each test, we vary the size of the block cache (relative to the size of the uncompressed data in the database).
- The database is filled sequentially. Throughput is then measured with a readrandom benchmark.
- We use a uniform distribution as a worst-case scenario.
The plot shows throughput (ops/s) relative to a configuration with no block cache and default allocator.
For all tests, p99 latency is below 500 us.
![image](https://user-images.githubusercontent.com/26400080/71108594-42479100-2178-11ea-8231-8a775bbc92db.png)
**Changes**
- Add MemkindKmemAllocator
- Add --use_cache_memkind_kmem_allocator db_bench option (to create an LRU block cache with the new allocator)
- Add detection of memkind library with KMEM DAX support
- Add test for MemkindKmemAllocator
**Minimum Requirements**
- kernel 5.3.12
- ndctl v67 - https://github.com/pmem/ndctl
- memkind v1.10.0 - https://github.com/memkind/memkind
**Memory Configuration**
The allocator uses the MEMKIND_DAX_KMEM memory kind. Follow the instructions on[ memkind’s GitHub page](https://github.com/memkind/memkind) to set up NVDIMM memory accordingly.
Note on memory allocation with NVDIMM memory exposed as system memory.
- The MemkindKmemAllocator will only allocate from NVDIMM memory (using memkind_malloc with MEMKIND_DAX_KMEM kind).
- The default allocator is not restricted to RAM by default. Based on NUMA node latency, the kernel should allocate from local RAM preferentially, but it’s a kernel decision. numactl --preferred/--membind can be used to allocate preferentially/exclusively from the local RAM node.
**Usage**
When creating an LRU cache, pass a MemkindKmemAllocator object as argument.
For example (replace capacity with the desired value in bytes):
```
#include "rocksdb/cache.h"
#include "memory/memkind_kmem_allocator.h"
NewLRUCache(
capacity /*size_t*/,
6 /*cache_numshardbits*/,
false /*strict_capacity_limit*/,
false /*cache_high_pri_pool_ratio*/,
std::make_shared<MemkindKmemAllocator>());
```
Refer to [RocksDB’s block cache documentation](https://github.com/facebook/rocksdb/wiki/Block-Cache) to assign the LRU cache as block cache for a database.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6214
Reviewed By: cheng-chang
Differential Revision: D19292435
fbshipit-source-id: 7202f47b769e7722b539c86c2ffd669f64d7b4e1
5 years ago
memkind_kmem_allocator_test \
block_test \
data_block_hash_index_test \
cache_test \
corruption_test \
slice_test \
slice_transform_test \
dbformat_test \
fault_injection_test \
filelock_test \
filename_test \
Support direct IO in RandomAccessFileReader::MultiRead (#6446)
Summary:
By supporting direct IO in RandomAccessFileReader::MultiRead, the benefits of parallel IO (IO uring) and direct IO can be combined.
In direct IO mode, read requests are aligned and merged together before being issued to RandomAccessFile::MultiRead, so blocks in the original requests might share the same underlying buffer, the shared buffers are returned in `aligned_bufs`, which is a new parameter of the `MultiRead` API.
For example, suppose alignment requirement for direct IO is 4KB, one request is (offset: 1KB, len: 1KB), another request is (offset: 3KB, len: 1KB), then since they all belong to page (offset: 0, len: 4KB), `MultiRead` only reads the page with direct IO into a buffer on heap, and returns 2 Slices referencing regions in that same buffer. See `random_access_file_reader_test.cc` for more examples.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6446
Test Plan: Added a new test `random_access_file_reader_test.cc`.
Reviewed By: anand1976
Differential Revision: D20097518
Pulled By: cheng-chang
fbshipit-source-id: ca48a8faf9c3af146465c102ef6b266a363e78d1
5 years ago
random_access_file_reader_test \
file_reader_writer_test \
block_based_filter_block_test \
full_filter_block_test \
partitioned_filter_block_test \
hash_table_test \
histogram_test \
log_test \
manual_compaction_test \
mock_env_test \
memtable_list_test \
merge_helper_test \
memory_test \
merge_test \
merger_test \
util_merge_operators_test \
options_file_test \
reduce_levels_test \
plain_table_db_test \
comparator_db_test \
external_sst_file_test \
Export Import sst files (#5495)
Summary:
Refresh of the earlier change here - https://github.com/facebook/rocksdb/issues/5135
This is a review request for code change needed for - https://github.com/facebook/rocksdb/issues/3469
"Add support for taking snapshot of a column family and creating column family from a given CF snapshot"
We have an implementation for this that we have been testing internally. We have two new APIs that together provide this functionality.
(1) ExportColumnFamily() - This API is modelled after CreateCheckpoint() as below.
// Exports all live SST files of a specified Column Family onto export_dir,
// returning SST files information in metadata.
// - SST files will be created as hard links when the directory specified
// is in the same partition as the db directory, copied otherwise.
// - export_dir should not already exist and will be created by this API.
// - Always triggers a flush.
virtual Status ExportColumnFamily(ColumnFamilyHandle* handle,
const std::string& export_dir,
ExportImportFilesMetaData** metadata);
Internally, the API will DisableFileDeletions(), GetColumnFamilyMetaData(), Parse through
metadata, creating links/copies of all the sst files, EnableFileDeletions() and complete the call by
returning the list of file metadata.
(2) CreateColumnFamilyWithImport() - This API is modeled after IngestExternalFile(), but invoked only during a CF creation as below.
// CreateColumnFamilyWithImport() will create a new column family with
// column_family_name and import external SST files specified in metadata into
// this column family.
// (1) External SST files can be created using SstFileWriter.
// (2) External SST files can be exported from a particular column family in
// an existing DB.
// Option in import_options specifies whether the external files are copied or
// moved (default is copy). When option specifies copy, managing files at
// external_file_path is caller's responsibility. When option specifies a
// move, the call ensures that the specified files at external_file_path are
// deleted on successful return and files are not modified on any error
// return.
// On error return, column family handle returned will be nullptr.
// ColumnFamily will be present on successful return and will not be present
// on error return. ColumnFamily may be present on any crash during this call.
virtual Status CreateColumnFamilyWithImport(
const ColumnFamilyOptions& options, const std::string& column_family_name,
const ImportColumnFamilyOptions& import_options,
const ExportImportFilesMetaData& metadata,
ColumnFamilyHandle** handle);
Internally, this API creates a new CF, parses all the sst files and adds it to the specified column family, at the same level and with same sequence number as in the metadata. Also performs safety checks with respect to overlaps between the sst files being imported.
If incoming sequence number is higher than current local sequence number, local sequence
number is updated to reflect this.
Note, as the sst files is are being moved across Column Families, Column Family name in sst file
will no longer match the actual column family on destination DB. The API does not modify Column
Family name or id in the sst files being imported.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5495
Differential Revision: D16018881
fbshipit-source-id: 9ae2251025d5916d35a9fc4ea4d6707f6be16ff9
6 years ago
import_column_family_test \
prefix_test \
skiplist_test \
write_buffer_manager_test \
stringappend_test \
cassandra_format_test \
cassandra_functional_test \
cassandra_row_merge_test \
cassandra_serialize_test \
ttl_test \
backupable_db_test \
cache_simulator_test \
sim_cache_test \
version_edit_test \
version_set_test \
compaction_picker_test \
version_builder_test \
file_indexer_test \
write_batch_test \
write_batch_with_index_test \
write_controller_test\
deletefile_test \
obsolete_files_test \
table_test \
delete_scheduler_test \
options_test \
options_settable_test \
Add OptionsUtil::LoadOptionsFromFile() API
Summary:
This patch adds OptionsUtil::LoadOptionsFromFile() and
OptionsUtil::LoadLatestOptionsFromDB(), which allow developers
to construct DBOptions and ColumnFamilyOptions from a RocksDB
options file. Note that most pointer-typed options such as
merge_operator will not be constructed.
With this API, developers no longer need to remember all the
options in order to reopen an existing rocksdb instance like
the following:
DBOptions db_options;
std::vector<std::string> cf_names;
std::vector<ColumnFamilyOptions> cf_opts;
// Load primitive-typed options from an existing DB
OptionsUtil::LoadLatestOptionsFromDB(
dbname, &db_options, &cf_names, &cf_opts);
// Initialize necessary pointer-typed options
cf_opts[0].merge_operator.reset(new MyMergeOperator());
...
// Construct the vector of ColumnFamilyDescriptor
std::vector<ColumnFamilyDescriptor> cf_descs;
for (size_t i = 0; i < cf_opts.size(); ++i) {
cf_descs.emplace_back(cf_names[i], cf_opts[i]);
}
// Open the DB
DB* db = nullptr;
std::vector<ColumnFamilyHandle*> cf_handles;
auto s = DB::Open(db_options, dbname, cf_descs,
&handles, &db);
Test Plan:
Augment existing tests in column_family_test
options_test
db_test
Reviewers: igor, IslamAbdelRahman, sdong, anthony
Reviewed By: anthony
Subscribers: dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D49095
9 years ago
options_util_test \
event_logger_test \
timer_queue_test \
cuckoo_table_builder_test \
cuckoo_table_reader_test \
cuckoo_table_db_test \
flush_job_test \
wal_manager_test \
listener_test \
Support for SingleDelete()
Summary:
This patch fixes #7460559. It introduces SingleDelete as a new database
operation. This operation can be used to delete keys that were never
overwritten (no put following another put of the same key). If an overwritten
key is single deleted the behavior is undefined. Single deletion of a
non-existent key has no effect but multiple consecutive single deletions are
not allowed (see limitations).
In contrast to the conventional Delete() operation, the deletion entry is
removed along with the value when the two are lined up in a compaction. Note:
The semantics are similar to @igor's prototype that allowed to have this
behavior on the granularity of a column family (
https://reviews.facebook.net/D42093 ). This new patch, however, is more
aggressive when it comes to removing tombstones: It removes the SingleDelete
together with the value whenever there is no snapshot between them while the
older patch only did this when the sequence number of the deletion was older
than the earliest snapshot.
Most of the complex additions are in the Compaction Iterator, all other changes
should be relatively straightforward. The patch also includes basic support for
single deletions in db_stress and db_bench.
Limitations:
- Not compatible with cuckoo hash tables
- Single deletions cannot be used in combination with merges and normal
deletions on the same key (other keys are not affected by this)
- Consecutive single deletions are currently not allowed (and older version of
this patch supported this so it could be resurrected if needed)
Test Plan: make all check
Reviewers: yhchiang, sdong, rven, anthony, yoshinorim, igor
Reviewed By: igor
Subscribers: maykov, dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D43179
9 years ago
compaction_iterator_test \
compaction_job_test \
thread_list_test \
sst_dump_test \
compact_files_test \
optimistic_transaction_test \
write_callback_test \
heap_test \
compact_on_deletion_collector_test \
Pessimistic Transactions
Summary:
Initial implementation of Pessimistic Transactions. This diff contains the api changes discussed in D38913. This diff is pretty large, so let me know if people would prefer to meet up to discuss it.
MyRocks folks: please take a look at the API in include/rocksdb/utilities/transaction[_db].h and let me know if you have any issues.
Also, you'll notice a couple of TODOs in the implementation of RollbackToSavePoint(). After chatting with Siying, I'm going to send out a separate diff for an alternate implementation of this feature that implements the rollback inside of WriteBatch/WriteBatchWithIndex. We can then decide which route is preferable.
Next, I'm planning on doing some perf testing and then integrating this diff into MongoRocks for further testing.
Test Plan: Unit tests, db_bench parallel testing.
Reviewers: igor, rven, sdong, yhchiang, yoshinorim
Reviewed By: sdong
Subscribers: hermanlee4, maykov, spetrunia, leveldb, dhruba
Differential Revision: https://reviews.facebook.net/D40869
10 years ago
compaction_job_stats_test \
option_change_migration_test \
Remove ldb HexToString method's usage of sscanf
Summary:
Fix hex2String performance issues by removing sscanf dependency.
Also fixed some edge case handling (odd length, bad input).
Test Plan: Created a test file which called old and new implementation, and validated results are the same. I'll paste results in the phabricator diff.
Reviewers: igor, rven, anthony, IslamAbdelRahman, kradhakrishnan, yhchiang, sdong
Reviewed By: sdong
Subscribers: thatsafunnyname, leveldb, dhruba
Differential Revision: https://reviews.facebook.net/D46785
9 years ago
transaction_test \
transaction_lock_mgr_test \
ldb_cmd_test \
persistent_cache_test \
statistics_test \
stats_history_test \
lru_cache_test \
object_registry_test \
repair_test \
env_timed_test \
write_prepared_transaction_test \
write_unprepared_transaction_test \
db_universal_compaction_test \
RocksDB Trace Analyzer (#4091)
Summary:
A framework of trace analyzing for RocksDB
After collecting the trace by using the tool of [PR #3837](https://github.com/facebook/rocksdb/pull/3837). User can use the Trace Analyzer to interpret, analyze, and characterize the collected workload.
**Input:**
1. trace file
2. Whole keys space file
**Statistics:**
1. Access count of each operation (Get, Put, Delete, SingleDelete, DeleteRange, Merge) in each column family.
2. Key hotness (access count) of each one
3. Key space separation based on given prefix
4. Key size distribution
5. Value size distribution if appliable
6. Top K accessed keys
7. QPS statistics including the average QPS and peak QPS
8. Top K accessed prefix
9. The query correlation analyzing, output the number of X after Y and the corresponding average time
intervals
**Output:**
1. key access heat map (either in the accessed key space or whole key space)
2. trace sequence file (interpret the raw trace file to line base text file for future use)
3. Time serial (The key space ID and its access time)
4. Key access count distritbution
5. Key size distribution
6. Value size distribution (in each intervals)
7. whole key space separation by the prefix
8. Accessed key space separation by the prefix
9. QPS of each operation and each column family
10. Top K QPS and their accessed prefix range
**Test:**
1. Added the unit test of analyzing Get, Put, Delete, SingleDelete, DeleteRange, Merge
2. Generated the trace and analyze the trace
**Implemented but not tested (due to the limitation of trace_replay):**
1. Analyzing Iterator, supporting Seek() and SeekForPrev() analyzing
2. Analyzing the number of Key found by Get
**Future Work:**
1. Support execution time analyzing of each requests
2. Support cache hit situation and block read situation of Get
Pull Request resolved: https://github.com/facebook/rocksdb/pull/4091
Differential Revision: D9256157
Pulled By: zhichao-cao
fbshipit-source-id: f0ceacb7eedbc43a3eee6e85b76087d7832a8fe6
7 years ago
trace_analyzer_test \
repeatable_thread_test \
Use only "local" range tombstones during Get (#4449)
Summary:
Previously, range tombstones were accumulated from every level, which
was necessary if a range tombstone in a higher level covered a key in a lower
level. However, RangeDelAggregator::AddTombstones's complexity is based on
the number of tombstones that are currently stored in it, which is wasteful in
the Get case, where we only need to know the highest sequence number of range
tombstones that cover the key from higher levels, and compute the highest covering
sequence number at the current level. This change introduces this optimization, and
removes the use of RangeDelAggregator from the Get path.
In the benchmark results, the following command was used to initialize the database:
```
./db_bench -db=/dev/shm/5k-rts -use_existing_db=false -benchmarks=filluniquerandom -write_buffer_size=1048576 -compression_type=lz4 -target_file_size_base=1048576 -max_bytes_for_level_base=4194304 -value_size=112 -key_size=16 -block_size=4096 -level_compaction_dynamic_level_bytes=true -num=5000000 -max_background_jobs=12 -benchmark_write_rate_limit=20971520 -range_tombstone_width=100 -writes_per_range_tombstone=100 -max_num_range_tombstones=50000 -bloom_bits=8
```
...and the following command was used to measure read throughput:
```
./db_bench -db=/dev/shm/5k-rts/ -use_existing_db=true -benchmarks=readrandom -disable_auto_compactions=true -num=5000000 -reads=100000 -threads=32
```
The filluniquerandom command was only run once, and the resulting database was used
to measure read performance before and after the PR. Both binaries were compiled with
`DEBUG_LEVEL=0`.
Readrandom results before PR:
```
readrandom : 4.544 micros/op 220090 ops/sec; 16.9 MB/s (63103 of 100000 found)
```
Readrandom results after PR:
```
readrandom : 11.147 micros/op 89707 ops/sec; 6.9 MB/s (63103 of 100000 found)
```
So it's actually slower right now, but this PR paves the way for future optimizations (see #4493).
----
Pull Request resolved: https://github.com/facebook/rocksdb/pull/4449
Differential Revision: D10370575
Pulled By: abhimadan
fbshipit-source-id: 9a2e152be1ef36969055c0e9eb4beb0d96c11f4d
6 years ago
range_tombstone_fragmenter_test \
range_del_aggregator_test \
sst_file_reader_test \
db_secondary_test \
block_cache_tracer_test \
block_cache_trace_analyzer_test \
defer_test \
blob_file_addition_test \
blob_file_garbage_test \
timer_test \
i f e q ( $( USE_FOLLY_DISTRIBUTED_MUTEX ) , 1 )
TESTS += folly_synchronization_distributed_mutex_test
e n d i f
PARALLEL_TEST = \
backupable_db_test \
db_bloom_filter_test \
db_compaction_filter_test \
db_compaction_test \
db_merge_operator_test \
db_sst_test \
db_test \
db_universal_compaction_test \
db_wal_test \
external_sst_file_test \
Export Import sst files (#5495)
Summary:
Refresh of the earlier change here - https://github.com/facebook/rocksdb/issues/5135
This is a review request for code change needed for - https://github.com/facebook/rocksdb/issues/3469
"Add support for taking snapshot of a column family and creating column family from a given CF snapshot"
We have an implementation for this that we have been testing internally. We have two new APIs that together provide this functionality.
(1) ExportColumnFamily() - This API is modelled after CreateCheckpoint() as below.
// Exports all live SST files of a specified Column Family onto export_dir,
// returning SST files information in metadata.
// - SST files will be created as hard links when the directory specified
// is in the same partition as the db directory, copied otherwise.
// - export_dir should not already exist and will be created by this API.
// - Always triggers a flush.
virtual Status ExportColumnFamily(ColumnFamilyHandle* handle,
const std::string& export_dir,
ExportImportFilesMetaData** metadata);
Internally, the API will DisableFileDeletions(), GetColumnFamilyMetaData(), Parse through
metadata, creating links/copies of all the sst files, EnableFileDeletions() and complete the call by
returning the list of file metadata.
(2) CreateColumnFamilyWithImport() - This API is modeled after IngestExternalFile(), but invoked only during a CF creation as below.
// CreateColumnFamilyWithImport() will create a new column family with
// column_family_name and import external SST files specified in metadata into
// this column family.
// (1) External SST files can be created using SstFileWriter.
// (2) External SST files can be exported from a particular column family in
// an existing DB.
// Option in import_options specifies whether the external files are copied or
// moved (default is copy). When option specifies copy, managing files at
// external_file_path is caller's responsibility. When option specifies a
// move, the call ensures that the specified files at external_file_path are
// deleted on successful return and files are not modified on any error
// return.
// On error return, column family handle returned will be nullptr.
// ColumnFamily will be present on successful return and will not be present
// on error return. ColumnFamily may be present on any crash during this call.
virtual Status CreateColumnFamilyWithImport(
const ColumnFamilyOptions& options, const std::string& column_family_name,
const ImportColumnFamilyOptions& import_options,
const ExportImportFilesMetaData& metadata,
ColumnFamilyHandle** handle);
Internally, this API creates a new CF, parses all the sst files and adds it to the specified column family, at the same level and with same sequence number as in the metadata. Also performs safety checks with respect to overlaps between the sst files being imported.
If incoming sequence number is higher than current local sequence number, local sequence
number is updated to reflect this.
Note, as the sst files is are being moved across Column Families, Column Family name in sst file
will no longer match the actual column family on destination DB. The API does not modify Column
Family name or id in the sst files being imported.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5495
Differential Revision: D16018881
fbshipit-source-id: 9ae2251025d5916d35a9fc4ea4d6707f6be16ff9
6 years ago
import_column_family_test \
fault_injection_test \
file_reader_writer_test \
inlineskiplist_test \
manual_compaction_test \
persistent_cache_test \
table_test \
transaction_test \
transaction_lock_mgr_test \
write_prepared_transaction_test \
write_unprepared_transaction_test \
# options_settable_test doesn't pass with UBSAN as we use hack in the test
i f d e f C O M P I L E _ W I T H _ U B S A N
TESTS := $( shell echo $( TESTS) | sed 's/\boptions_settable_test\b//g' )
e n d i f
SUBSET := $( TESTS)
i f d e f R O C K S D B T E S T S _ S T A R T
SUBSET := $( shell echo $( SUBSET) | sed 's/^.*$(ROCKSDBTESTS_START)/$(ROCKSDBTESTS_START)/' )
e n d i f
i f d e f R O C K S D B T E S T S _ E N D
SUBSET := $( shell echo $( SUBSET) | sed 's/$(ROCKSDBTESTS_END).*//' )
e n d i f
TOOLS = \
sst_dump \
db_sanity_test \
db_stress \
Write stress test
Summary:
The goal of this diff is to create a simple stress test with focus on catching:
* bugs in compaction/flush processes, especially the ones that cause assertion errors
* bugs in the code that deletes obsolete files
There are two parts of the test:
* write_stress, a binary that writes to the database
* write_stress_runner.py, a script that invokes and kills write_stress
Here are some interesting parts of write_stress:
* Runs with very high concurrency of compactions and flushes (32 threads total) and tries to create a huge amount of small files
* The keys written to the database are not uniformly distributed -- there is a 3-character prefix that mutates occasionally (in prefix mutator thread), in such a way that the first character mutates slower than second, which mutates slower than third character. That way, the compaction stress tests some interesting compaction features like trivial moves and bottommost level calculation
* There is a thread that creates an iterator, holds it for couple of seconds and then iterates over all keys. This is supposed to test RocksDB's abilities to keep the files alive when there are references to them.
* Some writes trigger WAL sync. This is stress testing our WAL sync code.
* At the end of the run, we make sure that we didn't leak any of the sst files
write_stress_runner.py changes the mode in which we run write_stress and also kills and restarts it. There are some interesting characteristics:
* At the beginning we divide the full test runtime into smaller parts -- shorter runtimes (couple of seconds) and longer runtimes (100, 1000) seconds
* The first time we run write_stress, we destroy the old DB. Every next time during the test, we use the same DB.
* We can run in kill mode or clean-restart mode. Kill mode kills the write_stress violently.
* We can run in mode where delete_obsolete_files_with_fullscan is true or false
* We can run with low_open_files mode turned on or off. When it's turned on, we configure table cache to only hold a couple of files -- that way we need to reopen files every time we access them.
Another goal was to create a stress test without a lot of parameters. So tools/write_stress_runner.py should only take one parameter -- runtime_sec and it should figure out everything else on its own.
In a separate diff, I'll add this new test to our nightly legocastle runs.
Test Plan:
The goal of this test was to retroactively catch the following bugs: D33045, D48201, D46899, D42399. I failed to reproduce D48201, but all others have been caught!
When i reverted https://reviews.facebook.net/D33045:
./write_stress --runtime_sec=200 --low_open_files_mode=true
Iterator statuts not OK: IO error: /fast-rocksdb-tmp/rocksdb_test/write_stress/089166.sst: No such file or directory
When i reverted https://reviews.facebook.net/D42399:
python tools/write_stress_runner.py --runtime_sec=5000
Running write_stress, will kill after 5 seconds: ./write_stress --runtime_sec=-1
Running write_stress, will kill after 2 seconds: ./write_stress --runtime_sec=-1 --destroy_db=false --delete_obsolete_files_with_fullscan=true
Running write_stress, will kill after 7 seconds: ./write_stress --runtime_sec=-1 --destroy_db=false
Running write_stress, will kill after 5 seconds: ./write_stress --runtime_sec=-1 --destroy_db=false
Running write_stress, will kill after 8 seconds: ./write_stress --runtime_sec=-1 --destroy_db=false --low_open_files_mode=true
Write to DB failed: IO error: /fast-rocksdb-tmp/rocksdb_test/write_stress/019250.sst: No such file or directory
ERROR: write_stress died with exitcode=-6
When i reverted https://reviews.facebook.net/D46899:
python tools/write_stress_runner.py --runtime_sec=1000
runtime: 1000
Going to execute write stress for [3, 3, 100, 3, 2, 100, 1, 788]
Running write_stress for 3 seconds: ./write_stress --runtime_sec=3 --low_open_files_mode=true
Running write_stress for 3 seconds: ./write_stress --runtime_sec=3 --destroy_db=false --delete_obsolete_files_with_fullscan=true
Running write_stress, will kill after 100 seconds: ./write_stress --runtime_sec=-1 --destroy_db=false --delete_obsolete_files_with_fullscan=true
write_stress: db/db_impl.cc:2070: void rocksdb::DBImpl::MarkLogsSynced(uint64_t, bool, const rocksdb::Status&): Assertion `log.getting_synced' failed.
ERROR: write_stress died with exitcode=-6
Reviewers: IslamAbdelRahman, yhchiang, rven, kradhakrishnan, sdong, anthony
Reviewed By: anthony
Subscribers: leveldb, dhruba
Differential Revision: https://reviews.facebook.net/D49533
9 years ago
write_stress \
ldb \
db_repl_stress \
rocksdb_dump \
rocksdb_undump \
blob_dump \
RocksDB Trace Analyzer (#4091)
Summary:
A framework of trace analyzing for RocksDB
After collecting the trace by using the tool of [PR #3837](https://github.com/facebook/rocksdb/pull/3837). User can use the Trace Analyzer to interpret, analyze, and characterize the collected workload.
**Input:**
1. trace file
2. Whole keys space file
**Statistics:**
1. Access count of each operation (Get, Put, Delete, SingleDelete, DeleteRange, Merge) in each column family.
2. Key hotness (access count) of each one
3. Key space separation based on given prefix
4. Key size distribution
5. Value size distribution if appliable
6. Top K accessed keys
7. QPS statistics including the average QPS and peak QPS
8. Top K accessed prefix
9. The query correlation analyzing, output the number of X after Y and the corresponding average time
intervals
**Output:**
1. key access heat map (either in the accessed key space or whole key space)
2. trace sequence file (interpret the raw trace file to line base text file for future use)
3. Time serial (The key space ID and its access time)
4. Key access count distritbution
5. Key size distribution
6. Value size distribution (in each intervals)
7. whole key space separation by the prefix
8. Accessed key space separation by the prefix
9. QPS of each operation and each column family
10. Top K QPS and their accessed prefix range
**Test:**
1. Added the unit test of analyzing Get, Put, Delete, SingleDelete, DeleteRange, Merge
2. Generated the trace and analyze the trace
**Implemented but not tested (due to the limitation of trace_replay):**
1. Analyzing Iterator, supporting Seek() and SeekForPrev() analyzing
2. Analyzing the number of Key found by Get
**Future Work:**
1. Support execution time analyzing of each requests
2. Support cache hit situation and block read situation of Get
Pull Request resolved: https://github.com/facebook/rocksdb/pull/4091
Differential Revision: D9256157
Pulled By: zhichao-cao
fbshipit-source-id: f0ceacb7eedbc43a3eee6e85b76087d7832a8fe6
7 years ago
trace_analyzer \
Support computing miss ratio curves using sim_cache. (#5449)
Summary:
This PR adds a BlockCacheTraceSimulator that reports the miss ratios given different cache configurations. A cache configuration contains "cache_name,num_shard_bits,cache_capacities". For example, "lru, 1, 1K, 2K, 4M, 4G".
When we replay the trace, we also perform lookups and inserts on the simulated caches.
In the end, it reports the miss ratio for each tuple <cache_name, num_shard_bits, cache_capacity> in a output file.
This PR also adds a main source block_cache_trace_analyzer so that we can run the analyzer in command line.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5449
Test Plan:
Added tests for block_cache_trace_analyzer.
COMPILE_WITH_ASAN=1 make check -j32.
Differential Revision: D15797073
Pulled By: HaoyuHuang
fbshipit-source-id: aef0c5c2e7938f3e8b6a10d4a6a50e6928ecf408
6 years ago
block_cache_trace_analyzer \
TEST_LIBS = \
librocksdb_env_basic_test.a
# TODO: add back forward_iterator_bench, after making it build in all environemnts.
BENCHMARKS = db_bench table_reader_bench cache_bench memtablerep_bench filter_bench persistent_cache_bench range_del_aggregator_bench
# if user didn't config LIBNAME, set the default
i f e q ( $( LIBNAME ) , )
# we should only run rocksdb in production with DEBUG_LEVEL 0
i f e q ( $( DEBUG_LEVEL ) , 0 )
LIBNAME = librocksdb
e l s e
LIBNAME = librocksdb_debug
e n d i f
e n d i f
LIBRARY = ${ LIBNAME } .a
TOOLS_LIBRARY = ${ LIBNAME } _tools.a
STRESS_LIBRARY = ${ LIBNAME } _stress.a
ROCKSDB_MAJOR = $( shell egrep "ROCKSDB_MAJOR.[0-9]" include/rocksdb/version.h | cut -d ' ' -f 3)
ROCKSDB_MINOR = $( shell egrep "ROCKSDB_MINOR.[0-9]" include/rocksdb/version.h | cut -d ' ' -f 3)
ROCKSDB_PATCH = $( shell egrep "ROCKSDB_PATCH.[0-9]" include/rocksdb/version.h | cut -d ' ' -f 3)
Package generation for Ubuntu and CentOS
Summary:
I put together a script to assist in the generation of deb's and
rpm's. I've tested that this works on ubuntu via vagrant. I've included the
Vagrantfile here, but I can remove it if it's not useful. The package.sh
script should work on any ubuntu or centos machine, I just added a bit of
logic in there to allow a base Ubuntu or Centos machine to be able to build
RocksDB from scratch.
Example output on Ubuntu 14.04:
```
root@vagrant-ubuntu-trusty-64:/vagrant# ./tools/package.sh
[+] g++-4.7 is already installed. skipping.
[+] libgflags-dev is already installed. skipping.
[+] ruby-all-dev is already installed. skipping.
[+] fpm is already installed. skipping.
Created package {:path=>"rocksdb_3.5_amd64.deb"}
root@vagrant-ubuntu-trusty-64:/vagrant# dpkg --info rocksdb_3.5_amd64.deb
new debian package, version 2.0.
size 17392022 bytes: control archive=1518 bytes.
275 bytes, 11 lines control
2911 bytes, 38 lines md5sums
Package: rocksdb
Version: 3.5
License: BSD
Vendor: Facebook
Architecture: amd64
Maintainer: rocksdb@fb.com
Installed-Size: 83358
Section: default
Priority: extra
Homepage: http://rocksdb.org/
Description: RocksDB is an embeddable persistent key-value store for fast storage.
```
Example output on CentOS 6.5:
```
[root@localhost vagrant]# rpm -qip rocksdb-3.5-1.x86_64.rpm
Name : rocksdb Relocations: /usr
Version : 3.5 Vendor: Facebook
Release : 1 Build Date: Mon 29 Sep 2014 01:26:11 AM UTC
Install Date: (not installed) Build Host: localhost
Group : default Source RPM: rocksdb-3.5-1.src.rpm
Size : 96231106 License: BSD
Signature : (none)
Packager : rocksdb@fb.com
URL : http://rocksdb.org/
Summary : RocksDB is an embeddable persistent key-value store for fast storage.
Description :
RocksDB is an embeddable persistent key-value store for fast storage.
```
Test Plan:
How this gets used is really up to the RocksDB core team. If you
want to actually get this into mainline, you might have to change `make
install` such that it install the RocksDB shared object file as well, which
would require you to link against gflags (maybe?) and that would require some
potential modifications to the script here (basically add a depends on that
package).
Currently, this will install the headers and a pre-compiled statically linked
object file. If that's what you want out of life, than this requires no
modifications.
Reviewers: ljin, yhchiang, igor
Reviewed By: igor
Differential Revision: https://reviews.facebook.net/D24141
10 years ago
default : all
#-----------------------------------------------
# Create platform independent shared libraries.
#-----------------------------------------------
i f n e q ( $( PLATFORM_SHARED_EXT ) , )
i f n e q ( $( PLATFORM_SHARED_VERSIONED ) , t r u e )
SHARED1 = ${ LIBNAME } .$( PLATFORM_SHARED_EXT)
SHARED2 = $( SHARED1)
SHARED3 = $( SHARED1)
SHARED4 = $( SHARED1)
SHARED = $( SHARED1)
e l s e
Package generation for Ubuntu and CentOS
Summary:
I put together a script to assist in the generation of deb's and
rpm's. I've tested that this works on ubuntu via vagrant. I've included the
Vagrantfile here, but I can remove it if it's not useful. The package.sh
script should work on any ubuntu or centos machine, I just added a bit of
logic in there to allow a base Ubuntu or Centos machine to be able to build
RocksDB from scratch.
Example output on Ubuntu 14.04:
```
root@vagrant-ubuntu-trusty-64:/vagrant# ./tools/package.sh
[+] g++-4.7 is already installed. skipping.
[+] libgflags-dev is already installed. skipping.
[+] ruby-all-dev is already installed. skipping.
[+] fpm is already installed. skipping.
Created package {:path=>"rocksdb_3.5_amd64.deb"}
root@vagrant-ubuntu-trusty-64:/vagrant# dpkg --info rocksdb_3.5_amd64.deb
new debian package, version 2.0.
size 17392022 bytes: control archive=1518 bytes.
275 bytes, 11 lines control
2911 bytes, 38 lines md5sums
Package: rocksdb
Version: 3.5
License: BSD
Vendor: Facebook
Architecture: amd64
Maintainer: rocksdb@fb.com
Installed-Size: 83358
Section: default
Priority: extra
Homepage: http://rocksdb.org/
Description: RocksDB is an embeddable persistent key-value store for fast storage.
```
Example output on CentOS 6.5:
```
[root@localhost vagrant]# rpm -qip rocksdb-3.5-1.x86_64.rpm
Name : rocksdb Relocations: /usr
Version : 3.5 Vendor: Facebook
Release : 1 Build Date: Mon 29 Sep 2014 01:26:11 AM UTC
Install Date: (not installed) Build Host: localhost
Group : default Source RPM: rocksdb-3.5-1.src.rpm
Size : 96231106 License: BSD
Signature : (none)
Packager : rocksdb@fb.com
URL : http://rocksdb.org/
Summary : RocksDB is an embeddable persistent key-value store for fast storage.
Description :
RocksDB is an embeddable persistent key-value store for fast storage.
```
Test Plan:
How this gets used is really up to the RocksDB core team. If you
want to actually get this into mainline, you might have to change `make
install` such that it install the RocksDB shared object file as well, which
would require you to link against gflags (maybe?) and that would require some
potential modifications to the script here (basically add a depends on that
package).
Currently, this will install the headers and a pre-compiled statically linked
object file. If that's what you want out of life, than this requires no
modifications.
Reviewers: ljin, yhchiang, igor
Reviewed By: igor
Differential Revision: https://reviews.facebook.net/D24141
10 years ago
SHARED_MAJOR = $( ROCKSDB_MAJOR)
SHARED_MINOR = $( ROCKSDB_MINOR)
SHARED_PATCH = $( ROCKSDB_PATCH)
SHARED1 = ${ LIBNAME } .$( PLATFORM_SHARED_EXT)
i f e q ( $( PLATFORM ) , O S _ M A C O S X )
SHARED_OSX = $( LIBNAME) .$( SHARED_MAJOR)
SHARED2 = $( SHARED_OSX) .$( PLATFORM_SHARED_EXT)
SHARED3 = $( SHARED_OSX) .$( SHARED_MINOR) .$( PLATFORM_SHARED_EXT)
SHARED4 = $( SHARED_OSX) .$( SHARED_MINOR) .$( SHARED_PATCH) .$( PLATFORM_SHARED_EXT)
e l s e
SHARED2 = $( SHARED1) .$( SHARED_MAJOR)
SHARED3 = $( SHARED1) .$( SHARED_MAJOR) .$( SHARED_MINOR)
SHARED4 = $( SHARED1) .$( SHARED_MAJOR) .$( SHARED_MINOR) .$( SHARED_PATCH)
e n d i f
SHARED = $( SHARED1) $( SHARED2) $( SHARED3) $( SHARED4)
$(SHARED1) : $( SHARED 4)
ln -fs $( SHARED4) $( SHARED1)
$(SHARED2) : $( SHARED 4)
ln -fs $( SHARED4) $( SHARED2)
$(SHARED3) : $( SHARED 4)
ln -fs $( SHARED4) $( SHARED3)
e n d i f
i f e q ( $( HAVE_POWER 8) , 1 )
SHARED_C_OBJECTS = $( LIB_SOURCES_C:.c= .o)
SHARED_ASM_OBJECTS = $( LIB_SOURCES_ASM:.S= .o)
SHARED_C_LIBOBJECTS = $( patsubst %.o,shared-objects/%.o,$( SHARED_C_OBJECTS) )
SHARED_ASM_LIBOBJECTS = $( patsubst %.o,shared-objects/%.o,$( SHARED_ASM_OBJECTS) )
shared_libobjects = $( patsubst %,shared-objects/%,$( LIB_CC_OBJECTS) )
e l s e
shared_libobjects = $( patsubst %,shared-objects/%,$( LIBOBJECTS) )
e n d i f
CLEAN_FILES += shared-objects
shared_all_libobjects = $( shared_libobjects)
i f e q ( $( HAVE_POWER 8) , 1 )
shared-ppc-objects = $( SHARED_C_LIBOBJECTS) $( SHARED_ASM_LIBOBJECTS)
shared-objects/util/crc32c_ppc.o : util /crc 32c_ppc .c
$( AM_V_CC) $( CC) $( CFLAGS) -c $< -o $@
shared-objects/util/crc32c_ppc_asm.o : util /crc 32c_ppc_asm .S
$( AM_V_CC) $( CC) $( CFLAGS) -c $< -o $@
e n d i f
$(shared_libobjects) : shared -objects /%.o : %.cc
$( AM_V_CC) mkdir -p $( @D) && $( CXX) $( CXXFLAGS) $( PLATFORM_SHARED_CFLAGS) -c $< -o $@
i f e q ( $( HAVE_POWER 8) , 1 )
shared_all_libobjects = $( shared_libobjects) $( shared-ppc-objects)
e n d i f
$(SHARED4) : $( shared_all_libobjects )
$( CXX) $( PLATFORM_SHARED_LDFLAGS) $( SHARED3) $( CXXFLAGS) $( PLATFORM_SHARED_CFLAGS) $( shared_all_libobjects) $( LDFLAGS) -o $@
e n d i f # PLATFORM_SHARED_EXT
Package generation for Ubuntu and CentOS
Summary:
I put together a script to assist in the generation of deb's and
rpm's. I've tested that this works on ubuntu via vagrant. I've included the
Vagrantfile here, but I can remove it if it's not useful. The package.sh
script should work on any ubuntu or centos machine, I just added a bit of
logic in there to allow a base Ubuntu or Centos machine to be able to build
RocksDB from scratch.
Example output on Ubuntu 14.04:
```
root@vagrant-ubuntu-trusty-64:/vagrant# ./tools/package.sh
[+] g++-4.7 is already installed. skipping.
[+] libgflags-dev is already installed. skipping.
[+] ruby-all-dev is already installed. skipping.
[+] fpm is already installed. skipping.
Created package {:path=>"rocksdb_3.5_amd64.deb"}
root@vagrant-ubuntu-trusty-64:/vagrant# dpkg --info rocksdb_3.5_amd64.deb
new debian package, version 2.0.
size 17392022 bytes: control archive=1518 bytes.
275 bytes, 11 lines control
2911 bytes, 38 lines md5sums
Package: rocksdb
Version: 3.5
License: BSD
Vendor: Facebook
Architecture: amd64
Maintainer: rocksdb@fb.com
Installed-Size: 83358
Section: default
Priority: extra
Homepage: http://rocksdb.org/
Description: RocksDB is an embeddable persistent key-value store for fast storage.
```
Example output on CentOS 6.5:
```
[root@localhost vagrant]# rpm -qip rocksdb-3.5-1.x86_64.rpm
Name : rocksdb Relocations: /usr
Version : 3.5 Vendor: Facebook
Release : 1 Build Date: Mon 29 Sep 2014 01:26:11 AM UTC
Install Date: (not installed) Build Host: localhost
Group : default Source RPM: rocksdb-3.5-1.src.rpm
Size : 96231106 License: BSD
Signature : (none)
Packager : rocksdb@fb.com
URL : http://rocksdb.org/
Summary : RocksDB is an embeddable persistent key-value store for fast storage.
Description :
RocksDB is an embeddable persistent key-value store for fast storage.
```
Test Plan:
How this gets used is really up to the RocksDB core team. If you
want to actually get this into mainline, you might have to change `make
install` such that it install the RocksDB shared object file as well, which
would require you to link against gflags (maybe?) and that would require some
potential modifications to the script here (basically add a depends on that
package).
Currently, this will install the headers and a pre-compiled statically linked
object file. If that's what you want out of life, than this requires no
modifications.
Reviewers: ljin, yhchiang, igor
Reviewed By: igor
Differential Revision: https://reviews.facebook.net/D24141
10 years ago
.PHONY : blackbox_crash_test check clean coverage crash_test ldb_tests package \
release tags tags0 valgrind_check whitebox_crash_test format static_lib shared_lib all \
dbg rocksdbjavastatic rocksdbjava install install-static install-shared uninstall \
analyze tools tools_lib \
blackbox_crash_test_with_atomic_flush whitebox_crash_test_with_atomic_flush \
blackbox_crash_test_with_txn whitebox_crash_test_with_txn
all : $( LIBRARY ) $( BENCHMARKS ) tools tools_lib test_libs $( TESTS )
all_but_some_tests : $( LIBRARY ) $( BENCHMARKS ) tools tools_lib test_libs $( SUBSET )
static_lib : $( LIBRARY )
shared_lib : $( SHARED )
stress_lib : $( STRESS_LIBRARY )
tools : $( TOOLS )
tools_lib : $( TOOLS_LIBRARY )
test_libs : $( TEST_LIBS )
dbg : $( LIBRARY ) $( BENCHMARKS ) tools $( TESTS )
# creates static library and programs
release :
$( MAKE) clean
DEBUG_LEVEL = 0 $( MAKE) static_lib tools db_bench
coverage :
$( MAKE) clean
COVERAGEFLAGS = "-fprofile-arcs -ftest-coverage" LDFLAGS += "-lgcov" $( MAKE) J = 1 all check
cd coverage && ./coverage_test.sh
# Delete intermediate files
$( FIND) . -type f -regex ".*\.\(\(gcda\)\|\(gcno\)\)" -exec rm { } \;
i f n e q ( , $( filter check parallel_check ,$ ( MAKECMDGOALS ) ) , )
# Use /dev/shm if it has the sticky bit set (otherwise, /tmp),
# and create a randomly-named rocksdb.XXXX directory therein.
# We'll use that directory in the "make check" rules.
i f e q ( $( TMPD ) , )
TMPDIR := $( shell echo $$ { TMPDIR:-/tmp} )
TMPD := $( shell f = /dev/shm; test -k $$ f || f = $( TMPDIR) ; \
perl -le 'use File::Temp "tempdir";' \
-e 'print tempdir("' $$ f'/rocksdb.XXXX", CLEANUP => 0)' )
e n d i f
e n d i f
# Run all tests in parallel, accumulating per-test logs in t/log-*.
#
# Each t/run-* file is a tiny generated bourne shell script that invokes one of
# sub-tests. Why use a file for this? Because that makes the invocation of
# parallel below simpler, which in turn makes the parsing of parallel's
# LOG simpler (the latter is for live monitoring as parallel
# tests run).
#
# Test names are extracted by running tests with --gtest_list_tests.
run 'make check's rules (and even subtests) in parallel
Summary:
When GNU parallel is available, "make check" tests are now run in parallel.
When /dev/shm is usable, we tell those tests to create temporary files therein.
Now, the longest-running single test, db_test, (which is composed of hundreds of sub-tests)
is no longer run sequentially: instead, each of its sub-tests is run independently, and can
be parallelized along with all other tests. To make that process easier, this change
creates a temporary directory, "t/", in which it puts a small script for each of those
subtests. The output from each parallel-run test is now saved in t/log-TEST_NAME.
When GNU parallel is not available, we run the tests in sequence, just as before.
If GNU parallel is available and you don't like the default of running one subtest
per core, you can invoke "make J=1 check" to run only one test at a time.
Beware: this will take a long time, and it starts with the two longest-running tests, so you
will wait for a long time before seeing any results. Instead, if you want to use fewer resources
but still see useful progress, try "make J=60% check". That will attempt to ensure that 60% of
the cores are occupied by test runs.
To watch progress of individual tests (duration, success (PASS-or-FAIL), name), run "make watch-log"
in the same directory from another window. That will start with something like this:
and when complete should show numbers/names like this:
Every 0.1s: sort -k7,7nr -k4,4gr LOG|perl -n -e '@a=split("\t",$_,-1); $t=$a[8]; $t =~ s,^\./,,;' -e '$t =~ s, >.*,,; chomp $t;' -e '$t =~ /.*--gtest_filter=... Wed Apr 1 10:51:42 2015
152.221 PASS t/DBTest.FileCreationRandomFailure
109.280 PASS t/DBTest.EncodeDecompressedBlockSizeTest
82.315 PASS reduce_levels_test
77.812 PASS t/DBTest.CompactionFilterWithValueChange
73.236 PASS backupable_db_test
63.428 PASS deletefile_test
57.248 PASS table_test
55.665 PASS prefix_test
49.816 PASS t/DBTest.RateLimitingTest
...
Test Plan:
Timings (measured so as to exclude compile and link times):
With this change, all tests complete in 2m40s on a system for which nproc prints 32.
Prior to this this change, "make check" would take 24.5 minutes on that same system.
Here are durations (in seconds) of the longest-running subtests:
152.435 PASS t/DBTest.FileCreationRandomFailure
107.070 PASS t/DBTest.EncodeDecompressedBlockSizeTest
81.391 PASS ./reduce_levels_test
71.587 PASS ./backupable_db_test
61.746 PASS ./deletefile_test
57.960 PASS ./table_test
55.230 PASS ./prefix_test
54.060 PASS t/DBTest.CompactionFilterWithValueChange
48.873 PASS t/DBTest.RateLimitingTest
47.569 PASS ./fault_injection_test
46.593 PASS t/DBTest.Randomized
42.662 PASS t/DBTest.CompactionFilter
31.793 PASS t/DBTest.SparseMerge
30.612 PASS t/DBTest.CompactionFilterV2
25.891 PASS t/DBTest.GroupCommitTest
23.863 PASS t/DBTest.DynamicLevelMaxBytesBase
22.976 PASS ./rate_limiter_test
18.942 PASS t/DBTest.OptimizeFiltersForHits
16.851 PASS ./env_test
15.399 PASS t/DBTest.CompactionFilterV2WithValueChange
14.827 PASS t/DBTest.CompactionFilterV2NULLPrefix
Reviewers: igor, sdong, rven, yhchiang, igor.sugak
Reviewed By: igor.sugak
Subscribers: dhruba
Differential Revision: https://reviews.facebook.net/D35379
10 years ago
# This filter removes the "#"-introduced comments, and expands to
# fully-qualified names by changing input like this:
#
# DBTest.
# Empty
# WriteEmptyBatch
# MultiThreaded/MultiThreadedDBTest.
# MultiThreaded/0 # GetParam() = 0
# MultiThreaded/1 # GetParam() = 1
#
# into this:
#
# DBTest.Empty
# DBTest.WriteEmptyBatch
# MultiThreaded/MultiThreadedDBTest.MultiThreaded/0
# MultiThreaded/MultiThreadedDBTest.MultiThreaded/1
#
parallel_tests = $( patsubst %,parallel_%,$( PARALLEL_TEST) )
.PHONY : gen_parallel_tests $( parallel_tests )
$(parallel_tests) : $( PARALLEL_TEST )
$( AM_V_at) TEST_BINARY = $( patsubst parallel_%,%,$@ ) ; \
TEST_NAMES = ` \
./$$ TEST_BINARY --gtest_list_tests \
| perl -n \
-e 's/ *\#.*//;' \
-e '/^(\s*)(\S+)/; !$$1 and do {$$p=$$2; break};' \
-e 'print qq! $$p$$2!' ` ; \
for TEST_NAME in $$ TEST_NAMES; do \
TEST_SCRIPT = t/run-$$ TEST_BINARY-$$ { TEST_NAME//\/ /-} ; \
echo " GEN " $$ TEST_SCRIPT; \
printf '%s\n' \
'#!/bin/sh' \
" d=\$(TMPD) $$ TEST_SCRIPT " \
'mkdir -p $$d' \
" TEST_TMPDIR=\$ $d $( DRIVER) ./ $$ TEST_BINARY --gtest_filter= $$ TEST_NAME " \
> $$ TEST_SCRIPT; \
chmod a = rx $$ TEST_SCRIPT; \
done
gen_parallel_tests :
$( AM_V_at) mkdir -p t
$( AM_V_at) rm -f t/run-*
$( MAKE) $( parallel_tests)
run 'make check's rules (and even subtests) in parallel
Summary:
When GNU parallel is available, "make check" tests are now run in parallel.
When /dev/shm is usable, we tell those tests to create temporary files therein.
Now, the longest-running single test, db_test, (which is composed of hundreds of sub-tests)
is no longer run sequentially: instead, each of its sub-tests is run independently, and can
be parallelized along with all other tests. To make that process easier, this change
creates a temporary directory, "t/", in which it puts a small script for each of those
subtests. The output from each parallel-run test is now saved in t/log-TEST_NAME.
When GNU parallel is not available, we run the tests in sequence, just as before.
If GNU parallel is available and you don't like the default of running one subtest
per core, you can invoke "make J=1 check" to run only one test at a time.
Beware: this will take a long time, and it starts with the two longest-running tests, so you
will wait for a long time before seeing any results. Instead, if you want to use fewer resources
but still see useful progress, try "make J=60% check". That will attempt to ensure that 60% of
the cores are occupied by test runs.
To watch progress of individual tests (duration, success (PASS-or-FAIL), name), run "make watch-log"
in the same directory from another window. That will start with something like this:
and when complete should show numbers/names like this:
Every 0.1s: sort -k7,7nr -k4,4gr LOG|perl -n -e '@a=split("\t",$_,-1); $t=$a[8]; $t =~ s,^\./,,;' -e '$t =~ s, >.*,,; chomp $t;' -e '$t =~ /.*--gtest_filter=... Wed Apr 1 10:51:42 2015
152.221 PASS t/DBTest.FileCreationRandomFailure
109.280 PASS t/DBTest.EncodeDecompressedBlockSizeTest
82.315 PASS reduce_levels_test
77.812 PASS t/DBTest.CompactionFilterWithValueChange
73.236 PASS backupable_db_test
63.428 PASS deletefile_test
57.248 PASS table_test
55.665 PASS prefix_test
49.816 PASS t/DBTest.RateLimitingTest
...
Test Plan:
Timings (measured so as to exclude compile and link times):
With this change, all tests complete in 2m40s on a system for which nproc prints 32.
Prior to this this change, "make check" would take 24.5 minutes on that same system.
Here are durations (in seconds) of the longest-running subtests:
152.435 PASS t/DBTest.FileCreationRandomFailure
107.070 PASS t/DBTest.EncodeDecompressedBlockSizeTest
81.391 PASS ./reduce_levels_test
71.587 PASS ./backupable_db_test
61.746 PASS ./deletefile_test
57.960 PASS ./table_test
55.230 PASS ./prefix_test
54.060 PASS t/DBTest.CompactionFilterWithValueChange
48.873 PASS t/DBTest.RateLimitingTest
47.569 PASS ./fault_injection_test
46.593 PASS t/DBTest.Randomized
42.662 PASS t/DBTest.CompactionFilter
31.793 PASS t/DBTest.SparseMerge
30.612 PASS t/DBTest.CompactionFilterV2
25.891 PASS t/DBTest.GroupCommitTest
23.863 PASS t/DBTest.DynamicLevelMaxBytesBase
22.976 PASS ./rate_limiter_test
18.942 PASS t/DBTest.OptimizeFiltersForHits
16.851 PASS ./env_test
15.399 PASS t/DBTest.CompactionFilterV2WithValueChange
14.827 PASS t/DBTest.CompactionFilterV2NULLPrefix
Reviewers: igor, sdong, rven, yhchiang, igor.sugak
Reviewed By: igor.sugak
Subscribers: dhruba
Differential Revision: https://reviews.facebook.net/D35379
10 years ago
# Reorder input lines (which are one per test) so that the
# longest-running tests appear first in the output.
# Do this by prefixing each selected name with its duration,
# sort the resulting names, and remove the leading numbers.
# FIXME: the "100" we prepend is a fake time, for now.
# FIXME: squirrel away timings from each run and use them
# (when present) on subsequent runs to order these tests.
#
# Without this reordering, these two tests would happen to start only
# after almost all other tests had completed, thus adding 100 seconds
# to the duration of parallel "make check". That's the difference
# between 4 minutes (old) and 2m20s (new).
#
# 152.120 PASS t/DBTest.FileCreationRandomFailure
# 107.816 PASS t/DBTest.EncodeDecompressedBlockSizeTest
#
slow_test_regexp = \
^.*SnapshotConcurrentAccessTest.*$$ | ^t/run-table_test-HarnessTest.Randomized$$ | ^t/run-db_test-.*( ?:FileCreationRandomFailure| EncodeDecompressedBlockSizeTest) $$ | ^.*RecoverFromCorruptedWALWithoutFlush$$
run 'make check's rules (and even subtests) in parallel
Summary:
When GNU parallel is available, "make check" tests are now run in parallel.
When /dev/shm is usable, we tell those tests to create temporary files therein.
Now, the longest-running single test, db_test, (which is composed of hundreds of sub-tests)
is no longer run sequentially: instead, each of its sub-tests is run independently, and can
be parallelized along with all other tests. To make that process easier, this change
creates a temporary directory, "t/", in which it puts a small script for each of those
subtests. The output from each parallel-run test is now saved in t/log-TEST_NAME.
When GNU parallel is not available, we run the tests in sequence, just as before.
If GNU parallel is available and you don't like the default of running one subtest
per core, you can invoke "make J=1 check" to run only one test at a time.
Beware: this will take a long time, and it starts with the two longest-running tests, so you
will wait for a long time before seeing any results. Instead, if you want to use fewer resources
but still see useful progress, try "make J=60% check". That will attempt to ensure that 60% of
the cores are occupied by test runs.
To watch progress of individual tests (duration, success (PASS-or-FAIL), name), run "make watch-log"
in the same directory from another window. That will start with something like this:
and when complete should show numbers/names like this:
Every 0.1s: sort -k7,7nr -k4,4gr LOG|perl -n -e '@a=split("\t",$_,-1); $t=$a[8]; $t =~ s,^\./,,;' -e '$t =~ s, >.*,,; chomp $t;' -e '$t =~ /.*--gtest_filter=... Wed Apr 1 10:51:42 2015
152.221 PASS t/DBTest.FileCreationRandomFailure
109.280 PASS t/DBTest.EncodeDecompressedBlockSizeTest
82.315 PASS reduce_levels_test
77.812 PASS t/DBTest.CompactionFilterWithValueChange
73.236 PASS backupable_db_test
63.428 PASS deletefile_test
57.248 PASS table_test
55.665 PASS prefix_test
49.816 PASS t/DBTest.RateLimitingTest
...
Test Plan:
Timings (measured so as to exclude compile and link times):
With this change, all tests complete in 2m40s on a system for which nproc prints 32.
Prior to this this change, "make check" would take 24.5 minutes on that same system.
Here are durations (in seconds) of the longest-running subtests:
152.435 PASS t/DBTest.FileCreationRandomFailure
107.070 PASS t/DBTest.EncodeDecompressedBlockSizeTest
81.391 PASS ./reduce_levels_test
71.587 PASS ./backupable_db_test
61.746 PASS ./deletefile_test
57.960 PASS ./table_test
55.230 PASS ./prefix_test
54.060 PASS t/DBTest.CompactionFilterWithValueChange
48.873 PASS t/DBTest.RateLimitingTest
47.569 PASS ./fault_injection_test
46.593 PASS t/DBTest.Randomized
42.662 PASS t/DBTest.CompactionFilter
31.793 PASS t/DBTest.SparseMerge
30.612 PASS t/DBTest.CompactionFilterV2
25.891 PASS t/DBTest.GroupCommitTest
23.863 PASS t/DBTest.DynamicLevelMaxBytesBase
22.976 PASS ./rate_limiter_test
18.942 PASS t/DBTest.OptimizeFiltersForHits
16.851 PASS ./env_test
15.399 PASS t/DBTest.CompactionFilterV2WithValueChange
14.827 PASS t/DBTest.CompactionFilterV2NULLPrefix
Reviewers: igor, sdong, rven, yhchiang, igor.sugak
Reviewed By: igor.sugak
Subscribers: dhruba
Differential Revision: https://reviews.facebook.net/D35379
10 years ago
prioritize_long_running_tests = \
perl -pe 's,($(slow_test_regexp)),100 $$1,' \
| sort -k1,1gr \
| sed 's/^[.0-9]* //'
run 'make check's rules (and even subtests) in parallel
Summary:
When GNU parallel is available, "make check" tests are now run in parallel.
When /dev/shm is usable, we tell those tests to create temporary files therein.
Now, the longest-running single test, db_test, (which is composed of hundreds of sub-tests)
is no longer run sequentially: instead, each of its sub-tests is run independently, and can
be parallelized along with all other tests. To make that process easier, this change
creates a temporary directory, "t/", in which it puts a small script for each of those
subtests. The output from each parallel-run test is now saved in t/log-TEST_NAME.
When GNU parallel is not available, we run the tests in sequence, just as before.
If GNU parallel is available and you don't like the default of running one subtest
per core, you can invoke "make J=1 check" to run only one test at a time.
Beware: this will take a long time, and it starts with the two longest-running tests, so you
will wait for a long time before seeing any results. Instead, if you want to use fewer resources
but still see useful progress, try "make J=60% check". That will attempt to ensure that 60% of
the cores are occupied by test runs.
To watch progress of individual tests (duration, success (PASS-or-FAIL), name), run "make watch-log"
in the same directory from another window. That will start with something like this:
and when complete should show numbers/names like this:
Every 0.1s: sort -k7,7nr -k4,4gr LOG|perl -n -e '@a=split("\t",$_,-1); $t=$a[8]; $t =~ s,^\./,,;' -e '$t =~ s, >.*,,; chomp $t;' -e '$t =~ /.*--gtest_filter=... Wed Apr 1 10:51:42 2015
152.221 PASS t/DBTest.FileCreationRandomFailure
109.280 PASS t/DBTest.EncodeDecompressedBlockSizeTest
82.315 PASS reduce_levels_test
77.812 PASS t/DBTest.CompactionFilterWithValueChange
73.236 PASS backupable_db_test
63.428 PASS deletefile_test
57.248 PASS table_test
55.665 PASS prefix_test
49.816 PASS t/DBTest.RateLimitingTest
...
Test Plan:
Timings (measured so as to exclude compile and link times):
With this change, all tests complete in 2m40s on a system for which nproc prints 32.
Prior to this this change, "make check" would take 24.5 minutes on that same system.
Here are durations (in seconds) of the longest-running subtests:
152.435 PASS t/DBTest.FileCreationRandomFailure
107.070 PASS t/DBTest.EncodeDecompressedBlockSizeTest
81.391 PASS ./reduce_levels_test
71.587 PASS ./backupable_db_test
61.746 PASS ./deletefile_test
57.960 PASS ./table_test
55.230 PASS ./prefix_test
54.060 PASS t/DBTest.CompactionFilterWithValueChange
48.873 PASS t/DBTest.RateLimitingTest
47.569 PASS ./fault_injection_test
46.593 PASS t/DBTest.Randomized
42.662 PASS t/DBTest.CompactionFilter
31.793 PASS t/DBTest.SparseMerge
30.612 PASS t/DBTest.CompactionFilterV2
25.891 PASS t/DBTest.GroupCommitTest
23.863 PASS t/DBTest.DynamicLevelMaxBytesBase
22.976 PASS ./rate_limiter_test
18.942 PASS t/DBTest.OptimizeFiltersForHits
16.851 PASS ./env_test
15.399 PASS t/DBTest.CompactionFilterV2WithValueChange
14.827 PASS t/DBTest.CompactionFilterV2NULLPrefix
Reviewers: igor, sdong, rven, yhchiang, igor.sugak
Reviewed By: igor.sugak
Subscribers: dhruba
Differential Revision: https://reviews.facebook.net/D35379
10 years ago
# "make check" uses
# Run with "make J=1 check" to disable parallelism in "make check".
# Run with "make J=200% check" to run two parallel jobs per core.
# The default is to run one job per core (J=100%).
# See "man parallel" for its "-j ..." option.
J ?= 100%
run 'make check's rules (and even subtests) in parallel
Summary:
When GNU parallel is available, "make check" tests are now run in parallel.
When /dev/shm is usable, we tell those tests to create temporary files therein.
Now, the longest-running single test, db_test, (which is composed of hundreds of sub-tests)
is no longer run sequentially: instead, each of its sub-tests is run independently, and can
be parallelized along with all other tests. To make that process easier, this change
creates a temporary directory, "t/", in which it puts a small script for each of those
subtests. The output from each parallel-run test is now saved in t/log-TEST_NAME.
When GNU parallel is not available, we run the tests in sequence, just as before.
If GNU parallel is available and you don't like the default of running one subtest
per core, you can invoke "make J=1 check" to run only one test at a time.
Beware: this will take a long time, and it starts with the two longest-running tests, so you
will wait for a long time before seeing any results. Instead, if you want to use fewer resources
but still see useful progress, try "make J=60% check". That will attempt to ensure that 60% of
the cores are occupied by test runs.
To watch progress of individual tests (duration, success (PASS-or-FAIL), name), run "make watch-log"
in the same directory from another window. That will start with something like this:
and when complete should show numbers/names like this:
Every 0.1s: sort -k7,7nr -k4,4gr LOG|perl -n -e '@a=split("\t",$_,-1); $t=$a[8]; $t =~ s,^\./,,;' -e '$t =~ s, >.*,,; chomp $t;' -e '$t =~ /.*--gtest_filter=... Wed Apr 1 10:51:42 2015
152.221 PASS t/DBTest.FileCreationRandomFailure
109.280 PASS t/DBTest.EncodeDecompressedBlockSizeTest
82.315 PASS reduce_levels_test
77.812 PASS t/DBTest.CompactionFilterWithValueChange
73.236 PASS backupable_db_test
63.428 PASS deletefile_test
57.248 PASS table_test
55.665 PASS prefix_test
49.816 PASS t/DBTest.RateLimitingTest
...
Test Plan:
Timings (measured so as to exclude compile and link times):
With this change, all tests complete in 2m40s on a system for which nproc prints 32.
Prior to this this change, "make check" would take 24.5 minutes on that same system.
Here are durations (in seconds) of the longest-running subtests:
152.435 PASS t/DBTest.FileCreationRandomFailure
107.070 PASS t/DBTest.EncodeDecompressedBlockSizeTest
81.391 PASS ./reduce_levels_test
71.587 PASS ./backupable_db_test
61.746 PASS ./deletefile_test
57.960 PASS ./table_test
55.230 PASS ./prefix_test
54.060 PASS t/DBTest.CompactionFilterWithValueChange
48.873 PASS t/DBTest.RateLimitingTest
47.569 PASS ./fault_injection_test
46.593 PASS t/DBTest.Randomized
42.662 PASS t/DBTest.CompactionFilter
31.793 PASS t/DBTest.SparseMerge
30.612 PASS t/DBTest.CompactionFilterV2
25.891 PASS t/DBTest.GroupCommitTest
23.863 PASS t/DBTest.DynamicLevelMaxBytesBase
22.976 PASS ./rate_limiter_test
18.942 PASS t/DBTest.OptimizeFiltersForHits
16.851 PASS ./env_test
15.399 PASS t/DBTest.CompactionFilterV2WithValueChange
14.827 PASS t/DBTest.CompactionFilterV2NULLPrefix
Reviewers: igor, sdong, rven, yhchiang, igor.sugak
Reviewed By: igor.sugak
Subscribers: dhruba
Differential Revision: https://reviews.facebook.net/D35379
10 years ago
# Use this regexp to select the subset of tests whose names match.
tests-regexp = .
run 'make check's rules (and even subtests) in parallel
Summary:
When GNU parallel is available, "make check" tests are now run in parallel.
When /dev/shm is usable, we tell those tests to create temporary files therein.
Now, the longest-running single test, db_test, (which is composed of hundreds of sub-tests)
is no longer run sequentially: instead, each of its sub-tests is run independently, and can
be parallelized along with all other tests. To make that process easier, this change
creates a temporary directory, "t/", in which it puts a small script for each of those
subtests. The output from each parallel-run test is now saved in t/log-TEST_NAME.
When GNU parallel is not available, we run the tests in sequence, just as before.
If GNU parallel is available and you don't like the default of running one subtest
per core, you can invoke "make J=1 check" to run only one test at a time.
Beware: this will take a long time, and it starts with the two longest-running tests, so you
will wait for a long time before seeing any results. Instead, if you want to use fewer resources
but still see useful progress, try "make J=60% check". That will attempt to ensure that 60% of
the cores are occupied by test runs.
To watch progress of individual tests (duration, success (PASS-or-FAIL), name), run "make watch-log"
in the same directory from another window. That will start with something like this:
and when complete should show numbers/names like this:
Every 0.1s: sort -k7,7nr -k4,4gr LOG|perl -n -e '@a=split("\t",$_,-1); $t=$a[8]; $t =~ s,^\./,,;' -e '$t =~ s, >.*,,; chomp $t;' -e '$t =~ /.*--gtest_filter=... Wed Apr 1 10:51:42 2015
152.221 PASS t/DBTest.FileCreationRandomFailure
109.280 PASS t/DBTest.EncodeDecompressedBlockSizeTest
82.315 PASS reduce_levels_test
77.812 PASS t/DBTest.CompactionFilterWithValueChange
73.236 PASS backupable_db_test
63.428 PASS deletefile_test
57.248 PASS table_test
55.665 PASS prefix_test
49.816 PASS t/DBTest.RateLimitingTest
...
Test Plan:
Timings (measured so as to exclude compile and link times):
With this change, all tests complete in 2m40s on a system for which nproc prints 32.
Prior to this this change, "make check" would take 24.5 minutes on that same system.
Here are durations (in seconds) of the longest-running subtests:
152.435 PASS t/DBTest.FileCreationRandomFailure
107.070 PASS t/DBTest.EncodeDecompressedBlockSizeTest
81.391 PASS ./reduce_levels_test
71.587 PASS ./backupable_db_test
61.746 PASS ./deletefile_test
57.960 PASS ./table_test
55.230 PASS ./prefix_test
54.060 PASS t/DBTest.CompactionFilterWithValueChange
48.873 PASS t/DBTest.RateLimitingTest
47.569 PASS ./fault_injection_test
46.593 PASS t/DBTest.Randomized
42.662 PASS t/DBTest.CompactionFilter
31.793 PASS t/DBTest.SparseMerge
30.612 PASS t/DBTest.CompactionFilterV2
25.891 PASS t/DBTest.GroupCommitTest
23.863 PASS t/DBTest.DynamicLevelMaxBytesBase
22.976 PASS ./rate_limiter_test
18.942 PASS t/DBTest.OptimizeFiltersForHits
16.851 PASS ./env_test
15.399 PASS t/DBTest.CompactionFilterV2WithValueChange
14.827 PASS t/DBTest.CompactionFilterV2NULLPrefix
Reviewers: igor, sdong, rven, yhchiang, igor.sugak
Reviewed By: igor.sugak
Subscribers: dhruba
Differential Revision: https://reviews.facebook.net/D35379
10 years ago
.PHONY : check_ 0
check_0 :
$( AM_V_GEN) export TEST_TMPDIR = $( TMPD) ; \
printf '%s\n' '' \
'To monitor subtest <duration,pass/fail,name>,' \
' run "make watch-log" in a separate window' '' ; \
test -t 1 && eta = --eta || eta = ; \
{ \
printf './%s\n' $( filter-out $( PARALLEL_TEST) ,$( TESTS) ) ; \
find t -name 'run-*' -print; \
} \
| $( prioritize_long_running_tests) \
| grep -E '$(tests-regexp)' \
| build_tools/gnu_parallel -j$( J) --plain --joblog= LOG $$ eta --gnu '{} >& t/log-{/}'
valgrind-blacklist-regexp = InlineSkipTest.ConcurrentInsert| TransactionStressTest.DeadlockStress| DBCompactionTest.SuggestCompactRangeNoTwoLevel0Compactions| BackupableDBTest.RateLimiting| DBTest.CloseSpeedup| DBTest.ThreadStatusFlush| DBTest.RateLimitingTest| DBTest.EncodeDecompressedBlockSizeTest| FaultInjectionTest.UninstalledCompaction| HarnessTest.Randomized| ExternalSSTFileTest.CompactDuringAddFileRandom| ExternalSSTFileTest.IngestFileWithGlobalSeqnoRandomized| MySQLStyleTransactionTest.TransactionStressTest
.PHONY : valgrind_check_ 0
valgrind_check_0 :
$( AM_V_GEN) export TEST_TMPDIR = $( TMPD) ; \
printf '%s\n' '' \
'To monitor subtest <duration,pass/fail,name>,' \
' run "make watch-log" in a separate window' '' ; \
test -t 1 && eta = --eta || eta = ; \
{ \
printf './%s\n' $( filter-out $( PARALLEL_TEST) %skiplist_test options_settable_test, $( TESTS) ) ; \
find t -name 'run-*' -print; \
} \
| $( prioritize_long_running_tests) \
| grep -E '$(tests-regexp)' \
| grep -E -v '$(valgrind-blacklist-regexp)' \
| build_tools/gnu_parallel -j$( J) --plain --joblog= LOG $$ eta --gnu \
'(if [[ "{}" == "./"* ]] ; then $(DRIVER) {}; else {}; fi) ' \
'>& t/valgrind_log-{/}'
run 'make check's rules (and even subtests) in parallel
Summary:
When GNU parallel is available, "make check" tests are now run in parallel.
When /dev/shm is usable, we tell those tests to create temporary files therein.
Now, the longest-running single test, db_test, (which is composed of hundreds of sub-tests)
is no longer run sequentially: instead, each of its sub-tests is run independently, and can
be parallelized along with all other tests. To make that process easier, this change
creates a temporary directory, "t/", in which it puts a small script for each of those
subtests. The output from each parallel-run test is now saved in t/log-TEST_NAME.
When GNU parallel is not available, we run the tests in sequence, just as before.
If GNU parallel is available and you don't like the default of running one subtest
per core, you can invoke "make J=1 check" to run only one test at a time.
Beware: this will take a long time, and it starts with the two longest-running tests, so you
will wait for a long time before seeing any results. Instead, if you want to use fewer resources
but still see useful progress, try "make J=60% check". That will attempt to ensure that 60% of
the cores are occupied by test runs.
To watch progress of individual tests (duration, success (PASS-or-FAIL), name), run "make watch-log"
in the same directory from another window. That will start with something like this:
and when complete should show numbers/names like this:
Every 0.1s: sort -k7,7nr -k4,4gr LOG|perl -n -e '@a=split("\t",$_,-1); $t=$a[8]; $t =~ s,^\./,,;' -e '$t =~ s, >.*,,; chomp $t;' -e '$t =~ /.*--gtest_filter=... Wed Apr 1 10:51:42 2015
152.221 PASS t/DBTest.FileCreationRandomFailure
109.280 PASS t/DBTest.EncodeDecompressedBlockSizeTest
82.315 PASS reduce_levels_test
77.812 PASS t/DBTest.CompactionFilterWithValueChange
73.236 PASS backupable_db_test
63.428 PASS deletefile_test
57.248 PASS table_test
55.665 PASS prefix_test
49.816 PASS t/DBTest.RateLimitingTest
...
Test Plan:
Timings (measured so as to exclude compile and link times):
With this change, all tests complete in 2m40s on a system for which nproc prints 32.
Prior to this this change, "make check" would take 24.5 minutes on that same system.
Here are durations (in seconds) of the longest-running subtests:
152.435 PASS t/DBTest.FileCreationRandomFailure
107.070 PASS t/DBTest.EncodeDecompressedBlockSizeTest
81.391 PASS ./reduce_levels_test
71.587 PASS ./backupable_db_test
61.746 PASS ./deletefile_test
57.960 PASS ./table_test
55.230 PASS ./prefix_test
54.060 PASS t/DBTest.CompactionFilterWithValueChange
48.873 PASS t/DBTest.RateLimitingTest
47.569 PASS ./fault_injection_test
46.593 PASS t/DBTest.Randomized
42.662 PASS t/DBTest.CompactionFilter
31.793 PASS t/DBTest.SparseMerge
30.612 PASS t/DBTest.CompactionFilterV2
25.891 PASS t/DBTest.GroupCommitTest
23.863 PASS t/DBTest.DynamicLevelMaxBytesBase
22.976 PASS ./rate_limiter_test
18.942 PASS t/DBTest.OptimizeFiltersForHits
16.851 PASS ./env_test
15.399 PASS t/DBTest.CompactionFilterV2WithValueChange
14.827 PASS t/DBTest.CompactionFilterV2NULLPrefix
Reviewers: igor, sdong, rven, yhchiang, igor.sugak
Reviewed By: igor.sugak
Subscribers: dhruba
Differential Revision: https://reviews.facebook.net/D35379
10 years ago
CLEAN_FILES += t LOG $( TMPD)
run 'make check's rules (and even subtests) in parallel
Summary:
When GNU parallel is available, "make check" tests are now run in parallel.
When /dev/shm is usable, we tell those tests to create temporary files therein.
Now, the longest-running single test, db_test, (which is composed of hundreds of sub-tests)
is no longer run sequentially: instead, each of its sub-tests is run independently, and can
be parallelized along with all other tests. To make that process easier, this change
creates a temporary directory, "t/", in which it puts a small script for each of those
subtests. The output from each parallel-run test is now saved in t/log-TEST_NAME.
When GNU parallel is not available, we run the tests in sequence, just as before.
If GNU parallel is available and you don't like the default of running one subtest
per core, you can invoke "make J=1 check" to run only one test at a time.
Beware: this will take a long time, and it starts with the two longest-running tests, so you
will wait for a long time before seeing any results. Instead, if you want to use fewer resources
but still see useful progress, try "make J=60% check". That will attempt to ensure that 60% of
the cores are occupied by test runs.
To watch progress of individual tests (duration, success (PASS-or-FAIL), name), run "make watch-log"
in the same directory from another window. That will start with something like this:
and when complete should show numbers/names like this:
Every 0.1s: sort -k7,7nr -k4,4gr LOG|perl -n -e '@a=split("\t",$_,-1); $t=$a[8]; $t =~ s,^\./,,;' -e '$t =~ s, >.*,,; chomp $t;' -e '$t =~ /.*--gtest_filter=... Wed Apr 1 10:51:42 2015
152.221 PASS t/DBTest.FileCreationRandomFailure
109.280 PASS t/DBTest.EncodeDecompressedBlockSizeTest
82.315 PASS reduce_levels_test
77.812 PASS t/DBTest.CompactionFilterWithValueChange
73.236 PASS backupable_db_test
63.428 PASS deletefile_test
57.248 PASS table_test
55.665 PASS prefix_test
49.816 PASS t/DBTest.RateLimitingTest
...
Test Plan:
Timings (measured so as to exclude compile and link times):
With this change, all tests complete in 2m40s on a system for which nproc prints 32.
Prior to this this change, "make check" would take 24.5 minutes on that same system.
Here are durations (in seconds) of the longest-running subtests:
152.435 PASS t/DBTest.FileCreationRandomFailure
107.070 PASS t/DBTest.EncodeDecompressedBlockSizeTest
81.391 PASS ./reduce_levels_test
71.587 PASS ./backupable_db_test
61.746 PASS ./deletefile_test
57.960 PASS ./table_test
55.230 PASS ./prefix_test
54.060 PASS t/DBTest.CompactionFilterWithValueChange
48.873 PASS t/DBTest.RateLimitingTest
47.569 PASS ./fault_injection_test
46.593 PASS t/DBTest.Randomized
42.662 PASS t/DBTest.CompactionFilter
31.793 PASS t/DBTest.SparseMerge
30.612 PASS t/DBTest.CompactionFilterV2
25.891 PASS t/DBTest.GroupCommitTest
23.863 PASS t/DBTest.DynamicLevelMaxBytesBase
22.976 PASS ./rate_limiter_test
18.942 PASS t/DBTest.OptimizeFiltersForHits
16.851 PASS ./env_test
15.399 PASS t/DBTest.CompactionFilterV2WithValueChange
14.827 PASS t/DBTest.CompactionFilterV2NULLPrefix
Reviewers: igor, sdong, rven, yhchiang, igor.sugak
Reviewed By: igor.sugak
Subscribers: dhruba
Differential Revision: https://reviews.facebook.net/D35379
10 years ago
# When running parallel "make check", you can monitor its progress
# from another window.
# Run "make watch_LOG" to show the duration,PASS/FAIL,name of parallel
# tests as they are being run. We sort them so that longer-running ones
# appear at the top of the list and any failing tests remain at the top
# regardless of their duration. As with any use of "watch", hit ^C to
# interrupt.
watch-log :
$( WATCH) --interval= 0 'sort -k7,7nr -k4,4gr LOG|$(quoted_perl_command)'
run 'make check's rules (and even subtests) in parallel
Summary:
When GNU parallel is available, "make check" tests are now run in parallel.
When /dev/shm is usable, we tell those tests to create temporary files therein.
Now, the longest-running single test, db_test, (which is composed of hundreds of sub-tests)
is no longer run sequentially: instead, each of its sub-tests is run independently, and can
be parallelized along with all other tests. To make that process easier, this change
creates a temporary directory, "t/", in which it puts a small script for each of those
subtests. The output from each parallel-run test is now saved in t/log-TEST_NAME.
When GNU parallel is not available, we run the tests in sequence, just as before.
If GNU parallel is available and you don't like the default of running one subtest
per core, you can invoke "make J=1 check" to run only one test at a time.
Beware: this will take a long time, and it starts with the two longest-running tests, so you
will wait for a long time before seeing any results. Instead, if you want to use fewer resources
but still see useful progress, try "make J=60% check". That will attempt to ensure that 60% of
the cores are occupied by test runs.
To watch progress of individual tests (duration, success (PASS-or-FAIL), name), run "make watch-log"
in the same directory from another window. That will start with something like this:
and when complete should show numbers/names like this:
Every 0.1s: sort -k7,7nr -k4,4gr LOG|perl -n -e '@a=split("\t",$_,-1); $t=$a[8]; $t =~ s,^\./,,;' -e '$t =~ s, >.*,,; chomp $t;' -e '$t =~ /.*--gtest_filter=... Wed Apr 1 10:51:42 2015
152.221 PASS t/DBTest.FileCreationRandomFailure
109.280 PASS t/DBTest.EncodeDecompressedBlockSizeTest
82.315 PASS reduce_levels_test
77.812 PASS t/DBTest.CompactionFilterWithValueChange
73.236 PASS backupable_db_test
63.428 PASS deletefile_test
57.248 PASS table_test
55.665 PASS prefix_test
49.816 PASS t/DBTest.RateLimitingTest
...
Test Plan:
Timings (measured so as to exclude compile and link times):
With this change, all tests complete in 2m40s on a system for which nproc prints 32.
Prior to this this change, "make check" would take 24.5 minutes on that same system.
Here are durations (in seconds) of the longest-running subtests:
152.435 PASS t/DBTest.FileCreationRandomFailure
107.070 PASS t/DBTest.EncodeDecompressedBlockSizeTest
81.391 PASS ./reduce_levels_test
71.587 PASS ./backupable_db_test
61.746 PASS ./deletefile_test
57.960 PASS ./table_test
55.230 PASS ./prefix_test
54.060 PASS t/DBTest.CompactionFilterWithValueChange
48.873 PASS t/DBTest.RateLimitingTest
47.569 PASS ./fault_injection_test
46.593 PASS t/DBTest.Randomized
42.662 PASS t/DBTest.CompactionFilter
31.793 PASS t/DBTest.SparseMerge
30.612 PASS t/DBTest.CompactionFilterV2
25.891 PASS t/DBTest.GroupCommitTest
23.863 PASS t/DBTest.DynamicLevelMaxBytesBase
22.976 PASS ./rate_limiter_test
18.942 PASS t/DBTest.OptimizeFiltersForHits
16.851 PASS ./env_test
15.399 PASS t/DBTest.CompactionFilterV2WithValueChange
14.827 PASS t/DBTest.CompactionFilterV2NULLPrefix
Reviewers: igor, sdong, rven, yhchiang, igor.sugak
Reviewed By: igor.sugak
Subscribers: dhruba
Differential Revision: https://reviews.facebook.net/D35379
10 years ago
# If J != 1 and GNU parallel is installed, run the tests in parallel,
run 'make check's rules (and even subtests) in parallel
Summary:
When GNU parallel is available, "make check" tests are now run in parallel.
When /dev/shm is usable, we tell those tests to create temporary files therein.
Now, the longest-running single test, db_test, (which is composed of hundreds of sub-tests)
is no longer run sequentially: instead, each of its sub-tests is run independently, and can
be parallelized along with all other tests. To make that process easier, this change
creates a temporary directory, "t/", in which it puts a small script for each of those
subtests. The output from each parallel-run test is now saved in t/log-TEST_NAME.
When GNU parallel is not available, we run the tests in sequence, just as before.
If GNU parallel is available and you don't like the default of running one subtest
per core, you can invoke "make J=1 check" to run only one test at a time.
Beware: this will take a long time, and it starts with the two longest-running tests, so you
will wait for a long time before seeing any results. Instead, if you want to use fewer resources
but still see useful progress, try "make J=60% check". That will attempt to ensure that 60% of
the cores are occupied by test runs.
To watch progress of individual tests (duration, success (PASS-or-FAIL), name), run "make watch-log"
in the same directory from another window. That will start with something like this:
and when complete should show numbers/names like this:
Every 0.1s: sort -k7,7nr -k4,4gr LOG|perl -n -e '@a=split("\t",$_,-1); $t=$a[8]; $t =~ s,^\./,,;' -e '$t =~ s, >.*,,; chomp $t;' -e '$t =~ /.*--gtest_filter=... Wed Apr 1 10:51:42 2015
152.221 PASS t/DBTest.FileCreationRandomFailure
109.280 PASS t/DBTest.EncodeDecompressedBlockSizeTest
82.315 PASS reduce_levels_test
77.812 PASS t/DBTest.CompactionFilterWithValueChange
73.236 PASS backupable_db_test
63.428 PASS deletefile_test
57.248 PASS table_test
55.665 PASS prefix_test
49.816 PASS t/DBTest.RateLimitingTest
...
Test Plan:
Timings (measured so as to exclude compile and link times):
With this change, all tests complete in 2m40s on a system for which nproc prints 32.
Prior to this this change, "make check" would take 24.5 minutes on that same system.
Here are durations (in seconds) of the longest-running subtests:
152.435 PASS t/DBTest.FileCreationRandomFailure
107.070 PASS t/DBTest.EncodeDecompressedBlockSizeTest
81.391 PASS ./reduce_levels_test
71.587 PASS ./backupable_db_test
61.746 PASS ./deletefile_test
57.960 PASS ./table_test
55.230 PASS ./prefix_test
54.060 PASS t/DBTest.CompactionFilterWithValueChange
48.873 PASS t/DBTest.RateLimitingTest
47.569 PASS ./fault_injection_test
46.593 PASS t/DBTest.Randomized
42.662 PASS t/DBTest.CompactionFilter
31.793 PASS t/DBTest.SparseMerge
30.612 PASS t/DBTest.CompactionFilterV2
25.891 PASS t/DBTest.GroupCommitTest
23.863 PASS t/DBTest.DynamicLevelMaxBytesBase
22.976 PASS ./rate_limiter_test
18.942 PASS t/DBTest.OptimizeFiltersForHits
16.851 PASS ./env_test
15.399 PASS t/DBTest.CompactionFilterV2WithValueChange
14.827 PASS t/DBTest.CompactionFilterV2NULLPrefix
Reviewers: igor, sdong, rven, yhchiang, igor.sugak
Reviewed By: igor.sugak
Subscribers: dhruba
Differential Revision: https://reviews.facebook.net/D35379
10 years ago
# via the check_0 rule above. Otherwise, run them sequentially.
check : all
$( MAKE) gen_parallel_tests
$( AM_V_GEN) if test " $( J) " != 1 \
&& ( build_tools/gnu_parallel --gnu --help 2>/dev/null) | \
grep -q 'GNU Parallel' ; \
then \
$( MAKE) T = " $$ t " TMPD = $( TMPD) check_0; \
else \
for t in $( TESTS) ; do \
echo " ===== Running $$ t (`date`) " ; ./$$ t || exit 1; done ; \
fi
rm -rf $( TMPD)
i f n e q ( $( PLATFORM ) , O S _ A I X )
$( PYTHON) tools/check_all_python.py
i f e q ( $( filter -DROCKSDB_LITE ,$ ( OPT ) ) , )
$( PYTHON) tools/ldb_test.py
sh tools/rocksdb_dump_test.sh
e n d i f
e n d i f
run 'make check's rules (and even subtests) in parallel
Summary:
When GNU parallel is available, "make check" tests are now run in parallel.
When /dev/shm is usable, we tell those tests to create temporary files therein.
Now, the longest-running single test, db_test, (which is composed of hundreds of sub-tests)
is no longer run sequentially: instead, each of its sub-tests is run independently, and can
be parallelized along with all other tests. To make that process easier, this change
creates a temporary directory, "t/", in which it puts a small script for each of those
subtests. The output from each parallel-run test is now saved in t/log-TEST_NAME.
When GNU parallel is not available, we run the tests in sequence, just as before.
If GNU parallel is available and you don't like the default of running one subtest
per core, you can invoke "make J=1 check" to run only one test at a time.
Beware: this will take a long time, and it starts with the two longest-running tests, so you
will wait for a long time before seeing any results. Instead, if you want to use fewer resources
but still see useful progress, try "make J=60% check". That will attempt to ensure that 60% of
the cores are occupied by test runs.
To watch progress of individual tests (duration, success (PASS-or-FAIL), name), run "make watch-log"
in the same directory from another window. That will start with something like this:
and when complete should show numbers/names like this:
Every 0.1s: sort -k7,7nr -k4,4gr LOG|perl -n -e '@a=split("\t",$_,-1); $t=$a[8]; $t =~ s,^\./,,;' -e '$t =~ s, >.*,,; chomp $t;' -e '$t =~ /.*--gtest_filter=... Wed Apr 1 10:51:42 2015
152.221 PASS t/DBTest.FileCreationRandomFailure
109.280 PASS t/DBTest.EncodeDecompressedBlockSizeTest
82.315 PASS reduce_levels_test
77.812 PASS t/DBTest.CompactionFilterWithValueChange
73.236 PASS backupable_db_test
63.428 PASS deletefile_test
57.248 PASS table_test
55.665 PASS prefix_test
49.816 PASS t/DBTest.RateLimitingTest
...
Test Plan:
Timings (measured so as to exclude compile and link times):
With this change, all tests complete in 2m40s on a system for which nproc prints 32.
Prior to this this change, "make check" would take 24.5 minutes on that same system.
Here are durations (in seconds) of the longest-running subtests:
152.435 PASS t/DBTest.FileCreationRandomFailure
107.070 PASS t/DBTest.EncodeDecompressedBlockSizeTest
81.391 PASS ./reduce_levels_test
71.587 PASS ./backupable_db_test
61.746 PASS ./deletefile_test
57.960 PASS ./table_test
55.230 PASS ./prefix_test
54.060 PASS t/DBTest.CompactionFilterWithValueChange
48.873 PASS t/DBTest.RateLimitingTest
47.569 PASS ./fault_injection_test
46.593 PASS t/DBTest.Randomized
42.662 PASS t/DBTest.CompactionFilter
31.793 PASS t/DBTest.SparseMerge
30.612 PASS t/DBTest.CompactionFilterV2
25.891 PASS t/DBTest.GroupCommitTest
23.863 PASS t/DBTest.DynamicLevelMaxBytesBase
22.976 PASS ./rate_limiter_test
18.942 PASS t/DBTest.OptimizeFiltersForHits
16.851 PASS ./env_test
15.399 PASS t/DBTest.CompactionFilterV2WithValueChange
14.827 PASS t/DBTest.CompactionFilterV2NULLPrefix
Reviewers: igor, sdong, rven, yhchiang, igor.sugak
Reviewed By: igor.sugak
Subscribers: dhruba
Differential Revision: https://reviews.facebook.net/D35379
10 years ago
# TODO add ldb_tests
check_some : $( SUBSET )
for t in $( SUBSET) ; do echo " ===== Running $$ t (`date`) " ; ./$$ t || exit 1; done
run 'make check's rules (and even subtests) in parallel
Summary:
When GNU parallel is available, "make check" tests are now run in parallel.
When /dev/shm is usable, we tell those tests to create temporary files therein.
Now, the longest-running single test, db_test, (which is composed of hundreds of sub-tests)
is no longer run sequentially: instead, each of its sub-tests is run independently, and can
be parallelized along with all other tests. To make that process easier, this change
creates a temporary directory, "t/", in which it puts a small script for each of those
subtests. The output from each parallel-run test is now saved in t/log-TEST_NAME.
When GNU parallel is not available, we run the tests in sequence, just as before.
If GNU parallel is available and you don't like the default of running one subtest
per core, you can invoke "make J=1 check" to run only one test at a time.
Beware: this will take a long time, and it starts with the two longest-running tests, so you
will wait for a long time before seeing any results. Instead, if you want to use fewer resources
but still see useful progress, try "make J=60% check". That will attempt to ensure that 60% of
the cores are occupied by test runs.
To watch progress of individual tests (duration, success (PASS-or-FAIL), name), run "make watch-log"
in the same directory from another window. That will start with something like this:
and when complete should show numbers/names like this:
Every 0.1s: sort -k7,7nr -k4,4gr LOG|perl -n -e '@a=split("\t",$_,-1); $t=$a[8]; $t =~ s,^\./,,;' -e '$t =~ s, >.*,,; chomp $t;' -e '$t =~ /.*--gtest_filter=... Wed Apr 1 10:51:42 2015
152.221 PASS t/DBTest.FileCreationRandomFailure
109.280 PASS t/DBTest.EncodeDecompressedBlockSizeTest
82.315 PASS reduce_levels_test
77.812 PASS t/DBTest.CompactionFilterWithValueChange
73.236 PASS backupable_db_test
63.428 PASS deletefile_test
57.248 PASS table_test
55.665 PASS prefix_test
49.816 PASS t/DBTest.RateLimitingTest
...
Test Plan:
Timings (measured so as to exclude compile and link times):
With this change, all tests complete in 2m40s on a system for which nproc prints 32.
Prior to this this change, "make check" would take 24.5 minutes on that same system.
Here are durations (in seconds) of the longest-running subtests:
152.435 PASS t/DBTest.FileCreationRandomFailure
107.070 PASS t/DBTest.EncodeDecompressedBlockSizeTest
81.391 PASS ./reduce_levels_test
71.587 PASS ./backupable_db_test
61.746 PASS ./deletefile_test
57.960 PASS ./table_test
55.230 PASS ./prefix_test
54.060 PASS t/DBTest.CompactionFilterWithValueChange
48.873 PASS t/DBTest.RateLimitingTest
47.569 PASS ./fault_injection_test
46.593 PASS t/DBTest.Randomized
42.662 PASS t/DBTest.CompactionFilter
31.793 PASS t/DBTest.SparseMerge
30.612 PASS t/DBTest.CompactionFilterV2
25.891 PASS t/DBTest.GroupCommitTest
23.863 PASS t/DBTest.DynamicLevelMaxBytesBase
22.976 PASS ./rate_limiter_test
18.942 PASS t/DBTest.OptimizeFiltersForHits
16.851 PASS ./env_test
15.399 PASS t/DBTest.CompactionFilterV2WithValueChange
14.827 PASS t/DBTest.CompactionFilterV2NULLPrefix
Reviewers: igor, sdong, rven, yhchiang, igor.sugak
Reviewed By: igor.sugak
Subscribers: dhruba
Differential Revision: https://reviews.facebook.net/D35379
10 years ago
.PHONY : ldb_tests
ldb_tests : ldb
$( PYTHON) tools/ldb_test.py
crash_test : whitebox_crash_test blackbox_crash_test
crash_test_with_atomic_flush : whitebox_crash_test_with_atomic_flush blackbox_crash_test_with_atomic_flush
crash_test_with_txn : whitebox_crash_test_with_txn blackbox_crash_test_with_txn
blackbox_crash_test : db_stress
$( PYTHON) -u tools/db_crashtest.py --simple blackbox $( CRASH_TEST_EXT_ARGS)
$( PYTHON) -u tools/db_crashtest.py blackbox $( CRASH_TEST_EXT_ARGS)
blackbox_crash_test_with_atomic_flush : db_stress
$( PYTHON) -u tools/db_crashtest.py --cf_consistency blackbox $( CRASH_TEST_EXT_ARGS)
blackbox_crash_test_with_txn : db_stress
$( PYTHON) -u tools/db_crashtest.py --txn blackbox $( CRASH_TEST_EXT_ARGS)
i f e q ( $( CRASH_TEST_KILL_ODD ) , )
CRASH_TEST_KILL_ODD = 888887
e n d i f
whitebox_crash_test : db_stress
$( PYTHON) -u tools/db_crashtest.py --simple whitebox --random_kill_odd \
$( CRASH_TEST_KILL_ODD) $( CRASH_TEST_EXT_ARGS)
$( PYTHON) -u tools/db_crashtest.py whitebox --random_kill_odd \
$( CRASH_TEST_KILL_ODD) $( CRASH_TEST_EXT_ARGS)
whitebox_crash_test_with_atomic_flush : db_stress
$( PYTHON) -u tools/db_crashtest.py --cf_consistency whitebox --random_kill_odd \
$( CRASH_TEST_KILL_ODD) $( CRASH_TEST_EXT_ARGS)
whitebox_crash_test_with_txn : db_stress
$( PYTHON) -u tools/db_crashtest.py --txn whitebox --random_kill_odd \
$( CRASH_TEST_KILL_ODD) $( CRASH_TEST_EXT_ARGS)
asan_check :
$( MAKE) clean
COMPILE_WITH_ASAN = 1 $( MAKE) check -j32
$( MAKE) clean
asan_crash_test :
$( MAKE) clean
COMPILE_WITH_ASAN = 1 $( MAKE) crash_test
$( MAKE) clean
asan_crash_test_with_atomic_flush :
$( MAKE) clean
COMPILE_WITH_ASAN = 1 $( MAKE) crash_test_with_atomic_flush
$( MAKE) clean
asan_crash_test_with_txn :
$( MAKE) clean
COMPILE_WITH_ASAN = 1 $( MAKE) crash_test_with_txn
$( MAKE) clean
ubsan_check :
$( MAKE) clean
COMPILE_WITH_UBSAN = 1 $( MAKE) check -j32
$( MAKE) clean
ubsan_crash_test :
$( MAKE) clean
COMPILE_WITH_UBSAN = 1 $( MAKE) crash_test
$( MAKE) clean
ubsan_crash_test_with_atomic_flush :
$( MAKE) clean
COMPILE_WITH_UBSAN = 1 $( MAKE) crash_test_with_atomic_flush
$( MAKE) clean
ubsan_crash_test_with_txn :
$( MAKE) clean
COMPILE_WITH_UBSAN = 1 $( MAKE) crash_test_with_txn
$( MAKE) clean
valgrind_test :
ROCKSDB_VALGRIND_RUN = 1 DISABLE_JEMALLOC = 1 $( MAKE) valgrind_check
valgrind_check : $( TESTS )
$( MAKE) DRIVER = " $( VALGRIND_VER) $( VALGRIND_OPTS) " gen_parallel_tests
$( AM_V_GEN) if test " $( J) " != 1 \
&& ( build_tools/gnu_parallel --gnu --help 2>/dev/null) | \
grep -q 'GNU Parallel' ; \
then \
$( MAKE) TMPD = $( TMPD) \
DRIVER = " $( VALGRIND_VER) $( VALGRIND_OPTS) " valgrind_check_0; \
else \
for t in $( filter-out %skiplist_test options_settable_test,$( TESTS) ) ; do \
$( VALGRIND_VER) $( VALGRIND_OPTS) ./$$ t; \
ret_code = $$ ?; \
if [ $$ ret_code -ne 0 ] ; then \
exit $$ ret_code; \
fi ; \
done ; \
fi
i f n e q ( $( PAR_TEST ) , )
parloop :
ret_bad = 0; \
for t in $( PAR_TEST) ; do \
echo " ===== Running $$ t in parallel $( NUM_PAR) (`date`) " ; \
if [ $( db_test) -eq 1 ] ; then \
seq $( J) | v = " $$ t " build_tools/gnu_parallel --gnu --plain 's=$(TMPD)/rdb-{}; export TEST_TMPDIR=$$s;' \
'timeout 2m ./db_test --gtest_filter=$$v >> $$s/log-{} 2>1' ; \
else \
seq $( J) | v = " ./ $$ t " build_tools/gnu_parallel --gnu --plain 's=$(TMPD)/rdb-{};' \
'export TEST_TMPDIR=$$s; timeout 10m $$v >> $$s/log-{} 2>1' ; \
fi ; \
ret_code = $$ ?; \
if [ $$ ret_code -ne 0 ] ; then \
ret_bad = $$ ret_code; \
echo $$ t exited with $$ ret_code; \
fi ; \
done ; \
exit $$ ret_bad;
e n d i f
test_names = \
./db_test --gtest_list_tests \
| perl -n \
-e 's/ *\#.*//;' \
-e '/^(\s*)(\S+)/; !$$1 and do {$$p=$$2; break};' \
-e 'print qq! $$p$$2!'
parallel_check : $( TESTS )
$( AM_V_GEN) if test " $( J) " > 1 \
&& ( build_tools/gnu_parallel --gnu --help 2>/dev/null) | \
grep -q 'GNU Parallel' ; \
then \
echo Running in parallel $( J) ; \
else \
echo "Need to have GNU Parallel and J > 1" ; exit 1; \
fi ; \
ret_bad = 0; \
echo $( J) ; \
echo Test Dir: $( TMPD) ; \
seq $( J) | build_tools/gnu_parallel --gnu --plain 's=$(TMPD)/rdb-{}; rm -rf $$s; mkdir $$s' ; \
$( MAKE) PAR_TEST = " $( shell $( test_names) ) " TMPD = $( TMPD) \
J = $( J) db_test = 1 parloop; \
$( MAKE) PAR_TEST = " $( filter-out db_test, $( TESTS) ) " \
TMPD = $( TMPD) J = $( J) db_test = 0 parloop;
analyze : clean
USE_CLANG = 1 $( MAKE) analyze_incremental
analyze_incremental :
$( CLANG_SCAN_BUILD) --use-analyzer= $( CLANG_ANALYZER) \
--use-c++= $( CXX) --use-cc= $( CC) --status-bugs \
-o $( CURDIR) /scan_build_report \
$( MAKE) dbg
run 'make check's rules (and even subtests) in parallel
Summary:
When GNU parallel is available, "make check" tests are now run in parallel.
When /dev/shm is usable, we tell those tests to create temporary files therein.
Now, the longest-running single test, db_test, (which is composed of hundreds of sub-tests)
is no longer run sequentially: instead, each of its sub-tests is run independently, and can
be parallelized along with all other tests. To make that process easier, this change
creates a temporary directory, "t/", in which it puts a small script for each of those
subtests. The output from each parallel-run test is now saved in t/log-TEST_NAME.
When GNU parallel is not available, we run the tests in sequence, just as before.
If GNU parallel is available and you don't like the default of running one subtest
per core, you can invoke "make J=1 check" to run only one test at a time.
Beware: this will take a long time, and it starts with the two longest-running tests, so you
will wait for a long time before seeing any results. Instead, if you want to use fewer resources
but still see useful progress, try "make J=60% check". That will attempt to ensure that 60% of
the cores are occupied by test runs.
To watch progress of individual tests (duration, success (PASS-or-FAIL), name), run "make watch-log"
in the same directory from another window. That will start with something like this:
and when complete should show numbers/names like this:
Every 0.1s: sort -k7,7nr -k4,4gr LOG|perl -n -e '@a=split("\t",$_,-1); $t=$a[8]; $t =~ s,^\./,,;' -e '$t =~ s, >.*,,; chomp $t;' -e '$t =~ /.*--gtest_filter=... Wed Apr 1 10:51:42 2015
152.221 PASS t/DBTest.FileCreationRandomFailure
109.280 PASS t/DBTest.EncodeDecompressedBlockSizeTest
82.315 PASS reduce_levels_test
77.812 PASS t/DBTest.CompactionFilterWithValueChange
73.236 PASS backupable_db_test
63.428 PASS deletefile_test
57.248 PASS table_test
55.665 PASS prefix_test
49.816 PASS t/DBTest.RateLimitingTest
...
Test Plan:
Timings (measured so as to exclude compile and link times):
With this change, all tests complete in 2m40s on a system for which nproc prints 32.
Prior to this this change, "make check" would take 24.5 minutes on that same system.
Here are durations (in seconds) of the longest-running subtests:
152.435 PASS t/DBTest.FileCreationRandomFailure
107.070 PASS t/DBTest.EncodeDecompressedBlockSizeTest
81.391 PASS ./reduce_levels_test
71.587 PASS ./backupable_db_test
61.746 PASS ./deletefile_test
57.960 PASS ./table_test
55.230 PASS ./prefix_test
54.060 PASS t/DBTest.CompactionFilterWithValueChange
48.873 PASS t/DBTest.RateLimitingTest
47.569 PASS ./fault_injection_test
46.593 PASS t/DBTest.Randomized
42.662 PASS t/DBTest.CompactionFilter
31.793 PASS t/DBTest.SparseMerge
30.612 PASS t/DBTest.CompactionFilterV2
25.891 PASS t/DBTest.GroupCommitTest
23.863 PASS t/DBTest.DynamicLevelMaxBytesBase
22.976 PASS ./rate_limiter_test
18.942 PASS t/DBTest.OptimizeFiltersForHits
16.851 PASS ./env_test
15.399 PASS t/DBTest.CompactionFilterV2WithValueChange
14.827 PASS t/DBTest.CompactionFilterV2NULLPrefix
Reviewers: igor, sdong, rven, yhchiang, igor.sugak
Reviewed By: igor.sugak
Subscribers: dhruba
Differential Revision: https://reviews.facebook.net/D35379
10 years ago
CLEAN_FILES += unity.cc
unity.cc : Makefile
rm -f $@ $@ -t
for source_file in $( LIB_SOURCES) ; do \
echo " #include \" $$ source_file\" " >> $@ -t; \
done
chmod a = r $@ -t
mv $@ -t $@
unity.a : unity .o
$( AM_V_AR) rm -f $@
$( AM_V_at) $( AR) $( ARFLAGS) $@ unity.o
TOOLLIBOBJECTS = $( TOOL_LIB_SOURCES:.cc= .o)
# try compiling db_test with unity
unity_test : db /db_test .o db /db_test_util .o $( TESTHARNESS ) $( TOOLLIBOBJECTS ) unity .a
$( AM_LINK)
./unity_test
rocksdb.h rocksdb.cc : build_tools /amalgamate .py Makefile $( LIB_SOURCES ) unity .cc
build_tools/amalgamate.py -I. -i./include unity.cc -x include/rocksdb/c.h -H rocksdb.h -o rocksdb.cc
clean : clean -ext -libraries -all clean -rocks clean -rocksjava
clean-not-downloaded : clean -ext -libraries -bin clean -rocks clean -not -downloaded -rocksjava
clean-rocks :
rm -f $( BENCHMARKS) $( TOOLS) $( TESTS) $( PARALLEL_TEST) $( LIBRARY) $( SHARED)
rm -rf $( CLEAN_FILES) ios-x86 ios-arm scan_build_report
$( FIND) . -name "*.[oda]" -exec rm -f { } \;
$( FIND) . -type f -regex ".*\.\(\(gcda\)\|\(gcno\)\)" -exec rm { } \;
clean-rocksjava :
cd java && $( MAKE) clean
clean-not-downloaded-rocksjava :
cd java && $( MAKE) clean-not-downloaded
clean-ext-libraries-all :
rm -rf bzip2* snappy* zlib* lz4* zstd*
clean-ext-libraries-bin :
find . -maxdepth 1 -type d \( -name bzip2\* -or -name snappy\* -or -name zlib\* -or -name lz4\* -or -name zstd\* \) -prune -exec rm -rf { } \;
tags :
ctags -R .
cscope -b ` $( FIND) . -name '*.cc' ` ` $( FIND) . -name '*.h' ` ` $( FIND) . -name '*.c' `
ctags -e -R -o etags *
tags0 :
ctags -R .
cscope -b ` $( FIND) . -name '*.cc' -and ! -name '*_test.cc' ` \
` $( FIND) . -name '*.c' -and ! -name '*_test.c' ` \
` $( FIND) . -name '*.h' -and ! -name '*_test.h' `
ctags -e -R -o etags *
format :
build_tools/format-diff.sh
Package generation for Ubuntu and CentOS
Summary:
I put together a script to assist in the generation of deb's and
rpm's. I've tested that this works on ubuntu via vagrant. I've included the
Vagrantfile here, but I can remove it if it's not useful. The package.sh
script should work on any ubuntu or centos machine, I just added a bit of
logic in there to allow a base Ubuntu or Centos machine to be able to build
RocksDB from scratch.
Example output on Ubuntu 14.04:
```
root@vagrant-ubuntu-trusty-64:/vagrant# ./tools/package.sh
[+] g++-4.7 is already installed. skipping.
[+] libgflags-dev is already installed. skipping.
[+] ruby-all-dev is already installed. skipping.
[+] fpm is already installed. skipping.
Created package {:path=>"rocksdb_3.5_amd64.deb"}
root@vagrant-ubuntu-trusty-64:/vagrant# dpkg --info rocksdb_3.5_amd64.deb
new debian package, version 2.0.
size 17392022 bytes: control archive=1518 bytes.
275 bytes, 11 lines control
2911 bytes, 38 lines md5sums
Package: rocksdb
Version: 3.5
License: BSD
Vendor: Facebook
Architecture: amd64
Maintainer: rocksdb@fb.com
Installed-Size: 83358
Section: default
Priority: extra
Homepage: http://rocksdb.org/
Description: RocksDB is an embeddable persistent key-value store for fast storage.
```
Example output on CentOS 6.5:
```
[root@localhost vagrant]# rpm -qip rocksdb-3.5-1.x86_64.rpm
Name : rocksdb Relocations: /usr
Version : 3.5 Vendor: Facebook
Release : 1 Build Date: Mon 29 Sep 2014 01:26:11 AM UTC
Install Date: (not installed) Build Host: localhost
Group : default Source RPM: rocksdb-3.5-1.src.rpm
Size : 96231106 License: BSD
Signature : (none)
Packager : rocksdb@fb.com
URL : http://rocksdb.org/
Summary : RocksDB is an embeddable persistent key-value store for fast storage.
Description :
RocksDB is an embeddable persistent key-value store for fast storage.
```
Test Plan:
How this gets used is really up to the RocksDB core team. If you
want to actually get this into mainline, you might have to change `make
install` such that it install the RocksDB shared object file as well, which
would require you to link against gflags (maybe?) and that would require some
potential modifications to the script here (basically add a depends on that
package).
Currently, this will install the headers and a pre-compiled statically linked
object file. If that's what you want out of life, than this requires no
modifications.
Reviewers: ljin, yhchiang, igor
Reviewed By: igor
Differential Revision: https://reviews.facebook.net/D24141
10 years ago
package :
bash build_tools/make_package.sh $( SHARED_MAJOR) .$( SHARED_MINOR)
Package generation for Ubuntu and CentOS
Summary:
I put together a script to assist in the generation of deb's and
rpm's. I've tested that this works on ubuntu via vagrant. I've included the
Vagrantfile here, but I can remove it if it's not useful. The package.sh
script should work on any ubuntu or centos machine, I just added a bit of
logic in there to allow a base Ubuntu or Centos machine to be able to build
RocksDB from scratch.
Example output on Ubuntu 14.04:
```
root@vagrant-ubuntu-trusty-64:/vagrant# ./tools/package.sh
[+] g++-4.7 is already installed. skipping.
[+] libgflags-dev is already installed. skipping.
[+] ruby-all-dev is already installed. skipping.
[+] fpm is already installed. skipping.
Created package {:path=>"rocksdb_3.5_amd64.deb"}
root@vagrant-ubuntu-trusty-64:/vagrant# dpkg --info rocksdb_3.5_amd64.deb
new debian package, version 2.0.
size 17392022 bytes: control archive=1518 bytes.
275 bytes, 11 lines control
2911 bytes, 38 lines md5sums
Package: rocksdb
Version: 3.5
License: BSD
Vendor: Facebook
Architecture: amd64
Maintainer: rocksdb@fb.com
Installed-Size: 83358
Section: default
Priority: extra
Homepage: http://rocksdb.org/
Description: RocksDB is an embeddable persistent key-value store for fast storage.
```
Example output on CentOS 6.5:
```
[root@localhost vagrant]# rpm -qip rocksdb-3.5-1.x86_64.rpm
Name : rocksdb Relocations: /usr
Version : 3.5 Vendor: Facebook
Release : 1 Build Date: Mon 29 Sep 2014 01:26:11 AM UTC
Install Date: (not installed) Build Host: localhost
Group : default Source RPM: rocksdb-3.5-1.src.rpm
Size : 96231106 License: BSD
Signature : (none)
Packager : rocksdb@fb.com
URL : http://rocksdb.org/
Summary : RocksDB is an embeddable persistent key-value store for fast storage.
Description :
RocksDB is an embeddable persistent key-value store for fast storage.
```
Test Plan:
How this gets used is really up to the RocksDB core team. If you
want to actually get this into mainline, you might have to change `make
install` such that it install the RocksDB shared object file as well, which
would require you to link against gflags (maybe?) and that would require some
potential modifications to the script here (basically add a depends on that
package).
Currently, this will install the headers and a pre-compiled statically linked
object file. If that's what you want out of life, than this requires no
modifications.
Reviewers: ljin, yhchiang, igor
Reviewed By: igor
Differential Revision: https://reviews.facebook.net/D24141
10 years ago
# ---------------------------------------------------------------------------
# Unit tests and tools
# ---------------------------------------------------------------------------
$(LIBRARY) : $( LIBOBJECTS )
$( AM_V_AR) rm -f $@
$( AM_V_at) $( AR) $( ARFLAGS) $@ $( LIBOBJECTS)
$(TOOLS_LIBRARY) : $( BENCH_LIB_SOURCES :.cc =.o ) $( TOOL_LIB_SOURCES :.cc =.o ) $( LIB_SOURCES :.cc =.o ) $( TESTUTIL ) $( ANALYZER_LIB_SOURCES :.cc =.o )
$( AM_V_AR) rm -f $@
$( AM_V_at) $( AR) $( ARFLAGS) $@ $^
$(STRESS_LIBRARY) : $( LIB_SOURCES :.cc =.o ) $( TESTUTIL ) $( ANALYZER_LIB_SOURCES :.cc =.o ) $( STRESS_LIB_SOURCES :.cc =.o )
$( AM_V_AR) rm -f $@
$( AM_V_at) $( AR) $( ARFLAGS) $@ $^
librocksdb_env_basic_test.a : env /env_basic_test .o $( LIBOBJECTS ) $( TESTHARNESS )
$( AM_V_AR) rm -f $@
$( AM_V_at) $( AR) $( ARFLAGS) $@ $^
db_bench : tools /db_bench .o $( BENCHTOOLOBJECTS )
$( AM_LINK)
trace_analyzer : tools /trace_analyzer .o $( ANALYZETOOLOBJECTS ) $( LIBOBJECTS )
$( AM_LINK)
Block cache simulator: Add pysim to simulate caches using reinforcement learning. (#5610)
Summary:
This PR implements cache eviction using reinforcement learning. It includes two implementations:
1. An implementation of Thompson Sampling for the Bernoulli Bandit [1].
2. An implementation of LinUCB with disjoint linear models [2].
The idea is that a cache uses multiple eviction policies, e.g., MRU, LRU, and LFU. The cache learns which eviction policy is the best and uses it upon a cache miss.
Thompson Sampling is contextless and does not include any features.
LinUCB includes features such as level, block type, caller, column family id to decide which eviction policy to use.
[1] Daniel J. Russo, Benjamin Van Roy, Abbas Kazerouni, Ian Osband, and Zheng Wen. 2018. A Tutorial on Thompson Sampling. Found. Trends Mach. Learn. 11, 1 (July 2018), 1-96. DOI: https://doi.org/10.1561/2200000070
[2] Lihong Li, Wei Chu, John Langford, and Robert E. Schapire. 2010. A contextual-bandit approach to personalized news article recommendation. In Proceedings of the 19th international conference on World wide web (WWW '10). ACM, New York, NY, USA, 661-670. DOI=http://dx.doi.org/10.1145/1772690.1772758
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5610
Differential Revision: D16435067
Pulled By: HaoyuHuang
fbshipit-source-id: 6549239ae14115c01cb1e70548af9e46d8dc21bb
6 years ago
block_cache_trace_analyzer : tools /block_cache_analyzer /block_cache_trace_analyzer_tool .o $( ANALYZETOOLOBJECTS ) $( LIBOBJECTS )
Support computing miss ratio curves using sim_cache. (#5449)
Summary:
This PR adds a BlockCacheTraceSimulator that reports the miss ratios given different cache configurations. A cache configuration contains "cache_name,num_shard_bits,cache_capacities". For example, "lru, 1, 1K, 2K, 4M, 4G".
When we replay the trace, we also perform lookups and inserts on the simulated caches.
In the end, it reports the miss ratio for each tuple <cache_name, num_shard_bits, cache_capacity> in a output file.
This PR also adds a main source block_cache_trace_analyzer so that we can run the analyzer in command line.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5449
Test Plan:
Added tests for block_cache_trace_analyzer.
COMPILE_WITH_ASAN=1 make check -j32.
Differential Revision: D15797073
Pulled By: HaoyuHuang
fbshipit-source-id: aef0c5c2e7938f3e8b6a10d4a6a50e6928ecf408
6 years ago
$( AM_LINK)
i f e q ( $( USE_FOLLY_DISTRIBUTED_MUTEX ) , 1 )
folly_synchronization_distributed_mutex_test : $( LIBOBJECTS ) $( TESTHARNESS ) $( FOLLYOBJECTS ) third -party /folly /folly /synchronization /test /DistributedMutexTest .o
$( AM_LINK)
e n d i f
cache_bench : cache /cache_bench .o $( LIBOBJECTS ) $( TESTUTIL )
$( AM_LINK)
persistent_cache_bench : utilities /persistent_cache /persistent_cache_bench .o $( LIBOBJECTS ) $( TESTUTIL )
$( AM_LINK)
memtablerep_bench : memtable /memtablerep_bench .o $( LIBOBJECTS ) $( TESTUTIL )
$( AM_LINK)
Memtablerep Benchmark
Summary:
Create a benchmark for testing memtablereps. This diff is a bit rough, but it should do the trick until other bootcampers can clean it up.
Addressing comments
Removed the mutexes
Changed ReadWriteBenchmark to fix number of reads and count the number of writes we can perform in that time.
Test Plan:
Run it.
Below runs pass
./memtablerep_bench --benchmarks fillrandom,readrandom --memtablerep skiplist
./memtablerep_bench --benchmarks fillseq,readseq --memtablerep skiplist
./memtablerep_bench --benchmarks readwrite,seqreadwrite --memtablerep skiplist --num_operations 200 --num_threads 5
./memtablerep_bench --benchmarks fillrandom,readrandom --memtablerep hashskiplist
./memtablerep_bench --benchmarks fillseq,readseq --memtablerep hashskiplist
--num_scans 2
./memtablerep_bench --benchmarks fillseq,readseq --memtablerep vector
Reviewers: jpaton, ikabiljo, sdong
Reviewed By: sdong
Subscribers: dhruba, ameyag
Differential Revision: https://reviews.facebook.net/D22683
10 years ago
filter_bench : util /filter_bench .o $( LIBOBJECTS ) $( TESTUTIL )
$( AM_LINK)
db_stress : db_stress_tool /db_stress .o $( STRESSTOOLOBJECTS )
$( AM_LINK)
Write stress test
Summary:
The goal of this diff is to create a simple stress test with focus on catching:
* bugs in compaction/flush processes, especially the ones that cause assertion errors
* bugs in the code that deletes obsolete files
There are two parts of the test:
* write_stress, a binary that writes to the database
* write_stress_runner.py, a script that invokes and kills write_stress
Here are some interesting parts of write_stress:
* Runs with very high concurrency of compactions and flushes (32 threads total) and tries to create a huge amount of small files
* The keys written to the database are not uniformly distributed -- there is a 3-character prefix that mutates occasionally (in prefix mutator thread), in such a way that the first character mutates slower than second, which mutates slower than third character. That way, the compaction stress tests some interesting compaction features like trivial moves and bottommost level calculation
* There is a thread that creates an iterator, holds it for couple of seconds and then iterates over all keys. This is supposed to test RocksDB's abilities to keep the files alive when there are references to them.
* Some writes trigger WAL sync. This is stress testing our WAL sync code.
* At the end of the run, we make sure that we didn't leak any of the sst files
write_stress_runner.py changes the mode in which we run write_stress and also kills and restarts it. There are some interesting characteristics:
* At the beginning we divide the full test runtime into smaller parts -- shorter runtimes (couple of seconds) and longer runtimes (100, 1000) seconds
* The first time we run write_stress, we destroy the old DB. Every next time during the test, we use the same DB.
* We can run in kill mode or clean-restart mode. Kill mode kills the write_stress violently.
* We can run in mode where delete_obsolete_files_with_fullscan is true or false
* We can run with low_open_files mode turned on or off. When it's turned on, we configure table cache to only hold a couple of files -- that way we need to reopen files every time we access them.
Another goal was to create a stress test without a lot of parameters. So tools/write_stress_runner.py should only take one parameter -- runtime_sec and it should figure out everything else on its own.
In a separate diff, I'll add this new test to our nightly legocastle runs.
Test Plan:
The goal of this test was to retroactively catch the following bugs: D33045, D48201, D46899, D42399. I failed to reproduce D48201, but all others have been caught!
When i reverted https://reviews.facebook.net/D33045:
./write_stress --runtime_sec=200 --low_open_files_mode=true
Iterator statuts not OK: IO error: /fast-rocksdb-tmp/rocksdb_test/write_stress/089166.sst: No such file or directory
When i reverted https://reviews.facebook.net/D42399:
python tools/write_stress_runner.py --runtime_sec=5000
Running write_stress, will kill after 5 seconds: ./write_stress --runtime_sec=-1
Running write_stress, will kill after 2 seconds: ./write_stress --runtime_sec=-1 --destroy_db=false --delete_obsolete_files_with_fullscan=true
Running write_stress, will kill after 7 seconds: ./write_stress --runtime_sec=-1 --destroy_db=false
Running write_stress, will kill after 5 seconds: ./write_stress --runtime_sec=-1 --destroy_db=false
Running write_stress, will kill after 8 seconds: ./write_stress --runtime_sec=-1 --destroy_db=false --low_open_files_mode=true
Write to DB failed: IO error: /fast-rocksdb-tmp/rocksdb_test/write_stress/019250.sst: No such file or directory
ERROR: write_stress died with exitcode=-6
When i reverted https://reviews.facebook.net/D46899:
python tools/write_stress_runner.py --runtime_sec=1000
runtime: 1000
Going to execute write stress for [3, 3, 100, 3, 2, 100, 1, 788]
Running write_stress for 3 seconds: ./write_stress --runtime_sec=3 --low_open_files_mode=true
Running write_stress for 3 seconds: ./write_stress --runtime_sec=3 --destroy_db=false --delete_obsolete_files_with_fullscan=true
Running write_stress, will kill after 100 seconds: ./write_stress --runtime_sec=-1 --destroy_db=false --delete_obsolete_files_with_fullscan=true
write_stress: db/db_impl.cc:2070: void rocksdb::DBImpl::MarkLogsSynced(uint64_t, bool, const rocksdb::Status&): Assertion `log.getting_synced' failed.
ERROR: write_stress died with exitcode=-6
Reviewers: IslamAbdelRahman, yhchiang, rven, kradhakrishnan, sdong, anthony
Reviewed By: anthony
Subscribers: leveldb, dhruba
Differential Revision: https://reviews.facebook.net/D49533
9 years ago
write_stress : tools /write_stress .o $( LIBOBJECTS ) $( TESTUTIL )
$( AM_LINK)
db_sanity_test : tools /db_sanity_test .o $( LIBOBJECTS ) $( TESTUTIL )
$( AM_LINK)
db_repl_stress : tools /db_repl_stress .o $( LIBOBJECTS ) $( TESTUTIL )
$( AM_LINK)
arena_test : memory /arena_test .o $( LIBOBJECTS ) $( TESTHARNESS )
$( AM_LINK)
Provide an allocator for new memory type to be used with RocksDB block cache (#6214)
Summary:
New memory technologies are being developed by various hardware vendors (Intel DCPMM is one such technology currently available). These new memory types require different libraries for allocation and management (such as PMDK and memkind). The high capacities available make it possible to provision large caches (up to several TBs in size), beyond what is achievable with DRAM.
The new allocator provided in this PR uses the memkind library to allocate memory on different media.
**Performance**
We tested the new allocator using db_bench.
- For each test, we vary the size of the block cache (relative to the size of the uncompressed data in the database).
- The database is filled sequentially. Throughput is then measured with a readrandom benchmark.
- We use a uniform distribution as a worst-case scenario.
The plot shows throughput (ops/s) relative to a configuration with no block cache and default allocator.
For all tests, p99 latency is below 500 us.
![image](https://user-images.githubusercontent.com/26400080/71108594-42479100-2178-11ea-8231-8a775bbc92db.png)
**Changes**
- Add MemkindKmemAllocator
- Add --use_cache_memkind_kmem_allocator db_bench option (to create an LRU block cache with the new allocator)
- Add detection of memkind library with KMEM DAX support
- Add test for MemkindKmemAllocator
**Minimum Requirements**
- kernel 5.3.12
- ndctl v67 - https://github.com/pmem/ndctl
- memkind v1.10.0 - https://github.com/memkind/memkind
**Memory Configuration**
The allocator uses the MEMKIND_DAX_KMEM memory kind. Follow the instructions on[ memkind’s GitHub page](https://github.com/memkind/memkind) to set up NVDIMM memory accordingly.
Note on memory allocation with NVDIMM memory exposed as system memory.
- The MemkindKmemAllocator will only allocate from NVDIMM memory (using memkind_malloc with MEMKIND_DAX_KMEM kind).
- The default allocator is not restricted to RAM by default. Based on NUMA node latency, the kernel should allocate from local RAM preferentially, but it’s a kernel decision. numactl --preferred/--membind can be used to allocate preferentially/exclusively from the local RAM node.
**Usage**
When creating an LRU cache, pass a MemkindKmemAllocator object as argument.
For example (replace capacity with the desired value in bytes):
```
#include "rocksdb/cache.h"
#include "memory/memkind_kmem_allocator.h"
NewLRUCache(
capacity /*size_t*/,
6 /*cache_numshardbits*/,
false /*strict_capacity_limit*/,
false /*cache_high_pri_pool_ratio*/,
std::make_shared<MemkindKmemAllocator>());
```
Refer to [RocksDB’s block cache documentation](https://github.com/facebook/rocksdb/wiki/Block-Cache) to assign the LRU cache as block cache for a database.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6214
Reviewed By: cheng-chang
Differential Revision: D19292435
fbshipit-source-id: 7202f47b769e7722b539c86c2ffd669f64d7b4e1
5 years ago
memkind_kmem_allocator_test : memory /memkind_kmem_allocator_test .o $( LIBOBJECTS ) $( TESTHARNESS )
$( AM_LINK)
autovector_test : util /autovector_test .o $( LIBOBJECTS ) $( TESTHARNESS )
$( AM_LINK)
column_family_test : db /column_family_test .o db /db_test_util .o $( LIBOBJECTS ) $( TESTHARNESS )
$( AM_LINK)
table_properties_collector_test : db /table_properties_collector_test .o $( LIBOBJECTS ) $( TESTHARNESS )
$( AM_LINK)
bloom_test : util /bloom_test .o $( LIBOBJECTS ) $( TESTHARNESS )
$( AM_LINK)
dynamic_bloom_test : util /dynamic_bloom_test .o $( LIBOBJECTS ) $( TESTHARNESS )
$( AM_LINK)
c_test : db /c_test .o $( LIBOBJECTS ) $( TESTHARNESS )
$( AM_LINK)
cache_test : cache /cache_test .o $( LIBOBJECTS ) $( TESTHARNESS )
$( AM_LINK)
coding_test : util /coding_test .o $( LIBOBJECTS ) $( TESTHARNESS )
$( AM_LINK)
hash_test : util /hash_test .o $( LIBOBJECTS ) $( TESTHARNESS )
$( AM_LINK)
random_test : util /random_test .o $( LIBOBJECTS ) $( TESTHARNESS )
$( AM_LINK)
option_change_migration_test : utilities /option_change_migration /option_change_migration_test .o db /db_test_util .o $( LIBOBJECTS ) $( TESTHARNESS )
$( AM_LINK)
stringappend_test : utilities /merge_operators /string_append /stringappend_test .o $( LIBOBJECTS ) $( TESTHARNESS )
$( AM_LINK)
cassandra_format_test : utilities /cassandra /cassandra_format_test .o utilities /cassandra /test_utils .o $( LIBOBJECTS ) $( TESTHARNESS )
$( AM_LINK)
cassandra_functional_test : utilities /cassandra /cassandra_functional_test .o utilities /cassandra /test_utils .o $( LIBOBJECTS ) $( TESTHARNESS )
$( AM_LINK)
cassandra_row_merge_test : utilities /cassandra /cassandra_row_merge_test .o utilities /cassandra /test_utils .o $( LIBOBJECTS ) $( TESTHARNESS )
$( AM_LINK)
cassandra_serialize_test : utilities /cassandra /cassandra_serialize_test .o $( LIBOBJECTS ) $( TESTHARNESS )
$( AM_LINK)
hash_table_test : utilities /persistent_cache /hash_table_test .o $( LIBOBJECTS ) $( TESTHARNESS )
$( AM_LINK)
histogram_test : monitoring /histogram_test .o $( LIBOBJECTS ) $( TESTHARNESS )
$( AM_LINK)
thread_local_test : util /thread_local_test .o $( LIBOBJECTS ) $( TESTHARNESS )
$( AM_LINK)
work_queue_test : util /work_queue_test .o $( LIBOBJECTS ) $( TESTHARNESS )
$( AM_LINK)
corruption_test : db /corruption_test .o db /db_test_util .o $( LIBOBJECTS ) $( TESTHARNESS )
$( AM_LINK)
crc32c_test : util /crc 32c_test .o $( LIBOBJECTS ) $( TESTHARNESS )
$( AM_LINK)
slice_test : util /slice_test .o $( LIBOBJECTS ) $( TESTHARNESS )
$( AM_LINK)
slice_transform_test : util /slice_transform_test .o $( LIBOBJECTS ) $( TESTHARNESS )
$( AM_LINK)
db_basic_test : db /db_basic_test .o db /db_test_util .o $( LIBOBJECTS ) $( TESTHARNESS )
$( AM_LINK)
db_with_timestamp_basic_test : db /db_with_timestamp_basic_test .o db /db_test_util .o $( LIBOBJECTS ) $( TESTHARNESS )
$( AM_LINK)
db_encryption_test : db /db_encryption_test .o db /db_test_util .o $( LIBOBJECTS ) $( TESTHARNESS )
$( AM_LINK)
db_test : db /db_test .o db /db_test_util .o $( LIBOBJECTS ) $( TESTHARNESS )
$( AM_LINK)
db_test2 : db /db_test 2.o db /db_test_util .o $( LIBOBJECTS ) $( TESTHARNESS )
$( AM_LINK)
db_logical_block_size_cache_test : db /db_logical_block_size_cache_test .o db /db_test_util .o $( LIBOBJECTS ) $( TESTHARNESS )
$( AM_LINK)
db_blob_index_test : db /blob /db_blob_index_test .o db /db_test_util .o $( LIBOBJECTS ) $( TESTHARNESS )
$( AM_LINK)
db_block_cache_test : db /db_block_cache_test .o db /db_test_util .o $( LIBOBJECTS ) $( TESTHARNESS )
$( AM_LINK)
db_bloom_filter_test : db /db_bloom_filter_test .o db /db_test_util .o $( LIBOBJECTS ) $( TESTHARNESS )
$( AM_LINK)
db_log_iter_test : db /db_log_iter_test .o db /db_test_util .o $( LIBOBJECTS ) $( TESTHARNESS )
$( AM_LINK)
db_compaction_filter_test : db /db_compaction_filter_test .o db /db_test_util .o $( LIBOBJECTS ) $( TESTHARNESS )
$( AM_LINK)
db_compaction_test : db /db_compaction_test .o db /db_test_util .o $( LIBOBJECTS ) $( TESTHARNESS )
$( AM_LINK)
db_dynamic_level_test : db /db_dynamic_level_test .o db /db_test_util .o $( LIBOBJECTS ) $( TESTHARNESS )
$( AM_LINK)
Fix flush not being commit while writing manifest
Summary:
Fix flush not being commit while writing manifest, which is a recent bug introduced by D60075.
The issue:
# Options.max_background_flushes > 1
# Background thread A pick up a flush job, flush, then commit to manifest. (Note that mutex is released before writing manifest.)
# Background thread B pick up another flush job, flush. When it gets to `MemTableList::InstallMemtableFlushResults`, it notices another thread is commiting, so it quit.
# After the first commit, thread A doesn't double check if there are more flush result need to commit, leaving the second flush uncommitted.
Test Plan: run the test. Also verify the new test hit deadlock without the fix.
Reviewers: sdong, igor, lightmark
Reviewed By: lightmark
Subscribers: andrewkr, omegaga, dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D60969
9 years ago
db_flush_test : db /db_flush_test .o db /db_test_util .o $( LIBOBJECTS ) $( TESTHARNESS )
$( AM_LINK)
db_inplace_update_test : db /db_inplace_update_test .o db /db_test_util .o $( LIBOBJECTS ) $( TESTHARNESS )
$( AM_LINK)
db_iterator_test : db /db_iterator_test .o db /db_test_util .o $( LIBOBJECTS ) $( TESTHARNESS )
$( AM_LINK)
db_memtable_test : db /db_memtable_test .o db /db_test_util .o $( LIBOBJECTS ) $( TESTHARNESS )
$( AM_LINK)
db_merge_operator_test : db /db_merge_operator_test .o db /db_test_util .o $( LIBOBJECTS ) $( TESTHARNESS )
$( AM_LINK)
New API to get all merge operands for a Key (#5604)
Summary:
This is a new API added to db.h to allow for fetching all merge operands associated with a Key. The main motivation for this API is to support use cases where doing a full online merge is not necessary as it is performance sensitive. Example use-cases:
1. Update subset of columns and read subset of columns -
Imagine a SQL Table, a row is encoded as a K/V pair (as it is done in MyRocks). If there are many columns and users only updated one of them, we can use merge operator to reduce write amplification. While users only read one or two columns in the read query, this feature can avoid a full merging of the whole row, and save some CPU.
2. Updating very few attributes in a value which is a JSON-like document -
Updating one attribute can be done efficiently using merge operator, while reading back one attribute can be done more efficiently if we don't need to do a full merge.
----------------------------------------------------------------------------------------------------
API :
Status GetMergeOperands(
const ReadOptions& options, ColumnFamilyHandle* column_family,
const Slice& key, PinnableSlice* merge_operands,
GetMergeOperandsOptions* get_merge_operands_options,
int* number_of_operands)
Example usage :
int size = 100;
int number_of_operands = 0;
std::vector<PinnableSlice> values(size);
GetMergeOperandsOptions merge_operands_info;
db_->GetMergeOperands(ReadOptions(), db_->DefaultColumnFamily(), "k1", values.data(), merge_operands_info, &number_of_operands);
Description :
Returns all the merge operands corresponding to the key. If the number of merge operands in DB is greater than merge_operands_options.expected_max_number_of_operands no merge operands are returned and status is Incomplete. Merge operands returned are in the order of insertion.
merge_operands-> Points to an array of at-least merge_operands_options.expected_max_number_of_operands and the caller is responsible for allocating it. If the status returned is Incomplete then number_of_operands will contain the total number of merge operands found in DB for key.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5604
Test Plan:
Added unit test and perf test in db_bench that can be run using the command:
./db_bench -benchmarks=getmergeoperands --merge_operator=sortlist
Differential Revision: D16657366
Pulled By: vjnadimpalli
fbshipit-source-id: 0faadd752351745224ee12d4ae9ef3cb529951bf
6 years ago
db_merge_operand_test : db /db_merge_operand_test .o db /db_test_util .o $( LIBOBJECTS ) $( TESTHARNESS )
$( AM_LINK)
Fix TSAN failures in DistributedMutex tests (#5684)
Summary:
TSAN was not able to correctly instrument atomic bts and btr instructions, so
when TSAN is enabled implement those with std::atomic::fetch_or and
std::atomic::fetch_and. Also disable tests that fail on TSAN with false
negatives (we know these are false negatives because this other verifiably
correct program fails with the same TSAN error <link>)
```
make clean
TEST_TMPDIR=/dev/shm/rocksdb OPT=-g COMPILE_WITH_TSAN=1 make J=1 -j56 folly_synchronization_distributed_mutex_test
```
This is the code that fails with the same false-negative with TSAN
```
namespace {
class ExceptionWithConstructionTrack : public std::exception {
public:
explicit ExceptionWithConstructionTrack(int id)
: id_{folly::to<std::string>(id)}, constructionTrack_{id} {}
const char* what() const noexcept override {
return id_.c_str();
}
private:
std::string id_;
TestConstruction constructionTrack_;
};
template <typename Storage, typename Atomic>
void transferCurrentException(Storage& storage, Atomic& produced) {
assert(std::current_exception());
new (&storage) std::exception_ptr(std::current_exception());
produced->store(true, std::memory_order_release);
}
void concurrentExceptionPropagationStress(
int numThreads,
std::chrono::milliseconds milliseconds) {
auto&& stop = std::atomic<bool>{false};
auto&& exceptions = std::vector<std::aligned_storage<48, 8>::type>{};
auto&& produced = std::vector<std::unique_ptr<std::atomic<bool>>>{};
auto&& consumed = std::vector<std::unique_ptr<std::atomic<bool>>>{};
auto&& consumers = std::vector<std::thread>{};
for (auto i = 0; i < numThreads; ++i) {
produced.emplace_back(new std::atomic<bool>{false});
consumed.emplace_back(new std::atomic<bool>{false});
exceptions.push_back({});
}
auto producer = std::thread{[&]() {
auto counter = std::vector<int>(numThreads, 0);
for (auto i = 0; true; i = ((i + 1) % numThreads)) {
try {
throw ExceptionWithConstructionTrack{counter.at(i)++};
} catch (...) {
transferCurrentException(exceptions.at(i), produced.at(i));
}
while (!consumed.at(i)->load(std::memory_order_acquire)) {
if (stop.load(std::memory_order_acquire)) {
return;
}
}
consumed.at(i)->store(false, std::memory_order_release);
}
}};
for (auto i = 0; i < numThreads; ++i) {
consumers.emplace_back([&, i]() {
auto counter = 0;
while (true) {
while (!produced.at(i)->load(std::memory_order_acquire)) {
if (stop.load(std::memory_order_acquire)) {
return;
}
}
produced.at(i)->store(false, std::memory_order_release);
try {
auto storage = &exceptions.at(i);
auto exc = folly::launder(
reinterpret_cast<std::exception_ptr*>(storage));
auto copy = std::move(*exc);
exc->std::exception_ptr::~exception_ptr();
std::rethrow_exception(std::move(copy));
} catch (std::exception& exc) {
auto value = std::stoi(exc.what());
EXPECT_EQ(value, counter++);
}
consumed.at(i)->store(true, std::memory_order_release);
}
});
}
std::this_thread::sleep_for(milliseconds);
stop.store(true);
producer.join();
for (auto& thread : consumers) {
thread.join();
}
}
} // namespace
```
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5684
Differential Revision: D16746077
Pulled By: miasantreble
fbshipit-source-id: 8af88dcf9161c05daec1a76290f577918638f79d
6 years ago
db_options_test : db /db_options_test .o db /db_test_util .o $( LIBOBJECTS ) $( TESTHARNESS )
$( AM_LINK)
db_range_del_test : db /db_range_del_test .o db /db_test_util .o $( LIBOBJECTS ) $( TESTHARNESS )
$( AM_LINK)
db_sst_test : db /db_sst_test .o db /db_test_util .o $( LIBOBJECTS ) $( TESTHARNESS )
$( AM_LINK)
db_statistics_test : db /db_statistics_test .o db /db_test_util .o $( LIBOBJECTS ) $( TESTHARNESS )
$( AM_LINK)
db_write_test : db /db_write_test .o db /db_test_util .o $( LIBOBJECTS ) $( TESTHARNESS )
$( AM_LINK)
error_handler_fs_test : db /error_handler_fs_test .o db /db_test_util .o $( LIBOBJECTS ) $( TESTHARNESS )
$( AM_LINK)
external_sst_file_basic_test : db /external_sst_file_basic_test .o db /db_test_util .o $( LIBOBJECTS ) $( TESTHARNESS )
$( AM_LINK)
external_sst_file_test : db /external_sst_file_test .o db /db_test_util .o $( LIBOBJECTS ) $( TESTHARNESS )
$( AM_LINK)
Export Import sst files (#5495)
Summary:
Refresh of the earlier change here - https://github.com/facebook/rocksdb/issues/5135
This is a review request for code change needed for - https://github.com/facebook/rocksdb/issues/3469
"Add support for taking snapshot of a column family and creating column family from a given CF snapshot"
We have an implementation for this that we have been testing internally. We have two new APIs that together provide this functionality.
(1) ExportColumnFamily() - This API is modelled after CreateCheckpoint() as below.
// Exports all live SST files of a specified Column Family onto export_dir,
// returning SST files information in metadata.
// - SST files will be created as hard links when the directory specified
// is in the same partition as the db directory, copied otherwise.
// - export_dir should not already exist and will be created by this API.
// - Always triggers a flush.
virtual Status ExportColumnFamily(ColumnFamilyHandle* handle,
const std::string& export_dir,
ExportImportFilesMetaData** metadata);
Internally, the API will DisableFileDeletions(), GetColumnFamilyMetaData(), Parse through
metadata, creating links/copies of all the sst files, EnableFileDeletions() and complete the call by
returning the list of file metadata.
(2) CreateColumnFamilyWithImport() - This API is modeled after IngestExternalFile(), but invoked only during a CF creation as below.
// CreateColumnFamilyWithImport() will create a new column family with
// column_family_name and import external SST files specified in metadata into
// this column family.
// (1) External SST files can be created using SstFileWriter.
// (2) External SST files can be exported from a particular column family in
// an existing DB.
// Option in import_options specifies whether the external files are copied or
// moved (default is copy). When option specifies copy, managing files at
// external_file_path is caller's responsibility. When option specifies a
// move, the call ensures that the specified files at external_file_path are
// deleted on successful return and files are not modified on any error
// return.
// On error return, column family handle returned will be nullptr.
// ColumnFamily will be present on successful return and will not be present
// on error return. ColumnFamily may be present on any crash during this call.
virtual Status CreateColumnFamilyWithImport(
const ColumnFamilyOptions& options, const std::string& column_family_name,
const ImportColumnFamilyOptions& import_options,
const ExportImportFilesMetaData& metadata,
ColumnFamilyHandle** handle);
Internally, this API creates a new CF, parses all the sst files and adds it to the specified column family, at the same level and with same sequence number as in the metadata. Also performs safety checks with respect to overlaps between the sst files being imported.
If incoming sequence number is higher than current local sequence number, local sequence
number is updated to reflect this.
Note, as the sst files is are being moved across Column Families, Column Family name in sst file
will no longer match the actual column family on destination DB. The API does not modify Column
Family name or id in the sst files being imported.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5495
Differential Revision: D16018881
fbshipit-source-id: 9ae2251025d5916d35a9fc4ea4d6707f6be16ff9
6 years ago
import_column_family_test : db /import_column_family_test .o db /db_test_util .o $( LIBOBJECTS ) $( TESTHARNESS )
$( AM_LINK)
db_tailing_iter_test : db /db_tailing_iter_test .o db /db_test_util .o $( LIBOBJECTS ) $( TESTHARNESS )
$( AM_LINK)
db_iter_test : db /db_iter_test .o $( LIBOBJECTS ) $( TESTHARNESS )
$( AM_LINK)
Change and clarify the relationship between Valid(), status() and Seek*() for all iterators. Also fix some bugs
Summary:
Before this PR, Iterator/InternalIterator may simultaneously have non-ok status() and Valid() = true. That state means that the last operation failed, but the iterator is nevertheless positioned on some unspecified record. Likely intended uses of that are:
* If some sst files are corrupted, a normal iterator can be used to read the data from files that are not corrupted.
* When using read_tier = kBlockCacheTier, read the data that's in block cache, skipping over the data that is not.
However, this behavior wasn't documented well (and until recently the wiki on github had misleading incorrect information). In the code there's a lot of confusion about the relationship between status() and Valid(), and about whether Seek()/SeekToLast()/etc reset the status or not. There were a number of bugs caused by this confusion, both inside rocksdb and in the code that uses rocksdb (including ours).
This PR changes the convention to:
* If status() is not ok, Valid() always returns false.
* Any seek operation resets status. (Before the PR, it depended on iterator type and on particular error.)
This does sacrifice the two use cases listed above, but siying said it's ok.
Overview of the changes:
* A commit that adds missing status checks in MergingIterator. This fixes a bug that actually affects us, and we need it fixed. `DBIteratorTest.NonBlockingIterationBugRepro` explains the scenario.
* Changes to lots of iterator types to make all of them conform to the new convention. Some bug fixes along the way. By far the biggest changes are in DBIter, which is a big messy piece of code; I tried to make it less big and messy but mostly failed.
* A stress-test for DBIter, to gain some confidence that I didn't break it. It does a few million random operations on the iterator, while occasionally modifying the underlying data (like ForwardIterator does) and occasionally returning non-ok status from internal iterator.
To find the iterator types that needed changes I searched for "public .*Iterator" in the code. Here's an overview of all 27 iterator types:
Iterators that didn't need changes:
* status() is always ok(), or Valid() is always false: MemTableIterator, ModelIter, TestIterator, KVIter (2 classes with this name anonymous namespaces), LoggingForwardVectorIterator, VectorIterator, MockTableIterator, EmptyIterator, EmptyInternalIterator.
* Thin wrappers that always pass through Valid() and status(): ArenaWrappedDBIter, TtlIterator, InternalIteratorFromIterator.
Iterators with changes (see inline comments for details):
* DBIter - an overhaul:
- It used to silently skip corrupted keys (`FindParseableKey()`), which seems dangerous. This PR makes it just stop immediately after encountering a corrupted key, just like it would for other kinds of corruption. Let me know if there was actually some deeper meaning in this behavior and I should put it back.
- It had a few code paths silently discarding subiterator's status. The stress test caught a few.
- The backwards iteration code path was expecting the internal iterator's set of keys to be immutable. It's probably always true in practice at the moment, since ForwardIterator doesn't support backwards iteration, but this PR fixes it anyway. See added DBIteratorTest.ReverseToForwardBug for an example.
- Some parts of backwards iteration code path even did things like `assert(iter_->Valid())` after a seek, which is never a safe assumption.
- It used to not reset status on seek for some types of errors.
- Some simplifications and better comments.
- Some things got more complicated from the added error handling. I'm open to ideas for how to make it nicer.
* MergingIterator - check status after every operation on every subiterator, and in some places assert that valid subiterators have ok status.
* ForwardIterator - changed to the new convention, also slightly simplified.
* ForwardLevelIterator - fixed some bugs and simplified.
* LevelIterator - simplified.
* TwoLevelIterator - changed to the new convention. Also fixed a bug that would make SeekForPrev() sometimes silently ignore errors from first_level_iter_.
* BlockBasedTableIterator - minor changes.
* BlockIter - replaced `SetStatus()` with `Invalidate()` to make sure non-ok BlockIter is always invalid.
* PlainTableIterator - some seeks used to not reset status.
* CuckooTableIterator - tiny code cleanup.
* ManagedIterator - fixed some bugs.
* BaseDeltaIterator - changed to the new convention and fixed a bug.
* BlobDBIterator - seeks used to not reset status.
* KeyConvertingIterator - some small change.
Closes https://github.com/facebook/rocksdb/pull/3810
Differential Revision: D7888019
Pulled By: al13n321
fbshipit-source-id: 4aaf6d3421c545d16722a815b2fa2e7912bc851d
7 years ago
db_iter_stress_test : db /db_iter_stress_test .o $( LIBOBJECTS ) $( TESTHARNESS )
$( AM_LINK)
db_universal_compaction_test : db /db_universal_compaction_test .o db /db_test_util .o $( LIBOBJECTS ) $( TESTHARNESS )
$( AM_LINK)
db_wal_test : db /db_wal_test .o db /db_test_util .o $( LIBOBJECTS ) $( TESTHARNESS )
$( AM_LINK)
db_io_failure_test : db /db_io_failure_test .o db /db_test_util .o $( LIBOBJECTS ) $( TESTHARNESS )
$( AM_LINK)
db_properties_test : db /db_properties_test .o db /db_test_util .o $( LIBOBJECTS ) $( TESTHARNESS )
$( AM_LINK)
db_table_properties_test : db /db_table_properties_test .o db /db_test_util .o $( LIBOBJECTS ) $( TESTHARNESS )
$( AM_LINK)
log_write_bench : util /log_write_bench .o $( LIBOBJECTS ) $( TESTHARNESS )
$( AM_LINK) $( PROFILING_FLAGS)
plain_table_db_test : db /plain_table_db_test .o $( LIBOBJECTS ) $( TESTHARNESS )
$( AM_LINK)
comparator_db_test : db /comparator_db_test .o $( LIBOBJECTS ) $( TESTHARNESS )
$( AM_LINK)
table_reader_bench : table /table_reader_bench .o $( LIBOBJECTS ) $( TESTHARNESS )
$( AM_LINK) $( PROFILING_FLAGS)
perf_context_test : db /perf_context_test .o $( LIBOBJECTS ) $( TESTHARNESS )
build: make "make" output readable by default
Summary:
With this change, make now prints a summary line for each
compiler and linker invocation, e.g.,:
CC db/builder.o
CC db/c.o
CC db/column_family.o
To see full commands, insert "V=1" into your make command.
E.g., run "make V=1 all" if you want it to print each command
in its full glory.
$^ is GNU make's abbreviation for the prerequisites of the current target.
These AM_V_... variables expand to some very short string like "CC" or
"LD", by default, so that the output of "make" is readable. If/when you
want more details, just build with "make V=1 ...", and make will print
each full command as it is executed. If you prefer to see the noise
all the time, and only want to optionally see the abbreviated output,
set AM_DEFAULT_VERBOSITY=1 in your environment, and then build with
V=0 to see the abbreviated command indicators.
Test Plan:
invoke make a few different ways and observe:
make clean; make # abbreviated
make clean; make V=0 # also abbreviated
make clean; make V=1 # full detail
Reviewers: sdong, ljin, igor
Reviewed By: igor
Subscribers: dhruba
Differential Revision: https://reviews.facebook.net/D33579
10 years ago
$( AM_V_CCLD) $( CXX) $^ $( EXEC_LDFLAGS) -o $@ $( LDFLAGS)
prefix_test : db /prefix_test .o $( LIBOBJECTS ) $( TESTHARNESS )
build: make "make" output readable by default
Summary:
With this change, make now prints a summary line for each
compiler and linker invocation, e.g.,:
CC db/builder.o
CC db/c.o
CC db/column_family.o
To see full commands, insert "V=1" into your make command.
E.g., run "make V=1 all" if you want it to print each command
in its full glory.
$^ is GNU make's abbreviation for the prerequisites of the current target.
These AM_V_... variables expand to some very short string like "CC" or
"LD", by default, so that the output of "make" is readable. If/when you
want more details, just build with "make V=1 ...", and make will print
each full command as it is executed. If you prefer to see the noise
all the time, and only want to optionally see the abbreviated output,
set AM_DEFAULT_VERBOSITY=1 in your environment, and then build with
V=0 to see the abbreviated command indicators.
Test Plan:
invoke make a few different ways and observe:
make clean; make # abbreviated
make clean; make V=0 # also abbreviated
make clean; make V=1 # full detail
Reviewers: sdong, ljin, igor
Reviewed By: igor
Subscribers: dhruba
Differential Revision: https://reviews.facebook.net/D33579
10 years ago
$( AM_V_CCLD) $( CXX) $^ $( EXEC_LDFLAGS) -o $@ $( LDFLAGS)
[RocksDB] BackupableDB
Summary:
In this diff I present you BackupableDB v1. You can easily use it to backup your DB and it will do incremental snapshots for you.
Let's first describe how you would use BackupableDB. It's inheriting StackableDB interface so you can easily construct it with your DB object -- it will add a method RollTheSnapshot() to the DB object. When you call RollTheSnapshot(), current snapshot of the DB will be stored in the backup dir. To restore, you can just call RestoreDBFromBackup() on a BackupableDB (which is a static method) and it will restore all files from the backup dir. In the next version, it will even support automatic backuping every X minutes.
There are multiple things you can configure:
1. backup_env and db_env can be different, which is awesome because then you can easily backup to HDFS or wherever you feel like.
2. sync - if true, it *guarantees* backup consistency on machine reboot
3. number of snapshots to keep - this will keep last N snapshots around if you want, for some reason, be able to restore from an earlier snapshot. All the backuping is done in incremental fashion - if we already have 00010.sst, we will not copy it again. *IMPORTANT* -- This is based on assumption that 00010.sst never changes - two files named 00010.sst from the same DB will always be exactly the same. Is this true? I always copy manifest, current and log files.
4. You can decide if you want to flush the memtables before you backup, or you're fine with backing up the log files -- either way, you get a complete and consistent view of the database at a time of backup.
5. More things you can find in BackupableDBOptions
Here is the directory structure I use:
backup_dir/CURRENT_SNAPSHOT - just 4 bytes holding the latest snapshot
0, 1, 2, ... - files containing serialized version of each snapshot - containing a list of files
files/*.sst - sst files shared between snapshots - if one snapshot references 00010.sst and another one needs to backup it from the DB, it will just reference the same file
files/ 0/, 1/, 2/, ... - snapshot directories containing private snapshot files - current, manifest and log files
All the files are ref counted and deleted immediatelly when they get out of scope.
Some other stuff in this diff:
1. Added GetEnv() method to the DB. Discussed with @haobo and we agreed that it seems right thing to do.
2. Fixed StackableDB interface. The way it was set up before, I was not able to implement BackupableDB.
Test Plan:
I have a unittest, but please don't look at this yet. I just hacked it up to help me with debugging. I will write a lot of good tests and update the diff.
Also, `make asan_check`
Reviewers: dhruba, haobo, emayanke
Reviewed By: dhruba
CC: leveldb, haobo
Differential Revision: https://reviews.facebook.net/D14295
11 years ago
backupable_db_test : utilities /backupable /backupable_db_test .o $( LIBOBJECTS ) $( TESTHARNESS )
$( AM_LINK)
[RocksDB] BackupableDB
Summary:
In this diff I present you BackupableDB v1. You can easily use it to backup your DB and it will do incremental snapshots for you.
Let's first describe how you would use BackupableDB. It's inheriting StackableDB interface so you can easily construct it with your DB object -- it will add a method RollTheSnapshot() to the DB object. When you call RollTheSnapshot(), current snapshot of the DB will be stored in the backup dir. To restore, you can just call RestoreDBFromBackup() on a BackupableDB (which is a static method) and it will restore all files from the backup dir. In the next version, it will even support automatic backuping every X minutes.
There are multiple things you can configure:
1. backup_env and db_env can be different, which is awesome because then you can easily backup to HDFS or wherever you feel like.
2. sync - if true, it *guarantees* backup consistency on machine reboot
3. number of snapshots to keep - this will keep last N snapshots around if you want, for some reason, be able to restore from an earlier snapshot. All the backuping is done in incremental fashion - if we already have 00010.sst, we will not copy it again. *IMPORTANT* -- This is based on assumption that 00010.sst never changes - two files named 00010.sst from the same DB will always be exactly the same. Is this true? I always copy manifest, current and log files.
4. You can decide if you want to flush the memtables before you backup, or you're fine with backing up the log files -- either way, you get a complete and consistent view of the database at a time of backup.
5. More things you can find in BackupableDBOptions
Here is the directory structure I use:
backup_dir/CURRENT_SNAPSHOT - just 4 bytes holding the latest snapshot
0, 1, 2, ... - files containing serialized version of each snapshot - containing a list of files
files/*.sst - sst files shared between snapshots - if one snapshot references 00010.sst and another one needs to backup it from the DB, it will just reference the same file
files/ 0/, 1/, 2/, ... - snapshot directories containing private snapshot files - current, manifest and log files
All the files are ref counted and deleted immediatelly when they get out of scope.
Some other stuff in this diff:
1. Added GetEnv() method to the DB. Discussed with @haobo and we agreed that it seems right thing to do.
2. Fixed StackableDB interface. The way it was set up before, I was not able to implement BackupableDB.
Test Plan:
I have a unittest, but please don't look at this yet. I just hacked it up to help me with debugging. I will write a lot of good tests and update the diff.
Also, `make asan_check`
Reviewers: dhruba, haobo, emayanke
Reviewed By: dhruba
CC: leveldb, haobo
Differential Revision: https://reviews.facebook.net/D14295
11 years ago
checkpoint_test : utilities /checkpoint /checkpoint_test .o $( LIBOBJECTS ) $( TESTHARNESS )
$( AM_LINK)
cache_simulator_test : utilities /simulator_cache /cache_simulator_test .o $( LIBOBJECTS ) $( TESTHARNESS )
$( AM_LINK)
sim_cache_test : utilities /simulator_cache /sim_cache_test .o db /db_test_util .o $( LIBOBJECTS ) $( TESTHARNESS )
$( AM_LINK)
env_mirror_test : utilities /env_mirror_test .o $( LIBOBJECTS ) $( TESTHARNESS )
$( AM_LINK)
env_timed_test : utilities /env_timed_test .o $( LIBOBJECTS ) $( TESTHARNESS )
$( AM_LINK)
Add EnvLibrados - RocksDB Env of RADOS (#1222)
EnvLibrados is a customized RocksDB Env to use RADOS as the backend file system of RocksDB. It overrides all file system related API of default Env. The easiest way to use it is just like following:
std::string db_name = "test_db";
std::string config_path = "path/to/ceph/config";
DB* db;
Options options;
options.env = EnvLibrados(db_name, config_path);
Status s = DB::Open(options, kDBPath, &db);
Then EnvLibrados will forward all file read/write operation to the RADOS cluster assigned by config_path. Default pool is db_name+"_pool".
There are some options that users could set for EnvLibrados.
- write_buffer_size. This variable is the max buffer size for WritableFile. After reaching the buffer_max_size, EnvLibrados will sync buffer content to RADOS, then clear buffer.
- db_pool. Rather than using default pool, users could set their own db pool name
- wal_dir. The dir for WAL files. Because RocksDB only has 2-level structure (dir_name/file_name), the format of wal_dir is "/dir_name"(CAN'T be "/dir1/dir2"). Default wal_dir is "/wal".
- wal_pool. Corresponding pool name for WAL files. Default value is db_name+"_wal_pool"
The example of setting options looks like following:
db_name = "test_db";
db_pool = db_name+"_pool";
wal_dir = "/wal";
wal_pool = db_name+"_wal_pool";
write_buffer_size = 1 << 20;
env_ = new EnvLibrados(db_name, config, db_pool, wal_dir, wal_pool, write_buffer_size);
DB* db;
Options options;
options.env = env_;
// The last level dir name should match the dir name in prefix_pool_map
options.wal_dir = "/tmp/wal";
// open DB
Status s = DB::Open(options, kDBPath, &db);
Librados is required to compile EnvLibrados. Then use "$make LIBRADOS=1" to compile RocksDB. If you want to only compile EnvLibrados test, just run "$ make env_librados_test LIBRADOS=1". To run env_librados_test, you need to have a running RADOS cluster with the configure file located in "../ceph/src/ceph.conf" related to "rocksdb/".
9 years ago
i f d e f R O C K S D B _ U S E _ L I B R A D O S
env_librados_test : utilities /env_librados_test .o $( LIBOBJECTS ) $( TESTHARNESS )
$( AM_V_CCLD) $( CXX) $^ $( EXEC_LDFLAGS) -o $@ $( LDFLAGS) $( COVERAGEFLAGS)
e n d i f
object_registry_test : utilities /object_registry_test .o $( LIBOBJECTS ) $( TESTHARNESS )
$( AM_LINK)
Timestamp and TTL Wrapper for rocksdb
Summary:
When opened with DBTimestamp::Open call, timestamps are prepended to and stripped from the value during subsequent Put and Get calls respectively. The Timestamp is used to discard values in Get and custom compaction filter which have exceeded their TTL which is specified during Open.
Have made a temporary change to Makefile to let us test with the temporary file TestTime.cc. Have also changed the private members of db_impl.h to protected to let them be inherited by the new class DBTimestamp
Test Plan: make db_timestamp; TestTime.cc(will not check it in) shows how to use the apis currently, but I will write unit-tests shortly
Reviewers: dhruba, vamsi, haobo, sheki, heyongqiang, vkrest
Reviewed By: vamsi
CC: zshao, xjin, vkrest, MarkCallaghan
Differential Revision: https://reviews.facebook.net/D10311
12 years ago
ttl_test : utilities /ttl /ttl_test .o $( LIBOBJECTS ) $( TESTHARNESS )
$( AM_LINK)
Timestamp and TTL Wrapper for rocksdb
Summary:
When opened with DBTimestamp::Open call, timestamps are prepended to and stripped from the value during subsequent Put and Get calls respectively. The Timestamp is used to discard values in Get and custom compaction filter which have exceeded their TTL which is specified during Open.
Have made a temporary change to Makefile to let us test with the temporary file TestTime.cc. Have also changed the private members of db_impl.h to protected to let them be inherited by the new class DBTimestamp
Test Plan: make db_timestamp; TestTime.cc(will not check it in) shows how to use the apis currently, but I will write unit-tests shortly
Reviewers: dhruba, vamsi, haobo, sheki, heyongqiang, vkrest
Reviewed By: vamsi
CC: zshao, xjin, vkrest, MarkCallaghan
Differential Revision: https://reviews.facebook.net/D10311
12 years ago
write_batch_with_index_test : utilities /write_batch_with_index /write_batch_with_index_test .o $( LIBOBJECTS ) $( TESTHARNESS )
$( AM_LINK)
Timestamp and TTL Wrapper for rocksdb
Summary:
When opened with DBTimestamp::Open call, timestamps are prepended to and stripped from the value during subsequent Put and Get calls respectively. The Timestamp is used to discard values in Get and custom compaction filter which have exceeded their TTL which is specified during Open.
Have made a temporary change to Makefile to let us test with the temporary file TestTime.cc. Have also changed the private members of db_impl.h to protected to let them be inherited by the new class DBTimestamp
Test Plan: make db_timestamp; TestTime.cc(will not check it in) shows how to use the apis currently, but I will write unit-tests shortly
Reviewers: dhruba, vamsi, haobo, sheki, heyongqiang, vkrest
Reviewed By: vamsi
CC: zshao, xjin, vkrest, MarkCallaghan
Differential Revision: https://reviews.facebook.net/D10311
12 years ago
flush_job_test : db /flush_job_test .o $( LIBOBJECTS ) $( TESTHARNESS )
$( AM_LINK)
Support for SingleDelete()
Summary:
This patch fixes #7460559. It introduces SingleDelete as a new database
operation. This operation can be used to delete keys that were never
overwritten (no put following another put of the same key). If an overwritten
key is single deleted the behavior is undefined. Single deletion of a
non-existent key has no effect but multiple consecutive single deletions are
not allowed (see limitations).
In contrast to the conventional Delete() operation, the deletion entry is
removed along with the value when the two are lined up in a compaction. Note:
The semantics are similar to @igor's prototype that allowed to have this
behavior on the granularity of a column family (
https://reviews.facebook.net/D42093 ). This new patch, however, is more
aggressive when it comes to removing tombstones: It removes the SingleDelete
together with the value whenever there is no snapshot between them while the
older patch only did this when the sequence number of the deletion was older
than the earliest snapshot.
Most of the complex additions are in the Compaction Iterator, all other changes
should be relatively straightforward. The patch also includes basic support for
single deletions in db_stress and db_bench.
Limitations:
- Not compatible with cuckoo hash tables
- Single deletions cannot be used in combination with merges and normal
deletions on the same key (other keys are not affected by this)
- Consecutive single deletions are currently not allowed (and older version of
this patch supported this so it could be resurrected if needed)
Test Plan: make all check
Reviewers: yhchiang, sdong, rven, anthony, yoshinorim, igor
Reviewed By: igor
Subscribers: maykov, dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D43179
9 years ago
compaction_iterator_test : db /compaction /compaction_iterator_test .o $( LIBOBJECTS ) $( TESTHARNESS )
Support for SingleDelete()
Summary:
This patch fixes #7460559. It introduces SingleDelete as a new database
operation. This operation can be used to delete keys that were never
overwritten (no put following another put of the same key). If an overwritten
key is single deleted the behavior is undefined. Single deletion of a
non-existent key has no effect but multiple consecutive single deletions are
not allowed (see limitations).
In contrast to the conventional Delete() operation, the deletion entry is
removed along with the value when the two are lined up in a compaction. Note:
The semantics are similar to @igor's prototype that allowed to have this
behavior on the granularity of a column family (
https://reviews.facebook.net/D42093 ). This new patch, however, is more
aggressive when it comes to removing tombstones: It removes the SingleDelete
together with the value whenever there is no snapshot between them while the
older patch only did this when the sequence number of the deletion was older
than the earliest snapshot.
Most of the complex additions are in the Compaction Iterator, all other changes
should be relatively straightforward. The patch also includes basic support for
single deletions in db_stress and db_bench.
Limitations:
- Not compatible with cuckoo hash tables
- Single deletions cannot be used in combination with merges and normal
deletions on the same key (other keys are not affected by this)
- Consecutive single deletions are currently not allowed (and older version of
this patch supported this so it could be resurrected if needed)
Test Plan: make all check
Reviewers: yhchiang, sdong, rven, anthony, yoshinorim, igor
Reviewed By: igor
Subscribers: maykov, dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D43179
9 years ago
$( AM_LINK)
compaction_job_test : db /compaction /compaction_job_test .o $( LIBOBJECTS ) $( TESTHARNESS )
$( AM_LINK)
compaction_job_stats_test : db /compaction /compaction_job_stats_test .o $( LIBOBJECTS ) $( TESTHARNESS )
$( AM_LINK)
compact_on_deletion_collector_test : utilities /table_properties_collectors /compact_on_deletion_collector_test .o $( LIBOBJECTS ) $( TESTHARNESS )
$( AM_LINK)
wal_manager_test : db /wal_manager_test .o $( LIBOBJECTS ) $( TESTHARNESS )
$( AM_LINK)
dbformat_test : db /dbformat_test .o $( LIBOBJECTS ) $( TESTHARNESS )
$( AM_LINK)
env_basic_test : env /env_basic_test .o $( LIBOBJECTS ) $( TESTHARNESS )
$( AM_LINK)
env_test : env /env_test .o $( LIBOBJECTS ) $( TESTHARNESS )
$( AM_LINK)
io_posix_test : env /io_posix_test .o $( LIBOBJECTS ) $( TESTHARNESS )
$( AM_LINK)
fault_injection_test : db /fault_injection_test .o $( LIBOBJECTS ) $( TESTHARNESS )
$( AM_LINK)
rate_limiter_test : util /rate_limiter_test .o db /db_test_util .o $( LIBOBJECTS ) $( TESTHARNESS )
$( AM_LINK)
generic rate limiter
Summary:
A generic rate limiter that can be shared by threads and rocksdb
instances. Will use this to smooth out write traffic generated by
compaction and flush. This will help us get better p99 behavior on flash
storage.
Test Plan:
unit test output
==== Test RateLimiterTest.Rate
request size [1 - 1023], limit 10 KB/sec, actual rate: 10.374969 KB/sec, elapsed 2002265
request size [1 - 2047], limit 20 KB/sec, actual rate: 20.771242 KB/sec, elapsed 2002139
request size [1 - 4095], limit 40 KB/sec, actual rate: 41.285299 KB/sec, elapsed 2202424
request size [1 - 8191], limit 80 KB/sec, actual rate: 81.371605 KB/sec, elapsed 2402558
request size [1 - 16383], limit 160 KB/sec, actual rate: 162.541268 KB/sec, elapsed 3303500
Reviewers: yhchiang, igor, sdong
Reviewed By: sdong
Subscribers: leveldb
Differential Revision: https://reviews.facebook.net/D19359
11 years ago
delete_scheduler_test : file /delete_scheduler_test .o $( LIBOBJECTS ) $( TESTHARNESS )
$( AM_LINK)
filename_test : db /filename_test .o $( LIBOBJECTS ) $( TESTHARNESS )
$( AM_LINK)
Support direct IO in RandomAccessFileReader::MultiRead (#6446)
Summary:
By supporting direct IO in RandomAccessFileReader::MultiRead, the benefits of parallel IO (IO uring) and direct IO can be combined.
In direct IO mode, read requests are aligned and merged together before being issued to RandomAccessFile::MultiRead, so blocks in the original requests might share the same underlying buffer, the shared buffers are returned in `aligned_bufs`, which is a new parameter of the `MultiRead` API.
For example, suppose alignment requirement for direct IO is 4KB, one request is (offset: 1KB, len: 1KB), another request is (offset: 3KB, len: 1KB), then since they all belong to page (offset: 0, len: 4KB), `MultiRead` only reads the page with direct IO into a buffer on heap, and returns 2 Slices referencing regions in that same buffer. See `random_access_file_reader_test.cc` for more examples.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6446
Test Plan: Added a new test `random_access_file_reader_test.cc`.
Reviewed By: anand1976
Differential Revision: D20097518
Pulled By: cheng-chang
fbshipit-source-id: ca48a8faf9c3af146465c102ef6b266a363e78d1
5 years ago
random_access_file_reader_test : file /random_access_file_reader_test .o $( LIBOBJECTS ) $( TESTHARNESS ) $( TESTUTIL )
$( AM_LINK)
RangeSync not to sync last 1MB of the file
Summary:
From other ones' investigation:
"sync_file_range() behavior highly depends on kernel version and filesystem.
xfs does neighbor page flushing outside of the specified ranges. For example, sync_file_range(fd, 8192, 16384) does not only trigger flushing page #3 to #4, but also flushing many more dirty pages (i.e. up to page#16)... Ranges of the sync_file_range() should be far enough from write() offset (at least 1MB)."
Test Plan: make all check
Reviewers: igor, rven, kradhakrishnan, yhchiang, IslamAbdelRahman, anthony
Reviewed By: anthony
Subscribers: yoshinorim, MarkCallaghan, sumeet, domas, dhruba, leveldb, ljin
Differential Revision: https://reviews.facebook.net/D15807
10 years ago
file_reader_writer_test : util /file_reader_writer_test .o $( LIBOBJECTS ) $( TESTHARNESS )
$( AM_LINK)
block_based_filter_block_test : table /block_based /block_based_filter_block_test .o $( LIBOBJECTS ) $( TESTHARNESS )
$( AM_LINK)
full_filter_block_test : table /block_based /full_filter_block_test .o $( LIBOBJECTS ) $( TESTHARNESS )
$( AM_LINK)
partitioned_filter_block_test : table /block_based /partitioned_filter_block_test .o $( LIBOBJECTS ) $( TESTHARNESS )
$( AM_LINK)
log_test : db /log_test .o $( LIBOBJECTS ) $( TESTHARNESS )
$( AM_LINK)
cleanable_test : table /cleanable_test .o $( LIBOBJECTS ) $( TESTHARNESS )
$( AM_LINK)
table_test : table /table_test .o $( LIBOBJECTS ) $( TESTHARNESS )
$( AM_LINK)
block_test : table /block_based /block_test .o $( LIBOBJECTS ) $( TESTHARNESS )
$( AM_LINK)
data_block_hash_index_test : table /block_based /data_block_hash_index_test .o $( LIBOBJECTS ) $( TESTHARNESS )
$( AM_LINK)
inlineskiplist_test : memtable /inlineskiplist_test .o $( LIBOBJECTS ) $( TESTHARNESS )
$( AM_LINK)
skiplist_test : memtable /skiplist_test .o $( LIBOBJECTS ) $( TESTHARNESS )
$( AM_LINK)
write_buffer_manager_test : memtable /write_buffer_manager_test .o $( LIBOBJECTS ) $( TESTHARNESS )
$( AM_LINK)
version_edit_test : db /version_edit_test .o $( LIBOBJECTS ) $( TESTHARNESS )
$( AM_LINK)
version_set_test : db /version_set_test .o $( LIBOBJECTS ) $( TESTHARNESS )
$( AM_LINK)
compaction_picker_test : db /compaction /compaction_picker_test .o $( LIBOBJECTS ) $( TESTHARNESS )
$( AM_LINK)
version_builder_test : db /version_builder_test .o $( LIBOBJECTS ) $( TESTHARNESS )
$( AM_LINK)
build: make "make" output readable by default
Summary:
With this change, make now prints a summary line for each
compiler and linker invocation, e.g.,:
CC db/builder.o
CC db/c.o
CC db/column_family.o
To see full commands, insert "V=1" into your make command.
E.g., run "make V=1 all" if you want it to print each command
in its full glory.
$^ is GNU make's abbreviation for the prerequisites of the current target.
These AM_V_... variables expand to some very short string like "CC" or
"LD", by default, so that the output of "make" is readable. If/when you
want more details, just build with "make V=1 ...", and make will print
each full command as it is executed. If you prefer to see the noise
all the time, and only want to optionally see the abbreviated output,
set AM_DEFAULT_VERBOSITY=1 in your environment, and then build with
V=0 to see the abbreviated command indicators.
Test Plan:
invoke make a few different ways and observe:
make clean; make # abbreviated
make clean; make V=0 # also abbreviated
make clean; make V=1 # full detail
Reviewers: sdong, ljin, igor
Reviewed By: igor
Subscribers: dhruba
Differential Revision: https://reviews.facebook.net/D33579
10 years ago
file_indexer_test : db /file_indexer_test .o $( LIBOBJECTS ) $( TESTHARNESS )
$( AM_LINK)
hints for narrowing down FindFile range and avoiding checking unrelevant L0 files
Summary:
The file tree structure in Version is prebuilt and the range of each file is known.
On the Get() code path, we do binary search in FindFile() by comparing
target key with each file's largest key and also check the range for each L0 file.
With some pre-calculated knowledge, each key comparision that has been done can serve
as a hint to narrow down further searches:
(1) If a key falls within a L0 file's range, we can safely skip the next
file if its range does not overlap with the current one.
(2) If a key falls within a file's range in level L0 - Ln-1, we should only
need to binary search in the next level for files that overlap with the current one.
(1) will be able to skip some files depending one the key distribution.
(2) can greatly reduce the range of binary search, especially for bottom
levels, given that one file most likely only overlaps with N files from
the level below (where N is max_bytes_for_level_multiplier). So on level
L, we will only look at ~N files instead of N^L files.
Some inital results: measured with 500M key DB, when write is light (10k/s = 1.2M/s), this
improves QPS ~7% on top of blocked bloom. When write is heavier (80k/s =
9.6M/s), it gives us ~13% improvement.
Test Plan: make all check
Reviewers: haobo, igor, dhruba, sdong, yhchiang
Reviewed By: haobo
CC: leveldb
Differential Revision: https://reviews.facebook.net/D17205
11 years ago
reduce_levels_test : tools /reduce_levels_test .o $( LIBOBJECTS ) $( TESTHARNESS )
$( AM_LINK)
write_batch_test : db /write_batch_test .o $( LIBOBJECTS ) $( TESTHARNESS )
$( AM_LINK)
Push- instead of pull-model for managing Write stalls
Summary:
Introducing WriteController, which is a source of truth about per-DB write delays. Let's define an DB epoch as a period where there are no flushes and compactions (i.e. new epoch is started when flush or compaction finishes). Each epoch can either:
* proceed with all writes without delay
* delay all writes by fixed time
* stop all writes
The three modes are recomputed at each epoch change (flush, compaction), rather than on every write (which is currently the case).
When we have a lot of column families, our current pull behavior adds a big overhead, since we need to loop over every column family for every write. With new push model, overhead on Write code-path is minimal.
This is just the start. Next step is to also take care of stalls introduced by slow memtable flushes. The final goal is to eliminate function MakeRoomForWrite(), which currently needs to be called for every column family by every write.
Test Plan: make check for now. I'll add some unit tests later. Also, perf test.
Reviewers: dhruba, yhchiang, MarkCallaghan, sdong, ljin
Reviewed By: ljin
Subscribers: leveldb
Differential Revision: https://reviews.facebook.net/D22791
11 years ago
write_controller_test : db /write_controller_test .o $( LIBOBJECTS ) $( TESTHARNESS )
$( AM_LINK)
Push- instead of pull-model for managing Write stalls
Summary:
Introducing WriteController, which is a source of truth about per-DB write delays. Let's define an DB epoch as a period where there are no flushes and compactions (i.e. new epoch is started when flush or compaction finishes). Each epoch can either:
* proceed with all writes without delay
* delay all writes by fixed time
* stop all writes
The three modes are recomputed at each epoch change (flush, compaction), rather than on every write (which is currently the case).
When we have a lot of column families, our current pull behavior adds a big overhead, since we need to loop over every column family for every write. With new push model, overhead on Write code-path is minimal.
This is just the start. Next step is to also take care of stalls introduced by slow memtable flushes. The final goal is to eliminate function MakeRoomForWrite(), which currently needs to be called for every column family by every write.
Test Plan: make check for now. I'll add some unit tests later. Also, perf test.
Reviewers: dhruba, yhchiang, MarkCallaghan, sdong, ljin
Reviewed By: ljin
Subscribers: leveldb
Differential Revision: https://reviews.facebook.net/D22791
11 years ago
merge_helper_test : db /merge_helper_test .o $( LIBOBJECTS ) $( TESTHARNESS )
$( AM_LINK)
memory_test : utilities /memory /memory_test .o $( LIBOBJECTS ) $( TESTHARNESS )
$( AM_LINK)
merge_test : db /merge_test .o $( LIBOBJECTS ) $( TESTHARNESS )
$( AM_LINK)
merger_test : table /merger_test .o $( LIBOBJECTS ) $( TESTHARNESS )
$( AM_LINK)
util_merge_operators_test : utilities /util_merge_operators_test .o $( LIBOBJECTS ) $( TESTHARNESS )
$( AM_LINK)
options_file_test : db /options_file_test .o $( LIBOBJECTS ) $( TESTHARNESS )
$( AM_LINK)
deletefile_test : db /deletefile_test .o db /db_test_util .o $( LIBOBJECTS ) $( TESTHARNESS )
$( AM_LINK)
obsolete_files_test : db /obsolete_files_test .o db /db_test_util .o $( LIBOBJECTS ) $( TESTHARNESS )
$( AM_LINK)
rocksdb_dump : tools /dump /rocksdb_dump .o $( LIBOBJECTS )
$( AM_LINK)
rocksdb_undump : tools /dump /rocksdb_undump .o $( LIBOBJECTS )
$( AM_LINK)
cuckoo_table_builder_test : table /cuckoo /cuckoo_table_builder_test .o $( LIBOBJECTS ) $( TESTHARNESS )
$( AM_LINK)
cuckoo_table_reader_test : table /cuckoo /cuckoo_table_reader_test .o $( LIBOBJECTS ) $( TESTHARNESS )
$( AM_LINK)
cuckoo_table_db_test : db /cuckoo_table_db_test .o $( LIBOBJECTS ) $( TESTHARNESS )
$( AM_LINK)
listener_test : db /listener_test .o db /db_test_util .o $( LIBOBJECTS ) $( TESTHARNESS )
$( AM_LINK)
CompactFiles, EventListener and GetDatabaseMetaData
Summary:
This diff adds three sets of APIs to RocksDB.
= GetColumnFamilyMetaData =
* This APIs allow users to obtain the current state of a RocksDB instance on one column family.
* See GetColumnFamilyMetaData in include/rocksdb/db.h
= EventListener =
* A virtual class that allows users to implement a set of
call-back functions which will be called when specific
events of a RocksDB instance happens.
* To register EventListener, simply insert an EventListener to ColumnFamilyOptions::listeners
= CompactFiles =
* CompactFiles API inputs a set of file numbers and an output level, and RocksDB
will try to compact those files into the specified level.
= Example =
* Example code can be found in example/compact_files_example.cc, which implements
a simple external compactor using EventListener, GetColumnFamilyMetaData, and
CompactFiles API.
Test Plan:
listener_test
compactor_test
example/compact_files_example
export ROCKSDB_TESTS=CompactFiles
db_test
export ROCKSDB_TESTS=MetaData
db_test
Reviewers: ljin, igor, rven, sdong
Reviewed By: sdong
Subscribers: MarkCallaghan, dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D24705
10 years ago
thread_list_test : util /thread_list_test .o $( LIBOBJECTS ) $( TESTHARNESS )
$( AM_LINK)
compact_files_test : db /compact_files_test .o $( LIBOBJECTS ) $( TESTHARNESS )
$( AM_LINK)
options_test : options /options_test .o $( LIBOBJECTS ) $( TESTHARNESS )
$( AM_LINK)
options_settable_test : options /options_settable_test .o $( LIBOBJECTS ) $( TESTHARNESS )
$( AM_LINK)
Add OptionsUtil::LoadOptionsFromFile() API
Summary:
This patch adds OptionsUtil::LoadOptionsFromFile() and
OptionsUtil::LoadLatestOptionsFromDB(), which allow developers
to construct DBOptions and ColumnFamilyOptions from a RocksDB
options file. Note that most pointer-typed options such as
merge_operator will not be constructed.
With this API, developers no longer need to remember all the
options in order to reopen an existing rocksdb instance like
the following:
DBOptions db_options;
std::vector<std::string> cf_names;
std::vector<ColumnFamilyOptions> cf_opts;
// Load primitive-typed options from an existing DB
OptionsUtil::LoadLatestOptionsFromDB(
dbname, &db_options, &cf_names, &cf_opts);
// Initialize necessary pointer-typed options
cf_opts[0].merge_operator.reset(new MyMergeOperator());
...
// Construct the vector of ColumnFamilyDescriptor
std::vector<ColumnFamilyDescriptor> cf_descs;
for (size_t i = 0; i < cf_opts.size(); ++i) {
cf_descs.emplace_back(cf_names[i], cf_opts[i]);
}
// Open the DB
DB* db = nullptr;
std::vector<ColumnFamilyHandle*> cf_handles;
auto s = DB::Open(db_options, dbname, cf_descs,
&handles, &db);
Test Plan:
Augment existing tests in column_family_test
options_test
db_test
Reviewers: igor, IslamAbdelRahman, sdong, anthony
Reviewed By: anthony
Subscribers: dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D49095
9 years ago
options_util_test : utilities /options /options_util_test .o $( LIBOBJECTS ) $( TESTHARNESS )
$( AM_LINK)
db_bench_tool_test : tools /db_bench_tool_test .o $( BENCHTOOLOBJECTS ) $( TESTHARNESS )
$( AM_LINK)
trace_analyzer_test : tools /trace_analyzer_test .o $( LIBOBJECTS ) $( ANALYZETOOLOBJECTS ) $( TESTHARNESS )
RocksDB Trace Analyzer (#4091)
Summary:
A framework of trace analyzing for RocksDB
After collecting the trace by using the tool of [PR #3837](https://github.com/facebook/rocksdb/pull/3837). User can use the Trace Analyzer to interpret, analyze, and characterize the collected workload.
**Input:**
1. trace file
2. Whole keys space file
**Statistics:**
1. Access count of each operation (Get, Put, Delete, SingleDelete, DeleteRange, Merge) in each column family.
2. Key hotness (access count) of each one
3. Key space separation based on given prefix
4. Key size distribution
5. Value size distribution if appliable
6. Top K accessed keys
7. QPS statistics including the average QPS and peak QPS
8. Top K accessed prefix
9. The query correlation analyzing, output the number of X after Y and the corresponding average time
intervals
**Output:**
1. key access heat map (either in the accessed key space or whole key space)
2. trace sequence file (interpret the raw trace file to line base text file for future use)
3. Time serial (The key space ID and its access time)
4. Key access count distritbution
5. Key size distribution
6. Value size distribution (in each intervals)
7. whole key space separation by the prefix
8. Accessed key space separation by the prefix
9. QPS of each operation and each column family
10. Top K QPS and their accessed prefix range
**Test:**
1. Added the unit test of analyzing Get, Put, Delete, SingleDelete, DeleteRange, Merge
2. Generated the trace and analyze the trace
**Implemented but not tested (due to the limitation of trace_replay):**
1. Analyzing Iterator, supporting Seek() and SeekForPrev() analyzing
2. Analyzing the number of Key found by Get
**Future Work:**
1. Support execution time analyzing of each requests
2. Support cache hit situation and block read situation of Get
Pull Request resolved: https://github.com/facebook/rocksdb/pull/4091
Differential Revision: D9256157
Pulled By: zhichao-cao
fbshipit-source-id: f0ceacb7eedbc43a3eee6e85b76087d7832a8fe6
7 years ago
$( AM_LINK)
event_logger_test : logging /event_logger_test .o $( LIBOBJECTS ) $( TESTHARNESS )
EventLogger
Summary:
Here's my proposal for making our LOGs easier to read by machines.
The idea is to dump all events as JSON objects. JSON is easy to read by humans, but more importantly, it's easy to read by machines. That way, we can parse this, load into SQLite/mongo and then query or visualize.
I started with table_create and table_delete events, but if everybody agrees, I'll continue by adding more events (flush/compaction/etc etc)
Test Plan:
Ran db_bench. Observed:
2015/01/15-14:13:25.788019 1105ef000 EVENT_LOG_v1 {"time_micros": 1421360005788015, "event": "table_file_creation", "file_number": 12, "file_size": 1909699}
2015/01/15-14:13:25.956500 110740000 EVENT_LOG_v1 {"time_micros": 1421360005956498, "event": "table_file_deletion", "file_number": 12}
Reviewers: yhchiang, rven, dhruba, MarkCallaghan, lgalanis, sdong
Reviewed By: sdong
Subscribers: dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D31647
10 years ago
$( AM_LINK)
timer_queue_test : util /timer_queue_test .o $( LIBOBJECTS ) $( TESTHARNESS )
$( AM_LINK)
sst_dump_test : tools /sst_dump_test .o $( LIBOBJECTS ) $( TESTHARNESS )
$( AM_LINK)
optimistic_transaction_test : utilities /transactions /optimistic_transaction_test .o $( LIBOBJECTS ) $( TESTHARNESS )
$( AM_LINK)
mock_env_test : env /mock_env_test .o $( LIBOBJECTS ) $( TESTHARNESS )
$( AM_LINK)
manual_compaction_test : db /manual_compaction_test .o $( LIBOBJECTS ) $( TESTHARNESS )
$( AM_LINK)
filelock_test : util /filelock_test .o $( LIBOBJECTS ) $( TESTHARNESS )
$( AM_LINK)
auto_roll_logger_test : logging /auto_roll_logger_test .o $( LIBOBJECTS ) $( TESTHARNESS )
$( AM_LINK)
env_logger_test : logging /env_logger_test .o $( LIBOBJECTS ) $( TESTHARNESS )
$( AM_LINK)
memtable_list_test : db /memtable_list_test .o $( LIBOBJECTS ) $( TESTHARNESS )
$( AM_LINK)
write_callback_test : db /write_callback_test .o $( LIBOBJECTS ) $( TESTHARNESS )
$( AM_LINK)
heap_test : util /heap_test .o $( GTEST )
$( AM_LINK)
transaction_lock_mgr_test : utilities /transactions /transaction_lock_mgr_test .o $( LIBOBJECTS ) $( TESTHARNESS )
$( AM_LINK)
Pessimistic Transactions
Summary:
Initial implementation of Pessimistic Transactions. This diff contains the api changes discussed in D38913. This diff is pretty large, so let me know if people would prefer to meet up to discuss it.
MyRocks folks: please take a look at the API in include/rocksdb/utilities/transaction[_db].h and let me know if you have any issues.
Also, you'll notice a couple of TODOs in the implementation of RollbackToSavePoint(). After chatting with Siying, I'm going to send out a separate diff for an alternate implementation of this feature that implements the rollback inside of WriteBatch/WriteBatchWithIndex. We can then decide which route is preferable.
Next, I'm planning on doing some perf testing and then integrating this diff into MongoRocks for further testing.
Test Plan: Unit tests, db_bench parallel testing.
Reviewers: igor, rven, sdong, yhchiang, yoshinorim
Reviewed By: sdong
Subscribers: hermanlee4, maykov, spetrunia, leveldb, dhruba
Differential Revision: https://reviews.facebook.net/D40869
10 years ago
transaction_test : utilities /transactions /transaction_test .o $( LIBOBJECTS ) $( TESTHARNESS )
$( AM_LINK)
write_prepared_transaction_test : utilities /transactions /write_prepared_transaction_test .o $( LIBOBJECTS ) $( TESTHARNESS )
$( AM_LINK)
write_unprepared_transaction_test : utilities /transactions /write_unprepared_transaction_test .o $( LIBOBJECTS ) $( TESTHARNESS )
$( AM_LINK)
sst_dump : tools /sst_dump .o $( LIBOBJECTS )
$( AM_LINK)
blob_dump : tools /blob_dump .o $( LIBOBJECTS )
$( AM_LINK)
repair_test : db /repair_test .o db /db_test_util .o $( LIBOBJECTS ) $( TESTHARNESS )
$( AM_LINK)
ldb_cmd_test : tools /ldb_cmd_test .o $( LIBOBJECTS ) $( TESTHARNESS )
Remove ldb HexToString method's usage of sscanf
Summary:
Fix hex2String performance issues by removing sscanf dependency.
Also fixed some edge case handling (odd length, bad input).
Test Plan: Created a test file which called old and new implementation, and validated results are the same. I'll paste results in the phabricator diff.
Reviewers: igor, rven, anthony, IslamAbdelRahman, kradhakrishnan, yhchiang, sdong
Reviewed By: sdong
Subscribers: thatsafunnyname, leveldb, dhruba
Differential Revision: https://reviews.facebook.net/D46785
9 years ago
$( AM_LINK)
ldb : tools /ldb .o $( LIBOBJECTS )
$( AM_LINK)
iostats_context_test : monitoring /iostats_context_test .o $( LIBOBJECTS ) $( TESTHARNESS )
$( AM_V_CCLD) $( CXX) $^ $( EXEC_LDFLAGS) -o $@ $( LDFLAGS)
persistent_cache_test : utilities /persistent_cache /persistent_cache_test .o db /db_test_util .o $( LIBOBJECTS ) $( TESTHARNESS )
$( AM_LINK)
statistics_test : monitoring /statistics_test .o $( LIBOBJECTS ) $( TESTHARNESS )
$( AM_LINK)
stats_history_test : monitoring /stats_history_test .o db /db_test_util .o $( LIBOBJECTS ) $( TESTHARNESS )
$( AM_LINK)
lru_cache_test : cache /lru_cache_test .o $( LIBOBJECTS ) $( TESTHARNESS )
$( AM_LINK)
range_del_aggregator_test : db /range_del_aggregator_test .o db /db_test_util .o $( LIBOBJECTS ) $( TESTHARNESS )
$( AM_LINK)
range_del_aggregator_bench : db /range_del_aggregator_bench .o $( LIBOBJECTS ) $( TESTUTIL )
$( AM_LINK)
blob_db_test : utilities /blob_db /blob_db_test .o $( LIBOBJECTS ) $( TESTHARNESS )
$( AM_LINK)
repeatable_thread_test : util /repeatable_thread_test .o $( LIBOBJECTS ) $( TESTHARNESS )
$( AM_LINK)
Use only "local" range tombstones during Get (#4449)
Summary:
Previously, range tombstones were accumulated from every level, which
was necessary if a range tombstone in a higher level covered a key in a lower
level. However, RangeDelAggregator::AddTombstones's complexity is based on
the number of tombstones that are currently stored in it, which is wasteful in
the Get case, where we only need to know the highest sequence number of range
tombstones that cover the key from higher levels, and compute the highest covering
sequence number at the current level. This change introduces this optimization, and
removes the use of RangeDelAggregator from the Get path.
In the benchmark results, the following command was used to initialize the database:
```
./db_bench -db=/dev/shm/5k-rts -use_existing_db=false -benchmarks=filluniquerandom -write_buffer_size=1048576 -compression_type=lz4 -target_file_size_base=1048576 -max_bytes_for_level_base=4194304 -value_size=112 -key_size=16 -block_size=4096 -level_compaction_dynamic_level_bytes=true -num=5000000 -max_background_jobs=12 -benchmark_write_rate_limit=20971520 -range_tombstone_width=100 -writes_per_range_tombstone=100 -max_num_range_tombstones=50000 -bloom_bits=8
```
...and the following command was used to measure read throughput:
```
./db_bench -db=/dev/shm/5k-rts/ -use_existing_db=true -benchmarks=readrandom -disable_auto_compactions=true -num=5000000 -reads=100000 -threads=32
```
The filluniquerandom command was only run once, and the resulting database was used
to measure read performance before and after the PR. Both binaries were compiled with
`DEBUG_LEVEL=0`.
Readrandom results before PR:
```
readrandom : 4.544 micros/op 220090 ops/sec; 16.9 MB/s (63103 of 100000 found)
```
Readrandom results after PR:
```
readrandom : 11.147 micros/op 89707 ops/sec; 6.9 MB/s (63103 of 100000 found)
```
So it's actually slower right now, but this PR paves the way for future optimizations (see #4493).
----
Pull Request resolved: https://github.com/facebook/rocksdb/pull/4449
Differential Revision: D10370575
Pulled By: abhimadan
fbshipit-source-id: 9a2e152be1ef36969055c0e9eb4beb0d96c11f4d
6 years ago
range_tombstone_fragmenter_test : db /range_tombstone_fragmenter_test .o db /db_test_util .o $( LIBOBJECTS ) $( TESTHARNESS )
$( AM_LINK)
sst_file_reader_test : table /sst_file_reader_test .o $( LIBOBJECTS ) $( TESTHARNESS )
$( AM_LINK)
db_secondary_test : db /db_impl /db_secondary_test .o db /db_test_util .o $( LIBOBJECTS ) $( TESTHARNESS )
$( AM_LINK)
block_cache_tracer_test : trace_replay /block_cache_tracer_test .o trace_replay /block_cache_tracer .o $( LIBOBJECTS ) $( TESTHARNESS )
$( AM_LINK)
Block cache simulator: Add pysim to simulate caches using reinforcement learning. (#5610)
Summary:
This PR implements cache eviction using reinforcement learning. It includes two implementations:
1. An implementation of Thompson Sampling for the Bernoulli Bandit [1].
2. An implementation of LinUCB with disjoint linear models [2].
The idea is that a cache uses multiple eviction policies, e.g., MRU, LRU, and LFU. The cache learns which eviction policy is the best and uses it upon a cache miss.
Thompson Sampling is contextless and does not include any features.
LinUCB includes features such as level, block type, caller, column family id to decide which eviction policy to use.
[1] Daniel J. Russo, Benjamin Van Roy, Abbas Kazerouni, Ian Osband, and Zheng Wen. 2018. A Tutorial on Thompson Sampling. Found. Trends Mach. Learn. 11, 1 (July 2018), 1-96. DOI: https://doi.org/10.1561/2200000070
[2] Lihong Li, Wei Chu, John Langford, and Robert E. Schapire. 2010. A contextual-bandit approach to personalized news article recommendation. In Proceedings of the 19th international conference on World wide web (WWW '10). ACM, New York, NY, USA, 661-670. DOI=http://dx.doi.org/10.1145/1772690.1772758
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5610
Differential Revision: D16435067
Pulled By: HaoyuHuang
fbshipit-source-id: 6549239ae14115c01cb1e70548af9e46d8dc21bb
6 years ago
block_cache_trace_analyzer_test : tools /block_cache_analyzer /block_cache_trace_analyzer_test .o tools /block_cache_analyzer /block_cache_trace_analyzer .o $( LIBOBJECTS ) $( TESTHARNESS )
$( AM_LINK)
defer_test : util /defer_test .o $( LIBOBJECTS ) $( TESTHARNESS )
$( AM_LINK)
blob_file_addition_test : db /blob /blob_file_addition_test .o $( LIBOBJECTS ) $( TESTHARNESS )
$( AM_LINK)
blob_file_garbage_test : db /blob /blob_file_garbage_test .o $( LIBOBJECTS ) $( TESTHARNESS )
$( AM_LINK)
timer_test : util /timer_test .o $( LIBOBJECTS ) $( TESTHARNESS )
$( AM_LINK)
#-------------------------------------------------
# make install related stuff
INSTALL_PATH ?= /usr/local
uninstall :
rm -rf $( INSTALL_PATH) /include/rocksdb \
$( INSTALL_PATH) /lib/$( LIBRARY) \
$( INSTALL_PATH) /lib/$( SHARED4) \
$( INSTALL_PATH) /lib/$( SHARED3) \
$( INSTALL_PATH) /lib/$( SHARED2) \
$( INSTALL_PATH) /lib/$( SHARED1)
install-headers :
install -d $( INSTALL_PATH) /lib
for header_dir in ` $( FIND) "include/rocksdb" -type d` ; do \
install -d $( INSTALL_PATH) /$$ header_dir; \
done
for header in ` $( FIND) "include/rocksdb" -type f -name *.h` ; do \
install -C -m 644 $$ header $( INSTALL_PATH) /$$ header; \
done
install-static : install -headers $( LIBRARY )
install -C -m 755 $( LIBRARY) $( INSTALL_PATH) /lib
install-shared : install -headers $( SHARED 4)
install -C -m 755 $( SHARED4) $( INSTALL_PATH) /lib && \
ln -fs $( SHARED4) $( INSTALL_PATH) /lib/$( SHARED3) && \
ln -fs $( SHARED4) $( INSTALL_PATH) /lib/$( SHARED2) && \
ln -fs $( SHARED4) $( INSTALL_PATH) /lib/$( SHARED1)
# install static by default + install shared if it exists
install : install -static
[ -e $( SHARED4) ] && $( MAKE) install-shared || :
#-------------------------------------------------
Add a jni library for rocksdb which supports Open, Get, Put, and Close.
Summary:
This diff contains a simple jni library for rocksdb which supports open, get,
put and closeusing default options (including Options, ReadOptions, and
WriteOptions.) In the usual case, Java developers can use the c++ rocksdb
library in the way similar to the following:
RocksDB db = RocksDB.open(path_to_db);
...
db.put("hello".getBytes(), "world".getBytes();
byte[] value = db.get("hello".getBytes());
...
db.close();
Specifically, this diff has the following major classes:
* RocksDB: a Java wrapper class which forwards the operations
from the java side to c++ rocksdb library.
* RocksDBException: ncapsulates the error of an operation.
This exception type is used to describe an internal error from
the c++ rocksdb library.
This diff also include a simple java sample code calling c++ rocksdb library.
To build the rocksdb jni library, simply run make jni, and make jtest will try to
build and run the sample code.
Note that if the rocksdb is not built with the default glibc that Java uses,
java will try to load the wrong glibc during the run time. As a result,
the sample code might not work properly during the run time.
Test Plan:
* make jni
* make jtest
Reviewers: haobo, dhruba, sdong, igor, ljin
Reviewed By: dhruba
CC: leveldb, xjin
Differential Revision: https://reviews.facebook.net/D17109
11 years ago
# ---------------------------------------------------------------------------
# Jni stuff
# ---------------------------------------------------------------------------
JAVA_INCLUDE = -I$( JAVA_HOME) /include/ -I$( JAVA_HOME) /include/linux
i f e q ( $( PLATFORM ) , O S _ S O L A R I S )
ARCH := $( shell isainfo -b)
e l s e i f e q ( $( PLATFORM ) , O S _ O P E N B S D )
ifneq ( ,$( filter amd64 ppc64 ppc64le arm64 aarch64 sparc64, $( MACHINE) ) )
ARCH := 64
else
ARCH := 32
endif
e l s e
ARCH := $( shell getconf LONG_BIT)
e n d i f
i f e q ( $( shell ldd /usr /bin /env 2>/dev /null | grep -q musl ; echo $ $ ?) , 0 )
JNI_LIBC = musl
# GNU LibC (or glibc) is so pervasive we can assume it is the default
# else
# JNI_LIBC = glibc
e n d i f
i f n e q ( $( origin JNI_LIBC ) , u n d e f i n e d )
JNI_LIBC_POSTFIX = -$( JNI_LIBC)
e n d i f
i f n e q ( , $( filter ppc % arm 64 aarch 64 sparc 64, $ ( MACHINE ) ) )
ROCKSDBJNILIB = librocksdbjni-linux-$( MACHINE) $( JNI_LIBC_POSTFIX) .so
e l s e
ROCKSDBJNILIB = librocksdbjni-linux$( ARCH) $( JNI_LIBC_POSTFIX) .so
e n d i f
ROCKSDB_JAR = rocksdbjni-$( ROCKSDB_MAJOR) .$( ROCKSDB_MINOR) .$( ROCKSDB_PATCH) -linux$( ARCH) $( JNI_LIBC_POSTFIX) .jar
ROCKSDB_JAR_ALL = rocksdbjni-$( ROCKSDB_MAJOR) .$( ROCKSDB_MINOR) .$( ROCKSDB_PATCH) .jar
ROCKSDB_JAVADOCS_JAR = rocksdbjni-$( ROCKSDB_MAJOR) .$( ROCKSDB_MINOR) .$( ROCKSDB_PATCH) -javadoc.jar
ROCKSDB_SOURCES_JAR = rocksdbjni-$( ROCKSDB_MAJOR) .$( ROCKSDB_MINOR) .$( ROCKSDB_PATCH) -sources.jar
SHA256_CMD = sha256sum
ZLIB_VER ?= 1.2.11
ZLIB_SHA256 ?= c3e5e9fdd5004dcb542feda5ee4f0ff0744628baf8ed2dd5d66f8ca1197cb1a1
ZLIB_DOWNLOAD_BASE ?= http://zlib.net
BZIP2_VER ?= 1.0.6
BZIP2_SHA256 ?= a2848f34fcd5d6cf47def00461fcb528a0484d8edef8208d6d2e2909dc61d9cd
BZIP2_DOWNLOAD_BASE ?= https://downloads.sourceforge.net/project/bzip2
SNAPPY_VER ?= 1.1.8
SNAPPY_SHA256 ?= 16b677f07832a612b0836178db7f374e414f94657c138e6993cbfc5dcc58651f
SNAPPY_DOWNLOAD_BASE ?= https://github.com/google/snappy/archive
LZ4_VER ?= 1.9.2
LZ4_SHA256 ?= 658ba6191fa44c92280d4aa2c271b0f4fbc0e34d249578dd05e50e76d0e5efcc
LZ4_DOWNLOAD_BASE ?= https://github.com/lz4/lz4/archive
ZSTD_VER ?= 1.4.4
ZSTD_SHA256 ?= a364f5162c7d1a455cc915e8e3cf5f4bd8b75d09bc0f53965b0c9ca1383c52c8
ZSTD_DOWNLOAD_BASE ?= https://github.com/facebook/zstd/archive
CURL_SSL_OPTS ?= --tlsv1
Add a jni library for rocksdb which supports Open, Get, Put, and Close.
Summary:
This diff contains a simple jni library for rocksdb which supports open, get,
put and closeusing default options (including Options, ReadOptions, and
WriteOptions.) In the usual case, Java developers can use the c++ rocksdb
library in the way similar to the following:
RocksDB db = RocksDB.open(path_to_db);
...
db.put("hello".getBytes(), "world".getBytes();
byte[] value = db.get("hello".getBytes());
...
db.close();
Specifically, this diff has the following major classes:
* RocksDB: a Java wrapper class which forwards the operations
from the java side to c++ rocksdb library.
* RocksDBException: ncapsulates the error of an operation.
This exception type is used to describe an internal error from
the c++ rocksdb library.
This diff also include a simple java sample code calling c++ rocksdb library.
To build the rocksdb jni library, simply run make jni, and make jtest will try to
build and run the sample code.
Note that if the rocksdb is not built with the default glibc that Java uses,
java will try to load the wrong glibc during the run time. As a result,
the sample code might not work properly during the run time.
Test Plan:
* make jni
* make jtest
Reviewers: haobo, dhruba, sdong, igor, ljin
Reviewed By: dhruba
CC: leveldb, xjin
Differential Revision: https://reviews.facebook.net/D17109
11 years ago
i f e q ( $( PLATFORM ) , O S _ M A C O S X )
ROCKSDBJNILIB = librocksdbjni-osx.jnilib
ROCKSDB_JAR = rocksdbjni-$( ROCKSDB_MAJOR) .$( ROCKSDB_MINOR) .$( ROCKSDB_PATCH) -osx.jar
SHA256_CMD = openssl sha256 -r
i f n e q ( "$(wildcard $(JAVA_HOME)/include/darwin)" , "" )
JAVA_INCLUDE = -I$( JAVA_HOME) /include -I $( JAVA_HOME) /include/darwin
e l s e
JAVA_INCLUDE = -I/System/Library/Frameworks/JavaVM.framework/Headers/
e n d i f
Add a jni library for rocksdb which supports Open, Get, Put, and Close.
Summary:
This diff contains a simple jni library for rocksdb which supports open, get,
put and closeusing default options (including Options, ReadOptions, and
WriteOptions.) In the usual case, Java developers can use the c++ rocksdb
library in the way similar to the following:
RocksDB db = RocksDB.open(path_to_db);
...
db.put("hello".getBytes(), "world".getBytes();
byte[] value = db.get("hello".getBytes());
...
db.close();
Specifically, this diff has the following major classes:
* RocksDB: a Java wrapper class which forwards the operations
from the java side to c++ rocksdb library.
* RocksDBException: ncapsulates the error of an operation.
This exception type is used to describe an internal error from
the c++ rocksdb library.
This diff also include a simple java sample code calling c++ rocksdb library.
To build the rocksdb jni library, simply run make jni, and make jtest will try to
build and run the sample code.
Note that if the rocksdb is not built with the default glibc that Java uses,
java will try to load the wrong glibc during the run time. As a result,
the sample code might not work properly during the run time.
Test Plan:
* make jni
* make jtest
Reviewers: haobo, dhruba, sdong, igor, ljin
Reviewed By: dhruba
CC: leveldb, xjin
Differential Revision: https://reviews.facebook.net/D17109
11 years ago
e n d i f
i f e q ( $( PLATFORM ) , O S _ F R E E B S D )
JAVA_INCLUDE = -I$( JAVA_HOME) /include -I$( JAVA_HOME) /include/freebsd
ROCKSDBJNILIB = librocksdbjni-freebsd$( ARCH) .so
ROCKSDB_JAR = rocksdbjni-$( ROCKSDB_MAJOR) .$( ROCKSDB_MINOR) .$( ROCKSDB_PATCH) -freebsd$( ARCH) .jar
e n d i f
i f e q ( $( PLATFORM ) , O S _ S O L A R I S )
ROCKSDBJNILIB = librocksdbjni-solaris$( ARCH) .so
ROCKSDB_JAR = rocksdbjni-$( ROCKSDB_MAJOR) .$( ROCKSDB_MINOR) .$( ROCKSDB_PATCH) -solaris$( ARCH) .jar
JAVA_INCLUDE = -I$( JAVA_HOME) /include/ -I$( JAVA_HOME) /include/solaris
SHA256_CMD = digest -a sha256
e n d i f
i f e q ( $( PLATFORM ) , O S _ A I X )
JAVA_INCLUDE = -I$( JAVA_HOME) /include/ -I$( JAVA_HOME) /include/aix
ROCKSDBJNILIB = librocksdbjni-aix.so
EXTRACT_SOURCES = gunzip < TAR_GZ | tar xvf -
SNAPPY_MAKE_TARGET = libsnappy.la
e n d i f
i f e q ( $( PLATFORM ) , O S _ O P E N B S D )
JAVA_INCLUDE = -I$( JAVA_HOME) /include -I$( JAVA_HOME) /include/openbsd
ROCKSDBJNILIB = librocksdbjni-openbsd$( ARCH) .so
ROCKSDB_JAR = rocksdbjni-$( ROCKSDB_MAJOR) .$( ROCKSDB_MINOR) .$( ROCKSDB_PATCH) -openbsd$( ARCH) .jar
e n d i f
Add a jni library for rocksdb which supports Open, Get, Put, and Close.
Summary:
This diff contains a simple jni library for rocksdb which supports open, get,
put and closeusing default options (including Options, ReadOptions, and
WriteOptions.) In the usual case, Java developers can use the c++ rocksdb
library in the way similar to the following:
RocksDB db = RocksDB.open(path_to_db);
...
db.put("hello".getBytes(), "world".getBytes();
byte[] value = db.get("hello".getBytes());
...
db.close();
Specifically, this diff has the following major classes:
* RocksDB: a Java wrapper class which forwards the operations
from the java side to c++ rocksdb library.
* RocksDBException: ncapsulates the error of an operation.
This exception type is used to describe an internal error from
the c++ rocksdb library.
This diff also include a simple java sample code calling c++ rocksdb library.
To build the rocksdb jni library, simply run make jni, and make jtest will try to
build and run the sample code.
Note that if the rocksdb is not built with the default glibc that Java uses,
java will try to load the wrong glibc during the run time. As a result,
the sample code might not work properly during the run time.
Test Plan:
* make jni
* make jtest
Reviewers: haobo, dhruba, sdong, igor, ljin
Reviewed By: dhruba
CC: leveldb, xjin
Differential Revision: https://reviews.facebook.net/D17109
11 years ago
libz.a :
-rm -rf zlib-$( ZLIB_VER)
i f e q ( , $( wildcard ./zlib -$ ( ZLIB_VER ) .tar .gz ) )
curl --fail --output zlib-$( ZLIB_VER) .tar.gz --location ${ ZLIB_DOWNLOAD_BASE } /zlib-$( ZLIB_VER) .tar.gz
e n d i f
ZLIB_SHA256_ACTUAL = ` $( SHA256_CMD) zlib-$( ZLIB_VER) .tar.gz | cut -d ' ' -f 1` ; \
if [ " $( ZLIB_SHA256) " != " $$ ZLIB_SHA256_ACTUAL " ] ; then \
echo zlib-$( ZLIB_VER) .tar.gz checksum mismatch, expected = \" $( ZLIB_SHA256) \" actual = \" $$ ZLIB_SHA256_ACTUAL\" ; \
exit 1; \
fi
tar xvzf zlib-$( ZLIB_VER) .tar.gz
cd zlib-$( ZLIB_VER) && CFLAGS = '-fPIC ${EXTRA_CFLAGS}' LDFLAGS = '${EXTRA_LDFLAGS}' ./configure --static && $( MAKE)
cp zlib-$( ZLIB_VER) /libz.a .
libbz2.a :
-rm -rf bzip2-$( BZIP2_VER)
i f e q ( , $( wildcard ./bzip 2-$ ( BZIP 2_VER ) .tar .gz ) )
curl --fail --output bzip2-$( BZIP2_VER) .tar.gz --location ${ CURL_SSL_OPTS } ${ BZIP2_DOWNLOAD_BASE } /bzip2-$( BZIP2_VER) .tar.gz
e n d i f
BZIP2_SHA256_ACTUAL = ` $( SHA256_CMD) bzip2-$( BZIP2_VER) .tar.gz | cut -d ' ' -f 1` ; \
if [ " $( BZIP2_SHA256) " != " $$ BZIP2_SHA256_ACTUAL " ] ; then \
echo bzip2-$( BZIP2_VER) .tar.gz checksum mismatch, expected = \" $( BZIP2_SHA256) \" actual = \" $$ BZIP2_SHA256_ACTUAL\" ; \
exit 1; \
fi
tar xvzf bzip2-$( BZIP2_VER) .tar.gz
cd bzip2-$( BZIP2_VER) && $( MAKE) CFLAGS = '-fPIC -O2 -g -D_FILE_OFFSET_BITS=64 ${EXTRA_CFLAGS}' AR = 'ar ${EXTRA_ARFLAGS}'
cp bzip2-$( BZIP2_VER) /libbz2.a .
libsnappy.a :
-rm -rf snappy-$( SNAPPY_VER)
i f e q ( , $( wildcard ./snappy -$ ( SNAPPY_VER ) .tar .gz ) )
curl --fail --output snappy-$( SNAPPY_VER) .tar.gz --location ${ CURL_SSL_OPTS } ${ SNAPPY_DOWNLOAD_BASE } /$( SNAPPY_VER) .tar.gz
e n d i f
SNAPPY_SHA256_ACTUAL = ` $( SHA256_CMD) snappy-$( SNAPPY_VER) .tar.gz | cut -d ' ' -f 1` ; \
if [ " $( SNAPPY_SHA256) " != " $$ SNAPPY_SHA256_ACTUAL " ] ; then \
echo snappy-$( SNAPPY_VER) .tar.gz checksum mismatch, expected = \" $( SNAPPY_SHA256) \" actual = \" $$ SNAPPY_SHA256_ACTUAL\" ; \
exit 1; \
fi
tar xvzf snappy-$( SNAPPY_VER) .tar.gz
mkdir snappy-$( SNAPPY_VER) /build
cd snappy-$( SNAPPY_VER) /build && CFLAGS = '${EXTRA_CFLAGS}' CXXFLAGS = '${EXTRA_CXXFLAGS}' LDFLAGS = '${EXTRA_LDFLAGS}' cmake -DCMAKE_POSITION_INDEPENDENT_CODE= ON .. && $( MAKE) ${ SNAPPY_MAKE_TARGET }
cp snappy-$( SNAPPY_VER) /build/libsnappy.a .
liblz4.a :
-rm -rf lz4-$( LZ4_VER)
i f e q ( , $( wildcard ./lz 4-$ ( LZ 4_VER ) .tar .gz ) )
curl --fail --output lz4-$( LZ4_VER) .tar.gz --location ${ CURL_SSL_OPTS } ${ LZ4_DOWNLOAD_BASE } /v$( LZ4_VER) .tar.gz
e n d i f
LZ4_SHA256_ACTUAL = ` $( SHA256_CMD) lz4-$( LZ4_VER) .tar.gz | cut -d ' ' -f 1` ; \
if [ " $( LZ4_SHA256) " != " $$ LZ4_SHA256_ACTUAL " ] ; then \
echo lz4-$( LZ4_VER) .tar.gz checksum mismatch, expected = \" $( LZ4_SHA256) \" actual = \" $$ LZ4_SHA256_ACTUAL\" ; \
exit 1; \
fi
tar xvzf lz4-$( LZ4_VER) .tar.gz
cd lz4-$( LZ4_VER) /lib && $( MAKE) CFLAGS = '-fPIC -O2 ${EXTRA_CFLAGS}' all
cp lz4-$( LZ4_VER) /lib/liblz4.a .
libzstd.a :
-rm -rf zstd-$( ZSTD_VER)
i f e q ( , $( wildcard ./zstd -$ ( ZSTD_VER ) .tar .gz ) )
curl --fail --output zstd-$( ZSTD_VER) .tar.gz --location ${ CURL_SSL_OPTS } ${ ZSTD_DOWNLOAD_BASE } /v$( ZSTD_VER) .tar.gz
e n d i f
ZSTD_SHA256_ACTUAL = ` $( SHA256_CMD) zstd-$( ZSTD_VER) .tar.gz | cut -d ' ' -f 1` ; \
if [ " $( ZSTD_SHA256) " != " $$ ZSTD_SHA256_ACTUAL " ] ; then \
echo zstd-$( ZSTD_VER) .tar.gz checksum mismatch, expected = \" $( ZSTD_SHA256) \" actual = \" $$ ZSTD_SHA256_ACTUAL\" ; \
exit 1; \
fi
tar xvzf zstd-$( ZSTD_VER) .tar.gz
cd zstd-$( ZSTD_VER) /lib && DESTDIR = . PREFIX = $( MAKE) CFLAGS = '-fPIC -O2 ${EXTRA_CFLAGS}' install
cp zstd-$( ZSTD_VER) /lib/libzstd.a .
# A version of each $(LIBOBJECTS) compiled with -fPIC and a fixed set of static compression libraries
java_static_libobjects = $( patsubst %,jls/%,$( LIB_CC_OBJECTS) )
CLEAN_FILES += jls
java_static_all_libobjects = $( java_static_libobjects)
i f n e q ( $( ROCKSDB_JAVA_NO_COMPRESSION ) , 1 )
JAVA_COMPRESSIONS = libz.a libbz2.a libsnappy.a liblz4.a libzstd.a
e n d i f
JAVA_STATIC_FLAGS = -DZLIB -DBZIP2 -DSNAPPY -DLZ4 -DZSTD
JAVA_STATIC_INCLUDES = -I./zlib-$( ZLIB_VER) -I./bzip2-$( BZIP2_VER) -I./snappy-$( SNAPPY_VER) -I./lz4-$( LZ4_VER) /lib -I./zstd-$( ZSTD_VER) /lib/include
i f e q ( $( HAVE_POWER 8) , 1 )
JAVA_STATIC_C_LIBOBJECTS = $( patsubst %.c.o,jls/%.c.o,$( LIB_SOURCES_C:.c= .o) )
JAVA_STATIC_ASM_LIBOBJECTS = $( patsubst %.S.o,jls/%.S.o,$( LIB_SOURCES_ASM:.S= .o) )
java_static_ppc_libobjects = $( JAVA_STATIC_C_LIBOBJECTS) $( JAVA_STATIC_ASM_LIBOBJECTS)
jls/util/crc32c_ppc.o : util /crc 32c_ppc .c
$( AM_V_CC) $( CC) $( CFLAGS) $( JAVA_STATIC_FLAGS) $( JAVA_STATIC_INCLUDES) -c $< -o $@
jls/util/crc32c_ppc_asm.o : util /crc 32c_ppc_asm .S
$( AM_V_CC) $( CC) $( CFLAGS) $( JAVA_STATIC_FLAGS) $( JAVA_STATIC_INCLUDES) -c $< -o $@
java_static_all_libobjects += $( java_static_ppc_libobjects)
e n d i f
$(java_static_libobjects) : jls /%.o : %.cc $( JAVA_COMPRESSIONS )
$( AM_V_CC) mkdir -p $( @D) && $( CXX) $( CXXFLAGS) $( JAVA_STATIC_FLAGS) $( JAVA_STATIC_INCLUDES) -fPIC -c $< -o $@ $( COVERAGEFLAGS)
rocksdbjavastatic : $( java_static_all_libobjects )
cd java; $( MAKE) javalib;
rm -f ./java/target/$( ROCKSDBJNILIB)
$( CXX) $( CXXFLAGS) -I./java/. $( JAVA_INCLUDE) -shared -fPIC \
-o ./java/target/$( ROCKSDBJNILIB) $( JNI_NATIVE_SOURCES) \
$( java_static_all_libobjects) $( COVERAGEFLAGS) \
$( JAVA_COMPRESSIONS) $( JAVA_STATIC_LDFLAGS)
cd java/target; if [ " $( DEBUG_LEVEL) " = = "0" ] ; then \
strip $( STRIPFLAGS) $( ROCKSDBJNILIB) ; \
fi
cd java; jar -cf target/$( ROCKSDB_JAR) HISTORY*.md
cd java/target; jar -uf $( ROCKSDB_JAR) $( ROCKSDBJNILIB)
cd java/target/classes; jar -uf ../$( ROCKSDB_JAR) org/rocksdb/*.class org/rocksdb/util/*.class
cd java/target/apidocs; jar -cf ../$( ROCKSDB_JAVADOCS_JAR) *
cd java/src/main/java; jar -cf ../../../target/$( ROCKSDB_SOURCES_JAR) org
rocksdbjavastaticrelease : rocksdbjavastatic
cd java/crossbuild && ( vagrant destroy -f || true ) && vagrant up linux32 && vagrant halt linux32 && vagrant up linux64 && vagrant halt linux64 && vagrant up linux64-musl && vagrant halt linux64-musl
cd java; jar -cf target/$( ROCKSDB_JAR_ALL) HISTORY*.md
cd java/target; jar -uf $( ROCKSDB_JAR_ALL) librocksdbjni-*.so librocksdbjni-*.jnilib
cd java/target/classes; jar -uf ../$( ROCKSDB_JAR_ALL) org/rocksdb/*.class org/rocksdb/util/*.class
rocksdbjavastaticreleasedocker : rocksdbjavastatic rocksdbjavastaticdockerx 86 rocksdbjavastaticdockerx 86_ 64 rocksdbjavastaticdockerx 86musl rocksdbjavastaticdockerx 86_ 64musl
cd java; jar -cf target/$( ROCKSDB_JAR_ALL) HISTORY*.md
cd java/target; jar -uf $( ROCKSDB_JAR_ALL) librocksdbjni-*.so librocksdbjni-*.jnilib
cd java/target/classes; jar -uf ../$( ROCKSDB_JAR_ALL) org/rocksdb/*.class org/rocksdb/util/*.class
rocksdbjavastaticdockerx86 :
mkdir -p java/target
docker run --rm --name rocksdb_linux_x86-be --attach stdin --attach stdout --attach stderr --volume $( HOME) /.m2:/root/.m2:ro --volume ` pwd ` :/rocksdb-host:ro --volume /rocksdb-local-build --volume ` pwd ` /java/target:/rocksdb-java-target --env DEBUG_LEVEL = $( DEBUG_LEVEL) evolvedbinary/rocksjava:centos6_x86-be /rocksdb-host/java/crossbuild/docker-build-linux-centos.sh
rocksdbjavastaticdockerx86_64 :
mkdir -p java/target
docker run --rm --name rocksdb_linux_x64-be --attach stdin --attach stdout --attach stderr --volume $( HOME) /.m2:/root/.m2:ro --volume ` pwd ` :/rocksdb-host:ro --volume /rocksdb-local-build --volume ` pwd ` /java/target:/rocksdb-java-target --env DEBUG_LEVEL = $( DEBUG_LEVEL) evolvedbinary/rocksjava:centos6_x64-be /rocksdb-host/java/crossbuild/docker-build-linux-centos.sh
rocksdbjavastaticdockerppc64le :
mkdir -p java/target
docker run --rm --name rocksdb_linux_ppc64le-be --attach stdin --attach stdout --attach stderr --volume $( HOME) /.m2:/root/.m2:ro --volume ` pwd ` :/rocksdb-host:ro --volume /rocksdb-local-build --volume ` pwd ` /java/target:/rocksdb-java-target --env DEBUG_LEVEL = $( DEBUG_LEVEL) evolvedbinary/rocksjava:centos7_ppc64le-be /rocksdb-host/java/crossbuild/docker-build-linux-centos.sh
rocksdbjavastaticdockerarm64v8 :
mkdir -p java/target
docker run --rm --name rocksdb_linux_arm64v8-be --attach stdin --attach stdout --attach stderr --volume $( HOME) /.m2:/root/.m2:ro --volume ` pwd ` :/rocksdb-host:ro --volume /rocksdb-local-build --volume ` pwd ` /java/target:/rocksdb-java-target --env DEBUG_LEVEL = $( DEBUG_LEVEL) evolvedbinary/rocksjava:centos7_arm64v8-be /rocksdb-host/java/crossbuild/docker-build-linux-centos.sh
rocksdbjavastaticdockerx86musl :
mkdir -p java/target
docker run --rm --name rocksdb_linux_x86-musl-be --attach stdin --attach stdout --attach stderr --volume $( HOME) /.m2:/root/.m2:ro --volume ` pwd ` :/rocksdb-host:ro --volume /rocksdb-local-build --volume ` pwd ` /java/target:/rocksdb-java-target --env DEBUG_LEVEL = $( DEBUG_LEVEL) evolvedbinary/rocksjava:alpine3_x86-be /rocksdb-host/java/crossbuild/docker-build-linux-centos.sh
rocksdbjavastaticdockerx86_64musl :
mkdir -p java/target
docker run --rm --name rocksdb_linux_x64-musl-be --attach stdin --attach stdout --attach stderr --volume $( HOME) /.m2:/root/.m2:ro --volume ` pwd ` :/rocksdb-host:ro --volume /rocksdb-local-build --volume ` pwd ` /java/target:/rocksdb-java-target --env DEBUG_LEVEL = $( DEBUG_LEVEL) evolvedbinary/rocksjava:alpine3_x64-be /rocksdb-host/java/crossbuild/docker-build-linux-centos.sh
rocksdbjavastaticdockerppc64lemusl :
mkdir -p java/target
docker run --rm --name rocksdb_linux_ppc64le-musl-be --attach stdin --attach stdout --attach stderr --volume $( HOME) /.m2:/root/.m2:ro --volume ` pwd ` :/rocksdb-host:ro --volume /rocksdb-local-build --volume ` pwd ` /java/target:/rocksdb-java-target --env DEBUG_LEVEL = $( DEBUG_LEVEL) evolvedbinary/rocksjava:alpine3_ppc64le-be /rocksdb-host/java/crossbuild/docker-build-linux-centos.sh
rocksdbjavastaticdockerarm64v8musl :
mkdir -p java/target
docker run --rm --name rocksdb_linux_arm64v8-musl-be --attach stdin --attach stdout --attach stderr --volume $( HOME) /.m2:/root/.m2:ro --volume ` pwd ` :/rocksdb-host:ro --volume /rocksdb-local-build --volume ` pwd ` /java/target:/rocksdb-java-target --env DEBUG_LEVEL = $( DEBUG_LEVEL) evolvedbinary/rocksjava:alpine3_arm64v8-be /rocksdb-host/java/crossbuild/docker-build-linux-centos.sh
rocksdbjavastaticpublish : rocksdbjavastaticrelease rocksdbjavastaticpublishcentral
rocksdbjavastaticpublishdocker : rocksdbjavastaticreleasedocker rocksdbjavastaticpublishcentral
rocksdbjavastaticpublishcentral :
mvn gpg:sign-and-deploy-file -Durl= https://oss.sonatype.org/service/local/staging/deploy/maven2/ -DrepositoryId= sonatype-nexus-staging -DpomFile= java/rocksjni.pom -Dfile= java/target/rocksdbjni-$( ROCKSDB_MAJOR) .$( ROCKSDB_MINOR) .$( ROCKSDB_PATCH) -javadoc.jar -Dclassifier= javadoc
mvn gpg:sign-and-deploy-file -Durl= https://oss.sonatype.org/service/local/staging/deploy/maven2/ -DrepositoryId= sonatype-nexus-staging -DpomFile= java/rocksjni.pom -Dfile= java/target/rocksdbjni-$( ROCKSDB_MAJOR) .$( ROCKSDB_MINOR) .$( ROCKSDB_PATCH) -sources.jar -Dclassifier= sources
mvn gpg:sign-and-deploy-file -Durl= https://oss.sonatype.org/service/local/staging/deploy/maven2/ -DrepositoryId= sonatype-nexus-staging -DpomFile= java/rocksjni.pom -Dfile= java/target/rocksdbjni-$( ROCKSDB_MAJOR) .$( ROCKSDB_MINOR) .$( ROCKSDB_PATCH) -linux64.jar -Dclassifier= linux64
mvn gpg:sign-and-deploy-file -Durl= https://oss.sonatype.org/service/local/staging/deploy/maven2/ -DrepositoryId= sonatype-nexus-staging -DpomFile= java/rocksjni.pom -Dfile= java/target/rocksdbjni-$( ROCKSDB_MAJOR) .$( ROCKSDB_MINOR) .$( ROCKSDB_PATCH) -linux32.jar -Dclassifier= linux32
mvn gpg:sign-and-deploy-file -Durl= https://oss.sonatype.org/service/local/staging/deploy/maven2/ -DrepositoryId= sonatype-nexus-staging -DpomFile= java/rocksjni.pom -Dfile= java/target/rocksdbjni-$( ROCKSDB_MAJOR) .$( ROCKSDB_MINOR) .$( ROCKSDB_PATCH) -linux64-musl.jar -Dclassifier= linux64-musl
mvn gpg:sign-and-deploy-file -Durl= https://oss.sonatype.org/service/local/staging/deploy/maven2/ -DrepositoryId= sonatype-nexus-staging -DpomFile= java/rocksjni.pom -Dfile= java/target/rocksdbjni-$( ROCKSDB_MAJOR) .$( ROCKSDB_MINOR) .$( ROCKSDB_PATCH) -linux32-musl.jar -Dclassifier= linux32-musl
mvn gpg:sign-and-deploy-file -Durl= https://oss.sonatype.org/service/local/staging/deploy/maven2/ -DrepositoryId= sonatype-nexus-staging -DpomFile= java/rocksjni.pom -Dfile= java/target/rocksdbjni-$( ROCKSDB_MAJOR) .$( ROCKSDB_MINOR) .$( ROCKSDB_PATCH) -osx.jar -Dclassifier= osx
mvn gpg:sign-and-deploy-file -Durl= https://oss.sonatype.org/service/local/staging/deploy/maven2/ -DrepositoryId= sonatype-nexus-staging -DpomFile= java/rocksjni.pom -Dfile= java/target/rocksdbjni-$( ROCKSDB_MAJOR) .$( ROCKSDB_MINOR) .$( ROCKSDB_PATCH) -win64.jar -Dclassifier= win64
mvn gpg:sign-and-deploy-file -Durl= https://oss.sonatype.org/service/local/staging/deploy/maven2/ -DrepositoryId= sonatype-nexus-staging -DpomFile= java/rocksjni.pom -Dfile= java/target/rocksdbjni-$( ROCKSDB_MAJOR) .$( ROCKSDB_MINOR) .$( ROCKSDB_PATCH) .jar
# A version of each $(LIBOBJECTS) compiled with -fPIC
i f e q ( $( HAVE_POWER 8) , 1 )
JAVA_CC_OBJECTS = $( SHARED_CC_OBJECTS)
JAVA_C_OBJECTS = $( SHARED_C_OBJECTS)
JAVA_ASM_OBJECTS = $( SHARED_ASM_OBJECTS)
JAVA_C_LIBOBJECTS = $( patsubst %.c.o,jl/%.c.o,$( JAVA_C_OBJECTS) )
JAVA_ASM_LIBOBJECTS = $( patsubst %.S.o,jl/%.S.o,$( JAVA_ASM_OBJECTS) )
e n d i f
java_libobjects = $( patsubst %,jl/%,$( LIB_CC_OBJECTS) )
CLEAN_FILES += jl
java_all_libobjects = $( java_libobjects)
i f e q ( $( HAVE_POWER 8) , 1 )
java_ppc_libobjects = $( JAVA_C_LIBOBJECTS) $( JAVA_ASM_LIBOBJECTS)
jl/crc32c_ppc.o : util /crc 32c_ppc .c
$( AM_V_CC) $( CC) $( CFLAGS) -c $< -o $@
jl/crc32c_ppc_asm.o : util /crc 32c_ppc_asm .S
$( AM_V_CC) $( CC) $( CFLAGS) -c $< -o $@
java_all_libobjects += $( java_ppc_libobjects)
e n d i f
$(java_libobjects) : jl /%.o : %.cc
$( AM_V_CC) mkdir -p $( @D) && $( CXX) $( CXXFLAGS) -fPIC -c $< -o $@ $( COVERAGEFLAGS)
rocksdbjava : $( java_all_libobjects )
$( AM_V_GEN) cd java; $( MAKE) javalib;
$( AM_V_at) rm -f ./java/target/$( ROCKSDBJNILIB)
$( AM_V_at) $( CXX) $( CXXFLAGS) -I./java/. $( JAVA_INCLUDE) -shared -fPIC -o ./java/target/$( ROCKSDBJNILIB) $( JNI_NATIVE_SOURCES) $( java_all_libobjects) $( JAVA_LDFLAGS) $( COVERAGEFLAGS)
$( AM_V_at) cd java; jar -cf target/$( ROCKSDB_JAR) HISTORY*.md
$( AM_V_at) cd java/target; jar -uf $( ROCKSDB_JAR) $( ROCKSDBJNILIB)
$( AM_V_at) cd java/target/classes; jar -uf ../$( ROCKSDB_JAR) org/rocksdb/*.class org/rocksdb/util/*.class
Add a jni library for rocksdb which supports Open, Get, Put, and Close.
Summary:
This diff contains a simple jni library for rocksdb which supports open, get,
put and closeusing default options (including Options, ReadOptions, and
WriteOptions.) In the usual case, Java developers can use the c++ rocksdb
library in the way similar to the following:
RocksDB db = RocksDB.open(path_to_db);
...
db.put("hello".getBytes(), "world".getBytes();
byte[] value = db.get("hello".getBytes());
...
db.close();
Specifically, this diff has the following major classes:
* RocksDB: a Java wrapper class which forwards the operations
from the java side to c++ rocksdb library.
* RocksDBException: ncapsulates the error of an operation.
This exception type is used to describe an internal error from
the c++ rocksdb library.
This diff also include a simple java sample code calling c++ rocksdb library.
To build the rocksdb jni library, simply run make jni, and make jtest will try to
build and run the sample code.
Note that if the rocksdb is not built with the default glibc that Java uses,
java will try to load the wrong glibc during the run time. As a result,
the sample code might not work properly during the run time.
Test Plan:
* make jni
* make jtest
Reviewers: haobo, dhruba, sdong, igor, ljin
Reviewed By: dhruba
CC: leveldb, xjin
Differential Revision: https://reviews.facebook.net/D17109
11 years ago
jclean :
cd java; $( MAKE) clean;
jtest_compile : rocksdbjava
cd java; $( MAKE) java_test
jtest_run :
cd java; $( MAKE) run_test
jtest : rocksdbjava
cd java; $( MAKE) sample; $( MAKE) test;
$( PYTHON) tools/check_all_python.py # TODO peterd: find a better place for this check in CI targets
Add a jni library for rocksdb which supports Open, Get, Put, and Close.
Summary:
This diff contains a simple jni library for rocksdb which supports open, get,
put and closeusing default options (including Options, ReadOptions, and
WriteOptions.) In the usual case, Java developers can use the c++ rocksdb
library in the way similar to the following:
RocksDB db = RocksDB.open(path_to_db);
...
db.put("hello".getBytes(), "world".getBytes();
byte[] value = db.get("hello".getBytes());
...
db.close();
Specifically, this diff has the following major classes:
* RocksDB: a Java wrapper class which forwards the operations
from the java side to c++ rocksdb library.
* RocksDBException: ncapsulates the error of an operation.
This exception type is used to describe an internal error from
the c++ rocksdb library.
This diff also include a simple java sample code calling c++ rocksdb library.
To build the rocksdb jni library, simply run make jni, and make jtest will try to
build and run the sample code.
Note that if the rocksdb is not built with the default glibc that Java uses,
java will try to load the wrong glibc during the run time. As a result,
the sample code might not work properly during the run time.
Test Plan:
* make jni
* make jtest
Reviewers: haobo, dhruba, sdong, igor, ljin
Reviewed By: dhruba
CC: leveldb, xjin
Differential Revision: https://reviews.facebook.net/D17109
11 years ago
jdb_bench :
cd java; $( MAKE) db_bench;
commit_prereq : build_tools /rocksdb -lego -determinator \
build_tools/precommit_checker.py
J = $( J) build_tools/precommit_checker.py unit unit_481 clang_unit release release_481 clang_release tsan asan ubsan lite unit_non_shm
$( MAKE) clean && $( MAKE) jclean && $( MAKE) rocksdbjava;
A build option to run through all check-in requirements.
Summary: Make it easier for people to run all the tests.
Test Plan: Run it.
Reviewers: rven, yhchiang, igor, MarkCallaghan, IslamAbdelRahman, igor.sugak, anthony, kradhakrishnan, meyering
Reviewed By: meyering
Subscribers: meyering, leveldb, dhruba
Differential Revision: https://reviews.facebook.net/D35319
10 years ago
# ---------------------------------------------------------------------------
# Platform-specific compilation
# ---------------------------------------------------------------------------
i f e q ( $( PLATFORM ) , I O S )
# For iOS, create universal object files to be used on both the simulator and
# a device.
XCODEROOT = $( shell xcode-select -print-path)
PLATFORMSROOT = $( XCODEROOT) /Platforms
SIMULATORROOT = $( PLATFORMSROOT) /iPhoneSimulator.platform/Developer
DEVICEROOT = $( PLATFORMSROOT) /iPhoneOS.platform/Developer
IOSVERSION = $( shell defaults read $( PLATFORMSROOT) /iPhoneOS.platform/version CFBundleShortVersionString)
.cc.o :
mkdir -p ios-x86/$( dir $@ )
$( CXX) $( CXXFLAGS) -isysroot $( SIMULATORROOT) /SDKs/iPhoneSimulator$( IOSVERSION) .sdk -arch i686 -arch x86_64 -c $< -o ios-x86/$@
mkdir -p ios-arm/$( dir $@ )
xcrun -sdk iphoneos $( CXX) $( CXXFLAGS) -isysroot $( DEVICEROOT) /SDKs/iPhoneOS$( IOSVERSION) .sdk -arch armv6 -arch armv7 -arch armv7s -arch arm64 -c $< -o ios-arm/$@
lipo ios-x86/$@ ios-arm/$@ -create -output $@
.c.o :
mkdir -p ios-x86/$( dir $@ )
$( CC) $( CFLAGS) -isysroot $( SIMULATORROOT) /SDKs/iPhoneSimulator$( IOSVERSION) .sdk -arch i686 -arch x86_64 -c $< -o ios-x86/$@
mkdir -p ios-arm/$( dir $@ )
xcrun -sdk iphoneos $( CC) $( CFLAGS) -isysroot $( DEVICEROOT) /SDKs/iPhoneOS$( IOSVERSION) .sdk -arch armv6 -arch armv7 -arch armv7s -arch arm64 -c $< -o ios-arm/$@
lipo ios-x86/$@ ios-arm/$@ -create -output $@
e l s e
i f e q ( $( HAVE_POWER 8) , 1 )
util/crc32c_ppc.o : util /crc 32c_ppc .c
$( AM_V_CC) $( CC) $( CFLAGS) -c $< -o $@
util/crc32c_ppc_asm.o : util /crc 32c_ppc_asm .S
$( AM_V_CC) $( CC) $( CFLAGS) -c $< -o $@
e n d i f
.cc.o :
build: make "make" output readable by default
Summary:
With this change, make now prints a summary line for each
compiler and linker invocation, e.g.,:
CC db/builder.o
CC db/c.o
CC db/column_family.o
To see full commands, insert "V=1" into your make command.
E.g., run "make V=1 all" if you want it to print each command
in its full glory.
$^ is GNU make's abbreviation for the prerequisites of the current target.
These AM_V_... variables expand to some very short string like "CC" or
"LD", by default, so that the output of "make" is readable. If/when you
want more details, just build with "make V=1 ...", and make will print
each full command as it is executed. If you prefer to see the noise
all the time, and only want to optionally see the abbreviated output,
set AM_DEFAULT_VERBOSITY=1 in your environment, and then build with
V=0 to see the abbreviated command indicators.
Test Plan:
invoke make a few different ways and observe:
make clean; make # abbreviated
make clean; make V=0 # also abbreviated
make clean; make V=1 # full detail
Reviewers: sdong, ljin, igor
Reviewed By: igor
Subscribers: dhruba
Differential Revision: https://reviews.facebook.net/D33579
10 years ago
$( AM_V_CC) $( CXX) $( CXXFLAGS) -c $< -o $@ $( COVERAGEFLAGS)
.cpp.o :
$( AM_V_CC) $( CXX) $( CXXFLAGS) -c $< -o $@ $( COVERAGEFLAGS)
.c.o :
build: make "make" output readable by default
Summary:
With this change, make now prints a summary line for each
compiler and linker invocation, e.g.,:
CC db/builder.o
CC db/c.o
CC db/column_family.o
To see full commands, insert "V=1" into your make command.
E.g., run "make V=1 all" if you want it to print each command
in its full glory.
$^ is GNU make's abbreviation for the prerequisites of the current target.
These AM_V_... variables expand to some very short string like "CC" or
"LD", by default, so that the output of "make" is readable. If/when you
want more details, just build with "make V=1 ...", and make will print
each full command as it is executed. If you prefer to see the noise
all the time, and only want to optionally see the abbreviated output,
set AM_DEFAULT_VERBOSITY=1 in your environment, and then build with
V=0 to see the abbreviated command indicators.
Test Plan:
invoke make a few different ways and observe:
make clean; make # abbreviated
make clean; make V=0 # also abbreviated
make clean; make V=1 # full detail
Reviewers: sdong, ljin, igor
Reviewed By: igor
Subscribers: dhruba
Differential Revision: https://reviews.facebook.net/D33579
10 years ago
$( AM_V_CC) $( CC) $( CFLAGS) -c $< -o $@
e n d i f
# ---------------------------------------------------------------------------
# Source files dependencies detection
# ---------------------------------------------------------------------------
all_sources = $( LIB_SOURCES) $( MAIN_SOURCES) $( MOCK_LIB_SOURCES) $( TOOL_LIB_SOURCES) $( BENCH_LIB_SOURCES) $( TEST_LIB_SOURCES) $( ANALYZER_LIB_SOURCES) $( STRESS_LIB_SOURCES)
DEPFILES = $( all_sources:.cc= .cc.d)
i f e q ( $( USE_FOLLY_DISTRIBUTED_MUTEX ) , 1 )
DEPFILES += $( FOLLY_SOURCES:.cpp= .cpp.d)
e n d i f
run 'make check's rules (and even subtests) in parallel
Summary:
When GNU parallel is available, "make check" tests are now run in parallel.
When /dev/shm is usable, we tell those tests to create temporary files therein.
Now, the longest-running single test, db_test, (which is composed of hundreds of sub-tests)
is no longer run sequentially: instead, each of its sub-tests is run independently, and can
be parallelized along with all other tests. To make that process easier, this change
creates a temporary directory, "t/", in which it puts a small script for each of those
subtests. The output from each parallel-run test is now saved in t/log-TEST_NAME.
When GNU parallel is not available, we run the tests in sequence, just as before.
If GNU parallel is available and you don't like the default of running one subtest
per core, you can invoke "make J=1 check" to run only one test at a time.
Beware: this will take a long time, and it starts with the two longest-running tests, so you
will wait for a long time before seeing any results. Instead, if you want to use fewer resources
but still see useful progress, try "make J=60% check". That will attempt to ensure that 60% of
the cores are occupied by test runs.
To watch progress of individual tests (duration, success (PASS-or-FAIL), name), run "make watch-log"
in the same directory from another window. That will start with something like this:
and when complete should show numbers/names like this:
Every 0.1s: sort -k7,7nr -k4,4gr LOG|perl -n -e '@a=split("\t",$_,-1); $t=$a[8]; $t =~ s,^\./,,;' -e '$t =~ s, >.*,,; chomp $t;' -e '$t =~ /.*--gtest_filter=... Wed Apr 1 10:51:42 2015
152.221 PASS t/DBTest.FileCreationRandomFailure
109.280 PASS t/DBTest.EncodeDecompressedBlockSizeTest
82.315 PASS reduce_levels_test
77.812 PASS t/DBTest.CompactionFilterWithValueChange
73.236 PASS backupable_db_test
63.428 PASS deletefile_test
57.248 PASS table_test
55.665 PASS prefix_test
49.816 PASS t/DBTest.RateLimitingTest
...
Test Plan:
Timings (measured so as to exclude compile and link times):
With this change, all tests complete in 2m40s on a system for which nproc prints 32.
Prior to this this change, "make check" would take 24.5 minutes on that same system.
Here are durations (in seconds) of the longest-running subtests:
152.435 PASS t/DBTest.FileCreationRandomFailure
107.070 PASS t/DBTest.EncodeDecompressedBlockSizeTest
81.391 PASS ./reduce_levels_test
71.587 PASS ./backupable_db_test
61.746 PASS ./deletefile_test
57.960 PASS ./table_test
55.230 PASS ./prefix_test
54.060 PASS t/DBTest.CompactionFilterWithValueChange
48.873 PASS t/DBTest.RateLimitingTest
47.569 PASS ./fault_injection_test
46.593 PASS t/DBTest.Randomized
42.662 PASS t/DBTest.CompactionFilter
31.793 PASS t/DBTest.SparseMerge
30.612 PASS t/DBTest.CompactionFilterV2
25.891 PASS t/DBTest.GroupCommitTest
23.863 PASS t/DBTest.DynamicLevelMaxBytesBase
22.976 PASS ./rate_limiter_test
18.942 PASS t/DBTest.OptimizeFiltersForHits
16.851 PASS ./env_test
15.399 PASS t/DBTest.CompactionFilterV2WithValueChange
14.827 PASS t/DBTest.CompactionFilterV2NULLPrefix
Reviewers: igor, sdong, rven, yhchiang, igor.sugak
Reviewed By: igor.sugak
Subscribers: dhruba
Differential Revision: https://reviews.facebook.net/D35379
10 years ago
# Add proper dependency support so changing a .h file forces a .cc file to
# rebuild.
# The .d file indicates .cc file's dependencies on .h files. We generate such
# dependency by g++'s -MM option, whose output is a make dependency rule.
%.cc.d : %.cc
build: make "make" output readable by default
Summary:
With this change, make now prints a summary line for each
compiler and linker invocation, e.g.,:
CC db/builder.o
CC db/c.o
CC db/column_family.o
To see full commands, insert "V=1" into your make command.
E.g., run "make V=1 all" if you want it to print each command
in its full glory.
$^ is GNU make's abbreviation for the prerequisites of the current target.
These AM_V_... variables expand to some very short string like "CC" or
"LD", by default, so that the output of "make" is readable. If/when you
want more details, just build with "make V=1 ...", and make will print
each full command as it is executed. If you prefer to see the noise
all the time, and only want to optionally see the abbreviated output,
set AM_DEFAULT_VERBOSITY=1 in your environment, and then build with
V=0 to see the abbreviated command indicators.
Test Plan:
invoke make a few different ways and observe:
make clean; make # abbreviated
make clean; make V=0 # also abbreviated
make clean; make V=1 # full detail
Reviewers: sdong, ljin, igor
Reviewed By: igor
Subscribers: dhruba
Differential Revision: https://reviews.facebook.net/D33579
10 years ago
@$( CXX) $( CXXFLAGS) $( PLATFORM_SHARED_CFLAGS) \
-MM -MT'$@' -MT'$(<:.cc=.o)' " $< " -o '$@'
%.cpp.d : %.cpp
@$( CXX) $( CXXFLAGS) $( PLATFORM_SHARED_CFLAGS) \
-MM -MT'$@' -MT'$(<:.cpp=.o)' " $< " -o '$@'
i f e q ( $( HAVE_POWER 8) , 1 )
DEPFILES_C = $( LIB_SOURCES_C:.c= .c.d)
DEPFILES_ASM = $( LIB_SOURCES_ASM:.S= .S.d)
%.c.d : %.c
@$( CXX) $( CXXFLAGS) $( PLATFORM_SHARED_CFLAGS) \
-MM -MT'$@' -MT'$(<:.c=.o)' " $< " -o '$@'
%.S.d : %.S
@$( CXX) $( CXXFLAGS) $( PLATFORM_SHARED_CFLAGS) \
-MM -MT'$@' -MT'$(<:.S=.o)' " $< " -o '$@'
$(DEPFILES_C) : %.c .d
$(DEPFILES_ASM) : %.S .d
depend : $( DEPFILES ) $( DEPFILES_C ) $( DEPFILES_ASM )
e l s e
depend : $( DEPFILES )
e n d i f
# if the make goal is either "clean" or "format", we shouldn't
# try to import the *.d files.
# TODO(kailiu) The unfamiliarity of Make's conditions leads to the ugly
# working solution.
i f n e q ( $( MAKECMDGOALS ) , c l e a n )
i f n e q ( $( MAKECMDGOALS ) , f o r m a t )
Add a jni library for rocksdb which supports Open, Get, Put, and Close.
Summary:
This diff contains a simple jni library for rocksdb which supports open, get,
put and closeusing default options (including Options, ReadOptions, and
WriteOptions.) In the usual case, Java developers can use the c++ rocksdb
library in the way similar to the following:
RocksDB db = RocksDB.open(path_to_db);
...
db.put("hello".getBytes(), "world".getBytes();
byte[] value = db.get("hello".getBytes());
...
db.close();
Specifically, this diff has the following major classes:
* RocksDB: a Java wrapper class which forwards the operations
from the java side to c++ rocksdb library.
* RocksDBException: ncapsulates the error of an operation.
This exception type is used to describe an internal error from
the c++ rocksdb library.
This diff also include a simple java sample code calling c++ rocksdb library.
To build the rocksdb jni library, simply run make jni, and make jtest will try to
build and run the sample code.
Note that if the rocksdb is not built with the default glibc that Java uses,
java will try to load the wrong glibc during the run time. As a result,
the sample code might not work properly during the run time.
Test Plan:
* make jni
* make jtest
Reviewers: haobo, dhruba, sdong, igor, ljin
Reviewed By: dhruba
CC: leveldb, xjin
Differential Revision: https://reviews.facebook.net/D17109
11 years ago
i f n e q ( $( MAKECMDGOALS ) , j c l e a n )
i f n e q ( $( MAKECMDGOALS ) , j t e s t )
Package generation for Ubuntu and CentOS
Summary:
I put together a script to assist in the generation of deb's and
rpm's. I've tested that this works on ubuntu via vagrant. I've included the
Vagrantfile here, but I can remove it if it's not useful. The package.sh
script should work on any ubuntu or centos machine, I just added a bit of
logic in there to allow a base Ubuntu or Centos machine to be able to build
RocksDB from scratch.
Example output on Ubuntu 14.04:
```
root@vagrant-ubuntu-trusty-64:/vagrant# ./tools/package.sh
[+] g++-4.7 is already installed. skipping.
[+] libgflags-dev is already installed. skipping.
[+] ruby-all-dev is already installed. skipping.
[+] fpm is already installed. skipping.
Created package {:path=>"rocksdb_3.5_amd64.deb"}
root@vagrant-ubuntu-trusty-64:/vagrant# dpkg --info rocksdb_3.5_amd64.deb
new debian package, version 2.0.
size 17392022 bytes: control archive=1518 bytes.
275 bytes, 11 lines control
2911 bytes, 38 lines md5sums
Package: rocksdb
Version: 3.5
License: BSD
Vendor: Facebook
Architecture: amd64
Maintainer: rocksdb@fb.com
Installed-Size: 83358
Section: default
Priority: extra
Homepage: http://rocksdb.org/
Description: RocksDB is an embeddable persistent key-value store for fast storage.
```
Example output on CentOS 6.5:
```
[root@localhost vagrant]# rpm -qip rocksdb-3.5-1.x86_64.rpm
Name : rocksdb Relocations: /usr
Version : 3.5 Vendor: Facebook
Release : 1 Build Date: Mon 29 Sep 2014 01:26:11 AM UTC
Install Date: (not installed) Build Host: localhost
Group : default Source RPM: rocksdb-3.5-1.src.rpm
Size : 96231106 License: BSD
Signature : (none)
Packager : rocksdb@fb.com
URL : http://rocksdb.org/
Summary : RocksDB is an embeddable persistent key-value store for fast storage.
Description :
RocksDB is an embeddable persistent key-value store for fast storage.
```
Test Plan:
How this gets used is really up to the RocksDB core team. If you
want to actually get this into mainline, you might have to change `make
install` such that it install the RocksDB shared object file as well, which
would require you to link against gflags (maybe?) and that would require some
potential modifications to the script here (basically add a depends on that
package).
Currently, this will install the headers and a pre-compiled statically linked
object file. If that's what you want out of life, than this requires no
modifications.
Reviewers: ljin, yhchiang, igor
Reviewed By: igor
Differential Revision: https://reviews.facebook.net/D24141
10 years ago
i f n e q ( $( MAKECMDGOALS ) , p a c k a g e )
i f n e q ( $( MAKECMDGOALS ) , a n a l y z e )
- i n c l u d e $( DEPFILES )
e n d i f
e n d i f
Add a jni library for rocksdb which supports Open, Get, Put, and Close.
Summary:
This diff contains a simple jni library for rocksdb which supports open, get,
put and closeusing default options (including Options, ReadOptions, and
WriteOptions.) In the usual case, Java developers can use the c++ rocksdb
library in the way similar to the following:
RocksDB db = RocksDB.open(path_to_db);
...
db.put("hello".getBytes(), "world".getBytes();
byte[] value = db.get("hello".getBytes());
...
db.close();
Specifically, this diff has the following major classes:
* RocksDB: a Java wrapper class which forwards the operations
from the java side to c++ rocksdb library.
* RocksDBException: ncapsulates the error of an operation.
This exception type is used to describe an internal error from
the c++ rocksdb library.
This diff also include a simple java sample code calling c++ rocksdb library.
To build the rocksdb jni library, simply run make jni, and make jtest will try to
build and run the sample code.
Note that if the rocksdb is not built with the default glibc that Java uses,
java will try to load the wrong glibc during the run time. As a result,
the sample code might not work properly during the run time.
Test Plan:
* make jni
* make jtest
Reviewers: haobo, dhruba, sdong, igor, ljin
Reviewed By: dhruba
CC: leveldb, xjin
Differential Revision: https://reviews.facebook.net/D17109
11 years ago
e n d i f
e n d i f
Package generation for Ubuntu and CentOS
Summary:
I put together a script to assist in the generation of deb's and
rpm's. I've tested that this works on ubuntu via vagrant. I've included the
Vagrantfile here, but I can remove it if it's not useful. The package.sh
script should work on any ubuntu or centos machine, I just added a bit of
logic in there to allow a base Ubuntu or Centos machine to be able to build
RocksDB from scratch.
Example output on Ubuntu 14.04:
```
root@vagrant-ubuntu-trusty-64:/vagrant# ./tools/package.sh
[+] g++-4.7 is already installed. skipping.
[+] libgflags-dev is already installed. skipping.
[+] ruby-all-dev is already installed. skipping.
[+] fpm is already installed. skipping.
Created package {:path=>"rocksdb_3.5_amd64.deb"}
root@vagrant-ubuntu-trusty-64:/vagrant# dpkg --info rocksdb_3.5_amd64.deb
new debian package, version 2.0.
size 17392022 bytes: control archive=1518 bytes.
275 bytes, 11 lines control
2911 bytes, 38 lines md5sums
Package: rocksdb
Version: 3.5
License: BSD
Vendor: Facebook
Architecture: amd64
Maintainer: rocksdb@fb.com
Installed-Size: 83358
Section: default
Priority: extra
Homepage: http://rocksdb.org/
Description: RocksDB is an embeddable persistent key-value store for fast storage.
```
Example output on CentOS 6.5:
```
[root@localhost vagrant]# rpm -qip rocksdb-3.5-1.x86_64.rpm
Name : rocksdb Relocations: /usr
Version : 3.5 Vendor: Facebook
Release : 1 Build Date: Mon 29 Sep 2014 01:26:11 AM UTC
Install Date: (not installed) Build Host: localhost
Group : default Source RPM: rocksdb-3.5-1.src.rpm
Size : 96231106 License: BSD
Signature : (none)
Packager : rocksdb@fb.com
URL : http://rocksdb.org/
Summary : RocksDB is an embeddable persistent key-value store for fast storage.
Description :
RocksDB is an embeddable persistent key-value store for fast storage.
```
Test Plan:
How this gets used is really up to the RocksDB core team. If you
want to actually get this into mainline, you might have to change `make
install` such that it install the RocksDB shared object file as well, which
would require you to link against gflags (maybe?) and that would require some
potential modifications to the script here (basically add a depends on that
package).
Currently, this will install the headers and a pre-compiled statically linked
object file. If that's what you want out of life, than this requires no
modifications.
Reviewers: ljin, yhchiang, igor
Reviewed By: igor
Differential Revision: https://reviews.facebook.net/D24141
10 years ago
e n d i f
e n d i f