Summary:
Major changes in this PR:
* Implement CassandraCompactionFilter to remove expired columns and rows (if all column expired)
* Move cassandra related code from utilities/merge_operators/cassandra to utilities/cassandra/*
* Switch to use shared_ptr<> from uniqu_ptr for Column membership management in RowValue. Since columns do have multiple owners in Merge and GC process, use shared_ptr helps make RowValue immutable.
* Rename cassandra_merge_test to cassandra_functional_test and add two TTL compaction related tests there.
Closes https://github.com/facebook/rocksdb/pull/2588
Differential Revision: D5430010
Pulled By: wpc
fbshipit-source-id: 9566c21e06de17491d486a68c70f52d501f27687
Summary:
Instead of ignoring UBSan checks, fix the negative shifts in
Hash(). Also add test to make sure the hash values are stable over
time. The values were computed before this change, so the test also
verifies the correctness of the change.
Closes https://github.com/facebook/rocksdb/pull/2546
Differential Revision: D5386369
Pulled By: yiwu-arbug
fbshipit-source-id: 6de4b44461a544d6222cc5d72d8cda2c0373d17e
Summary:
1. The buckifier script assume each test "foo" comes with a .cc file of the same name (i.e. foo.cc). Update cassandra tests to follow this pattern so that the buckifier script can recognize them.
2. add blob_db_test
Closes https://github.com/facebook/rocksdb/pull/2506
Differential Revision: D5331517
Pulled By: yiwu-arbug
fbshipit-source-id: 86f3eba471fc621186ab44cbd073b6162cde8e57
Summary:
This PR adds support for encrypting data stored by RocksDB when written to disk.
It adds an `EncryptedEnv` override of the `Env` class with matching overrides for sequential&random access files.
The encryption itself is done through a configurable `EncryptionProvider`. This class creates is asked to create `BlockAccessCipherStream` for a file. This is where the actual encryption/decryption is being done.
Currently there is a Counter mode implementation of `BlockAccessCipherStream` with a `ROT13` block cipher (NOTE the `ROT13` is for demo purposes only!!).
The Counter operation mode uses an initial counter & random initialization vector (IV).
Both are created randomly for each file and stored in a 4K (default size) block that is prefixed to that file. The `EncryptedEnv` implementation is such that clients of the `Env` class do not see this prefix (nor data, nor in filesize).
The largest part of the prefix block is also encrypted, and there is room left for implementation specific settings/values/keys in there.
To test the encryption, the `DBTestBase` class has been extended to consider a new environment variable called `ENCRYPTED_ENV`. If set, the test will setup a encrypted instance of the `Env` class to use for all tests.
Typically you would run it like this:
```
ENCRYPTED_ENV=1 make check_some
```
There is also an added test that checks that some data inserted into the database is or is not "visible" on disk. With `ENCRYPTED_ENV` active it must not find plain text strings, with `ENCRYPTED_ENV` unset, it must find the plain text strings.
Closes https://github.com/facebook/rocksdb/pull/2424
Differential Revision: D5322178
Pulled By: sdwilsh
fbshipit-source-id: 253b0a9c2c498cc98f580df7f2623cbf7678a27f
Summary:
Improve write buffer manager in several ways:
1. Size is tracked when arena block is allocated, rather than every allocation, so that it can better track actual memory usage and the tracking overhead is slightly lower.
2. We start to trigger memtable flush when 7/8 of the memory cap hits, instead of 100%, and make 100% much harder to hit.
3. Allow a cache object to be passed into buffer manager and the size allocated by memtable can be costed there. This can help users have one single memory cap across block cache and memtable.
Closes https://github.com/facebook/rocksdb/pull/2350
Differential Revision: D5110648
Pulled By: siying
fbshipit-source-id: b4238113094bf22574001e446b5d88523ba00017
Summary:
Blob db rely on base db returning sequence number through write batch after DB::Write(). However after recent changes to the write path, DB::Writ()e no longer return sequence number in some cases. Fixing it by have WriteBatchInternal::InsertInto() always encode sequence number into write batch.
Stacking on #2375.
Closes https://github.com/facebook/rocksdb/pull/2385
Differential Revision: D5148358
Pulled By: yiwu-arbug
fbshipit-source-id: 8bda0aa07b9334ed03ed381548b39d167dc20c33
Summary:
Re-enable blob_db_test with some update:
* Commented out delay at the end of GC tests. Will update the logic later with sync point to properly trigger GC.
* Added some helper functions.
Also update make files to include blob_dump tool.
Closes https://github.com/facebook/rocksdb/pull/2375
Differential Revision: D5133793
Pulled By: yiwu-arbug
fbshipit-source-id: 95470b26d0c1f9592ba4b7637e027fdd263f425c
Summary:
zstd files are downloaded and used as part of JNI build, but are left behind even after doing a `make clean`. This PR updates the `clean` target to remove these zstd files as well.
Closes https://github.com/facebook/rocksdb/pull/2365
Differential Revision: D5123537
Pulled By: sagar0
fbshipit-source-id: a8f355da5ba961aa89d5852e35751ffc35de03ea
Summary:
added ctags -e to the tags target in the makefile. It creates an etags file suitable for emacs.
Closes https://github.com/facebook/rocksdb/pull/2193
Differential Revision: D4983535
Pulled By: siying
fbshipit-source-id: 1077ef0676025b8109df37433572533c9e8fe86e
Summary:
-pic seems to be not working in gcc-5 and it is curently broken. Remove it to fix the build.
Closes https://github.com/facebook/rocksdb/pull/2320
Differential Revision: D5082775
Pulled By: siying
fbshipit-source-id: 5055f987353f1417643a394e7ce05905670410a4
Summary:
snprintf is in <stdio.h> and not in namespace std.
Closes https://github.com/facebook/rocksdb/pull/2287
Reviewed By: anirbanr-fb
Differential Revision: D5054752
Pulled By: yiwu-arbug
fbshipit-source-id: 356807ec38f3c7d95951cdb41f31a3d3ae0714d4
Summary:
As an alternative to Vagrant, we can now also use Docker to cross-build RocksDB. The advantages are:
1. The Docker images are fixed; they include all the latest updates and build tools.
2. The Vagrant image, required scripts that ran for every build that would update CentOS and install the buildtools. This lead to slow repeatable builds, we don't need to do this with Docker as they are already in the provided images.
The Docker images I have used have their Docker build files here: https://github.com/evolvedbinary/docker-rocksjava and the images themselves are available from Docker hub: https://hub.docker.com/r/evolvedbinary/rocksjava/
I have added the following targets to the `Makefile`:
1. `rocksdbjavastaticreleasedocker` this uses Docker to perform the cross-builds. It is basically the Docker version of the existing Vagrant `rocksdbjavastaticrelease` target.
2. `rocksdbjavastaticpublishdocker` delegates to `rocksdbjavastaticreleasedocker` and then `rocksdbjavastaticpublishcentral` to upload the artiacts to Maven Central. Equivalent to the existing Vagrant target: `rocksdbjavastaticpublish`
Closes https://github.com/facebook/rocksdb/pull/2278
Differential Revision: D5048206
Pulled By: yiwu-arbug
fbshipit-source-id: 78fa96ef9d966fe09638ed01de282cd4e31961a9
Summary:
When building rocksdb in fbcode using `make`, util/build_version.cc is always updated (gitignore/hgignore doesn't apply because the file is already checked into fbcode). To use the rocksdb makefile from our own makefile, I would like an option to prevent the metadata update, which is of no value for us.
Closes https://github.com/facebook/rocksdb/pull/2264
Differential Revision: D5037846
Pulled By: siying
fbshipit-source-id: 9fa005725c5ecb31d9cbe2e738cbee209591f08a
Summary:
I'm going to add more DB tests for statistics as currently we have very few. I started a file dedicated to this purpose and moved the existing stats-specific tests there.
Closes https://github.com/facebook/rocksdb/pull/2211
Differential Revision: D4951558
Pulled By: ajkr
fbshipit-source-id: 05d11c35079c40ecabdfd2cf5556ccb761f694a4
Summary:
Replacement of #2147
The change was squashed due to a lot of conflicts.
Closes https://github.com/facebook/rocksdb/pull/2194
Differential Revision: D4929799
Pulled By: siying
fbshipit-source-id: 5cd49c254737a1d5ac13f3c035f128e86524c581
Summary:
Previously, the shared library (make shared_lib) was built with only one
compile line, compiling all .cc files and linking the shared library in
one step. That step would often take 10+ minutes on one machine, and
could not take advantage of multiple CPUs (it's only one invocation of
the compiler).
This commit changes the shared_lib build to compile .o files
individually (placing the resulting .o files in the directory
shared-objects) and then link them into the shared library at the end,
similarly to how the java static build (jls) does it.
Tested by making sure that both static and shared libraries work, and by
making sure that "make clean" cleans up the shared-objects directory.
Closes https://github.com/facebook/rocksdb/pull/2165
Differential Revision: D4897121
Pulled By: yiwu-arbug
fbshipit-source-id: 9811e043d1c01e10503593f3489d186c786ee7d7
Summary:
In some CI test environment, compression libraries can't be successfully built. It still helps to build RocksDB there. Provide such an option to skip to download and build compression libraries.
Closes https://github.com/facebook/rocksdb/pull/2135
Differential Revision: D4872617
Pulled By: siying
fbshipit-source-id: bb21ac373bc62a2528cdf1ca4547e05fcae86214
Summary:
siying this is a resubmission of #2081 with the 4th commit fixed. From that commit message:
> Note that the previous use of quotes in PLATFORM_{CC,CXX}FLAGS was
incorrect and caused GCC to produce the incorrect define:
>
> #define ROCKSDB_JEMALLOC -DJEMALLOC_NO_DEMANGLE 1
>
> This was the cause of the Linux build failure on the previous version
of this change.
I've tested this locally, and the Linux build succeeds now.
Closes https://github.com/facebook/rocksdb/pull/2097
Differential Revision: D4839964
Pulled By: siying
fbshipit-source-id: cc51322
Summary:
Move some files under util/ to new directories env/, monitoring/ options/ and cache/
Closes https://github.com/facebook/rocksdb/pull/2090
Differential Revision: D4833681
Pulled By: siying
fbshipit-source-id: 2fd8bef
Summary:
I've needed Env timing measurements a few times now, so finally built something for it.
Closes https://github.com/facebook/rocksdb/pull/2073
Differential Revision: D4811231
Pulled By: ajkr
fbshipit-source-id: 218a249
Summary:
It is confusing to have auto_roll_logger to stay under db/, which has nothing to do with database. Move filename together as it is a dependency.
Closes https://github.com/facebook/rocksdb/pull/2080
Differential Revision: D4821141
Pulled By: siying
fbshipit-source-id: ca7d768
Summary:
Right now, building rocksdbjava in PowerPC is broken due to JNI library name. I figured it out that "uname -m" and java's os.arch matches in PowerPC architecture. I made use of this advantage to fix the issue. More info can found from this issue --> https://github.com/facebook/rocksdb/issues/1317
Closes https://github.com/facebook/rocksdb/pull/2040
Differential Revision: D4779967
Pulled By: siying
fbshipit-source-id: 259f939
Summary:
After we have db_basic_test and external_sst_file_basic_test, we don't need to run db_test and external_sst_file_test in Travis's MAC OS run anymore. Move it out.
Closes https://github.com/facebook/rocksdb/pull/1940
Differential Revision: D4659361
Pulled By: siying
fbshipit-source-id: e64e291
Summary:
Separate the platform dependent tests from external_sst_file_test. Only those tests need to run on platforms like OSX
Closes https://github.com/facebook/rocksdb/pull/1923
Differential Revision: D4622461
Pulled By: siying
fbshipit-source-id: d2d6f04
Summary:
Separate a smal subset of tests in DBTest to DBBasicTest. Tests in DBTest don't have to run in CI tests on platforms like OSX, as long as they are covered by Linux.
Closes https://github.com/facebook/rocksdb/pull/1924
Differential Revision: D4616702
Pulled By: siying
fbshipit-source-id: 13e6549
Summary:
Travis is short of OSX resource. Try to move platform independent test suites out of OSX
Closes https://github.com/facebook/rocksdb/pull/1922
Differential Revision: D4616070
Pulled By: siying
fbshipit-source-id: 786342c
Summary:
valgrind tests always timeout with parallel run. Black list some slowest ones. It is better to run fewer tests than always have the tests timeout.
Closes https://github.com/facebook/rocksdb/pull/1908
Differential Revision: D4607875
Pulled By: siying
fbshipit-source-id: 7062664
Summary:
The previous version of zlib is no longer available. I have also updated the versions of the other static libraries and added checkum checks for the downloads; This is related to https://github.com/facebook/rocksdb/issues/1769
Closes https://github.com/facebook/rocksdb/pull/1863
Differential Revision: D4550742
Pulled By: yiwu-arbug
fbshipit-source-id: 4414150
Summary:
The Env registration framework supports registering client Envs and selecting which one to instantiate according to a text field. This enabled things like adding the -env_uri argument to db_bench, so the same binary could be reused with different Envs just by changing CLI config.
Now this problem has come up again in a non-Env context, as I want to instantiate a client Statistics implementation from db_bench, which is configured entirely via text parameters. Also, in the future we may wish to use it for deserializing client objects when loading OPTIONS file.
This diff generalizes the Env registration logic to work with arbitrary types.
- Generalized registration and instantiation code by templating them
- The entire implementation is in a header file as that's Google style guide's recommendation for template definitions
- Pattern match with std::regex_match rather than checking prefix, which was the previous behavior
- Rename functions/files to be non-Env-specific
Closes https://github.com/facebook/rocksdb/pull/1776
Differential Revision: D4421933
Pulled By: ajkr
fbshipit-source-id: 34647d1
Summary:
Cleanable objects will perform the registered cleanups when
they are destructed. We however rather to delay this cleaning like when
we are gathering the merge operands. Current approach is to create the
Cleanable object on heap (instead of on stack) and delay deleting it.
By allowing Cleanables to delegate their cleanups to another cleanable
object we can delay the cleaning without however the need to craete the
cleanable object on heap and keeping it around. This patch applies this
technique for the cleanups of BlockIter and shows improved performance
for some in-memory benchmarks:
+1.8% for merge worklaod, +6.4% for non-merge workload when the merge
operator is specified.
https://our.intern.facebook.com/intern/tasks?t=15168163
Non-merge benchmark:
TEST_TMPDIR=/dev/shm/v100nocomp/ ./db_bench --benchmarks=fillrandom
--num=1000000 -value_size=100 -compression_type=none
Reading random with no merge operator specified:
TEST_TMPDIR=/dev/shm/v100nocomp/ ./db_bench
--benchmarks="read
Closes https://github.com/facebook/rocksdb/pull/1711
Differential Revision: D4361163
Pulled By: maysamyabandeh
fbshipit-source-id: 9801e07
Summary:
Added a tombstone-collapsing mode to RangeDelAggregator, which eliminates overlap in the TombstoneMap. In this mode, we can check whether a tombstone covers a user key using upper_bound() (i.e., binary search). However, the tradeoff is the overhead to add tombstones is now higher, so at first I've only enabled it for range scans (compaction/flush/user iterators), where we expect a high number of calls to ShouldDelete() for the same tombstones. Point queries like Get() will still use the linear scan approach.
Also in this diff I changed RangeDelAggregator's TombstoneMap to use multimap with user keys instead of map with internal keys. Callers sometimes provided ParsedInternalKey directly, from which it would've required string copying to derive an internal key Slice with which we could search the map.
Closes https://github.com/facebook/rocksdb/pull/1614
Differential Revision: D4270397
Pulled By: ajkr
fbshipit-source-id: 93092c7
Summary:
Iterator should be in corrupted status if merge operator return false.
Also add test to make sure if max_successive_merges is hit during write,
data will not be lost.
Closes https://github.com/facebook/rocksdb/pull/1665
Differential Revision: D4322695
Pulled By: yiwu-arbug
fbshipit-source-id: b327b05
Summary:
currently when running a portable build we have to do the following
PORTABLE=1 make ...
this commit adds support for the following
make PORTABLE=1 ...
this might be seem subtle but it makes PORTABLE like all other
makefile args and simplifies invocation from numerous build systems
including things like ExternalProject_Add in cmake.
Signed-off-by: Bassam Tabbara <bassam.tabbara@quantum.com>
Closes https://github.com/facebook/rocksdb/pull/1643
Differential Revision: D4315870
Pulled By: yiwu-arbug
fbshipit-source-id: ee43755
Summary:
The second variable "SHELL" simply tells make explicitly which shell to use, instead of allowing it to default to "/bin/sh", which may or may not be Bash.
However, simply defining the second variable by itself causes make to throw an error concerning a circular definition, as it would be attempting to use the "shell" command while simultaneously trying to set which shell to use. Thus, the first variable "BASH_EXISTS" is defined such that make already knows about "/path/to/bash" before trying to use it to set "SHELL".
A more technically correct solution would be to edit the makefile itself to make it compatible with non-bash shells (see the original Issue discussion for details). However, as it seems very few of the people working on this project were building with non-bash shells, I figured this solution would be good enough.
Closes https://github.com/facebook/rocksdb/pull/1631
Differential Revision: D4295689
Pulled By: yiwu-arbug
fbshipit-source-id: e4f9532