Summary:
In our testing cluster, we found large amount tombstone has been promoted to kValue type from kMerge after reaching the top level of compaction. Since we used to only collecting tombstone in merge operator, those tombstones can never be collected.
This PR addresses the issue by adding a GC step in compaction filter, which is only for kValue type records. Since those record already reached the top of compaction (no earlier data exists) we can safely remove them in compaction filter without worrying old data appears.
This PR also removes an old optimization in cassandra merge operator for single merge operands. We need to do GC even on a single operand, so the optimation does not make sense anymore.
Closes https://github.com/facebook/rocksdb/pull/2855
Reviewed By: sagar0
Differential Revision: D5806445
Pulled By: wpc
fbshipit-source-id: 6eb25629d4ce917eb5e8b489f64a6aa78c7d270b
Summary:
Plumbed ReadOptions::iterate_upper_bound through JNI.
Made the following design choices:
* Used Slice instead of AbstractSlice due to the anticipated usecase (key / key prefix). Can change this if anyone disagrees.
* Used Slice instead of raw byte[] which seemed cleaner but necessitated the package-private handle-based Slice constructor. Followed WriteBatch as an example.
* We need a copy constructor for ReadOptions, as we create one base ReadOptions for a particular usecase and clone -> change the iterate_upper_bound on each slice operation. Shallow copy seemed cleanest.
* Hold a reference to the upper bound slice on ReadOptions, in contrast to Snapshot.
Signed a Facebook CLA this morning.
Closes https://github.com/facebook/rocksdb/pull/2872
Differential Revision: D5824446
Pulled By: sagar0
fbshipit-source-id: 74fc51313a10a81ecd348625e2a50ca5b7766888
Summary:
As discussed in #2742 , this pull-requests brings the iterator's [SeekForPrev()](https://github.com/facebook/rocksdb/wiki/SeekForPrev) functionality to the java-api. It affects all locations in the code where previously only Seek() was supported.
All code changes are essentially a copy & paste of the already existing implementations for Seek().
**Please Note**: the changes to the C++ code were applied without fully understanding its effect, so please take a closer look. However, since Seek() and SeekForPrev() provide exactly the same signature, I do not expect any mistake here.
The java-tests are extended by new tests for the additional functionality.
Compilation (`make rocksdbjavastatic`) and test (`java/make test`) run without errors.
Closes https://github.com/facebook/rocksdb/pull/2747
Differential Revision: D5721011
Pulled By: sagar0
fbshipit-source-id: c1f951cddc321592c70dd2d32bc04892f3f119f8
Summary:
Remove cassandra tombstone when reaching the max compaction level (full merge). if all columns collected key will be removed in next compaction via compaction filter
Closes https://github.com/facebook/rocksdb/pull/2791
Reviewed By: sagar0
Differential Revision: D5722465
Pulled By: wpc
fbshipit-source-id: 61e9898a5686551653a16383255aeaab3197e65e
Summary:
I might have missed these while doing some recent cassandra code reviews.
Closes https://github.com/facebook/rocksdb/pull/2663
Differential Revision: D5520138
Pulled By: sagar0
fbshipit-source-id: 340930afe9efe03c75f535a1da1f89bd3e53c1f9
Summary:
I haven't looked to see if a class variable inside a loop like this is always initialised.
Closes https://github.com/facebook/rocksdb/pull/2602
Differential Revision: D5475937
Pulled By: IslamAbdelRahman
fbshipit-source-id: 8570b308f9a4b49e2a56ccc9e9b84d7c46568c15
Summary:
Major changes in this PR:
* Implement CassandraCompactionFilter to remove expired columns and rows (if all column expired)
* Move cassandra related code from utilities/merge_operators/cassandra to utilities/cassandra/*
* Switch to use shared_ptr<> from uniqu_ptr for Column membership management in RowValue. Since columns do have multiple owners in Merge and GC process, use shared_ptr helps make RowValue immutable.
* Rename cassandra_merge_test to cassandra_functional_test and add two TTL compaction related tests there.
Closes https://github.com/facebook/rocksdb/pull/2588
Differential Revision: D5430010
Pulled By: wpc
fbshipit-source-id: 9566c21e06de17491d486a68c70f52d501f27687
Summary:
Adding SSTFileWriter's newly introduced put, merge and delete apis to the Java api. The C++ APIs were first introduced in #2361.
Add is deprecated in favor of Put.
Merge is especially needed to support streaming for Cassandra-on-RocksDB work in https://issues.apache.org/jira/browse/CASSANDRA-13476.
Closes https://github.com/facebook/rocksdb/pull/2392
Differential Revision: D5165091
Pulled By: sagar0
fbshipit-source-id: 6f0ad396a7cbd2e27ca63e702584784dd72acaab
Summary:
Previously sst_file_writer only supports kTypeValue, we need kTypeMerge and kTypeDeletion also as user requested.
Closes https://github.com/facebook/rocksdb/pull/2361
Differential Revision: D5139402
Pulled By: lightmark
fbshipit-source-id: 092a60756d01692539d817a3765ebfd58a8d7f88
Summary:
Previously the Java implementation of `RocksDB#addFile` was both incomplete and not inline with the C++ API.
Rather than fix it, as I see that `rocksdb::DB::AddFile` is now deprecated in favour of `rocksdb::DB::IngestExternalFile`, I have removed the old broken implementation and implemented `RocksDB#ingestExternalFile`.
Closes https://github.com/facebook/rocksdb/issues/2261
Closes https://github.com/facebook/rocksdb/pull/2291
Differential Revision: D5061264
Pulled By: sagar0
fbshipit-source-id: 85df0899fa1b1fc3535175cac4f52353511d4104
Summary:
Replacement of #2147
The change was squashed due to a lot of conflicts.
Closes https://github.com/facebook/rocksdb/pull/2194
Differential Revision: D4929799
Pulled By: siying
fbshipit-source-id: 5cd49c254737a1d5ac13f3c035f128e86524c581
Summary:
Replace Options::use_direct_writes with Options::use_direct_io_for_flush_and_compaction
Now if Options::use_direct_io_for_flush_and_compaction = true, we will enable direct io for both reads and writes for flush and compaction job. Whereas Options::use_direct_reads controls user reads like iterator and Get().
Closes https://github.com/facebook/rocksdb/pull/2117
Differential Revision: D4860912
Pulled By: lightmark
fbshipit-source-id: d93575a8a5e780cf7e40797287edc425ee648c19
Summary:
This is an effort to club all string related utility functions into one common place, in string_util, so that it is easier for everyone to know what string processing functions are available. Right now they seem to be spread out across multiple modules, like logging and options_helper.
Check the sub-commits for easier reviewing.
Closes https://github.com/facebook/rocksdb/pull/2094
Differential Revision: D4837730
Pulled By: sagar0
fbshipit-source-id: 344278a
Summary:
While running `make jtest` using IBM Java, it fails at compactRangeToLevel with the below error.
```
Run: org.rocksdb.RocksDBTest testing now -> compactRangeToLevel
JVMJNCK056E JNI error in ReleaseByteArrayElements: Got memory 0x00003FFF94AA8908 from object 0x00000000000C7F78, releasing from 0x00000000000C7F68
JVMJNCK077E Error detected in org/rocksdb/RocksDB.compactRange0(J[BI[BIZII)V
JVMJNCK024E JNI error detected. Aborting.
JVMJNCK025I Use -Xcheck:jni:nonfatal to continue running when errors are detected.
Fatal error: JNI error
Makefile:205: recipe for target 'run_test' failed
make[1]: *** [run_test] Error 87
make[1]: Leaving directory '/home/ubuntu/rocksdb/java'
Makefile:1542: recipe for target 'jtest' failed
make: *** [jtest] Error 2
```
After checking the code, it is vivid that we are messing up the `ReleaseByteArrayElements` args in `rocksdb_compactrange_helper`.
```
.................
1959 s = db->CompactRange(compact_options, &begin_slice, &end_slice);
1960 }
Closes https://github.com/facebook/rocksdb/pull/2060
Differential Revision: D4831427
Pulled By: yiwu-arbug
fbshipit-source-id: dd02037
Summary:
Move some files under util/ to new directories env/, monitoring/ options/ and cache/
Closes https://github.com/facebook/rocksdb/pull/2090
Differential Revision: D4833681
Pulled By: siying
fbshipit-source-id: 2fd8bef
Summary:
This adds almost all missing options to RocksJava
Closes https://github.com/facebook/rocksdb/pull/2039
Differential Revision: D4779991
Pulled By: siying
fbshipit-source-id: 4a1bf28
Summary:
I have manually audited the entire RocksJava code base.
Sorry for the large pull-request, I have broken it down into many small atomic commits though.
My initial intention was to fix the warnings that appear when running RocksJava on Java 8 with `-Xcheck:jni`, for example when running `make jtest` you would see many errors similar to:
```
WARNING in native method: JNI call made without checking exceptions when required to from CallObjectMethod
WARNING in native method: JNI call made without checking exceptions when required to from CallVoidMethod
WARNING in native method: JNI call made without checking exceptions when required to from CallStaticVoidMethod
...
```
A few of those warnings still remain, however they seem to come directly from the JVM and are not directly related to RocksJava; I am in contact with the OpenJDK hostpot-dev mailing list about these - http://mail.openjdk.java.net/pipermail/hotspot-dev/2017-February/025981.html.
As a result of fixing these, I realised we were not r
Closes https://github.com/facebook/rocksdb/pull/1890
Differential Revision: D4591758
Pulled By: siying
fbshipit-source-id: 7f7fdf4
Summary:
…action
The two options, min_partial_merge_operands and verify_checksums_in_compaction, are not seldom used. Remove them to reduce the total number of options. Also remove them from Java and C interface.
Closes https://github.com/facebook/rocksdb/pull/1902
Differential Revision: D4601219
Pulled By: siying
fbshipit-source-id: aad4cb2
Summary:
Remove disableDataSync, and another similarly named disable_data_sync options.
This is being done to simplify options, and also because the performance gains of this feature can be achieved by other methods.
Closes https://github.com/facebook/rocksdb/pull/1859
Differential Revision: D4541292
Pulled By: sagar0
fbshipit-source-id: 5b3a6ca
Summary:
Fixes compile error:
In file included from ./util/statistics.h:17:0,
from ./util/stop_watch.h:8,
from ./util/perf_step_timer.h:9,
from ./util/iostats_context_imp.h:8,
from ./util/posix_logger.h:27,
from ./port/util_logger.h:18,
from ./db/auto_roll_logger.h:15,
from db/auto_roll_logger.cc:6:
./util/thread_local.h:65:16: error: 'function' in namespace 'std' does not name a template type
typedef std::function<void(void*, void*)> FoldFunc;
Closes https://github.com/facebook/rocksdb/pull/1656
Differential Revision: D4318702
Pulled By: yiwu-arbug
fbshipit-source-id: 8c5d17a
Summary:
Needed for working with `get` after `merge` on a WBWI.
Closes https://github.com/facebook/rocksdb/pull/1093
Differential Revision: D4137978
Pulled By: yhchiang
fbshipit-source-id: e18d50d
Summary:
I am not sure if this is the best way to fix this?
Closes https://github.com/facebook/rocksdb/pull/1452
Differential Revision: D4109338
Pulled By: yiwu-arbug
fbshipit-source-id: ca40809
Summary:
Changes in the diff
API changes:
- Introduce IngestExternalFile to replace AddFile (I think this make the API more clear)
- Introduce IngestExternalFileOptions (This struct will encapsulate the options for ingesting the external file)
- Deprecate AddFile() API
Logic changes:
- If our file overlap with the memtable we will flush the memtable
- We will find the first level in the LSM tree that our file key range overlap with the keys in it
- We will find the lowest level in the LSM tree above the the level we found in step 2 that our file can fit in and ingest our file in it
- We will assign a global sequence number to our new file
- Remove AddFile restrictions by using global sequence numbers
Other changes:
- Refactor all AddFile logic to be encapsulated in ExternalSstFileIngestionJob
Test Plan:
unit tests (still need to add more)
addfile_stress (https://reviews.facebook.net/D65037)
Reviewers: yiwu, andrewkr, lightmark, yhchiang, sdong
Reviewed By: sdong
Subscribers: jkedgar, hcz, andrewkr, dhruba
Differential Revision: https://reviews.facebook.net/D65061
Summary:
- Deprecated RateLimiterConfig and GenericRateLimiterConfig
- Introduced RateLimiter
It is now possible to use all C++ related methods also in RocksJava.
A noteable method is setBytesPerSecond which can change the allowed
number of bytes per second at runtime.
Test Plan:
make rocksdbjava
make jtest
Reviewers: adamretter, yhchiang, ankgup87
Subscribers: dhruba
Differential Revision: https://reviews.facebook.net/D35715
Summary: jvalue shadows a global name in <jni.h>. Rename it to jval to fix java build.
Test Plan:
JAVA_HOME=/usr/local/jdk-7u10-64 make rocksdbjava -j64
Reviewers: adamretter, yhchiang, IslamAbdelRahman
Reviewed By: yhchiang, IslamAbdelRahman
Subscribers: andrewkr, dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D63981
Summary:
* Change constructor of MutableCFOptions to depends only on ColumnFamilyOptions.
* Move `max_subcompactions`, `compaction_options_fifo` and `compaction_pri` to ImmutableCFOptions to make it clear that they are immutable.
Test Plan: existing unit tests.
Reviewers: yhchiang, IslamAbdelRahman, sdong
Reviewed By: sdong
Subscribers: andrewkr, dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D63945
Summary: There's no reference to ImmutableCFOptions elsewhere in /include/rocksdb. ImmutableCFOptions was introduced in this commit (5665e5e285) but later its reference in /include/rocksdb/table.h is removed.
Test Plan:
make all check
Reviewers: IslamAbdelRahman, sdong, yhchiang
Reviewed By: yhchiang
Subscribers: yhchiang, andrewkr, dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D63177
Summary: To reduce number of options, merge source_compaction_factor, max_grandparent_overlap_bytes and expanded_compaction_factor into max_compaction_bytes.
Test Plan: Add two new unit tests. Run all existing tests, including jtest.
Reviewers: yhchiang, igor, IslamAbdelRahman
Reviewed By: IslamAbdelRahman
Subscribers: leveldb, andrewkr, dhruba
Differential Revision: https://reviews.facebook.net/D59829
* Rename RocksDB#remove -> RocksDB#delete to match C++ API; Added deprecated versions of RocksDB#remove for backwards compatibility.
* Add missing experimental feature RocksDB#singleDelete