Summary:
This adds support for recovering WriteUnprepared transactions through the following changes:
- The information in `RecoveredTransaction` is extended so that it can reference multiple batches.
- `MarkBeginPrepare` is extended with a bool indicating whether it is an unprepared begin, and this is passed down to `InsertRecoveredTransaction` to indicate whether the current transaction is prepared or not.
- `WriteUnpreparedTxnDB::Initialize` is overridden so that it will rollback unprepared transactions from the recovered transactions. This can be done without updating the prepare heap/commit map, because this is before the DB has finished initializing, and after writing the rollback batch, those data structures should not contain information about the rolled back transaction anyway.
Commit/Rollback of live transactions is still unimplemented and will come later.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/4078
Differential Revision: D8703382
Pulled By: lth
fbshipit-source-id: 7e0aada6c23bd39299f1f20d6c060492e0e6b60a
Summary:
It seems that compilation has been made stricter about unused args.
Closes https://github.com/facebook/rocksdb/pull/4080
Differential Revision: D8712049
Pulled By: sagar0
fbshipit-source-id: 984af1982638af3568aac1a167f565f4741badee
Summary:
…ression
For `CompressionType` we have options `compression` and `bottommost_compression`. Thus, to make the compression options consitent with the compression type when bottommost_compression is enabled, we add the bottommost_compression_opts
Closes https://github.com/facebook/rocksdb/pull/3985
Reviewed By: riversand963
Differential Revision: D8385911
Pulled By: zhichao-cao
fbshipit-source-id: 07bc533dd61bcf1cef5927d8d62901c13d38d5fc
Summary:
`SetByteArrayRegion` does not have const source buffer thus compilation error. I have made that same as in other JNI files (const_cast). It was missing for new transaction functionality added recently.
Closes https://github.com/facebook/rocksdb/pull/4013
Differential Revision: D8493290
Pulled By: sagar0
fbshipit-source-id: 14afedf365b111121bd11e68a8d546a1cae68b26
Summary:
Adding support for zLinux on s390x architecture in JNI.
Closes https://github.com/facebook/rocksdb/pull/4009
Differential Revision: D8483750
Pulled By: siying
fbshipit-source-id: e681657c27e7a28f1731e08e8570382de5deff44
Summary:
1. Add a new ticker stat rocksdb.number.multiget.keys.found to track the
number of keys successfully read
2. Update rocksdb.memtable.hit/miss in DBImpl::MultiGet(). It was being done in
DBImpl::GetImpl(), but not MultiGet
Closes https://github.com/facebook/rocksdb/pull/3730
Differential Revision: D7677364
Pulled By: anand1976
fbshipit-source-id: af22bd0ef8ddc5cf2b4244b0a024e539fe48bca5
Summary:
This PR comments out the rest of the unused arguments which allow us to turn on the -Wunused-parameter flag. This is the second part of a codemod relating to https://github.com/facebook/rocksdb/pull/3557.
Closes https://github.com/facebook/rocksdb/pull/3662
Differential Revision: D7426121
Pulled By: Dayvedde
fbshipit-source-id: 223994923b42bd4953eb016a0129e47560f7e352
Summary:
Changes to support sharing block cache using the Java API.
Previously DB instances could share the block cache only when the same Options instance is passed to all the DB instances. But now, with this change, it is possible to explicitly create a cache and pass it to multiple options instances, to share the block cache.
Implementing this for [Rocksandra](https://github.com/instagram/cassandra/tree/rocks_3.0), but this feature has been requested by many java api users over the years.
Closes https://github.com/facebook/rocksdb/pull/3623
Differential Revision: D7305794
Pulled By: sagar0
fbshipit-source-id: 03e4e8ed7aeee6f88bada4a8365d4279ede2ad71
Summary:
I modified the Makefile so that we can compile rocksdb on OpenBSD.
The instructions for building have been added to INSTALL.md.
The whole compilation process works fine like this on OpenBSD-current
Closes https://github.com/facebook/rocksdb/pull/3617
Differential Revision: D7323754
Pulled By: siying
fbshipit-source-id: 990037d1cc69138d22f85bd77ef4dc8c1ba9edea
Summary:
This changes the console output when the RocksJava tests are run. It makes spotting the errors and failures much easier; perviously the output was malformed with results like "ERun" where the "E" represented an error in the preceding test.
Closes https://github.com/facebook/rocksdb/pull/3621
Differential Revision: D7306172
Pulled By: sagar0
fbshipit-source-id: 3fa6f6e1ca6c6ea7ceef55a23ca81903716132b7
Summary:
This is an abstraction for working with custom Comparators implemented in native C++ code from Java. Native code must directly extend `rocksdb::Comparator`. When the native code comparator is compiled into the RocksDB codebase, you can then create a Java Class, and JNI stub to wrap it.
Useful if the C++/JNI barrier overhead is too much for your applications comparator performance.
An example is provided in `java/rocksjni/native_comparator_wrapper_test.cc` and `java/src/main/java/org/rocksdb/NativeComparatorWrapperTest.java`.
Closes https://github.com/facebook/rocksdb/pull/3334
Differential Revision: D7172605
Pulled By: miasantreble
fbshipit-source-id: e24b7eb267a3bcb6afa214e0379a1d5e8a2ceabe
Summary:
Add Java-side copy constructors for:
- Options
- DBOptions
- ColumnFamilyOptions
- WriteOptions
along with unit tests to assert the copy worked.
NOTE: Unit tests are failing in travis but it looks like a global timeout issue. These tests pass.
Closes https://github.com/facebook/rocksdb/pull/3450
Differential Revision: D6874425
Pulled By: sagar0
fbshipit-source-id: 5bde68ea5b5225e071faea2628bf8bbf10bd65ab
Summary: Grandfather in super old lint issues to make a clean slate for moving forward that allows us to have stronger enforcement on new issues.
Reviewed By: yiwu-arbug
Differential Revision: D6821806
fbshipit-source-id: 22797d31ec58e9eb0255d3b66fedfcfcb0dc127c
Summary:
Add a new histogram stat called rocksdb.db.flush.micros for memtable
flush
Closes https://github.com/facebook/rocksdb/pull/3269
Differential Revision: D6559496
Pulled By: anand1976
fbshipit-source-id: f5c771ba2568630458751795e8c37a493ff9b14d
Summary:
This diff adds a new ticker stat, NUMBER_ITER_SKIP, to count the
number of internal keys skipped during iteration. Keys can be skipped
due to deletes, or lower sequence number, or higher sequence number
than the one requested.
Also, fix the issue when StatisticsData is naturally aligned on cacheline boundary,
padding becomes a zero size array, which the Windows compiler doesn't
like. So add a cacheline worth of padding in that case to keep it happy.
We cannot conditionally add padding as gcc doesn't allow using sizeof
in preprocessor directives.
Closes https://github.com/facebook/rocksdb/pull/3177
Differential Revision: D6353897
Pulled By: anand1976
fbshipit-source-id: 441d5a09af9c4e22e7355242dfc0c7b27aa0a6c2
Summary:
This options was introduced in the C++ API in #1953 .
Closes https://github.com/facebook/rocksdb/pull/3064
Differential Revision: D6139010
Pulled By: sagar0
fbshipit-source-id: 164de11d539d174cf3afe7cd40e667049f44b0bc
Summary:
Java's `Status.SubCode` was out of sync with `include/rocksdb/status.h:SubCode`.
When running out of disc space this led to an `IllegalArgumentException` because of an invalid status code, rather than just returning the corresponding status code without an exception.
I added the missing status codes.
By this, we keep the behaviour of throwing an `IllegalArgumentException` in case of newly added status codes that are defined in C but not in Java.
We could think of an alternative strategy: add in Java another code "UnknownCode" which acts as a catch-all for all those status codes that are not yet mirrored from C to Java. This approach would never throw an exception but simply return a non-OK status-code.
I think the current approach of throwing an Exception in case of a C/Java inconsistency is fine, but if you have some opinion on the alternative strategy, then feel free to comment here.
Closes https://github.com/facebook/rocksdb/pull/3050
Differential Revision: D6129682
Pulled By: sagar0
fbshipit-source-id: f2bf44caad650837cffdcb1f93eb793b43580c66
Summary:
Adding OptionsUtil java class and options_util.cc to java/CMakeLists.txt, which were missed accidentally when they were introduced in #2898.
Closes https://github.com/facebook/rocksdb/pull/2985
Differential Revision: D6015878
Pulled By: sagar0
fbshipit-source-id: 1abbd46db4aebad1e07ea53523eacbdcb12823e1
Summary:
This PR also includes some cleanup, bugfixes and refactoring of the Java API. However these are really pre-cursors on the road to CompactionFilterFactory support.
Closes https://github.com/facebook/rocksdb/pull/1241
Differential Revision: D6012778
Pulled By: sagar0
fbshipit-source-id: 0774465940ee99001a78906e4fed4ef57068ad5c
Summary:
Now that RocksDB supports conditional merging during point lookups (introduced in #2923), Cassandra value merge operator can be updated to pass in a limit. The limit needs to be passed in from the Cassandra code.
Closes https://github.com/facebook/rocksdb/pull/2947
Differential Revision: D5938454
Pulled By: sagar0
fbshipit-source-id: d64a72d53170d8cf202b53bd648475c3952f7d7f
Summary:
This enables us to crossbuild pcc64le RocksJava binaries with a suitably old version of glibc (2.17) on CentOS 7.
Closes https://github.com/facebook/rocksdb/pull/2491
Differential Revision: D5955301
Pulled By: sagar0
fbshipit-source-id: 69ef9746f1dc30ffde4063dc764583d8c7ae937e
Summary:
Problem:
During RocksJava performance testing we found that the rocksdb jni library is not built with jemalloc; instead it was getting built with the default glibc malloc. We saw quite a bit of memory bloat due to this.
Addressed this by installing jemalloc-devel package in the vm that we use to build release jars.
Closes https://github.com/facebook/rocksdb/pull/2916
Differential Revision: D5887018
Pulled By: sagar0
fbshipit-source-id: ace0b5d60234b3a30dcd5d39633e7827a5982a50
Summary:
This option was introduced in the C++ API in RocksDB 5.6 in bb01c1880c . Now, exposing it through RocksJava API.
Closes https://github.com/facebook/rocksdb/pull/2908
Differential Revision: D5864224
Pulled By: sagar0
fbshipit-source-id: 140aa55dcf74b14e4d11219d996735c7fdddf513
Summary:
In our testing cluster, we found large amount tombstone has been promoted to kValue type from kMerge after reaching the top level of compaction. Since we used to only collecting tombstone in merge operator, those tombstones can never be collected.
This PR addresses the issue by adding a GC step in compaction filter, which is only for kValue type records. Since those record already reached the top of compaction (no earlier data exists) we can safely remove them in compaction filter without worrying old data appears.
This PR also removes an old optimization in cassandra merge operator for single merge operands. We need to do GC even on a single operand, so the optimation does not make sense anymore.
Closes https://github.com/facebook/rocksdb/pull/2855
Reviewed By: sagar0
Differential Revision: D5806445
Pulled By: wpc
fbshipit-source-id: 6eb25629d4ce917eb5e8b489f64a6aa78c7d270b
Summary:
Plumbed ReadOptions::iterate_upper_bound through JNI.
Made the following design choices:
* Used Slice instead of AbstractSlice due to the anticipated usecase (key / key prefix). Can change this if anyone disagrees.
* Used Slice instead of raw byte[] which seemed cleaner but necessitated the package-private handle-based Slice constructor. Followed WriteBatch as an example.
* We need a copy constructor for ReadOptions, as we create one base ReadOptions for a particular usecase and clone -> change the iterate_upper_bound on each slice operation. Shallow copy seemed cleanest.
* Hold a reference to the upper bound slice on ReadOptions, in contrast to Snapshot.
Signed a Facebook CLA this morning.
Closes https://github.com/facebook/rocksdb/pull/2872
Differential Revision: D5824446
Pulled By: sagar0
fbshipit-source-id: 74fc51313a10a81ecd348625e2a50ca5b7766888
Summary:
As discussed in #2742 , this pull-requests brings the iterator's [SeekForPrev()](https://github.com/facebook/rocksdb/wiki/SeekForPrev) functionality to the java-api. It affects all locations in the code where previously only Seek() was supported.
All code changes are essentially a copy & paste of the already existing implementations for Seek().
**Please Note**: the changes to the C++ code were applied without fully understanding its effect, so please take a closer look. However, since Seek() and SeekForPrev() provide exactly the same signature, I do not expect any mistake here.
The java-tests are extended by new tests for the additional functionality.
Compilation (`make rocksdbjavastatic`) and test (`java/make test`) run without errors.
Closes https://github.com/facebook/rocksdb/pull/2747
Differential Revision: D5721011
Pulled By: sagar0
fbshipit-source-id: c1f951cddc321592c70dd2d32bc04892f3f119f8
Summary:
Remove cassandra tombstone when reaching the max compaction level (full merge). if all columns collected key will be removed in next compaction via compaction filter
Closes https://github.com/facebook/rocksdb/pull/2791
Reviewed By: sagar0
Differential Revision: D5722465
Pulled By: wpc
fbshipit-source-id: 61e9898a5686551653a16383255aeaab3197e65e
Summary:
I observed while doing a `make jtest` that the java sample was broken, due to the changes in #2551 .
Closes https://github.com/facebook/rocksdb/pull/2674
Differential Revision: D5539807
Pulled By: sagar0
fbshipit-source-id: 2c7e9d84778099dfa1c611996b444efe3c9fd466
Summary:
I might have missed these while doing some recent cassandra code reviews.
Closes https://github.com/facebook/rocksdb/pull/2663
Differential Revision: D5520138
Pulled By: sagar0
fbshipit-source-id: 340930afe9efe03c75f535a1da1f89bd3e53c1f9
Summary:
I haven't looked to see if a class variable inside a loop like this is always initialised.
Closes https://github.com/facebook/rocksdb/pull/2602
Differential Revision: D5475937
Pulled By: IslamAbdelRahman
fbshipit-source-id: 8570b308f9a4b49e2a56ccc9e9b84d7c46568c15
Summary:
Major changes in this PR:
* Implement CassandraCompactionFilter to remove expired columns and rows (if all column expired)
* Move cassandra related code from utilities/merge_operators/cassandra to utilities/cassandra/*
* Switch to use shared_ptr<> from uniqu_ptr for Column membership management in RowValue. Since columns do have multiple owners in Merge and GC process, use shared_ptr helps make RowValue immutable.
* Rename cassandra_merge_test to cassandra_functional_test and add two TTL compaction related tests there.
Closes https://github.com/facebook/rocksdb/pull/2588
Differential Revision: D5430010
Pulled By: wpc
fbshipit-source-id: 9566c21e06de17491d486a68c70f52d501f27687