rocksdb

Commit Graph

Author	SHA1	Message	Date
Alan Paxton	c1ec0b28eb	java / jni io_uring support (#9224 ) Summary: Existing multiGet() in java calls multi_get_helper() which then calls DB::std::vector MultiGet(). This doesn't take advantage of io_uring. This change adds another JNI level method that runs a parallel code path using the DB::void MultiGet(), using ByteBuffers at the JNI level. We call it multiGetDirect(). In addition to using the io_uring path, this code internally returns pinned slices which we can copy out of into our direct byte buffers; this should reduce the overall number of copies in the code path to/from Java. Some jmh benchmark runs (100k keys, 1000 key multiGet) suggest that for value sizes > 1k, we see about a 20% performance improvement, although performance is slightly reduced for small value sizes, there's a little bit more overhead in the JNI methods. Closes https://github.com/facebook/rocksdb/issues/8407 Pull Request resolved: https://github.com/facebook/rocksdb/pull/9224 Reviewed By: mrambacher Differential Revision: D32951754 Pulled By: jay-zhuang fbshipit-source-id: 1f70df7334be2b6c42a9c8f92725f67c71631690	3 years ago
Adam Retter	6477075f2c	JMH microbenchmarks for RocksJava (#6241 ) Summary: This is the start of some JMH microbenchmarks for RocksJava. Such benchmarks can help us decide on performance improvements of the Java API. At the moment, I have only added benchmarks for various Comparator options, as that is one of the first areas where I want to improve performance. I plan to expand this to many more tests. Details of how to compile and run the benchmarks are in the `README.md`. A run of these on a XEON 3.5 GHz 4vCPU (QEMU Virtual CPU version 2.5+) / 8GB RAM KVM with Ubuntu 18.04, OpenJDK 1.8.0_232, and gcc 8.3.0 produced the following: ``` # Run complete. Total time: 01:43:17 REMEMBER: The numbers below are just data. To gain reusable insights, you need to follow up on why the numbers are the way they are. Use profilers (see -prof, -lprof), design factorial experiments, perform baseline and negative tests that provide experimental control, make sure the benchmarking environment is safe on JVM/OS/HW level, ask for reviews from the domain experts. Do not assume the numbers tell you what you want them to tell. Benchmark (comparatorName) Mode Cnt Score Error Units ComparatorBenchmarks.put native_bytewise thrpt 25 122373.920 ± 2200.538 ops/s ComparatorBenchmarks.put java_bytewise_adaptive_mutex thrpt 25 17388.201 ± 1444.006 ops/s ComparatorBenchmarks.put java_bytewise_non-adaptive_mutex thrpt 25 16887.150 ± 1632.204 ops/s ComparatorBenchmarks.put java_direct_bytewise_adaptive_mutex thrpt 25 15644.572 ± 1791.189 ops/s ComparatorBenchmarks.put java_direct_bytewise_non-adaptive_mutex thrpt 25 14869.601 ± 2252.135 ops/s ComparatorBenchmarks.put native_reverse_bytewise thrpt 25 116528.735 ± 4168.797 ops/s ComparatorBenchmarks.put java_reverse_bytewise_adaptive_mutex thrpt 25 10651.975 ± 545.998 ops/s ComparatorBenchmarks.put java_reverse_bytewise_non-adaptive_mutex thrpt 25 10514.224 ± 930.069 ops/s ``` Indicating a ~7x difference between comparators implemented natively (C++) and those implemented in Java. Let's see if we can't improve on that in the near future... Pull Request resolved: https://github.com/facebook/rocksdb/pull/6241 Differential Revision: D19290410 Pulled By: pdillinger fbshipit-source-id: 25d44bf3a31de265502ed0c5d8a28cf4c7cb9c0b	5 years ago

Author

SHA1

Message

Date

Alan Paxton

c1ec0b28eb

java / jni io_uring support (#9224 )

Summary:
Existing multiGet() in java calls multi_get_helper() which then calls DB::std::vector MultiGet(). This doesn't take advantage of io_uring.

This change adds another JNI level method that runs a parallel code path using the DB::void MultiGet(), using ByteBuffers at the JNI level. We call it multiGetDirect(). In addition to using the io_uring path, this code internally returns pinned slices which we can copy out of into our direct byte buffers; this should reduce the overall number of copies in the code path to/from Java. Some jmh benchmark runs (100k keys, 1000 key multiGet) suggest that for value sizes > 1k, we see about a 20% performance improvement, although performance is slightly reduced for small value sizes, there's a little bit more overhead in the JNI methods.

Closes https://github.com/facebook/rocksdb/issues/8407

Pull Request resolved: https://github.com/facebook/rocksdb/pull/9224

Reviewed By: mrambacher

Differential Revision: D32951754

Pulled By: jay-zhuang

fbshipit-source-id: 1f70df7334be2b6c42a9c8f92725f67c71631690

3 years ago

Adam Retter

6477075f2c

JMH microbenchmarks for RocksJava (#6241 )

Summary:
This is the start of some JMH microbenchmarks for RocksJava.

Such benchmarks can help us decide on performance improvements of the Java API.

At the moment, I have only added benchmarks for various Comparator options, as that is one of the first areas where I want to improve performance. I plan to expand this to many more tests.

Details of how to compile and run the benchmarks are in the `README.md`.

A run of these on a XEON 3.5 GHz 4vCPU (QEMU Virtual CPU version 2.5+) / 8GB RAM KVM with Ubuntu 18.04, OpenJDK 1.8.0_232, and gcc 8.3.0 produced the following:

```
# Run complete. Total time: 01:43:17

REMEMBER: The numbers below are just data. To gain reusable insights, you need to follow up on
why the numbers are the way they are. Use profilers (see -prof, -lprof), design factorial
experiments, perform baseline and negative tests that provide experimental control, make sure
the benchmarking environment is safe on JVM/OS/HW level, ask for reviews from the domain experts.
Do not assume the numbers tell you what you want them to tell.

Benchmark                                         (comparatorName)   Mode  Cnt       Score       Error  Units
ComparatorBenchmarks.put                           native_bytewise thrpt   25   122373.920 ±  2200.538  ops/s
ComparatorBenchmarks.put              java_bytewise_adaptive_mutex thrpt   25    17388.201 ±  1444.006  ops/s
ComparatorBenchmarks.put          java_bytewise_non-adaptive_mutex thrpt   25    16887.150 ±  1632.204  ops/s
ComparatorBenchmarks.put       java_direct_bytewise_adaptive_mutex thrpt   25    15644.572 ±  1791.189  ops/s
ComparatorBenchmarks.put   java_direct_bytewise_non-adaptive_mutex thrpt   25    14869.601 ±  2252.135  ops/s
ComparatorBenchmarks.put                   native_reverse_bytewise thrpt   25   116528.735 ±  4168.797  ops/s
ComparatorBenchmarks.put      java_reverse_bytewise_adaptive_mutex thrpt   25    10651.975 ±   545.998  ops/s
ComparatorBenchmarks.put  java_reverse_bytewise_non-adaptive_mutex thrpt   25    10514.224 ±   930.069  ops/s
```

Indicating a ~7x difference between comparators implemented natively (C++) and those implemented in Java. Let's see if we can't improve on that in the near future...
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6241

Differential Revision: D19290410

Pulled By: pdillinger

fbshipit-source-id: 25d44bf3a31de265502ed0c5d8a28cf4c7cb9c0b

5 years ago

2 Commits (18cb731f27746cc65742d5965da24fb8231e413a)