Summary:
**Context:**
The existing stat rocksdb.sst.read.micros does not reflect each of compaction and flush cases but aggregate them, which is not so helpful for us to understand IO read behavior of each of them.
**Summary**
- Update `StopWatch` and `RandomAccessFileReader` to record `rocksdb.sst.read.micros` and `rocksdb.file.{flush/compaction}.read.micros`
- Fixed the default histogram in `RandomAccessFileReader`
- New field `ReadOptions/IOOptions::io_activity`; Pass `ReadOptions` through paths under db open, flush and compaction to where we can prepare `IOOptions` and pass it to `RandomAccessFileReader`
- Use `thread_status_util` for assertion in `DbStressFSWrapper` for continuous testing on we are passing correct `io_activity` under db open, flush and compaction
Pull Request resolved: https://github.com/facebook/rocksdb/pull/11288
Test Plan:
- **Stress test**
- **Db bench 1: rocksdb.sst.read.micros COUNT ≈ sum of rocksdb.file.read.flush.micros's and rocksdb.file.read.compaction.micros's.** (without blob)
- May not be exactly the same due to `HistogramStat::Add` only guarantees atomic not accuracy across threads.
```
./db_bench -db=/dev/shm/testdb/ -statistics=true -benchmarks="fillseq" -key_size=32 -value_size=512 -num=50000 -write_buffer_size=655 -target_file_size_base=655 -disable_auto_compactions=false -compression_type=none -bloom_bits=3 (-use_plain_table=1 -prefix_size=10)
```
```
// BlockBasedTable
rocksdb.sst.read.micros P50 : 2.009374 P95 : 4.968548 P99 : 8.110362 P100 : 43.000000 COUNT : 40456 SUM : 114805
rocksdb.file.read.flush.micros P50 : 1.871841 P95 : 3.872407 P99 : 5.540541 P100 : 43.000000 COUNT : 2250 SUM : 6116
rocksdb.file.read.compaction.micros P50 : 2.023109 P95 : 5.029149 P99 : 8.196910 P100 : 26.000000 COUNT : 38206 SUM : 108689
// PlainTable
Does not apply
```
- **Db bench 2: performance**
**Read**
SETUP: db with 900 files
```
./db_bench -db=/dev/shm/testdb/ -benchmarks="fillseq" -key_size=32 -value_size=512 -num=50000 -write_buffer_size=655 -disable_auto_compactions=true -target_file_size_base=655 -compression_type=none
```run till convergence
```
./db_bench -seed=1678564177044286 -use_existing_db=true -db=/dev/shm/testdb -benchmarks=readrandom[-X60] -statistics=true -num=1000000 -disable_auto_compactions=true -compression_type=none -bloom_bits=3
```
Pre-change
`readrandom [AVG 60 runs] : 21568 (± 248) ops/sec`
Post-change (no regression, -0.3%)
`readrandom [AVG 60 runs] : 21486 (± 236) ops/sec`
**Compaction/Flush**run till convergence
```
./db_bench -db=/dev/shm/testdb2/ -seed=1678564177044286 -benchmarks="fillseq[-X60]" -key_size=32 -value_size=512 -num=50000 -write_buffer_size=655 -disable_auto_compactions=false -target_file_size_base=655 -compression_type=none
rocksdb.sst.read.micros COUNT : 33820
rocksdb.sst.read.flush.micros COUNT : 1800
rocksdb.sst.read.compaction.micros COUNT : 32020
```
Pre-change
`fillseq [AVG 46 runs] : 1391 (± 214) ops/sec; 0.7 (± 0.1) MB/sec`
Post-change (no regression, ~-0.4%)
`fillseq [AVG 46 runs] : 1385 (± 216) ops/sec; 0.7 (± 0.1) MB/sec`
Reviewed By: ajkr
Differential Revision: D44007011
Pulled By: hx235
fbshipit-source-id: a54c89e4846dfc9a135389edf3f3eedfea257132
Summary:
The primary purpose of the FactoryFunc was to support LITE mode where the ObjectRegistry was not available. With the removal of LITE mode, the function was no longer required.
Note that the MergeOperator had some private classes defined in header files. To gain access to their constructors (and name methods), the class definitions were moved into header files.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/11203
Reviewed By: cbi42
Differential Revision: D43160255
Pulled By: pdillinger
fbshipit-source-id: f3a465fd5d1a7049b73ecf31e4b8c3762f6dae6c
Summary:
We haven't been actively mantaining RocksDB LITE recently and the size must have been gone up significantly. We are removing the support.
Most of changes were done through following comments:
unifdef -m -UROCKSDB_LITE `git grep -l ROCKSDB_LITE | egrep '[.](cc|h)'`
by Peter Dillinger. Others changes were manually applied to build scripts, CircleCI manifests, ROCKSDB_LITE is used in an expression and file db_stress_test_base.cc.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/11147
Test Plan: See CI
Reviewed By: pdillinger
Differential Revision: D42796341
fbshipit-source-id: 4920e15fc2060c2cd2221330a6d0e5e65d4b7fe2
Summary:
In order to support old-style regex function registration, restored the original "Register<T>(string, Factory)" method using regular expressions. The PatternEntry methods were left in place but renamed to AddFactory. The goal is to allow for the deprecation of the original regex Registry method in an upcoming release.
Added modes to the PatternEntry kMatchZeroOrMore and kMatchAtLeastOne to match * or +, respectively (kMatchAtLeastOne was the original behavior).
Pull Request resolved: https://github.com/facebook/rocksdb/pull/9362
Reviewed By: pdillinger
Differential Revision: D33432562
Pulled By: mrambacher
fbshipit-source-id: ed88ab3f9a2ad0d525c7bd1692873f9bb3209d02
Summary:
Added a few classes in the same class hierarchy to remove code duplication and
refactor the logic of reading and processing MANIFEST files.
New classes are as follows.
```
class VersionEditHandlerBase;
class ListColumnFamiliesHandler : VersionEditHandlerBase;
class FileChecksumRetriever : VersionEditHandlerBase;
class DumpManifestHandler : VersionEditHandler;
```
Classes that already existed before this PR are as follows.
```
class VersionEditHandler : VersionEditHandlerBase;
```
With these classes, refactored functions: `VersionSet::Recover()`,
`VersionSet::ListColumnFamilies()`, `VersionSet::DumpManifest()`,
`GetFileChecksumFromManifest()`.
Test Plan (devserver):
```
make check
COMPILE_WITH_ASAN=1 make check
```
These refactored code, especially recovery-related logic, will be tested intensively by
all existing unit tests and stress tests. For example, run
```
make crash_test
```
Verified 3 successful runs on devserver.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6581
Reviewed By: ajkr
Differential Revision: D20616217
Pulled By: riversand963
fbshipit-source-id: 048c7743aa4be2623ccd0cc3e61c0027e604e78b
Summary:
This is a "real" fix for the issue worked around in https://github.com/facebook/rocksdb/issues/7294.
To get DB checksum info for live files, we now read the manifest file
that will become part of the checkpoint/backup. This requires a little
extra handling in taking a custom checkpoint, including only reading the
manifest file up to the size prescribed by the checkpoint.
This moves GetFileChecksumsFromManifest from backup code to
file_checksum_helper.{h,cc} and removes apparently unnecessary checking
related to column families.
Updated HISTORY.md and warned potential future users of
DB::GetLiveFilesChecksumInfo()
Pull Request resolved: https://github.com/facebook/rocksdb/pull/7309
Test Plan: updated unit test, before and after
Reviewed By: ajkr
Differential Revision: D23311994
Pulled By: pdillinger
fbshipit-source-id: 741e30a2dc1830e8208f7648fcc8c5f000d4e2d5
Summary:
Add NewFileChecksumGenCrc32cFactory to file checksum public interface such that applications can use the build in crc32 checksum factory.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6688
Test Plan: pass make asan_check
Reviewed By: riversand963
Differential Revision: D21006859
Pulled By: zhichao-cao
fbshipit-source-id: ea8a45196a8b77c310728ab05f6cc0f49f3baef0
Summary:
In the current implementation, sst file checksum is calculated by a shared checksum function object, which may make some checksum function hard to be applied here such as SHA1. In this implementation, each sst file will have its own checksum generator obejct, created by FileChecksumGenFactory. User needs to implement its own FilechecksumGenerator and Factory to plugin the in checksum calculation method.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6600
Test Plan: tested with make asan_check
Reviewed By: riversand963
Differential Revision: D20717670
Pulled By: zhichao-cao
fbshipit-source-id: 2a74c1c280ac11a07a1980185b43b671acaa71c6
Summary:
When dynamically linking two binaries together, different builds of RocksDB from two sources might cause errors. To provide a tool for user to solve the problem, the RocksDB namespace is changed to a flag which can be overridden in build time.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6433
Test Plan: Build release, all and jtest. Try to build with ROCKSDB_NAMESPACE with another flag.
Differential Revision: D19977691
fbshipit-source-id: aa7f2d0972e1c31d75339ac48478f34f6cfcfb3e
Summary:
In the current code base, RocksDB generate the checksum for each block and verify the checksum at usage. Current PR enable SST file checksum. After a SST file is generated by Flush or Compaction, RocksDB generate the SST file checksum and store the checksum value and checksum method name in the vs_info and MANIFEST as part for the FileMetadata.
Added the enable_sst_file_checksum to Options to enable or disable file checksum. Added sst_file_checksum to Options such that user can plugin their own SST file checksum calculate method via overriding the SstFileChecksum class. The checksum information inlcuding uint32_t checksum value and a checksum name (string). A new tool is added to LDB such that user can dump out a list of file checksum information from MANIFEST. If user enables the file checksum but does not provide the sst_file_checksum instance, RocksDB will use the default crc32checksum implemented in table/sst_file_checksum_crc32c.h
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6216
Test Plan: Added the testing case in table_test and ldb_cmd_test to verify checksum is correct in different level. Pass make asan_check.
Differential Revision: D19171461
Pulled By: zhichao-cao
fbshipit-source-id: b2e53479eefc5bb0437189eaa1941670e5ba8b87