Summary:
Add a new function in Listener to let the caller know when rocksdb
is stalling writes.
Closes https://github.com/facebook/rocksdb/pull/2897
Differential Revision: D5860124
Pulled By: schischi
fbshipit-source-id: ee791606169aa64f772c86f817cebf02624e05e1
Summary:
Add kTypeBlobIndex value type, which will be used by blob db only, to insert a (key, blob_offset) KV pair. The purpose is to
1. Make it possible to open existing rocksdb instance as blob db. Existing value will be of kTypeIndex type, while value inserted by blob db will be of kTypeBlobIndex.
2. Make rocksdb able to detect if the db contains value written by blob db, if so return error.
3. Make it possible to have blob db optionally store value in SST file (with kTypeValue type) or as a blob value (with kTypeBlobIndex type).
The root db (DBImpl) basically pretended kTypeBlobIndex are normal value on write. On Get if is_blob is provided, return whether the value read is of kTypeBlobIndex type, or return Status::NotSupported() status if is_blob is not provided. On scan allow_blob flag is pass and if the flag is true, return wether the value is of kTypeBlobIndex type via iter->IsBlob().
Changes on blob db side will be in a separate patch.
Closes https://github.com/facebook/rocksdb/pull/2886
Differential Revision: D5838431
Pulled By: yiwu-arbug
fbshipit-source-id: 3c5306c62bc13bb11abc03422ec5cbcea1203cca
Summary:
This reverts the previous commit 1d7048c598, which broke the build.
Did a `git revert 1d7048c`.
Closes https://github.com/facebook/rocksdb/pull/2627
Differential Revision: D5476473
Pulled By: sagar0
fbshipit-source-id: 4756ff5c0dfc88c17eceb00e02c36176de728d06
Summary: This uses `clang-tidy` to comment out unused parameters (in functions, methods and lambdas) in fbcode. Cases that the tool failed to handle are fixed manually.
Reviewed By: igorsugak
Differential Revision: D5454343
fbshipit-source-id: 5dee339b4334e25e963891b519a5aa81fbf627b2
Summary:
Introducing FIFO compactions with TTL.
FIFO compaction is based on size only which makes it tricky to enable in production as use cases can have organic growth. A user requested an option to drop files based on the time of their creation instead of the total size.
To address that request:
- Added a new TTL option to FIFO compaction options.
- Updated FIFO compaction score to take TTL into consideration.
- Added a new table property, creation_time, to keep track of when the SST file is created.
- Creation_time is set as below:
- On Flush: Set to the time of flush.
- On Compaction: Set to the max creation_time of all the files involved in the compaction.
- On Repair and Recovery: Set to the time of repair/recovery.
- Old files created prior to this code change will have a creation_time of 0.
- FIFO compaction with TTL is enabled when ttl > 0. All files older than ttl will be deleted during compaction. i.e. `if (file.creation_time < (current_time - ttl)) then delete(file)`. This will enable cases where you might want to delete all files older than, say, 1 day.
- FIFO compaction will fall back to the prior way of deleting files based on size if:
- the creation_time of all files involved in compaction is 0.
- the total size (of all SST files combined) does not drop below `compaction_options_fifo.max_table_files_size` even if the files older than ttl are deleted.
This feature is not supported if max_open_files != -1 or with table formats other than Block-based.
**Test Plan:**
Added tests.
**Benchmark results:**
Base: FIFO with max size: 100MB ::
```
svemuri@dev15905 ~/rocksdb (fifo-compaction) $ TEST_TMPDIR=/dev/shm ./db_bench --benchmarks=readwhilewriting --num=5000000 --threads=16 --compaction_style=2 --fifo_compaction_max_table_files_size_mb=100
readwhilewriting : 1.924 micros/op 519858 ops/sec; 13.6 MB/s (1176277 of 5000000 found)
```
With TTL (a low one for testing) ::
```
svemuri@dev15905 ~/rocksdb (fifo-compaction) $ TEST_TMPDIR=/dev/shm ./db_bench --benchmarks=readwhilewriting --num=5000000 --threads=16 --compaction_style=2 --fifo_compaction_max_table_files_size_mb=100 --fifo_compaction_ttl=20
readwhilewriting : 1.902 micros/op 525817 ops/sec; 13.7 MB/s (1185057 of 5000000 found)
```
Example Log lines:
```
2017/06/26-15:17:24.609249 7fd5a45ff700 (Original Log Time 2017/06/26-15:17:24.609177) [db/compaction_picker.cc:1471] [default] FIFO compaction: picking file 40 with creation time 1498515423 for deletion
2017/06/26-15:17:24.609255 7fd5a45ff700 (Original Log Time 2017/06/26-15:17:24.609234) [db/db_impl_compaction_flush.cc:1541] [default] Deleted 1 files
...
2017/06/26-15:17:25.553185 7fd5a61a5800 [DEBUG] [db/db_impl_files.cc:309] [JOB 0] Delete /dev/shm/dbbench/000040.sst type=2 #40 -- OK
2017/06/26-15:17:25.553205 7fd5a61a5800 EVENT_LOG_v1 {"time_micros": 1498515445553199, "job": 0, "event": "table_file_deletion", "file_number": 40}
```
SST Files remaining in the dbbench dir, after db_bench execution completed:
```
svemuri@dev15905 ~/rocksdb (fifo-compaction) $ ls -l /dev/shm//dbbench/*.sst
-rw-r--r--. 1 svemuri users 30749887 Jun 26 15:17 /dev/shm//dbbench/000042.sst
-rw-r--r--. 1 svemuri users 30768779 Jun 26 15:17 /dev/shm//dbbench/000044.sst
-rw-r--r--. 1 svemuri users 30757481 Jun 26 15:17 /dev/shm//dbbench/000046.sst
```
Closes https://github.com/facebook/rocksdb/pull/2480
Differential Revision: D5305116
Pulled By: sagar0
fbshipit-source-id: 3e5cfcf5dd07ed2211b5b37492eb235b45139174
Summary:
Some users want to prevent rocksdb from entering read-only mode in certain error cases. This diff gives them a callback, `OnBackgroundError`, that they can use to achieve it.
- call `OnBackgroundError` every time we consider setting `bg_error_`. Use its result to assign `bg_error_` but not to change the function's return status.
- classified calls using `BackgroundErrorReason` to give the callback some info about where the error happened
- renamed `ParanoidCheck` to something more specific so we can provide a clear `BackgroundErrorReason`
- unit tests for the most common cases: flush or compaction errors
Closes https://github.com/facebook/rocksdb/pull/2477
Differential Revision: D5300190
Pulled By: ajkr
fbshipit-source-id: a0ea4564249719b83428e3f4c6ca2c49e366e9b3
Summary:
Allow an option for users to do some compaction in FIFO compaction, to pay some write amplification for fewer number of files.
Closes https://github.com/facebook/rocksdb/pull/2163
Differential Revision: D4895953
Pulled By: siying
fbshipit-source-id: a1ab608dd0627211f3e1f588a2e97159646e1231
Summary: add ColumnFamilyHandleDeletionStarted listener which can be called when user deletes handler.
Test Plan: ./listener_test
Reviewers: yiwu, IslamAbdelRahman, sdong, andrewkr
Reviewed By: andrewkr
Subscribers: andrewkr, dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D60717
Summary: char is not signed in some platforms. Having negative values confuse those compilers.
Test Plan: Run all existing tests.
Reviewers: andrewkr, IslamAbdelRahman
Reviewed By: IslamAbdelRahman
Subscribers: leveldb, andrewkr, dhruba
Differential Revision: https://reviews.facebook.net/D59619
* Create a callback for memtable becoming immutable
Create a callback for memtable becoming immutable
Create a callback for memtable becoming immutable
moved notification outside the lock
Move sealed notification to unlocked portion of SwitchMemtable
* fix lite build
Summary:
Add a new option that can be used to set a specific compression algorithm for bottommost level.
This option will only affect levels larger than base level.
I have also updated CompactionJobInfo to include the compression algorithm used in compaction
Test Plan:
added new unittest
existing unittests
Reviewers: andrewkr, yhchiang, sdong
Reviewed By: sdong
Subscribers: lightmark, andrewkr, dhruba, yoshinorim
Differential Revision: https://reviews.facebook.net/D57669
Summary: Added EventListener::OnTableFileCreationStarted. EventListener::OnTableFileCreated will be called on failure case. User can check creation status via TableFileCreationInfo::status.
Test Plan: unit test.
Reviewers: dhruba, yhchiang, ott, sdong
Reviewed By: sdong
Subscribers: sdong, kradhakrishnan, IslamAbdelRahman, andrewkr, yhchiang, leveldb, ott, dhruba
Differential Revision: https://reviews.facebook.net/D56337
Summary:
Add CompactionReason to CompactionJobInfo
This will allow users to understand why compaction started which will help options tuning
Test Plan:
added new tests
make check -j64
Reviewers: yhchiang, anthony, kradhakrishnan, sdong, rven
Reviewed By: rven
Subscribers: dhruba
Differential Revision: https://reviews.facebook.net/D51975
Summary: Run "make format" for some recent commits.
Test Plan: Build and run tests
Reviewers: IslamAbdelRahman
Reviewed By: IslamAbdelRahman
Subscribers: leveldb, dhruba
Differential Revision: https://reviews.facebook.net/D49707
Summary: It would be nice to have and access to table properties in compaction callbacks. In MyRocks project, it will make possible to update optimizer statistics online.
Test Plan: ran the unit test. Ran myrocks with the new way of collecting stats.
Reviewers: igor, rven, yhchiang
Reviewed By: yhchiang
Subscribers: dhruba
Differential Revision: https://reviews.facebook.net/D48267
Summary:
Adding largest sequence number to FlushJobInfo
and passing flushed file metadata to NotifyOnFlushCompleted which include alot of other values that we may want to expose in FlushJobInfo
Test Plan: make check
Reviewers: igor, sdong
Reviewed By: sdong
Subscribers: dhruba
Differential Revision: https://reviews.facebook.net/D39927
Summary:
EventListener::OnFlushCompleted() now passes a structure instead
of a list of parameters. This minimizes the API change in the
future.
Test Plan:
listener_test
compact_files_test
example/compact_files_example
Reviewers: kradhakrishnan, sdong, IslamAbdelRahman, rven, igor
Reviewed By: rven, igor
Subscribers: IslamAbdelRahman, rven, dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D39543
Summary:
Add EventListener::OnTableFileDeletion(), which will be
called when a table file is deleted.
Test Plan: Extend three existing tests in db_test to verify the deleted files.
Reviewers: rven, anthony, kradhakrishnan, igor, sdong
Reviewed By: sdong
Subscribers: dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D38931
Summary:
Allow EventListener::OnCompactionCompleted to return CompactionJobStats,
which contains useful information about a compaction.
Example CompactionJobStats returned by OnCompactionCompleted():
smallest_output_key_prefix 05000000
largest_output_key_prefix 06990000
elapsed_time 42419
num_input_records 300
num_input_files 3
num_input_files_at_output_level 2
num_output_records 200
num_output_files 1
actual_bytes_input 167200
actual_bytes_output 110688
total_input_raw_key_bytes 5400
total_input_raw_value_bytes 300000
num_records_replaced 100
is_manual_compaction 1
Test Plan: Developed a mega test in db_test which covers 20 variables in CompactionJobStats.
Reviewers: rven, igor, anthony, sdong
Reviewed By: sdong
Subscribers: tnovak, dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D38463
Summary: Fixed ROCKSDB_LITE compile error due to the missing of TableFileCreationInfo
Test Plan: make OPT=-DROCKSDB_LITE shared_lib -j32
Reviewers: sdong, igor
Subscribers: dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D39405
Summary:
Add EventListener::OnTableFileCreated(), which will be called
when a table file is created. This patch is part of the
EventLogger and EventListener integration.
Test Plan: Augment existing test in db/listener_test.cc
Reviewers: anthony, kradhakrishnan, rven, igor, sdong
Reviewed By: sdong
Subscribers: dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D38865
Summary:
Fixed a bug in EventListener::OnCompactionCompleted() that returns
incorrect list of input / output file names.
Test Plan: Extend existing test in listener_test.cc
Reviewers: sdong, rven, igor
Reviewed By: igor
Subscribers: dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D38349
Summary: This adds a listener for compactions, and gives some useful statistics on each compaction pass.
Test Plan: Unit tests.
Reviewers: sdong, igor, rven, yhchiang
Reviewed By: yhchiang
Subscribers: dhruba
Differential Revision: https://reviews.facebook.net/D31641
Summary:
This diff adds three sets of APIs to RocksDB.
= GetColumnFamilyMetaData =
* This APIs allow users to obtain the current state of a RocksDB instance on one column family.
* See GetColumnFamilyMetaData in include/rocksdb/db.h
= EventListener =
* A virtual class that allows users to implement a set of
call-back functions which will be called when specific
events of a RocksDB instance happens.
* To register EventListener, simply insert an EventListener to ColumnFamilyOptions::listeners
= CompactFiles =
* CompactFiles API inputs a set of file numbers and an output level, and RocksDB
will try to compact those files into the specified level.
= Example =
* Example code can be found in example/compact_files_example.cc, which implements
a simple external compactor using EventListener, GetColumnFamilyMetaData, and
CompactFiles API.
Test Plan:
listener_test
compactor_test
example/compact_files_example
export ROCKSDB_TESTS=CompactFiles
db_test
export ROCKSDB_TESTS=MetaData
db_test
Reviewers: ljin, igor, rven, sdong
Reviewed By: sdong
Subscribers: MarkCallaghan, dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D24705