Summary: Currently, even if statistics is not enabled, StopWatch only for the stats still gets the time of the day, which is wasteful. This patch adds a new option to StopWatch to disable this get in this case.
Test Plan: make all check
Reviewers: dhruba, haobo, igor
CC: leveldb
Differential Revision: https://reviews.facebook.net/D14703
Conflicts:
db/db_impl.cc
Summary:
Implement a mem table, in which keys are hashed based on prefixes. In each bucket, entries are organized in a sorted linked list. It has the same thread safety guarantee as skip list.
The motivation is to optimize memory usage for the case that prefix hashing is primary way of seeking to the entry. Compared to hash skip list implementation, this implementation is more memory efficient, but inside each bucket, search is always linear. The target scenario is that there are only very limited number of records in each hash bucket.
Test Plan: Add a test case in db_test
Reviewers: haobo, kailiu, dhruba
Reviewed By: haobo
CC: igor, nkg-, leveldb
Differential Revision: https://reviews.facebook.net/D14979
Summary:
This diff introduces some features that were requested by two internal customers:
* Ability for backups not to share table files, because we can't guarantee that equal filename means equal content accross replicas
* Ability for two threads to call EnableFileDeletions() and DisableFileDeletions()
* Ability to stop backup from another thread and not slow down the DB close
* Copy the files to the temporary folder first and then atomically rename
Test Plan: Added some tests to backupable_db_test
Reviewers: dhruba, sanketh, muthu, sdong, haobo
Reviewed By: haobo
CC: leveldb, sanketh, muthu
Differential Revision: https://reviews.facebook.net/D14769
Summary:
Added readwhilewriting benchmark to our regression tests.
Changed block cache shards from 16 to 64, as Mark found that cache mutex contention is a big bottleneck.
Test Plan: Ran it.
Reviewers: dhruba, haobo, MarkCallaghan, xjin
Reviewed By: MarkCallaghan
CC: leveldb
Differential Revision: https://reviews.facebook.net/D15075
Summary: Currently, even if statistics is not enabled, StopWatch only for the stats still gets the time of the day, which is wasteful. This patch adds a new option to StopWatch to disable this get in this case.
Test Plan: make all check
Reviewers: dhruba, haobo, igor
CC: leveldb
Differential Revision: https://reviews.facebook.net/D14703
Summary: Use two vectors for different types of memory allocation.
Test Plan: run all unit tests.
Reviewers: haobo, sdong
Reviewed By: haobo
CC: leveldb
Differential Revision: https://reviews.facebook.net/D15027
Summary:
I have added three new value types:
* kTypeColumnFamilyDeletion
* kTypeColumnFamilyValue
* kTypeColumnFamilyMerge
which include column family Varint32 before the data (value, deletion and merge). These values are used only in WAL (not in memtables yet).
This endeavour required changing some WriteBatch internals.
Test Plan: Added a unittest
Reviewers: dhruba, haobo, sdong, kailiu
CC: leveldb
Differential Revision: https://reviews.facebook.net/D15045
Summary:
Code was always compressing L0 files written by a memtable flush
when compression was enabled. Now this is done when
min_level_to_compress=0 for leveled compaction and when
universal_compaction_size_percent=-1 for universal compaction.
Task ID: #3416472
Blame Rev:
Test Plan:
ran db_bench with compression options
Revert Plan:
Database Impact:
Memcache Impact:
Other Notes:
EImportant:
- begin *PUBLIC* platform impact section -
Bugzilla: #
- end platform impact -
Reviewers: dhruba, igor, sdong
Reviewed By: dhruba
CC: leveldb
Differential Revision: https://reviews.facebook.net/D14757
Summary:
In addition to implementing OpenWithColumnFamilies, this diff also includes some minor changes:
* Changed all column family names from Slice() to std::string. The performance of column family name handling is not critical, and it's more convenient and cleaner to have names as std::strings
* Implemented ColumnFamilyOptions(const Options&) and DBOptions(const Options&)
* Added ColumnFamilyOptions to VersionSet::ColumnFamilyData. ColumnFamilyOptions are specified on OpenWithColumnFamilies() and CreateColumnFamily()
I will keep the diff in the Phabricator for a day or two and will push to the branch then. Feel free to comment even after the diff has been pushed.
Test Plan: Added a simple unit test
Reviewers: dhruba, haobo
CC: leveldb
Differential Revision: https://reviews.facebook.net/D15033
This seems out of place as it's the only time RocksDB prints to stdout in the
normal course of operations. Thread IDs can still be retrieved from the LOG
file: cut -d ' ' -f2 LOG | sort | uniq | egrep -x '[0-9a-f]+'
Summary:
The way DBImpl::TEST_CompactRange() throttles down the number of bg compactions
can cause it to deadlock when CompactRange() is called concurrently from
multiple threads. Imagine a following scenario with only two threads
(max_background_compactions is 10 and bg_compaction_scheduled_ is initially 0):
1. Thread #1 increments bg_compaction_scheduled_ (to LargeNumber), sets
bg_compaction_scheduled_ to 9 (newvalue), schedules the compaction
(bg_compaction_scheduled_ is now 10) and waits for it to complete.
2. Thread #2 calls TEST_CompactRange(), increments bg_compaction_scheduled_
(now LargeNumber + 10) and waits on a cv for bg_compaction_scheduled_ to
drop to LargeNumber.
3. BG thread completes the first manual compaction, decrements
bg_compaction_scheduled_ and wakes up all threads waiting on bg_cv_.
Thread #1 runs, increments bg_compaction_scheduled_ by LargeNumber again
(now 2*LargeNumber + 9). Since that's more than LargeNumber + newvalue,
thread #2 also goes to sleep (waiting on bg_cv_), without resetting
bg_compaction_scheduled_.
This diff attempts to address the problem by introducing a new counter
bg_manual_only_ (when positive, MaybeScheduleFlushOrCompaction() will only
schedule manual compactions).
Test Plan:
I could pretty much consistently reproduce the deadlock with a program that
calls CompactRange(nullptr, nullptr) immediately after Write() from multiple
threads. This no longer happens with this patch.
Tests (make check) pass.
Reviewers: dhruba, igor, sdong, haobo
Reviewed By: igor
CC: leveldb
Differential Revision: https://reviews.facebook.net/D14799
Summary:
This diff provides basic implementations of CreateColumnFamily(), DropColumnFamily() and ListColumnFamilies(). It builds on top of https://reviews.facebook.net/D14733
It also includes a bug fix for DBImplReadOnly, where Get implementation would be redirected to DBImpl instead of DBImplReadOnly.
Test Plan: Added unit test
Reviewers: dhruba, haobo, kailiu
CC: leveldb
Differential Revision: https://reviews.facebook.net/D15021
Summary: this diff only replace the cases when we need to frequently create vector with small amount of entries. This diff doesn't aim to improve performance of a specific area, but more like a small scale test for the autovector and see how it works in real life.
Test Plan:
make check
I also ran the performance tests, however there is no performance gain/loss. All performance numbers are pretty much the same before/after the change.
Reviewers: dhruba, haobo, sdong, igor
CC: leveldb
Differential Revision: https://reviews.facebook.net/D14985
Summary: I made some cleanup while reading the source code in `db`. Most changes are about style, naming or C++ 11 new features.
Test Plan: ran `make check`
Reviewers: haobo, dhruba, sdong
CC: leveldb
Differential Revision: https://reviews.facebook.net/D15009
Summary:
The rule file is forked from that in Facebook's repo.
I'll add format file for now and team members can tune the rules later.
In this patch, I made only two changes in order to be consistent with existing coding style
`SpacesBeforeTrailingComments: 2`
`ColumnLimit: 80`
Test Plan: N/A
Reviewers: igor
CC: leveldb
Differential Revision: https://reviews.facebook.net/D15015
Summary: Added a script that prepares the repo for facebook's new rocksdb release, which will automatically do some necessary work to make sure this repo is ready for 3rdparty release.
Test Plan:
Run this script and observed:
* new version was created (both in local and remote repo) as a git tag.
* build_version.cc was updated
* build_detect_platform was changed so that it won't create any new change.
Reviewers: haobo, dhruba, sdong, igor
CC: leveldb
Differential Revision: https://reviews.facebook.net/D15003
Summary:
<This diff is for Column Family branch>
Added fields in manifest file to support adding and deleting column families.
Pretty simple change, each version edit record can be:
1. add column family
2. drop column family
3. add and delete N files from a single column family (compactions and flushes will generate such records)
Test Plan: make check works, the code is backward compatible
Reviewers: dhruba, haobo
CC: leveldb
Differential Revision: https://reviews.facebook.net/D14733
Summary:
We don't want two threads to clash if they concurrently call DisableFileDeletions() and EnableFileDeletions(). I'm adding a counter that will enable file deletions only after all DisableFileDeletions() calls have been negated with EnableFileDeletions().
However, we also don't want to break the old behavior, so I added a parameter force to EnableFileDeletions(). If force is true, we will still enable file deletions after every call to EnableFileDeletions(), which is what is happening now.
Test Plan: make check
Reviewers: dhruba, haobo, sanketh
Reviewed By: dhruba
CC: leveldb
Differential Revision: https://reviews.facebook.net/D14781
Summary: I'm not sure what's the purpose of encoding file number to a new buffer for looking up the table cache. It seems to be unnecessary to me. With this patch, we point the lookup key to the address of the int64 of the file number.
Test Plan: make all check
Reviewers: dhruba, haobo, igor, kailiu
Reviewed By: dhruba
CC: leveldb
Differential Revision: https://reviews.facebook.net/D14811
Summary:
In some places we have NotFound status created with empty message, but it doesn't avoid a malloc. With this patch, the malloc is avoided for that case.
The motivation of it is that I found in db_bench readrandom test when all keys are not existing, about 4% of the total running time is spent on malloc of Status, plus a similar amount of CPU spent on free of them, which is not necessary.
Test Plan: make all check
Reviewers: dhruba, haobo, igor
Reviewed By: haobo
CC: leveldb
Differential Revision: https://reviews.facebook.net/D14691