You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
rocksdb/db/db_iter_test.cc

3271 lines
112 KiB

// Copyright (c) 2011-present, Facebook, Inc. All rights reserved.
// This source code is licensed under both the GPLv2 (found in the
// COPYING file in the root directory) and Apache 2.0 License
// (found in the LICENSE.Apache file in the root directory).
#include <string>
#include <vector>
#include <algorithm>
#include <utility>
#include "db/db_iter.h"
#include "db/dbformat.h"
#include "rocksdb/comparator.h"
#include "rocksdb/options.h"
#include "rocksdb/perf_context.h"
#include "rocksdb/slice.h"
#include "rocksdb/statistics.h"
#include "table/iterator_wrapper.h"
#include "table/merging_iterator.h"
#include "test_util/sync_point.h"
#include "test_util/testharness.h"
#include "util/string_util.h"
#include "utilities/merge_operators.h"
namespace ROCKSDB_NAMESPACE {
static uint64_t TestGetTickerCount(const Options& options,
Tickers ticker_type) {
return options.statistics->getTickerCount(ticker_type);
}
class TestIterator : public InternalIterator {
public:
explicit TestIterator(const Comparator* comparator)
: initialized_(false),
valid_(false),
sequence_number_(0),
iter_(0),
cmp(comparator) {
data_.reserve(16);
}
Support for SingleDelete() Summary: This patch fixes #7460559. It introduces SingleDelete as a new database operation. This operation can be used to delete keys that were never overwritten (no put following another put of the same key). If an overwritten key is single deleted the behavior is undefined. Single deletion of a non-existent key has no effect but multiple consecutive single deletions are not allowed (see limitations). In contrast to the conventional Delete() operation, the deletion entry is removed along with the value when the two are lined up in a compaction. Note: The semantics are similar to @igor's prototype that allowed to have this behavior on the granularity of a column family ( https://reviews.facebook.net/D42093 ). This new patch, however, is more aggressive when it comes to removing tombstones: It removes the SingleDelete together with the value whenever there is no snapshot between them while the older patch only did this when the sequence number of the deletion was older than the earliest snapshot. Most of the complex additions are in the Compaction Iterator, all other changes should be relatively straightforward. The patch also includes basic support for single deletions in db_stress and db_bench. Limitations: - Not compatible with cuckoo hash tables - Single deletions cannot be used in combination with merges and normal deletions on the same key (other keys are not affected by this) - Consecutive single deletions are currently not allowed (and older version of this patch supported this so it could be resurrected if needed) Test Plan: make all check Reviewers: yhchiang, sdong, rven, anthony, yoshinorim, igor Reviewed By: igor Subscribers: maykov, dhruba, leveldb Differential Revision: https://reviews.facebook.net/D43179
9 years ago
void AddPut(std::string argkey, std::string argvalue) {
Add(argkey, kTypeValue, argvalue);
}
void AddDeletion(std::string argkey) {
Add(argkey, kTypeDeletion, std::string());
}
Support for SingleDelete() Summary: This patch fixes #7460559. It introduces SingleDelete as a new database operation. This operation can be used to delete keys that were never overwritten (no put following another put of the same key). If an overwritten key is single deleted the behavior is undefined. Single deletion of a non-existent key has no effect but multiple consecutive single deletions are not allowed (see limitations). In contrast to the conventional Delete() operation, the deletion entry is removed along with the value when the two are lined up in a compaction. Note: The semantics are similar to @igor's prototype that allowed to have this behavior on the granularity of a column family ( https://reviews.facebook.net/D42093 ). This new patch, however, is more aggressive when it comes to removing tombstones: It removes the SingleDelete together with the value whenever there is no snapshot between them while the older patch only did this when the sequence number of the deletion was older than the earliest snapshot. Most of the complex additions are in the Compaction Iterator, all other changes should be relatively straightforward. The patch also includes basic support for single deletions in db_stress and db_bench. Limitations: - Not compatible with cuckoo hash tables - Single deletions cannot be used in combination with merges and normal deletions on the same key (other keys are not affected by this) - Consecutive single deletions are currently not allowed (and older version of this patch supported this so it could be resurrected if needed) Test Plan: make all check Reviewers: yhchiang, sdong, rven, anthony, yoshinorim, igor Reviewed By: igor Subscribers: maykov, dhruba, leveldb Differential Revision: https://reviews.facebook.net/D43179
9 years ago
void AddSingleDeletion(std::string argkey) {
Add(argkey, kTypeSingleDeletion, std::string());
}
void AddMerge(std::string argkey, std::string argvalue) {
Add(argkey, kTypeMerge, argvalue);
}
void Add(std::string argkey, ValueType type, std::string argvalue) {
Add(argkey, type, argvalue, sequence_number_++);
}
void Add(std::string argkey, ValueType type, std::string argvalue,
size_t seq_num, bool update_iter = false) {
valid_ = true;
ParsedInternalKey internal_key(argkey, seq_num, type);
data_.push_back(
std::pair<std::string, std::string>(std::string(), argvalue));
AppendInternalKey(&data_.back().first, internal_key);
if (update_iter && valid_ && cmp.Compare(data_.back().first, key()) < 0) {
// insert a key smaller than current key
Finish();
// data_[iter_] is not anymore the current element of the iterator.
// Increment it to reposition it to the right position.
iter_++;
}
}
// should be called before operations with iterator
void Finish() {
initialized_ = true;
std::sort(data_.begin(), data_.end(),
[this](std::pair<std::string, std::string> a,
std::pair<std::string, std::string> b) {
return (cmp.Compare(a.first, b.first) < 0);
});
}
Change and clarify the relationship between Valid(), status() and Seek*() for all iterators. Also fix some bugs Summary: Before this PR, Iterator/InternalIterator may simultaneously have non-ok status() and Valid() = true. That state means that the last operation failed, but the iterator is nevertheless positioned on some unspecified record. Likely intended uses of that are: * If some sst files are corrupted, a normal iterator can be used to read the data from files that are not corrupted. * When using read_tier = kBlockCacheTier, read the data that's in block cache, skipping over the data that is not. However, this behavior wasn't documented well (and until recently the wiki on github had misleading incorrect information). In the code there's a lot of confusion about the relationship between status() and Valid(), and about whether Seek()/SeekToLast()/etc reset the status or not. There were a number of bugs caused by this confusion, both inside rocksdb and in the code that uses rocksdb (including ours). This PR changes the convention to: * If status() is not ok, Valid() always returns false. * Any seek operation resets status. (Before the PR, it depended on iterator type and on particular error.) This does sacrifice the two use cases listed above, but siying said it's ok. Overview of the changes: * A commit that adds missing status checks in MergingIterator. This fixes a bug that actually affects us, and we need it fixed. `DBIteratorTest.NonBlockingIterationBugRepro` explains the scenario. * Changes to lots of iterator types to make all of them conform to the new convention. Some bug fixes along the way. By far the biggest changes are in DBIter, which is a big messy piece of code; I tried to make it less big and messy but mostly failed. * A stress-test for DBIter, to gain some confidence that I didn't break it. It does a few million random operations on the iterator, while occasionally modifying the underlying data (like ForwardIterator does) and occasionally returning non-ok status from internal iterator. To find the iterator types that needed changes I searched for "public .*Iterator" in the code. Here's an overview of all 27 iterator types: Iterators that didn't need changes: * status() is always ok(), or Valid() is always false: MemTableIterator, ModelIter, TestIterator, KVIter (2 classes with this name anonymous namespaces), LoggingForwardVectorIterator, VectorIterator, MockTableIterator, EmptyIterator, EmptyInternalIterator. * Thin wrappers that always pass through Valid() and status(): ArenaWrappedDBIter, TtlIterator, InternalIteratorFromIterator. Iterators with changes (see inline comments for details): * DBIter - an overhaul: - It used to silently skip corrupted keys (`FindParseableKey()`), which seems dangerous. This PR makes it just stop immediately after encountering a corrupted key, just like it would for other kinds of corruption. Let me know if there was actually some deeper meaning in this behavior and I should put it back. - It had a few code paths silently discarding subiterator's status. The stress test caught a few. - The backwards iteration code path was expecting the internal iterator's set of keys to be immutable. It's probably always true in practice at the moment, since ForwardIterator doesn't support backwards iteration, but this PR fixes it anyway. See added DBIteratorTest.ReverseToForwardBug for an example. - Some parts of backwards iteration code path even did things like `assert(iter_->Valid())` after a seek, which is never a safe assumption. - It used to not reset status on seek for some types of errors. - Some simplifications and better comments. - Some things got more complicated from the added error handling. I'm open to ideas for how to make it nicer. * MergingIterator - check status after every operation on every subiterator, and in some places assert that valid subiterators have ok status. * ForwardIterator - changed to the new convention, also slightly simplified. * ForwardLevelIterator - fixed some bugs and simplified. * LevelIterator - simplified. * TwoLevelIterator - changed to the new convention. Also fixed a bug that would make SeekForPrev() sometimes silently ignore errors from first_level_iter_. * BlockBasedTableIterator - minor changes. * BlockIter - replaced `SetStatus()` with `Invalidate()` to make sure non-ok BlockIter is always invalid. * PlainTableIterator - some seeks used to not reset status. * CuckooTableIterator - tiny code cleanup. * ManagedIterator - fixed some bugs. * BaseDeltaIterator - changed to the new convention and fixed a bug. * BlobDBIterator - seeks used to not reset status. * KeyConvertingIterator - some small change. Closes https://github.com/facebook/rocksdb/pull/3810 Differential Revision: D7888019 Pulled By: al13n321 fbshipit-source-id: 4aaf6d3421c545d16722a815b2fa2e7912bc851d
7 years ago
// Removes the key from the set of keys over which this iterator iterates.
// Not to be confused with AddDeletion().
// If the iterator is currently positioned on this key, the deletion will
// apply next time the iterator moves.
// Used for simulating ForwardIterator updating to a new version that doesn't
// have some of the keys (e.g. after compaction with a filter).
void Vanish(std::string _key) {
if (valid_ && data_[iter_].first == _key) {
delete_current_ = true;
return;
}
for (auto it = data_.begin(); it != data_.end(); ++it) {
ParsedInternalKey ikey;
Status pik_status =
ParseInternalKey(it->first, &ikey, true /* log_err_key */);
pik_status.PermitUncheckedError();
assert(pik_status.ok());
if (!pik_status.ok() || ikey.user_key != _key) {
Change and clarify the relationship between Valid(), status() and Seek*() for all iterators. Also fix some bugs Summary: Before this PR, Iterator/InternalIterator may simultaneously have non-ok status() and Valid() = true. That state means that the last operation failed, but the iterator is nevertheless positioned on some unspecified record. Likely intended uses of that are: * If some sst files are corrupted, a normal iterator can be used to read the data from files that are not corrupted. * When using read_tier = kBlockCacheTier, read the data that's in block cache, skipping over the data that is not. However, this behavior wasn't documented well (and until recently the wiki on github had misleading incorrect information). In the code there's a lot of confusion about the relationship between status() and Valid(), and about whether Seek()/SeekToLast()/etc reset the status or not. There were a number of bugs caused by this confusion, both inside rocksdb and in the code that uses rocksdb (including ours). This PR changes the convention to: * If status() is not ok, Valid() always returns false. * Any seek operation resets status. (Before the PR, it depended on iterator type and on particular error.) This does sacrifice the two use cases listed above, but siying said it's ok. Overview of the changes: * A commit that adds missing status checks in MergingIterator. This fixes a bug that actually affects us, and we need it fixed. `DBIteratorTest.NonBlockingIterationBugRepro` explains the scenario. * Changes to lots of iterator types to make all of them conform to the new convention. Some bug fixes along the way. By far the biggest changes are in DBIter, which is a big messy piece of code; I tried to make it less big and messy but mostly failed. * A stress-test for DBIter, to gain some confidence that I didn't break it. It does a few million random operations on the iterator, while occasionally modifying the underlying data (like ForwardIterator does) and occasionally returning non-ok status from internal iterator. To find the iterator types that needed changes I searched for "public .*Iterator" in the code. Here's an overview of all 27 iterator types: Iterators that didn't need changes: * status() is always ok(), or Valid() is always false: MemTableIterator, ModelIter, TestIterator, KVIter (2 classes with this name anonymous namespaces), LoggingForwardVectorIterator, VectorIterator, MockTableIterator, EmptyIterator, EmptyInternalIterator. * Thin wrappers that always pass through Valid() and status(): ArenaWrappedDBIter, TtlIterator, InternalIteratorFromIterator. Iterators with changes (see inline comments for details): * DBIter - an overhaul: - It used to silently skip corrupted keys (`FindParseableKey()`), which seems dangerous. This PR makes it just stop immediately after encountering a corrupted key, just like it would for other kinds of corruption. Let me know if there was actually some deeper meaning in this behavior and I should put it back. - It had a few code paths silently discarding subiterator's status. The stress test caught a few. - The backwards iteration code path was expecting the internal iterator's set of keys to be immutable. It's probably always true in practice at the moment, since ForwardIterator doesn't support backwards iteration, but this PR fixes it anyway. See added DBIteratorTest.ReverseToForwardBug for an example. - Some parts of backwards iteration code path even did things like `assert(iter_->Valid())` after a seek, which is never a safe assumption. - It used to not reset status on seek for some types of errors. - Some simplifications and better comments. - Some things got more complicated from the added error handling. I'm open to ideas for how to make it nicer. * MergingIterator - check status after every operation on every subiterator, and in some places assert that valid subiterators have ok status. * ForwardIterator - changed to the new convention, also slightly simplified. * ForwardLevelIterator - fixed some bugs and simplified. * LevelIterator - simplified. * TwoLevelIterator - changed to the new convention. Also fixed a bug that would make SeekForPrev() sometimes silently ignore errors from first_level_iter_. * BlockBasedTableIterator - minor changes. * BlockIter - replaced `SetStatus()` with `Invalidate()` to make sure non-ok BlockIter is always invalid. * PlainTableIterator - some seeks used to not reset status. * CuckooTableIterator - tiny code cleanup. * ManagedIterator - fixed some bugs. * BaseDeltaIterator - changed to the new convention and fixed a bug. * BlobDBIterator - seeks used to not reset status. * KeyConvertingIterator - some small change. Closes https://github.com/facebook/rocksdb/pull/3810 Differential Revision: D7888019 Pulled By: al13n321 fbshipit-source-id: 4aaf6d3421c545d16722a815b2fa2e7912bc851d
7 years ago
continue;
}
if (valid_ && data_.begin() + iter_ > it) {
--iter_;
}
data_.erase(it);
return;
}
assert(false);
}
// Number of operations done on this iterator since construction.
size_t steps() const { return steps_; }
bool Valid() const override {
assert(initialized_);
return valid_;
}
void SeekToFirst() override {
assert(initialized_);
Change and clarify the relationship between Valid(), status() and Seek*() for all iterators. Also fix some bugs Summary: Before this PR, Iterator/InternalIterator may simultaneously have non-ok status() and Valid() = true. That state means that the last operation failed, but the iterator is nevertheless positioned on some unspecified record. Likely intended uses of that are: * If some sst files are corrupted, a normal iterator can be used to read the data from files that are not corrupted. * When using read_tier = kBlockCacheTier, read the data that's in block cache, skipping over the data that is not. However, this behavior wasn't documented well (and until recently the wiki on github had misleading incorrect information). In the code there's a lot of confusion about the relationship between status() and Valid(), and about whether Seek()/SeekToLast()/etc reset the status or not. There were a number of bugs caused by this confusion, both inside rocksdb and in the code that uses rocksdb (including ours). This PR changes the convention to: * If status() is not ok, Valid() always returns false. * Any seek operation resets status. (Before the PR, it depended on iterator type and on particular error.) This does sacrifice the two use cases listed above, but siying said it's ok. Overview of the changes: * A commit that adds missing status checks in MergingIterator. This fixes a bug that actually affects us, and we need it fixed. `DBIteratorTest.NonBlockingIterationBugRepro` explains the scenario. * Changes to lots of iterator types to make all of them conform to the new convention. Some bug fixes along the way. By far the biggest changes are in DBIter, which is a big messy piece of code; I tried to make it less big and messy but mostly failed. * A stress-test for DBIter, to gain some confidence that I didn't break it. It does a few million random operations on the iterator, while occasionally modifying the underlying data (like ForwardIterator does) and occasionally returning non-ok status from internal iterator. To find the iterator types that needed changes I searched for "public .*Iterator" in the code. Here's an overview of all 27 iterator types: Iterators that didn't need changes: * status() is always ok(), or Valid() is always false: MemTableIterator, ModelIter, TestIterator, KVIter (2 classes with this name anonymous namespaces), LoggingForwardVectorIterator, VectorIterator, MockTableIterator, EmptyIterator, EmptyInternalIterator. * Thin wrappers that always pass through Valid() and status(): ArenaWrappedDBIter, TtlIterator, InternalIteratorFromIterator. Iterators with changes (see inline comments for details): * DBIter - an overhaul: - It used to silently skip corrupted keys (`FindParseableKey()`), which seems dangerous. This PR makes it just stop immediately after encountering a corrupted key, just like it would for other kinds of corruption. Let me know if there was actually some deeper meaning in this behavior and I should put it back. - It had a few code paths silently discarding subiterator's status. The stress test caught a few. - The backwards iteration code path was expecting the internal iterator's set of keys to be immutable. It's probably always true in practice at the moment, since ForwardIterator doesn't support backwards iteration, but this PR fixes it anyway. See added DBIteratorTest.ReverseToForwardBug for an example. - Some parts of backwards iteration code path even did things like `assert(iter_->Valid())` after a seek, which is never a safe assumption. - It used to not reset status on seek for some types of errors. - Some simplifications and better comments. - Some things got more complicated from the added error handling. I'm open to ideas for how to make it nicer. * MergingIterator - check status after every operation on every subiterator, and in some places assert that valid subiterators have ok status. * ForwardIterator - changed to the new convention, also slightly simplified. * ForwardLevelIterator - fixed some bugs and simplified. * LevelIterator - simplified. * TwoLevelIterator - changed to the new convention. Also fixed a bug that would make SeekForPrev() sometimes silently ignore errors from first_level_iter_. * BlockBasedTableIterator - minor changes. * BlockIter - replaced `SetStatus()` with `Invalidate()` to make sure non-ok BlockIter is always invalid. * PlainTableIterator - some seeks used to not reset status. * CuckooTableIterator - tiny code cleanup. * ManagedIterator - fixed some bugs. * BaseDeltaIterator - changed to the new convention and fixed a bug. * BlobDBIterator - seeks used to not reset status. * KeyConvertingIterator - some small change. Closes https://github.com/facebook/rocksdb/pull/3810 Differential Revision: D7888019 Pulled By: al13n321 fbshipit-source-id: 4aaf6d3421c545d16722a815b2fa2e7912bc851d
7 years ago
++steps_;
DeleteCurrentIfNeeded();
valid_ = (data_.size() > 0);
iter_ = 0;
}
void SeekToLast() override {
assert(initialized_);
Change and clarify the relationship between Valid(), status() and Seek*() for all iterators. Also fix some bugs Summary: Before this PR, Iterator/InternalIterator may simultaneously have non-ok status() and Valid() = true. That state means that the last operation failed, but the iterator is nevertheless positioned on some unspecified record. Likely intended uses of that are: * If some sst files are corrupted, a normal iterator can be used to read the data from files that are not corrupted. * When using read_tier = kBlockCacheTier, read the data that's in block cache, skipping over the data that is not. However, this behavior wasn't documented well (and until recently the wiki on github had misleading incorrect information). In the code there's a lot of confusion about the relationship between status() and Valid(), and about whether Seek()/SeekToLast()/etc reset the status or not. There were a number of bugs caused by this confusion, both inside rocksdb and in the code that uses rocksdb (including ours). This PR changes the convention to: * If status() is not ok, Valid() always returns false. * Any seek operation resets status. (Before the PR, it depended on iterator type and on particular error.) This does sacrifice the two use cases listed above, but siying said it's ok. Overview of the changes: * A commit that adds missing status checks in MergingIterator. This fixes a bug that actually affects us, and we need it fixed. `DBIteratorTest.NonBlockingIterationBugRepro` explains the scenario. * Changes to lots of iterator types to make all of them conform to the new convention. Some bug fixes along the way. By far the biggest changes are in DBIter, which is a big messy piece of code; I tried to make it less big and messy but mostly failed. * A stress-test for DBIter, to gain some confidence that I didn't break it. It does a few million random operations on the iterator, while occasionally modifying the underlying data (like ForwardIterator does) and occasionally returning non-ok status from internal iterator. To find the iterator types that needed changes I searched for "public .*Iterator" in the code. Here's an overview of all 27 iterator types: Iterators that didn't need changes: * status() is always ok(), or Valid() is always false: MemTableIterator, ModelIter, TestIterator, KVIter (2 classes with this name anonymous namespaces), LoggingForwardVectorIterator, VectorIterator, MockTableIterator, EmptyIterator, EmptyInternalIterator. * Thin wrappers that always pass through Valid() and status(): ArenaWrappedDBIter, TtlIterator, InternalIteratorFromIterator. Iterators with changes (see inline comments for details): * DBIter - an overhaul: - It used to silently skip corrupted keys (`FindParseableKey()`), which seems dangerous. This PR makes it just stop immediately after encountering a corrupted key, just like it would for other kinds of corruption. Let me know if there was actually some deeper meaning in this behavior and I should put it back. - It had a few code paths silently discarding subiterator's status. The stress test caught a few. - The backwards iteration code path was expecting the internal iterator's set of keys to be immutable. It's probably always true in practice at the moment, since ForwardIterator doesn't support backwards iteration, but this PR fixes it anyway. See added DBIteratorTest.ReverseToForwardBug for an example. - Some parts of backwards iteration code path even did things like `assert(iter_->Valid())` after a seek, which is never a safe assumption. - It used to not reset status on seek for some types of errors. - Some simplifications and better comments. - Some things got more complicated from the added error handling. I'm open to ideas for how to make it nicer. * MergingIterator - check status after every operation on every subiterator, and in some places assert that valid subiterators have ok status. * ForwardIterator - changed to the new convention, also slightly simplified. * ForwardLevelIterator - fixed some bugs and simplified. * LevelIterator - simplified. * TwoLevelIterator - changed to the new convention. Also fixed a bug that would make SeekForPrev() sometimes silently ignore errors from first_level_iter_. * BlockBasedTableIterator - minor changes. * BlockIter - replaced `SetStatus()` with `Invalidate()` to make sure non-ok BlockIter is always invalid. * PlainTableIterator - some seeks used to not reset status. * CuckooTableIterator - tiny code cleanup. * ManagedIterator - fixed some bugs. * BaseDeltaIterator - changed to the new convention and fixed a bug. * BlobDBIterator - seeks used to not reset status. * KeyConvertingIterator - some small change. Closes https://github.com/facebook/rocksdb/pull/3810 Differential Revision: D7888019 Pulled By: al13n321 fbshipit-source-id: 4aaf6d3421c545d16722a815b2fa2e7912bc851d
7 years ago
++steps_;
DeleteCurrentIfNeeded();
valid_ = (data_.size() > 0);
iter_ = data_.size() - 1;
}
void Seek(const Slice& target) override {
assert(initialized_);
SeekToFirst();
Change and clarify the relationship between Valid(), status() and Seek*() for all iterators. Also fix some bugs Summary: Before this PR, Iterator/InternalIterator may simultaneously have non-ok status() and Valid() = true. That state means that the last operation failed, but the iterator is nevertheless positioned on some unspecified record. Likely intended uses of that are: * If some sst files are corrupted, a normal iterator can be used to read the data from files that are not corrupted. * When using read_tier = kBlockCacheTier, read the data that's in block cache, skipping over the data that is not. However, this behavior wasn't documented well (and until recently the wiki on github had misleading incorrect information). In the code there's a lot of confusion about the relationship between status() and Valid(), and about whether Seek()/SeekToLast()/etc reset the status or not. There were a number of bugs caused by this confusion, both inside rocksdb and in the code that uses rocksdb (including ours). This PR changes the convention to: * If status() is not ok, Valid() always returns false. * Any seek operation resets status. (Before the PR, it depended on iterator type and on particular error.) This does sacrifice the two use cases listed above, but siying said it's ok. Overview of the changes: * A commit that adds missing status checks in MergingIterator. This fixes a bug that actually affects us, and we need it fixed. `DBIteratorTest.NonBlockingIterationBugRepro` explains the scenario. * Changes to lots of iterator types to make all of them conform to the new convention. Some bug fixes along the way. By far the biggest changes are in DBIter, which is a big messy piece of code; I tried to make it less big and messy but mostly failed. * A stress-test for DBIter, to gain some confidence that I didn't break it. It does a few million random operations on the iterator, while occasionally modifying the underlying data (like ForwardIterator does) and occasionally returning non-ok status from internal iterator. To find the iterator types that needed changes I searched for "public .*Iterator" in the code. Here's an overview of all 27 iterator types: Iterators that didn't need changes: * status() is always ok(), or Valid() is always false: MemTableIterator, ModelIter, TestIterator, KVIter (2 classes with this name anonymous namespaces), LoggingForwardVectorIterator, VectorIterator, MockTableIterator, EmptyIterator, EmptyInternalIterator. * Thin wrappers that always pass through Valid() and status(): ArenaWrappedDBIter, TtlIterator, InternalIteratorFromIterator. Iterators with changes (see inline comments for details): * DBIter - an overhaul: - It used to silently skip corrupted keys (`FindParseableKey()`), which seems dangerous. This PR makes it just stop immediately after encountering a corrupted key, just like it would for other kinds of corruption. Let me know if there was actually some deeper meaning in this behavior and I should put it back. - It had a few code paths silently discarding subiterator's status. The stress test caught a few. - The backwards iteration code path was expecting the internal iterator's set of keys to be immutable. It's probably always true in practice at the moment, since ForwardIterator doesn't support backwards iteration, but this PR fixes it anyway. See added DBIteratorTest.ReverseToForwardBug for an example. - Some parts of backwards iteration code path even did things like `assert(iter_->Valid())` after a seek, which is never a safe assumption. - It used to not reset status on seek for some types of errors. - Some simplifications and better comments. - Some things got more complicated from the added error handling. I'm open to ideas for how to make it nicer. * MergingIterator - check status after every operation on every subiterator, and in some places assert that valid subiterators have ok status. * ForwardIterator - changed to the new convention, also slightly simplified. * ForwardLevelIterator - fixed some bugs and simplified. * LevelIterator - simplified. * TwoLevelIterator - changed to the new convention. Also fixed a bug that would make SeekForPrev() sometimes silently ignore errors from first_level_iter_. * BlockBasedTableIterator - minor changes. * BlockIter - replaced `SetStatus()` with `Invalidate()` to make sure non-ok BlockIter is always invalid. * PlainTableIterator - some seeks used to not reset status. * CuckooTableIterator - tiny code cleanup. * ManagedIterator - fixed some bugs. * BaseDeltaIterator - changed to the new convention and fixed a bug. * BlobDBIterator - seeks used to not reset status. * KeyConvertingIterator - some small change. Closes https://github.com/facebook/rocksdb/pull/3810 Differential Revision: D7888019 Pulled By: al13n321 fbshipit-source-id: 4aaf6d3421c545d16722a815b2fa2e7912bc851d
7 years ago
++steps_;
if (!valid_) {
return;
}
while (iter_ < data_.size() &&
(cmp.Compare(data_[iter_].first, target) < 0)) {
++iter_;
}
if (iter_ == data_.size()) {
valid_ = false;
}
}
void SeekForPrev(const Slice& target) override {
assert(initialized_);
Change and clarify the relationship between Valid(), status() and Seek*() for all iterators. Also fix some bugs Summary: Before this PR, Iterator/InternalIterator may simultaneously have non-ok status() and Valid() = true. That state means that the last operation failed, but the iterator is nevertheless positioned on some unspecified record. Likely intended uses of that are: * If some sst files are corrupted, a normal iterator can be used to read the data from files that are not corrupted. * When using read_tier = kBlockCacheTier, read the data that's in block cache, skipping over the data that is not. However, this behavior wasn't documented well (and until recently the wiki on github had misleading incorrect information). In the code there's a lot of confusion about the relationship between status() and Valid(), and about whether Seek()/SeekToLast()/etc reset the status or not. There were a number of bugs caused by this confusion, both inside rocksdb and in the code that uses rocksdb (including ours). This PR changes the convention to: * If status() is not ok, Valid() always returns false. * Any seek operation resets status. (Before the PR, it depended on iterator type and on particular error.) This does sacrifice the two use cases listed above, but siying said it's ok. Overview of the changes: * A commit that adds missing status checks in MergingIterator. This fixes a bug that actually affects us, and we need it fixed. `DBIteratorTest.NonBlockingIterationBugRepro` explains the scenario. * Changes to lots of iterator types to make all of them conform to the new convention. Some bug fixes along the way. By far the biggest changes are in DBIter, which is a big messy piece of code; I tried to make it less big and messy but mostly failed. * A stress-test for DBIter, to gain some confidence that I didn't break it. It does a few million random operations on the iterator, while occasionally modifying the underlying data (like ForwardIterator does) and occasionally returning non-ok status from internal iterator. To find the iterator types that needed changes I searched for "public .*Iterator" in the code. Here's an overview of all 27 iterator types: Iterators that didn't need changes: * status() is always ok(), or Valid() is always false: MemTableIterator, ModelIter, TestIterator, KVIter (2 classes with this name anonymous namespaces), LoggingForwardVectorIterator, VectorIterator, MockTableIterator, EmptyIterator, EmptyInternalIterator. * Thin wrappers that always pass through Valid() and status(): ArenaWrappedDBIter, TtlIterator, InternalIteratorFromIterator. Iterators with changes (see inline comments for details): * DBIter - an overhaul: - It used to silently skip corrupted keys (`FindParseableKey()`), which seems dangerous. This PR makes it just stop immediately after encountering a corrupted key, just like it would for other kinds of corruption. Let me know if there was actually some deeper meaning in this behavior and I should put it back. - It had a few code paths silently discarding subiterator's status. The stress test caught a few. - The backwards iteration code path was expecting the internal iterator's set of keys to be immutable. It's probably always true in practice at the moment, since ForwardIterator doesn't support backwards iteration, but this PR fixes it anyway. See added DBIteratorTest.ReverseToForwardBug for an example. - Some parts of backwards iteration code path even did things like `assert(iter_->Valid())` after a seek, which is never a safe assumption. - It used to not reset status on seek for some types of errors. - Some simplifications and better comments. - Some things got more complicated from the added error handling. I'm open to ideas for how to make it nicer. * MergingIterator - check status after every operation on every subiterator, and in some places assert that valid subiterators have ok status. * ForwardIterator - changed to the new convention, also slightly simplified. * ForwardLevelIterator - fixed some bugs and simplified. * LevelIterator - simplified. * TwoLevelIterator - changed to the new convention. Also fixed a bug that would make SeekForPrev() sometimes silently ignore errors from first_level_iter_. * BlockBasedTableIterator - minor changes. * BlockIter - replaced `SetStatus()` with `Invalidate()` to make sure non-ok BlockIter is always invalid. * PlainTableIterator - some seeks used to not reset status. * CuckooTableIterator - tiny code cleanup. * ManagedIterator - fixed some bugs. * BaseDeltaIterator - changed to the new convention and fixed a bug. * BlobDBIterator - seeks used to not reset status. * KeyConvertingIterator - some small change. Closes https://github.com/facebook/rocksdb/pull/3810 Differential Revision: D7888019 Pulled By: al13n321 fbshipit-source-id: 4aaf6d3421c545d16722a815b2fa2e7912bc851d
7 years ago
DeleteCurrentIfNeeded();
SeekForPrevImpl(target, &cmp);
}
void Next() override {
assert(initialized_);
Change and clarify the relationship between Valid(), status() and Seek*() for all iterators. Also fix some bugs Summary: Before this PR, Iterator/InternalIterator may simultaneously have non-ok status() and Valid() = true. That state means that the last operation failed, but the iterator is nevertheless positioned on some unspecified record. Likely intended uses of that are: * If some sst files are corrupted, a normal iterator can be used to read the data from files that are not corrupted. * When using read_tier = kBlockCacheTier, read the data that's in block cache, skipping over the data that is not. However, this behavior wasn't documented well (and until recently the wiki on github had misleading incorrect information). In the code there's a lot of confusion about the relationship between status() and Valid(), and about whether Seek()/SeekToLast()/etc reset the status or not. There were a number of bugs caused by this confusion, both inside rocksdb and in the code that uses rocksdb (including ours). This PR changes the convention to: * If status() is not ok, Valid() always returns false. * Any seek operation resets status. (Before the PR, it depended on iterator type and on particular error.) This does sacrifice the two use cases listed above, but siying said it's ok. Overview of the changes: * A commit that adds missing status checks in MergingIterator. This fixes a bug that actually affects us, and we need it fixed. `DBIteratorTest.NonBlockingIterationBugRepro` explains the scenario. * Changes to lots of iterator types to make all of them conform to the new convention. Some bug fixes along the way. By far the biggest changes are in DBIter, which is a big messy piece of code; I tried to make it less big and messy but mostly failed. * A stress-test for DBIter, to gain some confidence that I didn't break it. It does a few million random operations on the iterator, while occasionally modifying the underlying data (like ForwardIterator does) and occasionally returning non-ok status from internal iterator. To find the iterator types that needed changes I searched for "public .*Iterator" in the code. Here's an overview of all 27 iterator types: Iterators that didn't need changes: * status() is always ok(), or Valid() is always false: MemTableIterator, ModelIter, TestIterator, KVIter (2 classes with this name anonymous namespaces), LoggingForwardVectorIterator, VectorIterator, MockTableIterator, EmptyIterator, EmptyInternalIterator. * Thin wrappers that always pass through Valid() and status(): ArenaWrappedDBIter, TtlIterator, InternalIteratorFromIterator. Iterators with changes (see inline comments for details): * DBIter - an overhaul: - It used to silently skip corrupted keys (`FindParseableKey()`), which seems dangerous. This PR makes it just stop immediately after encountering a corrupted key, just like it would for other kinds of corruption. Let me know if there was actually some deeper meaning in this behavior and I should put it back. - It had a few code paths silently discarding subiterator's status. The stress test caught a few. - The backwards iteration code path was expecting the internal iterator's set of keys to be immutable. It's probably always true in practice at the moment, since ForwardIterator doesn't support backwards iteration, but this PR fixes it anyway. See added DBIteratorTest.ReverseToForwardBug for an example. - Some parts of backwards iteration code path even did things like `assert(iter_->Valid())` after a seek, which is never a safe assumption. - It used to not reset status on seek for some types of errors. - Some simplifications and better comments. - Some things got more complicated from the added error handling. I'm open to ideas for how to make it nicer. * MergingIterator - check status after every operation on every subiterator, and in some places assert that valid subiterators have ok status. * ForwardIterator - changed to the new convention, also slightly simplified. * ForwardLevelIterator - fixed some bugs and simplified. * LevelIterator - simplified. * TwoLevelIterator - changed to the new convention. Also fixed a bug that would make SeekForPrev() sometimes silently ignore errors from first_level_iter_. * BlockBasedTableIterator - minor changes. * BlockIter - replaced `SetStatus()` with `Invalidate()` to make sure non-ok BlockIter is always invalid. * PlainTableIterator - some seeks used to not reset status. * CuckooTableIterator - tiny code cleanup. * ManagedIterator - fixed some bugs. * BaseDeltaIterator - changed to the new convention and fixed a bug. * BlobDBIterator - seeks used to not reset status. * KeyConvertingIterator - some small change. Closes https://github.com/facebook/rocksdb/pull/3810 Differential Revision: D7888019 Pulled By: al13n321 fbshipit-source-id: 4aaf6d3421c545d16722a815b2fa2e7912bc851d
7 years ago
assert(valid_);
assert(iter_ < data_.size());
++steps_;
if (delete_current_) {
DeleteCurrentIfNeeded();
} else {
++iter_;
}
Change and clarify the relationship between Valid(), status() and Seek*() for all iterators. Also fix some bugs Summary: Before this PR, Iterator/InternalIterator may simultaneously have non-ok status() and Valid() = true. That state means that the last operation failed, but the iterator is nevertheless positioned on some unspecified record. Likely intended uses of that are: * If some sst files are corrupted, a normal iterator can be used to read the data from files that are not corrupted. * When using read_tier = kBlockCacheTier, read the data that's in block cache, skipping over the data that is not. However, this behavior wasn't documented well (and until recently the wiki on github had misleading incorrect information). In the code there's a lot of confusion about the relationship between status() and Valid(), and about whether Seek()/SeekToLast()/etc reset the status or not. There were a number of bugs caused by this confusion, both inside rocksdb and in the code that uses rocksdb (including ours). This PR changes the convention to: * If status() is not ok, Valid() always returns false. * Any seek operation resets status. (Before the PR, it depended on iterator type and on particular error.) This does sacrifice the two use cases listed above, but siying said it's ok. Overview of the changes: * A commit that adds missing status checks in MergingIterator. This fixes a bug that actually affects us, and we need it fixed. `DBIteratorTest.NonBlockingIterationBugRepro` explains the scenario. * Changes to lots of iterator types to make all of them conform to the new convention. Some bug fixes along the way. By far the biggest changes are in DBIter, which is a big messy piece of code; I tried to make it less big and messy but mostly failed. * A stress-test for DBIter, to gain some confidence that I didn't break it. It does a few million random operations on the iterator, while occasionally modifying the underlying data (like ForwardIterator does) and occasionally returning non-ok status from internal iterator. To find the iterator types that needed changes I searched for "public .*Iterator" in the code. Here's an overview of all 27 iterator types: Iterators that didn't need changes: * status() is always ok(), or Valid() is always false: MemTableIterator, ModelIter, TestIterator, KVIter (2 classes with this name anonymous namespaces), LoggingForwardVectorIterator, VectorIterator, MockTableIterator, EmptyIterator, EmptyInternalIterator. * Thin wrappers that always pass through Valid() and status(): ArenaWrappedDBIter, TtlIterator, InternalIteratorFromIterator. Iterators with changes (see inline comments for details): * DBIter - an overhaul: - It used to silently skip corrupted keys (`FindParseableKey()`), which seems dangerous. This PR makes it just stop immediately after encountering a corrupted key, just like it would for other kinds of corruption. Let me know if there was actually some deeper meaning in this behavior and I should put it back. - It had a few code paths silently discarding subiterator's status. The stress test caught a few. - The backwards iteration code path was expecting the internal iterator's set of keys to be immutable. It's probably always true in practice at the moment, since ForwardIterator doesn't support backwards iteration, but this PR fixes it anyway. See added DBIteratorTest.ReverseToForwardBug for an example. - Some parts of backwards iteration code path even did things like `assert(iter_->Valid())` after a seek, which is never a safe assumption. - It used to not reset status on seek for some types of errors. - Some simplifications and better comments. - Some things got more complicated from the added error handling. I'm open to ideas for how to make it nicer. * MergingIterator - check status after every operation on every subiterator, and in some places assert that valid subiterators have ok status. * ForwardIterator - changed to the new convention, also slightly simplified. * ForwardLevelIterator - fixed some bugs and simplified. * LevelIterator - simplified. * TwoLevelIterator - changed to the new convention. Also fixed a bug that would make SeekForPrev() sometimes silently ignore errors from first_level_iter_. * BlockBasedTableIterator - minor changes. * BlockIter - replaced `SetStatus()` with `Invalidate()` to make sure non-ok BlockIter is always invalid. * PlainTableIterator - some seeks used to not reset status. * CuckooTableIterator - tiny code cleanup. * ManagedIterator - fixed some bugs. * BaseDeltaIterator - changed to the new convention and fixed a bug. * BlobDBIterator - seeks used to not reset status. * KeyConvertingIterator - some small change. Closes https://github.com/facebook/rocksdb/pull/3810 Differential Revision: D7888019 Pulled By: al13n321 fbshipit-source-id: 4aaf6d3421c545d16722a815b2fa2e7912bc851d
7 years ago
valid_ = iter_ < data_.size();
}
void Prev() override {
assert(initialized_);
Change and clarify the relationship between Valid(), status() and Seek*() for all iterators. Also fix some bugs Summary: Before this PR, Iterator/InternalIterator may simultaneously have non-ok status() and Valid() = true. That state means that the last operation failed, but the iterator is nevertheless positioned on some unspecified record. Likely intended uses of that are: * If some sst files are corrupted, a normal iterator can be used to read the data from files that are not corrupted. * When using read_tier = kBlockCacheTier, read the data that's in block cache, skipping over the data that is not. However, this behavior wasn't documented well (and until recently the wiki on github had misleading incorrect information). In the code there's a lot of confusion about the relationship between status() and Valid(), and about whether Seek()/SeekToLast()/etc reset the status or not. There were a number of bugs caused by this confusion, both inside rocksdb and in the code that uses rocksdb (including ours). This PR changes the convention to: * If status() is not ok, Valid() always returns false. * Any seek operation resets status. (Before the PR, it depended on iterator type and on particular error.) This does sacrifice the two use cases listed above, but siying said it's ok. Overview of the changes: * A commit that adds missing status checks in MergingIterator. This fixes a bug that actually affects us, and we need it fixed. `DBIteratorTest.NonBlockingIterationBugRepro` explains the scenario. * Changes to lots of iterator types to make all of them conform to the new convention. Some bug fixes along the way. By far the biggest changes are in DBIter, which is a big messy piece of code; I tried to make it less big and messy but mostly failed. * A stress-test for DBIter, to gain some confidence that I didn't break it. It does a few million random operations on the iterator, while occasionally modifying the underlying data (like ForwardIterator does) and occasionally returning non-ok status from internal iterator. To find the iterator types that needed changes I searched for "public .*Iterator" in the code. Here's an overview of all 27 iterator types: Iterators that didn't need changes: * status() is always ok(), or Valid() is always false: MemTableIterator, ModelIter, TestIterator, KVIter (2 classes with this name anonymous namespaces), LoggingForwardVectorIterator, VectorIterator, MockTableIterator, EmptyIterator, EmptyInternalIterator. * Thin wrappers that always pass through Valid() and status(): ArenaWrappedDBIter, TtlIterator, InternalIteratorFromIterator. Iterators with changes (see inline comments for details): * DBIter - an overhaul: - It used to silently skip corrupted keys (`FindParseableKey()`), which seems dangerous. This PR makes it just stop immediately after encountering a corrupted key, just like it would for other kinds of corruption. Let me know if there was actually some deeper meaning in this behavior and I should put it back. - It had a few code paths silently discarding subiterator's status. The stress test caught a few. - The backwards iteration code path was expecting the internal iterator's set of keys to be immutable. It's probably always true in practice at the moment, since ForwardIterator doesn't support backwards iteration, but this PR fixes it anyway. See added DBIteratorTest.ReverseToForwardBug for an example. - Some parts of backwards iteration code path even did things like `assert(iter_->Valid())` after a seek, which is never a safe assumption. - It used to not reset status on seek for some types of errors. - Some simplifications and better comments. - Some things got more complicated from the added error handling. I'm open to ideas for how to make it nicer. * MergingIterator - check status after every operation on every subiterator, and in some places assert that valid subiterators have ok status. * ForwardIterator - changed to the new convention, also slightly simplified. * ForwardLevelIterator - fixed some bugs and simplified. * LevelIterator - simplified. * TwoLevelIterator - changed to the new convention. Also fixed a bug that would make SeekForPrev() sometimes silently ignore errors from first_level_iter_. * BlockBasedTableIterator - minor changes. * BlockIter - replaced `SetStatus()` with `Invalidate()` to make sure non-ok BlockIter is always invalid. * PlainTableIterator - some seeks used to not reset status. * CuckooTableIterator - tiny code cleanup. * ManagedIterator - fixed some bugs. * BaseDeltaIterator - changed to the new convention and fixed a bug. * BlobDBIterator - seeks used to not reset status. * KeyConvertingIterator - some small change. Closes https://github.com/facebook/rocksdb/pull/3810 Differential Revision: D7888019 Pulled By: al13n321 fbshipit-source-id: 4aaf6d3421c545d16722a815b2fa2e7912bc851d
7 years ago
assert(valid_);
assert(iter_ < data_.size());
++steps_;
DeleteCurrentIfNeeded();
if (iter_ == 0) {
valid_ = false;
} else {
--iter_;
}
}
Slice key() const override {
assert(initialized_);
return data_[iter_].first;
}
Slice value() const override {
assert(initialized_);
return data_[iter_].second;
}
Status status() const override {
assert(initialized_);
return Status::OK();
}
bool IsKeyPinned() const override { return true; }
bool IsValuePinned() const override { return true; }
Introduce FullMergeV2 (eliminate memcpy from merge operators) Summary: This diff update the code to pin the merge operator operands while the merge operation is done, so that we can eliminate the memcpy cost, to do that we need a new public API for FullMerge that replace the std::deque<std::string> with std::vector<Slice> This diff is stacked on top of D56493 and D56511 In this diff we - Update FullMergeV2 arguments to be encapsulated in MergeOperationInput and MergeOperationOutput which will make it easier to add new arguments in the future - Replace std::deque<std::string> with std::vector<Slice> to pass operands - Replace MergeContext std::deque with std::vector (based on a simple benchmark I ran https://gist.github.com/IslamAbdelRahman/78fc86c9ab9f52b1df791e58943fb187) - Allow FullMergeV2 output to be an existing operand ``` [Everything in Memtable | 10K operands | 10 KB each | 1 operand per key] DEBUG_LEVEL=0 make db_bench -j64 && ./db_bench --benchmarks="mergerandom,readseq,readseq,readseq,readseq,readseq" --merge_operator="max" --merge_keys=10000 --num=10000 --disable_auto_compactions --value_size=10240 --write_buffer_size=1000000000 [FullMergeV2] readseq : 0.607 micros/op 1648235 ops/sec; 16121.2 MB/s readseq : 0.478 micros/op 2091546 ops/sec; 20457.2 MB/s readseq : 0.252 micros/op 3972081 ops/sec; 38850.5 MB/s readseq : 0.237 micros/op 4218328 ops/sec; 41259.0 MB/s readseq : 0.247 micros/op 4043927 ops/sec; 39553.2 MB/s [master] readseq : 3.935 micros/op 254140 ops/sec; 2485.7 MB/s readseq : 3.722 micros/op 268657 ops/sec; 2627.7 MB/s readseq : 3.149 micros/op 317605 ops/sec; 3106.5 MB/s readseq : 3.125 micros/op 320024 ops/sec; 3130.1 MB/s readseq : 4.075 micros/op 245374 ops/sec; 2400.0 MB/s ``` ``` [Everything in Memtable | 10K operands | 10 KB each | 10 operand per key] DEBUG_LEVEL=0 make db_bench -j64 && ./db_bench --benchmarks="mergerandom,readseq,readseq,readseq,readseq,readseq" --merge_operator="max" --merge_keys=1000 --num=10000 --disable_auto_compactions --value_size=10240 --write_buffer_size=1000000000 [FullMergeV2] readseq : 3.472 micros/op 288018 ops/sec; 2817.1 MB/s readseq : 2.304 micros/op 434027 ops/sec; 4245.2 MB/s readseq : 1.163 micros/op 859845 ops/sec; 8410.0 MB/s readseq : 1.192 micros/op 838926 ops/sec; 8205.4 MB/s readseq : 1.250 micros/op 800000 ops/sec; 7824.7 MB/s [master] readseq : 24.025 micros/op 41623 ops/sec; 407.1 MB/s readseq : 18.489 micros/op 54086 ops/sec; 529.0 MB/s readseq : 18.693 micros/op 53495 ops/sec; 523.2 MB/s readseq : 23.621 micros/op 42335 ops/sec; 414.1 MB/s readseq : 18.775 micros/op 53262 ops/sec; 521.0 MB/s ``` ``` [Everything in Block cache | 10K operands | 10 KB each | 1 operand per key] [FullMergeV2] $ DEBUG_LEVEL=0 make db_bench -j64 && ./db_bench --benchmarks="readseq,readseq,readseq,readseq,readseq" --merge_operator="max" --num=100000 --db="/dev/shm/merge-random-10K-10KB" --cache_size=1000000000 --use_existing_db --disable_auto_compactions readseq : 14.741 micros/op 67837 ops/sec; 663.5 MB/s readseq : 1.029 micros/op 971446 ops/sec; 9501.6 MB/s readseq : 0.974 micros/op 1026229 ops/sec; 10037.4 MB/s readseq : 0.965 micros/op 1036080 ops/sec; 10133.8 MB/s readseq : 0.943 micros/op 1060657 ops/sec; 10374.2 MB/s [master] readseq : 16.735 micros/op 59755 ops/sec; 584.5 MB/s readseq : 3.029 micros/op 330151 ops/sec; 3229.2 MB/s readseq : 3.136 micros/op 318883 ops/sec; 3119.0 MB/s readseq : 3.065 micros/op 326245 ops/sec; 3191.0 MB/s readseq : 3.014 micros/op 331813 ops/sec; 3245.4 MB/s ``` ``` [Everything in Block cache | 10K operands | 10 KB each | 10 operand per key] DEBUG_LEVEL=0 make db_bench -j64 && ./db_bench --benchmarks="readseq,readseq,readseq,readseq,readseq" --merge_operator="max" --num=100000 --db="/dev/shm/merge-random-10-operands-10K-10KB" --cache_size=1000000000 --use_existing_db --disable_auto_compactions [FullMergeV2] readseq : 24.325 micros/op 41109 ops/sec; 402.1 MB/s readseq : 1.470 micros/op 680272 ops/sec; 6653.7 MB/s readseq : 1.231 micros/op 812347 ops/sec; 7945.5 MB/s readseq : 1.091 micros/op 916590 ops/sec; 8965.1 MB/s readseq : 1.109 micros/op 901713 ops/sec; 8819.6 MB/s [master] readseq : 27.257 micros/op 36687 ops/sec; 358.8 MB/s readseq : 4.443 micros/op 225073 ops/sec; 2201.4 MB/s readseq : 5.830 micros/op 171526 ops/sec; 1677.7 MB/s readseq : 4.173 micros/op 239635 ops/sec; 2343.8 MB/s readseq : 4.150 micros/op 240963 ops/sec; 2356.8 MB/s ``` Test Plan: COMPILE_WITH_ASAN=1 make check -j64 Reviewers: yhchiang, andrewkr, sdong Reviewed By: sdong Subscribers: lovro, andrewkr, dhruba Differential Revision: https://reviews.facebook.net/D57075
8 years ago
private:
bool initialized_;
bool valid_;
size_t sequence_number_;
size_t iter_;
Change and clarify the relationship between Valid(), status() and Seek*() for all iterators. Also fix some bugs Summary: Before this PR, Iterator/InternalIterator may simultaneously have non-ok status() and Valid() = true. That state means that the last operation failed, but the iterator is nevertheless positioned on some unspecified record. Likely intended uses of that are: * If some sst files are corrupted, a normal iterator can be used to read the data from files that are not corrupted. * When using read_tier = kBlockCacheTier, read the data that's in block cache, skipping over the data that is not. However, this behavior wasn't documented well (and until recently the wiki on github had misleading incorrect information). In the code there's a lot of confusion about the relationship between status() and Valid(), and about whether Seek()/SeekToLast()/etc reset the status or not. There were a number of bugs caused by this confusion, both inside rocksdb and in the code that uses rocksdb (including ours). This PR changes the convention to: * If status() is not ok, Valid() always returns false. * Any seek operation resets status. (Before the PR, it depended on iterator type and on particular error.) This does sacrifice the two use cases listed above, but siying said it's ok. Overview of the changes: * A commit that adds missing status checks in MergingIterator. This fixes a bug that actually affects us, and we need it fixed. `DBIteratorTest.NonBlockingIterationBugRepro` explains the scenario. * Changes to lots of iterator types to make all of them conform to the new convention. Some bug fixes along the way. By far the biggest changes are in DBIter, which is a big messy piece of code; I tried to make it less big and messy but mostly failed. * A stress-test for DBIter, to gain some confidence that I didn't break it. It does a few million random operations on the iterator, while occasionally modifying the underlying data (like ForwardIterator does) and occasionally returning non-ok status from internal iterator. To find the iterator types that needed changes I searched for "public .*Iterator" in the code. Here's an overview of all 27 iterator types: Iterators that didn't need changes: * status() is always ok(), or Valid() is always false: MemTableIterator, ModelIter, TestIterator, KVIter (2 classes with this name anonymous namespaces), LoggingForwardVectorIterator, VectorIterator, MockTableIterator, EmptyIterator, EmptyInternalIterator. * Thin wrappers that always pass through Valid() and status(): ArenaWrappedDBIter, TtlIterator, InternalIteratorFromIterator. Iterators with changes (see inline comments for details): * DBIter - an overhaul: - It used to silently skip corrupted keys (`FindParseableKey()`), which seems dangerous. This PR makes it just stop immediately after encountering a corrupted key, just like it would for other kinds of corruption. Let me know if there was actually some deeper meaning in this behavior and I should put it back. - It had a few code paths silently discarding subiterator's status. The stress test caught a few. - The backwards iteration code path was expecting the internal iterator's set of keys to be immutable. It's probably always true in practice at the moment, since ForwardIterator doesn't support backwards iteration, but this PR fixes it anyway. See added DBIteratorTest.ReverseToForwardBug for an example. - Some parts of backwards iteration code path even did things like `assert(iter_->Valid())` after a seek, which is never a safe assumption. - It used to not reset status on seek for some types of errors. - Some simplifications and better comments. - Some things got more complicated from the added error handling. I'm open to ideas for how to make it nicer. * MergingIterator - check status after every operation on every subiterator, and in some places assert that valid subiterators have ok status. * ForwardIterator - changed to the new convention, also slightly simplified. * ForwardLevelIterator - fixed some bugs and simplified. * LevelIterator - simplified. * TwoLevelIterator - changed to the new convention. Also fixed a bug that would make SeekForPrev() sometimes silently ignore errors from first_level_iter_. * BlockBasedTableIterator - minor changes. * BlockIter - replaced `SetStatus()` with `Invalidate()` to make sure non-ok BlockIter is always invalid. * PlainTableIterator - some seeks used to not reset status. * CuckooTableIterator - tiny code cleanup. * ManagedIterator - fixed some bugs. * BaseDeltaIterator - changed to the new convention and fixed a bug. * BlobDBIterator - seeks used to not reset status. * KeyConvertingIterator - some small change. Closes https://github.com/facebook/rocksdb/pull/3810 Differential Revision: D7888019 Pulled By: al13n321 fbshipit-source-id: 4aaf6d3421c545d16722a815b2fa2e7912bc851d
7 years ago
size_t steps_ = 0;
InternalKeyComparator cmp;
std::vector<std::pair<std::string, std::string>> data_;
Change and clarify the relationship between Valid(), status() and Seek*() for all iterators. Also fix some bugs Summary: Before this PR, Iterator/InternalIterator may simultaneously have non-ok status() and Valid() = true. That state means that the last operation failed, but the iterator is nevertheless positioned on some unspecified record. Likely intended uses of that are: * If some sst files are corrupted, a normal iterator can be used to read the data from files that are not corrupted. * When using read_tier = kBlockCacheTier, read the data that's in block cache, skipping over the data that is not. However, this behavior wasn't documented well (and until recently the wiki on github had misleading incorrect information). In the code there's a lot of confusion about the relationship between status() and Valid(), and about whether Seek()/SeekToLast()/etc reset the status or not. There were a number of bugs caused by this confusion, both inside rocksdb and in the code that uses rocksdb (including ours). This PR changes the convention to: * If status() is not ok, Valid() always returns false. * Any seek operation resets status. (Before the PR, it depended on iterator type and on particular error.) This does sacrifice the two use cases listed above, but siying said it's ok. Overview of the changes: * A commit that adds missing status checks in MergingIterator. This fixes a bug that actually affects us, and we need it fixed. `DBIteratorTest.NonBlockingIterationBugRepro` explains the scenario. * Changes to lots of iterator types to make all of them conform to the new convention. Some bug fixes along the way. By far the biggest changes are in DBIter, which is a big messy piece of code; I tried to make it less big and messy but mostly failed. * A stress-test for DBIter, to gain some confidence that I didn't break it. It does a few million random operations on the iterator, while occasionally modifying the underlying data (like ForwardIterator does) and occasionally returning non-ok status from internal iterator. To find the iterator types that needed changes I searched for "public .*Iterator" in the code. Here's an overview of all 27 iterator types: Iterators that didn't need changes: * status() is always ok(), or Valid() is always false: MemTableIterator, ModelIter, TestIterator, KVIter (2 classes with this name anonymous namespaces), LoggingForwardVectorIterator, VectorIterator, MockTableIterator, EmptyIterator, EmptyInternalIterator. * Thin wrappers that always pass through Valid() and status(): ArenaWrappedDBIter, TtlIterator, InternalIteratorFromIterator. Iterators with changes (see inline comments for details): * DBIter - an overhaul: - It used to silently skip corrupted keys (`FindParseableKey()`), which seems dangerous. This PR makes it just stop immediately after encountering a corrupted key, just like it would for other kinds of corruption. Let me know if there was actually some deeper meaning in this behavior and I should put it back. - It had a few code paths silently discarding subiterator's status. The stress test caught a few. - The backwards iteration code path was expecting the internal iterator's set of keys to be immutable. It's probably always true in practice at the moment, since ForwardIterator doesn't support backwards iteration, but this PR fixes it anyway. See added DBIteratorTest.ReverseToForwardBug for an example. - Some parts of backwards iteration code path even did things like `assert(iter_->Valid())` after a seek, which is never a safe assumption. - It used to not reset status on seek for some types of errors. - Some simplifications and better comments. - Some things got more complicated from the added error handling. I'm open to ideas for how to make it nicer. * MergingIterator - check status after every operation on every subiterator, and in some places assert that valid subiterators have ok status. * ForwardIterator - changed to the new convention, also slightly simplified. * ForwardLevelIterator - fixed some bugs and simplified. * LevelIterator - simplified. * TwoLevelIterator - changed to the new convention. Also fixed a bug that would make SeekForPrev() sometimes silently ignore errors from first_level_iter_. * BlockBasedTableIterator - minor changes. * BlockIter - replaced `SetStatus()` with `Invalidate()` to make sure non-ok BlockIter is always invalid. * PlainTableIterator - some seeks used to not reset status. * CuckooTableIterator - tiny code cleanup. * ManagedIterator - fixed some bugs. * BaseDeltaIterator - changed to the new convention and fixed a bug. * BlobDBIterator - seeks used to not reset status. * KeyConvertingIterator - some small change. Closes https://github.com/facebook/rocksdb/pull/3810 Differential Revision: D7888019 Pulled By: al13n321 fbshipit-source-id: 4aaf6d3421c545d16722a815b2fa2e7912bc851d
7 years ago
bool delete_current_ = false;
void DeleteCurrentIfNeeded() {
if (!delete_current_) {
return;
}
data_.erase(data_.begin() + iter_);
delete_current_ = false;
}
};
rocksdb: switch to gtest Summary: Our existing test notation is very similar to what is used in gtest. It makes it easy to adopt what is different. In this diff I modify existing [[ https://code.google.com/p/googletest/wiki/Primer#Test_Fixtures:_Using_the_Same_Data_Configuration_for_Multiple_Te | test fixture ]] classes to inherit from `testing::Test`. Also for unit tests that use fixture class, `TEST` is replaced with `TEST_F` as required in gtest. There are several custom `main` functions in our existing tests. To make this transition easier, I modify all `main` functions to fallow gtest notation. But eventually we can remove them and use implementation of `main` that gtest provides. ```lang=bash % cat ~/transform #!/bin/sh files=$(git ls-files '*test\.cc') for file in $files do if grep -q "rocksdb::test::RunAllTests()" $file then if grep -Eq '^class \w+Test {' $file then perl -pi -e 's/^(class \w+Test) {/${1}: public testing::Test {/g' $file perl -pi -e 's/^(TEST)/${1}_F/g' $file fi perl -pi -e 's/(int main.*\{)/${1}::testing::InitGoogleTest(&argc, argv);/g' $file perl -pi -e 's/rocksdb::test::RunAllTests/RUN_ALL_TESTS/g' $file fi done % sh ~/transform % make format ``` Second iteration of this diff contains only scripted changes. Third iteration contains manual changes to fix last errors and make it compilable. Test Plan: Build and notice no errors. ```lang=bash % USE_CLANG=1 make check -j55 ``` Tests are still testing. Reviewers: meyering, sdong, rven, igor Reviewed By: igor Subscribers: dhruba, leveldb Differential Revision: https://reviews.facebook.net/D35157
10 years ago
class DBIteratorTest : public testing::Test {
public:
Env* env_;
DBIteratorTest() : env_(Env::Default()) {}
};
rocksdb: switch to gtest Summary: Our existing test notation is very similar to what is used in gtest. It makes it easy to adopt what is different. In this diff I modify existing [[ https://code.google.com/p/googletest/wiki/Primer#Test_Fixtures:_Using_the_Same_Data_Configuration_for_Multiple_Te | test fixture ]] classes to inherit from `testing::Test`. Also for unit tests that use fixture class, `TEST` is replaced with `TEST_F` as required in gtest. There are several custom `main` functions in our existing tests. To make this transition easier, I modify all `main` functions to fallow gtest notation. But eventually we can remove them and use implementation of `main` that gtest provides. ```lang=bash % cat ~/transform #!/bin/sh files=$(git ls-files '*test\.cc') for file in $files do if grep -q "rocksdb::test::RunAllTests()" $file then if grep -Eq '^class \w+Test {' $file then perl -pi -e 's/^(class \w+Test) {/${1}: public testing::Test {/g' $file perl -pi -e 's/^(TEST)/${1}_F/g' $file fi perl -pi -e 's/(int main.*\{)/${1}::testing::InitGoogleTest(&argc, argv);/g' $file perl -pi -e 's/rocksdb::test::RunAllTests/RUN_ALL_TESTS/g' $file fi done % sh ~/transform % make format ``` Second iteration of this diff contains only scripted changes. Third iteration contains manual changes to fix last errors and make it compilable. Test Plan: Build and notice no errors. ```lang=bash % USE_CLANG=1 make check -j55 ``` Tests are still testing. Reviewers: meyering, sdong, rven, igor Reviewed By: igor Subscribers: dhruba, leveldb Differential Revision: https://reviews.facebook.net/D35157
10 years ago
TEST_F(DBIteratorTest, DBIteratorPrevNext) {
Options options;
ImmutableOptions ioptions = ImmutableOptions(options);
MutableCFOptions mutable_cf_options = MutableCFOptions(options);
{
TestIterator* internal_iter = new TestIterator(BytewiseComparator());
internal_iter->AddDeletion("a");
internal_iter->AddDeletion("a");
internal_iter->AddDeletion("a");
internal_iter->AddDeletion("a");
internal_iter->AddPut("a", "val_a");
internal_iter->AddPut("b", "val_b");
internal_iter->Finish();
ReadOptions ro;
std::unique_ptr<Iterator> db_iter(NewDBIterator(
env_, ro, ioptions, mutable_cf_options, BytewiseComparator(),
internal_iter, nullptr /* version */, 10 /* sequence */,
options.max_sequential_skip_in_iterations,
nullptr /* read_callback */));
db_iter->SeekToLast();
ASSERT_TRUE(db_iter->Valid());
ASSERT_EQ(db_iter->key().ToString(), "b");
ASSERT_EQ(db_iter->value().ToString(), "val_b");
db_iter->Prev();
ASSERT_TRUE(db_iter->Valid());
ASSERT_EQ(db_iter->key().ToString(), "a");
ASSERT_EQ(db_iter->value().ToString(), "val_a");
db_iter->Next();
ASSERT_TRUE(db_iter->Valid());
ASSERT_EQ(db_iter->key().ToString(), "b");
ASSERT_EQ(db_iter->value().ToString(), "val_b");
db_iter->Next();
ASSERT_TRUE(!db_iter->Valid());
}
// Test to check the SeekToLast() with iterate_upper_bound not set
{
TestIterator* internal_iter = new TestIterator(BytewiseComparator());
internal_iter->AddPut("a", "val_a");
internal_iter->AddPut("b", "val_b");
internal_iter->AddPut("b", "val_b");
internal_iter->AddPut("c", "val_c");
internal_iter->Finish();
ReadOptions ro;
std::unique_ptr<Iterator> db_iter(NewDBIterator(
env_, ro, ioptions, mutable_cf_options, BytewiseComparator(),
internal_iter, nullptr /* version */, 10 /* sequence */,
options.max_sequential_skip_in_iterations,
nullptr /* read_callback */));
db_iter->SeekToLast();
ASSERT_TRUE(db_iter->Valid());
ASSERT_EQ(db_iter->key().ToString(), "c");
}
// Test to check the SeekToLast() with iterate_upper_bound set
{
TestIterator* internal_iter = new TestIterator(BytewiseComparator());
internal_iter->AddPut("a", "val_a");
internal_iter->AddPut("b", "val_b");
internal_iter->AddPut("c", "val_c");
internal_iter->AddPut("d", "val_d");
internal_iter->AddPut("e", "val_e");
internal_iter->AddPut("f", "val_f");
internal_iter->Finish();
Slice prefix("d");
ReadOptions ro;
ro.iterate_upper_bound = &prefix;
std::unique_ptr<Iterator> db_iter(NewDBIterator(
env_, ro, ioptions, mutable_cf_options, BytewiseComparator(),
internal_iter, nullptr /* version */, 10 /* sequence */,
options.max_sequential_skip_in_iterations,
nullptr /* read_callback */));
db_iter->SeekToLast();
ASSERT_TRUE(db_iter->Valid());
ASSERT_EQ(db_iter->key().ToString(), "c");
db_iter->Next();
ASSERT_TRUE(!db_iter->Valid());
db_iter->SeekToLast();
ASSERT_TRUE(db_iter->Valid());
ASSERT_EQ(db_iter->key().ToString(), "c");
}
// Test to check the SeekToLast() iterate_upper_bound set to a key that
// is not Put yet
{
TestIterator* internal_iter = new TestIterator(BytewiseComparator());
internal_iter->AddPut("a", "val_a");
internal_iter->AddPut("a", "val_a");
internal_iter->AddPut("b", "val_b");
internal_iter->AddPut("c", "val_c");
internal_iter->AddPut("d", "val_d");
internal_iter->Finish();
Slice prefix("z");
ReadOptions ro;
ro.iterate_upper_bound = &prefix;
std::unique_ptr<Iterator> db_iter(NewDBIterator(
env_, ro, ioptions, mutable_cf_options, BytewiseComparator(),
internal_iter, nullptr /* version */, 10 /* sequence */,
options.max_sequential_skip_in_iterations,
nullptr /* read_callback */));
db_iter->SeekToLast();
ASSERT_TRUE(db_iter->Valid());
ASSERT_EQ(db_iter->key().ToString(), "d");
db_iter->Next();
ASSERT_TRUE(!db_iter->Valid());
db_iter->SeekToLast();
ASSERT_TRUE(db_iter->Valid());
ASSERT_EQ(db_iter->key().ToString(), "d");
db_iter->Prev();
ASSERT_TRUE(db_iter->Valid());
ASSERT_EQ(db_iter->key().ToString(), "c");
}
// Test to check the SeekToLast() with iterate_upper_bound set to the
// first key
{
TestIterator* internal_iter = new TestIterator(BytewiseComparator());
internal_iter->AddPut("a", "val_a");
internal_iter->AddPut("a", "val_a");
internal_iter->AddPut("a", "val_a");
internal_iter->AddPut("b", "val_b");
internal_iter->AddPut("b", "val_b");
internal_iter->Finish();
Slice prefix("a");
ReadOptions ro;
ro.iterate_upper_bound = &prefix;
std::unique_ptr<Iterator> db_iter(NewDBIterator(
env_, ro, ioptions, mutable_cf_options, BytewiseComparator(),
internal_iter, nullptr /* version */, 10 /* sequence */,
options.max_sequential_skip_in_iterations,
nullptr /* read_callback */));
db_iter->SeekToLast();
ASSERT_TRUE(!db_iter->Valid());
ASSERT_OK(db_iter->status());
}
// Test case to check SeekToLast with iterate_upper_bound set
// (same key put may times - SeekToLast should start with the
// maximum sequence id of the upper bound)
{
TestIterator* internal_iter = new TestIterator(BytewiseComparator());
internal_iter->AddPut("a", "val_a");
internal_iter->AddPut("b", "val_b");
internal_iter->AddPut("c", "val_c");
internal_iter->AddPut("c", "val_c");
internal_iter->AddPut("c", "val_c");
internal_iter->AddPut("c", "val_c");
internal_iter->AddPut("c", "val_c");
internal_iter->AddPut("c", "val_c");
internal_iter->AddPut("c", "val_c");
internal_iter->Finish();
Slice prefix("c");
ReadOptions ro;
ro.iterate_upper_bound = &prefix;
std::unique_ptr<Iterator> db_iter(NewDBIterator(
env_, ro, ioptions, mutable_cf_options, BytewiseComparator(),
internal_iter, nullptr /* version */, 7 /* sequence */,
options.max_sequential_skip_in_iterations,
nullptr /* read_callback */));
SetPerfLevel(kEnableCount);
ASSERT_TRUE(GetPerfLevel() == kEnableCount);
get_perf_context()->Reset();
db_iter->SeekToLast();
ASSERT_TRUE(db_iter->Valid());
ASSERT_EQ(static_cast<int>(get_perf_context()->internal_key_skipped_count), 1);
ASSERT_EQ(db_iter->key().ToString(), "b");
SetPerfLevel(kDisable);
}
// Test to check the SeekToLast() with the iterate_upper_bound set
// (Checking the value of the key which has sequence ids greater than
// and less that the iterator's sequence id)
{
TestIterator* internal_iter = new TestIterator(BytewiseComparator());
internal_iter->AddPut("a", "val_a1");
internal_iter->AddPut("a", "val_a2");
internal_iter->AddPut("b", "val_b1");
internal_iter->AddPut("c", "val_c1");
internal_iter->AddPut("c", "val_c2");
internal_iter->AddPut("c", "val_c3");
internal_iter->AddPut("b", "val_b2");
internal_iter->AddPut("d", "val_d1");
internal_iter->Finish();
Slice prefix("c");
ReadOptions ro;
ro.iterate_upper_bound = &prefix;
std::unique_ptr<Iterator> db_iter(NewDBIterator(
env_, ro, ioptions, mutable_cf_options, BytewiseComparator(),
internal_iter, nullptr /* version */, 4 /* sequence */,
options.max_sequential_skip_in_iterations,
nullptr /* read_callback */));
db_iter->SeekToLast();
ASSERT_TRUE(db_iter->Valid());
ASSERT_EQ(db_iter->key().ToString(), "b");
ASSERT_EQ(db_iter->value().ToString(), "val_b1");
}
// Test to check the SeekToLast() with the iterate_upper_bound set to the
// key that is deleted
{
TestIterator* internal_iter = new TestIterator(BytewiseComparator());
internal_iter->AddPut("a", "val_a");
internal_iter->AddDeletion("a");
internal_iter->AddPut("b", "val_b");
internal_iter->AddPut("c", "val_c");
internal_iter->Finish();
Slice prefix("a");
ReadOptions ro;
ro.iterate_upper_bound = &prefix;
std::unique_ptr<Iterator> db_iter(NewDBIterator(
env_, ro, ioptions, mutable_cf_options, BytewiseComparator(),
internal_iter, nullptr /* version */, 10 /* sequence */,
options.max_sequential_skip_in_iterations,
nullptr /* read_callback */));
db_iter->SeekToLast();
ASSERT_TRUE(!db_iter->Valid());
ASSERT_OK(db_iter->status());
}
// Test to check the SeekToLast() with the iterate_upper_bound set
// (Deletion cases)
{
TestIterator* internal_iter = new TestIterator(BytewiseComparator());
internal_iter->AddPut("a", "val_a");
internal_iter->AddPut("b", "val_b");
internal_iter->AddDeletion("b");
internal_iter->AddPut("c", "val_c");
internal_iter->Finish();
Slice prefix("c");
ReadOptions ro;
ro.iterate_upper_bound = &prefix;
std::unique_ptr<Iterator> db_iter(NewDBIterator(
env_, ro, ioptions, mutable_cf_options, BytewiseComparator(),
internal_iter, nullptr /* version */, 10 /* sequence */,
options.max_sequential_skip_in_iterations,
nullptr /* read_callback */));
db_iter->SeekToLast();
ASSERT_TRUE(db_iter->Valid());
ASSERT_EQ(db_iter->key().ToString(), "a");
db_iter->Next();
ASSERT_TRUE(!db_iter->Valid());
db_iter->SeekToLast();
ASSERT_TRUE(db_iter->Valid());
ASSERT_EQ(db_iter->key().ToString(), "a");
}
// Test to check the SeekToLast() with iterate_upper_bound set
// (Deletion cases - Lot of internal keys after the upper_bound
// is deleted)
{
TestIterator* internal_iter = new TestIterator(BytewiseComparator());
internal_iter->AddPut("a", "val_a");
internal_iter->AddPut("b", "val_b");
internal_iter->AddDeletion("c");
internal_iter->AddDeletion("d");
internal_iter->AddDeletion("e");
internal_iter->AddDeletion("f");
internal_iter->AddDeletion("g");
internal_iter->AddDeletion("h");
internal_iter->Finish();
Slice prefix("c");
ReadOptions ro;
ro.iterate_upper_bound = &prefix;
std::unique_ptr<Iterator> db_iter(NewDBIterator(
env_, ro, ioptions, mutable_cf_options, BytewiseComparator(),
internal_iter, nullptr /* version */, 7 /* sequence */,
options.max_sequential_skip_in_iterations,
nullptr /* read_callback */));
SetPerfLevel(kEnableCount);
ASSERT_TRUE(GetPerfLevel() == kEnableCount);
get_perf_context()->Reset();
db_iter->SeekToLast();
ASSERT_TRUE(db_iter->Valid());
ASSERT_EQ(static_cast<int>(get_perf_context()->internal_delete_skipped_count), 0);
ASSERT_EQ(db_iter->key().ToString(), "b");
SetPerfLevel(kDisable);
}
{
TestIterator* internal_iter = new TestIterator(BytewiseComparator());
internal_iter->AddDeletion("a");
internal_iter->AddDeletion("a");
internal_iter->AddDeletion("a");
internal_iter->AddDeletion("a");
internal_iter->AddPut("a", "val_a");
internal_iter->AddPut("b", "val_b");
internal_iter->Finish();
ReadOptions ro;
std::unique_ptr<Iterator> db_iter(NewDBIterator(
env_, ro, ioptions, mutable_cf_options, BytewiseComparator(),
internal_iter, nullptr /* version */, 10 /* sequence */,
options.max_sequential_skip_in_iterations,
nullptr /* read_callback */));
db_iter->SeekToFirst();
ASSERT_TRUE(db_iter->Valid());
ASSERT_EQ(db_iter->key().ToString(), "a");
ASSERT_EQ(db_iter->value().ToString(), "val_a");
db_iter->Next();
ASSERT_TRUE(db_iter->Valid());
ASSERT_EQ(db_iter->key().ToString(), "b");
ASSERT_EQ(db_iter->value().ToString(), "val_b");
db_iter->Prev();
ASSERT_TRUE(db_iter->Valid());
ASSERT_EQ(db_iter->key().ToString(), "a");
ASSERT_EQ(db_iter->value().ToString(), "val_a");
db_iter->Prev();
ASSERT_TRUE(!db_iter->Valid());
ASSERT_OK(db_iter->status());
}
{
TestIterator* internal_iter = new TestIterator(BytewiseComparator());
internal_iter->AddPut("a", "val_a");
internal_iter->AddPut("b", "val_b");
internal_iter->AddPut("a", "val_a");
internal_iter->AddPut("b", "val_b");
internal_iter->AddPut("a", "val_a");
internal_iter->AddPut("b", "val_b");
internal_iter->AddPut("a", "val_a");
internal_iter->AddPut("b", "val_b");
internal_iter->AddPut("a", "val_a");
internal_iter->AddPut("b", "val_b");
internal_iter->Finish();
ReadOptions ro;
std::unique_ptr<Iterator> db_iter(NewDBIterator(
env_, ro, ioptions, mutable_cf_options, BytewiseComparator(),
internal_iter, nullptr /* version */, 2 /* sequence */,
options.max_sequential_skip_in_iterations,
nullptr /* read_callback */));
db_iter->SeekToLast();
ASSERT_TRUE(db_iter->Valid());
ASSERT_EQ(db_iter->key().ToString(), "b");
ASSERT_EQ(db_iter->value().ToString(), "val_b");
db_iter->Next();
ASSERT_TRUE(!db_iter->Valid());
db_iter->SeekToLast();
ASSERT_TRUE(db_iter->Valid());
ASSERT_EQ(db_iter->key().ToString(), "b");
ASSERT_EQ(db_iter->value().ToString(), "val_b");
}
{
TestIterator* internal_iter = new TestIterator(BytewiseComparator());
internal_iter->AddPut("a", "val_a");
internal_iter->AddPut("a", "val_a");
internal_iter->AddPut("a", "val_a");
internal_iter->AddPut("a", "val_a");
internal_iter->AddPut("a", "val_a");
internal_iter->AddPut("b", "val_b");
internal_iter->AddPut("c", "val_c");
internal_iter->Finish();
ReadOptions ro;
std::unique_ptr<Iterator> db_iter(NewDBIterator(
env_, ro, ioptions, mutable_cf_options, BytewiseComparator(),
internal_iter, nullptr /* version */, 10 /* sequence */,
options.max_sequential_skip_in_iterations,
nullptr /* read_callback */));
db_iter->SeekToLast();
ASSERT_TRUE(db_iter->Valid());
ASSERT_EQ(db_iter->key().ToString(), "c");
ASSERT_EQ(db_iter->value().ToString(), "val_c");
db_iter->Prev();
ASSERT_TRUE(db_iter->Valid());
ASSERT_EQ(db_iter->key().ToString(), "b");
ASSERT_EQ(db_iter->value().ToString(), "val_b");
db_iter->Next();
ASSERT_TRUE(db_iter->Valid());
ASSERT_EQ(db_iter->key().ToString(), "c");
ASSERT_EQ(db_iter->value().ToString(), "val_c");
}
}
rocksdb: switch to gtest Summary: Our existing test notation is very similar to what is used in gtest. It makes it easy to adopt what is different. In this diff I modify existing [[ https://code.google.com/p/googletest/wiki/Primer#Test_Fixtures:_Using_the_Same_Data_Configuration_for_Multiple_Te | test fixture ]] classes to inherit from `testing::Test`. Also for unit tests that use fixture class, `TEST` is replaced with `TEST_F` as required in gtest. There are several custom `main` functions in our existing tests. To make this transition easier, I modify all `main` functions to fallow gtest notation. But eventually we can remove them and use implementation of `main` that gtest provides. ```lang=bash % cat ~/transform #!/bin/sh files=$(git ls-files '*test\.cc') for file in $files do if grep -q "rocksdb::test::RunAllTests()" $file then if grep -Eq '^class \w+Test {' $file then perl -pi -e 's/^(class \w+Test) {/${1}: public testing::Test {/g' $file perl -pi -e 's/^(TEST)/${1}_F/g' $file fi perl -pi -e 's/(int main.*\{)/${1}::testing::InitGoogleTest(&argc, argv);/g' $file perl -pi -e 's/rocksdb::test::RunAllTests/RUN_ALL_TESTS/g' $file fi done % sh ~/transform % make format ``` Second iteration of this diff contains only scripted changes. Third iteration contains manual changes to fix last errors and make it compilable. Test Plan: Build and notice no errors. ```lang=bash % USE_CLANG=1 make check -j55 ``` Tests are still testing. Reviewers: meyering, sdong, rven, igor Reviewed By: igor Subscribers: dhruba, leveldb Differential Revision: https://reviews.facebook.net/D35157
10 years ago
TEST_F(DBIteratorTest, DBIteratorEmpty) {
Options options;
ImmutableOptions ioptions = ImmutableOptions(options);
MutableCFOptions mutable_cf_options = MutableCFOptions(options);
ReadOptions ro;
{
TestIterator* internal_iter = new TestIterator(BytewiseComparator());
internal_iter->Finish();
std::unique_ptr<Iterator> db_iter(NewDBIterator(
env_, ro, ioptions, mutable_cf_options, BytewiseComparator(),
internal_iter, nullptr /* version */, 0 /* sequence */,
options.max_sequential_skip_in_iterations,
nullptr /* read_callback */));
db_iter->SeekToLast();
ASSERT_TRUE(!db_iter->Valid());
ASSERT_OK(db_iter->status());
}
{
TestIterator* internal_iter = new TestIterator(BytewiseComparator());
internal_iter->Finish();
std::unique_ptr<Iterator> db_iter(NewDBIterator(
env_, ro, ioptions, mutable_cf_options, BytewiseComparator(),
internal_iter, nullptr /* version */, 0 /* sequence */,
options.max_sequential_skip_in_iterations,
nullptr /* read_callback */));
db_iter->SeekToFirst();
ASSERT_TRUE(!db_iter->Valid());
ASSERT_OK(db_iter->status());
}
}
rocksdb: switch to gtest Summary: Our existing test notation is very similar to what is used in gtest. It makes it easy to adopt what is different. In this diff I modify existing [[ https://code.google.com/p/googletest/wiki/Primer#Test_Fixtures:_Using_the_Same_Data_Configuration_for_Multiple_Te | test fixture ]] classes to inherit from `testing::Test`. Also for unit tests that use fixture class, `TEST` is replaced with `TEST_F` as required in gtest. There are several custom `main` functions in our existing tests. To make this transition easier, I modify all `main` functions to fallow gtest notation. But eventually we can remove them and use implementation of `main` that gtest provides. ```lang=bash % cat ~/transform #!/bin/sh files=$(git ls-files '*test\.cc') for file in $files do if grep -q "rocksdb::test::RunAllTests()" $file then if grep -Eq '^class \w+Test {' $file then perl -pi -e 's/^(class \w+Test) {/${1}: public testing::Test {/g' $file perl -pi -e 's/^(TEST)/${1}_F/g' $file fi perl -pi -e 's/(int main.*\{)/${1}::testing::InitGoogleTest(&argc, argv);/g' $file perl -pi -e 's/rocksdb::test::RunAllTests/RUN_ALL_TESTS/g' $file fi done % sh ~/transform % make format ``` Second iteration of this diff contains only scripted changes. Third iteration contains manual changes to fix last errors and make it compilable. Test Plan: Build and notice no errors. ```lang=bash % USE_CLANG=1 make check -j55 ``` Tests are still testing. Reviewers: meyering, sdong, rven, igor Reviewed By: igor Subscribers: dhruba, leveldb Differential Revision: https://reviews.facebook.net/D35157
10 years ago
TEST_F(DBIteratorTest, DBIteratorUseSkipCountSkips) {
ReadOptions ro;
Options options;
options.statistics = ROCKSDB_NAMESPACE::CreateDBStatistics();
options.merge_operator = MergeOperators::CreateFromStringId("stringappend");
TestIterator* internal_iter = new TestIterator(BytewiseComparator());
for (size_t i = 0; i < 200; ++i) {
internal_iter->AddPut("a", "a");
internal_iter->AddPut("b", "b");
internal_iter->AddPut("c", "c");
}
internal_iter->Finish();
std::unique_ptr<Iterator> db_iter(NewDBIterator(
env_, ro, ImmutableOptions(options), MutableCFOptions(options),
BytewiseComparator(), internal_iter, nullptr /* version */,
2 /* sequence */, options.max_sequential_skip_in_iterations,
nullptr /* read_callback */));
db_iter->SeekToLast();
ASSERT_TRUE(db_iter->Valid());
ASSERT_EQ(db_iter->key().ToString(), "c");
ASSERT_EQ(db_iter->value().ToString(), "c");
ASSERT_EQ(TestGetTickerCount(options, NUMBER_OF_RESEEKS_IN_ITERATION), 1u);
db_iter->Prev();
ASSERT_TRUE(db_iter->Valid());
ASSERT_EQ(db_iter->key().ToString(), "b");
ASSERT_EQ(db_iter->value().ToString(), "b");
ASSERT_EQ(TestGetTickerCount(options, NUMBER_OF_RESEEKS_IN_ITERATION), 2u);
db_iter->Prev();
ASSERT_TRUE(db_iter->Valid());
ASSERT_EQ(db_iter->key().ToString(), "a");
ASSERT_EQ(db_iter->value().ToString(), "a");
ASSERT_EQ(TestGetTickerCount(options, NUMBER_OF_RESEEKS_IN_ITERATION), 3u);
db_iter->Prev();
ASSERT_TRUE(!db_iter->Valid());
ASSERT_OK(db_iter->status());
ASSERT_EQ(TestGetTickerCount(options, NUMBER_OF_RESEEKS_IN_ITERATION), 3u);
}
rocksdb: switch to gtest Summary: Our existing test notation is very similar to what is used in gtest. It makes it easy to adopt what is different. In this diff I modify existing [[ https://code.google.com/p/googletest/wiki/Primer#Test_Fixtures:_Using_the_Same_Data_Configuration_for_Multiple_Te | test fixture ]] classes to inherit from `testing::Test`. Also for unit tests that use fixture class, `TEST` is replaced with `TEST_F` as required in gtest. There are several custom `main` functions in our existing tests. To make this transition easier, I modify all `main` functions to fallow gtest notation. But eventually we can remove them and use implementation of `main` that gtest provides. ```lang=bash % cat ~/transform #!/bin/sh files=$(git ls-files '*test\.cc') for file in $files do if grep -q "rocksdb::test::RunAllTests()" $file then if grep -Eq '^class \w+Test {' $file then perl -pi -e 's/^(class \w+Test) {/${1}: public testing::Test {/g' $file perl -pi -e 's/^(TEST)/${1}_F/g' $file fi perl -pi -e 's/(int main.*\{)/${1}::testing::InitGoogleTest(&argc, argv);/g' $file perl -pi -e 's/rocksdb::test::RunAllTests/RUN_ALL_TESTS/g' $file fi done % sh ~/transform % make format ``` Second iteration of this diff contains only scripted changes. Third iteration contains manual changes to fix last errors and make it compilable. Test Plan: Build and notice no errors. ```lang=bash % USE_CLANG=1 make check -j55 ``` Tests are still testing. Reviewers: meyering, sdong, rven, igor Reviewed By: igor Subscribers: dhruba, leveldb Differential Revision: https://reviews.facebook.net/D35157
10 years ago
TEST_F(DBIteratorTest, DBIteratorUseSkip) {
ReadOptions ro;
Options options;
options.merge_operator = MergeOperators::CreateFromStringId("stringappend");
ImmutableOptions ioptions = ImmutableOptions(options);
MutableCFOptions mutable_cf_options = MutableCFOptions(options);
{
for (size_t i = 0; i < 200; ++i) {
TestIterator* internal_iter = new TestIterator(BytewiseComparator());
internal_iter->AddMerge("b", "merge_1");
internal_iter->AddMerge("a", "merge_2");
for (size_t k = 0; k < 200; ++k) {
internal_iter->AddPut("c", ToString(k));
}
internal_iter->Finish();
options.statistics = ROCKSDB_NAMESPACE::CreateDBStatistics();
std::unique_ptr<Iterator> db_iter(NewDBIterator(
env_, ro, ioptions, mutable_cf_options, BytewiseComparator(),
internal_iter, nullptr /* version */, i + 2 /* sequence */,
options.max_sequential_skip_in_iterations,
nullptr /* read_callback */));
db_iter->SeekToLast();
ASSERT_TRUE(db_iter->Valid());
ASSERT_EQ(db_iter->key().ToString(), "c");
ASSERT_EQ(db_iter->value().ToString(), ToString(i));
db_iter->Prev();
ASSERT_TRUE(db_iter->Valid());
ASSERT_EQ(db_iter->key().ToString(), "b");
ASSERT_EQ(db_iter->value().ToString(), "merge_1");
db_iter->Prev();
ASSERT_TRUE(db_iter->Valid());
ASSERT_EQ(db_iter->key().ToString(), "a");
ASSERT_EQ(db_iter->value().ToString(), "merge_2");
db_iter->Prev();
ASSERT_TRUE(!db_iter->Valid());
ASSERT_OK(db_iter->status());
}
}
{
for (size_t i = 0; i < 200; ++i) {
TestIterator* internal_iter = new TestIterator(BytewiseComparator());
internal_iter->AddMerge("b", "merge_1");
internal_iter->AddMerge("a", "merge_2");
for (size_t k = 0; k < 200; ++k) {
internal_iter->AddDeletion("c");
}
internal_iter->AddPut("c", "200");
internal_iter->Finish();
std::unique_ptr<Iterator> db_iter(NewDBIterator(
env_, ro, ioptions, mutable_cf_options, BytewiseComparator(),
internal_iter, nullptr /* version */, i + 2 /* sequence */,
options.max_sequential_skip_in_iterations,
nullptr /* read_callback */));
db_iter->SeekToLast();
ASSERT_TRUE(db_iter->Valid());
ASSERT_EQ(db_iter->key().ToString(), "b");
ASSERT_EQ(db_iter->value().ToString(), "merge_1");
db_iter->Prev();
ASSERT_TRUE(db_iter->Valid());
ASSERT_EQ(db_iter->key().ToString(), "a");
ASSERT_EQ(db_iter->value().ToString(), "merge_2");
db_iter->Prev();
ASSERT_TRUE(!db_iter->Valid());
ASSERT_OK(db_iter->status());
}
{
TestIterator* internal_iter = new TestIterator(BytewiseComparator());
internal_iter->AddMerge("b", "merge_1");
internal_iter->AddMerge("a", "merge_2");
for (size_t i = 0; i < 200; ++i) {
internal_iter->AddDeletion("c");
}
internal_iter->AddPut("c", "200");
internal_iter->Finish();
std::unique_ptr<Iterator> db_iter(NewDBIterator(
env_, ro, ioptions, mutable_cf_options, BytewiseComparator(),
internal_iter, nullptr /* version */, 202 /* sequence */,
options.max_sequential_skip_in_iterations,
nullptr /* read_callback */));
db_iter->SeekToLast();
ASSERT_TRUE(db_iter->Valid());
ASSERT_EQ(db_iter->key().ToString(), "c");
ASSERT_EQ(db_iter->value().ToString(), "200");
db_iter->Prev();
ASSERT_TRUE(db_iter->Valid());
ASSERT_EQ(db_iter->key().ToString(), "b");
ASSERT_EQ(db_iter->value().ToString(), "merge_1");
db_iter->Prev();
ASSERT_TRUE(db_iter->Valid());
ASSERT_EQ(db_iter->key().ToString(), "a");
ASSERT_EQ(db_iter->value().ToString(), "merge_2");
db_iter->Prev();
ASSERT_TRUE(!db_iter->Valid());
ASSERT_OK(db_iter->status());
}
}
{
for (size_t i = 0; i < 200; ++i) {
TestIterator* internal_iter = new TestIterator(BytewiseComparator());
for (size_t k = 0; k < 200; ++k) {
internal_iter->AddDeletion("c");
}
internal_iter->AddPut("c", "200");
internal_iter->Finish();
std::unique_ptr<Iterator> db_iter(NewDBIterator(
env_, ro, ioptions, mutable_cf_options, BytewiseComparator(),
internal_iter, nullptr /* version */, i /* sequence */,
options.max_sequential_skip_in_iterations,
nullptr /* read_callback */));
db_iter->SeekToLast();
ASSERT_TRUE(!db_iter->Valid());
ASSERT_OK(db_iter->status());
db_iter->SeekToFirst();
ASSERT_TRUE(!db_iter->Valid());
ASSERT_OK(db_iter->status());
}
TestIterator* internal_iter = new TestIterator(BytewiseComparator());
for (size_t i = 0; i < 200; ++i) {
internal_iter->AddDeletion("c");
}
internal_iter->AddPut("c", "200");
internal_iter->Finish();
std::unique_ptr<Iterator> db_iter(NewDBIterator(
env_, ro, ioptions, mutable_cf_options, BytewiseComparator(),
internal_iter, nullptr /* version */, 200 /* sequence */,
options.max_sequential_skip_in_iterations,
nullptr /* read_callback */));
db_iter->SeekToLast();
ASSERT_TRUE(db_iter->Valid());
ASSERT_EQ(db_iter->key().ToString(), "c");
ASSERT_EQ(db_iter->value().ToString(), "200");
db_iter->Prev();
ASSERT_TRUE(!db_iter->Valid());
ASSERT_OK(db_iter->status());
db_iter->SeekToFirst();
ASSERT_TRUE(db_iter->Valid());
ASSERT_EQ(db_iter->key().ToString(), "c");
ASSERT_EQ(db_iter->value().ToString(), "200");
db_iter->Next();
ASSERT_TRUE(!db_iter->Valid());
ASSERT_OK(db_iter->status());
}
{
for (size_t i = 0; i < 200; ++i) {
TestIterator* internal_iter = new TestIterator(BytewiseComparator());
internal_iter->AddMerge("b", "merge_1");
internal_iter->AddMerge("a", "merge_2");
for (size_t k = 0; k < 200; ++k) {
internal_iter->AddPut("d", ToString(k));
}
for (size_t k = 0; k < 200; ++k) {
internal_iter->AddPut("c", ToString(k));
}
internal_iter->Finish();
std::unique_ptr<Iterator> db_iter(NewDBIterator(
env_, ro, ioptions, mutable_cf_options, BytewiseComparator(),
internal_iter, nullptr /* version */, i + 2 /* sequence */,
options.max_sequential_skip_in_iterations,
nullptr /* read_callback */));
db_iter->SeekToLast();
ASSERT_TRUE(db_iter->Valid());
ASSERT_EQ(db_iter->key().ToString(), "d");
ASSERT_EQ(db_iter->value().ToString(), ToString(i));
db_iter->Prev();
ASSERT_TRUE(db_iter->Valid());
ASSERT_EQ(db_iter->key().ToString(), "b");
ASSERT_EQ(db_iter->value().ToString(), "merge_1");
db_iter->Prev();
ASSERT_TRUE(db_iter->Valid());
ASSERT_EQ(db_iter->key().ToString(), "a");
ASSERT_EQ(db_iter->value().ToString(), "merge_2");
db_iter->Prev();
ASSERT_TRUE(!db_iter->Valid());
ASSERT_OK(db_iter->status());
}
}
{
for (size_t i = 0; i < 200; ++i) {
TestIterator* internal_iter = new TestIterator(BytewiseComparator());
internal_iter->AddMerge("b", "b");
internal_iter->AddMerge("a", "a");
for (size_t k = 0; k < 200; ++k) {
internal_iter->AddMerge("c", ToString(k));
}
internal_iter->Finish();
std::unique_ptr<Iterator> db_iter(NewDBIterator(
env_, ro, ioptions, mutable_cf_options, BytewiseComparator(),
internal_iter, nullptr /* version */, i + 2 /* sequence */,
options.max_sequential_skip_in_iterations,
nullptr /* read_callback */));
db_iter->SeekToLast();
ASSERT_TRUE(db_iter->Valid());
ASSERT_EQ(db_iter->key().ToString(), "c");
std::string merge_result = "0";
for (size_t j = 1; j <= i; ++j) {
merge_result += "," + ToString(j);
}
ASSERT_EQ(db_iter->value().ToString(), merge_result);
db_iter->Prev();
ASSERT_TRUE(db_iter->Valid());
ASSERT_EQ(db_iter->key().ToString(), "b");
ASSERT_EQ(db_iter->value().ToString(), "b");
db_iter->Prev();
ASSERT_TRUE(db_iter->Valid());
ASSERT_EQ(db_iter->key().ToString(), "a");
ASSERT_EQ(db_iter->value().ToString(), "a");
db_iter->Prev();
ASSERT_TRUE(!db_iter->Valid());
ASSERT_OK(db_iter->status());
}
}
}
TEST_F(DBIteratorTest, DBIteratorSkipInternalKeys) {
Options options;
ImmutableOptions ioptions = ImmutableOptions(options);
MutableCFOptions mutable_cf_options = MutableCFOptions(options);
ReadOptions ro;
// Basic test case ... Make sure explicityly passing the default value works.
// Skipping internal keys is disabled by default, when the value is 0.
{
TestIterator* internal_iter = new TestIterator(BytewiseComparator());
internal_iter->AddPut("a", "val_a");
internal_iter->AddDeletion("b");
internal_iter->AddDeletion("b");
internal_iter->AddPut("c", "val_c");
internal_iter->AddPut("c", "val_c");
internal_iter->AddDeletion("c");
internal_iter->AddPut("d", "val_d");
internal_iter->Finish();
ro.max_skippable_internal_keys = 0;
std::unique_ptr<Iterator> db_iter(NewDBIterator(
env_, ro, ioptions, mutable_cf_options, BytewiseComparator(),
internal_iter, nullptr /* version */, 10 /* sequence */,
options.max_sequential_skip_in_iterations,
nullptr /* read_callback */));
db_iter->SeekToFirst();
ASSERT_TRUE(db_iter->Valid());
ASSERT_EQ(db_iter->key().ToString(), "a");
ASSERT_EQ(db_iter->value().ToString(), "val_a");
db_iter->Next();
ASSERT_TRUE(db_iter->Valid());
ASSERT_EQ(db_iter->key().ToString(), "d");
ASSERT_EQ(db_iter->value().ToString(), "val_d");
db_iter->Next();
ASSERT_TRUE(!db_iter->Valid());
ASSERT_TRUE(db_iter->status().ok());
db_iter->SeekToLast();
ASSERT_TRUE(db_iter->Valid());
ASSERT_EQ(db_iter->key().ToString(), "d");
ASSERT_EQ(db_iter->value().ToString(), "val_d");
db_iter->Prev();
ASSERT_TRUE(db_iter->Valid());
ASSERT_EQ(db_iter->key().ToString(), "a");
ASSERT_EQ(db_iter->value().ToString(), "val_a");
db_iter->Prev();
ASSERT_TRUE(!db_iter->Valid());
ASSERT_OK(db_iter->status());
}
// Test to make sure that the request will *not* fail as incomplete if
// num_internal_keys_skipped is *equal* to max_skippable_internal_keys
// threshold. (It will fail as incomplete only when the threshold is
// exceeded.)
{
TestIterator* internal_iter = new TestIterator(BytewiseComparator());
internal_iter->AddPut("a", "val_a");
internal_iter->AddDeletion("b");
internal_iter->AddDeletion("b");
internal_iter->AddPut("c", "val_c");
internal_iter->Finish();
ro.max_skippable_internal_keys = 2;
std::unique_ptr<Iterator> db_iter(NewDBIterator(
env_, ro, ioptions, mutable_cf_options, BytewiseComparator(),
internal_iter, nullptr /* version */, 10 /* sequence */,
options.max_sequential_skip_in_iterations,
nullptr /* read_callback */));
db_iter->SeekToFirst();
ASSERT_TRUE(db_iter->Valid());
ASSERT_EQ(db_iter->key().ToString(), "a");
ASSERT_EQ(db_iter->value().ToString(), "val_a");
db_iter->Next();
ASSERT_TRUE(db_iter->Valid());
ASSERT_EQ(db_iter->key().ToString(), "c");
ASSERT_EQ(db_iter->value().ToString(), "val_c");
db_iter->Next();
ASSERT_TRUE(!db_iter->Valid());
ASSERT_TRUE(db_iter->status().ok());
db_iter->SeekToLast();
ASSERT_TRUE(db_iter->Valid());
ASSERT_EQ(db_iter->key().ToString(), "c");
ASSERT_EQ(db_iter->value().ToString(), "val_c");
db_iter->Prev();
ASSERT_EQ(db_iter->key().ToString(), "a");
ASSERT_EQ(db_iter->value().ToString(), "val_a");
db_iter->Prev();
ASSERT_TRUE(!db_iter->Valid());
ASSERT_TRUE(db_iter->status().ok());
}
// Fail the request as incomplete when num_internal_keys_skipped >
// max_skippable_internal_keys
{
TestIterator* internal_iter = new TestIterator(BytewiseComparator());
internal_iter->AddPut("a", "val_a");
internal_iter->AddDeletion("b");
internal_iter->AddDeletion("b");
internal_iter->AddDeletion("b");
internal_iter->AddPut("c", "val_c");
internal_iter->Finish();
ro.max_skippable_internal_keys = 2;
std::unique_ptr<Iterator> db_iter(NewDBIterator(
env_, ro, ioptions, mutable_cf_options, BytewiseComparator(),
internal_iter, nullptr /* version */, 10 /* sequence */,
options.max_sequential_skip_in_iterations,
nullptr /* read_callback */));
db_iter->SeekToFirst();
ASSERT_TRUE(db_iter->Valid());
ASSERT_EQ(db_iter->key().ToString(), "a");
ASSERT_EQ(db_iter->value().ToString(), "val_a");
db_iter->Next();
ASSERT_TRUE(!db_iter->Valid());
ASSERT_TRUE(db_iter->status().IsIncomplete());
db_iter->SeekToLast();
ASSERT_TRUE(db_iter->Valid());
ASSERT_EQ(db_iter->key().ToString(), "c");
ASSERT_EQ(db_iter->value().ToString(), "val_c");
db_iter->Prev();
ASSERT_TRUE(!db_iter->Valid());
ASSERT_TRUE(db_iter->status().IsIncomplete());
}
// Test that the num_internal_keys_skipped counter resets after a successful
// read.
{
TestIterator* internal_iter = new TestIterator(BytewiseComparator());
internal_iter->AddPut("a", "val_a");
internal_iter->AddDeletion("b");
internal_iter->AddDeletion("b");
internal_iter->AddPut("c", "val_c");
internal_iter->AddDeletion("d");
internal_iter->AddDeletion("d");
internal_iter->AddDeletion("d");
internal_iter->AddPut("e", "val_e");
internal_iter->Finish();
ro.max_skippable_internal_keys = 2;
std::unique_ptr<Iterator> db_iter(NewDBIterator(
env_, ro, ioptions, mutable_cf_options, BytewiseComparator(),
internal_iter, nullptr /* version */, 10 /* sequence */,
options.max_sequential_skip_in_iterations,
nullptr /* read_callback */));
db_iter->SeekToFirst();
ASSERT_TRUE(db_iter->Valid());
ASSERT_EQ(db_iter->key().ToString(), "a");
ASSERT_EQ(db_iter->value().ToString(), "val_a");
db_iter->Next();
ASSERT_TRUE(db_iter->Valid());
ASSERT_EQ(db_iter->key().ToString(), "c");
ASSERT_EQ(db_iter->value().ToString(), "val_c");
db_iter->Next(); // num_internal_keys_skipped counter resets here.
ASSERT_TRUE(!db_iter->Valid());
ASSERT_TRUE(db_iter->status().IsIncomplete());
}
// Test that the num_internal_keys_skipped counter resets after a successful
// read.
// Reverse direction
{
TestIterator* internal_iter = new TestIterator(BytewiseComparator());
internal_iter->AddPut("a", "val_a");
internal_iter->AddDeletion("b");
internal_iter->AddDeletion("b");
internal_iter->AddDeletion("b");
internal_iter->AddPut("c", "val_c");
internal_iter->AddDeletion("d");
internal_iter->AddDeletion("d");
internal_iter->AddPut("e", "val_e");
internal_iter->Finish();
ro.max_skippable_internal_keys = 2;
std::unique_ptr<Iterator> db_iter(NewDBIterator(
env_, ro, ioptions, mutable_cf_options, BytewiseComparator(),
internal_iter, nullptr /* version */, 10 /* sequence */,
options.max_sequential_skip_in_iterations,
nullptr /* read_callback */));
db_iter->SeekToLast();
ASSERT_TRUE(db_iter->Valid());
ASSERT_EQ(db_iter->key().ToString(), "e");
ASSERT_EQ(db_iter->value().ToString(), "val_e");
db_iter->Prev();
ASSERT_TRUE(db_iter->Valid());
ASSERT_EQ(db_iter->key().ToString(), "c");
ASSERT_EQ(db_iter->value().ToString(), "val_c");
db_iter->Prev(); // num_internal_keys_skipped counter resets here.
ASSERT_TRUE(!db_iter->Valid());
ASSERT_TRUE(db_iter->status().IsIncomplete());
}
// Test that skipping separate keys is handled
{
TestIterator* internal_iter = new TestIterator(BytewiseComparator());
internal_iter->AddPut("a", "val_a");
internal_iter->AddDeletion("b");
internal_iter->AddDeletion("c");
internal_iter->AddDeletion("d");
internal_iter->AddPut("e", "val_e");
internal_iter->Finish();
ro.max_skippable_internal_keys = 2;
std::unique_ptr<Iterator> db_iter(NewDBIterator(
env_, ro, ioptions, mutable_cf_options, BytewiseComparator(),
internal_iter, nullptr /* version */, 10 /* sequence */,
options.max_sequential_skip_in_iterations,
nullptr /* read_callback */));
db_iter->SeekToFirst();
ASSERT_TRUE(db_iter->Valid());
ASSERT_EQ(db_iter->key().ToString(), "a");
ASSERT_EQ(db_iter->value().ToString(), "val_a");
db_iter->Next();
ASSERT_TRUE(!db_iter->Valid());
ASSERT_TRUE(db_iter->status().IsIncomplete());
db_iter->SeekToLast();
ASSERT_TRUE(db_iter->Valid());
ASSERT_EQ(db_iter->key().ToString(), "e");
ASSERT_EQ(db_iter->value().ToString(), "val_e");
db_iter->Prev();
ASSERT_TRUE(!db_iter->Valid());
ASSERT_TRUE(db_iter->status().IsIncomplete());
}
// Test if alternating puts and deletes of the same key are handled correctly.
{
TestIterator* internal_iter = new TestIterator(BytewiseComparator());
internal_iter->AddPut("a", "val_a");
internal_iter->AddPut("b", "val_b");
internal_iter->AddDeletion("b");
internal_iter->AddPut("c", "val_c");
internal_iter->AddDeletion("c");
internal_iter->AddPut("d", "val_d");
internal_iter->AddDeletion("d");
internal_iter->AddPut("e", "val_e");
internal_iter->Finish();
ro.max_skippable_internal_keys = 2;
std::unique_ptr<Iterator> db_iter(NewDBIterator(
env_, ro, ioptions, mutable_cf_options, BytewiseComparator(),
internal_iter, nullptr /* version */, 10 /* sequence */,
options.max_sequential_skip_in_iterations,
nullptr /* read_callback */));
db_iter->SeekToFirst();
ASSERT_TRUE(db_iter->Valid());
ASSERT_EQ(db_iter->key().ToString(), "a");
ASSERT_EQ(db_iter->value().ToString(), "val_a");
db_iter->Next();
ASSERT_TRUE(!db_iter->Valid());
ASSERT_TRUE(db_iter->status().IsIncomplete());
db_iter->SeekToLast();
ASSERT_TRUE(db_iter->Valid());
ASSERT_EQ(db_iter->key().ToString(), "e");
ASSERT_EQ(db_iter->value().ToString(), "val_e");
db_iter->Prev();
ASSERT_TRUE(!db_iter->Valid());
ASSERT_TRUE(db_iter->status().IsIncomplete());
}
// Test for large number of skippable internal keys with *default*
// max_sequential_skip_in_iterations.
{
for (size_t i = 1; i <= 200; ++i) {
TestIterator* internal_iter = new TestIterator(BytewiseComparator());
internal_iter->AddPut("a", "val_a");
for (size_t j = 1; j <= i; ++j) {
internal_iter->AddPut("b", "val_b");
internal_iter->AddDeletion("b");
}
internal_iter->AddPut("c", "val_c");
internal_iter->Finish();
ro.max_skippable_internal_keys = i;
std::unique_ptr<Iterator> db_iter(NewDBIterator(
env_, ro, ioptions, mutable_cf_options, BytewiseComparator(),
internal_iter, nullptr /* version */, 2 * i + 1 /* sequence */,
options.max_sequential_skip_in_iterations,
nullptr /* read_callback */));
db_iter->SeekToFirst();
ASSERT_TRUE(db_iter->Valid());
ASSERT_EQ(db_iter->key().ToString(), "a");
ASSERT_EQ(db_iter->value().ToString(), "val_a");
db_iter->Next();
if ((options.max_sequential_skip_in_iterations + 1) >=
ro.max_skippable_internal_keys) {
ASSERT_TRUE(!db_iter->Valid());
ASSERT_TRUE(db_iter->status().IsIncomplete());
} else {
ASSERT_TRUE(db_iter->Valid());
ASSERT_EQ(db_iter->key().ToString(), "c");
ASSERT_EQ(db_iter->value().ToString(), "val_c");
}
db_iter->SeekToLast();
ASSERT_TRUE(db_iter->Valid());
ASSERT_EQ(db_iter->key().ToString(), "c");
ASSERT_EQ(db_iter->value().ToString(), "val_c");
db_iter->Prev();
if ((options.max_sequential_skip_in_iterations + 1) >=
ro.max_skippable_internal_keys) {
ASSERT_TRUE(!db_iter->Valid());
ASSERT_TRUE(db_iter->status().IsIncomplete());
} else {
ASSERT_TRUE(db_iter->Valid());
ASSERT_EQ(db_iter->key().ToString(), "a");
ASSERT_EQ(db_iter->value().ToString(), "val_a");
}
}
}
// Test for large number of skippable internal keys with a *non-default*
// max_sequential_skip_in_iterations.
{
for (size_t i = 1; i <= 200; ++i) {
TestIterator* internal_iter = new TestIterator(BytewiseComparator());
internal_iter->AddPut("a", "val_a");
for (size_t j = 1; j <= i; ++j) {
internal_iter->AddPut("b", "val_b");
internal_iter->AddDeletion("b");
}
internal_iter->AddPut("c", "val_c");
internal_iter->Finish();
options.max_sequential_skip_in_iterations = 1000;
ro.max_skippable_internal_keys = i;
std::unique_ptr<Iterator> db_iter(NewDBIterator(
env_, ro, ioptions, mutable_cf_options, BytewiseComparator(),
internal_iter, nullptr /* version */, 2 * i + 1 /* sequence */,
options.max_sequential_skip_in_iterations,
nullptr /* read_callback */));
db_iter->SeekToFirst();
ASSERT_TRUE(db_iter->Valid());
ASSERT_EQ(db_iter->key().ToString(), "a");
ASSERT_EQ(db_iter->value().ToString(), "val_a");
db_iter->Next();
ASSERT_TRUE(!db_iter->Valid());
ASSERT_TRUE(db_iter->status().IsIncomplete());
db_iter->SeekToLast();
ASSERT_TRUE(db_iter->Valid());
ASSERT_EQ(db_iter->key().ToString(), "c");
ASSERT_EQ(db_iter->value().ToString(), "val_c");
db_iter->Prev();
ASSERT_TRUE(!db_iter->Valid());
ASSERT_TRUE(db_iter->status().IsIncomplete());
}
}
}
rocksdb: switch to gtest Summary: Our existing test notation is very similar to what is used in gtest. It makes it easy to adopt what is different. In this diff I modify existing [[ https://code.google.com/p/googletest/wiki/Primer#Test_Fixtures:_Using_the_Same_Data_Configuration_for_Multiple_Te | test fixture ]] classes to inherit from `testing::Test`. Also for unit tests that use fixture class, `TEST` is replaced with `TEST_F` as required in gtest. There are several custom `main` functions in our existing tests. To make this transition easier, I modify all `main` functions to fallow gtest notation. But eventually we can remove them and use implementation of `main` that gtest provides. ```lang=bash % cat ~/transform #!/bin/sh files=$(git ls-files '*test\.cc') for file in $files do if grep -q "rocksdb::test::RunAllTests()" $file then if grep -Eq '^class \w+Test {' $file then perl -pi -e 's/^(class \w+Test) {/${1}: public testing::Test {/g' $file perl -pi -e 's/^(TEST)/${1}_F/g' $file fi perl -pi -e 's/(int main.*\{)/${1}::testing::InitGoogleTest(&argc, argv);/g' $file perl -pi -e 's/rocksdb::test::RunAllTests/RUN_ALL_TESTS/g' $file fi done % sh ~/transform % make format ``` Second iteration of this diff contains only scripted changes. Third iteration contains manual changes to fix last errors and make it compilable. Test Plan: Build and notice no errors. ```lang=bash % USE_CLANG=1 make check -j55 ``` Tests are still testing. Reviewers: meyering, sdong, rven, igor Reviewed By: igor Subscribers: dhruba, leveldb Differential Revision: https://reviews.facebook.net/D35157
10 years ago
TEST_F(DBIteratorTest, DBIterator1) {
ReadOptions ro;
Options options;
options.merge_operator = MergeOperators::CreateFromStringId("stringappend");
TestIterator* internal_iter = new TestIterator(BytewiseComparator());
internal_iter->AddPut("a", "0");
internal_iter->AddPut("b", "0");
internal_iter->AddDeletion("b");
internal_iter->AddMerge("a", "1");
internal_iter->AddMerge("b", "2");
internal_iter->Finish();
std::unique_ptr<Iterator> db_iter(NewDBIterator(
env_, ro, ImmutableOptions(options), MutableCFOptions(options),
BytewiseComparator(), internal_iter, nullptr /* version */,
1 /* sequence */, options.max_sequential_skip_in_iterations,
nullptr /* read_callback */));
db_iter->SeekToFirst();
ASSERT_TRUE(db_iter->Valid());
ASSERT_EQ(db_iter->key().ToString(), "a");
ASSERT_EQ(db_iter->value().ToString(), "0");
db_iter->Next();
ASSERT_TRUE(db_iter->Valid());
ASSERT_EQ(db_iter->key().ToString(), "b");
Support for SingleDelete() Summary: This patch fixes #7460559. It introduces SingleDelete as a new database operation. This operation can be used to delete keys that were never overwritten (no put following another put of the same key). If an overwritten key is single deleted the behavior is undefined. Single deletion of a non-existent key has no effect but multiple consecutive single deletions are not allowed (see limitations). In contrast to the conventional Delete() operation, the deletion entry is removed along with the value when the two are lined up in a compaction. Note: The semantics are similar to @igor's prototype that allowed to have this behavior on the granularity of a column family ( https://reviews.facebook.net/D42093 ). This new patch, however, is more aggressive when it comes to removing tombstones: It removes the SingleDelete together with the value whenever there is no snapshot between them while the older patch only did this when the sequence number of the deletion was older than the earliest snapshot. Most of the complex additions are in the Compaction Iterator, all other changes should be relatively straightforward. The patch also includes basic support for single deletions in db_stress and db_bench. Limitations: - Not compatible with cuckoo hash tables - Single deletions cannot be used in combination with merges and normal deletions on the same key (other keys are not affected by this) - Consecutive single deletions are currently not allowed (and older version of this patch supported this so it could be resurrected if needed) Test Plan: make all check Reviewers: yhchiang, sdong, rven, anthony, yoshinorim, igor Reviewed By: igor Subscribers: maykov, dhruba, leveldb Differential Revision: https://reviews.facebook.net/D43179
9 years ago
db_iter->Next();
ASSERT_FALSE(db_iter->Valid());
}
rocksdb: switch to gtest Summary: Our existing test notation is very similar to what is used in gtest. It makes it easy to adopt what is different. In this diff I modify existing [[ https://code.google.com/p/googletest/wiki/Primer#Test_Fixtures:_Using_the_Same_Data_Configuration_for_Multiple_Te | test fixture ]] classes to inherit from `testing::Test`. Also for unit tests that use fixture class, `TEST` is replaced with `TEST_F` as required in gtest. There are several custom `main` functions in our existing tests. To make this transition easier, I modify all `main` functions to fallow gtest notation. But eventually we can remove them and use implementation of `main` that gtest provides. ```lang=bash % cat ~/transform #!/bin/sh files=$(git ls-files '*test\.cc') for file in $files do if grep -q "rocksdb::test::RunAllTests()" $file then if grep -Eq '^class \w+Test {' $file then perl -pi -e 's/^(class \w+Test) {/${1}: public testing::Test {/g' $file perl -pi -e 's/^(TEST)/${1}_F/g' $file fi perl -pi -e 's/(int main.*\{)/${1}::testing::InitGoogleTest(&argc, argv);/g' $file perl -pi -e 's/rocksdb::test::RunAllTests/RUN_ALL_TESTS/g' $file fi done % sh ~/transform % make format ``` Second iteration of this diff contains only scripted changes. Third iteration contains manual changes to fix last errors and make it compilable. Test Plan: Build and notice no errors. ```lang=bash % USE_CLANG=1 make check -j55 ``` Tests are still testing. Reviewers: meyering, sdong, rven, igor Reviewed By: igor Subscribers: dhruba, leveldb Differential Revision: https://reviews.facebook.net/D35157
10 years ago
TEST_F(DBIteratorTest, DBIterator2) {
ReadOptions ro;
Options options;
options.merge_operator = MergeOperators::CreateFromStringId("stringappend");
TestIterator* internal_iter = new TestIterator(BytewiseComparator());
internal_iter->AddPut("a", "0");
internal_iter->AddPut("b", "0");
internal_iter->AddDeletion("b");
internal_iter->AddMerge("a", "1");
internal_iter->AddMerge("b", "2");
internal_iter->Finish();
std::unique_ptr<Iterator> db_iter(NewDBIterator(
env_, ro, ImmutableOptions(options), MutableCFOptions(options),
BytewiseComparator(), internal_iter, nullptr /* version */,
0 /* sequence */, options.max_sequential_skip_in_iterations,
nullptr /* read_callback */));
db_iter->SeekToFirst();
ASSERT_TRUE(db_iter->Valid());
ASSERT_EQ(db_iter->key().ToString(), "a");
ASSERT_EQ(db_iter->value().ToString(), "0");
db_iter->Next();
ASSERT_TRUE(!db_iter->Valid());
}
rocksdb: switch to gtest Summary: Our existing test notation is very similar to what is used in gtest. It makes it easy to adopt what is different. In this diff I modify existing [[ https://code.google.com/p/googletest/wiki/Primer#Test_Fixtures:_Using_the_Same_Data_Configuration_for_Multiple_Te | test fixture ]] classes to inherit from `testing::Test`. Also for unit tests that use fixture class, `TEST` is replaced with `TEST_F` as required in gtest. There are several custom `main` functions in our existing tests. To make this transition easier, I modify all `main` functions to fallow gtest notation. But eventually we can remove them and use implementation of `main` that gtest provides. ```lang=bash % cat ~/transform #!/bin/sh files=$(git ls-files '*test\.cc') for file in $files do if grep -q "rocksdb::test::RunAllTests()" $file then if grep -Eq '^class \w+Test {' $file then perl -pi -e 's/^(class \w+Test) {/${1}: public testing::Test {/g' $file perl -pi -e 's/^(TEST)/${1}_F/g' $file fi perl -pi -e 's/(int main.*\{)/${1}::testing::InitGoogleTest(&argc, argv);/g' $file perl -pi -e 's/rocksdb::test::RunAllTests/RUN_ALL_TESTS/g' $file fi done % sh ~/transform % make format ``` Second iteration of this diff contains only scripted changes. Third iteration contains manual changes to fix last errors and make it compilable. Test Plan: Build and notice no errors. ```lang=bash % USE_CLANG=1 make check -j55 ``` Tests are still testing. Reviewers: meyering, sdong, rven, igor Reviewed By: igor Subscribers: dhruba, leveldb Differential Revision: https://reviews.facebook.net/D35157
10 years ago
TEST_F(DBIteratorTest, DBIterator3) {
ReadOptions ro;
Options options;
options.merge_operator = MergeOperators::CreateFromStringId("stringappend");
TestIterator* internal_iter = new TestIterator(BytewiseComparator());
internal_iter->AddPut("a", "0");
internal_iter->AddPut("b", "0");
internal_iter->AddDeletion("b");
internal_iter->AddMerge("a", "1");
internal_iter->AddMerge("b", "2");
internal_iter->Finish();
std::unique_ptr<Iterator> db_iter(NewDBIterator(
env_, ro, ImmutableOptions(options), MutableCFOptions(options),
BytewiseComparator(), internal_iter, nullptr /* version */,
2 /* sequence */, options.max_sequential_skip_in_iterations,
nullptr /* read_callback */));
db_iter->SeekToFirst();
ASSERT_TRUE(db_iter->Valid());
ASSERT_EQ(db_iter->key().ToString(), "a");
ASSERT_EQ(db_iter->value().ToString(), "0");
db_iter->Next();
ASSERT_TRUE(!db_iter->Valid());
}
rocksdb: switch to gtest Summary: Our existing test notation is very similar to what is used in gtest. It makes it easy to adopt what is different. In this diff I modify existing [[ https://code.google.com/p/googletest/wiki/Primer#Test_Fixtures:_Using_the_Same_Data_Configuration_for_Multiple_Te | test fixture ]] classes to inherit from `testing::Test`. Also for unit tests that use fixture class, `TEST` is replaced with `TEST_F` as required in gtest. There are several custom `main` functions in our existing tests. To make this transition easier, I modify all `main` functions to fallow gtest notation. But eventually we can remove them and use implementation of `main` that gtest provides. ```lang=bash % cat ~/transform #!/bin/sh files=$(git ls-files '*test\.cc') for file in $files do if grep -q "rocksdb::test::RunAllTests()" $file then if grep -Eq '^class \w+Test {' $file then perl -pi -e 's/^(class \w+Test) {/${1}: public testing::Test {/g' $file perl -pi -e 's/^(TEST)/${1}_F/g' $file fi perl -pi -e 's/(int main.*\{)/${1}::testing::InitGoogleTest(&argc, argv);/g' $file perl -pi -e 's/rocksdb::test::RunAllTests/RUN_ALL_TESTS/g' $file fi done % sh ~/transform % make format ``` Second iteration of this diff contains only scripted changes. Third iteration contains manual changes to fix last errors and make it compilable. Test Plan: Build and notice no errors. ```lang=bash % USE_CLANG=1 make check -j55 ``` Tests are still testing. Reviewers: meyering, sdong, rven, igor Reviewed By: igor Subscribers: dhruba, leveldb Differential Revision: https://reviews.facebook.net/D35157
10 years ago
TEST_F(DBIteratorTest, DBIterator4) {
ReadOptions ro;
Options options;
options.merge_operator = MergeOperators::CreateFromStringId("stringappend");
TestIterator* internal_iter = new TestIterator(BytewiseComparator());
internal_iter->AddPut("a", "0");
internal_iter->AddPut("b", "0");
internal_iter->AddDeletion("b");
internal_iter->AddMerge("a", "1");
internal_iter->AddMerge("b", "2");
internal_iter->Finish();
std::unique_ptr<Iterator> db_iter(NewDBIterator(
env_, ro, ImmutableOptions(options), MutableCFOptions(options),
BytewiseComparator(), internal_iter, nullptr /* version */,
4 /* sequence */, options.max_sequential_skip_in_iterations,
nullptr /* read_callback */));
db_iter->SeekToFirst();
ASSERT_TRUE(db_iter->Valid());
ASSERT_EQ(db_iter->key().ToString(), "a");
ASSERT_EQ(db_iter->value().ToString(), "0,1");
db_iter->Next();
ASSERT_TRUE(db_iter->Valid());
ASSERT_EQ(db_iter->key().ToString(), "b");
ASSERT_EQ(db_iter->value().ToString(), "2");
db_iter->Next();
ASSERT_TRUE(!db_iter->Valid());
}
rocksdb: switch to gtest Summary: Our existing test notation is very similar to what is used in gtest. It makes it easy to adopt what is different. In this diff I modify existing [[ https://code.google.com/p/googletest/wiki/Primer#Test_Fixtures:_Using_the_Same_Data_Configuration_for_Multiple_Te | test fixture ]] classes to inherit from `testing::Test`. Also for unit tests that use fixture class, `TEST` is replaced with `TEST_F` as required in gtest. There are several custom `main` functions in our existing tests. To make this transition easier, I modify all `main` functions to fallow gtest notation. But eventually we can remove them and use implementation of `main` that gtest provides. ```lang=bash % cat ~/transform #!/bin/sh files=$(git ls-files '*test\.cc') for file in $files do if grep -q "rocksdb::test::RunAllTests()" $file then if grep -Eq '^class \w+Test {' $file then perl -pi -e 's/^(class \w+Test) {/${1}: public testing::Test {/g' $file perl -pi -e 's/^(TEST)/${1}_F/g' $file fi perl -pi -e 's/(int main.*\{)/${1}::testing::InitGoogleTest(&argc, argv);/g' $file perl -pi -e 's/rocksdb::test::RunAllTests/RUN_ALL_TESTS/g' $file fi done % sh ~/transform % make format ``` Second iteration of this diff contains only scripted changes. Third iteration contains manual changes to fix last errors and make it compilable. Test Plan: Build and notice no errors. ```lang=bash % USE_CLANG=1 make check -j55 ``` Tests are still testing. Reviewers: meyering, sdong, rven, igor Reviewed By: igor Subscribers: dhruba, leveldb Differential Revision: https://reviews.facebook.net/D35157
10 years ago
TEST_F(DBIteratorTest, DBIterator5) {
ReadOptions ro;
Options options;
options.merge_operator = MergeOperators::CreateFromStringId("stringappend");
ImmutableOptions ioptions = ImmutableOptions(options);
MutableCFOptions mutable_cf_options = MutableCFOptions(options);
{
TestIterator* internal_iter = new TestIterator(BytewiseComparator());
internal_iter->AddMerge("a", "merge_1");
internal_iter->AddMerge("a", "merge_2");
internal_iter->AddMerge("a", "merge_3");
internal_iter->AddPut("a", "put_1");
internal_iter->AddMerge("a", "merge_4");
internal_iter->AddMerge("a", "merge_5");
internal_iter->AddMerge("a", "merge_6");
internal_iter->Finish();
std::unique_ptr<Iterator> db_iter(NewDBIterator(
env_, ro, ioptions, mutable_cf_options, BytewiseComparator(),
internal_iter, nullptr /* version */, 0 /* sequence */,
options.max_sequential_skip_in_iterations,
nullptr /* read_callback */));
db_iter->SeekToLast();
ASSERT_TRUE(db_iter->Valid());
ASSERT_EQ(db_iter->key().ToString(), "a");
ASSERT_EQ(db_iter->value().ToString(), "merge_1");
db_iter->Prev();
ASSERT_TRUE(!db_iter->Valid());
}
{
TestIterator* internal_iter = new TestIterator(BytewiseComparator());
internal_iter->AddMerge("a", "merge_1");
internal_iter->AddMerge("a", "merge_2");
internal_iter->AddMerge("a", "merge_3");
internal_iter->AddPut("a", "put_1");
internal_iter->AddMerge("a", "merge_4");
internal_iter->AddMerge("a", "merge_5");
internal_iter->AddMerge("a", "merge_6");
internal_iter->Finish();
std::unique_ptr<Iterator> db_iter(NewDBIterator(
env_, ro, ioptions, mutable_cf_options, BytewiseComparator(),
internal_iter, nullptr /* version */, 1 /* sequence */,
options.max_sequential_skip_in_iterations,
nullptr /* read_callback */));
db_iter->SeekToLast();
ASSERT_TRUE(db_iter->Valid());
ASSERT_EQ(db_iter->key().ToString(), "a");
ASSERT_EQ(db_iter->value().ToString(), "merge_1,merge_2");
db_iter->Prev();
ASSERT_TRUE(!db_iter->Valid());
}
{
TestIterator* internal_iter = new TestIterator(BytewiseComparator());
internal_iter->AddMerge("a", "merge_1");
internal_iter->AddMerge("a", "merge_2");
internal_iter->AddMerge("a", "merge_3");
internal_iter->AddPut("a", "put_1");
internal_iter->AddMerge("a", "merge_4");
internal_iter->AddMerge("a", "merge_5");
internal_iter->AddMerge("a", "merge_6");
internal_iter->Finish();
std::unique_ptr<Iterator> db_iter(NewDBIterator(
env_, ro, ioptions, mutable_cf_options, BytewiseComparator(),
internal_iter, nullptr /* version */, 2 /* sequence */,
options.max_sequential_skip_in_iterations,
nullptr /* read_callback */));
db_iter->SeekToLast();
ASSERT_TRUE(db_iter->Valid());
ASSERT_EQ(db_iter->key().ToString(), "a");
ASSERT_EQ(db_iter->value().ToString(), "merge_1,merge_2,merge_3");
db_iter->Prev();
ASSERT_TRUE(!db_iter->Valid());
}
{
TestIterator* internal_iter = new TestIterator(BytewiseComparator());
internal_iter->AddMerge("a", "merge_1");
internal_iter->AddMerge("a", "merge_2");
internal_iter->AddMerge("a", "merge_3");
internal_iter->AddPut("a", "put_1");
internal_iter->AddMerge("a", "merge_4");
internal_iter->AddMerge("a", "merge_5");
internal_iter->AddMerge("a", "merge_6");
internal_iter->Finish();
std::unique_ptr<Iterator> db_iter(NewDBIterator(
env_, ro, ioptions, mutable_cf_options, BytewiseComparator(),
internal_iter, nullptr /* version */, 3 /* sequence */,
options.max_sequential_skip_in_iterations,
nullptr /* read_callback */));
db_iter->SeekToLast();
ASSERT_TRUE(db_iter->Valid());
ASSERT_EQ(db_iter->key().ToString(), "a");
ASSERT_EQ(db_iter->value().ToString(), "put_1");
db_iter->Prev();
ASSERT_TRUE(!db_iter->Valid());
}
{
TestIterator* internal_iter = new TestIterator(BytewiseComparator());
internal_iter->AddMerge("a", "merge_1");
internal_iter->AddMerge("a", "merge_2");
internal_iter->AddMerge("a", "merge_3");
internal_iter->AddPut("a", "put_1");
internal_iter->AddMerge("a", "merge_4");
internal_iter->AddMerge("a", "merge_5");
internal_iter->AddMerge("a", "merge_6");
internal_iter->Finish();
std::unique_ptr<Iterator> db_iter(NewDBIterator(
env_, ro, ioptions, mutable_cf_options, BytewiseComparator(),
internal_iter, nullptr /* version */, 4 /* sequence */,
options.max_sequential_skip_in_iterations,
nullptr /* read_callback */));
db_iter->SeekToLast();
ASSERT_TRUE(db_iter->Valid());
ASSERT_EQ(db_iter->key().ToString(), "a");
ASSERT_EQ(db_iter->value().ToString(), "put_1,merge_4");
db_iter->Prev();
ASSERT_TRUE(!db_iter->Valid());
}
{
TestIterator* internal_iter = new TestIterator(BytewiseComparator());
internal_iter->AddMerge("a", "merge_1");
internal_iter->AddMerge("a", "merge_2");
internal_iter->AddMerge("a", "merge_3");
internal_iter->AddPut("a", "put_1");
internal_iter->AddMerge("a", "merge_4");
internal_iter->AddMerge("a", "merge_5");
internal_iter->AddMerge("a", "merge_6");
internal_iter->Finish();
std::unique_ptr<Iterator> db_iter(NewDBIterator(
env_, ro, ioptions, mutable_cf_options, BytewiseComparator(),
internal_iter, nullptr /* version */, 5 /* sequence */,
options.max_sequential_skip_in_iterations,
nullptr /* read_callback */));
db_iter->SeekToLast();
ASSERT_TRUE(db_iter->Valid());
ASSERT_EQ(db_iter->key().ToString(), "a");
ASSERT_EQ(db_iter->value().ToString(), "put_1,merge_4,merge_5");
db_iter->Prev();
ASSERT_TRUE(!db_iter->Valid());
}
{
TestIterator* internal_iter = new TestIterator(BytewiseComparator());
internal_iter->AddMerge("a", "merge_1");
internal_iter->AddMerge("a", "merge_2");
internal_iter->AddMerge("a", "merge_3");
internal_iter->AddPut("a", "put_1");
internal_iter->AddMerge("a", "merge_4");
internal_iter->AddMerge("a", "merge_5");
internal_iter->AddMerge("a", "merge_6");
internal_iter->Finish();
std::unique_ptr<Iterator> db_iter(NewDBIterator(
env_, ro, ioptions, mutable_cf_options, BytewiseComparator(),
internal_iter, nullptr /* version */, 6 /* sequence */,
options.max_sequential_skip_in_iterations,
nullptr /* read_callback */));
db_iter->SeekToLast();
ASSERT_TRUE(db_iter->Valid());
ASSERT_EQ(db_iter->key().ToString(), "a");
ASSERT_EQ(db_iter->value().ToString(), "put_1,merge_4,merge_5,merge_6");
db_iter->Prev();
ASSERT_TRUE(!db_iter->Valid());
}
{
// put, singledelete, merge
TestIterator* internal_iter = new TestIterator(BytewiseComparator());
internal_iter->AddPut("a", "val_a");
internal_iter->AddSingleDeletion("a");
internal_iter->AddMerge("a", "merge_1");
internal_iter->AddMerge("a", "merge_2");
internal_iter->AddPut("b", "val_b");
internal_iter->Finish();
std::unique_ptr<Iterator> db_iter(NewDBIterator(
env_, ro, ioptions, mutable_cf_options, BytewiseComparator(),
internal_iter, nullptr /* version */, 10 /* sequence */,
options.max_sequential_skip_in_iterations,
nullptr /* read_callback */));
db_iter->Seek("b");
ASSERT_TRUE(db_iter->Valid());
ASSERT_EQ(db_iter->key().ToString(), "b");
db_iter->Prev();
ASSERT_TRUE(db_iter->Valid());
ASSERT_EQ(db_iter->key().ToString(), "a");
}
}
rocksdb: switch to gtest Summary: Our existing test notation is very similar to what is used in gtest. It makes it easy to adopt what is different. In this diff I modify existing [[ https://code.google.com/p/googletest/wiki/Primer#Test_Fixtures:_Using_the_Same_Data_Configuration_for_Multiple_Te | test fixture ]] classes to inherit from `testing::Test`. Also for unit tests that use fixture class, `TEST` is replaced with `TEST_F` as required in gtest. There are several custom `main` functions in our existing tests. To make this transition easier, I modify all `main` functions to fallow gtest notation. But eventually we can remove them and use implementation of `main` that gtest provides. ```lang=bash % cat ~/transform #!/bin/sh files=$(git ls-files '*test\.cc') for file in $files do if grep -q "rocksdb::test::RunAllTests()" $file then if grep -Eq '^class \w+Test {' $file then perl -pi -e 's/^(class \w+Test) {/${1}: public testing::Test {/g' $file perl -pi -e 's/^(TEST)/${1}_F/g' $file fi perl -pi -e 's/(int main.*\{)/${1}::testing::InitGoogleTest(&argc, argv);/g' $file perl -pi -e 's/rocksdb::test::RunAllTests/RUN_ALL_TESTS/g' $file fi done % sh ~/transform % make format ``` Second iteration of this diff contains only scripted changes. Third iteration contains manual changes to fix last errors and make it compilable. Test Plan: Build and notice no errors. ```lang=bash % USE_CLANG=1 make check -j55 ``` Tests are still testing. Reviewers: meyering, sdong, rven, igor Reviewed By: igor Subscribers: dhruba, leveldb Differential Revision: https://reviews.facebook.net/D35157
10 years ago
TEST_F(DBIteratorTest, DBIterator6) {
ReadOptions ro;
Options options;
options.merge_operator = MergeOperators::CreateFromStringId("stringappend");
ImmutableOptions ioptions = ImmutableOptions(options);
MutableCFOptions mutable_cf_options = MutableCFOptions(options);
{
TestIterator* internal_iter = new TestIterator(BytewiseComparator());
internal_iter->AddMerge("a", "merge_1");
internal_iter->AddMerge("a", "merge_2");
internal_iter->AddMerge("a", "merge_3");
internal_iter->AddDeletion("a");
internal_iter->AddMerge("a", "merge_4");
internal_iter->AddMerge("a", "merge_5");
internal_iter->AddMerge("a", "merge_6");
internal_iter->Finish();
std::unique_ptr<Iterator> db_iter(NewDBIterator(
env_, ro, ioptions, mutable_cf_options, BytewiseComparator(),
internal_iter, nullptr /* version */, 0 /* sequence */,
options.max_sequential_skip_in_iterations,
nullptr /* read_callback */));
db_iter->SeekToLast();
ASSERT_TRUE(db_iter->Valid());
ASSERT_EQ(db_iter->key().ToString(), "a");
ASSERT_EQ(db_iter->value().ToString(), "merge_1");
db_iter->Prev();
ASSERT_TRUE(!db_iter->Valid());
}
{
TestIterator* internal_iter = new TestIterator(BytewiseComparator());
internal_iter->AddMerge("a", "merge_1");
internal_iter->AddMerge("a", "merge_2");
internal_iter->AddMerge("a", "merge_3");
internal_iter->AddDeletion("a");
internal_iter->AddMerge("a", "merge_4");
internal_iter->AddMerge("a", "merge_5");
internal_iter->AddMerge("a", "merge_6");
internal_iter->Finish();
std::unique_ptr<Iterator> db_iter(NewDBIterator(
env_, ro, ioptions, mutable_cf_options, BytewiseComparator(),
internal_iter, nullptr /* version */, 1 /* sequence */,
options.max_sequential_skip_in_iterations,
nullptr /* read_callback */));
db_iter->SeekToLast();
ASSERT_TRUE(db_iter->Valid());
ASSERT_EQ(db_iter->key().ToString(), "a");
ASSERT_EQ(db_iter->value().ToString(), "merge_1,merge_2");
db_iter->Prev();
ASSERT_TRUE(!db_iter->Valid());
}
{
TestIterator* internal_iter = new TestIterator(BytewiseComparator());
internal_iter->AddMerge("a", "merge_1");
internal_iter->AddMerge("a", "merge_2");
internal_iter->AddMerge("a", "merge_3");
internal_iter->AddDeletion("a");
internal_iter->AddMerge("a", "merge_4");
internal_iter->AddMerge("a", "merge_5");
internal_iter->AddMerge("a", "merge_6");
internal_iter->Finish();
std::unique_ptr<Iterator> db_iter(NewDBIterator(
env_, ro, ioptions, mutable_cf_options, BytewiseComparator(),
internal_iter, nullptr /* version */, 2 /* sequence */,
options.max_sequential_skip_in_iterations,
nullptr /* read_callback */));
db_iter->SeekToLast();
ASSERT_TRUE(db_iter->Valid());
ASSERT_EQ(db_iter->key().ToString(), "a");
ASSERT_EQ(db_iter->value().ToString(), "merge_1,merge_2,merge_3");
db_iter->Prev();
ASSERT_TRUE(!db_iter->Valid());
}
{
TestIterator* internal_iter = new TestIterator(BytewiseComparator());
internal_iter->AddMerge("a", "merge_1");
internal_iter->AddMerge("a", "merge_2");
internal_iter->AddMerge("a", "merge_3");
internal_iter->AddDeletion("a");
internal_iter->AddMerge("a", "merge_4");
internal_iter->AddMerge("a", "merge_5");
internal_iter->AddMerge("a", "merge_6");
internal_iter->Finish();
std::unique_ptr<Iterator> db_iter(NewDBIterator(
env_, ro, ioptions, mutable_cf_options, BytewiseComparator(),
internal_iter, nullptr /* version */, 3 /* sequence */,
options.max_sequential_skip_in_iterations,
nullptr /* read_callback */));
db_iter->SeekToLast();
ASSERT_TRUE(!db_iter->Valid());
}
{
TestIterator* internal_iter = new TestIterator(BytewiseComparator());
internal_iter->AddMerge("a", "merge_1");
internal_iter->AddMerge("a", "merge_2");
internal_iter->AddMerge("a", "merge_3");
internal_iter->AddDeletion("a");
internal_iter->AddMerge("a", "merge_4");
internal_iter->AddMerge("a", "merge_5");
internal_iter->AddMerge("a", "merge_6");
internal_iter->Finish();
std::unique_ptr<Iterator> db_iter(NewDBIterator(
env_, ro, ioptions, mutable_cf_options, BytewiseComparator(),
internal_iter, nullptr /* version */, 4 /* sequence */,
options.max_sequential_skip_in_iterations,
nullptr /* read_callback */));
db_iter->SeekToLast();
ASSERT_TRUE(db_iter->Valid());
ASSERT_EQ(db_iter->key().ToString(), "a");
ASSERT_EQ(db_iter->value().ToString(), "merge_4");
db_iter->Prev();
ASSERT_TRUE(!db_iter->Valid());
}
{
TestIterator* internal_iter = new TestIterator(BytewiseComparator());
internal_iter->AddMerge("a", "merge_1");
internal_iter->AddMerge("a", "merge_2");
internal_iter->AddMerge("a", "merge_3");
internal_iter->AddDeletion("a");
internal_iter->AddMerge("a", "merge_4");
internal_iter->AddMerge("a", "merge_5");
internal_iter->AddMerge("a", "merge_6");
internal_iter->Finish();
std::unique_ptr<Iterator> db_iter(NewDBIterator(
env_, ro, ioptions, mutable_cf_options, BytewiseComparator(),
internal_iter, nullptr /* version */, 5 /* sequence */,
options.max_sequential_skip_in_iterations,
nullptr /* read_callback */));
db_iter->SeekToLast();
ASSERT_TRUE(db_iter->Valid());
ASSERT_EQ(db_iter->key().ToString(), "a");
ASSERT_EQ(db_iter->value().ToString(), "merge_4,merge_5");
db_iter->Prev();
ASSERT_TRUE(!db_iter->Valid());
}
{
TestIterator* internal_iter = new TestIterator(BytewiseComparator());
internal_iter->AddMerge("a", "merge_1");
internal_iter->AddMerge("a", "merge_2");
internal_iter->AddMerge("a", "merge_3");
internal_iter->AddDeletion("a");
internal_iter->AddMerge("a", "merge_4");
internal_iter->AddMerge("a", "merge_5");
internal_iter->AddMerge("a", "merge_6");
internal_iter->Finish();
std::unique_ptr<Iterator> db_iter(NewDBIterator(
env_, ro, ioptions, mutable_cf_options, BytewiseComparator(),
internal_iter, nullptr /* version */, 6 /* sequence */,
options.max_sequential_skip_in_iterations,
nullptr /* read_callback */));
db_iter->SeekToLast();
ASSERT_TRUE(db_iter->Valid());
ASSERT_EQ(db_iter->key().ToString(), "a");
ASSERT_EQ(db_iter->value().ToString(), "merge_4,merge_5,merge_6");
db_iter->Prev();
ASSERT_TRUE(!db_iter->Valid());
}
}
rocksdb: switch to gtest Summary: Our existing test notation is very similar to what is used in gtest. It makes it easy to adopt what is different. In this diff I modify existing [[ https://code.google.com/p/googletest/wiki/Primer#Test_Fixtures:_Using_the_Same_Data_Configuration_for_Multiple_Te | test fixture ]] classes to inherit from `testing::Test`. Also for unit tests that use fixture class, `TEST` is replaced with `TEST_F` as required in gtest. There are several custom `main` functions in our existing tests. To make this transition easier, I modify all `main` functions to fallow gtest notation. But eventually we can remove them and use implementation of `main` that gtest provides. ```lang=bash % cat ~/transform #!/bin/sh files=$(git ls-files '*test\.cc') for file in $files do if grep -q "rocksdb::test::RunAllTests()" $file then if grep -Eq '^class \w+Test {' $file then perl -pi -e 's/^(class \w+Test) {/${1}: public testing::Test {/g' $file perl -pi -e 's/^(TEST)/${1}_F/g' $file fi perl -pi -e 's/(int main.*\{)/${1}::testing::InitGoogleTest(&argc, argv);/g' $file perl -pi -e 's/rocksdb::test::RunAllTests/RUN_ALL_TESTS/g' $file fi done % sh ~/transform % make format ``` Second iteration of this diff contains only scripted changes. Third iteration contains manual changes to fix last errors and make it compilable. Test Plan: Build and notice no errors. ```lang=bash % USE_CLANG=1 make check -j55 ``` Tests are still testing. Reviewers: meyering, sdong, rven, igor Reviewed By: igor Subscribers: dhruba, leveldb Differential Revision: https://reviews.facebook.net/D35157
10 years ago
TEST_F(DBIteratorTest, DBIterator7) {
ReadOptions ro;
Options options;
options.merge_operator = MergeOperators::CreateFromStringId("stringappend");
ImmutableOptions ioptions = ImmutableOptions(options);
MutableCFOptions mutable_cf_options = MutableCFOptions(options);
{
TestIterator* internal_iter = new TestIterator(BytewiseComparator());
internal_iter->AddMerge("a", "merge_1");
internal_iter->AddPut("b", "val");
internal_iter->AddMerge("b", "merge_2");
internal_iter->AddDeletion("b");
internal_iter->AddMerge("b", "merge_3");
internal_iter->AddMerge("c", "merge_4");
internal_iter->AddMerge("c", "merge_5");
internal_iter->AddDeletion("b");
internal_iter->AddMerge("b", "merge_6");
internal_iter->AddMerge("b", "merge_7");
internal_iter->AddMerge("b", "merge_8");
internal_iter->AddMerge("b", "merge_9");
internal_iter->AddMerge("b", "merge_10");
internal_iter->AddMerge("b", "merge_11");
internal_iter->AddDeletion("c");
internal_iter->Finish();
std::unique_ptr<Iterator> db_iter(NewDBIterator(
env_, ro, ioptions, mutable_cf_options, BytewiseComparator(),
internal_iter, nullptr /* version */, 0 /* sequence */,
options.max_sequential_skip_in_iterations,
nullptr /* read_callback */));
db_iter->SeekToLast();
ASSERT_TRUE(db_iter->Valid());
ASSERT_EQ(db_iter->key().ToString(), "a");
ASSERT_EQ(db_iter->value().ToString(), "merge_1");
db_iter->Prev();
ASSERT_TRUE(!db_iter->Valid());
}
{
TestIterator* internal_iter = new TestIterator(BytewiseComparator());
internal_iter->AddMerge("a", "merge_1");
internal_iter->AddPut("b", "val");
internal_iter->AddMerge("b", "merge_2");
internal_iter->AddDeletion("b");
internal_iter->AddMerge("b", "merge_3");
internal_iter->AddMerge("c", "merge_4");
internal_iter->AddMerge("c", "merge_5");
internal_iter->AddDeletion("b");
internal_iter->AddMerge("b", "merge_6");
internal_iter->AddMerge("b", "merge_7");
internal_iter->AddMerge("b", "merge_8");
internal_iter->AddMerge("b", "merge_9");
internal_iter->AddMerge("b", "merge_10");
internal_iter->AddMerge("b", "merge_11");
internal_iter->AddDeletion("c");
internal_iter->Finish();
std::unique_ptr<Iterator> db_iter(NewDBIterator(
env_, ro, ioptions, mutable_cf_options, BytewiseComparator(),
internal_iter, nullptr /* version */, 2 /* sequence */,
options.max_sequential_skip_in_iterations,
nullptr /* read_callback */));
db_iter->SeekToLast();
ASSERT_TRUE(db_iter->Valid());
ASSERT_EQ(db_iter->key().ToString(), "b");
ASSERT_EQ(db_iter->value().ToString(), "val,merge_2");
db_iter->Prev();
ASSERT_TRUE(db_iter->Valid());
ASSERT_EQ(db_iter->key().ToString(), "a");
ASSERT_EQ(db_iter->value().ToString(), "merge_1");
db_iter->Prev();
ASSERT_TRUE(!db_iter->Valid());
}
{
TestIterator* internal_iter = new TestIterator(BytewiseComparator());
internal_iter->AddMerge("a", "merge_1");
internal_iter->AddPut("b", "val");
internal_iter->AddMerge("b", "merge_2");
internal_iter->AddDeletion("b");
internal_iter->AddMerge("b", "merge_3");
internal_iter->AddMerge("c", "merge_4");
internal_iter->AddMerge("c", "merge_5");
internal_iter->AddDeletion("b");
internal_iter->AddMerge("b", "merge_6");
internal_iter->AddMerge("b", "merge_7");
internal_iter->AddMerge("b", "merge_8");
internal_iter->AddMerge("b", "merge_9");
internal_iter->AddMerge("b", "merge_10");
internal_iter->AddMerge("b", "merge_11");
internal_iter->AddDeletion("c");
internal_iter->Finish();
std::unique_ptr<Iterator> db_iter(NewDBIterator(
env_, ro, ioptions, mutable_cf_options, BytewiseComparator(),
internal_iter, nullptr /* version */, 4 /* sequence */,
options.max_sequential_skip_in_iterations,
nullptr /* read_callback */));
db_iter->SeekToLast();
ASSERT_TRUE(db_iter->Valid());
ASSERT_EQ(db_iter->key().ToString(), "b");
ASSERT_EQ(db_iter->value().ToString(), "merge_3");
db_iter->Prev();
ASSERT_TRUE(db_iter->Valid());
ASSERT_EQ(db_iter->key().ToString(), "a");
ASSERT_EQ(db_iter->value().ToString(), "merge_1");
db_iter->Prev();
ASSERT_TRUE(!db_iter->Valid());
}
{
TestIterator* internal_iter = new TestIterator(BytewiseComparator());
internal_iter->AddMerge("a", "merge_1");
internal_iter->AddPut("b", "val");
internal_iter->AddMerge("b", "merge_2");
internal_iter->AddDeletion("b");
internal_iter->AddMerge("b", "merge_3");
internal_iter->AddMerge("c", "merge_4");
internal_iter->AddMerge("c", "merge_5");
internal_iter->AddDeletion("b");
internal_iter->AddMerge("b", "merge_6");
internal_iter->AddMerge("b", "merge_7");
internal_iter->AddMerge("b", "merge_8");
internal_iter->AddMerge("b", "merge_9");
internal_iter->AddMerge("b", "merge_10");
internal_iter->AddMerge("b", "merge_11");
internal_iter->AddDeletion("c");
internal_iter->Finish();
std::unique_ptr<Iterator> db_iter(NewDBIterator(
env_, ro, ioptions, mutable_cf_options, BytewiseComparator(),
internal_iter, nullptr /* version */, 5 /* sequence */,
options.max_sequential_skip_in_iterations,
nullptr /* read_callback */));
db_iter->SeekToLast();
ASSERT_TRUE(db_iter->Valid());
ASSERT_EQ(db_iter->key().ToString(), "c");
ASSERT_EQ(db_iter->value().ToString(), "merge_4");
db_iter->Prev();
ASSERT_TRUE(db_iter->Valid());
ASSERT_EQ(db_iter->key().ToString(), "b");
ASSERT_EQ(db_iter->value().ToString(), "merge_3");
db_iter->Prev();
ASSERT_TRUE(db_iter->Valid());
ASSERT_EQ(db_iter->key().ToString(), "a");
ASSERT_EQ(db_iter->value().ToString(), "merge_1");
db_iter->Prev();
ASSERT_TRUE(!db_iter->Valid());
}
{
TestIterator* internal_iter = new TestIterator(BytewiseComparator());
internal_iter->AddMerge("a", "merge_1");
internal_iter->AddPut("b", "val");
internal_iter->AddMerge("b", "merge_2");
internal_iter->AddDeletion("b");
internal_iter->AddMerge("b", "merge_3");
internal_iter->AddMerge("c", "merge_4");
internal_iter->AddMerge("c", "merge_5");
internal_iter->AddDeletion("b");
internal_iter->AddMerge("b", "merge_6");
internal_iter->AddMerge("b", "merge_7");
internal_iter->AddMerge("b", "merge_8");
internal_iter->AddMerge("b", "merge_9");
internal_iter->AddMerge("b", "merge_10");
internal_iter->AddMerge("b", "merge_11");
internal_iter->AddDeletion("c");
internal_iter->Finish();
std::unique_ptr<Iterator> db_iter(NewDBIterator(
env_, ro, ioptions, mutable_cf_options, BytewiseComparator(),
internal_iter, nullptr /* version */, 6 /* sequence */,
options.max_sequential_skip_in_iterations,
nullptr /* read_callback */));
db_iter->SeekToLast();
ASSERT_TRUE(db_iter->Valid());
ASSERT_EQ(db_iter->key().ToString(), "c");
ASSERT_EQ(db_iter->value().ToString(), "merge_4,merge_5");
db_iter->Prev();
ASSERT_TRUE(db_iter->Valid());
ASSERT_TRUE(db_iter->Valid());
ASSERT_EQ(db_iter->key().ToString(), "b");
ASSERT_EQ(db_iter->value().ToString(), "merge_3");
db_iter->Prev();
ASSERT_TRUE(db_iter->Valid());
ASSERT_EQ(db_iter->key().ToString(), "a");
ASSERT_EQ(db_iter->value().ToString(), "merge_1");
db_iter->Prev();
ASSERT_TRUE(!db_iter->Valid());
}
{
TestIterator* internal_iter = new TestIterator(BytewiseComparator());
internal_iter->AddMerge("a", "merge_1");
internal_iter->AddPut("b", "val");
internal_iter->AddMerge("b", "merge_2");
internal_iter->AddDeletion("b");
internal_iter->AddMerge("b", "merge_3");
internal_iter->AddMerge("c", "merge_4");
internal_iter->AddMerge("c", "merge_5");
internal_iter->AddDeletion("b");
internal_iter->AddMerge("b", "merge_6");
internal_iter->AddMerge("b", "merge_7");
internal_iter->AddMerge("b", "merge_8");
internal_iter->AddMerge("b", "merge_9");
internal_iter->AddMerge("b", "merge_10");
internal_iter->AddMerge("b", "merge_11");
internal_iter->AddDeletion("c");
internal_iter->Finish();
std::unique_ptr<Iterator> db_iter(NewDBIterator(
env_, ro, ioptions, mutable_cf_options, BytewiseComparator(),
internal_iter, nullptr /* version */, 7 /* sequence */,
options.max_sequential_skip_in_iterations,
nullptr /* read_callback */));
db_iter->SeekToLast();
ASSERT_TRUE(db_iter->Valid());
ASSERT_EQ(db_iter->key().ToString(), "c");
ASSERT_EQ(db_iter->value().ToString(), "merge_4,merge_5");
db_iter->Prev();
ASSERT_TRUE(db_iter->Valid());
ASSERT_EQ(db_iter->key().ToString(), "a");
ASSERT_EQ(db_iter->value().ToString(), "merge_1");
db_iter->Prev();
ASSERT_TRUE(!db_iter->Valid());
}
{
TestIterator* internal_iter = new TestIterator(BytewiseComparator());
internal_iter->AddMerge("a", "merge_1");
internal_iter->AddPut("b", "val");
internal_iter->AddMerge("b", "merge_2");
internal_iter->AddDeletion("b");
internal_iter->AddMerge("b", "merge_3");
internal_iter->AddMerge("c", "merge_4");
internal_iter->AddMerge("c", "merge_5");
internal_iter->AddDeletion("b");
internal_iter->AddMerge("b", "merge_6");
internal_iter->AddMerge("b", "merge_7");
internal_iter->AddMerge("b", "merge_8");
internal_iter->AddMerge("b", "merge_9");
internal_iter->AddMerge("b", "merge_10");
internal_iter->AddMerge("b", "merge_11");
internal_iter->AddDeletion("c");
internal_iter->Finish();
std::unique_ptr<Iterator> db_iter(NewDBIterator(
env_, ro, ioptions, mutable_cf_options, BytewiseComparator(),
internal_iter, nullptr /* version */, 9 /* sequence */,
options.max_sequential_skip_in_iterations,
nullptr /* read_callback */));
db_iter->SeekToLast();
ASSERT_TRUE(db_iter->Valid());
ASSERT_EQ(db_iter->key().ToString(), "c");
ASSERT_EQ(db_iter->value().ToString(), "merge_4,merge_5");
db_iter->Prev();
ASSERT_TRUE(db_iter->Valid());
ASSERT_TRUE(db_iter->Valid());
ASSERT_EQ(db_iter->key().ToString(), "b");
ASSERT_EQ(db_iter->value().ToString(), "merge_6,merge_7");
db_iter->Prev();
ASSERT_TRUE(db_iter->Valid());
ASSERT_EQ(db_iter->key().ToString(), "a");
ASSERT_EQ(db_iter->value().ToString(), "merge_1");
db_iter->Prev();
ASSERT_TRUE(!db_iter->Valid());
}
{
TestIterator* internal_iter = new TestIterator(BytewiseComparator());
internal_iter->AddMerge("a", "merge_1");
internal_iter->AddPut("b", "val");
internal_iter->AddMerge("b", "merge_2");
internal_iter->AddDeletion("b");
internal_iter->AddMerge("b", "merge_3");
internal_iter->AddMerge("c", "merge_4");
internal_iter->AddMerge("c", "merge_5");
internal_iter->AddDeletion("b");
internal_iter->AddMerge("b", "merge_6");
internal_iter->AddMerge("b", "merge_7");
internal_iter->AddMerge("b", "merge_8");
internal_iter->AddMerge("b", "merge_9");
internal_iter->AddMerge("b", "merge_10");
internal_iter->AddMerge("b", "merge_11");
internal_iter->AddDeletion("c");
internal_iter->Finish();
std::unique_ptr<Iterator> db_iter(NewDBIterator(
env_, ro, ioptions, mutable_cf_options, BytewiseComparator(),
internal_iter, nullptr /* version */, 13 /* sequence */,
options.max_sequential_skip_in_iterations,
nullptr /* read_callback */));
db_iter->SeekToLast();
ASSERT_TRUE(db_iter->Valid());
ASSERT_EQ(db_iter->key().ToString(), "c");
ASSERT_EQ(db_iter->value().ToString(), "merge_4,merge_5");
db_iter->Prev();
ASSERT_TRUE(db_iter->Valid());
ASSERT_TRUE(db_iter->Valid());
ASSERT_EQ(db_iter->key().ToString(), "b");
ASSERT_EQ(db_iter->value().ToString(),
"merge_6,merge_7,merge_8,merge_9,merge_10,merge_11");
db_iter->Prev();
ASSERT_TRUE(db_iter->Valid());
ASSERT_EQ(db_iter->key().ToString(), "a");
ASSERT_EQ(db_iter->value().ToString(), "merge_1");
db_iter->Prev();
ASSERT_TRUE(!db_iter->Valid());
}
{
TestIterator* internal_iter = new TestIterator(BytewiseComparator());
internal_iter->AddMerge("a", "merge_1");
internal_iter->AddPut("b", "val");
internal_iter->AddMerge("b", "merge_2");
internal_iter->AddDeletion("b");
internal_iter->AddMerge("b", "merge_3");
internal_iter->AddMerge("c", "merge_4");
internal_iter->AddMerge("c", "merge_5");
internal_iter->AddDeletion("b");
internal_iter->AddMerge("b", "merge_6");
internal_iter->AddMerge("b", "merge_7");
internal_iter->AddMerge("b", "merge_8");
internal_iter->AddMerge("b", "merge_9");
internal_iter->AddMerge("b", "merge_10");
internal_iter->AddMerge("b", "merge_11");
internal_iter->AddDeletion("c");
internal_iter->Finish();
std::unique_ptr<Iterator> db_iter(NewDBIterator(
env_, ro, ioptions, mutable_cf_options, BytewiseComparator(),
internal_iter, nullptr /* version */, 14 /* sequence */,
options.max_sequential_skip_in_iterations,
nullptr /* read_callback */));
db_iter->SeekToLast();
ASSERT_TRUE(db_iter->Valid());
ASSERT_EQ(db_iter->key().ToString(), "b");
ASSERT_EQ(db_iter->value().ToString(),
"merge_6,merge_7,merge_8,merge_9,merge_10,merge_11");
db_iter->Prev();
ASSERT_TRUE(db_iter->Valid());
ASSERT_EQ(db_iter->key().ToString(), "a");
ASSERT_EQ(db_iter->value().ToString(), "merge_1");
db_iter->Prev();
ASSERT_TRUE(!db_iter->Valid());
}
}
rocksdb: switch to gtest Summary: Our existing test notation is very similar to what is used in gtest. It makes it easy to adopt what is different. In this diff I modify existing [[ https://code.google.com/p/googletest/wiki/Primer#Test_Fixtures:_Using_the_Same_Data_Configuration_for_Multiple_Te | test fixture ]] classes to inherit from `testing::Test`. Also for unit tests that use fixture class, `TEST` is replaced with `TEST_F` as required in gtest. There are several custom `main` functions in our existing tests. To make this transition easier, I modify all `main` functions to fallow gtest notation. But eventually we can remove them and use implementation of `main` that gtest provides. ```lang=bash % cat ~/transform #!/bin/sh files=$(git ls-files '*test\.cc') for file in $files do if grep -q "rocksdb::test::RunAllTests()" $file then if grep -Eq '^class \w+Test {' $file then perl -pi -e 's/^(class \w+Test) {/${1}: public testing::Test {/g' $file perl -pi -e 's/^(TEST)/${1}_F/g' $file fi perl -pi -e 's/(int main.*\{)/${1}::testing::InitGoogleTest(&argc, argv);/g' $file perl -pi -e 's/rocksdb::test::RunAllTests/RUN_ALL_TESTS/g' $file fi done % sh ~/transform % make format ``` Second iteration of this diff contains only scripted changes. Third iteration contains manual changes to fix last errors and make it compilable. Test Plan: Build and notice no errors. ```lang=bash % USE_CLANG=1 make check -j55 ``` Tests are still testing. Reviewers: meyering, sdong, rven, igor Reviewed By: igor Subscribers: dhruba, leveldb Differential Revision: https://reviews.facebook.net/D35157
10 years ago
TEST_F(DBIteratorTest, DBIterator8) {
ReadOptions ro;
Options options;
options.merge_operator = MergeOperators::CreateFromStringId("stringappend");
TestIterator* internal_iter = new TestIterator(BytewiseComparator());
internal_iter->AddDeletion("a");
internal_iter->AddPut("a", "0");
internal_iter->AddPut("b", "0");
internal_iter->Finish();
std::unique_ptr<Iterator> db_iter(NewDBIterator(
env_, ro, ImmutableOptions(options), MutableCFOptions(options),
BytewiseComparator(), internal_iter, nullptr /* version */,
10 /* sequence */, options.max_sequential_skip_in_iterations,
nullptr /* read_callback */));
db_iter->SeekToLast();
ASSERT_TRUE(db_iter->Valid());
ASSERT_EQ(db_iter->key().ToString(), "b");
ASSERT_EQ(db_iter->value().ToString(), "0");
db_iter->Prev();
ASSERT_TRUE(db_iter->Valid());
ASSERT_EQ(db_iter->key().ToString(), "a");
ASSERT_EQ(db_iter->value().ToString(), "0");
}
// TODO(3.13): fix the issue of Seek() then Prev() which might not necessary
// return the biggest element smaller than the seek key.
TEST_F(DBIteratorTest, DBIterator9) {
ReadOptions ro;
Options options;
options.merge_operator = MergeOperators::CreateFromStringId("stringappend");
{
TestIterator* internal_iter = new TestIterator(BytewiseComparator());
internal_iter->AddMerge("a", "merge_1");
internal_iter->AddMerge("a", "merge_2");
internal_iter->AddMerge("b", "merge_3");
internal_iter->AddMerge("b", "merge_4");
internal_iter->AddMerge("d", "merge_5");
internal_iter->AddMerge("d", "merge_6");
internal_iter->Finish();
std::unique_ptr<Iterator> db_iter(NewDBIterator(
env_, ro, ImmutableOptions(options), MutableCFOptions(options),
BytewiseComparator(), internal_iter, nullptr /* version */,
10 /* sequence */, options.max_sequential_skip_in_iterations,
nullptr /* read_callback */));
db_iter->SeekToLast();
ASSERT_TRUE(db_iter->Valid());
db_iter->Prev();
ASSERT_TRUE(db_iter->Valid());
ASSERT_EQ(db_iter->key().ToString(), "b");
ASSERT_EQ(db_iter->value().ToString(), "merge_3,merge_4");
db_iter->Next();
ASSERT_TRUE(db_iter->Valid());
ASSERT_EQ(db_iter->key().ToString(), "d");
ASSERT_EQ(db_iter->value().ToString(), "merge_5,merge_6");
db_iter->Seek("b");
ASSERT_TRUE(db_iter->Valid());
ASSERT_EQ(db_iter->key().ToString(), "b");
ASSERT_EQ(db_iter->value().ToString(), "merge_3,merge_4");
db_iter->Prev();
ASSERT_TRUE(db_iter->Valid());
ASSERT_EQ(db_iter->key().ToString(), "a");
ASSERT_EQ(db_iter->value().ToString(), "merge_1,merge_2");
db_iter->SeekForPrev("b");
ASSERT_TRUE(db_iter->Valid());
ASSERT_EQ(db_iter->key().ToString(), "b");
ASSERT_EQ(db_iter->value().ToString(), "merge_3,merge_4");
db_iter->Next();
ASSERT_TRUE(db_iter->Valid());
ASSERT_EQ(db_iter->key().ToString(), "d");
ASSERT_EQ(db_iter->value().ToString(), "merge_5,merge_6");
db_iter->Seek("c");
ASSERT_TRUE(db_iter->Valid());
ASSERT_EQ(db_iter->key().ToString(), "d");
ASSERT_EQ(db_iter->value().ToString(), "merge_5,merge_6");
db_iter->Prev();
ASSERT_TRUE(db_iter->Valid());
ASSERT_EQ(db_iter->key().ToString(), "b");
ASSERT_EQ(db_iter->value().ToString(), "merge_3,merge_4");
db_iter->SeekForPrev("c");
ASSERT_TRUE(db_iter->Valid());
ASSERT_EQ(db_iter->key().ToString(), "b");
ASSERT_EQ(db_iter->value().ToString(), "merge_3,merge_4");
db_iter->Next();
ASSERT_TRUE(db_iter->Valid());
ASSERT_EQ(db_iter->key().ToString(), "d");
ASSERT_EQ(db_iter->value().ToString(), "merge_5,merge_6");
}
}
// TODO(3.13): fix the issue of Seek() then Prev() which might not necessary
// return the biggest element smaller than the seek key.
TEST_F(DBIteratorTest, DBIterator10) {
ReadOptions ro;
Options options;
TestIterator* internal_iter = new TestIterator(BytewiseComparator());
internal_iter->AddPut("a", "1");
internal_iter->AddPut("b", "2");
internal_iter->AddPut("c", "3");
internal_iter->AddPut("d", "4");
internal_iter->Finish();
std::unique_ptr<Iterator> db_iter(NewDBIterator(
env_, ro, ImmutableOptions(options), MutableCFOptions(options),
BytewiseComparator(), internal_iter, nullptr /* version */,
10 /* sequence */, options.max_sequential_skip_in_iterations,
nullptr /* read_callback */));
db_iter->Seek("c");
ASSERT_TRUE(db_iter->Valid());
db_iter->Prev();
ASSERT_TRUE(db_iter->Valid());
ASSERT_EQ(db_iter->key().ToString(), "b");
ASSERT_EQ(db_iter->value().ToString(), "2");
db_iter->Next();
ASSERT_TRUE(db_iter->Valid());
ASSERT_EQ(db_iter->key().ToString(), "c");
ASSERT_EQ(db_iter->value().ToString(), "3");
db_iter->SeekForPrev("c");
ASSERT_TRUE(db_iter->Valid());
db_iter->Next();
ASSERT_TRUE(db_iter->Valid());
ASSERT_EQ(db_iter->key().ToString(), "d");
ASSERT_EQ(db_iter->value().ToString(), "4");
db_iter->Prev();
ASSERT_TRUE(db_iter->Valid());
ASSERT_EQ(db_iter->key().ToString(), "c");
ASSERT_EQ(db_iter->value().ToString(), "3");
}
TEST_F(DBIteratorTest, SeekToLastOccurrenceSeq0) {
ReadOptions ro;
Options options;
options.merge_operator = nullptr;
TestIterator* internal_iter = new TestIterator(BytewiseComparator());
internal_iter->AddPut("a", "1");
internal_iter->AddPut("b", "2");
internal_iter->Finish();
std::unique_ptr<Iterator> db_iter(NewDBIterator(
env_, ro, ImmutableOptions(options), MutableCFOptions(options),
BytewiseComparator(), internal_iter, nullptr /* version */,
10 /* sequence */, 0 /* force seek */, nullptr /* read_callback */));
db_iter->SeekToFirst();
ASSERT_TRUE(db_iter->Valid());
ASSERT_EQ(db_iter->key().ToString(), "a");
ASSERT_EQ(db_iter->value().ToString(), "1");
db_iter->Next();
ASSERT_TRUE(db_iter->Valid());
ASSERT_EQ(db_iter->key().ToString(), "b");
ASSERT_EQ(db_iter->value().ToString(), "2");
db_iter->Next();
ASSERT_FALSE(db_iter->Valid());
}
Support for SingleDelete() Summary: This patch fixes #7460559. It introduces SingleDelete as a new database operation. This operation can be used to delete keys that were never overwritten (no put following another put of the same key). If an overwritten key is single deleted the behavior is undefined. Single deletion of a non-existent key has no effect but multiple consecutive single deletions are not allowed (see limitations). In contrast to the conventional Delete() operation, the deletion entry is removed along with the value when the two are lined up in a compaction. Note: The semantics are similar to @igor's prototype that allowed to have this behavior on the granularity of a column family ( https://reviews.facebook.net/D42093 ). This new patch, however, is more aggressive when it comes to removing tombstones: It removes the SingleDelete together with the value whenever there is no snapshot between them while the older patch only did this when the sequence number of the deletion was older than the earliest snapshot. Most of the complex additions are in the Compaction Iterator, all other changes should be relatively straightforward. The patch also includes basic support for single deletions in db_stress and db_bench. Limitations: - Not compatible with cuckoo hash tables - Single deletions cannot be used in combination with merges and normal deletions on the same key (other keys are not affected by this) - Consecutive single deletions are currently not allowed (and older version of this patch supported this so it could be resurrected if needed) Test Plan: make all check Reviewers: yhchiang, sdong, rven, anthony, yoshinorim, igor Reviewed By: igor Subscribers: maykov, dhruba, leveldb Differential Revision: https://reviews.facebook.net/D43179
9 years ago
TEST_F(DBIteratorTest, DBIterator11) {
ReadOptions ro;
Support for SingleDelete() Summary: This patch fixes #7460559. It introduces SingleDelete as a new database operation. This operation can be used to delete keys that were never overwritten (no put following another put of the same key). If an overwritten key is single deleted the behavior is undefined. Single deletion of a non-existent key has no effect but multiple consecutive single deletions are not allowed (see limitations). In contrast to the conventional Delete() operation, the deletion entry is removed along with the value when the two are lined up in a compaction. Note: The semantics are similar to @igor's prototype that allowed to have this behavior on the granularity of a column family ( https://reviews.facebook.net/D42093 ). This new patch, however, is more aggressive when it comes to removing tombstones: It removes the SingleDelete together with the value whenever there is no snapshot between them while the older patch only did this when the sequence number of the deletion was older than the earliest snapshot. Most of the complex additions are in the Compaction Iterator, all other changes should be relatively straightforward. The patch also includes basic support for single deletions in db_stress and db_bench. Limitations: - Not compatible with cuckoo hash tables - Single deletions cannot be used in combination with merges and normal deletions on the same key (other keys are not affected by this) - Consecutive single deletions are currently not allowed (and older version of this patch supported this so it could be resurrected if needed) Test Plan: make all check Reviewers: yhchiang, sdong, rven, anthony, yoshinorim, igor Reviewed By: igor Subscribers: maykov, dhruba, leveldb Differential Revision: https://reviews.facebook.net/D43179
9 years ago
Options options;
options.merge_operator = MergeOperators::CreateFromStringId("stringappend");
TestIterator* internal_iter = new TestIterator(BytewiseComparator());
internal_iter->AddPut("a", "0");
internal_iter->AddPut("b", "0");
internal_iter->AddSingleDeletion("b");
internal_iter->AddMerge("a", "1");
internal_iter->AddMerge("b", "2");
internal_iter->Finish();
std::unique_ptr<Iterator> db_iter(NewDBIterator(
env_, ro, ImmutableOptions(options), MutableCFOptions(options),
BytewiseComparator(), internal_iter, nullptr /* version */,
1 /* sequence */, options.max_sequential_skip_in_iterations,
nullptr /* read_callback */));
Support for SingleDelete() Summary: This patch fixes #7460559. It introduces SingleDelete as a new database operation. This operation can be used to delete keys that were never overwritten (no put following another put of the same key). If an overwritten key is single deleted the behavior is undefined. Single deletion of a non-existent key has no effect but multiple consecutive single deletions are not allowed (see limitations). In contrast to the conventional Delete() operation, the deletion entry is removed along with the value when the two are lined up in a compaction. Note: The semantics are similar to @igor's prototype that allowed to have this behavior on the granularity of a column family ( https://reviews.facebook.net/D42093 ). This new patch, however, is more aggressive when it comes to removing tombstones: It removes the SingleDelete together with the value whenever there is no snapshot between them while the older patch only did this when the sequence number of the deletion was older than the earliest snapshot. Most of the complex additions are in the Compaction Iterator, all other changes should be relatively straightforward. The patch also includes basic support for single deletions in db_stress and db_bench. Limitations: - Not compatible with cuckoo hash tables - Single deletions cannot be used in combination with merges and normal deletions on the same key (other keys are not affected by this) - Consecutive single deletions are currently not allowed (and older version of this patch supported this so it could be resurrected if needed) Test Plan: make all check Reviewers: yhchiang, sdong, rven, anthony, yoshinorim, igor Reviewed By: igor Subscribers: maykov, dhruba, leveldb Differential Revision: https://reviews.facebook.net/D43179
9 years ago
db_iter->SeekToFirst();
ASSERT_TRUE(db_iter->Valid());
ASSERT_EQ(db_iter->key().ToString(), "a");
ASSERT_EQ(db_iter->value().ToString(), "0");
db_iter->Next();
ASSERT_TRUE(db_iter->Valid());
ASSERT_EQ(db_iter->key().ToString(), "b");
db_iter->Next();
ASSERT_FALSE(db_iter->Valid());
}
TEST_F(DBIteratorTest, DBIterator12) {
ReadOptions ro;
Support for SingleDelete() Summary: This patch fixes #7460559. It introduces SingleDelete as a new database operation. This operation can be used to delete keys that were never overwritten (no put following another put of the same key). If an overwritten key is single deleted the behavior is undefined. Single deletion of a non-existent key has no effect but multiple consecutive single deletions are not allowed (see limitations). In contrast to the conventional Delete() operation, the deletion entry is removed along with the value when the two are lined up in a compaction. Note: The semantics are similar to @igor's prototype that allowed to have this behavior on the granularity of a column family ( https://reviews.facebook.net/D42093 ). This new patch, however, is more aggressive when it comes to removing tombstones: It removes the SingleDelete together with the value whenever there is no snapshot between them while the older patch only did this when the sequence number of the deletion was older than the earliest snapshot. Most of the complex additions are in the Compaction Iterator, all other changes should be relatively straightforward. The patch also includes basic support for single deletions in db_stress and db_bench. Limitations: - Not compatible with cuckoo hash tables - Single deletions cannot be used in combination with merges and normal deletions on the same key (other keys are not affected by this) - Consecutive single deletions are currently not allowed (and older version of this patch supported this so it could be resurrected if needed) Test Plan: make all check Reviewers: yhchiang, sdong, rven, anthony, yoshinorim, igor Reviewed By: igor Subscribers: maykov, dhruba, leveldb Differential Revision: https://reviews.facebook.net/D43179
9 years ago
Options options;
options.merge_operator = nullptr;
TestIterator* internal_iter = new TestIterator(BytewiseComparator());
internal_iter->AddPut("a", "1");
internal_iter->AddPut("b", "2");
internal_iter->AddPut("c", "3");
internal_iter->AddSingleDeletion("b");
internal_iter->Finish();
std::unique_ptr<Iterator> db_iter(NewDBIterator(
env_, ro, ImmutableOptions(options), MutableCFOptions(options),
BytewiseComparator(), internal_iter, nullptr /* version */,
10 /* sequence */, 0 /* force seek */, nullptr /* read_callback */));
Support for SingleDelete() Summary: This patch fixes #7460559. It introduces SingleDelete as a new database operation. This operation can be used to delete keys that were never overwritten (no put following another put of the same key). If an overwritten key is single deleted the behavior is undefined. Single deletion of a non-existent key has no effect but multiple consecutive single deletions are not allowed (see limitations). In contrast to the conventional Delete() operation, the deletion entry is removed along with the value when the two are lined up in a compaction. Note: The semantics are similar to @igor's prototype that allowed to have this behavior on the granularity of a column family ( https://reviews.facebook.net/D42093 ). This new patch, however, is more aggressive when it comes to removing tombstones: It removes the SingleDelete together with the value whenever there is no snapshot between them while the older patch only did this when the sequence number of the deletion was older than the earliest snapshot. Most of the complex additions are in the Compaction Iterator, all other changes should be relatively straightforward. The patch also includes basic support for single deletions in db_stress and db_bench. Limitations: - Not compatible with cuckoo hash tables - Single deletions cannot be used in combination with merges and normal deletions on the same key (other keys are not affected by this) - Consecutive single deletions are currently not allowed (and older version of this patch supported this so it could be resurrected if needed) Test Plan: make all check Reviewers: yhchiang, sdong, rven, anthony, yoshinorim, igor Reviewed By: igor Subscribers: maykov, dhruba, leveldb Differential Revision: https://reviews.facebook.net/D43179
9 years ago
db_iter->SeekToLast();
ASSERT_TRUE(db_iter->Valid());
ASSERT_EQ(db_iter->key().ToString(), "c");
ASSERT_EQ(db_iter->value().ToString(), "3");
db_iter->Prev();
ASSERT_TRUE(db_iter->Valid());
ASSERT_EQ(db_iter->key().ToString(), "a");
ASSERT_EQ(db_iter->value().ToString(), "1");
db_iter->Prev();
ASSERT_FALSE(db_iter->Valid());
}
TEST_F(DBIteratorTest, DBIterator13) {
ReadOptions ro;
Options options;
options.merge_operator = nullptr;
std::string key;
key.resize(9);
key.assign(9, static_cast<char>(0));
key[0] = 'b';
TestIterator* internal_iter = new TestIterator(BytewiseComparator());
internal_iter->AddPut(key, "0");
internal_iter->AddPut(key, "1");
internal_iter->AddPut(key, "2");
internal_iter->AddPut(key, "3");
internal_iter->AddPut(key, "4");
internal_iter->AddPut(key, "5");
internal_iter->AddPut(key, "6");
internal_iter->AddPut(key, "7");
internal_iter->AddPut(key, "8");
internal_iter->Finish();
std::unique_ptr<Iterator> db_iter(NewDBIterator(
env_, ro, ImmutableOptions(options), MutableCFOptions(options),
BytewiseComparator(), internal_iter, nullptr /* version */,
2 /* sequence */, 3 /* max_sequential_skip_in_iterations */,
nullptr /* read_callback */));
db_iter->Seek("b");
ASSERT_TRUE(db_iter->Valid());
ASSERT_EQ(db_iter->key().ToString(), key);
ASSERT_EQ(db_iter->value().ToString(), "2");
}
TEST_F(DBIteratorTest, DBIterator14) {
ReadOptions ro;
Options options;
options.merge_operator = nullptr;
std::string key("b");
TestIterator* internal_iter = new TestIterator(BytewiseComparator());
internal_iter->AddPut("b", "0");
internal_iter->AddPut("b", "1");
internal_iter->AddPut("b", "2");
internal_iter->AddPut("b", "3");
internal_iter->AddPut("a", "4");
internal_iter->AddPut("a", "5");
internal_iter->AddPut("a", "6");
internal_iter->AddPut("c", "7");
internal_iter->AddPut("c", "8");
internal_iter->AddPut("c", "9");
internal_iter->Finish();
std::unique_ptr<Iterator> db_iter(NewDBIterator(
env_, ro, ImmutableOptions(options), MutableCFOptions(options),
BytewiseComparator(), internal_iter, nullptr /* version */,
4 /* sequence */, 1 /* max_sequential_skip_in_iterations */,
nullptr /* read_callback */));
db_iter->Seek("b");
ASSERT_TRUE(db_iter->Valid());
ASSERT_EQ(db_iter->key().ToString(), "b");
ASSERT_EQ(db_iter->value().ToString(), "3");
db_iter->SeekToFirst();
ASSERT_EQ(db_iter->key().ToString(), "a");
ASSERT_EQ(db_iter->value().ToString(), "4");
}
Added support for differential snapshots Summary: The motivation for this PR is to add to RocksDB support for differential (incremental) snapshots, as snapshot of the DB changes between two points in time (one can think of it as diff between to sequence numbers, or the diff D which can be thought of as an SST file or just set of KVs that can be applied to sequence number S1 to get the database to the state at sequence number S2). This feature would be useful for various distributed storages layers built on top of RocksDB, as it should help reduce resources (time and network bandwidth) needed to recover and rebuilt DB instances as replicas in the context of distributed storages. From the API standpoint that would like client app requesting iterator between (start seqnum) and current DB state, and reading the "diff". This is a very draft PR for initial review in the discussion on the approach, i'm going to rework some parts and keep updating the PR. For now, what's done here according to initial discussions: Preserving deletes: - We want to be able to optionally preserve recent deletes for some defined period of time, so that if a delete came in recently and might need to be included in the next incremental snapshot it would't get dropped by a compaction. This is done by adding new param to Options (preserve deletes flag) and new variable to DB Impl where we keep track of the sequence number after which we don't want to drop tombstones, even if they are otherwise eligible for deletion. - I also added a new API call for clients to be able to advance this cutoff seqnum after which we drop deletes; i assume it's more flexible to let clients control this, since otherwise we'd need to keep some kind of timestamp < -- > seqnum mapping inside the DB, which sounds messy and painful to support. Clients could make use of it by periodically calling GetLatestSequenceNumber(), noting the timestamp, doing some calculation and figuring out by how much we need to advance the cutoff seqnum. - Compaction codepath in compaction_iterator.cc has been modified to avoid dropping tombstones with seqnum > cutoff seqnum. Iterator changes: - couple params added to ReadOptions, to optionally allow client to request internal keys instead of user keys (so that client can get the latest value of a key, be it delete marker or a put), as well as min timestamp and min seqnum. TableCache changes: - I modified table_cache code to be able to quickly exclude SST files from iterators heep if creation_time on the file is less then iter_start_ts as passed in ReadOptions. That would help a lot in some DB settings (like reading very recent data only or using FIFO compactions), but not so much for universal compaction with more or less long iterator time span. What's left: - Still looking at how to best plug that inside DBIter codepath. So far it seems that FindNextUserKeyInternal only parses values as UserKeys, and iter->key() call generally returns user key. Can we add new API to DBIter as internal_key(), and modify this internal method to optionally set saved_key_ to point to the full internal key? I don't need to store actual seqnum there, but I do need to store type. Closes https://github.com/facebook/rocksdb/pull/2999 Differential Revision: D6175602 Pulled By: mikhail-antonov fbshipit-source-id: c779a6696ee2d574d86c69cec866a3ae095aa900
7 years ago
TEST_F(DBIteratorTest, DBIteratorTestDifferentialSnapshots) {
{ // test that KVs earlier that iter_start_seqnum are filtered out
ReadOptions ro;
ro.iter_start_seqnum=5;
Options options;
options.statistics = ROCKSDB_NAMESPACE::CreateDBStatistics();
Added support for differential snapshots Summary: The motivation for this PR is to add to RocksDB support for differential (incremental) snapshots, as snapshot of the DB changes between two points in time (one can think of it as diff between to sequence numbers, or the diff D which can be thought of as an SST file or just set of KVs that can be applied to sequence number S1 to get the database to the state at sequence number S2). This feature would be useful for various distributed storages layers built on top of RocksDB, as it should help reduce resources (time and network bandwidth) needed to recover and rebuilt DB instances as replicas in the context of distributed storages. From the API standpoint that would like client app requesting iterator between (start seqnum) and current DB state, and reading the "diff". This is a very draft PR for initial review in the discussion on the approach, i'm going to rework some parts and keep updating the PR. For now, what's done here according to initial discussions: Preserving deletes: - We want to be able to optionally preserve recent deletes for some defined period of time, so that if a delete came in recently and might need to be included in the next incremental snapshot it would't get dropped by a compaction. This is done by adding new param to Options (preserve deletes flag) and new variable to DB Impl where we keep track of the sequence number after which we don't want to drop tombstones, even if they are otherwise eligible for deletion. - I also added a new API call for clients to be able to advance this cutoff seqnum after which we drop deletes; i assume it's more flexible to let clients control this, since otherwise we'd need to keep some kind of timestamp < -- > seqnum mapping inside the DB, which sounds messy and painful to support. Clients could make use of it by periodically calling GetLatestSequenceNumber(), noting the timestamp, doing some calculation and figuring out by how much we need to advance the cutoff seqnum. - Compaction codepath in compaction_iterator.cc has been modified to avoid dropping tombstones with seqnum > cutoff seqnum. Iterator changes: - couple params added to ReadOptions, to optionally allow client to request internal keys instead of user keys (so that client can get the latest value of a key, be it delete marker or a put), as well as min timestamp and min seqnum. TableCache changes: - I modified table_cache code to be able to quickly exclude SST files from iterators heep if creation_time on the file is less then iter_start_ts as passed in ReadOptions. That would help a lot in some DB settings (like reading very recent data only or using FIFO compactions), but not so much for universal compaction with more or less long iterator time span. What's left: - Still looking at how to best plug that inside DBIter codepath. So far it seems that FindNextUserKeyInternal only parses values as UserKeys, and iter->key() call generally returns user key. Can we add new API to DBIter as internal_key(), and modify this internal method to optionally set saved_key_ to point to the full internal key? I don't need to store actual seqnum there, but I do need to store type. Closes https://github.com/facebook/rocksdb/pull/2999 Differential Revision: D6175602 Pulled By: mikhail-antonov fbshipit-source-id: c779a6696ee2d574d86c69cec866a3ae095aa900
7 years ago
TestIterator* internal_iter = new TestIterator(BytewiseComparator());
for (size_t i = 0; i < 10; ++i) {
internal_iter->AddPut(std::to_string(i), std::to_string(i) + "a");
internal_iter->AddPut(std::to_string(i), std::to_string(i) + "b");
internal_iter->AddPut(std::to_string(i), std::to_string(i) + "c");
}
internal_iter->Finish();
std::unique_ptr<Iterator> db_iter(NewDBIterator(
env_, ro, ImmutableOptions(options), MutableCFOptions(options),
BytewiseComparator(), internal_iter, nullptr /* version */,
13 /* sequence */, options.max_sequential_skip_in_iterations,
nullptr /* read_callback */));
Added support for differential snapshots Summary: The motivation for this PR is to add to RocksDB support for differential (incremental) snapshots, as snapshot of the DB changes between two points in time (one can think of it as diff between to sequence numbers, or the diff D which can be thought of as an SST file or just set of KVs that can be applied to sequence number S1 to get the database to the state at sequence number S2). This feature would be useful for various distributed storages layers built on top of RocksDB, as it should help reduce resources (time and network bandwidth) needed to recover and rebuilt DB instances as replicas in the context of distributed storages. From the API standpoint that would like client app requesting iterator between (start seqnum) and current DB state, and reading the "diff". This is a very draft PR for initial review in the discussion on the approach, i'm going to rework some parts and keep updating the PR. For now, what's done here according to initial discussions: Preserving deletes: - We want to be able to optionally preserve recent deletes for some defined period of time, so that if a delete came in recently and might need to be included in the next incremental snapshot it would't get dropped by a compaction. This is done by adding new param to Options (preserve deletes flag) and new variable to DB Impl where we keep track of the sequence number after which we don't want to drop tombstones, even if they are otherwise eligible for deletion. - I also added a new API call for clients to be able to advance this cutoff seqnum after which we drop deletes; i assume it's more flexible to let clients control this, since otherwise we'd need to keep some kind of timestamp < -- > seqnum mapping inside the DB, which sounds messy and painful to support. Clients could make use of it by periodically calling GetLatestSequenceNumber(), noting the timestamp, doing some calculation and figuring out by how much we need to advance the cutoff seqnum. - Compaction codepath in compaction_iterator.cc has been modified to avoid dropping tombstones with seqnum > cutoff seqnum. Iterator changes: - couple params added to ReadOptions, to optionally allow client to request internal keys instead of user keys (so that client can get the latest value of a key, be it delete marker or a put), as well as min timestamp and min seqnum. TableCache changes: - I modified table_cache code to be able to quickly exclude SST files from iterators heep if creation_time on the file is less then iter_start_ts as passed in ReadOptions. That would help a lot in some DB settings (like reading very recent data only or using FIFO compactions), but not so much for universal compaction with more or less long iterator time span. What's left: - Still looking at how to best plug that inside DBIter codepath. So far it seems that FindNextUserKeyInternal only parses values as UserKeys, and iter->key() call generally returns user key. Can we add new API to DBIter as internal_key(), and modify this internal method to optionally set saved_key_ to point to the full internal key? I don't need to store actual seqnum there, but I do need to store type. Closes https://github.com/facebook/rocksdb/pull/2999 Differential Revision: D6175602 Pulled By: mikhail-antonov fbshipit-source-id: c779a6696ee2d574d86c69cec866a3ae095aa900
7 years ago
// Expecting InternalKeys in [5,8] range with correct type
int seqnums[4] = {5,8,11,13};
std::string user_keys[4] = {"1","2","3","4"};
std::string values[4] = {"1c", "2c", "3c", "4b"};
int i = 0;
for (db_iter->SeekToFirst(); db_iter->Valid(); db_iter->Next()) {
ParsedInternalKey fkey;
ASSERT_OK(
ParseInternalKey(db_iter->key(), &fkey, true /* log_err_key */));
Added support for differential snapshots Summary: The motivation for this PR is to add to RocksDB support for differential (incremental) snapshots, as snapshot of the DB changes between two points in time (one can think of it as diff between to sequence numbers, or the diff D which can be thought of as an SST file or just set of KVs that can be applied to sequence number S1 to get the database to the state at sequence number S2). This feature would be useful for various distributed storages layers built on top of RocksDB, as it should help reduce resources (time and network bandwidth) needed to recover and rebuilt DB instances as replicas in the context of distributed storages. From the API standpoint that would like client app requesting iterator between (start seqnum) and current DB state, and reading the "diff". This is a very draft PR for initial review in the discussion on the approach, i'm going to rework some parts and keep updating the PR. For now, what's done here according to initial discussions: Preserving deletes: - We want to be able to optionally preserve recent deletes for some defined period of time, so that if a delete came in recently and might need to be included in the next incremental snapshot it would't get dropped by a compaction. This is done by adding new param to Options (preserve deletes flag) and new variable to DB Impl where we keep track of the sequence number after which we don't want to drop tombstones, even if they are otherwise eligible for deletion. - I also added a new API call for clients to be able to advance this cutoff seqnum after which we drop deletes; i assume it's more flexible to let clients control this, since otherwise we'd need to keep some kind of timestamp < -- > seqnum mapping inside the DB, which sounds messy and painful to support. Clients could make use of it by periodically calling GetLatestSequenceNumber(), noting the timestamp, doing some calculation and figuring out by how much we need to advance the cutoff seqnum. - Compaction codepath in compaction_iterator.cc has been modified to avoid dropping tombstones with seqnum > cutoff seqnum. Iterator changes: - couple params added to ReadOptions, to optionally allow client to request internal keys instead of user keys (so that client can get the latest value of a key, be it delete marker or a put), as well as min timestamp and min seqnum. TableCache changes: - I modified table_cache code to be able to quickly exclude SST files from iterators heep if creation_time on the file is less then iter_start_ts as passed in ReadOptions. That would help a lot in some DB settings (like reading very recent data only or using FIFO compactions), but not so much for universal compaction with more or less long iterator time span. What's left: - Still looking at how to best plug that inside DBIter codepath. So far it seems that FindNextUserKeyInternal only parses values as UserKeys, and iter->key() call generally returns user key. Can we add new API to DBIter as internal_key(), and modify this internal method to optionally set saved_key_ to point to the full internal key? I don't need to store actual seqnum there, but I do need to store type. Closes https://github.com/facebook/rocksdb/pull/2999 Differential Revision: D6175602 Pulled By: mikhail-antonov fbshipit-source-id: c779a6696ee2d574d86c69cec866a3ae095aa900
7 years ago
ASSERT_EQ(user_keys[i], fkey.user_key.ToString());
ASSERT_EQ(kTypeValue, fkey.type);
Added support for differential snapshots Summary: The motivation for this PR is to add to RocksDB support for differential (incremental) snapshots, as snapshot of the DB changes between two points in time (one can think of it as diff between to sequence numbers, or the diff D which can be thought of as an SST file or just set of KVs that can be applied to sequence number S1 to get the database to the state at sequence number S2). This feature would be useful for various distributed storages layers built on top of RocksDB, as it should help reduce resources (time and network bandwidth) needed to recover and rebuilt DB instances as replicas in the context of distributed storages. From the API standpoint that would like client app requesting iterator between (start seqnum) and current DB state, and reading the "diff". This is a very draft PR for initial review in the discussion on the approach, i'm going to rework some parts and keep updating the PR. For now, what's done here according to initial discussions: Preserving deletes: - We want to be able to optionally preserve recent deletes for some defined period of time, so that if a delete came in recently and might need to be included in the next incremental snapshot it would't get dropped by a compaction. This is done by adding new param to Options (preserve deletes flag) and new variable to DB Impl where we keep track of the sequence number after which we don't want to drop tombstones, even if they are otherwise eligible for deletion. - I also added a new API call for clients to be able to advance this cutoff seqnum after which we drop deletes; i assume it's more flexible to let clients control this, since otherwise we'd need to keep some kind of timestamp < -- > seqnum mapping inside the DB, which sounds messy and painful to support. Clients could make use of it by periodically calling GetLatestSequenceNumber(), noting the timestamp, doing some calculation and figuring out by how much we need to advance the cutoff seqnum. - Compaction codepath in compaction_iterator.cc has been modified to avoid dropping tombstones with seqnum > cutoff seqnum. Iterator changes: - couple params added to ReadOptions, to optionally allow client to request internal keys instead of user keys (so that client can get the latest value of a key, be it delete marker or a put), as well as min timestamp and min seqnum. TableCache changes: - I modified table_cache code to be able to quickly exclude SST files from iterators heep if creation_time on the file is less then iter_start_ts as passed in ReadOptions. That would help a lot in some DB settings (like reading very recent data only or using FIFO compactions), but not so much for universal compaction with more or less long iterator time span. What's left: - Still looking at how to best plug that inside DBIter codepath. So far it seems that FindNextUserKeyInternal only parses values as UserKeys, and iter->key() call generally returns user key. Can we add new API to DBIter as internal_key(), and modify this internal method to optionally set saved_key_ to point to the full internal key? I don't need to store actual seqnum there, but I do need to store type. Closes https://github.com/facebook/rocksdb/pull/2999 Differential Revision: D6175602 Pulled By: mikhail-antonov fbshipit-source-id: c779a6696ee2d574d86c69cec866a3ae095aa900
7 years ago
ASSERT_EQ(seqnums[i], fkey.sequence);
ASSERT_EQ(values[i], db_iter->value().ToString());
i++;
}
ASSERT_EQ(i, 4);
}
{ // Test that deletes are returned correctly as internal KVs
ReadOptions ro;
ro.iter_start_seqnum=5;
Options options;
options.statistics = ROCKSDB_NAMESPACE::CreateDBStatistics();
Added support for differential snapshots Summary: The motivation for this PR is to add to RocksDB support for differential (incremental) snapshots, as snapshot of the DB changes between two points in time (one can think of it as diff between to sequence numbers, or the diff D which can be thought of as an SST file or just set of KVs that can be applied to sequence number S1 to get the database to the state at sequence number S2). This feature would be useful for various distributed storages layers built on top of RocksDB, as it should help reduce resources (time and network bandwidth) needed to recover and rebuilt DB instances as replicas in the context of distributed storages. From the API standpoint that would like client app requesting iterator between (start seqnum) and current DB state, and reading the "diff". This is a very draft PR for initial review in the discussion on the approach, i'm going to rework some parts and keep updating the PR. For now, what's done here according to initial discussions: Preserving deletes: - We want to be able to optionally preserve recent deletes for some defined period of time, so that if a delete came in recently and might need to be included in the next incremental snapshot it would't get dropped by a compaction. This is done by adding new param to Options (preserve deletes flag) and new variable to DB Impl where we keep track of the sequence number after which we don't want to drop tombstones, even if they are otherwise eligible for deletion. - I also added a new API call for clients to be able to advance this cutoff seqnum after which we drop deletes; i assume it's more flexible to let clients control this, since otherwise we'd need to keep some kind of timestamp < -- > seqnum mapping inside the DB, which sounds messy and painful to support. Clients could make use of it by periodically calling GetLatestSequenceNumber(), noting the timestamp, doing some calculation and figuring out by how much we need to advance the cutoff seqnum. - Compaction codepath in compaction_iterator.cc has been modified to avoid dropping tombstones with seqnum > cutoff seqnum. Iterator changes: - couple params added to ReadOptions, to optionally allow client to request internal keys instead of user keys (so that client can get the latest value of a key, be it delete marker or a put), as well as min timestamp and min seqnum. TableCache changes: - I modified table_cache code to be able to quickly exclude SST files from iterators heep if creation_time on the file is less then iter_start_ts as passed in ReadOptions. That would help a lot in some DB settings (like reading very recent data only or using FIFO compactions), but not so much for universal compaction with more or less long iterator time span. What's left: - Still looking at how to best plug that inside DBIter codepath. So far it seems that FindNextUserKeyInternal only parses values as UserKeys, and iter->key() call generally returns user key. Can we add new API to DBIter as internal_key(), and modify this internal method to optionally set saved_key_ to point to the full internal key? I don't need to store actual seqnum there, but I do need to store type. Closes https://github.com/facebook/rocksdb/pull/2999 Differential Revision: D6175602 Pulled By: mikhail-antonov fbshipit-source-id: c779a6696ee2d574d86c69cec866a3ae095aa900
7 years ago
TestIterator* internal_iter = new TestIterator(BytewiseComparator());
for (size_t i = 0; i < 10; ++i) {
internal_iter->AddPut(std::to_string(i), std::to_string(i) + "a");
internal_iter->AddPut(std::to_string(i), std::to_string(i) + "b");
internal_iter->AddDeletion(std::to_string(i));
}
internal_iter->Finish();
std::unique_ptr<Iterator> db_iter(NewDBIterator(
env_, ro, ImmutableOptions(options), MutableCFOptions(options),
BytewiseComparator(), internal_iter, nullptr /* version */,
13 /* sequence */, options.max_sequential_skip_in_iterations,
nullptr /* read_callback */));
Added support for differential snapshots Summary: The motivation for this PR is to add to RocksDB support for differential (incremental) snapshots, as snapshot of the DB changes between two points in time (one can think of it as diff between to sequence numbers, or the diff D which can be thought of as an SST file or just set of KVs that can be applied to sequence number S1 to get the database to the state at sequence number S2). This feature would be useful for various distributed storages layers built on top of RocksDB, as it should help reduce resources (time and network bandwidth) needed to recover and rebuilt DB instances as replicas in the context of distributed storages. From the API standpoint that would like client app requesting iterator between (start seqnum) and current DB state, and reading the "diff". This is a very draft PR for initial review in the discussion on the approach, i'm going to rework some parts and keep updating the PR. For now, what's done here according to initial discussions: Preserving deletes: - We want to be able to optionally preserve recent deletes for some defined period of time, so that if a delete came in recently and might need to be included in the next incremental snapshot it would't get dropped by a compaction. This is done by adding new param to Options (preserve deletes flag) and new variable to DB Impl where we keep track of the sequence number after which we don't want to drop tombstones, even if they are otherwise eligible for deletion. - I also added a new API call for clients to be able to advance this cutoff seqnum after which we drop deletes; i assume it's more flexible to let clients control this, since otherwise we'd need to keep some kind of timestamp < -- > seqnum mapping inside the DB, which sounds messy and painful to support. Clients could make use of it by periodically calling GetLatestSequenceNumber(), noting the timestamp, doing some calculation and figuring out by how much we need to advance the cutoff seqnum. - Compaction codepath in compaction_iterator.cc has been modified to avoid dropping tombstones with seqnum > cutoff seqnum. Iterator changes: - couple params added to ReadOptions, to optionally allow client to request internal keys instead of user keys (so that client can get the latest value of a key, be it delete marker or a put), as well as min timestamp and min seqnum. TableCache changes: - I modified table_cache code to be able to quickly exclude SST files from iterators heep if creation_time on the file is less then iter_start_ts as passed in ReadOptions. That would help a lot in some DB settings (like reading very recent data only or using FIFO compactions), but not so much for universal compaction with more or less long iterator time span. What's left: - Still looking at how to best plug that inside DBIter codepath. So far it seems that FindNextUserKeyInternal only parses values as UserKeys, and iter->key() call generally returns user key. Can we add new API to DBIter as internal_key(), and modify this internal method to optionally set saved_key_ to point to the full internal key? I don't need to store actual seqnum there, but I do need to store type. Closes https://github.com/facebook/rocksdb/pull/2999 Differential Revision: D6175602 Pulled By: mikhail-antonov fbshipit-source-id: c779a6696ee2d574d86c69cec866a3ae095aa900
7 years ago
// Expecting InternalKeys in [5,8] range with correct type
int seqnums[4] = {5,8,11,13};
ValueType key_types[4] = {kTypeDeletion, kTypeDeletion, kTypeDeletion,
kTypeValue};
Added support for differential snapshots Summary: The motivation for this PR is to add to RocksDB support for differential (incremental) snapshots, as snapshot of the DB changes between two points in time (one can think of it as diff between to sequence numbers, or the diff D which can be thought of as an SST file or just set of KVs that can be applied to sequence number S1 to get the database to the state at sequence number S2). This feature would be useful for various distributed storages layers built on top of RocksDB, as it should help reduce resources (time and network bandwidth) needed to recover and rebuilt DB instances as replicas in the context of distributed storages. From the API standpoint that would like client app requesting iterator between (start seqnum) and current DB state, and reading the "diff". This is a very draft PR for initial review in the discussion on the approach, i'm going to rework some parts and keep updating the PR. For now, what's done here according to initial discussions: Preserving deletes: - We want to be able to optionally preserve recent deletes for some defined period of time, so that if a delete came in recently and might need to be included in the next incremental snapshot it would't get dropped by a compaction. This is done by adding new param to Options (preserve deletes flag) and new variable to DB Impl where we keep track of the sequence number after which we don't want to drop tombstones, even if they are otherwise eligible for deletion. - I also added a new API call for clients to be able to advance this cutoff seqnum after which we drop deletes; i assume it's more flexible to let clients control this, since otherwise we'd need to keep some kind of timestamp < -- > seqnum mapping inside the DB, which sounds messy and painful to support. Clients could make use of it by periodically calling GetLatestSequenceNumber(), noting the timestamp, doing some calculation and figuring out by how much we need to advance the cutoff seqnum. - Compaction codepath in compaction_iterator.cc has been modified to avoid dropping tombstones with seqnum > cutoff seqnum. Iterator changes: - couple params added to ReadOptions, to optionally allow client to request internal keys instead of user keys (so that client can get the latest value of a key, be it delete marker or a put), as well as min timestamp and min seqnum. TableCache changes: - I modified table_cache code to be able to quickly exclude SST files from iterators heep if creation_time on the file is less then iter_start_ts as passed in ReadOptions. That would help a lot in some DB settings (like reading very recent data only or using FIFO compactions), but not so much for universal compaction with more or less long iterator time span. What's left: - Still looking at how to best plug that inside DBIter codepath. So far it seems that FindNextUserKeyInternal only parses values as UserKeys, and iter->key() call generally returns user key. Can we add new API to DBIter as internal_key(), and modify this internal method to optionally set saved_key_ to point to the full internal key? I don't need to store actual seqnum there, but I do need to store type. Closes https://github.com/facebook/rocksdb/pull/2999 Differential Revision: D6175602 Pulled By: mikhail-antonov fbshipit-source-id: c779a6696ee2d574d86c69cec866a3ae095aa900
7 years ago
std::string user_keys[4] = {"1","2","3","4"};
std::string values[4] = {"", "", "", "4b"};
int i = 0;
for (db_iter->SeekToFirst(); db_iter->Valid(); db_iter->Next()) {
ParsedInternalKey fkey;
ASSERT_OK(
ParseInternalKey(db_iter->key(), &fkey, true /* log_err_key */));
Added support for differential snapshots Summary: The motivation for this PR is to add to RocksDB support for differential (incremental) snapshots, as snapshot of the DB changes between two points in time (one can think of it as diff between to sequence numbers, or the diff D which can be thought of as an SST file or just set of KVs that can be applied to sequence number S1 to get the database to the state at sequence number S2). This feature would be useful for various distributed storages layers built on top of RocksDB, as it should help reduce resources (time and network bandwidth) needed to recover and rebuilt DB instances as replicas in the context of distributed storages. From the API standpoint that would like client app requesting iterator between (start seqnum) and current DB state, and reading the "diff". This is a very draft PR for initial review in the discussion on the approach, i'm going to rework some parts and keep updating the PR. For now, what's done here according to initial discussions: Preserving deletes: - We want to be able to optionally preserve recent deletes for some defined period of time, so that if a delete came in recently and might need to be included in the next incremental snapshot it would't get dropped by a compaction. This is done by adding new param to Options (preserve deletes flag) and new variable to DB Impl where we keep track of the sequence number after which we don't want to drop tombstones, even if they are otherwise eligible for deletion. - I also added a new API call for clients to be able to advance this cutoff seqnum after which we drop deletes; i assume it's more flexible to let clients control this, since otherwise we'd need to keep some kind of timestamp < -- > seqnum mapping inside the DB, which sounds messy and painful to support. Clients could make use of it by periodically calling GetLatestSequenceNumber(), noting the timestamp, doing some calculation and figuring out by how much we need to advance the cutoff seqnum. - Compaction codepath in compaction_iterator.cc has been modified to avoid dropping tombstones with seqnum > cutoff seqnum. Iterator changes: - couple params added to ReadOptions, to optionally allow client to request internal keys instead of user keys (so that client can get the latest value of a key, be it delete marker or a put), as well as min timestamp and min seqnum. TableCache changes: - I modified table_cache code to be able to quickly exclude SST files from iterators heep if creation_time on the file is less then iter_start_ts as passed in ReadOptions. That would help a lot in some DB settings (like reading very recent data only or using FIFO compactions), but not so much for universal compaction with more or less long iterator time span. What's left: - Still looking at how to best plug that inside DBIter codepath. So far it seems that FindNextUserKeyInternal only parses values as UserKeys, and iter->key() call generally returns user key. Can we add new API to DBIter as internal_key(), and modify this internal method to optionally set saved_key_ to point to the full internal key? I don't need to store actual seqnum there, but I do need to store type. Closes https://github.com/facebook/rocksdb/pull/2999 Differential Revision: D6175602 Pulled By: mikhail-antonov fbshipit-source-id: c779a6696ee2d574d86c69cec866a3ae095aa900
7 years ago
ASSERT_EQ(user_keys[i], fkey.user_key.ToString());
ASSERT_EQ(key_types[i], fkey.type);
ASSERT_EQ(seqnums[i], fkey.sequence);
ASSERT_EQ(values[i], db_iter->value().ToString());
i++;
}
ASSERT_EQ(i, 4);
}
}
class DBIterWithMergeIterTest : public testing::Test {
public:
DBIterWithMergeIterTest()
: env_(Env::Default()), icomp_(BytewiseComparator()) {
options_.merge_operator = nullptr;
internal_iter1_ = new TestIterator(BytewiseComparator());
internal_iter1_->Add("a", kTypeValue, "1", 3u);
internal_iter1_->Add("f", kTypeValue, "2", 5u);
internal_iter1_->Add("g", kTypeValue, "3", 7u);
internal_iter1_->Finish();
internal_iter2_ = new TestIterator(BytewiseComparator());
internal_iter2_->Add("a", kTypeValue, "4", 6u);
internal_iter2_->Add("b", kTypeValue, "5", 1u);
internal_iter2_->Add("c", kTypeValue, "6", 2u);
internal_iter2_->Add("d", kTypeValue, "7", 3u);
internal_iter2_->Finish();
std::vector<InternalIterator*> child_iters;
child_iters.push_back(internal_iter1_);
child_iters.push_back(internal_iter2_);
InternalKeyComparator icomp(BytewiseComparator());
InternalIterator* merge_iter =
NewMergingIterator(&icomp_, &child_iters[0], 2u);
db_iter_.reset(NewDBIterator(
env_, ro_, ImmutableOptions(options_), MutableCFOptions(options_),
BytewiseComparator(), merge_iter, nullptr /* version */,
8 /* read data earlier than seqId 8 */,
3 /* max iterators before reseek */, nullptr /* read_callback */));
}
Env* env_;
ReadOptions ro_;
Options options_;
TestIterator* internal_iter1_;
TestIterator* internal_iter2_;
InternalKeyComparator icomp_;
Iterator* merge_iter_;
std::unique_ptr<Iterator> db_iter_;
};
TEST_F(DBIterWithMergeIterTest, InnerMergeIterator1) {
db_iter_->SeekToFirst();
ASSERT_TRUE(db_iter_->Valid());
ASSERT_EQ(db_iter_->key().ToString(), "a");
ASSERT_EQ(db_iter_->value().ToString(), "4");
db_iter_->Next();
ASSERT_TRUE(db_iter_->Valid());
ASSERT_EQ(db_iter_->key().ToString(), "b");
ASSERT_EQ(db_iter_->value().ToString(), "5");
db_iter_->Next();
ASSERT_TRUE(db_iter_->Valid());
ASSERT_EQ(db_iter_->key().ToString(), "c");
ASSERT_EQ(db_iter_->value().ToString(), "6");
db_iter_->Next();
ASSERT_TRUE(db_iter_->Valid());
ASSERT_EQ(db_iter_->key().ToString(), "d");
ASSERT_EQ(db_iter_->value().ToString(), "7");
db_iter_->Next();
ASSERT_TRUE(db_iter_->Valid());
ASSERT_EQ(db_iter_->key().ToString(), "f");
ASSERT_EQ(db_iter_->value().ToString(), "2");
db_iter_->Next();
ASSERT_TRUE(db_iter_->Valid());
ASSERT_EQ(db_iter_->key().ToString(), "g");
ASSERT_EQ(db_iter_->value().ToString(), "3");
db_iter_->Next();
ASSERT_FALSE(db_iter_->Valid());
}
TEST_F(DBIterWithMergeIterTest, InnerMergeIterator2) {
// Test Prev() when one child iterator is at its end.
db_iter_->SeekForPrev("g");
ASSERT_TRUE(db_iter_->Valid());
ASSERT_EQ(db_iter_->key().ToString(), "g");
ASSERT_EQ(db_iter_->value().ToString(), "3");
db_iter_->Prev();
ASSERT_TRUE(db_iter_->Valid());
ASSERT_EQ(db_iter_->key().ToString(), "f");
ASSERT_EQ(db_iter_->value().ToString(), "2");
db_iter_->Prev();
ASSERT_TRUE(db_iter_->Valid());
ASSERT_EQ(db_iter_->key().ToString(), "d");
ASSERT_EQ(db_iter_->value().ToString(), "7");
db_iter_->Prev();
ASSERT_TRUE(db_iter_->Valid());
ASSERT_EQ(db_iter_->key().ToString(), "c");
ASSERT_EQ(db_iter_->value().ToString(), "6");
db_iter_->Prev();
ASSERT_TRUE(db_iter_->Valid());
ASSERT_EQ(db_iter_->key().ToString(), "b");
ASSERT_EQ(db_iter_->value().ToString(), "5");
db_iter_->Prev();
ASSERT_TRUE(db_iter_->Valid());
ASSERT_EQ(db_iter_->key().ToString(), "a");
ASSERT_EQ(db_iter_->value().ToString(), "4");
}
TEST_F(DBIterWithMergeIterTest, InnerMergeIteratorDataRace1) {
// Test Prev() when one child iterator is at its end but more rows
// are added.
db_iter_->Seek("f");
ASSERT_TRUE(db_iter_->Valid());
ASSERT_EQ(db_iter_->key().ToString(), "f");
ASSERT_EQ(db_iter_->value().ToString(), "2");
// Test call back inserts a key in the end of the mem table after
// MergeIterator::Prev() realized the mem table iterator is at its end
// and before an SeekToLast() is called.
ROCKSDB_NAMESPACE::SyncPoint::GetInstance()->SetCallBack(
"MergeIterator::Prev:BeforePrev",
[&](void* /*arg*/) { internal_iter2_->Add("z", kTypeValue, "7", 12u); });
ROCKSDB_NAMESPACE::SyncPoint::GetInstance()->EnableProcessing();
db_iter_->Prev();
ASSERT_TRUE(db_iter_->Valid());
ASSERT_EQ(db_iter_->key().ToString(), "d");
ASSERT_EQ(db_iter_->value().ToString(), "7");
db_iter_->Prev();
ASSERT_TRUE(db_iter_->Valid());
ASSERT_EQ(db_iter_->key().ToString(), "c");
ASSERT_EQ(db_iter_->value().ToString(), "6");
db_iter_->Prev();
ASSERT_TRUE(db_iter_->Valid());
ASSERT_EQ(db_iter_->key().ToString(), "b");
ASSERT_EQ(db_iter_->value().ToString(), "5");
db_iter_->Prev();
ASSERT_TRUE(db_iter_->Valid());
ASSERT_EQ(db_iter_->key().ToString(), "a");
ASSERT_EQ(db_iter_->value().ToString(), "4");
ROCKSDB_NAMESPACE::SyncPoint::GetInstance()->DisableProcessing();
}
TEST_F(DBIterWithMergeIterTest, InnerMergeIteratorDataRace2) {
// Test Prev() when one child iterator is at its end but more rows
// are added.
db_iter_->Seek("f");
ASSERT_TRUE(db_iter_->Valid());
ASSERT_EQ(db_iter_->key().ToString(), "f");
ASSERT_EQ(db_iter_->value().ToString(), "2");
// Test call back inserts entries for update a key in the end of the
// mem table after MergeIterator::Prev() realized the mem tableiterator is at
// its end and before an SeekToLast() is called.
ROCKSDB_NAMESPACE::SyncPoint::GetInstance()->SetCallBack(
"MergeIterator::Prev:BeforePrev", [&](void* /*arg*/) {
internal_iter2_->Add("z", kTypeValue, "7", 12u);
internal_iter2_->Add("z", kTypeValue, "7", 11u);
});
ROCKSDB_NAMESPACE::SyncPoint::GetInstance()->EnableProcessing();
db_iter_->Prev();
ASSERT_TRUE(db_iter_->Valid());
ASSERT_EQ(db_iter_->key().ToString(), "d");
ASSERT_EQ(db_iter_->value().ToString(), "7");
db_iter_->Prev();
ASSERT_TRUE(db_iter_->Valid());
ASSERT_EQ(db_iter_->key().ToString(), "c");
ASSERT_EQ(db_iter_->value().ToString(), "6");
db_iter_->Prev();
ASSERT_TRUE(db_iter_->Valid());
ASSERT_EQ(db_iter_->key().ToString(), "b");
ASSERT_EQ(db_iter_->value().ToString(), "5");
db_iter_->Prev();
ASSERT_TRUE(db_iter_->Valid());
ASSERT_EQ(db_iter_->key().ToString(), "a");
ASSERT_EQ(db_iter_->value().ToString(), "4");
ROCKSDB_NAMESPACE::SyncPoint::GetInstance()->DisableProcessing();
}
TEST_F(DBIterWithMergeIterTest, InnerMergeIteratorDataRace3) {
// Test Prev() when one child iterator is at its end but more rows
// are added and max_skipped is triggered.
db_iter_->Seek("f");
ASSERT_TRUE(db_iter_->Valid());
ASSERT_EQ(db_iter_->key().ToString(), "f");
ASSERT_EQ(db_iter_->value().ToString(), "2");
// Test call back inserts entries for update a key in the end of the
// mem table after MergeIterator::Prev() realized the mem table iterator is at
// its end and before an SeekToLast() is called.
ROCKSDB_NAMESPACE::SyncPoint::GetInstance()->SetCallBack(
"MergeIterator::Prev:BeforePrev", [&](void* /*arg*/) {
internal_iter2_->Add("z", kTypeValue, "7", 16u, true);
internal_iter2_->Add("z", kTypeValue, "7", 15u, true);
internal_iter2_->Add("z", kTypeValue, "7", 14u, true);
internal_iter2_->Add("z", kTypeValue, "7", 13u, true);
internal_iter2_->Add("z", kTypeValue, "7", 12u, true);
internal_iter2_->Add("z", kTypeValue, "7", 11u, true);
});
ROCKSDB_NAMESPACE::SyncPoint::GetInstance()->EnableProcessing();
db_iter_->Prev();
ASSERT_TRUE(db_iter_->Valid());
ASSERT_EQ(db_iter_->key().ToString(), "d");
ASSERT_EQ(db_iter_->value().ToString(), "7");
db_iter_->Prev();
ASSERT_TRUE(db_iter_->Valid());
ASSERT_EQ(db_iter_->key().ToString(), "c");
ASSERT_EQ(db_iter_->value().ToString(), "6");
db_iter_->Prev();
ASSERT_TRUE(db_iter_->Valid());
ASSERT_EQ(db_iter_->key().ToString(), "b");
ASSERT_EQ(db_iter_->value().ToString(), "5");
db_iter_->Prev();
ASSERT_TRUE(db_iter_->Valid());
ASSERT_EQ(db_iter_->key().ToString(), "a");
ASSERT_EQ(db_iter_->value().ToString(), "4");
ROCKSDB_NAMESPACE::SyncPoint::GetInstance()->DisableProcessing();
}
TEST_F(DBIterWithMergeIterTest, InnerMergeIteratorDataRace4) {
// Test Prev() when one child iterator has more rows inserted
// between Seek() and Prev() when changing directions.
internal_iter2_->Add("z", kTypeValue, "9", 4u);
db_iter_->Seek("g");
ASSERT_TRUE(db_iter_->Valid());
ASSERT_EQ(db_iter_->key().ToString(), "g");
ASSERT_EQ(db_iter_->value().ToString(), "3");
// Test call back inserts entries for update a key before "z" in
// mem table after MergeIterator::Prev() calls mem table iterator's
// Seek() and before calling Prev()
ROCKSDB_NAMESPACE::SyncPoint::GetInstance()->SetCallBack(
"MergeIterator::Prev:BeforePrev", [&](void* arg) {
IteratorWrapper* it = reinterpret_cast<IteratorWrapper*>(arg);
if (it->key().starts_with("z")) {
internal_iter2_->Add("x", kTypeValue, "7", 16u, true);
internal_iter2_->Add("x", kTypeValue, "7", 15u, true);
internal_iter2_->Add("x", kTypeValue, "7", 14u, true);
internal_iter2_->Add("x", kTypeValue, "7", 13u, true);
internal_iter2_->Add("x", kTypeValue, "7", 12u, true);
internal_iter2_->Add("x", kTypeValue, "7", 11u, true);
}
});
ROCKSDB_NAMESPACE::SyncPoint::GetInstance()->EnableProcessing();
db_iter_->Prev();
ASSERT_TRUE(db_iter_->Valid());
ASSERT_EQ(db_iter_->key().ToString(), "f");
ASSERT_EQ(db_iter_->value().ToString(), "2");
db_iter_->Prev();
ASSERT_TRUE(db_iter_->Valid());
ASSERT_EQ(db_iter_->key().ToString(), "d");
ASSERT_EQ(db_iter_->value().ToString(), "7");
db_iter_->Prev();
ASSERT_TRUE(db_iter_->Valid());
ASSERT_EQ(db_iter_->key().ToString(), "c");
ASSERT_EQ(db_iter_->value().ToString(), "6");
db_iter_->Prev();
ASSERT_TRUE(db_iter_->Valid());
ASSERT_EQ(db_iter_->key().ToString(), "b");
ASSERT_EQ(db_iter_->value().ToString(), "5");
db_iter_->Prev();
ASSERT_TRUE(db_iter_->Valid());
ASSERT_EQ(db_iter_->key().ToString(), "a");
ASSERT_EQ(db_iter_->value().ToString(), "4");
ROCKSDB_NAMESPACE::SyncPoint::GetInstance()->DisableProcessing();
}
TEST_F(DBIterWithMergeIterTest, InnerMergeIteratorDataRace5) {
internal_iter2_->Add("z", kTypeValue, "9", 4u);
// Test Prev() when one child iterator has more rows inserted
// between Seek() and Prev() when changing directions.
db_iter_->Seek("g");
ASSERT_TRUE(db_iter_->Valid());
ASSERT_EQ(db_iter_->key().ToString(), "g");
ASSERT_EQ(db_iter_->value().ToString(), "3");
// Test call back inserts entries for update a key before "z" in
// mem table after MergeIterator::Prev() calls mem table iterator's
// Seek() and before calling Prev()
ROCKSDB_NAMESPACE::SyncPoint::GetInstance()->SetCallBack(
"MergeIterator::Prev:BeforePrev", [&](void* arg) {
IteratorWrapper* it = reinterpret_cast<IteratorWrapper*>(arg);
if (it->key().starts_with("z")) {
internal_iter2_->Add("x", kTypeValue, "7", 16u, true);
internal_iter2_->Add("x", kTypeValue, "7", 15u, true);
}
});
ROCKSDB_NAMESPACE::SyncPoint::GetInstance()->EnableProcessing();
db_iter_->Prev();
ASSERT_TRUE(db_iter_->Valid());
ASSERT_EQ(db_iter_->key().ToString(), "f");
ASSERT_EQ(db_iter_->value().ToString(), "2");
db_iter_->Prev();
ASSERT_TRUE(db_iter_->Valid());
ASSERT_EQ(db_iter_->key().ToString(), "d");
ASSERT_EQ(db_iter_->value().ToString(), "7");
db_iter_->Prev();
ASSERT_TRUE(db_iter_->Valid());
ASSERT_EQ(db_iter_->key().ToString(), "c");
ASSERT_EQ(db_iter_->value().ToString(), "6");
db_iter_->Prev();
ASSERT_TRUE(db_iter_->Valid());
ASSERT_EQ(db_iter_->key().ToString(), "b");
ASSERT_EQ(db_iter_->value().ToString(), "5");
db_iter_->Prev();
ASSERT_TRUE(db_iter_->Valid());
ASSERT_EQ(db_iter_->key().ToString(), "a");
ASSERT_EQ(db_iter_->value().ToString(), "4");
ROCKSDB_NAMESPACE::SyncPoint::GetInstance()->DisableProcessing();
}
TEST_F(DBIterWithMergeIterTest, InnerMergeIteratorDataRace6) {
internal_iter2_->Add("z", kTypeValue, "9", 4u);
// Test Prev() when one child iterator has more rows inserted
// between Seek() and Prev() when changing directions.
db_iter_->Seek("g");
ASSERT_TRUE(db_iter_->Valid());
ASSERT_EQ(db_iter_->key().ToString(), "g");
ASSERT_EQ(db_iter_->value().ToString(), "3");
// Test call back inserts an entry for update a key before "z" in
// mem table after MergeIterator::Prev() calls mem table iterator's
// Seek() and before calling Prev()
ROCKSDB_NAMESPACE::SyncPoint::GetInstance()->SetCallBack(
"MergeIterator::Prev:BeforePrev", [&](void* arg) {
IteratorWrapper* it = reinterpret_cast<IteratorWrapper*>(arg);
if (it->key().starts_with("z")) {
internal_iter2_->Add("x", kTypeValue, "7", 16u, true);
}
});
ROCKSDB_NAMESPACE::SyncPoint::GetInstance()->EnableProcessing();
db_iter_->Prev();
ASSERT_TRUE(db_iter_->Valid());
ASSERT_EQ(db_iter_->key().ToString(), "f");
ASSERT_EQ(db_iter_->value().ToString(), "2");
db_iter_->Prev();
ASSERT_TRUE(db_iter_->Valid());
ASSERT_EQ(db_iter_->key().ToString(), "d");
ASSERT_EQ(db_iter_->value().ToString(), "7");
db_iter_->Prev();
ASSERT_TRUE(db_iter_->Valid());
ASSERT_EQ(db_iter_->key().ToString(), "c");
ASSERT_EQ(db_iter_->value().ToString(), "6");
db_iter_->Prev();
ASSERT_TRUE(db_iter_->Valid());
ASSERT_EQ(db_iter_->key().ToString(), "b");
ASSERT_EQ(db_iter_->value().ToString(), "5");
db_iter_->Prev();
ASSERT_TRUE(db_iter_->Valid());
ASSERT_EQ(db_iter_->key().ToString(), "a");
ASSERT_EQ(db_iter_->value().ToString(), "4");
ROCKSDB_NAMESPACE::SyncPoint::GetInstance()->DisableProcessing();
}
TEST_F(DBIterWithMergeIterTest, InnerMergeIteratorDataRace7) {
internal_iter1_->Add("u", kTypeValue, "10", 4u);
internal_iter1_->Add("v", kTypeValue, "11", 4u);
internal_iter1_->Add("w", kTypeValue, "12", 4u);
internal_iter2_->Add("z", kTypeValue, "9", 4u);
// Test Prev() when one child iterator has more rows inserted
// between Seek() and Prev() when changing directions.
db_iter_->Seek("g");
ASSERT_TRUE(db_iter_->Valid());
ASSERT_EQ(db_iter_->key().ToString(), "g");
ASSERT_EQ(db_iter_->value().ToString(), "3");
// Test call back inserts entries for update a key before "z" in
// mem table after MergeIterator::Prev() calls mem table iterator's
// Seek() and before calling Prev()
ROCKSDB_NAMESPACE::SyncPoint::GetInstance()->SetCallBack(
"MergeIterator::Prev:BeforePrev", [&](void* arg) {
IteratorWrapper* it = reinterpret_cast<IteratorWrapper*>(arg);
if (it->key().starts_with("z")) {
internal_iter2_->Add("x", kTypeValue, "7", 16u, true);
internal_iter2_->Add("x", kTypeValue, "7", 15u, true);
internal_iter2_->Add("x", kTypeValue, "7", 14u, true);
internal_iter2_->Add("x", kTypeValue, "7", 13u, true);
internal_iter2_->Add("x", kTypeValue, "7", 12u, true);
internal_iter2_->Add("x", kTypeValue, "7", 11u, true);
}
});
ROCKSDB_NAMESPACE::SyncPoint::GetInstance()->EnableProcessing();
db_iter_->Prev();
ASSERT_TRUE(db_iter_->Valid());
ASSERT_EQ(db_iter_->key().ToString(), "f");
ASSERT_EQ(db_iter_->value().ToString(), "2");
db_iter_->Prev();
ASSERT_TRUE(db_iter_->Valid());
ASSERT_EQ(db_iter_->key().ToString(), "d");
ASSERT_EQ(db_iter_->value().ToString(), "7");
db_iter_->Prev();
ASSERT_TRUE(db_iter_->Valid());
ASSERT_EQ(db_iter_->key().ToString(), "c");
ASSERT_EQ(db_iter_->value().ToString(), "6");
db_iter_->Prev();
ASSERT_TRUE(db_iter_->Valid());
ASSERT_EQ(db_iter_->key().ToString(), "b");
ASSERT_EQ(db_iter_->value().ToString(), "5");
db_iter_->Prev();
ASSERT_TRUE(db_iter_->Valid());
ASSERT_EQ(db_iter_->key().ToString(), "a");
ASSERT_EQ(db_iter_->value().ToString(), "4");
ROCKSDB_NAMESPACE::SyncPoint::GetInstance()->DisableProcessing();
}
TEST_F(DBIterWithMergeIterTest, InnerMergeIteratorDataRace8) {
// internal_iter1_: a, f, g
// internal_iter2_: a, b, c, d, adding (z)
internal_iter2_->Add("z", kTypeValue, "9", 4u);
// Test Prev() when one child iterator has more rows inserted
// between Seek() and Prev() when changing directions.
db_iter_->Seek("g");
ASSERT_TRUE(db_iter_->Valid());
ASSERT_EQ(db_iter_->key().ToString(), "g");
ASSERT_EQ(db_iter_->value().ToString(), "3");
// Test call back inserts two keys before "z" in mem table after
// MergeIterator::Prev() calls mem table iterator's Seek() and
// before calling Prev()
ROCKSDB_NAMESPACE::SyncPoint::GetInstance()->SetCallBack(
"MergeIterator::Prev:BeforePrev", [&](void* arg) {
IteratorWrapper* it = reinterpret_cast<IteratorWrapper*>(arg);
if (it->key().starts_with("z")) {
internal_iter2_->Add("x", kTypeValue, "7", 16u, true);
internal_iter2_->Add("y", kTypeValue, "7", 17u, true);
}
});
ROCKSDB_NAMESPACE::SyncPoint::GetInstance()->EnableProcessing();
db_iter_->Prev();
ASSERT_TRUE(db_iter_->Valid());
ASSERT_EQ(db_iter_->key().ToString(), "f");
ASSERT_EQ(db_iter_->value().ToString(), "2");
db_iter_->Prev();
ASSERT_TRUE(db_iter_->Valid());
ASSERT_EQ(db_iter_->key().ToString(), "d");
ASSERT_EQ(db_iter_->value().ToString(), "7");
ROCKSDB_NAMESPACE::SyncPoint::GetInstance()->DisableProcessing();
}
TEST_F(DBIteratorTest, SeekPrefixTombstones) {
ReadOptions ro;
Options options;
options.prefix_extractor.reset(NewNoopTransform());
TestIterator* internal_iter = new TestIterator(BytewiseComparator());
internal_iter->AddDeletion("b");
internal_iter->AddDeletion("c");
internal_iter->AddDeletion("d");
internal_iter->AddDeletion("e");
internal_iter->AddDeletion("f");
internal_iter->AddDeletion("g");
internal_iter->Finish();
ro.prefix_same_as_start = true;
std::unique_ptr<Iterator> db_iter(NewDBIterator(
env_, ro, ImmutableOptions(options), MutableCFOptions(options),
BytewiseComparator(), internal_iter, nullptr /* version */,
10 /* sequence */, options.max_sequential_skip_in_iterations,
nullptr /* read_callback */));
int skipped_keys = 0;
get_perf_context()->Reset();
db_iter->SeekForPrev("z");
skipped_keys =
static_cast<int>(get_perf_context()->internal_key_skipped_count);
ASSERT_EQ(skipped_keys, 0);
get_perf_context()->Reset();
db_iter->Seek("a");
skipped_keys =
static_cast<int>(get_perf_context()->internal_key_skipped_count);
ASSERT_EQ(skipped_keys, 0);
}
TEST_F(DBIteratorTest, SeekToFirstLowerBound) {
const int kNumKeys = 3;
for (int i = 0; i < kNumKeys + 2; ++i) {
// + 2 for two special cases: lower bound before and lower bound after the
// internal iterator's keys
TestIterator* internal_iter = new TestIterator(BytewiseComparator());
for (int j = 1; j <= kNumKeys; ++j) {
internal_iter->AddPut(std::to_string(j), "val");
}
internal_iter->Finish();
ReadOptions ro;
auto lower_bound_str = std::to_string(i);
Slice lower_bound(lower_bound_str);
ro.iterate_lower_bound = &lower_bound;
Options options;
std::unique_ptr<Iterator> db_iter(NewDBIterator(
env_, ro, ImmutableOptions(options), MutableCFOptions(options),
BytewiseComparator(), internal_iter, nullptr /* version */,
10 /* sequence */, options.max_sequential_skip_in_iterations,
nullptr /* read_callback */));
db_iter->SeekToFirst();
if (i == kNumKeys + 1) {
// lower bound was beyond the last key
ASSERT_FALSE(db_iter->Valid());
ASSERT_OK(db_iter->status());
} else {
ASSERT_TRUE(db_iter->Valid());
int expected;
if (i == 0) {
// lower bound was before the first key
expected = 1;
} else {
// lower bound was at the ith key
expected = i;
}
ASSERT_EQ(std::to_string(expected), db_iter->key().ToString());
}
}
}
TEST_F(DBIteratorTest, PrevLowerBound) {
const int kNumKeys = 3;
const int kLowerBound = 2;
TestIterator* internal_iter = new TestIterator(BytewiseComparator());
for (int j = 1; j <= kNumKeys; ++j) {
internal_iter->AddPut(std::to_string(j), "val");
}
internal_iter->Finish();
ReadOptions ro;
auto lower_bound_str = std::to_string(kLowerBound);
Slice lower_bound(lower_bound_str);
ro.iterate_lower_bound = &lower_bound;
Options options;
std::unique_ptr<Iterator> db_iter(NewDBIterator(
env_, ro, ImmutableOptions(options), MutableCFOptions(options),
BytewiseComparator(), internal_iter, nullptr /* version */,
10 /* sequence */, options.max_sequential_skip_in_iterations,
nullptr /* read_callback */));
db_iter->SeekToLast();
for (int i = kNumKeys; i >= kLowerBound; --i) {
ASSERT_TRUE(db_iter->Valid());
ASSERT_EQ(std::to_string(i), db_iter->key().ToString());
db_iter->Prev();
}
ASSERT_FALSE(db_iter->Valid());
}
TEST_F(DBIteratorTest, SeekLessLowerBound) {
const int kNumKeys = 3;
const int kLowerBound = 2;
TestIterator* internal_iter = new TestIterator(BytewiseComparator());
for (int j = 1; j <= kNumKeys; ++j) {
internal_iter->AddPut(std::to_string(j), "val");
}
internal_iter->Finish();
ReadOptions ro;
auto lower_bound_str = std::to_string(kLowerBound);
Slice lower_bound(lower_bound_str);
ro.iterate_lower_bound = &lower_bound;
Options options;
std::unique_ptr<Iterator> db_iter(NewDBIterator(
env_, ro, ImmutableOptions(options), MutableCFOptions(options),
BytewiseComparator(), internal_iter, nullptr /* version */,
10 /* sequence */, options.max_sequential_skip_in_iterations,
nullptr /* read_callback */));
auto before_lower_bound_str = std::to_string(kLowerBound - 1);
Slice before_lower_bound(lower_bound_str);
db_iter->Seek(before_lower_bound);
ASSERT_TRUE(db_iter->Valid());
ASSERT_EQ(lower_bound_str, db_iter->key().ToString());
}
Change and clarify the relationship between Valid(), status() and Seek*() for all iterators. Also fix some bugs Summary: Before this PR, Iterator/InternalIterator may simultaneously have non-ok status() and Valid() = true. That state means that the last operation failed, but the iterator is nevertheless positioned on some unspecified record. Likely intended uses of that are: * If some sst files are corrupted, a normal iterator can be used to read the data from files that are not corrupted. * When using read_tier = kBlockCacheTier, read the data that's in block cache, skipping over the data that is not. However, this behavior wasn't documented well (and until recently the wiki on github had misleading incorrect information). In the code there's a lot of confusion about the relationship between status() and Valid(), and about whether Seek()/SeekToLast()/etc reset the status or not. There were a number of bugs caused by this confusion, both inside rocksdb and in the code that uses rocksdb (including ours). This PR changes the convention to: * If status() is not ok, Valid() always returns false. * Any seek operation resets status. (Before the PR, it depended on iterator type and on particular error.) This does sacrifice the two use cases listed above, but siying said it's ok. Overview of the changes: * A commit that adds missing status checks in MergingIterator. This fixes a bug that actually affects us, and we need it fixed. `DBIteratorTest.NonBlockingIterationBugRepro` explains the scenario. * Changes to lots of iterator types to make all of them conform to the new convention. Some bug fixes along the way. By far the biggest changes are in DBIter, which is a big messy piece of code; I tried to make it less big and messy but mostly failed. * A stress-test for DBIter, to gain some confidence that I didn't break it. It does a few million random operations on the iterator, while occasionally modifying the underlying data (like ForwardIterator does) and occasionally returning non-ok status from internal iterator. To find the iterator types that needed changes I searched for "public .*Iterator" in the code. Here's an overview of all 27 iterator types: Iterators that didn't need changes: * status() is always ok(), or Valid() is always false: MemTableIterator, ModelIter, TestIterator, KVIter (2 classes with this name anonymous namespaces), LoggingForwardVectorIterator, VectorIterator, MockTableIterator, EmptyIterator, EmptyInternalIterator. * Thin wrappers that always pass through Valid() and status(): ArenaWrappedDBIter, TtlIterator, InternalIteratorFromIterator. Iterators with changes (see inline comments for details): * DBIter - an overhaul: - It used to silently skip corrupted keys (`FindParseableKey()`), which seems dangerous. This PR makes it just stop immediately after encountering a corrupted key, just like it would for other kinds of corruption. Let me know if there was actually some deeper meaning in this behavior and I should put it back. - It had a few code paths silently discarding subiterator's status. The stress test caught a few. - The backwards iteration code path was expecting the internal iterator's set of keys to be immutable. It's probably always true in practice at the moment, since ForwardIterator doesn't support backwards iteration, but this PR fixes it anyway. See added DBIteratorTest.ReverseToForwardBug for an example. - Some parts of backwards iteration code path even did things like `assert(iter_->Valid())` after a seek, which is never a safe assumption. - It used to not reset status on seek for some types of errors. - Some simplifications and better comments. - Some things got more complicated from the added error handling. I'm open to ideas for how to make it nicer. * MergingIterator - check status after every operation on every subiterator, and in some places assert that valid subiterators have ok status. * ForwardIterator - changed to the new convention, also slightly simplified. * ForwardLevelIterator - fixed some bugs and simplified. * LevelIterator - simplified. * TwoLevelIterator - changed to the new convention. Also fixed a bug that would make SeekForPrev() sometimes silently ignore errors from first_level_iter_. * BlockBasedTableIterator - minor changes. * BlockIter - replaced `SetStatus()` with `Invalidate()` to make sure non-ok BlockIter is always invalid. * PlainTableIterator - some seeks used to not reset status. * CuckooTableIterator - tiny code cleanup. * ManagedIterator - fixed some bugs. * BaseDeltaIterator - changed to the new convention and fixed a bug. * BlobDBIterator - seeks used to not reset status. * KeyConvertingIterator - some small change. Closes https://github.com/facebook/rocksdb/pull/3810 Differential Revision: D7888019 Pulled By: al13n321 fbshipit-source-id: 4aaf6d3421c545d16722a815b2fa2e7912bc851d
7 years ago
TEST_F(DBIteratorTest, ReverseToForwardWithDisappearingKeys) {
Options options;
options.prefix_extractor.reset(NewCappedPrefixTransform(0));
TestIterator* internal_iter = new TestIterator(BytewiseComparator());
internal_iter->AddPut("a", "A");
internal_iter->AddPut("b", "B");
for (int i = 0; i < 100; ++i) {
internal_iter->AddPut("c" + ToString(i), "");
}
internal_iter->Finish();
std::unique_ptr<Iterator> db_iter(NewDBIterator(
env_, ReadOptions(), ImmutableOptions(options), MutableCFOptions(options),
BytewiseComparator(), internal_iter, nullptr /* version */,
10 /* sequence */, options.max_sequential_skip_in_iterations,
nullptr /* read_callback */));
Change and clarify the relationship between Valid(), status() and Seek*() for all iterators. Also fix some bugs Summary: Before this PR, Iterator/InternalIterator may simultaneously have non-ok status() and Valid() = true. That state means that the last operation failed, but the iterator is nevertheless positioned on some unspecified record. Likely intended uses of that are: * If some sst files are corrupted, a normal iterator can be used to read the data from files that are not corrupted. * When using read_tier = kBlockCacheTier, read the data that's in block cache, skipping over the data that is not. However, this behavior wasn't documented well (and until recently the wiki on github had misleading incorrect information). In the code there's a lot of confusion about the relationship between status() and Valid(), and about whether Seek()/SeekToLast()/etc reset the status or not. There were a number of bugs caused by this confusion, both inside rocksdb and in the code that uses rocksdb (including ours). This PR changes the convention to: * If status() is not ok, Valid() always returns false. * Any seek operation resets status. (Before the PR, it depended on iterator type and on particular error.) This does sacrifice the two use cases listed above, but siying said it's ok. Overview of the changes: * A commit that adds missing status checks in MergingIterator. This fixes a bug that actually affects us, and we need it fixed. `DBIteratorTest.NonBlockingIterationBugRepro` explains the scenario. * Changes to lots of iterator types to make all of them conform to the new convention. Some bug fixes along the way. By far the biggest changes are in DBIter, which is a big messy piece of code; I tried to make it less big and messy but mostly failed. * A stress-test for DBIter, to gain some confidence that I didn't break it. It does a few million random operations on the iterator, while occasionally modifying the underlying data (like ForwardIterator does) and occasionally returning non-ok status from internal iterator. To find the iterator types that needed changes I searched for "public .*Iterator" in the code. Here's an overview of all 27 iterator types: Iterators that didn't need changes: * status() is always ok(), or Valid() is always false: MemTableIterator, ModelIter, TestIterator, KVIter (2 classes with this name anonymous namespaces), LoggingForwardVectorIterator, VectorIterator, MockTableIterator, EmptyIterator, EmptyInternalIterator. * Thin wrappers that always pass through Valid() and status(): ArenaWrappedDBIter, TtlIterator, InternalIteratorFromIterator. Iterators with changes (see inline comments for details): * DBIter - an overhaul: - It used to silently skip corrupted keys (`FindParseableKey()`), which seems dangerous. This PR makes it just stop immediately after encountering a corrupted key, just like it would for other kinds of corruption. Let me know if there was actually some deeper meaning in this behavior and I should put it back. - It had a few code paths silently discarding subiterator's status. The stress test caught a few. - The backwards iteration code path was expecting the internal iterator's set of keys to be immutable. It's probably always true in practice at the moment, since ForwardIterator doesn't support backwards iteration, but this PR fixes it anyway. See added DBIteratorTest.ReverseToForwardBug for an example. - Some parts of backwards iteration code path even did things like `assert(iter_->Valid())` after a seek, which is never a safe assumption. - It used to not reset status on seek for some types of errors. - Some simplifications and better comments. - Some things got more complicated from the added error handling. I'm open to ideas for how to make it nicer. * MergingIterator - check status after every operation on every subiterator, and in some places assert that valid subiterators have ok status. * ForwardIterator - changed to the new convention, also slightly simplified. * ForwardLevelIterator - fixed some bugs and simplified. * LevelIterator - simplified. * TwoLevelIterator - changed to the new convention. Also fixed a bug that would make SeekForPrev() sometimes silently ignore errors from first_level_iter_. * BlockBasedTableIterator - minor changes. * BlockIter - replaced `SetStatus()` with `Invalidate()` to make sure non-ok BlockIter is always invalid. * PlainTableIterator - some seeks used to not reset status. * CuckooTableIterator - tiny code cleanup. * ManagedIterator - fixed some bugs. * BaseDeltaIterator - changed to the new convention and fixed a bug. * BlobDBIterator - seeks used to not reset status. * KeyConvertingIterator - some small change. Closes https://github.com/facebook/rocksdb/pull/3810 Differential Revision: D7888019 Pulled By: al13n321 fbshipit-source-id: 4aaf6d3421c545d16722a815b2fa2e7912bc851d
7 years ago
db_iter->SeekForPrev("a");
ASSERT_TRUE(db_iter->Valid());
ASSERT_OK(db_iter->status());
ASSERT_EQ("a", db_iter->key().ToString());
internal_iter->Vanish("a");
db_iter->Next();
ASSERT_TRUE(db_iter->Valid());
ASSERT_OK(db_iter->status());
ASSERT_EQ("b", db_iter->key().ToString());
// A (sort of) bug used to cause DBIter to pointlessly drag the internal
// iterator all the way to the end. But this doesn't really matter at the time
// of writing because the only iterator that can see disappearing keys is
// ForwardIterator, which doesn't support SeekForPrev().
EXPECT_LT(internal_iter->steps(), 20);
}
} // namespace ROCKSDB_NAMESPACE
rocksdb: switch to gtest Summary: Our existing test notation is very similar to what is used in gtest. It makes it easy to adopt what is different. In this diff I modify existing [[ https://code.google.com/p/googletest/wiki/Primer#Test_Fixtures:_Using_the_Same_Data_Configuration_for_Multiple_Te | test fixture ]] classes to inherit from `testing::Test`. Also for unit tests that use fixture class, `TEST` is replaced with `TEST_F` as required in gtest. There are several custom `main` functions in our existing tests. To make this transition easier, I modify all `main` functions to fallow gtest notation. But eventually we can remove them and use implementation of `main` that gtest provides. ```lang=bash % cat ~/transform #!/bin/sh files=$(git ls-files '*test\.cc') for file in $files do if grep -q "rocksdb::test::RunAllTests()" $file then if grep -Eq '^class \w+Test {' $file then perl -pi -e 's/^(class \w+Test) {/${1}: public testing::Test {/g' $file perl -pi -e 's/^(TEST)/${1}_F/g' $file fi perl -pi -e 's/(int main.*\{)/${1}::testing::InitGoogleTest(&argc, argv);/g' $file perl -pi -e 's/rocksdb::test::RunAllTests/RUN_ALL_TESTS/g' $file fi done % sh ~/transform % make format ``` Second iteration of this diff contains only scripted changes. Third iteration contains manual changes to fix last errors and make it compilable. Test Plan: Build and notice no errors. ```lang=bash % USE_CLANG=1 make check -j55 ``` Tests are still testing. Reviewers: meyering, sdong, rven, igor Reviewed By: igor Subscribers: dhruba, leveldb Differential Revision: https://reviews.facebook.net/D35157
10 years ago
int main(int argc, char** argv) {
::testing::InitGoogleTest(&argc, argv);
return RUN_ALL_TESTS();
}