Pessimistic Transactions
Summary:
Initial implementation of Pessimistic Transactions. This diff contains the api changes discussed in D38913. This diff is pretty large, so let me know if people would prefer to meet up to discuss it.
MyRocks folks: please take a look at the API in include/rocksdb/utilities/transaction[_db].h and let me know if you have any issues.
Also, you'll notice a couple of TODOs in the implementation of RollbackToSavePoint(). After chatting with Siying, I'm going to send out a separate diff for an alternate implementation of this feature that implements the rollback inside of WriteBatch/WriteBatchWithIndex. We can then decide which route is preferable.
Next, I'm planning on doing some perf testing and then integrating this diff into MongoRocks for further testing.
Test Plan: Unit tests, db_bench parallel testing.
Reviewers: igor, rven, sdong, yhchiang, yoshinorim
Reviewed By: sdong
Subscribers: hermanlee4, maykov, spetrunia, leveldb, dhruba
Differential Revision: https://reviews.facebook.net/D40869
10 years ago
|
|
|
// Copyright (c) 2015, Facebook, Inc. All rights reserved.
|
|
|
|
// This source code is licensed under the BSD-style license found in the
|
|
|
|
// LICENSE file in the root directory of this source tree. An additional grant
|
|
|
|
// of patent rights can be found in the PATENTS file in the same directory.
|
|
|
|
|
|
|
|
#pragma once
|
|
|
|
|
|
|
|
#ifndef ROCKSDB_LITE
|
|
|
|
|
|
|
|
#include <string>
|
|
|
|
#include <unordered_map>
|
|
|
|
|
|
|
|
#include "rocksdb/db.h"
|
|
|
|
#include "rocksdb/slice.h"
|
|
|
|
#include "rocksdb/status.h"
|
|
|
|
#include "rocksdb/types.h"
|
|
|
|
|
|
|
|
namespace rocksdb {
|
|
|
|
|
|
|
|
using TransactionKeyMap =
|
|
|
|
std::unordered_map<uint32_t,
|
|
|
|
std::unordered_map<std::string, SequenceNumber>>;
|
|
|
|
|
|
|
|
class DBImpl;
|
|
|
|
struct SuperVersion;
|
|
|
|
class WriteBatchWithIndex;
|
|
|
|
|
|
|
|
class TransactionUtil {
|
|
|
|
public:
|
|
|
|
// Verifies there have been no writes to this key in the db since this
|
|
|
|
// sequence number.
|
|
|
|
//
|
|
|
|
// Returns OK on success, BUSY if there is a conflicting write, or other error
|
|
|
|
// status for any unexpected errors.
|
|
|
|
static Status CheckKeyForConflicts(DBImpl* db_impl,
|
|
|
|
ColumnFamilyHandle* column_family,
|
|
|
|
const std::string& key,
|
|
|
|
SequenceNumber key_seq);
|
|
|
|
|
|
|
|
// For each key,SequenceNumber pair in the TransactionKeyMap, this function
|
|
|
|
// will verify there have been no writes to the key in the db since that
|
|
|
|
// sequence number.
|
|
|
|
//
|
|
|
|
// Returns OK on success, BUSY if there is a conflicting write, or other error
|
|
|
|
// status for any unexpected errors.
|
|
|
|
//
|
|
|
|
// REQUIRED: this function should only be called on the write thread or if the
|
|
|
|
// mutex is held.
|
|
|
|
static Status CheckKeysForConflicts(DBImpl* db_impl,
|
|
|
|
const TransactionKeyMap& keys);
|
Pessimistic Transactions
Summary:
Initial implementation of Pessimistic Transactions. This diff contains the api changes discussed in D38913. This diff is pretty large, so let me know if people would prefer to meet up to discuss it.
MyRocks folks: please take a look at the API in include/rocksdb/utilities/transaction[_db].h and let me know if you have any issues.
Also, you'll notice a couple of TODOs in the implementation of RollbackToSavePoint(). After chatting with Siying, I'm going to send out a separate diff for an alternate implementation of this feature that implements the rollback inside of WriteBatch/WriteBatchWithIndex. We can then decide which route is preferable.
Next, I'm planning on doing some perf testing and then integrating this diff into MongoRocks for further testing.
Test Plan: Unit tests, db_bench parallel testing.
Reviewers: igor, rven, sdong, yhchiang, yoshinorim
Reviewed By: sdong
Subscribers: hermanlee4, maykov, spetrunia, leveldb, dhruba
Differential Revision: https://reviews.facebook.net/D40869
10 years ago
|
|
|
|
|
|
|
private:
|
|
|
|
static Status CheckKey(DBImpl* db_impl, SuperVersion* sv,
|
|
|
|
SequenceNumber earliest_seq, SequenceNumber key_seq,
|
|
|
|
const std::string& key);
|
|
|
|
};
|
|
|
|
|
|
|
|
} // namespace rocksdb
|
|
|
|
|
|
|
|
#endif // ROCKSDB_LITE
|