rocksdb

Commit Graph

Author	SHA1	Message	Date
Manuel Ung	e03377c7fd	Add lock wait time as a perf context counter Summary: Adds two new counters: `key_lock_wait_count` counts how many times a lock was blocked by another transaction and had to wait, instead of being granted the lock immediately. `key_lock_wait_time` counts the time spent acquiring locks. Closes https://github.com/facebook/rocksdb/pull/3107 Differential Revision: D6217332 Pulled By: lth fbshipit-source-id: 55d4f46da5550c333e523263422fd61d6a46deb9	8 years ago
Yi Wu	be410dede8	Fix PinnableSlice move assignment Summary: After move assignment, we need to re-initialized the moved PinnableSlice. Also update blob_db_impl.cc to not reuse the moved PinnableSlice since it is supposed to be in an undefined state after move. Closes https://github.com/facebook/rocksdb/pull/3127 Differential Revision: D6238585 Pulled By: yiwu-arbug fbshipit-source-id: bd99f2e37406c4f7de160c7dee6a2e8126bc224e	8 years ago
Yi Wu	2581c0a5a1	Blob DB: Fix BlobDBTest::SnapshotAndGarbageCollection asan failure Summary: Fix unreleased snapshot at the end of the test. Closes https://github.com/facebook/rocksdb/pull/3126 Differential Revision: D6232867 Pulled By: yiwu-arbug fbshipit-source-id: 651ca3144fc573ea2ab0ab20f0a752fb4a101d26	8 years ago
Yi Wu	62578d80c1	Blob DB: Add compaction filter to remove expired blob index entries Summary: After adding expiration to blob index in #3066, we are now able to add a compaction filter to cleanup expired blob index entries. Closes https://github.com/facebook/rocksdb/pull/3090 Differential Revision: D6183812 Pulled By: yiwu-arbug fbshipit-source-id: 9cb03267a9702975290e758c9c176a2c03530b83	8 years ago
Yi Wu	7bfa88037e	Blob DB: fix snapshot handling Summary: Blob db will keep blob file if data in the file is visible to an active snapshot. Before this patch it checks whether there is an active snapshot has sequence number greater than the earliest sequence in the file. This is problematic since we take snapshot on every read, if it keep having reads, old blob files will not be cleanup. Change to check if there is an active snapshot falls in the range of [earliest_sequence, obsolete_sequence) where obsolete sequence is 1. if data is relocated to another file by garbage collection, it is the latest sequence at the time garbage collection finish 2. otherwise, it is the latest sequence of the file Closes https://github.com/facebook/rocksdb/pull/3087 Differential Revision: D6182519 Pulled By: yiwu-arbug fbshipit-source-id: cdf4c35281f782eb2a9ad6a87b6727bbdff27a45	8 years ago
Yi Wu	f662f8f0b6	Blob DB: option to enable garbage collection Summary: Add an option to enable/disable auto garbage collection, where we keep counting how many keys have been evicted by either deletion or compaction and decide whether to garbage collect a blob file. Default disable auto garbage collection for now since the whole logic is not fully tested and we plan to make major change to it. Closes https://github.com/facebook/rocksdb/pull/3117 Differential Revision: D6224756 Pulled By: yiwu-arbug fbshipit-source-id: cdf53bdccec96a4580a2b3a342110ad9e8864dfe	8 years ago
Yi Wu	167ba599ec	Blob DB: Fix flaky BlobDBTest::GCExpiredKeyWhileOverwriting test Summary: The test intent to wait until key being overwritten until proceed with garbage collection. It failed to wait for `PutUntil` finally finish. Fixing it. Closes https://github.com/facebook/rocksdb/pull/3116 Differential Revision: D6222833 Pulled By: yiwu-arbug fbshipit-source-id: fa9b57a772b92a66cf250b44e7975c43f62f45c5	8 years ago
Sagar Vemuri	25ac1697b4	Blob DB: Evict oldest blob file when close to blob db size limit Summary: Evict oldest blob file and put it in obsolete_files list when close to blob db size limit. The file will be delete when the `DeleteObsoleteFiles` background job runs next time. For now I set `kEvictOldestFileAtSize` constant, which controls when to evict the oldest file, at 90%. It could be tweaked or made into an option if really needed; I didn't want to expose it as an option pre-maturely as there are already too many :) . Closes https://github.com/facebook/rocksdb/pull/3094 Differential Revision: D6187340 Pulled By: sagar0 fbshipit-source-id: 687f8262101b9301bf964b94025a2fe9d8573421	8 years ago
Maysam Yabandeh	60d83df23d	WritePrepared Txn: Move DB class to its own file Summary: Move WritePreparedTxnDB from pessimistic_transaction_db.h to its own header, write_prepared_txn_db.h Closes https://github.com/facebook/rocksdb/pull/3114 Differential Revision: D6220987 Pulled By: maysamyabandeh fbshipit-source-id: 18893fb4fdc6b809fe117dabb544080f9b4a301b	8 years ago
Maysam Yabandeh	02693f64fc	WritePrepared Txn: ValidateSnapshot Summary: Implements ValidateSnapshot for WritePrepared txns and also adds a unit test to clarify the contract of this function. Closes https://github.com/facebook/rocksdb/pull/3101 Differential Revision: D6199405 Pulled By: maysamyabandeh fbshipit-source-id: ace509934c307ea5d26f4bbac5f836d7c80fd240	8 years ago
Maysam Yabandeh	17731a43a6	WritePrepared Txn: Optimize for recoverable state Summary: GetCommitTimeWriteBatch is currently used to store some state as part of commit in 2PC. In MyRocks it is specifically used to store some data that would be needed only during recovery. So it is not need to be stored in memtable right after each commit. This patch enables an optimization to write the GetCommitTimeWriteBatch only to the WAL. The batch will be written to memtable during recovery when the WAL is replayed. To cover the case when WAL is deleted after memtable flush, the batch is also buffered and written to memtable right before each memtable flush. Closes https://github.com/facebook/rocksdb/pull/3071 Differential Revision: D6148023 Pulled By: maysamyabandeh fbshipit-source-id: 2d09bae5565abe2017c0327421010d5c0d55eaa7	8 years ago
Maysam Yabandeh	c1cf94c787	WritePrepared Txn: sort indexes before batch collapse Summary: The collapse of duplicate keys in write batch needs to sort the indexes of duplicate keys since it only checks the index in the batch with the head of the list of duplicate keys. Closes https://github.com/facebook/rocksdb/pull/3093 Differential Revision: D6186800 Pulled By: maysamyabandeh fbshipit-source-id: abc9ae8c2f1840445a5584f925cf86ecc6f37154	8 years ago
Yi Wu	f6082d1944	Blob DB: cleanup unused options Summary: * cleanup num_concurrent_simple_blobs. We don't do concurrent writes (by taking write_mutex_) so it doesn't make sense to have multiple non TTL files open. We can revisit later when we want to improve writes. * cleanup eviction callback. we don't have plan to use it now. * rename s/open_simple_blob_files_/open_non_ttl_file_/ and s/open_blob_files_/open_ttl_files_/ to avoid confusion. Closes https://github.com/facebook/rocksdb/pull/3088 Differential Revision: D6182598 Pulled By: yiwu-arbug fbshipit-source-id: 99e6f5e01fa66d31309cdb06ce48502464bac6ad	8 years ago
Sagar Vemuri	f5078dde2d	Blob DB: Initialize all fields in Blob Header, Footer and Record structs Summary: Fixing un-itializations caught by valgrind. Closes https://github.com/facebook/rocksdb/pull/3103 Differential Revision: D6200195 Pulled By: sagar0 fbshipit-source-id: bf35a3fb03eb1d308e4c5ce30dee1e345d7b03b3	8 years ago
Yi Wu	3ebb7ba7b9	Blob DB: update blob file format Summary: Changing blob file format and some code cleanup around the change. The change with blob log format are: * Remove timestamp field in blob file header, blob file footer and blob records. The field is not being use and often confuse with expiration field. * Blob file header now come with column family id, which always equal to default column family id. It leaves room for future support of column family. * Compression field in blob file header now is a standalone byte (instead of compact encode with flags field) * Blob file footer now come with its own crc. * Key length now being uint64_t instead of uint32_t * Blob CRC now checksum both key and value (instead of value only). * Some reordering of the fields. The list of cleanups: * Better inline comments in blob_log_format.h * rename ttlrange_t and snrange_t to ExpirationRange and SequenceRange respectively. * simplify blob_db::Reader * Move crc checking logic to inside blob_log_format.cc Closes https://github.com/facebook/rocksdb/pull/3081 Differential Revision: D6171304 Pulled By: yiwu-arbug fbshipit-source-id: e4373e0d39264441b7e2fbd0caba93ddd99ea2af	8 years ago
Yi Wu	5a2a6483dc	Blob DB: Inline small values in base DB Summary: Adding the `min_blob_size` option to allow storing small values in base db (in LSM tree) together with the key. The goal is to improve performance for small values, while taking advantage of blob db's low write amplification for large values. Also adding expiration timestamp to blob index. It will be useful to evict stale blob indexes in base db by adding a compaction filter. I'll work on the compaction filter in future patches. See blob_index.h for the new blob index format. There are 4 cases when writing a new key: * small value w/o TTL: put in base db as normal value (i.e. ValueType::kTypeValue) * small value w/ TTL: put (type, expiration, value) to base db. * large value w/o TTL: write value to blob log and put (type, file, offset, size, compression) to base db. * large value w/TTL: write value to blob log and put (type, expiration, file, offset, size, compression) to base db. Closes https://github.com/facebook/rocksdb/pull/3066 Differential Revision: D6142115 Pulled By: yiwu-arbug fbshipit-source-id: 9526e76e19f0839310a3f5f2a43772a4ad182cd0	8 years ago
Sagar Vemuri	96e3a600ba	Return write error on reaching blob dir size limit Summary: I found that we continue accepting writes even when the blob db goes beyond the configured blob directory size limit. Now, we return an error for writes on reaching `blob_dir_size` limit and if `is_fifo` is set to false. (We cannot just drop any file when `is_fifo` is true.) Deleting the oldest file when `is_fifo` is true will be handled in a later PR. Closes https://github.com/facebook/rocksdb/pull/3060 Differential Revision: D6136156 Pulled By: sagar0 fbshipit-source-id: 2f11cb3f2eedfa94524fbfa2613dd64bfad7a23c	8 years ago
zach shipko	386a57e6ef	Fix build on OpenBSD Summary: A few simple changes to allow RocksDB to be built on OpenBSD. Let me know if any further changes are needed. Closes https://github.com/facebook/rocksdb/pull/3061 Differential Revision: D6138800 Pulled By: ajkr fbshipit-source-id: a13a17b5dc051e6518bd56a8c5efd1d24dd81b0c	8 years ago
Yi Wu	66a2c44ef4	Add DB::Properties::kEstimateOldestKeyTime Summary: With FIFO compaction we would like to get the oldest data time for monitoring. The problem is we don't have timestamp for each key in the DB. As an approximation, we expose the earliest of sst file "creation_time" property. My plan is to override the property with a more accurate value with blob db, where we actually have timestamp. Closes https://github.com/facebook/rocksdb/pull/2842 Differential Revision: D5770600 Pulled By: yiwu-arbug fbshipit-source-id: 03833c8f10bbfbee62f8ea5c0d03c0cafb5d853a	8 years ago
Dmitri Smirnov	d2a65c59e1	Fix unused var warnings in Release mode Summary: MSVC does not support unused attribute at this time. A separate assignment line fixes the issue probably by being counted as usage for MSVC and it no longer complains about unused var. Closes https://github.com/facebook/rocksdb/pull/3048 Differential Revision: D6126272 Pulled By: maysamyabandeh fbshipit-source-id: 4907865db45fd75a39a15725c0695aaa17509c1f	8 years ago
Maysam Yabandeh	63822eb761	Enable two write queues for transactions Summary: Enable concurrent_prepare flag for WritePrepared transactions and extend the existing transaction tests with this config. Closes https://github.com/facebook/rocksdb/pull/3046 Differential Revision: D6106534 Pulled By: maysamyabandeh fbshipit-source-id: 88c8d21d45bc492beb0a131caea84a2ac5e7d38c	8 years ago
Dmitri Smirnov	ebab2e2d42	Enable MSVC W4 with a few exceptions. Fix warnings and bugs Summary: Closes https://github.com/facebook/rocksdb/pull/3018 Differential Revision: D6079011 Pulled By: yiwu-arbug fbshipit-source-id: 988a721e7e7617967859dba71d660fc69f4dff57	8 years ago
Maysam Yabandeh	7e38238981	WritePrepared Txn: Disable GC during recovery Summary: Disables GC during recovery of a WritePrepared txn db to avoid GCing uncommitted key values. Closes https://github.com/facebook/rocksdb/pull/2980 Differential Revision: D6000191 Pulled By: maysamyabandeh fbshipit-source-id: fc4d522c643d24ebf043f811fe4ecd0dd0294675	8 years ago
Yi Wu	eaaef91178	Blob DB: Store blob index as kTypeBlobIndex in base db Summary: Blob db insert blob index to base db as kTypeBlobIndex type, to tell apart values written by plain rocksdb or blob db. This is to make it possible to migrate from existing rocksdb to blob db. Also with the patch blob db garbage collection get away from OptimisticTransaction. Instead it use a custom write callback to achieve similar behavior as OptimisticTransaction. This is because we need to pass the is_blob_index flag to DBImpl::Get but OptimisticTransaction don't support it. Closes https://github.com/facebook/rocksdb/pull/3000 Differential Revision: D6050044 Pulled By: yiwu-arbug fbshipit-source-id: 61dc72ab9977625e75f78cd968e7d8a3976e3632	8 years ago
Yi Wu	0552029b5c	Blob DB: not writing sequence number as blob record footer Summary: Previously each time we write a blob we write blog_record_header + key + value + blob_record_footer to blob log. The footer only contains a sequence and a crc for the sequence number. The sequence number was used in garbage collection to verify the value is recent. After #2703 we moved to use optimistic transaction and no longer use sequence number from the footer. Remove the footer altogether. There's another usage of sequence number and we are keeping it: Each blob log file keep track of sequence number range of keys in it, and use it to check if it is reference by a snapshot, before being deleted. Closes https://github.com/facebook/rocksdb/pull/3005 Differential Revision: D6057585 Pulled By: yiwu-arbug fbshipit-source-id: d6da53c457a316e9723f359a1b47facfc3ffe090	8 years ago
Yi Wu	10ba50e9eb	Blob DB: Move BlobFile definition to a separate file Summary: simply move BlobFile definition from blob_db_impl.h to blob_file.h. Closes https://github.com/facebook/rocksdb/pull/3002 Differential Revision: D6050143 Pulled By: yiwu-arbug fbshipit-source-id: a8fb6e094fe39bdeace6279569834bc65aa64a34	8 years ago
Zhongyi Xie	e2548366e1	add GetLiveFiles and GetLiveFilesMetaData for BlobDB Summary: Closes https://github.com/facebook/rocksdb/pull/2976 Differential Revision: D5994759 Pulled By: miasantreble fbshipit-source-id: 985c31dccb957cb970c302f813cd07a1e8cb6438	8 years ago
Yi Wu	8c392a31d7	WritePrepared Txn: Iterator Summary: On iterator create, take a snapshot, create a ReadCallback and pass the ReadCallback to the underlying DBIter to check if key is committed. Closes https://github.com/facebook/rocksdb/pull/2981 Differential Revision: D6001471 Pulled By: yiwu-arbug fbshipit-source-id: 3565c4cdaf25370ba47008b0e0cb65b31dfe79fe	8 years ago
Yi Wu	17c6325e8a	WritePrepare Txn: Cancel flush/compaction before destruction Summary: On WritePreparedTxnDB destruct there could be running compaction/flush holding a SnapshotChecker, which holds a pointer back to WritePreparedTxnDB. Make sure those jobs finished before destructing WritePreparedTxnDB. This is caught by TransactionTest::SeqAdvanceTest. Closes https://github.com/facebook/rocksdb/pull/2982 Differential Revision: D6002957 Pulled By: yiwu-arbug fbshipit-source-id: f1e70390c9798d1bd7959f5c8e2a1c14100773c3	8 years ago
Maysam Yabandeh	ec6c5383d0	WritePrepared Txn: end-to-end tests Summary: Enable WritePrepared policy for existing transaction tests. Closes https://github.com/facebook/rocksdb/pull/2972 Differential Revision: D5993614 Pulled By: maysamyabandeh fbshipit-source-id: d1eb53e2920c4e2a56434bb001231c98426f3509	8 years ago
Sagar Vemuri	da29eba43b	Enable WAL for blob index Summary: Enabled WAL, during GC, for blob index which is stored on regular RocksDB. Closes https://github.com/facebook/rocksdb/pull/2975 Differential Revision: D5997384 Pulled By: sagar0 fbshipit-source-id: b76c1487d8b5be0e36c55e8d77ffe3d37d63d85b	8 years ago
Yi Wu	d1b74b0c82	WritePrepared Txn: Compaction/Flush Summary: Update Compaction/Flush to support WritePreparedTxnDB: Add SnapshotChecker which is a proxy to query WritePreparedTxnDB::IsInSnapshot. Pass SnapshotChecker to DBImpl on WritePreparedTxnDB open. CompactionIterator use it to check if a key has been committed and if it is visible to a snapshot. In CompactionIterator: * check if key has been committed. If not, output uncommitted keys AS-IS. * use SnapshotChecker to check if key is visible to a snapshot when in need. * do not output key with seq = 0 if the key is not committed. Closes https://github.com/facebook/rocksdb/pull/2926 Differential Revision: D5902907 Pulled By: yiwu-arbug fbshipit-source-id: 945e037fdf0aa652dc5ba0ad879461040baa0320	8 years ago
Yi Wu	cc20ec3689	WritePrepared Txn: Test sequence number 0 is visible Summary: Compaction will output keys with sequence number 0, if it is visible to earliest snapshot. Adding a test to make sure IsInSnapshot() report sequence number 0 is visible to any snapshot. Closes https://github.com/facebook/rocksdb/pull/2974 Differential Revision: D5990665 Pulled By: yiwu-arbug fbshipit-source-id: ef50ebc777ff8ca688771f3ab598c7a609b0b65e	8 years ago
Maysam Yabandeh	4e3c3d8c6a	WritePrepared Txn: duplicate keys Summary: With WriteCommitted, when the write batch has duplicate keys, the txn db simply inserts them to the db with different seq numbers and let the db ignore/merge the duplicate values at the read time. With WritePrepared all the entries of the batch are inserted with the same seq number which prevents us from benefiting from this simple solution. This patch applies a hackish solution to unblock the end-to-end testing. The hack is to be replaced with a proper solution soon. The patch simply detects the duplicate key insertions, and mark the previous one as obsolete. Then before writing to the db it rewrites the batch eliminating the obsolete keys. This would incur a memcpy cost. Furthermore handing duplicate merge would require to do FullMerge instead of simply ignoring the previous value, which is not handled by this patch. Closes https://github.com/facebook/rocksdb/pull/2969 Differential Revision: D5976337 Pulled By: maysamyabandeh fbshipit-source-id: 114e65b66f137d8454ff2d1d782b8c05da95f989	8 years ago
Maysam Yabandeh	283d60761e	fix valgrind leak report in unit test Summary: I cannot locally reproduce the valgrind leak report but based on my code inspection not deleting txn1 might be the reason. ``` ==197848== 2,990 (544 direct, 2,446 indirect) bytes in 1 blocks are definitely lost in loss record 15 of 16 ==197848== at 0x4C2D06F: operator new(unsigned long) (in /usr/local/fbcode/gcc-5-glibc-2.23/lib/valgrind/vgpreload_memcheck-amd64-linux.so) ==197848== by 0x7D5B31: rocksdb::WritePreparedTxnDB::BeginTransaction(rocksdb::WriteOptions const&, rocksdb::TransactionOptions const&, rocksdb::Transaction) (pessimistic_transaction_db.cc:173) ==197848== by 0x7D80C1: rocksdb::PessimisticTransactionDB::Initialize(std::vector<unsigned long, std::allocator<unsigned long> > const&, std::vector<rocksdb::ColumnFamilyHandle, std::allocator<rocksdb::ColumnFamilyHandle> > const&) (pessimistic_transaction_db.cc:115) ==197848== by 0x7DC42F: rocksdb::WritePreparedTxnDB::Initialize(std::vector<unsigned long, std::allocator<unsigned long> > const&, std::vector<rocksdb::ColumnFamilyHandle, std::allocator<rocksdb::ColumnFamilyHandle> > const&) (pessimistic_transaction_db.cc:151) ==197848== by 0x7D8CA0: rocksdb::TransactionDB::WrapDB(rocksdb::DB, rocksdb::TransactionDBOptions const&, std::vector<unsigned long, std::allocator<unsigned long> > const&, std::vector<rocksdb::ColumnFamilyHandle, std::allocator<rocksdb::ColumnFamilyHandle> > const&, rocksdb::TransactionDB*) (pessimistic_transaction_db.cc:275) ==197848== by 0x7D9F26: rocksdb::TransactionDB::Open(rocksdb::DBOptions const&, rocksdb::TransactionDBOptions const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::vector<rocksdb::ColumnFamilyDescriptor, std::allocator<rocksdb::ColumnFamilyDescriptor> > const&, std::vector<rocksdb::ColumnFamilyHandle, std::allocator<rocksdb::ColumnFamilyHandle> >, rocksdb::TransactionDB) (pessimistic_transaction_db.cc:227) ==197848== by 0x7DB349: rocksdb::TransactionDB::Open(rocksdb::Options const&, rocksdb::TransactionDBOptions const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, rocksdb::TransactionDB) (pessimistic_transaction_db.cc:198) ==197848== by 0x52ABD2: rocksdb::TransactionTest::ReOpenNoDelete() (transaction_test.h:87) ==197848== by 0x51F7B8: rocksdb::WritePreparedTransactionTest_BasicRecoveryTest_Test::TestBody() (write_prepared_transaction_test.cc:843) ==197848== by 0x857557: HandleSehExceptionsInMethodIfSupported<testing::Test, void> (gtest-all.cc:3824) ==197848== by 0x857557: void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test, void (testing::Test::)(), char const*) (gtest-all.cc:3860) ==197848== by 0x84E7EB: testing::Test::Run() [clone .part.485] (gtest-all.cc:3897) ==197848== by 0x84E9BC: Run (gtest-all.cc:3888) ==197848== by 0x84E9BC: testing::TestInfo::Run() [clone .part.486] (gtest-all.cc:4072) ``` Closes https://github.com/facebook/rocksdb/pull/2963 Differential Revision: D5968856 Pulled By: maysamyabandeh fbshipit-source-id: 2ac512bbcad37dc8eeeffe4f363978913354180c	8 years ago
Maysam Yabandeh	d27258d3a6	WritePrepared Txn: Rollback Summary: Implement the rollback of WritePrepared txns. For each modified value, it reads the value before the txn and write it back. This would cancel out the effect of transaction. It also remove the rolled back txn from prepared heap. Closes https://github.com/facebook/rocksdb/pull/2946 Differential Revision: D5937575 Pulled By: maysamyabandeh fbshipit-source-id: a6d3c47f44db3729f44b287a80f97d08dc4e888d	8 years ago
Sagar Vemuri	bb38cd03a9	Limit number of merge operands in Cassandra merge operator Summary: Now that RocksDB supports conditional merging during point lookups (introduced in #2923), Cassandra value merge operator can be updated to pass in a limit. The limit needs to be passed in from the Cassandra code. Closes https://github.com/facebook/rocksdb/pull/2947 Differential Revision: D5938454 Pulled By: sagar0 fbshipit-source-id: d64a72d53170d8cf202b53bd648475c3952f7d7f	8 years ago
Andrew Kryczka	5df172da2f	fix deletion-triggered compaction in table builder Summary: It was broken when `NotifyCollectTableCollectorsOnFinish` was introduced. That function called `Finish` on each of the `TablePropertiesCollector`s, and `CompactOnDeletionCollector::Finish()` was resetting all its internal state. Then, when we checked whether compaction is necessary, the flag had already been cleared. Fixed above issue by avoiding resetting internal state during `Finish()`. Multiple calls to `Finish()` are allowed, but callers cannot invoke `AddUserKey()` on the collector after any finishes. Closes https://github.com/facebook/rocksdb/pull/2936 Differential Revision: D5918659 Pulled By: ajkr fbshipit-source-id: 4f05e9d80e50ee762ba1e611d8d22620029dca6b	8 years ago
Maysam Yabandeh	385049baf2	WritePrepared Txn: Recovery Summary: Recover txns from the WAL. Also added some unit tests. Closes https://github.com/facebook/rocksdb/pull/2901 Differential Revision: D5859596 Pulled By: maysamyabandeh fbshipit-source-id: 6424967b231388093b4effffe0a3b1b7ec8caeb0	8 years ago
Quinn Jarrell	6a541afcc4	Make bytes_per_sync and wal_bytes_per_sync mutable Summary: SUMMARY Moves the bytes_per_sync and wal_bytes_per_sync options from immutableoptions to mutable options. Also if wal_bytes_per_sync is changed, the wal file and memtables are flushed. TEST PLAN ran make check all passed Two new tests SetBytesPerSync, SetWalBytesPerSync check that after issuing setoptions with a new value for the var, the db options have the new value. Closes https://github.com/facebook/rocksdb/pull/2893 Reviewed By: yiwu-arbug Differential Revision: D5845814 Pulled By: TheRushingWookie fbshipit-source-id: 93b52d779ce623691b546679dcd984a06d2ad1bd	8 years ago
Yi Wu	ec48e5c77f	Add TransactionDB::SingleDelete() Summary: Looks like the API is simply missing. Adding it. Closes https://github.com/facebook/rocksdb/pull/2937 Differential Revision: D5919955 Pulled By: yiwu-arbug fbshipit-source-id: 6e2e9c96c29882b0bb4113d1f8efb72bffc57878	8 years ago
Yi Wu	be97dbb15c	Fix WritePreparedTransactionTest::SeqAdvanceTest ASAN failure Summary: Closes https://github.com/facebook/rocksdb/pull/2922 Differential Revision: D5895310 Pulled By: yiwu-arbug fbshipit-source-id: 52c635a25d22478ec1eca49b6817551202babac2	8 years ago
Yi Wu	1480e6f7cf	Fix TransactionTest::SeqAdvanceTest ASAN failure Summary: The test didn't delete txn before creating a new one. Closes https://github.com/facebook/rocksdb/pull/2913 Differential Revision: D5880236 Pulled By: yiwu-arbug fbshipit-source-id: 7a4fcaada3d86332292754502cd8f4341143bf4f	8 years ago
Pengchao Wang	e4234fbdcf	collecting kValue type tombstone Summary: In our testing cluster, we found large amount tombstone has been promoted to kValue type from kMerge after reaching the top level of compaction. Since we used to only collecting tombstone in merge operator, those tombstones can never be collected. This PR addresses the issue by adding a GC step in compaction filter, which is only for kValue type records. Since those record already reached the top of compaction (no earlier data exists) we can safely remove them in compaction filter without worrying old data appears. This PR also removes an old optimization in cassandra merge operator for single merge operands. We need to do GC even on a single operand, so the optimation does not make sense anymore. Closes https://github.com/facebook/rocksdb/pull/2855 Reviewed By: sagar0 Differential Revision: D5806445 Pulled By: wpc fbshipit-source-id: 6eb25629d4ce917eb5e8b489f64a6aa78c7d270b	8 years ago
Maysam Yabandeh	60beefd6e0	WritePrepared Txn: Advance seq one per batch Summary: By default the seq number in DB is increased once per written key. WritePrepared txns requires the seq to be increased once per the entire batch so that the seq would be used as the prepare timestamp by which the transaction is identified. Also we need to increase seq for the commit marker since it would give a unique id to the commit timestamp of transactions. Two unit tests are added to verify our understanding of how the seq should be increased. The recovery path requires much more work and is left to another patch. Closes https://github.com/facebook/rocksdb/pull/2885 Differential Revision: D5837843 Pulled By: maysamyabandeh fbshipit-source-id: a08960b93d727e1cf438c254d0c2636fb133cc1c	8 years ago
Yi Wu	9a970c81af	Fix WriteBatchWithIndex::GetFromBatchAndDB not allowing StackableDB Summary: Closes https://github.com/facebook/rocksdb/pull/2881 Differential Revision: D5829682 Pulled By: yiwu-arbug fbshipit-source-id: abb8fa14b58cea7c416282f9be19e8b1a7961c6e	8 years ago
Maysam Yabandeh	09713a64b3	WritePrepared Txn: Lock-free CommitMap Summary: We had two proposals for lock-free commit maps. This patch implements the latter one that was simpler. We can later experiment with both proposals. In this impl each entry is an std::atomic of uint64_t, which are accessed via memory_order_acquire/release. In x86_64 arch this is compiled to simple reads and writes from memory. Closes https://github.com/facebook/rocksdb/pull/2861 Differential Revision: D5800724 Pulled By: maysamyabandeh fbshipit-source-id: 41abae9a4a5df050a8eb696c43de11c2770afdda	8 years ago
Amy Xu	5785b1fcb8	Fix naming in InternalKey Summary: - Switched all instances of SetMinPossibleForUserKey and SetMaxPossibleForUserKey in accordance to InternalKeyComparator's comparison logic Closes https://github.com/facebook/rocksdb/pull/2868 Differential Revision: D5804152 Pulled By: axxufb fbshipit-source-id: 80be35e04f2e8abc35cc64abe1fecb03af24e183	8 years ago
Andrew Kryczka	f5148ade10	support opening zero backups during engine init Summary: There are internal users who open BackupEngine for writing new backups only, and they don't care whether old backups can be read or not. The condition `BackupableDBOptions::max_valid_backups_to_open == 0` should be supported (previously in `df74b775e6` I made the mistake of choosing 0 as a special value to disable the limit). Closes https://github.com/facebook/rocksdb/pull/2819 Differential Revision: D5751599 Pulled By: ajkr fbshipit-source-id: e73ac19eb5d756d6b68601eae8e43407ee4f2752	8 years ago
Siying Dong	64b6452e0c	Make InternalKeyComparator final and directly use it in merging iterator Summary: Merging iterator invokes InternalKeyComparator.Compare() frequently to heap merge. By making InternalKeyComparator final and merging iterator to directly use InternalKeyComparator rather than through Iterator interface, we can give compiler a choice to avoid one more virtual function call if possible. I ran readseq benchmark in memory-only use case to make sure the performance at least doesn't regress. I have to disable the final key word in debug build, as a hack test class depends on overriding the class. Closes https://github.com/facebook/rocksdb/pull/2860 Differential Revision: D5800461 Pulled By: siying fbshipit-source-id: ab876f22a09bb5c560740911412336e0e25ccb53	8 years ago

... 4 5 6 7 8 ...

952 Commits (f5576c33173f3ef27fe9ba1d71beeb6f1aa15c6a)