rocksdb

Commit Graph

Author	SHA1	Message	Date
Victor Gao	1d7048c598	comment out unused parameters Summary: This uses `clang-tidy` to comment out unused parameters (in functions, methods and lambdas) in fbcode. Cases that the tool failed to handle are fixed manually. Reviewed By: igorsugak Differential Revision: D5454343 fbshipit-source-id: 5dee339b4334e25e963891b519a5aa81fbf627b2	7 years ago
Siying Dong	3c327ac2d0	Change RocksDB License Summary: Closes https://github.com/facebook/rocksdb/pull/2589 Differential Revision: D5431502 Pulled By: siying fbshipit-source-id: 8ebf8c87883daa9daa54b2303d11ce01ab1f6f75	7 years ago
Aaron Gao	7f6c02dda1	using ThreadLocalPtr to hide ROCKSDB_SUPPORT_THREAD_LOCAL from public… Summary: … headers https://github.com/facebook/rocksdb/pull/2199 should not reference RocksDB-specific macros (like ROCKSDB_SUPPORT_THREAD_LOCAL in this case) to public headers, `iostats_context.h` and `perf_context.h`. We shouldn't do that because users have to provide these compiler flags when building their binary with RocksDB. We should hide the thread local global variable inside our implementation and just expose a function api to retrieve these variables. It may break some users for now but good for long term. make check -j64 Closes https://github.com/facebook/rocksdb/pull/2380 Differential Revision: D5177896 Pulled By: lightmark fbshipit-source-id: 6fcdfac57f2e2dcfe60992b7385c5403f6dcb390	8 years ago
Andrew Kryczka	3a8a848a55	account for L0 size in estimated compaction bytes Summary: also changed the `>` in the comparison against `level0_file_num_compaction_trigger` into a `>=` since exactly `level0_file_num_compaction_trigger` can trigger a compaction from L0. Closes https://github.com/facebook/rocksdb/pull/2179 Differential Revision: D4915772 Pulled By: ajkr fbshipit-source-id: e38fec6253de6f9a40e61734615c6670d84038aa	8 years ago
Andrew Kryczka	6cc9aef162	New API for background work in single thread pool Summary: Previously users could set `max_background_flushes=0` to force rocksdb to use a single thread pool for both background flushes and compactions. That'll no longer be possible since I'm going to deprecate `max_background_flushes` and `max_background_compactions` in favor of a single option. This diff introduces a new way to force a single thread pool: when high-pri pool has zero threads, all background jobs will be submitted to low-pri pool. Note the majority of the code change is adding `Env::GetBackgroundThreads()`, which is necessary to check whether the user has provided a zero-sized thread pool. Closes https://github.com/facebook/rocksdb/pull/2204 Differential Revision: D4936256 Pulled By: ajkr fbshipit-source-id: 929a07a0c0705f7766f5339cd013ff74e90d6e01	8 years ago
Yi Wu	9d0a07ed52	Fix rocksdb.estimate-num-keys DB property underflow Summary: rocksdb.estimate-num-keys is compute from `estimate_num_keys - 2 * estimate_num_deletes`. If `2 * estimate_num_deletes > estimate_num_keys` it will underflow. Fixing it. Closes https://github.com/facebook/rocksdb/pull/2348 Differential Revision: D5109272 Pulled By: yiwu-arbug fbshipit-source-id: e1bfb91346a59b7282a282b615002507e9d7c246	8 years ago
Siying Dong	c49d704656	Add DB:ResetStats() Summary: Add a function to allow users to reset internal stats without restarting the DB. Closes https://github.com/facebook/rocksdb/pull/2167 Differential Revision: D4907939 Pulled By: siying fbshipit-source-id: ab2dd85b88aabe9380da7485320a1d460d3e1f68	8 years ago
Siying Dong	8f47a97512	File level histogram should be printed per CF, not per DB Summary: Currently level histogram is only printed out for DB stats and for default CF. This is confusing. Change to print for every CF instead. Closes https://github.com/facebook/rocksdb/pull/2126 Differential Revision: D4865373 Pulled By: siying fbshipit-source-id: 1c853e0ac66e00120ee931cabc9daf69ccc2d577	8 years ago
Maysam Yabandeh	e6725e8c8d	Fix some bugs in MockEnv Summary: Fixing some bugs in MockEnv so it be actually used. Closes https://github.com/facebook/rocksdb/pull/1914 Differential Revision: D4609923 Pulled By: maysamyabandeh fbshipit-source-id: ca25735	8 years ago
Yi Wu	dfb6fe6755	Unified InlineSkipList::Insert algorithm with hinting Summary: This PR is based on nbronson's diff with small modifications to wire it up with existing interface. Comparing to previous version, this approach works better for inserting keys in decreasing order or updating the same key, and impose less restriction to the prefix extractor. ---- Summary from original diff ---- This diff introduces a single InlineSkipList::Insert that unifies the existing sequential insert optimization (prev_), concurrent insertion, and insertion using externally-managed insertion point hints. There's a deep symmetry between insertion hints (cursors) and the concurrent algorithm. In both cases we have partial information from the recent past that is likely but not certain to be accurate. This diff introduces the struct InlineSkipList::Splice, which encodes predecessor and successor information in the same form that was previously only used within a single call to InsertConcurrently. Splice holds information about an insertion point that can be used to levera Closes https://github.com/facebook/rocksdb/pull/1561 Differential Revision: D4217283 Pulled By: yiwu-arbug fbshipit-source-id: 33ee437	8 years ago
Siying Dong	972e3ff295	Enable allow_concurrent_memtable_write and enable_write_thread_adaptive_yield by default Summary: Closes https://github.com/facebook/rocksdb/pull/1496 Differential Revision: D4168080 Pulled By: siying fbshipit-source-id: 056ae62	8 years ago
Yi Wu	1543d5d92e	Report memory usage by memtable insert hints map. Summary: It is hard to measure acutal memory usage by std containers. Even providing a custom allocator will miss count some of the usage. Here we only do a wild guess on its memory usage. Closes https://github.com/facebook/rocksdb/pull/1511 Differential Revision: D4179945 Pulled By: yiwu-arbug fbshipit-source-id: 32ab929	8 years ago
Andrew Kryczka	b9bc7a2aa4	Use skiplist rep for range tombstone memtable Summary: somehow missed committing this update in D62217 Test Plan: make check Reviewers: sdong Reviewed By: sdong Subscribers: andrewkr, dhruba, leveldb Differential Revision: https://reviews.facebook.net/D65361	8 years ago
Andrew Kryczka	6009c473c7	Store range tombstones in memtable Summary: - Store range tombstones in a separate MemTableRep instantiated with ColumnFamilyOptions::memtable_factory - MemTable::NewRangeTombstoneIterator() returns a MemTableIterator over the separate MemTableRep - Part of the read path is not implemented yet (i.e., MemTable::Get()) Test Plan: see unit tests Reviewers: wanning Subscribers: andrewkr, dhruba, leveldb Differential Revision: https://reviews.facebook.net/D62217	8 years ago
Islam AbdelRahman	2a9c97108e	[Flaky Test] Disable DBPropertiesTest.GetProperty Summary: Disable flaky test Test Plan: run it Reviewers: yiwu, andrewkr, kradhakrishnan, yhchiang, lightmark, sdong Reviewed By: sdong Subscribers: andrewkr, dhruba Differential Revision: https://reviews.facebook.net/D62487	8 years ago
Jay Edgar	efd013d6d8	Miscellaneous performance improvements Summary: I was investigating performance issues in the SstFileWriter and found all of the following: - The SstFileWriter::Add() function created a local InternalKey every time it was called generating a allocation and free each time. Changed to have an InternalKey member variable that can be reset with the new InternalKey::Set() function. - In SstFileWriter::Add() the smallest_key and largest_key values were assigned the result of a ToString() call, but it is simpler to just assign them directly from the user's key. - The Slice class had no move constructor so each time one was returned from a function a new one had to be allocated, the old data copied to the new, and the old one was freed. I added the move constructor which also required a copy constructor and assignment operator. - The BlockBuilder::CurrentSizeEstimate() function calculates the current estimate size, but was being called 2 or 3 times for each key added. I changed the class to maintain a running estimate (equal to the original calculation) so that the function can return an already calculated value. - The code in BlockBuilder::Add() that calculated the shared bytes between the last key and the new key duplicated what Slice::difference_offset does, so I replaced it with the standard function. - BlockBuilder::Add() had code to copy just the changed portion into the last key value (and asserted that it now matched the new key). It is more efficient just to copy the whole new key over. - Moved this same code up into the 'if (use_delta_encoding_)' since the last key value is only needed when delta encoding is on. - FlushBlockBySizePolicy::BlockAlmostFull calculated a standard deviation value each time it was called, but this information would only change if block_size of block_size_deviation changed, so I created a member variable to hold the value to avoid the calculation each time. - Each PutVarint??() function has a buffer and calls std::string::append(). Two or three calls in a row could share a buffer and a single call to std::string::append(). Some of these will be helpful outside of the SstFileWriter. I'm not 100% the addition of the move constructor is appropriate as I wonder why this wasn't done before - maybe because of compiler compatibility? I tried it on gcc 4.8 and 4.9. Test Plan: The changes should not affect the results so the existing tests should all still work and no new tests were added. The value of the changes was seen by manually testing the SstFileWriter class through MyRocks and adding timing code to identify problem areas. Reviewers: sdong, IslamAbdelRahman Reviewed By: IslamAbdelRahman Subscribers: andrewkr, dhruba Differential Revision: https://reviews.facebook.net/D59607	8 years ago
Islam AbdelRahman	7c14abf2c7	Improve BytewiseComparatorImpl::FindShortestSeparator Summary: The current implementation find the first different byte and try to increment it, if it cannot it return the original key we can improve this by keep going after the first different byte to find the first non 0xFF byte and increment it After trying this patch on some logdevice sst files I see decrease in there index block size by 8.5% Test Plan: existing tests and updated test Reviewers: yhchiang, andrewkr, sdong Reviewed By: sdong Subscribers: andrewkr, dhruba Differential Revision: https://reviews.facebook.net/D56241	9 years ago
Andrew Kryczka	73a847ef89	Add per-level compression ratio property Summary: This is needed so we can measure compression ratio improvements achieved by D52287. The property compares raw data size against the total file size for a given level. If the level is empty it should return 0.0. Test Plan: new unit test Reviewers: IslamAbdelRahman, yhchiang, sdong Reviewed By: sdong Subscribers: andrewkr, dhruba, leveldb Differential Revision: https://reviews.facebook.net/D56967	9 years ago
Aaron Gao	cc87075d63	No need to limit to 20 files in UpdateAccumulatedStats() if options.max_open_files=-1 Summary: There is a hardcoded constraint in our statistics collection that prevents reading properties from more than 20 SST files. This means our statistics will be very inaccurate for databases with > 20 files since additional files are just ignored. The purpose of constraining the number of files used is to bound the I/O performed during statistics collection, since these statistics need to be recomputed every time the database reopened. However, this constraint doesn't take into account the case where option "max_open_files" is -1. In that case, all the file metadata has already been read, so MaybeInitializeFileMetaData() won't incur any I/O cost. so this diff gets rid of the 20-file constraint in case max_open_files == -1. Test Plan: write into unit test db/db_properties_test.cc - "ValidateSampleNumber". We generate 20 files with 2 rows and 10 files with 1 row. If max_open_files !=-1, the `rocksdb.estimate-num-keys` should be (101 + 102)/20 * 30 = 45. Otherwise, it should be the ground truth, 50. {F1089153} Reviewers: andrewkr Reviewed By: andrewkr Subscribers: andrewkr, dhruba, leveldb Differential Revision: https://reviews.facebook.net/D56253	9 years ago
sdong	294bdf9ee2	Change Property name from "rocksdb.current_version_number" to "rocksdb.current-super-version-number" Summary: I realized I again is wrong about the naming convention. Let me change it to the correct one. Test Plan: Run unit tests. Reviewers: IslamAbdelRahman, kradhakrishnan, yhchiang, andrewkr Reviewed By: andrewkr Subscribers: leveldb, dhruba Differential Revision: https://reviews.facebook.net/D55041	9 years ago
sdong	432f3adf2c	Add DB Property "rocksdb.current_version_number" Summary: Add a DB Property "rocksdb.current_version_number" for users to monitor version changes and stale iterators. Test Plan: Add a unit test. Reviewers: andrewkr, yhchiang, kradhakrishnan, IslamAbdelRahman Reviewed By: IslamAbdelRahman Subscribers: leveldb, dhruba Differential Revision: https://reviews.facebook.net/D54927	9 years ago
Baraa Hamodi	21e95811d1	Updated all copyright headers to the new format.	9 years ago
Andrew Kryczka	284aa613a7	Eliminate duplicated property constants Summary: Before this diff, there were duplicated constants to refer to properties (user- facing API had strings and InternalStats had an enum). I noticed these were inconsistent in terms of which constants are provided, names of constants, and documentation of constants. Overall it seemed annoying/error-prone to maintain these duplicated constants. So, this diff gets rid of InternalStats's constants and replaces them with a map keyed on the user-facing constant. The value in that map contains a function pointer to get the property value, so we don't need to do string matching while holding db->mutex_. This approach has a side benefit of making many small handler functions rather than a giant switch-statement. Test Plan: db_properties_test passes, running "make commit-prereq -j32" Reviewers: sdong, yhchiang, kradhakrishnan, IslamAbdelRahman, rven, anthony Reviewed By: anthony Subscribers: dhruba, leveldb Differential Revision: https://reviews.facebook.net/D53253	9 years ago
Andrew Kryczka	29289333d0	Add named constants for remaining properties Summary: There were just these two properties that didn't have any named constant. Test Plan: build and below test $ ./db_properties_test --gtest_filter=DBPropertiesTest.NumImmutableMemTable Reviewers: yhchiang, sdong Reviewed By: sdong Subscribers: dhruba, leveldb Differential Revision: https://reviews.facebook.net/D53103	9 years ago
Andrew Kryczka	eceb5cb1b7	Split db_test.cc (part 1: properties) Summary: Moved all the tests that verify property correctness into a separate file. The goal is to reduce compile time and complexity of db_test. I didn't add parallelism for db_properties_test, even though these tests were parallelized in db_test, since the file is small enough that it won't matter. Some of these moves may be controversial since it's hard to say whether the test is "verifying property correctness," or "using properties to verify rocksdb's correctness." I'm interested in any opinions. Test Plan: ran db_properties_test, also waiting on "make commit-prereq -j32" Reviewers: yhchiang, IslamAbdelRahman, sdong Reviewed By: sdong Subscribers: dhruba, leveldb Differential Revision: https://reviews.facebook.net/D52995	9 years ago

1 2

75 Commits (40c2ec6d08c3971af6ef3d0f5c5c27cdf3965ebf)