rocksdb

Commit Graph

Author	SHA1	Message	Date
Aaron Gao	4c9d2b1046	remove #include port/port.h in public header file Summary: break internal build Closes https://github.com/facebook/rocksdb/pull/2336 Differential Revision: D5097089 Pulled By: lightmark fbshipit-source-id: 6996cbadeead21074a41e526ea04659190ee61d8	8 years ago
Nikhil Benesch	11c5d4741a	cross-platform compatibility improvements Summary: We've had a couple CockroachDB users fail to build RocksDB on exotic platforms, so I figured I'd try my hand at solving these issues upstream. The problems stem from a) `USE_SSE=1` being too aggressive about turning on SSE4.2, even on toolchains that don't support SSE4.2 and b) RocksDB attempting to detect support for thread-local storage based on OS, even though it can vary by compiler on the same OS. See the individual commit messages for details. Regarding SSE support, this PR should change virtually nothing for non-CMake based builds. `make`, `PORTABLE=1 make`, `USE_SSE=1 make`, and `PORTABLE=1 USE_SSE=1 make` function exactly as before, except that SSE support will be automatically disabled when a simple SSE4.2-using test program fails to compile, as it does on OpenBSD. (OpenBSD's ports GCC supports SSE4.2, but its binutils do not, so `__SSE_4_2__` is defined but an SSE4.2-using program will fail to assemble.) A warning is emitted in this case. The CMake build is modified to support the same set of options, except that `USE_SSE` is spelled `FORCE_SSE42` because `USE_SSE` is rather useless now that we can automatically detect SSE support, and I figure changing options in the CMake build is less disruptive than changing the non-CMake build. I've tested these changes on all the platforms I can get my hands on (macOS, Windows MSVC, Windows MinGW, and OpenBSD) and it all works splendidly. Let me know if there's anything you object to—I obviously don't mean to break any of your build pipelines in the process of fixing ours downstream. Closes https://github.com/facebook/rocksdb/pull/2199 Differential Revision: D5054042 Pulled By: yiwu-arbug fbshipit-source-id: 938e1fc665c049c02ae15698e1409155b8e72171	8 years ago
Tomas Kolda	c1fbf91b2f	Fixing Solaris Sparc crash due to cached TLS Summary: Workaround for Solaris gcc binary. Program is crashing, because when TLS of perf context that is used twice on same frame, it is damaged thus Segmentation fault. Issue: #2153 Closes https://github.com/facebook/rocksdb/pull/2187 Differential Revision: D4922274 Pulled By: siying fbshipit-source-id: 549105ebce9a8ce08a737f4d6b9f2312ebcde9a8	9 years ago
Andrew Kryczka	e2c6c06366	add TimedEnv Summary: I've needed Env timing measurements a few times now, so finally built something for it. Closes https://github.com/facebook/rocksdb/pull/2073 Differential Revision: D4811231 Pulled By: ajkr fbshipit-source-id: 218a249	9 years ago
Orgad Shaneh	6401a8b76b	Fix build with MinGW Summary: There still are many warnings (most of them about invalid printf format for long long), but it builds if FAIL_ON_WARNINGS is disabled. Closes https://github.com/facebook/rocksdb/pull/2052 Differential Revision: D4807355 Pulled By: siying fbshipit-source-id: ef03786	9 years ago
Mike Kolupaev	236d4c67e9	Less linear search in DBIter::Seek() when keys are overwritten a lot Summary: In one deployment we saw high latencies (presumably from slow iterator operations) and a lot of CPU time reported by perf with this stack: ``` rocksdb::MergingIterator::Next rocksdb::DBIter::FindNextUserEntryInternal rocksdb::DBIter::Seek ``` I think what's happening is: 1. we create a snapshot iterator, 2. we do lots of Put()s for the same key x; this creates lots of entries in memtable, 3. we seek the iterator to a key slightly smaller than x, 4. the seek walks over lots of entries in memtable for key x, skipping them because of high sequence numbers. CC IslamAbdelRahman Closes https://github.com/facebook/rocksdb/pull/1413 Differential Revision: D4083879 Pulled By: IslamAbdelRahman fbshipit-source-id: a83ddae	9 years ago
Aaron Gao	f517d9dd09	Add SeekForPrev() to Iterator Summary: Add new Iterator API, `SeekForPrev`: find the last key that <= target key support prefix_extractor support prefix_same_as_start support upper_bound not supported in iterators without Prev() Also add tests in db_iter_test and db_iterator_test Pass all tests Cheers! Test Plan: make all check -j64 Reviewers: andrewkr, yiwu, IslamAbdelRahman, sdong Reviewed By: sdong Subscribers: andrewkr, dhruba, leveldb Differential Revision: https://reviews.facebook.net/D64149	9 years ago
sdong	0930e5e99f	Update comments on include/rocksdb/perf_context.h Summary: Some grammer mistakes in code comments in include/rocksdb/perf_context.h. Also polish it a liitlebit. Test Plan: Not needed Reviewers: IslamAbdelRahman, yhchiang, yiwu, andrewkr Reviewed By: andrewkr Subscribers: leveldb, andrewkr, dhruba Differential Revision: https://reviews.facebook.net/D56307	10 years ago
sdong	3996770d0b	Add comments to perf_context skip counters Summary: Document the skipped counters in perf context more clearly. Test Plan: Comment only. Reviewers: IslamAbdelRahman, yhchiang, MarkCallaghan Reviewed By: MarkCallaghan Subscribers: leveldb, andrewkr, dhruba Differential Revision: https://reviews.facebook.net/D55833	10 years ago
Baraa Hamodi	21e95811d1	Updated all copyright headers to the new format.	10 years ago
Dmytro Ivchenko	aa5e3b7c04	PerfContext::ToString() add option to exclude zero counters Test Plan: Added unit test to check w/ w/o zeros scenarios Reviewers: yhchiang Reviewed By: yhchiang Subscribers: sdong, dhruba Differential Revision: https://reviews.facebook.net/D52809	10 years ago
Yueh-Hsuan Chiang	29a47cd2b8	Include the time unit in the comment of perf_context timers Summary: Include the time unit in the comment of perf_context timers Test Plan: make perf_context_test Reviewers: igor, IslamAbdelRahman, sdong Reviewed By: sdong Subscribers: dhruba, leveldb Differential Revision: https://reviews.facebook.net/D48663	10 years ago
dyniusz	a065cdb388	bloom hit/miss stats for SST and memtable Summary: hit and miss bloom filter stats for memtable and SST stats added to perf_context struct key matches and prefix matches combined into one stat Test Plan: unit test veryfing the functionality added, see BloomStatsTest in db_test.cc for details Reviewers: yhchiang, igor, sdong Reviewed By: sdong Subscribers: dhruba, leveldb Differential Revision: https://reviews.facebook.net/D47859	10 years ago
Andres Noetzli	014fd55adc	Support for SingleDelete() Summary: This patch fixes #7460559. It introduces SingleDelete as a new database operation. This operation can be used to delete keys that were never overwritten (no put following another put of the same key). If an overwritten key is single deleted the behavior is undefined. Single deletion of a non-existent key has no effect but multiple consecutive single deletions are not allowed (see limitations). In contrast to the conventional Delete() operation, the deletion entry is removed along with the value when the two are lined up in a compaction. Note: The semantics are similar to @igor's prototype that allowed to have this behavior on the granularity of a column family ( https://reviews.facebook.net/D42093 ). This new patch, however, is more aggressive when it comes to removing tombstones: It removes the SingleDelete together with the value whenever there is no snapshot between them while the older patch only did this when the sequence number of the deletion was older than the earliest snapshot. Most of the complex additions are in the Compaction Iterator, all other changes should be relatively straightforward. The patch also includes basic support for single deletions in db_stress and db_bench. Limitations: - Not compatible with cuckoo hash tables - Single deletions cannot be used in combination with merges and normal deletions on the same key (other keys are not affected by this) - Consecutive single deletions are currently not allowed (and older version of this patch supported this so it could be resurrected if needed) Test Plan: make all check Reviewers: yhchiang, sdong, rven, anthony, yoshinorim, igor Reviewed By: igor Subscribers: maykov, dhruba, leveldb Differential Revision: https://reviews.facebook.net/D43179	10 years ago
Dmitri Smirnov	d1a457181d	Ensure Windows build w/o port/port.h in public headers - Remove make file defines from public headers and use _WIN32 because it is compiler defined - use __GNUC__ and __clang__ to guard non-portable attributes - add #include "port/port.h" to some new .cc files. - minor changes in CMakeLists to reflect recent changes	10 years ago
Dmitri Smirnov	247690fe38	Ensure Windows build w/o port/port.h in public headers - Remove make file defines from public headers and use _WIN32 because it is compiler defined - use __GNUC__ and __clang__ to guard non-portable attributes - add #include "port/port.h" to some new .cc files. - minor changes in CMakeLists to reflect recent changes	10 years ago
sdong	76d3cd3286	Fix public API dependency on internal codes and dependency on MAX_INT32 Summary: Public API depends on port/port.h which is wrong. Fix it. Also with gcc 4.8.1 build was broken as MAX_INT32 was not recognized. Fix it by using ::max in linux. Test Plan: Build it and try to build an external project on top of it. Reviewers: anthony, yhchiang, kradhakrishnan, igor Reviewed By: igor Subscribers: yoshinorim, leveldb, dhruba Differential Revision: https://reviews.facebook.net/D41745	10 years ago
sdong	a6e38fd170	Fix a uncleaned counter in PerfContext::Reset() Summary: new_table_iterator_nanos is not cleaned in PerfContext::Reset() while new_table_block_iter_nanos is cleaned twice. Fix it. Also fix a comment. Test Plan: Build and db_bench with --perf_context to see the value shown. Reviewers: kradhakrishnan, anthony, yhchiang, igor Reviewed By: igor Subscribers: leveldb, dhruba Differential Revision: https://reviews.facebook.net/D41721	10 years ago
sdong	041b6f95a2	perf_context: report time spent on reading index and bloom blocks Summary: Add a perf context counter to help users figure out time spent on reading indexes and bloom filter blocks. Test Plan: Will write a unit test Subscribers: leveldb, dhruba Differential Revision: https://reviews.facebook.net/D41433	10 years ago
Dmitri Smirnov	18285c1e2f	Windows Port from Microsoft Summary: Make RocksDb build and run on Windows to be functionally complete and performant. All existing test cases run with no regressions. Performance numbers are in the pull-request. Test plan: make all of the existing unit tests pass, obtain perf numbers. Co-authored-by: Praveen Rao praveensinghrao@outlook.com Co-authored-by: Sherlock Huang baihan.huang@gmail.com Co-authored-by: Alex Zinoviev alexander.zinoviev@me.com Co-authored-by: Dmitri Smirnov dmitrism@microsoft.com	10 years ago
Mike Kolupaev	ec7a944360	more times in perf_context and iostats_context Summary: We occasionally get write stalls (>1s Write() calls) on HDD under read load. The following timers explain almost all of the stalls: - perf_context.db_mutex_lock_nanos - perf_context.db_condition_wait_nanos - iostats_context.open_time - iostats_context.allocate_time - iostats_context.write_time - iostats_context.range_sync_time - iostats_context.logger_time In my experiments each of these occasionally takes >1s on write path under some workload. There are rare cases when Write() takes long but none of these takes long. Test Plan: Added code to our application to write the listed timings to log for slow writes. They usually add up to almost exactly the time Write() call took. Reviewers: rven, yhchiang, sdong Reviewed By: sdong Subscribers: march, dhruba, tnovak Differential Revision: https://reviews.facebook.net/D39177	10 years ago
Anurag Indu	3d1a924ff3	Adding stats for the merge and filter operation Summary: We have addded new stats and perf_context for measuring the merge and filter operation time consumption. We have bounded all the merge operations within the GUARD statment and collected the total time for these operations in the DB. Test Plan: WIP Reviewers: rven, yhchiang, kradhakrishnan, igor, sdong Reviewed By: sdong Subscribers: dhruba Differential Revision: https://reviews.facebook.net/D34377	11 years ago
Igor Canadi	3cf7f353d9	Instrument memtable seeks Summary: As title Test Plan: compiles Reviewers: sdong Reviewed By: sdong Subscribers: dhruba, leveldb Differential Revision: https://reviews.facebook.net/D34191	11 years ago
sdong	6d6305dd7d	Perf Context to report DB mutex waiting time Summary: Add counters in perf context to allow users to figure out how time spent on waiting for DB mutex Test Plan: Add a test and run it. Reviewers: yhchiang, rven, igor Reviewed By: igor Subscribers: leveldb, dhruba Differential Revision: https://reviews.facebook.net/D33177	11 years ago
sdong	36de0e5359	Add a function to return current perf level Summary: Add a function to return the perf level. It is to allow a wrapper of DB to increase the perf level and restore the original perf level after finishing the function call. Test Plan: Add a verification in db_test Reviewers: yhchiang, igor, ljin Reviewed By: ljin Subscribers: xjin, dhruba, leveldb Differential Revision: https://reviews.facebook.net/D19551	11 years ago
Lei Jin	92c1eb0291	macros for perf_context Summary: This will allow us to disable them completely for iOS or for better performance Test Plan: will run make all check Reviewers: igor, haobo, dhruba Reviewed By: haobo CC: leveldb Differential Revision: https://reviews.facebook.net/D17511	12 years ago
Igor Canadi	bcd1f15b60	Remove -Wno-unused-const-variable	12 years ago
Igor Canadi	51023c3911	Make RocksDB compile for iOS Summary: I had to make number of changes to the code and Makefile: * Add `make lib`, that will create static library without debug info. We need this to avoid growing binary too much. Currently it's 14MB. * Remove cpuinfo() function and use __SSE4_2__ macro. We actually used the macro as part of Fast_CRC32() function. As a result, I also accidentally fixed this issue: https://www.facebook.com/groups/rocksdb.dev/permalink/549700778461774/?stream_ref=2 * Remove __thread locals in OS_MACOSX Test Plan: `make lib PLATFORM=IOS` Reviewers: ljin, haobo, dhruba, sdong Reviewed By: haobo CC: leveldb Differential Revision: https://reviews.facebook.net/D17475	12 years ago
Lei Jin	04298f8c33	output perf_context in db_bench readrandom Summary: Add helper function to print perf context data in db_bench if enabled. I didn't find any code that actually exports perf context data. Not sure if I missed anything Test Plan: ran db_bench Reviewers: haobo, sdong, igor Reviewed By: igor CC: leveldb Differential Revision: https://reviews.facebook.net/D16575	12 years ago
Siying Dong	3e35aa6412	Revert "Allow users to profile a query and see bottleneck of the query" This reverts commit `3d8ac31d71`.	12 years ago
Siying Dong	b135d01e7b	Allow users to profile a query and see bottleneck of the query Summary: Provide a framework to profile a query in detail to figure out latency bottleneck. Currently, in Get(), Put() and iterators, 2-3 simple timing is used. We can easily add more profile counters to the framework later. Test Plan: Enable this profiling in seveal existing tests. Reviewers: haobo, dhruba, kailiu, emayanke, vamsi, igor CC: leveldb Differential Revision: https://reviews.facebook.net/D14001 Conflicts: table/merger.cc	12 years ago
Siying Dong	3d8ac31d71	Allow users to profile a query and see bottleneck of the query Summary: Provide a framework to profile a query in detail to figure out latency bottleneck. Currently, in Get(), Put() and iterators, 2-3 simple timing is used. We can easily add more profile counters to the framework later. Test Plan: Enable this profiling in seveal existing tests. Reviewers: haobo, dhruba, kailiu, emayanke, vamsi, igor CC: leveldb Differential Revision: https://reviews.facebook.net/D14001	12 years ago
Dhruba Borthakur	31295b0a1b	Add License message to public header files. Summary: Add License message to public header files. Test Plan: Reviewers: CC: Task ID: # Blame Rev:	12 years ago
Haobo Xu	2fb361ad98	[RocksDB] Add perf_context.wal_write_time to track time spent on writing the recovery log. Summary: as title Test Plan: make check; ./perf_context_test Reviewers: dhruba Reviewed By: dhruba CC: leveldb Differential Revision: https://reviews.facebook.net/D13629	12 years ago
Dhruba Borthakur	3c37955a2f	Remove obsolete namespace mappings. Summary: The previous release 2.4 had a mapping to alias the older namespace to rocksdb. This mapping is not needed in the new release. Test Plan: make check make release Reviewers: emayanke Reviewed By: emayanke CC: leveldb Differential Revision: https://reviews.facebook.net/D13359	12 years ago
Dhruba Borthakur	a143ef9b38	Change namespace from leveldb to rocksdb Summary: Change namespace from leveldb to rocksdb. This allows a single application to link in open-source leveldb code as well as rocksdb code into the same process. Test Plan: compile rocksdb Reviewers: emayanke Reviewed By: emayanke CC: leveldb Differential Revision: https://reviews.facebook.net/D13287	12 years ago
Haobo Xu	71046971f0	[RocksDB] Added perf counters to track skipped internal keys during iteration Summary: as title. unit test not polished. this is for a quick live test Test Plan: live Reviewers: dhruba Reviewed By: dhruba CC: leveldb Differential Revision: https://reviews.facebook.net/D13221	12 years ago
Haobo Xu	f2f4c8072f	[RocksDB] Added nano second stopwatch and new perf counters to track block read cost Summary: The pupose of this diff is to expose per user-call level precise timing of block read, so that we can answer questions like: a Get() costs me 100ms, is that somehow related to loading blocks from file system, or sth else? We will answer that with EXACTLY how many blocks have been read, how much time was spent on transfering the bytes from os, how much time was spent on checksum verification and how much time was spent on block decompression, just for that one Get. A nano second stopwatch was introduced to track time with higher precision. The cost/precision of the stopwatch is also measured in unit-test. On my dev box, retrieving one time instance costs about 30ns, on average. The deviation of timing results is good enough to track 100ns-1us level events. And the overhead could be safely ignored for 100us level events (10000 instances/s), for example, a viewstate thrift call. Test Plan: perf_context_test, also testing with viewstate shadow traffic. Reviewers: dhruba Reviewed By: dhruba CC: leveldb, xjin Differential Revision: https://reviews.facebook.net/D12351	12 years ago
Dhruba Borthakur	1186192ed1	Replace include/leveldb with include/rocksdb. Summary: Replace include/leveldb with include/rocksdb. Test Plan: make clean; make check make clean; make release Differential Revision: https://reviews.facebook.net/D12489	12 years ago
Haobo Xu	d9dd2a1926	[RocksDB] Expose thread local perf counter for low overhead, per call level performance statistics. Summary: As title. No locking/atomic is needed due to thread local. There is also no need to modify the existing client interface, in order to expose related counters. perf_context_test shows a simple example of retrieving the number of user key comparison done for each put and get call. More counters could be added later. Sample output ./perf_context_test 1000000 ==== Test PerfContextTest.KeyComparisonCount Inserting 1000000 key/value pairs ... total user key comparison get: 43446523 total user key comparison put: 8017877 max user key comparison get: 88939 avg user key comparison get:43 Basically, the current skiplist does well on average, but could perform poorly in extreme cases. Test Plan: run perf_context_test <total number of entries to put/get> Reviewers: dhruba Differential Revision: https://reviews.facebook.net/D12225	12 years ago

37 Commits (69ec8356b2bd1550fcb4ac53f65564624691764d)