rocksdb

Commit Graph

Author	SHA1	Message	Date
Andrew Kryczka	941543721d	Bytes read stat for `VerifyChecksum()` and `VerifyFileChecksums()` APIs (#8741 ) Summary: - Clarified some comments on compatibility for adding new ticker stats - Added read I/O stats for `VerifyChecksum()` and `VerifyFileChecksums()` APIs Pull Request resolved: https://github.com/facebook/rocksdb/pull/8741 Test Plan: new unit test Reviewed By: zhichao-cao Differential Revision: D30708578 Pulled By: ajkr fbshipit-source-id: d06b961f7e199ae92c266b683e39870aa8f63449	4 years ago
anand76	f35042ca40	Add a PerfContext counter for secondary cache hits (#8685 ) Summary: Add a PerfContext counter. Pull Request resolved: https://github.com/facebook/rocksdb/pull/8685 Reviewed By: zhichao-cao Differential Revision: D30453957 Pulled By: anand1976 fbshipit-source-id: 42888a3ced240e1c44446d52d3b04adfb01f5665	4 years ago
anand76	add68bd28a	Add a stat to count secondary cache hits (#8666 ) Summary: Add a stat for secondary cache hits. The ```Cache::Lookup``` API had an unused ```stats``` parameter. This PR uses that to pass the pointer to a ```Statistics``` object that ```LRUCache``` uses to record the stat. Pull Request resolved: https://github.com/facebook/rocksdb/pull/8666 Test Plan: Update a unit test in lru_cache_test Reviewed By: zhichao-cao Differential Revision: D30353816 Pulled By: anand1976 fbshipit-source-id: 2046f78b460428877a26ffdd2bb914ae47dfbe77	4 years ago
Drewryz	3b27725245	Fix a minor issue with initializing the test path (#8555 ) Summary: The PerThreadDBPath has already specified a slash. It does not need to be specified when initializing the test path. Pull Request resolved: https://github.com/facebook/rocksdb/pull/8555 Reviewed By: ajkr Differential Revision: D29758399 Pulled By: jay-zhuang fbshipit-source-id: 6d2b878523e3e8580536e2829cb25489844d9011	4 years ago
sherriiiliu	5535d06b9c	Fix stats_history_test failure on Windows (#8520 ) Summary: Fixed a stats_history_test failure on Windows * In StatsHistoryTest.InMemoryStatsHistoryPurging test, the capping memory cost of stats_history_size on Windows increases to 15000 bytes with latest changes Pull Request resolved: https://github.com/facebook/rocksdb/pull/8520 Reviewed By: ajkr Differential Revision: D29734631 Pulled By: mrambacher fbshipit-source-id: 461698fcf22ef06acfb7f7aa86f8415aaffe7f1e	4 years ago
Peter Dillinger	df5dc73bec	Don't hold DB mutex for block cache entry stat scans (#8538 ) Summary: I previously didn't notice the DB mutex was being held during block cache entry stat scans, probably because I primarily checked for read performance regressions, because they require the block cache and are traditionally latency-sensitive. This change does some refactoring to avoid holding DB mutex and to avoid triggering and waiting for a scan in GetProperty("rocksdb.cfstats"). Some tests have to be updated because now the stats collector is populated in the Cache aggressively on DB startup rather than lazily. (I hope to clean up some of this added complexity in the future.) This change also ensures proper treatment of need_out_of_mutex for non-int DB properties. Pull Request resolved: https://github.com/facebook/rocksdb/pull/8538 Test Plan: Added unit test logic that uses sync points to fail if the DB mutex is held during a scan, covering the various ways that a scan might be triggered. Performance test - the known impact to holding the DB mutex is on TransactionDB, and the easiest way to see the impact is to hack the scan code to almost always miss and take an artificially long time scanning. Here I've injected an unconditional 5s sleep at the call to ApplyToAllEntries. Before (hacked): $ TEST_TMPDIR=/dev/shm ./db_bench.base_xxx -benchmarks=randomtransaction,stats -cache_index_and_filter_blocks=1 -bloom_bits=10 -partition_index_and_filters=1 -duration=30 -stats_dump_period_sec=12 -cache_size=100000000 -statistics -transaction_db 2>&1 \| egrep 'db.db.write.micros\|micros/op' randomtransaction : 433.219 micros/op 2308 ops/sec; 0.1 MB/s ( transactions:78999 aborts:0) rocksdb.db.write.micros P50 : 16.135883 P95 : 36.622503 P99 : 66.036115 P100 : 5000614.000000 COUNT : 149677 SUM : 8364856 $ TEST_TMPDIR=/dev/shm ./db_bench.base_xxx -benchmarks=randomtransaction,stats -cache_index_and_filter_blocks=1 -bloom_bits=10 -partition_index_and_filters=1 -duration=30 -stats_dump_period_sec=12 -cache_size=100000000 -statistics -transaction_db 2>&1 \| egrep 'db.db.write.micros\|micros/op' randomtransaction : 448.802 micros/op 2228 ops/sec; 0.1 MB/s ( transactions:75999 aborts:0) rocksdb.db.write.micros P50 : 16.629221 P95 : 37.320607 P99 : 72.144341 P100 : 5000871.000000 COUNT : 143995 SUM : 13472323 Notice the 5s P100 write time. After (hacked): $ TEST_TMPDIR=/dev/shm ./db_bench.new_xxx -benchmarks=randomtransaction,stats -cache_index_and_filter_blocks=1 -bloom_bits=10 -partition_index_and_filters=1 -duration=30 -stats_dump_period_sec=12 -cache_size=100000000 -statistics -transaction_db 2>&1 \| egrep 'db.db.write.micros\|micros/op' randomtransaction : 303.645 micros/op 3293 ops/sec; 0.1 MB/s ( transactions:98999 aborts:0) rocksdb.db.write.micros P50 : 16.061871 P95 : 33.978834 P99 : 60.018017 P100 : 616315.000000 COUNT : 187619 SUM : 4097407 $ TEST_TMPDIR=/dev/shm ./db_bench.new_xxx -benchmarks=randomtransaction,stats -cache_index_and_filter_blocks=1 -bloom_bits=10 -partition_index_and_filters=1 -duration=30 -stats_dump_period_sec=12 -cache_size=100000000 -statistics -transaction_db 2>&1 \| egrep 'db.db.write.micros\|micros/op' randomtransaction : 310.383 micros/op 3221 ops/sec; 0.1 MB/s ( transactions:96999 aborts:0) rocksdb.db.write.micros P50 : 16.270026 P95 : 35.786844 P99 : 64.302878 P100 : 603088.000000 COUNT : 183819 SUM : 4095918 P100 write is now ~0.6s. Not good, but it's the same even if I completely bypass all the scanning code: $ TEST_TMPDIR=/dev/shm ./db_bench.new_skip -benchmarks=randomtransaction,stats -cache_index_and_filter_blocks=1 -bloom_bits=10 -partition_index_and_filters=1 -duration=30 -stats_dump_period_sec=12 -cache_size=100000000 -statistics -transaction_db 2>&1 \| egrep 'db.db.write.micros\|micros/op' randomtransaction : 311.365 micros/op 3211 ops/sec; 0.1 MB/s ( transactions:96999 aborts:0) rocksdb.db.write.micros P50 : 16.274362 P95 : 36.221184 P99 : 68.809783 P100 : 649808.000000 COUNT : 183819 SUM : 4156767 $ TEST_TMPDIR=/dev/shm ./db_bench.new_skip -benchmarks=randomtransaction,stats -cache_index_and_filter_blocks=1 -bloom_bits=10 -partition_index_and_filters=1 -duration=30 -stats_dump_period_sec=12 -cache_size=100000000 -statistics -transaction_db 2>&1 \| egrep 'db.db.write.micros\|micros/op' randomtransaction : 308.395 micros/op 3242 ops/sec; 0.1 MB/s ( transactions:97999 aborts:0) rocksdb.db.write.micros P50 : 16.106222 P95 : 37.202403 P99 : 67.081875 P100 : 598091.000000 COUNT : 185714 SUM : 4098832 No substantial difference. Reviewed By: siying Differential Revision: D29738847 Pulled By: pdillinger fbshipit-source-id: 1c5c155f5a1b62e4fea0fd4eeb515a8b7474027b	4 years ago
Baptiste Lemaire	e817bc9628	Added memtable garbage statistics (#8411 ) Summary: Summary: 2 new statistics counters are added to RocksDB: `MEMTABLE_PAYLOAD_BYTES_AT_FLUSH` and `MEMTABLE_GARBAGE_BYTES_AT_FLUSH`. The former tracks how many raw bytes of useful data are present on the memtable at flush time, whereas the latter is tracks how many of these raw bytes are considered garbage, meaning that they ended up not being imported on the SSTables resulting from the flush operations. Unit test: run `make db_flush_test -j$(nproc); ./db_flush_test` to run the unit test. This executable includes 3 tests, that test support and correct stat calculations for workloads with inserts, deletes, and DeleteRanges. The parameters are set such that the workloads are performed on a single memtable, and a single SSTable is created as a result of the flush operation. The flush operation is manually called in the test file. The tests verify that the values of these 2 statistics counters introduced in this PR can be exactly predicted, showing that we have a full understanding of the underlying operations. Performance testing: `./db_bench -statistics -benchmarks=fillrandom -num=10000000` repeated 10 times. Timing done using "date" function in a bash script. _Results_: Original Rocksdb fork: mean 66.6 sec, std 1.18 sec. This feature branch: mean 67.4 sec, std 1.35 sec. Pull Request resolved: https://github.com/facebook/rocksdb/pull/8411 Reviewed By: akankshamahajan15 Differential Revision: D29150629 Pulled By: bjlemaire fbshipit-source-id: 7b3c2e86d50c6aa34fa50fd134282eacb543a5b1	4 years ago
Yanqin Jin	fd00f39f97	Disable IOStatsContext/PerfContext if no thread local (#8117 ) Summary: Before this PR, `get_iostats_context()` will silently return a nullptr if no thread_local support is detected. This can be the result of build_detect_platform's failure to compile the simple code snippet on certain platforms, as reported in https://github.com/facebook/mysql-5.6/issues/904. To be safe, we should fail the compilation if user does not opt out IOStatsContext and ROCKSDB_SUPPORT_THREAD_LOCAL is not defined. If RocksDB relies on c++11, can we just always use thread_local? It turns out there might be performance concerns (https://github.com/facebook/rocksdb/issues/5774), which is beyond the scope of this PR. We can revisit this later. Here, we stick to the original impl. Pull Request resolved: https://github.com/facebook/rocksdb/pull/8117 Reviewed By: ajkr Differential Revision: D27356847 Pulled By: riversand963 fbshipit-source-id: f7d5776842277598d8341b955febb601946801ae	4 years ago
darionyaphet	b2c48a570f	Support cpu_write_nanos and cpu_read_nanos in IOStatsContext (#8149 ) Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/8149 Reviewed By: ajkr Differential Revision: D27571017 Pulled By: riversand963 fbshipit-source-id: a73427e907a7cb899debf55d60a2ede726695277	4 years ago
Zhichao Cao	08ec5e7321	Add the statistics and info log for Error handler (#8050 ) Summary: Add statistics and info log for error handler: counters for bg error, bg io error, bg retryable io error, auto resume, auto resume total retry, and auto resume sucess; Histogram for auto resume retry count in each recovery call. Pull Request resolved: https://github.com/facebook/rocksdb/pull/8050 Test Plan: make check and add test to error_handler_fs_test Reviewed By: anand1976 Differential Revision: D26990565 Pulled By: zhichao-cao fbshipit-source-id: 49f71e8ea4e9db8b189943976404205b56ab883f	4 years ago
mrambacher	3dff28cf9b	Use SystemClock* instead of std::shared_ptr<SystemClock> in lower level routines (#8033 ) Summary: For performance purposes, the lower level routines were changed to use a SystemClock* instead of a std::shared_ptr<SystemClock>. The shared ptr has some performance degradation on certain hardware classes. For most of the system, there is no risk of the pointer being deleted/invalid because the shared_ptr will be stored elsewhere. For example, the ImmutableDBOptions stores the Env which has a std::shared_ptr<SystemClock> in it. The SystemClock* within the ImmutableDBOptions is essentially a "short cut" to gain access to this constant resource. There were a few classes (PeriodicWorkScheduler?) where the "short cut" property did not hold. In those cases, the shared pointer was preserved. Using db_bench readrandom perf_level=3 on my EC2 box, this change performed as well or better than 6.17: 6.17: readrandom : 28.046 micros/op 854902 ops/sec; 61.3 MB/s (355999 of 355999 found) 6.18: readrandom : 32.615 micros/op 735306 ops/sec; 52.7 MB/s (290999 of 290999 found) PR: readrandom : 27.500 micros/op 871909 ops/sec; 62.5 MB/s (367999 of 367999 found) (Note that the times for 6.18 are prior to revert of the SystemClock). Pull Request resolved: https://github.com/facebook/rocksdb/pull/8033 Reviewed By: pdillinger Differential Revision: D27014563 Pulled By: mrambacher fbshipit-source-id: ad0459eba03182e454391b5926bf5cdd45657b67	4 years ago
jsteemann	02974c9437	make PerfStepTimer struct smaller by reordering members (#7931 ) Summary: On x86_64, this makes the struct 8 bytes smaller, so creating a PerfStepTimer on the stack will use slightly less stack space. Pull Request resolved: https://github.com/facebook/rocksdb/pull/7931 Reviewed By: jay-zhuang Differential Revision: D26529470 Pulled By: ajkr fbshipit-source-id: bbe2e843167152ffa05a5946f1add6621c9849f7	4 years ago
Yanqin Jin	9fdc9fbeea	Still use SystemClock* instead of shared_ptr in StepPerfTimer (#8006 ) Summary: This is likely a temp fix before we figure out a better way. PerfStepTimer is used intensively in certain benchmarking/testings. https://github.com/facebook/rocksdb/issues/7858 stores a `shared_ptr` to system clock in PerfStepTimer which gets created each time a `PerfStepTimer` object is created. The atomic operations in `shared_ptr` may add overhead in CPU cycles. Therefore, we change it back to a raw `SystemClock*` for now. Pull Request resolved: https://github.com/facebook/rocksdb/pull/8006 Test Plan: make check Reviewed By: pdillinger Differential Revision: D26703560 Pulled By: riversand963 fbshipit-source-id: 519d0769b28da2334bea7d86c848fcc26ee8a17f	4 years ago
sherriiiliu	e017af15c1	Fix testcase failures on windows (#7992 ) Summary: Fixed 5 test case failures found on Windows 10/Windows Server 2016 1. In `flush_job_test`, the DestroyDir function fails in deconstructor because some file handles are still being held by VersionSet. This happens on Windows Server 2016, so need to manually reset versions_ pointer to release all file handles. 2. In `StatsHistoryTest.InMemoryStatsHistoryPurging` test, the capping memory cost of stats_history_size on Windows becomes 14000 bytes with latest changes, not just 13000 bytes. 3. In `SSTDumpToolTest.RawOutput` test, the output file handle is not closed at the end. 4. In `FullBloomTest.OptimizeForMemory` test, ROCKSDB_MALLOC_USABLE_SIZE is undefined on windows so `total_mem` is always equal to `total_size`. The internal memory fragmentation assertion does not apply in this case. 5. In `BlockFetcherTest.FetchAndUncompressCompressedDataBlock` test, XPRESS cannot reach 87.5% compression ratio with original CreateTable method, so I append extra zeros to the string value to enhance compression ratio. Beside, since XPRESS allocates memory internally, thus does not support for custom allocator verification, we will skip the allocator verification for XPRESS Pull Request resolved: https://github.com/facebook/rocksdb/pull/7992 Reviewed By: jay-zhuang Differential Revision: D26615283 Pulled By: ajkr fbshipit-source-id: 3632612f84b99e2b9c77c403b112b6bedf3b125d	4 years ago
mrambacher	12f1137355	Add a SystemClock class to capture the time functions of an Env (#7858 ) Summary: Introduces and uses a SystemClock class to RocksDB. This class contains the time-related functions of an Env and these functions can be redirected from the Env to the SystemClock. Many of the places that used an Env (Timer, PerfStepTimer, RepeatableThread, RateLimiter, WriteController) for time-related functions have been changed to use SystemClock instead. There are likely more places that can be changed, but this is a start to show what can/should be done. Over time it would be nice to migrate most (if not all) of the uses of the time functions from the Env to the SystemClock. There are several Env classes that implement these functions. Most of these have not been converted yet to SystemClock implementations; that will come in a subsequent PR. It would be good to unify many of the Mock Timer implementations, so that they behave similarly and be tested similarly (some override Sleep, some use a MockSleep, etc). Additionally, this change will allow new methods to be introduced to the SystemClock (like https://github.com/facebook/rocksdb/issues/7101 WaitFor) in a consistent manner across a smaller number of classes. Pull Request resolved: https://github.com/facebook/rocksdb/pull/7858 Reviewed By: pdillinger Differential Revision: D26006406 Pulled By: mrambacher fbshipit-source-id: ed10a8abbdab7ff2e23d69d85bd25b3e7e899e90	4 years ago
Jay Zhuang	58660bf21a	Use mock time for histogram_test (#7799 ) Summary: `histogram_test` uses real sleep, which depends on the test executing speed, it makes the test unstable. Switching to using a mock time env, it can also increase the test speed (from 10100ms -> 100ms). Pull Request resolved: https://github.com/facebook/rocksdb/pull/7799 Test Plan: run test 10 times, all passed. vs. without fix 3 out 10 test failed: no fix: https://app.circleci.com/pipelines/github/facebook/rocksdb?branch=pull%2F7797 with fix: https://app.circleci.com/pipelines/github/facebook/rocksdb?branch=pull%2F7799 Reviewed By: pdillinger Differential Revision: D25676948 Pulled By: jay-zhuang fbshipit-source-id: 64c273fc299c53283138dbb213386e4b45e8bdc2	4 years ago
mrambacher	55e99688cc	No elide constructors (#7798 ) Summary: Added "no-elide-constructors to the ASSERT_STATUS_CHECK builds. This flag gives more errors/warnings for some of the Status checks where an inner class checks a Status and later returns it. In this case, without the elide check on, the returned status may not have been checked in the caller, thereby bypassing the checked code. Pull Request resolved: https://github.com/facebook/rocksdb/pull/7798 Reviewed By: jay-zhuang Differential Revision: D25680451 Pulled By: pdillinger fbshipit-source-id: c3f14ed9e2a13f0a8c54d839d5fb4d1fc1e93917	4 years ago
Yanqin Jin	394210f280	Remove unused includes (#7604 ) Summary: This is a PR generated semi-automatically by an internal tool to remove unused includes and `using` statements. Pull Request resolved: https://github.com/facebook/rocksdb/pull/7604 Test Plan: make check Reviewed By: ajkr Differential Revision: D24579392 Pulled By: riversand963 fbshipit-source-id: c4bfa6c6b08da1de186690d37eb73d8fff45aecd	5 years ago
mrambacher	f35f7f2704	Fix many tests to run with MEM_ENV and ENCRYPTED_ENV; Introduce a MemoryFileSystem class (#7566 ) Summary: This PR does a few things: 1. The MockFileSystem class was split out from the MockEnv. This change would theoretically allow a MockFileSystem to be used by other Environments as well (if we created a means of constructing one). The MockFileSystem implements a FileSystem in its entirety and does not rely on any Wrapper implementation. 2. Make the RocksDB test suite work when MOCK_ENV=1 and ENCRYPTED_ENV=1 are set. To accomplish this, a few things were needed: - The tests that tried to use the "wrong" environment (Env::Default() instead of env_) were updated - The MockFileSystem was changed to support the features it was missing or mishandled (such as recursively deleting files in a directory or supporting renaming of a directory). 3. Updated the test framework to have a ROCKSDB_GTEST_SKIP macro. This can be used to flag tests that are skipped. Currently, this defaults to doing nothing (marks the test as SUCCESS) but will mark the tests as SKIPPED when RocksDB is upgraded to a version of gtest that supports this (gtest-1.10). I have run a full "make check" with MEM_ENV, ENCRYPTED_ENV, both, and neither under both MacOS and RedHat. A few tests were disabled/skipped for the MEM/ENCRYPTED cases. The error_handler_fs_test fails/hangs for MEM_ENV (presumably a timing problem) and I will introduce another PR/issue to track that problem. (I will also push a change to disable those tests soon). There is one more test in DBTest2 that also fails which I need to investigate or skip before this PR is merged. Theoretically, this PR should also allow the test suite to run against an Env loaded from the registry, though I do not have one to try it with currently. Finally, once this is accepted, it would be nice if there was a CircleCI job to run these tests on a checkin so this effort does not become stale. I do not know how to do that, so if someone could write that job, it would be appreciated :) Pull Request resolved: https://github.com/facebook/rocksdb/pull/7566 Reviewed By: zhichao-cao Differential Revision: D24408980 Pulled By: jay-zhuang fbshipit-source-id: 911b1554a4d0da06fd51feca0c090a4abdcb4a5f	5 years ago
Akanksha Mahajan	38d0a365e3	Add Stats for MultiGet (#7366 ) Summary: Add following stats for MultiGet in Histogram to get more insight on MultiGet. 1. Number of index and filter blocks read from file as part of MultiGet request per level. 2. Number of data blocks read from file per level. 3. Number of SST files loaded from file system per level. Pull Request resolved: https://github.com/facebook/rocksdb/pull/7366 Reviewed By: anand1976 Differential Revision: D24127040 Pulled By: akankshamahajan15 fbshipit-source-id: e63a003056b833729b277edc0639c08fb432756b	5 years ago
Andrew Kryczka	29ed766193	add Status check enforcement for stats_history_test (#7496 ) Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/7496 Reviewed By: zhichao-cao Differential Revision: D24070007 Pulled By: ajkr fbshipit-source-id: 4320413a4d7707774ee23a7e6232714d7ee7a57f	5 years ago
Andrew Kryczka	1e00909730	Periodically flush info log out of application buffer (#7488 ) Summary: This PR schedules a background thread (shared across all DB instances) to flush info log every ten seconds. This improves debuggability in case of RocksDB hanging since it ensures the log messages leading up to the hang will eventually become visible in the log. The bulk of this PR is moving monitoring/stats_dump_scheduler* to db/periodic_work_scheduler* and making the corresponding name changes since now the scheduler handles info log flushing, not just stats dumping. Pull Request resolved: https://github.com/facebook/rocksdb/pull/7488 Reviewed By: riversand963 Differential Revision: D24065165 Pulled By: ajkr fbshipit-source-id: 339c47a0ff43b79fdbd055fbd9fefbb6f9d8d3b5	5 years ago
Yanqin Jin	ab202e8d72	Add a new stats level to exclude tickers (#7329 ) Summary: Currently, application may pass a statistics object to db but later wants to reduce stats tracking overhead by setting stats level to kExceptHistogramOrTimers (the current lowest level). Tickers will still be incremented, causing up to 1% CPU. We can add a new lowest stats level `kExceptTickers` to disable ticker incrementing as well, thus reducing CPU cycles spent on tickers. Test Plan (devserver): ``` make check make clean DEBUG_LEVEL=0 make db_bench ./db_bench -perf_level=1 -stats_level=0 -statistics -benchmarks=fillseq,readrandom -duration=120 ``` Measure CPU util (%) before and after change: CPU util by rocksdb::RecordTick: 1.1 vs (<0.1) Pull Request resolved: https://github.com/facebook/rocksdb/pull/7329 Reviewed By: pdillinger Differential Revision: D23434014 Pulled By: riversand963 fbshipit-source-id: 72ff0f02a192ac476d4b0044b9f37fd4a22ff0d4	5 years ago
Jay Zhuang	187964a039	Add test function MockTimeEnv.SleepForMicroseconds() (#7293 ) Summary: And change the internal time value from seconds to microseconds. Pull Request resolved: https://github.com/facebook/rocksdb/pull/7293 Reviewed By: pdillinger Differential Revision: D23253751 Pulled By: jay-zhuang fbshipit-source-id: 36aa9376b8801b85bd10163173590a17cf4f3a3a	5 years ago
mrambacher	e9befdebbf	Add EnvTestWithParam::OptionsTest to the ASSERT_STATUS_CHECKED passes (#7283 ) Summary: This test uses database functionality and required more extensive work to get it to pass than the other tests. The DB functionality required for this test now passes the check. When it was unclear what the proper behavior was for unchecked status codes, a TODO was added. Pull Request resolved: https://github.com/facebook/rocksdb/pull/7283 Reviewed By: akankshamahajan15 Differential Revision: D23251497 Pulled By: ajkr fbshipit-source-id: 52b79629bdafa0a58de8ead1d1d66f141b331523	5 years ago
Jay Zhuang	3e422ce0ca	Fix a timer_test deadlock (#7277 ) Summary: There's a potential deadlock caused by MockTimeEnv time value get to a large number, which causes TimedWait() wait forever. The test misuses the microseconds as seconds, making it more likely to happen. Pull Request resolved: https://github.com/facebook/rocksdb/pull/7277 Reviewed By: pdillinger Differential Revision: D23183873 Pulled By: jay-zhuang fbshipit-source-id: 6fc38ebd40b4125a99551204b271f91a27e70086	5 years ago
sdong	b194c21bba	Whole DBTest to skip fsync (#7274 ) Summary: After https://github.com/facebook/rocksdb/pull/7036, we still see extra DBTest that can timeout when running 10 or 20 in parallel. Expand skip-fsync mode in whole DBTest. Still preserve other tests from doing this mode to be conservative. This commit reinstates https://github.com/facebook/rocksdb/issues/7049, whose un-revert was lost in an automatic infrastructure mis-merge. Pull Request resolved: https://github.com/facebook/rocksdb/pull/7274 Test Plan: Run all existing files. Reviewed By: pdillinger Differential Revision: D23177444 fbshipit-source-id: 1f61690b2ac6333c3b2c87176fef6b2cba086b33	5 years ago
Jay Zhuang	69760b4d05	Introduce a global StatsDumpScheduler for stats dumping (#7223 ) Summary: Have a global StatsDumpScheduler for all DB instance stats dumping, including `DumpStats()` and `PersistStats()`. Before this, there're 2 dedicate threads for every DB instance, one for DumpStats() one for PersistStats(), which could create lots of threads if there're hundreds DB instances. Pull Request resolved: https://github.com/facebook/rocksdb/pull/7223 Reviewed By: riversand963 Differential Revision: D23056737 Pulled By: jay-zhuang fbshipit-source-id: 0faa2311142a73433ebb3317361db7cbf43faeba	5 years ago
Peter Dillinger	6ac1d25fd0	Fix+clean up handling of mock sleeps (#7101 ) Summary: We have a number of tests hanging on MacOS and windows due to mishandling of code for mock sleeps. In addition, the code was in terrible shape because the same variable (addon_time_) would sometimes refer to microseconds and sometimes to seconds. One test even assumed it was nanoseconds but was written to pass anyway. This has been cleaned up so that DB tests generally use a SpecialEnv function to mock sleep, for either some number of microseconds or seconds depending on the function called. But to call one of these, the test must first call SetMockSleep (precondition enforced with assertion), which also turns sleeps in RocksDB into mock sleeps. To also removes accounting for actual clock time, call SetTimeElapseOnlySleepOnReopen, which implies SetMockSleep (on DB re-open). This latter setting only works by applying on DB re-open, otherwise havoc can ensue if Env goes back in time with DB open. More specifics: Removed some unused test classes, and updated comments on the general problem. Fixed DBSSTTest.GetTotalSstFilesSize using a sync point callback instead of mock time. For this we have the only modification to production code, inserting a sync point callback in flush_job.cc, which is not a change to production behavior. Removed unnecessary resetting of mock times to 0 in many tests. RocksDB deals in relative time. Any behaviors relying on absolute date/time are likely a bug. (The above test DBSSTTest.GetTotalSstFilesSize was the only one clearly injecting a specific absolute time for actual testing convenience.) Just in case I misunderstood some test, I put this note in each replacement: // NOTE: Presumed unnecessary and removed: resetting mock time in env Strengthened some tests like MergeTestTime, MergeCompactionTimeTest, and FilterCompactionTimeTest in db_test.cc stats_history_test and blob_db_test are each their own beast, rather deeply dependent on MockTimeEnv. Each gets its own variant of a work-around for TimedWait in a mock time environment. (Reduces redundancy and inconsistency in stats_history_test.) Intended follow-up: Remove TimedWait from the public API of InstrumentedCondVar, and only make that accessible through Env by passing in an InstrumentedCondVar and a deadline. Then the Env implementations mocking time can fix this problem without using sync points. (Test infrastructure using sync points interferes with individual tests' control over sync points.) With that change, we can simplify/consolidate the scattered work-arounds. Pull Request resolved: https://github.com/facebook/rocksdb/pull/7101 Test Plan: make check on Linux and MacOS Reviewed By: zhichao-cao Differential Revision: D23032815 Pulled By: pdillinger fbshipit-source-id: 7f33967ada8b83011fb54e8279365c008bd6610b	5 years ago
Aaron Kabcenell	56ed601df3	Compaction Read/Write Stats by Compaction Type (#7165 ) Summary: Adds compaction statistics (total bytes read and written) for compactions that occur for delete-triggered, periodic, and TTL compaction reasons. Pull Request resolved: https://github.com/facebook/rocksdb/pull/7165 Test Plan: TTL and periodic can be checked by runnning db_bench with the options activated: /db_bench --benchmarks="fillrandom,stats" --statistics --num=10000000 -base_background_compactions=16 -periodic_compaction_seconds=1 ./db_bench --benchmarks="fillrandom,stats" --statistics --num=10000000 -base_background_compactions=16 -fifo_compaction_ttl=1 Setting the time to one second causes non-zero bytes read/written for those compaction reasons. Disabling them or setting them to times longer than the test run length causes the stats to return to zero as expected. Delete-triggered compaction counting is tested in DBTablePropertiesTest.DeletionTriggeredCompactionMarking Reviewed By: ajkr Differential Revision: D22693050 Pulled By: akabcenell fbshipit-source-id: d15cef4d94576f703015c8942d5f0d492f69401d	5 years ago
Jay Zhuang	00de699096	Replace reinterpret_cast with static_cast_with_check (#7067 ) Summary: Replace `reinterpret_cast` with `static_cast_with_check` for `DBImpl` and `ColumnFamilyHandleImpl`. Pull Request resolved: https://github.com/facebook/rocksdb/pull/7067 Reviewed By: siying Differential Revision: D22361587 Pulled By: jay-zhuang fbshipit-source-id: dfe9e8f3af39c3d27cc372c55ab9ad905eb0a5a1	5 years ago
Peter Dillinger	52d59e0c93	Revert "Whole DBTest to skip fsync (#7049 )" (#7070 ) Summary: This reverts commit `4f1534bdb0`. This commit caused failures and deadlocks in MultiThreadedDBTest.MultiThreaded/69 and others. Pull Request resolved: https://github.com/facebook/rocksdb/pull/7070 Reviewed By: riversand963 Differential Revision: D22358778 Pulled By: pdillinger fbshipit-source-id: faf8f2cb469a7063a113921c8e9c64a9f7610dac	5 years ago
sdong	4f1534bdb0	Whole DBTest to skip fsync (#7049 ) Summary: After https://github.com/facebook/rocksdb/pull/7036, we still see extra DBTest that can timeout when running 10 or 20 in parallel. Expand skip-fsync mode in whole DBTest. Still preserve other tests from doing this mode to be conservative. Pull Request resolved: https://github.com/facebook/rocksdb/pull/7049 Test Plan: Run all existing files. Reviewed By: pdillinger Differential Revision: D22301700 fbshipit-source-id: f9a9e3b3b26ce640665a47cb8bff33ba0c89b565	5 years ago
Akanksha Mahajan	2677bd5967	Add logs and stats in DeleteScheduler (#6927 ) Summary: Add logs and stats for files marked as trash and files deleted immediately in DeleteScheduler Pull Request resolved: https://github.com/facebook/rocksdb/pull/6927 Test Plan: make check -j64 Reviewed By: riversand963 Differential Revision: D21869068 Pulled By: akankshamahajan15 fbshipit-source-id: e9f673c4fa8049ce648b23c75d742f2f9c6c57a1	5 years ago
Derrick Pallas	5272305437	Fix FilterBench when RTTI=0 (#6732 ) Summary: The dynamic_cast in the filter benchmark causes release mode to fail due to no-rtti. Replace with static_cast_with_check. Signed-off-by: Derrick Pallas <derrick@pallas.us> Addition by peterd: Remove unnecessary 2nd template arg on all static_cast_with_check Pull Request resolved: https://github.com/facebook/rocksdb/pull/6732 Reviewed By: ltamasi Differential Revision: D21304260 Pulled By: pdillinger fbshipit-source-id: 6e8eb437c4ca5a16dbbfa4053d67c4ad55f1608c	5 years ago
Peter Dillinger	249eff0f30	Stats for redundant insertions into block cache (#6681 ) Summary: Since read threads do not coordinate on loading data into block cache, two threads between Lookup and Insert can end up loading and inserting the same data. This is particularly concerning with cache_index_and_filter_blocks since those are hot and more likely to be race targets if ejected from (or not pre-populated in) the cache. Particularly with moves toward disaggregated / network storage, the cost of redundant retrieval might be high, and we should at least have some hard statistics from which we can estimate impact. Example with full filter thrashing "cliff": $ ./db_bench --benchmarks=fillrandom --num=15000000 --cache_index_and_filter_blocks -bloom_bits=10 ... $ ./db_bench --db=/tmp/rocksdbtest-172704/dbbench --use_existing_db --benchmarks=readrandom,stats --num=200000 --cache_index_and_filter_blocks --cache_size=$((130 * 1024 * 1024)) --bloom_bits=10 --threads=16 -statistics 2>&1 \| egrep '^rocksdb.block.cache.(.add\|.redundant)' \| grep -v compress \| sort rocksdb.block.cache.add COUNT : 14181 rocksdb.block.cache.add.failures COUNT : 0 rocksdb.block.cache.add.redundant COUNT : 476 rocksdb.block.cache.data.add COUNT : 12749 rocksdb.block.cache.data.add.redundant COUNT : 18 rocksdb.block.cache.filter.add COUNT : 1003 rocksdb.block.cache.filter.add.redundant COUNT : 217 rocksdb.block.cache.index.add COUNT : 429 rocksdb.block.cache.index.add.redundant COUNT : 241 $ ./db_bench --db=/tmp/rocksdbtest-172704/dbbench --use_existing_db --benchmarks=readrandom,stats --num=200000 --cache_index_and_filter_blocks --cache_size=$((120 * 1024 * 1024)) --bloom_bits=10 --threads=16 -statistics 2>&1 \| egrep '^rocksdb.block.cache.(.add\|.redundant)' \| grep -v compress \| sort rocksdb.block.cache.add COUNT : 1182223 rocksdb.block.cache.add.failures COUNT : 0 rocksdb.block.cache.add.redundant COUNT : 302728 rocksdb.block.cache.data.add COUNT : 31425 rocksdb.block.cache.data.add.redundant COUNT : 12 rocksdb.block.cache.filter.add COUNT : 795455 rocksdb.block.cache.filter.add.redundant COUNT : 130238 rocksdb.block.cache.index.add COUNT : 355343 rocksdb.block.cache.index.add.redundant COUNT : 172478 Pull Request resolved: https://github.com/facebook/rocksdb/pull/6681 Test Plan: Some manual testing (above) and unit test covering key metrics is included Reviewed By: ltamasi Differential Revision: D21134113 Pulled By: pdillinger fbshipit-source-id: c11497b5f00f4ffdfe919823904e52d0a1a91d87	5 years ago
Andrew Kryczka	e60ea7fe57	fix compiler errors with -DNPERF_CONTEXT (#6642 ) Summary: as titled Pull Request resolved: https://github.com/facebook/rocksdb/pull/6642 Test Plan: ``` $ EXTRA_CXXFLAGS="-DNPERF_CONTEXT" DEBUG_LEVEL=0 make -j48 db_bench ``` Reviewed By: riversand963 Differential Revision: D20842313 Pulled By: ajkr fbshipit-source-id: a830cad312ca681591f06749242279503b101df2	5 years ago
Burton Li	df62cd5b35	Fix msvc debug test failures (#6579 ) Summary: 1. stats_history_test: one slice of stats history is 12526 Bytes, which is greater than original assumption. ![image](https://user-images.githubusercontent.com/17753898/77381970-5a611a80-6d3c-11ea-9d64-59d2e3c04f79.png) 2. table_test: in VerifyBlockAccessTrace function, release trace reader before delete trace file. Pull Request resolved: https://github.com/facebook/rocksdb/pull/6579 Reviewed By: siying Differential Revision: D20767373 Pulled By: pdillinger fbshipit-source-id: e8647d665cbe83a3f5429639c6219b50c0912124	5 years ago
sdong	fdf882ded2	Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433 ) Summary: When dynamically linking two binaries together, different builds of RocksDB from two sources might cause errors. To provide a tool for user to solve the problem, the RocksDB namespace is changed to a flag which can be overridden in build time. Pull Request resolved: https://github.com/facebook/rocksdb/pull/6433 Test Plan: Build release, all and jtest. Try to build with ROCKSDB_NAMESPACE with another flag. Differential Revision: D19977691 fbshipit-source-id: aa7f2d0972e1c31d75339ac48478f34f6cfcfb3e	5 years ago
sdong	24c9dce825	Remove include math.h (#6373 ) Summary: We see some odd errors complaining math. However, it doesn't seem that it is needed to be included. Remove the include of math.h. Just removing it from db_bench doesn't seem to break anything. Replacing sqrt from std::sqrt seems to work for histogram.cc Pull Request resolved: https://github.com/facebook/rocksdb/pull/6373 Test Plan: Watch Travis and appveyor to run. Differential Revision: D19730068 fbshipit-source-id: d3ad41defcdd9f51c2da1a3673fb258f5dfacf47	5 years ago
sdong	e8263dbdaa	Apply formatter to recent 200+ commits. (#5830 ) Summary: Further apply formatter to more recent commits. Pull Request resolved: https://github.com/facebook/rocksdb/pull/5830 Test Plan: Run all existing tests. Differential Revision: D17488031 fbshipit-source-id: 137458fd94d56dd271b8b40c522b03036943a2ab	6 years ago
Maysam Yabandeh	6ec6a4a9a4	Remove snap_refresh_nanos option (#5826 ) Summary: The snap_refresh_nanos option didn't bring much benefit. Remove the feature to simplify the code. Pull Request resolved: https://github.com/facebook/rocksdb/pull/5826 Differential Revision: D17467147 Pulled By: maysamyabandeh fbshipit-source-id: 4f950b046990d0d1292d7fc04c2ccafaf751c7f0	6 years ago
Shylock Hg	9eb3e1f77d	Use delete to disable automatic generated methods. (#5009 ) Summary: Use delete to disable automatic generated methods instead of private, and put the constructor together for more clear.This modification cause the unused field warning, so add unused attribute to disable this warning. Pull Request resolved: https://github.com/facebook/rocksdb/pull/5009 Differential Revision: D17288733 fbshipit-source-id: 8a767ce096f185f1db01bd28fc88fef1cdd921f3	6 years ago
Wilfried Goesgens	fbab9913e2	upgrade gtest 1.7.0 => 1.8.1 for json result writing Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/5332 Differential Revision: D17242232 fbshipit-source-id: c0d4646556a1335e51ac7382b986ca7f6ced7b64	6 years ago
git-hulk	cdb6334e68	MOD: trim last space and comma in perf context and iostat context ToString() Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/5755 Differential Revision: D17165190 Pulled By: riversand963 fbshipit-source-id: a3a4633961bfe019bf360f97a4c4d36464e7fa0b	6 years ago
jsteemann	a2e46eae46	fix compiling with `-DNPERF_CONTEXT` (#5704 ) Summary: This was previously broken, as the performance context-related macro signatures in file monitoring/perf_context_imp.h deviated for the case when NPERF_CONTEXT was defined and when it was not. Update the macros for the `-DNPERF_CONTEXT` case, so it compiles. Pull Request resolved: https://github.com/facebook/rocksdb/pull/5704 Differential Revision: D16867746 fbshipit-source-id: 05539724cb1f7955ecc42828365836a677759ad9	6 years ago
Maysam Yabandeh	208556ee13	WritePrepared: fix Get without snapshot (#5664 ) Summary: if read_options.snapshot is not set, ::Get will take the last sequence number after taking a super-version and uses that as the sequence number. Theoretically max_eviceted_seq_ could advance this sequence number. This could lead ::IsInSnapshot that will be invoked by the ReadCallback to notice the absence of the snapshot. In this case, the ReadCallback should have passed a non-value to snap_released so that it could be set by the ::IsInSnapshot. The patch does that, and adds a unit test to verify it. Pull Request resolved: https://github.com/facebook/rocksdb/pull/5664 Differential Revision: D16614033 Pulled By: maysamyabandeh fbshipit-source-id: 06fb3fd4aacd75806ed1a1acec7961f5d02486f2	6 years ago
Zhongyi Xie	cfdf2116d3	Exclude StatsHistoryTest.ForceManualFlushStatsCF test from lite mode (#5529 ) Summary: Recent commit `3886dddc3b` introduced a new test which is not compatible with lite mode and breaks contrun test: ``` [ RUN ] StatsHistoryTest.ForceManualFlushStatsCF monitoring/stats_history_test.cc:642: Failure Expected: (cfd_stats->GetLogNumber()) < (cfd_test->GetLogNumber()), actual: 15 vs 15 ``` This PR excludes the test from lite mode to appease the failing test Pull Request resolved: https://github.com/facebook/rocksdb/pull/5529 Differential Revision: D16080892 Pulled By: miasantreble fbshipit-source-id: 2f8a22758f71250cd9f204046404226ddc13b028	6 years ago
Zhongyi Xie	3886dddc3b	force flushing stats CF to avoid holding old logs (#5509 ) Summary: WAL records RocksDB writes to all column families. When user flushes a a column family, the old WAL will not accept new writes but cannot be deleted yet because it may still contain live data for other column families. (See https://github.com/facebook/rocksdb/wiki/Write-Ahead-Log#life-cycle-of-a-wal for detailed explanation) Because of this, if there is a column family that receive very infrequent writes and no manual flush is called for it, it could prevent a lot of WALs from being deleted. PR https://github.com/facebook/rocksdb/pull/5046 introduced persistent stats column family which is a good example of such column families. Depending on the config, it may have long intervals between writes, and user is unaware of it which makes it difficult to call manual flush for it. This PR addresses the problem for persistent stats column family by forcing a flush for persistent stats column family when 1) another column family is flushed 2) persistent stats column family's log number is the smallest among all column families, this way persistent stats column family will keep advancing its log number when necessary, allowing RocksDB to delete old WAL files. Pull Request resolved: https://github.com/facebook/rocksdb/pull/5509 Differential Revision: D16045896 Pulled By: miasantreble fbshipit-source-id: 286837b633e988417f0096ff38384742d3b40ef4	6 years ago
Zhongyi Xie	ddd088c8b9	fix rocksdb lite and clang contrun test failures (#5477 ) Summary: recent commit `671d15cbdd` introduced some test failures: ``` ===== Running stats_history_test [==========] Running 9 tests from 1 test case. [----------] Global test environment set-up. [----------] 9 tests from StatsHistoryTest [ RUN ] StatsHistoryTest.RunStatsDumpPeriodSec monitoring/stats_history_test.cc:63: Failure dbfull()->SetDBOptions({{"stats_dump_period_sec", "0"}}) Not implemented: Not supported in ROCKSDB LITE db/db_options_test.cc:28:11: error: unused variable 'kMicrosInSec' [-Werror,-Wunused-const-variable] const int kMicrosInSec = 1000000; ``` This PR fixes these failures Pull Request resolved: https://github.com/facebook/rocksdb/pull/5477 Differential Revision: D15871814 Pulled By: miasantreble fbshipit-source-id: 0a7023914d2c1784d9d2d3f5bfb47310d4855394	6 years ago

1 2 3

109 Commits (6cca9fab7cda61365b19910fb7b9440cc8bd8ac2)