Summary:
The first parameter of SyncPoint::Process is "const std::string&". The majority, maybe all, of the actual calls to this function use a "const char *". The conversion before entering the function requires a construction of a std::string object on the heap. This std::object is then typically not needed because first use of the string is a rocksdb::Slice which has a less costly conversion of char * to slice.
Example:
We have a load and iterate test. The test loads 10m keys and iterates most via 10 rocksdb::Iterator objects. We used TCMALLOC to gather information about allocation and space usage during iterators.
- Before this PR: test took 32 min 17 sec
- After this PR: test took 1 min 14 sec
The TCMALLOC top object list before this PR:
<pre>
Total: 5105999 objects
5003717 98.0% 98.0% 5009471 98.1% rocksdb::DBIter::MergeValuesNewToOld (inline)
20260 0.4% 98.4% 20260 0.4% std::__cxx11::basic_string::_M_mutate
15214 0.3% 98.7% 15214 0.3% rocksdb::UncompressBlockContentsForCompressionType (inline)
13408 0.3% 99.0% 13408 0.3% std::_Rb_tree::_M_emplace_hint_unique [clone .constprop.416] (inline)
12957 0.3% 99.2% 12957 0.3% std::_Rb_tree::_M_emplace_hint_unique [clone .constprop.405] (inline)
9327 0.2% 99.4% 9327 0.2% std::_Rb_tree::_M_copy (inline)
7691 0.2% 99.5% 9919 0.2% JVM_FindSignal
2859 0.1% 99.6% 2859 0.1% rocksdb::Cleanable::RegisterCleanup
2844 0.1% 99.7% 2844 0.1% std::map::operator[] (inline)
</pre>
The "MergeValuesNewToOld (inline)" objects are the #define wrappers to SyncPoint::Process. We discovered this in a 5.18 rocksdb release. There TCMALLOC was more specific that std::basic_string was being constructed. I believe that was before SyncPoint::Process was declared inline in subsequent releases.
The TCMALLOC top object list after this PR:
<pre>
Total: 104911 objects
45090 43.0% 43.0% 45090 43.0% rocksdb::Cleanable::RegisterCleanup
29995 28.6% 71.6% 29995 28.6% rocksdb::LRUCacheShard::Insert
15229 14.5% 86.1% 15229 14.5% rocksdb::UncompressBlockContentsForCompressionType (inline)
4373 4.2% 90.3% 4551 4.3% JVM_FindSignal
2881 2.7% 93.0% 2881 2.7% rocksdb::::ReadBlockFromFile (inline)
1162 1.1% 94.1% 1176 1.1% rocksdb::BlockFetcher::ReadBlockContents (inline)
1036 1.0% 95.1% 1036 1.0% std::__cxx11::basic_string::_M_mutate
869 0.8% 95.9% 869 0.8% std::vector::_M_realloc_insert (inline)
806 0.8% 96.7% 806 0.8% SnmpAgent::GetVariables (inline)
</pre>
Pull Request resolved: https://github.com/facebook/rocksdb/pull/9023
Reviewed By: pdillinger
Differential Revision: D31610907
Pulled By: mrambacher
fbshipit-source-id: 574ff51b639dd46ad253a8e664a575f06b7cc85d
Summary:
Refactor kill point to one single class, rather than several extern variables. The intention was to drop unflushed data before killing to simulate some job, and I tried to a pointer to fault ingestion fs to the killing class, but it ended up with harder than I thought. Perhaps we'll need to do this in another way. But I thought the refactoring itself is good so I send it out.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/8241
Test Plan: make release and run crash test for a while.
Reviewed By: anand1976
Differential Revision: D28078486
fbshipit-source-id: f9182c1455f52e6851c13f88a21bade63bcec45f
Summary:
This is a PR generated **semi-automatically** by an internal tool to remove unused includes and `using` statements.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/7604
Test Plan: make check
Reviewed By: ajkr
Differential Revision: D24579392
Pulled By: riversand963
fbshipit-source-id: c4bfa6c6b08da1de186690d37eb73d8fff45aecd
Summary:
(1) Skip check on specific key if restoring an old backup
(small minority of cases) because it can fail in those cases. (2) Remove
an old assertion about number of column families and number of keys
passed in, which is broken by atomic flush (cf_consistency) test. Like
other code (for better or worse) assume a single key and iterate over
column families. (3) Apply mock_direct_io to NewSequentialFile so that
db_stress backup works on /dev/shm.
Also add more context to output in case of backup/restore db_stress
failure.
Also a minor fix to BackupEngine to report first failure status in
creating new backup, and drop another clue about the potential
source of a "Backup failed" status.
Reverts "Disable backup/restore stress test (https://github.com/facebook/rocksdb/issues/7350)"
Pull Request resolved: https://github.com/facebook/rocksdb/pull/7357
Test Plan:
Using backup_one_in=10000,
"USE_CLANG=1 make crash_test_with_atomic_flush" for 30+ minutes
"USE_CLANG=1 make blackbox_crash_test" for 30+ minutes
And with use_direct_reads with TEST_TMPDIR=/dev/shm/rocksdb
Reviewed By: riversand963
Differential Revision: D23567244
Pulled By: pdillinger
fbshipit-source-id: e77171c2e8394d173917e36898c02dead1c40b77
Summary:
Cleans up some of the dependencies on test code in the Makefile while building tools:
- Moves the test::RandomString, DBBaseTest::RandomString into Random
- Moves the test::RandomHumanReadableString into Random
- Moves the DestroyDir method into file_utils
- Moves the SetupSyncPointsToMockDirectIO into sync_point.
- Moves the FaultInjection Env and FS classes under env
These changes allow all of the tools to build without dependencies on test_util, thereby simplifying the build dependencies. By moving the FaultInjection code, the dependency in db_stress on different libraries for debug vs release was eliminated.
Tested both release and debug builds via Make and CMake for both static and shared libraries.
More work remains to clean up how the tools are built and remove some unnecessary dependencies. There is also more work that should be done to get the Makefile and CMake to align in their builds -- what is in the libraries and the sizes of the executables are different.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/7097
Reviewed By: riversand963
Differential Revision: D22463160
Pulled By: pdillinger
fbshipit-source-id: e19462b53324ab3f0b7c72459dbc73165cc382b2
Summary:
When dynamically linking two binaries together, different builds of RocksDB from two sources might cause errors. To provide a tool for user to solve the problem, the RocksDB namespace is changed to a flag which can be overridden in build time.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6433
Test Plan: Build release, all and jtest. Try to build with ROCKSDB_NAMESPACE with another flag.
Differential Revision: D19977691
fbshipit-source-id: aa7f2d0972e1c31d75339ac48478f34f6cfcfb3e
Summary:
There are too many types of files under util/. Some test related files don't belong to there or just are just loosely related. Mo
ve them to a new directory test_util/, so that util/ is cleaner.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5377
Differential Revision: D15551366
Pulled By: siying
fbshipit-source-id: 0f5c8653832354ef8caa31749c0143815d719e2c
Summary:
Current `log::Reader` does not perform retry after encountering `EOF`. In the future, we need the log reader to be able to retry tailing the log even after `EOF`.
Current implementation is simple. It does not provide more advanced retry policies. Will address this in the future.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/4394
Differential Revision: D9926508
Pulled By: riversand963
fbshipit-source-id: d86d145792a41bd64a72f642a2a08c7b7b5201e1
Summary:
Disable direct reads for log and manifest. Direct reads should not affect sequential_file
Also add kDirectIO for option_config_ in db_test_util
Closes https://github.com/facebook/rocksdb/pull/2337
Differential Revision: D5100261
Pulled By: lightmark
fbshipit-source-id: 0ebfd13b93fa1b8f9acae514ac44f8125a05868b
Summary:
Fixes compile error:
In file included from ./util/statistics.h:17:0,
from ./util/stop_watch.h:8,
from ./util/perf_step_timer.h:9,
from ./util/iostats_context_imp.h:8,
from ./util/posix_logger.h:27,
from ./port/util_logger.h:18,
from ./db/auto_roll_logger.h:15,
from db/auto_roll_logger.cc:6:
./util/thread_local.h:65:16: error: 'function' in namespace 'std' does not name a template type
typedef std::function<void(void*, void*)> FoldFunc;
Closes https://github.com/facebook/rocksdb/pull/1656
Differential Revision: D4318702
Pulled By: yiwu-arbug
fbshipit-source-id: 8c5d17a
Summary:
We were frequently seeing a race between SyncPoint::Process() and
SyncPoint::~SyncPoint() (e.g.,
https://our.intern.facebook.com/intern/sandcastle/log/?instance_id=207289975&step_id=2412725431).
The issue was marked_thread_id_ gets deleted when the main thread is exiting and
simultaneously background threads may access it. We can prevent this race
condition by checking whether sync points are disabled (assuming the test terminates
with them disabled) before attempting to access that member. I do not understand
why accesses to other members (mutex_ and enabled_) are ok but anyways the
test no longer fails tsan.
Test Plan: ran tests
Reviewers: sdong, yhchiang, IslamAbdelRahman, yiwu, wanning
Subscribers: andrewkr, dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D62133
Summary: Add markers to sync points. A marked sync point will only be active when it is on the same thread as the marker sync point.
Test Plan: Write a unit test to validate.
Reviewers: sdong, IslamAbdelRahman, andrewkr
Reviewed By: andrewkr
Subscribers: andrewkr, dhruba
Differential Revision: https://reviews.facebook.net/D60375
Summary:
In crash test, when coming to each kill point, we start a random class using seed as current second. With this approach, for every second, the random number used is the same. However, in each second, there are multiple kill points with different frequency. It makes it hard to reason about chance of kill point to trigger. With this commit, we use thread local random seed to generate the random number, so that it will take different values per second, hoping it makes chances of killing much easier to reason about.
Also significantly reduce the kill odd to make sure time before kiling is similar as before.
Test Plan: Run white box crash test and see the killing happens as expected and the run time time before killing reasonable.
Reviewers: kradhakrishnan, IslamAbdelRahman, rven, yhchiang, andrewkr, anthony
Reviewed By: anthony
Subscribers: leveldb, dhruba
Differential Revision: https://reviews.facebook.net/D52971
Summary: crash_test still has a very low chance to hit some crash point. Have another mode for covering them more likely.
Test Plan: Run crash_test and see db_stress is called with expected prameters.
Reviewers: kradhakrishnan, igor, anthony, rven, IslamAbdelRahman, yhchiang
Reviewed By: yhchiang
Subscribers: leveldb, dhruba
Differential Revision: https://reviews.facebook.net/D49473
Summary:
Give a name for every kill point, and allow users to disable some kill points based on prefixes. The kill points can be passed by db_stress through a command line paramter. This provides a way for users to boost the chance of triggering low frequency kill points
This allow follow up changes in crash test scripts to improve crash test coverage.
Test Plan:
Manually run db_stress with variable values of --kill_random_test and --kill_prefix_blacklist. Like this:
--kill_random_test=2 --kill_prefix_blacklist=Posix,WritableFileWriter::Append,WritableFileWriter::WriteBuffered,WritableFileWriter::Sync
Reviewers: igor, kradhakrishnan, rven, IslamAbdelRahman, yhchiang
Reviewed By: yhchiang
Subscribers: leveldb, dhruba
Differential Revision: https://reviews.facebook.net/D48735
Summary: We want to keep Env a think layer for better portability. Less platform dependent codes should be moved out of Env. In this patch, I create a wrapper of file readers and writers, and put rate limiting, write buffering, as well as most perf context instrumentation and random kill out of Env. It will make it easier to maintain multiple Env in the future.
Test Plan: Run all existing unit tests.
Reviewers: anthony, kradhakrishnan, IslamAbdelRahman, yhchiang, igor
Reviewed By: igor
Subscribers: leveldb, dhruba
Differential Revision: https://reviews.facebook.net/D42321
Summary:
Allow users to give a callback function with parameter using sync point, so more complicated verification can be done in tests.
Use it in DBTest.DynamicLevelCompressionPerLevel2 so that failures will be more easy to debug.
Test Plan: Run all tests. Run DBTest.DynamicLevelCompressionPerLevel2 with valgrind check.
Reviewers: rven, yhchiang, anthony, kradhakrishnan, igor
Reviewed By: igor
Subscribers: leveldb, dhruba
Differential Revision: https://reviews.facebook.net/D36999
Summary:
When having fixed max_bytes_for_level_base, the ratio of size of largest level and the second one can range from 0 to the multiplier. This makes LSM tree frequently irregular and unpredictable. It can also cause poor space amplification in some cases.
In this improvement (proposed by Igor Kabiljo), we introduce a parameter option.level_compaction_use_dynamic_max_bytes. When turning it on, RocksDB is free to pick a level base in the range of (options.max_bytes_for_level_base/options.max_bytes_for_level_multiplier, options.max_bytes_for_level_base] so that real level ratios are close to options.max_bytes_for_level_multiplier.
Test Plan: New unit tests and pass tests suites including valgrind.
Reviewers: MarkCallaghan, rven, yhchiang, igor, ikabiljo
Reviewed By: ikabiljo
Subscribers: yoshinorim, ikabiljo, dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D31437
Summary:
The LoadDependency function does not take a lock when it runs
and it could be modifying data structures while other threads are
accessing it.
Test Plan: Run TSAN.
Reviewers: igor, sdong
Reviewed By: igor, sdong
Subscribers: dhruba
Differential Revision: https://reviews.facebook.net/D34095
Summary:
We don't really need sync_point.o if we're compiling with NDEBUG.
This diff depends on D17823
Test Plan: compiles
Reviewers: haobo, ljin, sdong
Reviewed By: haobo
CC: leveldb
Differential Revision: https://reviews.facebook.net/D17829
Summary: This patch fixed a race condition where a log file is moved to archived dir in the middle of GetSortedWalFiles. Without the fix, the log file would be missed in the result, which leads to transaction log iterator gap. A test utility SyncPoint is added to help reproducing the race condition.
Test Plan: TransactionLogIteratorRace; make check
Reviewers: dhruba, ljin
Reviewed By: dhruba
CC: leveldb
Differential Revision: https://reviews.facebook.net/D17121