Summary:
Many logging related source files are under util/. It will be more structured if they are together.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5387
Differential Revision: D15579036
Pulled By: siying
fbshipit-source-id: 3850134ed50b8c0bb40a0c8ae1f184fa4081303f
Summary:
There are too many types of files under util/. Some test related files don't belong to there or just are just loosely related. Mo
ve them to a new directory test_util/, so that util/ is cleaner.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5377
Differential Revision: D15551366
Pulled By: siying
fbshipit-source-id: 0f5c8653832354ef8caa31749c0143815d719e2c
Summary:
The recent improvement in https://github.com/facebook/rocksdb/pull/3661 could cause a deadlock: When writing recoverable state, we also commit its sequence number to commit table, which could result into evicting existing commit entry, which could result into advancing max_evicted_seq_, which would need to get snapshots from database, which requires obtaining db mutex. The patch releases db_mutex before calling the callback in WriteRecoverableState to avoid the potential deadlock. It also improves the stress tests to let the issue be manifested in the tests.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5306
Differential Revision: D15341458
Pulled By: maysamyabandeh
fbshipit-source-id: 05dcbed7e21b789fd1e5fd5ee8eea08077162323
Summary:
Annotate all of the logging functions to inform the compiler that these
use printf-style formatting arguments. This allows the compiler to emit
warnings if the format arguments are incorrect.
This also fixes many problems reported now that format string checking
is enabled. Many of these are simply mix-ups in the argument type (e.g,
int vs uint64_t), but in several cases the wrong number of arguments
were being passed in which can cause the code to crash.
The primary motivation for this was to fix the log message in
`DBImpl::SwitchMemtable()` which caused a segfault due to an extra %s
format parameter with no argument supplied.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5089
Differential Revision: D14574795
Pulled By: simpkins
fbshipit-source-id: 0921b03f0743652bf4ae21e414ff54b3bb65422a
Summary:
Enhance ::Insert and ::Verify test functions to add artificial delay between prepare and commit, and take snapshot and reads respectively. A future PR will make use of these to improve stress tests to test against long-running transactions as well as long-running backup jobs. Also randomly sets set_snapshot to false for inserters to skip setting the snapshot in the initialization phase and let the snapshot be taken later explicitly.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/4970
Differential Revision: D14031342
Pulled By: maysamyabandeh
fbshipit-source-id: b52b453751f0b25b81b23c48892bc1d152464cab
Summary:
Closes https://github.com/facebook/rocksdb/issues/4462
I'm not sure if you'll be happy with `std::random_device{}`, perhaps you would want to use your rand instance instead. I didn't test to see if your rand instance supports the requirements that `std::shuffle` takes.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/4482
Differential Revision: D10325133
Pulled By: yiwu-arbug
fbshipit-source-id: 47b7adaf4bb2b8d64cf090ea6b1b48ef53180581
Summary:
As you know, almost all compilers support "pragma once" keyword instead of using include guards. To be keep consistency between header files, all header files are edited.
Besides this, try to fix some warnings about loss of data.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/4339
Differential Revision: D9654990
Pulled By: ajkr
fbshipit-source-id: c2cf3d2d03a599847684bed81378c401920ca848
Summary:
This adds support for writing unprepared batches based on size defined in `TransactionOptions::max_write_batch_size`. This is done by overriding methods that modify data (Put/Delete/SingleDelete/Merge) and checking first if write batch size has exceeded threshold. If so, the write batch is written to DB as an unprepared batch.
Support for Commit/Rollback for unprepared batch is added as well. This has been done by simply extending the WritePrepared Commit/Rollback logic to take care of all unprep_seq numbers either when updating prepare heap, or adding to commit map. For updating the commit map, this logic exists inside `WriteUnpreparedCommitEntryPreReleaseCallback`.
A test change was also made to have transactions unregister themselves when committing without prepare. This is because with write unprepared, there may be unprepared entries (which act similarly to prepared entries) already when a commit is done without prepare.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/4104
Differential Revision: D8785717
Pulled By: lth
fbshipit-source-id: c02006e281ec1ce00f628e2a7beec0ee73096a91
Summary:
Rollback was disabled in stress test since there was a concurrency issue in WritePrepared rollback algorithm. The issue is fixed by caching the column family handles in WritePrepared to skip getting them from the db when needed for rollback.
Tested by running transaction stress test under tsan.
Closes https://github.com/facebook/rocksdb/pull/3785
Differential Revision: D7793727
Pulled By: maysamyabandeh
fbshipit-source-id: d81ab6fda0e53186ca69944cfe0712ce4869451e
Summary:
WritePrepared rollback implementation is not ready to be invoked in the middle of workload. This is due the lack of synchronization to obtain the cf handle from db. Temporarily disabling this until the problem with rollback is fixed.
Closes https://github.com/facebook/rocksdb/pull/3772
Differential Revision: D7769041
Pulled By: maysamyabandeh
fbshipit-source-id: 0e3b0ce679bc2afba82e653a40afa3f045722754
Summary:
Currently WritePrepared rolls back a transaction with prepare sequence number prepare_seq by i) write a single rollback batch with rollback_seq, ii) add <rollback_seq, rollback_seq> to commit cache, iii) remove prepare_seq from PrepareHeap.
This is correct assuming that there is no snapshot taken when a transaction is rolled back. This is the case the way MySQL does rollback which is after recovery. Otherwise if max_evicted_seq advances the prepare_seq, the live snapshot might assume data as committed since it does not find them in CommitCache.
The change is to simply add <prepare_seq. rollback_seq> to commit cache before removing prepare_seq from PrepareHeap. In this way if max_evicted_seq advances prpeare_seq, the existing mechanism that we have to check evicted entries against live snapshots will make sure that the live snapshot will not see the data of rolled back transaction.
Closes https://github.com/facebook/rocksdb/pull/3745
Differential Revision: D7696193
Pulled By: maysamyabandeh
fbshipit-source-id: c9a2d46341ddc03554dded1303520a1cab74ef9c
Summary:
This patch addressed several issues.
Portability including db_test std::thread -> port::Thread Cc: @
and %z to ROCKSDB portable macro. Cc: maysamyabandeh
Implement Env::AreFilesSame
Make the implementation of file unique number more robust
Get rid of C-runtime and go directly to Windows API when dealing
with file primitives.
Implement GetSectorSize() and aling unbuffered read on the value if
available.
Adjust Windows Logger for the new interface, implement CloseImpl() Cc: anand1976
Fix test running script issue where $status var was of incorrect scope
so the failures were swallowed and not reported.
DestroyDB() creates a logger and opens a LOG file in the directory
being cleaned up. This holds a lock on the folder and the cleanup is
prevented. This fails one of the checkpoin tests. We observe the same in production.
We close the log file in this change.
Fix DBTest2.ReadAmpBitmapLiveInCacheAfterDBClose failure where the test
attempts to open a directory with NewRandomAccessFile which does not
work on Windows.
Fix DBTest.SoftLimit as it is dependent on thread timing. CC: yiwu-arbug
Closes https://github.com/facebook/rocksdb/pull/3552
Differential Revision: D7156304
Pulled By: siying
fbshipit-source-id: 43db0a757f1dfceffeb2b7988043156639173f5b
Summary:
This fixes the following warnings when compiled with GCC7:
util/transaction_test_util.cc: In static member function ‘static rocksdb::Status rocksdb::RandomTransactionInserter::DBGet(rocksdb::DB*, rocksdb::Transaction*, rocksdb::ReadOptions&, uint16_t, uint64_t, bool, uint64_t*, std::__cxx11::string*, bool*)’:
util/transaction_test_util.cc:75:8: error: ‘snprintf’ output may be truncated before the last format character [-Werror=format-truncation=]
Status RandomTransactionInserter::DBGet(
^~~~~~~~~~~~~~~~~~~~~~~~~
util/transaction_test_util.cc:84:11: note: ‘snprintf’ output between 5 and 6 bytes into a destination of size 5
snprintf(prefix_buf, sizeof(prefix_buf), "%.4u", set_i + 1);
~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
util/transaction_test_util.cc: In static member function ‘static rocksdb::Status rocksdb::RandomTransactionInserter::Verify(rocksdb::DB*, uint16_t, uint64_t, bool, rocksdb::Random64*)’:
util/transaction_test_util.cc:245:8: error: ‘snprintf’ output may be truncated before the last format character [-Werror=format-truncation=]
Status RandomTransactionInserter::Verify(DB* db, uint16_t num_sets,
^~~~~~~~~~~~~~~~~~~~~~~~~
util/transaction_test_util.cc:268:13: note: ‘snprintf’ output between 5 and 6 bytes into a destination of size 5
snprintf(prefix_buf, sizeof(prefix_buf), "%.4u", set_i + 1);
~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Closes https://github.com/facebook/rocksdb/pull/3295
Differential Revision: D6609411
Pulled By: maysamyabandeh
fbshipit-source-id: 33f0add471056eb59db2f8bd4366e6dfbb1a187d
Summary:
There were a few places where MSVC's implicit truncation warnings were getting triggered, which was causing the MSVC build to fail due to warnings being treated as errors. This resolves the issues by making the truncations in some places explicit, and by making it so there are no truncations of literals.
Fixes#3239
Supersedes #3259
Closes https://github.com/facebook/rocksdb/pull/3273
Reviewed By: yiwu-arbug
Differential Revision: D6569204
Pulled By: Orvid
fbshipit-source-id: c188cf1cf98d9acb6d94b71875041cc81f8ff088
Summary:
Throughput: 46k tps in our sysbench settings (filling the details later)
The idea is to have the simplest change that gives us a reasonable boost
in 2PC throughput.
Major design changes:
1. The WAL file internal buffer is not flushed after each write. Instead
it is flushed before critical operations (WAL copy via fs) or when
FlushWAL is called by MySQL. Flushing the WAL buffer is also protected
via mutex_.
2. Use two sequence numbers: last seq, and last seq for write. Last seq
is the last visible sequence number for reads. Last seq for write is the
next sequence number that should be used to write to WAL/memtable. This
allows to have a memtable write be in parallel to WAL writes.
3. BatchGroup is not used for writes. This means that we can have
parallel writers which changes a major assumption in the code base. To
accommodate for that i) allow only 1 WriteImpl that intends to write to
memtable via mem_mutex_--which is fine since in 2PC almost all of the memtable writes
come via group commit phase which is serial anyway, ii) make all the
parts in the code base that assumed to be the only writer (via
EnterUnbatched) to also acquire mem_mutex_, iii) stat updates are
protected via a stat_mutex_.
Note: the first commit has the approach figured out but is not clean.
Submitting the PR anyway to get the early feedback on the approach. If
we are ok with the approach I will go ahead with this updates:
0) Rebase with Yi's pipelining changes
1) Currently batching is disabled by default to make sure that it will be
consistent with all unit tests. Will make this optional via a config.
2) A couple of unit tests are disabled. They need to be updated with the
serial commit of 2PC taken into account.
3) Replacing BatchGroup with mem_mutex_ got a bit ugly as it requires
releasing mutex_ beforehand (the same way EnterUnbatched does). This
needs to be cleaned up.
Closes https://github.com/facebook/rocksdb/pull/2345
Differential Revision: D5210732
Pulled By: maysamyabandeh
fbshipit-source-id: 78653bd95a35cd1e831e555e0e57bdfd695355a4
Summary: Refactored db_bench transaction stress tests so that they can be called from unit tests as well.
Test Plan: run new unit test as well as db_bench
Reviewers: yhchiang, IslamAbdelRahman, sdong
Reviewed By: IslamAbdelRahman
Subscribers: andrewkr, dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D55203