Summary. A change https://reviews.facebook.net/differential/diff/224721/
Has attempted to move common functionality out of platform dependent
code to a new facility called file_reader_writer.
This includes:
- perf counters
- Buffering
- RateLimiting
However, the change did not attempt to refactor Windows code.
To mitigate, we introduce new quering interfaces such as UseOSBuffer(),
GetRequiredBufferAlignment() and ReaderWriterForward()
for pure forwarding where required.
Introduce WritableFile got a new method Truncate(). This is to communicate
to the file as to how much data it has on close.
- When space is pre-allocated on Linux it is filled with zeros implicitly,
no such thing exist on Windows so we must truncate file on close.
- When operating in unbuffered mode the last page is filled with zeros but we still want to truncate.
Previously, Close() would take care of it but now buffer management is shifted to the wrappers and the file has
no idea about the file true size.
This means that Close() on the wrapper level must always include
Truncate() as well as wrapper __dtor should call Close() and
against double Close().
Move buffered/unbuffered write logic to the wrapper.
Utilize Aligned buffer class.
Adjust tests and implement Truncate() where necessary.
Come up with reasonable defaults for new virtual interfaces.
Forward calls for RandomAccessReadAhead class to avoid double
buffering and locking (double locking in unbuffered mode on WIndows).
Summary: RandomRWFile is not used anywhere in out code base, this patch remove RandomRWFile
Test Plan:
make check -j64
USE_CLANG=1 make all -j64
OPT=-DROCKSDB_LITE make release -j64
Reviewers: sdong, yhchiang, anthony, kradhakrishnan, rven, igor
Reviewed By: igor
Subscribers: dhruba
Differential Revision: https://reviews.facebook.net/D44091
Summary: We want to keep Env a think layer for better portability. Less platform dependent codes should be moved out of Env. In this patch, I create a wrapper of file readers and writers, and put rate limiting, write buffering, as well as most perf context instrumentation and random kill out of Env. It will make it easier to maintain multiple Env in the future.
Test Plan: Run all existing unit tests.
Reviewers: anthony, kradhakrishnan, IslamAbdelRahman, yhchiang, igor
Reviewed By: igor
Subscribers: leveldb, dhruba
Differential Revision: https://reviews.facebook.net/D42321
Summary: This helps Windows port to format their changes, as discussed. Might have formatted some other codes too becasue last 10 commits include more.
Test Plan: Build it.
Reviewers: anthony, IslamAbdelRahman, kradhakrishnan, yhchiang, igor
Reviewed By: igor
Subscribers: leveldb, dhruba
Differential Revision: https://reviews.facebook.net/D41961
Summary: Make RocksDb build and run on Windows to be functionally
complete and performant. All existing test cases run with no
regressions. Performance numbers are in the pull-request.
Test plan: make all of the existing unit tests pass, obtain perf numbers.
Co-authored-by: Praveen Rao praveensinghrao@outlook.com
Co-authored-by: Sherlock Huang baihan.huang@gmail.com
Co-authored-by: Alex Zinoviev alexander.zinoviev@me.com
Co-authored-by: Dmitri Smirnov dmitrism@microsoft.com
Summary: It used to be no good (known to me) non-intrusive way to wrap WritableFile - you can't call protected virtual methods of the wrapped pointer to WritableFile. This diff adds a convenience class WritableFileWrapper that makes wrapping WritableFile both possible and easy.
Test Plan: `make clean; make -j release`, `make clean; OPT=-DROCKSDB_LITE make release`, `make clean; USE_CLANG=1 make -j all`.
Reviewers: sdong, yhchiang, rven
Reviewed By: rven
Subscribers: dhruba, tnovak, march
Differential Revision: https://reviews.facebook.net/D39147
Summary:
Without this change, someone on the machine on which
I run "make check" could cause me to overwrite arbitrary
files owned by me, via a symlink attack.
Instead of using a predictable temporary directory and
accepting to use a preexisting one, always create a new
one using mkdtemp. If $TEST_IOCTL_FRIENDLY_TMPDIR is
set and usable, attempt first to find a usable
temporary directory therein. If not, or if unusable,
then try /var/tmp and /tmp. If none of those is usable
abort with a diagnostic.
To do that, I added a new class.
Its constructor finds a suitable directory or aborts,
the sole member prints that directory's name, and the
destructor unlinks what should be an empty directory.
Note that while the code before this did not remove
its temporary directory, there was only one per $UID.
Now, there would be at least one per run or one per
test, depending on implementation, so it is important
to remove them.
Test Plan:
Run this on a fedora rawhide system, where /tmp
is a tmpfs file system, and /var/tmp is ext4.
# This gives a diagnostic that /dev/shm is not suitable
# and ends up using /var/tmp.
TEST_IOCTL_FRIENDLY_TMPDIR=/dev/shm ./env_test
# Uses /var/tmp; same as when envvar not set.
TEST_IOCTL_FRIENDLY_TMPDIR=/var/tmp ./env_test
# Uses /tmp unless it's tmpfs, in which case it gives
# a diagnostic and uses /var/tmp.
TEST_IOCTL_FRIENDLY_TMPDIR=/tmp ./env_test
Reviewers: ljin, rven, igor.sugak, yhchiang, sdong, igor
Reviewed By: igor
Subscribers: dhruba
Differential Revision: https://reviews.facebook.net/D37287
Summary:
[NB: this is a prerequisite for the /tmp-abuse-fixing patch]
This avoids spurious test failure on Linux systems
like Fedora for which /tmp is a tmpfs file system.
On a devtmpfs file
system, ioctl(fd, FS_IOC_GETVERSION, &version) returns -1 with
errno == ENOTTTY, indicating that that ioctl is not supported
on such a file system. Do not let this cause test failures, e.g.,
where env_test would assert that file->GetUniqueId(...) > 0.
Before this change, ./env_test would fail these three tests
on a fedora rawhide system:
[ FAILED ] 3 tests, listed below:
[ FAILED ] EnvPosixTest.RandomAccessUniqueID
[ FAILED ] EnvPosixTest.RandomAccessUniqueIDConcurrent
[ FAILED ] EnvPosixTest.RandomAccessUniqueIDDeletes
3 FAILED TESTS
The fix:
When support for that ioctl is lacking, skip each affected test.
Could be improved by noting which sub-tests are being skipped.
Test Plan:
run these on F21 and note that they now pass.
TEST_TMPDIR=/dev/shm/rdb ./env_test
./env_test
Reviewers: ljin, rven, igor.sugak, yhchiang, sdong, igor
Reviewed By: igor
Subscribers: dhruba
Differential Revision: https://reviews.facebook.net/D37323
Summary: Check for state of task before deleting it.
Test Plan: Run env_test with TSAN
Reviewers: igor, sdong
Reviewed By: sdong
Subscribers: meyering, dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D35283
Summary:
Our existing test notation is very similar to what is used in gtest. It makes it easy to adopt what is different.
In this diff I modify existing [[ https://code.google.com/p/googletest/wiki/Primer#Test_Fixtures:_Using_the_Same_Data_Configuration_for_Multiple_Te | test fixture ]] classes to inherit from `testing::Test`. Also for unit tests that use fixture class, `TEST` is replaced with `TEST_F` as required in gtest.
There are several custom `main` functions in our existing tests. To make this transition easier, I modify all `main` functions to fallow gtest notation. But eventually we can remove them and use implementation of `main` that gtest provides.
```lang=bash
% cat ~/transform
#!/bin/sh
files=$(git ls-files '*test\.cc')
for file in $files
do
if grep -q "rocksdb::test::RunAllTests()" $file
then
if grep -Eq '^class \w+Test {' $file
then
perl -pi -e 's/^(class \w+Test) {/${1}: public testing::Test {/g' $file
perl -pi -e 's/^(TEST)/${1}_F/g' $file
fi
perl -pi -e 's/(int main.*\{)/${1}::testing::InitGoogleTest(&argc, argv);/g' $file
perl -pi -e 's/rocksdb::test::RunAllTests/RUN_ALL_TESTS/g' $file
fi
done
% sh ~/transform
% make format
```
Second iteration of this diff contains only scripted changes.
Third iteration contains manual changes to fix last errors and make it compilable.
Test Plan:
Build and notice no errors.
```lang=bash
% USE_CLANG=1 make check -j55
```
Tests are still testing.
Reviewers: meyering, sdong, rven, igor
Reviewed By: igor
Subscribers: dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D35157
Summary:
On RocksDB, when there are multiple instances doing
flushes/compactions in the background, the close call takes a long time
because the flushes/compactions need to complete before the database can
shut down. If another instance is using the background threads and the compaction for this instance is in the queue since it has been scheduled, we still cannot shutdown. We now remove the scheduled background tasks which have not yet started running, so that shutdown is speeded up.
Test Plan: DB Test added.
Reviewers: yhchiang, igor, sdong
Reviewed By: sdong
Subscribers: dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D33741
Summary:
In some environment such as android, the c++ library does not have
std::to_string. This path adds rocksdb::ToString(), which wraps std::to_string
when std::to_string is not available, and implements std::to_string
in the other case.
Test Plan:
make dbg -j32
./db_test
make clean
make dbg OPT=-DOS_ANDROID -j32
./db_test
Reviewers: ljin, sdong, igor
Reviewed By: igor
Subscribers: dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D29181
It seems that on some FS we get more blocks than we ask for. This is
already handled when checking the allocated number of blocks, but
after the file is closed it checks for an exact number of blocks,
which fails on my machine.
I changed the test to add one full page to the size, then calculate
the expected number of blocks and check if the actual number of blocks
is less or equal to that.
Summary:
With the patch, thread pool size will be automatically increased if DB's options ask for more parallelism of compactions or flushes.
Too many users have been confused by the API. Change it to make it harder for users to make mistakes
Test Plan: Add two unit tests to cover the function.
Reviewers: yhchiang, rven, igor, MarkCallaghan, ljin
Reviewed By: ljin
Subscribers: dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D27555
Summary:
TestMemEnv simulates all Env APIs using in-memory data structures.
We can use it to speed up db_test run, which is now reduced ~7mins when it is
enabled.
We can also add features to simulate power/disk failures in the next
step
TestMemEnv is derived from helper/mem_env
mem_env can not be used for rocksdb since some of its APIs do not give
the same results as env_posix. And its file read/write is not thread safe
Test Plan:
make all -j32
./db_test
./env_mem_test
Reviewers: sdong, yhchiang, rven, igor
Reviewed By: igor
Subscribers: dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D28035
Summary:
ftruncate does not always free preallocated unused space at the end of file.
In some cases, we pin too much disk space than it should
Test Plan: env_test
Reviewers: sdong, rven, yhchiang, igor
Reviewed By: igor
Subscribers: nkg-, dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D25641
Summary: RocksDB already depends on C++11, so we might as well all the goodness that C++11 provides. This means that we don't need AtomicPointer anymore. The less things in port/, the easier it will be to port to other platforms.
Test Plan: make check + careful visual review verifying that NoBarried got memory_order_relaxed, while Acquire/Release methods got memory_order_acquire and memory_order_release
Reviewers: rven, yhchiang, ljin, sdong
Reviewed By: ljin
Subscribers: leveldb
Differential Revision: https://reviews.facebook.net/D27543
Summary:
Now the file summary is too small for printing. Enlarge it.
To enable it, allow to pass a size to log buffer.
Test Plan:
Add a unit test.
make all check
Reviewers: ljin, yhchiang
Reviewed By: yhchiang
Subscribers: leveldb
Differential Revision: https://reviews.facebook.net/D21723
Summary: To avoid false positive test failures when the file system doesn't support fallocate. In EnvTest.AllocateTest, we first make a simple fallocate call and check the error codes to rule out the possibility that it is not supported. Skip the test if the error code indicates it is not supported.
Test Plan: Run the test and make sure it passes on file systems supporting and not supporting fallocate
Reviewers: yhchiang, ljin, igor
Reviewed By: igor
Subscribers: leveldb
Differential Revision: https://reviews.facebook.net/D23667
Summary:
Lots of travis builds are failing because on EnvPosixTest.RandomAccessUniqueID: https://travis-ci.org/facebook/rocksdb/builds/34400833
This is the result of their environment and not because of RocksDB's bug.
Also note that RocksDB works correctly even though UniqueID feature is not present in the system (as it's the case with os x)
Test Plan:
OPT=-DTRAVIS make env_test && ./env_test
Observed that offending tests are not being run
Reviewers: sdong, yhchiang, ljin
Reviewed By: ljin
Subscribers: leveldb
Differential Revision: https://reviews.facebook.net/D22803
Summary:
Add a feature to decrease the number of threads in thread pool.
Also instantly schedule more threads if number of threads is increased.
Here is the way it is implemented: each background thread needs its thread ID. After decreasing number of threads, all threads are woken up. The thread with the largest thread ID will terminate. If there are more threads to terminate, the thread will wake up all threads again.
Another change is made so that when number of threads is increased, more threads are created and all previous excessive threads are woken up to do the work.
Test Plan: Add a unit test.
Reviewers: haobo, dhruba
Reviewed By: haobo
CC: yhchiang, igor, nkg-, leveldb
Differential Revision: https://reviews.facebook.net/D18675
Summary: For some reason, on a subset of our continuous build machines, preallocation is allocating 8 block more than it should be. Let's relax the test a little bit -- now we require the test to allocate *at least* the number of blocks as we told them to.
Test Plan: no
Reviewers: ljin, haobo, sdong
Reviewed By: ljin
CC: leveldb
Differential Revision: https://reviews.facebook.net/D18141
Summary: XCode for some reason injects `#define DEBUG 1` into our code, which makes compile fail because we use `DEBUG` keyword for other stuff. This diff fixes the issue by renaming `DEBUG` to `DEBUG_LEVEL`.
Test Plan: compiles
Reviewers: dhruba, haobo, sdong, yhchiang, ljin
Reviewed By: yhchiang
CC: leveldb
Differential Revision: https://reviews.facebook.net/D17709
Summary: Compiling for iOS has by default turned on -Wmissing-prototypes, which causes rocksdb to fail compiling. This diff turns on -Wmissing-prototypes in our compile options and cleans up all functions with missing prototypes.
Test Plan: compiles
Reviewers: dhruba, haobo, ljin, sdong
Reviewed By: ljin
CC: leveldb
Differential Revision: https://reviews.facebook.net/D17649
Summary: LogBuffer::AddLogToBuffer() uses vsnprintf() in the wrong way, which might cause buffer overflow when log line is too line. Fix it.
Test Plan: Add a unit test to cover most LogBuffer's most logic.
Reviewers: igor, haobo, dhruba
Reviewed By: igor
CC: ljin, yhchiang, leveldb
Differential Revision: https://reviews.facebook.net/D17103
Summary: Add a function to Env so that users can query the waiting queue length of each thread pool
Test Plan: add a test in env_test
Reviewers: haobo
Reviewed By: haobo
CC: dhruba, igor, yhchiang, ljin, nkg-, leveldb
Differential Revision: https://reviews.facebook.net/D16755
Summary:
Blocks allocated with fallocate will take extra space on disk even if they are unused and the file is close.
Now we remove the extra blocks at the end of the file by calling `ftruncate`.
Test Plan: added a test to env_test
Reviewers: dhruba
Reviewed By: dhruba
CC: leveldb
Differential Revision: https://reviews.facebook.net/D16647
Summary:
With the use of tmpfs or ramfs, unit tests related to GetUniqueID()
failed because of the failure from ioctl, which doesn't work with these
fancy file systems at all.
I fixed this issue and make sure all related tests run on the "regular"
storage (disk or flash).
Test Plan: TEST_TMPDIR=/dev/shm make check -j32
Reviewers: igor, dhruba
CC: leveldb
Differential Revision: https://reviews.facebook.net/D16593
Summary: This diff invoves some more complicated issues in the posix environment.
Test Plan: works under mac os. will need to verify dev box.
Reviewers: dhruba
Reviewed By: dhruba
CC: leveldb
Differential Revision: https://reviews.facebook.net/D14061
Summary:
tmpfs might not support fallocate(). Fix unit test so that this
does not cause a unit test to fail.
Test Plan: ./env_test
Reviewers: emayanke, igor, kailiu
Reviewed By: kailiu
CC: leveldb
Differential Revision: https://reviews.facebook.net/D13455
Summary: I have implemented basic simple use case that I need for External Value Store I'm working on. There is a potential for making this prettier by refactoring/combining WritableFile and RandomAccessFile, avoiding some copypasta. However, I decided to implement just the basic functionality, so I can continue working on the other diff.
Test Plan: Added a unittest
Reviewers: dhruba, haobo, kailiu
Reviewed By: haobo
CC: leveldb
Differential Revision: https://reviews.facebook.net/D13365
Summary:
Change namespace from leveldb to rocksdb. This allows a single
application to link in open-source leveldb code as well as
rocksdb code into the same process.
Test Plan: compile rocksdb
Reviewers: emayanke
Reviewed By: emayanke
CC: leveldb
Differential Revision: https://reviews.facebook.net/D13287
Summary:
Added a new api to the Environment that allows clearing out not-needed
pages from the OS cache. This will be helpful when the compressed
block cache replaces the OS cache.
Test Plan: EnvPosixTest.InvalidateCache
Reviewers: haobo
Reviewed By: haobo
CC: leveldb
Differential Revision: https://reviews.facebook.net/D13041
Summary: move the TwoPools test to the end of thread related tests. Otherwise, the SetBackgroundThreads call would increase the Low pool size and affect the result of other tests.
Test Plan: make env_test; ./env_test
Reviewers: dhruba, emayanke, xjin
Reviewed By: xjin
CC: leveldb
Differential Revision: https://reviews.facebook.net/D12939
Summary:
this is the ground work for separating memtable flush jobs to their own thread pool.
Both SetBackgroundThreads and Schedule take a third parameter Priority to indicate which thread pool they are working on. The names LOW and HIGH are just identifiers for two different thread pools, and does not indicate real difference in 'priority'. We can set number of threads in the pools independently.
The thread pool implementation is refactored.
Test Plan: make check
Reviewers: dhruba, emayanke
CC: leveldb
Differential Revision: https://reviews.facebook.net/D12885