Summary:
crash-recovery verification is failing in the whitebox testing, which may or may not be a valid correctness issue -- need more time to investigate. In the meantime, reverting so we don't mask other failures.
Closes https://github.com/facebook/rocksdb/pull/3786
Differential Revision: D7794516
Pulled By: ajkr
fbshipit-source-id: 28ccdfdb9ec9b3b0fb08c15cbf9d2e282201ff33
Summary:
Previously, our `db_stress` tool held the expected state of the DB in-memory, so after crash-recovery, there was no way to verify data correctness. This PR adds an option, `--expected_values_file`, which specifies a file holding the expected values.
In black-box testing, the `db_stress` process can be killed arbitrarily, so updates to the `--expected_values_file` must be atomic. We achieve this by `mmap`ing the file and relying on `std::atomic<uint32_t>` for atomicity. Actually this doesn't provide a total guarantee on what we want as `std::atomic<uint32_t>` could, in theory, be translated into multiple stores surrounded by a mutex. We can verify our assumption by looking at `std::atomic::is_always_lock_free`.
For the `mmap`'d file, we didn't have an existing way to expose its contents as a raw memory buffer. This PR adds it in the `Env::NewMemoryMappedFileBuffer` function, and `MemoryMappedFileBuffer` class.
`db_crashtest.py` is updated to use an expected values file for black-box testing. On the first iteration (when the DB is created), an empty file is provided as `db_stress` will populate it when it runs. On subsequent iterations, that same filename is provided so `db_stress` can check the data is as expected on startup.
Closes https://github.com/facebook/rocksdb/pull/3629
Differential Revision: D7463144
Pulled By: ajkr
fbshipit-source-id: c8f3e82c93e045a90055e2468316be155633bd8b
Summary:
The test is flaky in our CI but could not be reproduce manually on the same CI host. Disabling it.
Closes https://github.com/facebook/rocksdb/pull/3753
Differential Revision: D7716320
Pulled By: yiwu-arbug
fbshipit-source-id: 6bed3b05880c1d24e8dc86bc970e5181bc98fb45
Summary:
This PR comments out the rest of the unused arguments which allow us to turn on the -Wunused-parameter flag. This is the second part of a codemod relating to https://github.com/facebook/rocksdb/pull/3557.
Closes https://github.com/facebook/rocksdb/pull/3662
Differential Revision: D7426121
Pulled By: Dayvedde
fbshipit-source-id: 223994923b42bd4953eb016a0129e47560f7e352
Summary:
This diff handles cases where compaction causes an ENOSPC error.
This does not handle corner cases where another background job is started while compaction is running, and the other background job triggers ENOSPC, although we do allow the user to provision for these background jobs with SstFileManager::SetCompactionBufferSize.
It also does not handle the case where compaction has finished and some other background job independently triggers ENOSPC.
Usage: Functionality is inside SstFileManager. In particular, users should set SstFileManager::SetMaxAllowedSpaceUsage, which is the reference highwatermark for determining whether to cancel compactions.
Closes https://github.com/facebook/rocksdb/pull/3449
Differential Revision: D7016941
Pulled By: amytai
fbshipit-source-id: 8965ab8dd8b00972e771637a41b4e6c645450445
Summary:
This patch addressed several issues.
Portability including db_test std::thread -> port::Thread Cc: @
and %z to ROCKSDB portable macro. Cc: maysamyabandeh
Implement Env::AreFilesSame
Make the implementation of file unique number more robust
Get rid of C-runtime and go directly to Windows API when dealing
with file primitives.
Implement GetSectorSize() and aling unbuffered read on the value if
available.
Adjust Windows Logger for the new interface, implement CloseImpl() Cc: anand1976
Fix test running script issue where $status var was of incorrect scope
so the failures were swallowed and not reported.
DestroyDB() creates a logger and opens a LOG file in the directory
being cleaned up. This holds a lock on the folder and the cleanup is
prevented. This fails one of the checkpoin tests. We observe the same in production.
We close the log file in this change.
Fix DBTest2.ReadAmpBitmapLiveInCacheAfterDBClose failure where the test
attempts to open a directory with NewRandomAccessFile which does not
work on Windows.
Fix DBTest.SoftLimit as it is dependent on thread timing. CC: yiwu-arbug
Closes https://github.com/facebook/rocksdb/pull/3552
Differential Revision: D7156304
Pulled By: siying
fbshipit-source-id: 43db0a757f1dfceffeb2b7988043156639173f5b
Summary:
Currently FreeBSD build is broken in master and possibly some previous releases due to unrecognized symbol `O_DIRECT`.
This PR will fix the build on FreeBSD
Closes https://github.com/facebook/rocksdb/pull/3560
Differential Revision: D7148646
Pulled By: miasantreble
fbshipit-source-id: 95b6c3d310fa531267c086b2cd40a5ab1c042b5a
Summary:
The recent Logger::Close() and DBImpl::Close() implementation rely on
calling the CloseImpl() virtual function from the destructor, which will
not work. Refactor the implementation to have a private close helper
function in derived classes that can be called by both CloseImpl() and
the destructor.
Closes https://github.com/facebook/rocksdb/pull/3528
Reviewed By: gfosco
Differential Revision: D7049303
Pulled By: anand1976
fbshipit-source-id: 76a64cbf403209216dfe4864ecf96b5d7f3db9f4
Summary:
Add per-exe execution capability
Add fix parsing of groups/tests
Add timer test exclusion
Fix unit tests
Ifdef threadpool specific tests that do not pass on Vista threadpool.
Remove spurious outout from prefix_test so test case listing works
properly.
Fix not using standard test directories results in file creation errors
in sst_dump_test.
BlobDb fixes:
In C++ end() iterators can not be dereferenced. They are not valid.
When deleting blob_db_ set it to nullptr before any other code executes.
Not fixed:. On Windows you can not delete a file while it is open.
[ RUN ] BlobDBTest.ReadWhileGC
d:\dev\rocksdb\rocksdb\utilities\blob_db\blob_db_test.cc(75): error: DestroyBlobDB(dbname_, options, bdb_options)
IO error: Failed to delete: d:/mnt/db\testrocksdb-17444/blob_db_test/blob_dir/000001.blob: Permission denied
d:\dev\rocksdb\rocksdb\utilities\blob_db\blob_db_test.cc(75): error: DestroyBlobDB(dbname_, options, bdb_options)
IO error: Failed to delete: d:/mnt/db\testrocksdb-17444/blob_db_test/blob_dir/000001.blob: Permission denied
write_batch
Should not call front() if there is a chance the container is empty
Closes https://github.com/facebook/rocksdb/pull/3152
Differential Revision: D6293274
Pulled By: sagar0
fbshipit-source-id: 318c3717c22087fae13b18715dffb24565dbd956
Summary:
A few simple changes to allow RocksDB to be built on OpenBSD. Let me know if any further changes are needed.
Closes https://github.com/facebook/rocksdb/pull/3061
Differential Revision: D6138800
Pulled By: ajkr
fbshipit-source-id: a13a17b5dc051e6518bd56a8c5efd1d24dd81b0c
Summary:
When we had a single thread pool for compactions, a thread could be busy for a long time (minutes) executing a compaction involving the bottom level. In multi-instance setups, the entire thread pool could be consumed by such bottom-level compactions. Then, top-level compactions (e.g., a few L0 files) would be blocked for a long time ("head-of-line blocking"). Such top-level compactions are critical to prevent compaction stalls as they can quickly reduce number of L0 files / sorted runs.
This diff introduces a bottom-priority queue for universal compactions including the bottom level. This alleviates the head-of-line blocking situation for fast, top-level compactions.
- Added `Env::Priority::BOTTOM` thread pool. This feature is only enabled if user explicitly configures it to have a positive number of threads.
- Changed `ThreadPoolImpl`'s default thread limit from one to zero. This change is invisible to users as we call `IncBackgroundThreadsIfNeeded` on the low-pri/high-pri pools during `DB::Open` with values of at least one. It is necessary, though, for bottom-pri to start with zero threads so the feature is disabled by default.
- Separated `ManualCompaction` into two parts in `PrepickedCompaction`. `PrepickedCompaction` is used for any compaction that's picked outside of its execution thread, either manual or automatic.
- Forward universal compactions involving last level to the bottom pool (worker thread's entry point is `BGWorkBottomCompaction`).
- Track `bg_bottom_compaction_scheduled_` so we can wait for bottom-level compactions to finish. We don't count them against the background jobs limits. So users of this feature will get an extra compaction for free.
Closes https://github.com/facebook/rocksdb/pull/2580
Differential Revision: D5422916
Pulled By: ajkr
fbshipit-source-id: a74bd11f1ea4933df3739b16808bb21fcd512333
Summary:
This reverts the previous commit 1d7048c598, which broke the build.
Did a `git revert 1d7048c`.
Closes https://github.com/facebook/rocksdb/pull/2627
Differential Revision: D5476473
Pulled By: sagar0
fbshipit-source-id: 4756ff5c0dfc88c17eceb00e02c36176de728d06
Summary: This uses `clang-tidy` to comment out unused parameters (in functions, methods and lambdas) in fbcode. Cases that the tool failed to handle are fixed manually.
Reviewed By: igorsugak
Differential Revision: D5454343
fbshipit-source-id: 5dee339b4334e25e963891b519a5aa81fbf627b2
Summary:
Disable direct reads for log and manifest. Direct reads should not affect sequential_file
Also add kDirectIO for option_config_ in db_test_util
Closes https://github.com/facebook/rocksdb/pull/2337
Differential Revision: D5100261
Pulled By: lightmark
fbshipit-source-id: 0ebfd13b93fa1b8f9acae514ac44f8125a05868b
Summary:
Replacement of #2147
The change was squashed due to a lot of conflicts.
Closes https://github.com/facebook/rocksdb/pull/2194
Differential Revision: D4929799
Pulled By: siying
fbshipit-source-id: 5cd49c254737a1d5ac13f3c035f128e86524c581
Summary:
Move some files under util/ to new directories env/, monitoring/ options/ and cache/
Closes https://github.com/facebook/rocksdb/pull/2090
Differential Revision: D4833681
Pulled By: siying
fbshipit-source-id: 2fd8bef
Summary:
querying logical sector size from the device instead of hardcoding it for linux platform.
Closes https://github.com/facebook/rocksdb/pull/1875
Differential Revision: D4591946
Pulled By: ajkr
fbshipit-source-id: 4e9805c
Summary:
Some places autodetected. These are the two places that didn't.
closes#1498
Still unsure if the following instances of 4 * 1024 need fixing in:
util/io_posix.h
include/rocksdb/table.h (appears to be blocksize and different)
utilities/persistent_cache/block_cache_tier.cc
utilities/persistent_cache/persistent_cache_test.h
include/rocksdb/env.h
util/env_posix.cc
db/column_family.cc
Closes https://github.com/facebook/rocksdb/pull/1499
Differential Revision: D4593640
Pulled By: yiwu-arbug
fbshipit-source-id: efc48de
Summary:
fix lite bugs
disable direct io in lite mode
Closes https://github.com/facebook/rocksdb/pull/1870
Differential Revision: D4559866
Pulled By: yiwu-arbug
fbshipit-source-id: 3761c51
Summary:
If we don't wait for the threads to finish after each run, the thread queue may not be empty while the next test starts to run, which can cause unexpected behaviors.
Also make some of the relaxed read/write more restrict.
Closes https://github.com/facebook/rocksdb/pull/1590
Reviewed By: AsyncDBConnMarkedDownDBException
Differential Revision: D4245922
Pulled By: AsyncDBConnMarkedDownDBException
fbshipit-source-id: f83b74b
Summary:
also change variable name `direct_io_` to `use_direct_io_` in WritableFile to make it consistent with read path.
Closes https://github.com/facebook/rocksdb/pull/1770
Differential Revision: D4416435
Pulled By: lightmark
fbshipit-source-id: 4143c53
Summary:
direct IO reads refactoring
remove unnecessary classes and unified interfaces
tested with db_bench
need more change for options and ON/OFF for different files.
Since disabled is default, it should be fine now
Closes https://github.com/facebook/rocksdb/pull/1636
Differential Revision: D4307189
Pulled By: lightmark
fbshipit-source-id: 6991e22
Summary:
Enable directIO on WritableFileImpl::Append
with offset being current length of the file.
Enable UniqueID tests on Windows, disable others but
leeting them to compile. Unique tests are valuable to
detect failures on different filesystems and upcoming
ReFS.
Clear output in WinEnv Getchildren.This is different from
previous strategy, do not touch output on failure.
Make sure DBTest.OpenWhenOpen works with windows error message
Closes https://github.com/facebook/rocksdb/pull/1746
Differential Revision: D4385681
Pulled By: IslamAbdelRahman
fbshipit-source-id: c07b702
Summary:
This patch clarifies the contract of PositionedAppend with some unit
tests and also implements it for PosixWritableFile. (Tasks: 14524071)
Closes https://github.com/facebook/rocksdb/pull/1514
Differential Revision: D4204907
Pulled By: maysamyabandeh
fbshipit-source-id: 06eabd2
Summary: EnvPosixTestWithParam.TwoPools relies on explicit sleeping, so it sometimes fail. Fix it.
Test Plan: Run tests with high parallelism many times and make sure the test passes.
Reviewers: yiwu, andrewkr
Reviewed By: andrewkr
Subscribers: leveldb, andrewkr, dhruba
Differential Revision: https://reviews.facebook.net/D63417
Summary:
Add Env::RandomRWFile in env.h and implement it for POSIX
RandomRWFile is a file that allow us to read from / write to random offsets in the file
I will implement it for other Envs later after finishing the whole task for AddFile()
Test Plan: unit tests
Reviewers: andrewkr, kradhakrishnan, sdong
Reviewed By: sdong
Subscribers: andrewkr, dhruba
Differential Revision: https://reviews.facebook.net/D62433
Summary:
Direct IO checkin breaks Windows build. Fixing the code to work for
Windows.
Test Plan: Run env_test in Windows 10 and make check in Linux
Reviewers: sdong
Subscribers: andrewkr, dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D59073
Summary:
O_DIRECT is not available in Mac as a flag for open. The fix is to make
use of fctl after the file is opened
Test Plan: Run the tests on mac and Linux
Reviewers: sdong
Subscribers: andrewkr, dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D58665
Summary:
This patch adds direct IO capability to RocksDB Env.
The direct IO capability is required for persistent cache since NVM is best
accessed as 4K direct IO. SSDs can leverage direct IO for reading.
Direct IO requires the offset and size be sector size aligned, and memory to
be kernel page aligned. Since neither RocksDB/Persistent read cache data
layout is aligned to sector size, the code can accommodate reading unaligned IO size
(or unaligned memory) at the cost of an alloc/copy.
The write code path expects the size and memory to be aligned.
Test Plan: Run RocksDB unit tests
Reviewers: sdong
Subscribers: andrewkr, dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D57393
Summary:
For testing backups, we needed an Env that is fully isolated from other
Envs on the same machine. Our in-memory Envs (MockEnv and InMemoryEnv) were
insufficient because they don't implement most directory operations.
This diff introduces a new Env, "ChrootEnv", that translates paths such that the
chroot directory appears to be the root directory. This way, multiple Envs can
be isolated in the filesystem by using different chroot directories. Since we
use the filesystem, all directory operations are trivially supported.
Test Plan:
I parameterized the existing EnvPosixTest so it runs tests on ChrootEnv
except the ioctl-related cases.
Reviewers: sdong, lightmark, IslamAbdelRahman
Reviewed By: IslamAbdelRahman
Subscribers: andrewkr, dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D57543
* Musl libc does not provide adaptive mutex. Added feature test for PTHREAD_MUTEX_ADAPTIVE_NP.
* Musl libc does not provide backtrace(3). Added a feature check for backtrace(3).
* Fixed compiler error.
* Musl libc does not implement backtrace(3). Added platform check for libexecinfo.
* Alpine does not appear to support gcc -pg option. By default (gcc has PIE option enabled) it fails with:
gcc: error: -pie and -pg|p|profile are incompatible when linking
When -fno-PIE and -nopie are used it fails with:
/usr/lib/gcc/x86_64-alpine-linux-musl/5.3.0/../../../../x86_64-alpine-linux-musl/bin/ld: cannot find gcrt1.o: No such file or directory
Added gcc -pg platform test and output PROFILING_FLAGS accordingly. Replaced pg var in Makefile with PROFILING_FLAGS.
* fix segfault when TEST_IOCTL_FRIENDLY_TMPDIR is undefined and default candidates are not suitable
* use ASSERT_DOUBLE_EQ instead of ASSERT_EQ
* When compiled with ROCKSDB_MALLOC_USABLE_SIZE UniversalCompactionFourPaths and UniversalCompactionSecondPathRatio tests fail due to premature memtable flushes on systems with 16-byte alignment. Arena runs out of block space before GenerateNewFile() completes.
Increased options.write_buffer_size.
Summary:
Previously compilation failed when ROCKSDB_NO_FBCODE=1 because fcntl.h
wasn't included for open().
Related issue: https://github.com/facebook/rocksdb/issues/977
Test Plan:
verified below command works now:
$ make clean && ROCKSDB_NO_FBCODE=1 ROCKSDB_DISABLE_FALLOCATE=1 make -j32 env_test
Reviewers: sdong, yhchiang, igor
Reviewed By: igor
Subscribers: dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D54135
Summary:
Added this new function, which returns filename, size, and modified
timestamp for each file in the provided directory. The default implementation
retrieves the metadata sequentially using existing functions. In the next diff
I'll make HdfsEnv override this function to use libhdfs's bulk get function.
This won't work on windows due to the path separator.
Test Plan:
new unit test
$ ./env_test --gtest_filter=EnvPosixTest.ConsistentChildrenMetadata
Reviewers: yhchiang, sdong
Reviewed By: sdong
Subscribers: IslamAbdelRahman, dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D53781
in 64-bit.
Currently, a signed off_t type is being used for the following
interfaces for both offset and the length in bytes:
* `Allocate`
* `RangeSync`
On Linux `off_t` is automatically either 32 or 64-bit depending on
the platform. On Windows it is always a 32-bit signed long which
limits file access and in particular space pre-allocation
to effectively 2 Gb.
Proposal is to replace off_t with uint64_t as a portable type
always access files with 64-bit interfaces.
May need to modify posix code but lack resources to test it.
in 64-bit.
Currently, a signed off_t type is being used for the following
interfaces for both offset and the length in bytes:
* `Allocate`
* `RangeSync`
On Linux `off_t` is automatically either 32 or 64-bit depending on
the platform. On Windows it is always a 32-bit signed long which
limits file access and in particular space pre-allocation
to effectively 2 Gb.
Proposal is to replace off_t with uint64_t as a portable type
always access files with 64-bit interfaces.
May need to modify posix code but lack resources to test it.
Summary: include/posix/io_posix.h should not depend on ROCKSDB_FALLOCATE_PRESENT. Remove it.
Test Plan: Build it with both of ROCKSDB_FALLOCATE_PRESENT defined and not defined.
Reviewers: rven, yhchiang, anthony, kradhakrishnan, IslamAbdelRahman, igor
Reviewed By: igor
Subscribers: leveldb, dhruba
Differential Revision: https://reviews.facebook.net/D49563
Summary: As title
Test Plan: make check
Reviewers: yhchiang, sdong
Reviewed By: sdong
Subscribers: dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D46983