Summary: A better error message. A local change. Did not look at other places where this could be done.
Test Plan: compile
Reviewers: dhruba, MarkCallaghan
Reviewed By: dhruba
CC: leveldb
Differential Revision: https://reviews.facebook.net/D10251
Summary: Simplified level_ptrs by using a std:vector
Test Plan: make check
Reviewers: sheki, emayanke
Reviewed By: emayanke
CC: leveldb
Differential Revision: https://reviews.facebook.net/D10245
Summary:
FindObsoleteFiles was slow, holding the single big lock, resulted in bad p99 behavior.
Didn't profile anything, but several things could be improved:
1. VersionSet::AddLiveFiles works with std::set, which is by itself slow (a tree).
You also don't know how many dynamic allocations occur just for building up this tree.
switched to std::vector, also added logic to pre-calculate total size and do just one allocation
2. Don't see why env_->GetChildren() needs to be mutex proteced, moved to PurgeObsoleteFiles where
mutex could be unlocked.
3. switched std::set to std:unordered_set, the conversion from vector is also inside PurgeObsoleteFiles
I have a feeling this should pretty much fix it.
Test Plan: make check; db_stress
Reviewers: dhruba, heyongqiang, MarkCallaghan
Reviewed By: dhruba
CC: leveldb, zshao
Differential Revision: https://reviews.facebook.net/D10197
Summary: Primarily a refactor. Introduced LDBTool interface to which customers can plug in their options and this will create their own version of ldb tool.
Test Plan: made ldb tool and tried it.
Reviewers: dhruba, heyongqiang
Reviewed By: dhruba
CC: leveldb
Differential Revision: https://reviews.facebook.net/D10191
Summary: using unique_ptr to have automatic delete for probableWALfiles in db_impl.cc
Test Plan: make
Reviewers: sheki, dhruba
Reviewed By: sheki
CC: leveldb
Differential Revision: https://reviews.facebook.net/D10083
Summary:
The background compaction threads are never exitted and therefore caused
memory-leaks while running rpcksdb tests. Have changed the PosixEnv destructor to exit and join them and changed the tests likewise
The memory leaked has reduced from 320 bytes to 64 bytes in all the tests. The 64
bytes is relating to
pthread_exit, but still have to figure out why. The stack-trace right now with
table_test.cc = 64 bytes in 1 blocks are possibly lost in loss record 4 of 5
at 0x475D8C: malloc (jemalloc.c:914)
by 0x400D69E: _dl_map_object_deps (dl-deps.c:505)
by 0x4013393: dl_open_worker (dl-open.c:263)
by 0x400F015: _dl_catch_error (dl-error.c:178)
by 0x4013B2B: _dl_open (dl-open.c:569)
by 0x5D3E913: do_dlopen (dl-libc.c:86)
by 0x400F015: _dl_catch_error (dl-error.c:178)
by 0x5D3E9D6: __libc_dlopen_mode (dl-libc.c:47)
by 0x5048BF3: pthread_cancel_init (unwind-forcedunwind.c:53)
by 0x5048DC9: _Unwind_ForcedUnwind (unwind-forcedunwind.c:126)
by 0x5046D9F: __pthread_unwind (unwind.c:130)
by 0x50413A4: pthread_exit (pthreadP.h:289)
Test Plan: make all check
Reviewers: dhruba, sheki, haobo
Reviewed By: dhruba
CC: leveldb, chip
Differential Revision: https://reviews.facebook.net/D9573
Summary: as subject. This is causing problem in adsconv. Ideally, this flags should be set in open. But that is only supported in Linux kernel ≥2.6.23 and glibc ≥2.7.
Test Plan:
db_test
run db_test
Reviewers: dhruba, MarkCallaghan, haobo
Reviewed By: dhruba
CC: leveldb, chip
Differential Revision: https://reviews.facebook.net/D10089
Summary: To know which options the crashtest was run with. Also changed print to sys.stdout.write which is more standard.
Test Plan: python tools/db_crashtest.py
Reviewers: vamsi, akushner, dhruba
Reviewed By: akushner
Differential Revision: https://reviews.facebook.net/D10119
Summary:
The segfault was happening because the program was unable to open a new
sst file (as part of the compaction) because the process ran out of
file descriptors.
The fix is to check the return status of the file creation before taking
any other action.
Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7fabf03f9700 (LWP 29904)]
leveldb::DBImpl::OpenCompactionOutputFile (this=this@entry=0x7fabf9011400, compact=compact@entry=0x7fabf741a2b0) at db/db_impl.cc:1399
1399 db/db_impl.cc: No such file or directory.
(gdb) where
Test Plan: make check
Reviewers: MarkCallaghan, sheki
Reviewed By: MarkCallaghan
CC: leveldb
Differential Revision: https://reviews.facebook.net/D10101
Summary: make crash_test will now invoke the crash_test. Also some cleanup in the db_crashtest.py file
Test Plan: make crash_test
Reviewers: akushner, vamsi, sheki, dhruba
Reviewed By: vamsi
Differential Revision: https://reviews.facebook.net/D9987
Summary:
Transaction Log Iterator did not move to the next file in the series if there was a write batch at the end of the currentFile.
The solution is if the last seq no. of the current file is < RequestedSeqNo. Assume the first seqNo. of the next file has to satisfy the request.
Also major refactoring around the code. Moved opening the logreader to a seperate function, got rid of goto.
Test Plan: added a unit test for it.
Reviewers: dhruba, heyongqiang
Reviewed By: heyongqiang
CC: leveldb, emayanke
Differential Revision: https://reviews.facebook.net/D10029
Summary: The crash_test depends on db_stress to work with pre-existing dir
Test Plan: make db_stress; Run db_stress with 'destroy_db_initially=0'
Reviewers: vamsi, dhruba
Reviewed By: dhruba
CC: leveldb
Differential Revision: https://reviews.facebook.net/D10041
Summary: For sanity w.r.t. the way we split up the reopens equally among the ops/thread
Test Plan: make db_stress; db_stress --ops_per_thread=10 --reopens=10 => error
Reviewers: vamsi, dhruba
Reviewed By: dhruba
CC: leveldb
Differential Revision: https://reviews.facebook.net/D10023
Summary: Renames in the Makefile. It will be used like this in third-party.
Test Plan: make
Reviewers: dhruba, sheki, heyongqiang, haobo
Reviewed By: heyongqiang
CC: leveldb
Differential Revision: https://reviews.facebook.net/D9633
Summary:
TableAndFile was a struct used earlier to delete the file as we did not have std::unique_ptr in the codebase.
With Chip introducing C++11 hotness like std::unique_ptr we can do away with the struct.
Test Plan: make all check
Reviewers: haobo, heyongqiang
Reviewed By: heyongqiang
CC: dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D9975
Summary:
1. The stock LRUCache nukes itself whenever the working set (the total number of entries not released by client at a certain time) is bigger than the cache capacity.
See https://our.dev.facebook.com/intern/tasks/?t=2252281
2. There's a bug in shard calculation leading to segmentation fault when only one shard is needed.
Test Plan: make check
Reviewers: dhruba, heyongqiang
Reviewed By: heyongqiang
CC: leveldb, zshao, sheki
Differential Revision: https://reviews.facebook.net/D9927
Summary:
When I run db_crashtest, I am seeing lot of warnings that say db_stress completed
before it was killed. To fix that I made ops per thread a very large value so that it keeps
running until it is killed.
I also set #reopens to 0. Since we are killing the process anyway, the 'simulated crash'
that happens during reopen may not add additional value.
I usually see 10-25K ops happening before the kill. So I increased max_key from 100 to
1000 so that we use more distinct keys.
Test Plan:
Ran a few times.
Revert Plan: OK
Task ID: #
Reviewers: emayanke
Reviewed By: emayanke
CC: leveldb
Differential Revision: https://reviews.facebook.net/D9909
Summary:
During recovery, last_updated_manifest number was not set if there were no records in the Write-ahead log.
Now check for the recovered manifest also and set last_updated_manifest file to the max value.
Test Plan: unit test
Reviewers: heyongqiang
Reviewed By: heyongqiang
CC: leveldb
Differential Revision: https://reviews.facebook.net/D9891
Summary:
1. SetBackgroundThreads was not thread safe
2. queue_size_ does not seem necessary
3. moved condition signal after shared state change. Even though the original
order is in practice ok (because the mutex is still held), it looks fishy
and non-intuitive.
Test Plan: make check
Reviewers: dhruba
Reviewed By: dhruba
CC: leveldb, zshao
Differential Revision: https://reviews.facebook.net/D9825
Summary: The script runs and kills the stress test periodically. Default values have been used in the script now. Should I make this a part of the Makefile or automated rocksdb build? The values can be easily changed in the script right now, but should I add some support for variable values or input to the script? I believe the script achieves its objective of unsafe crashes and reopening to expect sanity in the database.
Test Plan: python tools/db_crashtest.py
Reviewers: dhruba, vamsi, MarkCallaghan
Reviewed By: vamsi
CC: leveldb
Differential Revision: https://reviews.facebook.net/D9369
Summary:
If a class owns an object:
- If the object can be null => use a unique_ptr. no delete
- If the object can not be null => don't even need new, let alone delete
- for runtime sized array => use vector, no delete.
Test Plan: make check
Reviewers: dhruba, heyongqiang
Reviewed By: heyongqiang
CC: leveldb, zshao, sheki, emayanke, MarkCallaghan
Differential Revision: https://reviews.facebook.net/D9783
Summary:
RocksDB does a binary search to look at the files which might contain the requested sequence number at the call GetUpdatesSince.
There was a bug in the binary search => when the file pointed by the middle index of bsearch was empty/corrupt it needst to resize the vector and update indexes.
This now fixes that.
Test Plan: existing unit tests pass.
Reviewers: heyongqiang, dhruba
Reviewed By: heyongqiang
CC: leveldb
Differential Revision: https://reviews.facebook.net/D9777
Summary:
If the vector returned by GetUpdatesSince is empty, it is still returned to the
user. This causes it throw an std::range error.
The probable file list is checked and it returns an IOError status instead of OK now.
Test Plan: added a unit test.
Reviewers: dhruba, heyongqiang
Reviewed By: heyongqiang
CC: leveldb
Differential Revision: https://reviews.facebook.net/D9771
Summary:
Use non mmapd files for Write-Ahead log.
Earlier use of MMaped files. made the log iterator read ahead and miss records.
Now the reader and writer will point to the same physical location.
There is no perf regression :
./db_bench --benchmarks=fillseq --db=/dev/shm/mmap_test --num=$(million 20) --use_existing_db=0 --threads=2
with This diff :
fillseq : 10.756 micros/op 185281 ops/sec; 20.5 MB/s
without this dif :
fillseq : 11.085 micros/op 179676 ops/sec; 19.9 MB/s
Test Plan: unit test included
Reviewers: dhruba, heyongqiang
Reviewed By: heyongqiang
CC: leveldb
Differential Revision: https://reviews.facebook.net/D9741
Summary:
Earlier Statistics object was a raw pointer. This meant the user had to clear up
the Statistics object after creating the database. In most use cases the database is created in a function and the statistics pointer is out of scope. Hence the statistics object would never be deleted.
Now Using a shared_ptr to manage this.
Want this in before the next release.
Test Plan: make all check.
Reviewers: dhruba, emayanke
Reviewed By: emayanke
CC: leveldb
Differential Revision: https://reviews.facebook.net/D9735
Summary: rocksdb uses a single global lock to protect in memory metadata. We should minimize the mutex protected code section to increase the effective parallelism of the program. See https://our.intern.facebook.com/intern/tasks/?t=2218928
Test Plan:
make check
db_bench
Reviewers: dhruba, heyongqiang
CC: zshao, leveldb
Differential Revision: https://reviews.facebook.net/D9705
Summary:
The unit test fails as our solution does not work with MMap'd files.
Disable the failing unit test. Put it back with the next diff which should fix the problem.
Test Plan: db_test
Reviewers: heyongqiang
CC: dhruba
Differential Revision: https://reviews.facebook.net/D9645
Summary:
* Add a method to check if the log reader is at EOF.
* If we know a record has been flushed force the log_reader to believe it is not at EOF, using a new method UnMarkEof().
This does not work with MMpaed files.
Test Plan: added a unit test.
Reviewers: dhruba, heyongqiang
Reviewed By: heyongqiang
CC: leveldb
Differential Revision: https://reviews.facebook.net/D9567
Summary: This caused compilation problems on some gcc platforms during the third-partyrelease
Test Plan: make
Reviewers: sheki
Reviewed By: sheki
Differential Revision: https://reviews.facebook.net/D9627
Summary: simple sed command to replace NULL in tools directory. Was missed by the previous codemod.
Test Plan: it compiles
Reviewers: emayanke
Reviewed By: emayanke
CC: leveldb
Differential Revision: https://reviews.facebook.net/D9621
Summary:
The events that trigger compaction:
* opening the database
* Get -> only if seek compaction is not disabled and other checks are true
* MakeRoomForWrite -> when memtable is full
* BackgroundCall ->
If the background thread is about to do a compaction run, it schedules
a new background task to trigger a possible compaction. This will cause
additional background threads to find and process other compactions that
can run concurrently.
Test Plan: ran db_bench with overwrite and readonly alternatively.
Reviewers: sheki, MarkCallaghan
Reviewed By: sheki
CC: leveldb
Differential Revision: https://reviews.facebook.net/D9579
Summary:
This patch allows an application to specify whether to use bufferedio,
reads-via-mmaps and writes-via-mmaps per database. Earlier, there
was a global static variable that was used to configure this functionality.
The default setting remains the same (and is backward compatible):
1. use bufferedio
2. do not use mmaps for reads
3. use mmap for writes
4. use readaheads for reads needed for compaction
I also added a parameter to db_bench to be able to explicitly specify
whether to do readaheads for compactions or not.
Test Plan: make check
Reviewers: sheki, heyongqiang, MarkCallaghan
Reviewed By: sheki
CC: leveldb
Differential Revision: https://reviews.facebook.net/D9429
Summary: Getting rid of boost in our github codebase which caused problems on third-party
Test Plan: make ldb; python tools/ldb_test.py
Reviewers: sheki, dhruba
Reviewed By: sheki
Differential Revision: https://reviews.facebook.net/D9543
Summary: Was causing error(warning) in third-party saying unused result
Test Plan: make
Reviewers: sheki, dhruba
Reviewed By: dhruba
Differential Revision: https://reviews.facebook.net/D9447
Summary: Some comparisons left in log_test.cc and db_test.cc complained by make
Test Plan: make
Reviewers: dhruba, sheki
Reviewed By: dhruba
Differential Revision: https://reviews.facebook.net/D9537
Summary: Makefile had options to ignore sign-comparisons and unused-parameters, which should be there. Also fixed the specific errors in the code-base
Test Plan: make
Reviewers: chip, dhruba
Reviewed By: dhruba
CC: leveldb
Differential Revision: https://reviews.facebook.net/D9531
Summary:
Add --benchmarks=levelstats option to report per-level stats (#files, #bytes)
Change readwhilewriting test to report response time for writes but exclude
them from the stats merged by all threads.
Prevent "NaN" in stats output by preventing division by 0.
Remove "o" file I committed by mistake.
Task ID: #
Blame Rev:
Test Plan:
make check
Revert Plan:
Database Impact:
Memcache Impact:
Other Notes:
EImportant:
- begin *PUBLIC* platform impact section -
Bugzilla: #
- end platform impact -
Reviewers: dhruba
Reviewed By: dhruba
Differential Revision: https://reviews.facebook.net/D9513
Summary:
Rocksdb can create 0 sized log files when it is opened and closed without any operations.
The GetUpdatesSince fails currently if there is a log file of size zero.
This diff fixes this. If there is a log file is 0, it is removed form the probable_file_list
Test Plan: unit test
Reviewers: dhruba, heyongqiang
Reviewed By: heyongqiang
CC: leveldb
Differential Revision: https://reviews.facebook.net/D9507
Summary:
Instead of checking for number of files in L0. Check for number of files in the requested level.
Bug introduced in D4929 (diff trying to do too many things).
Test Plan: db_test.
Reviewers: dhruba, MarkCallaghan
Reviewed By: dhruba
Differential Revision: https://reviews.facebook.net/D9483
Summary: negation of the condition checked currently had to be checkd actually
Test Plan: make ldb; python ldb_test.py
Reviewers: sheki, dhruba
Reviewed By: sheki
Differential Revision: https://reviews.facebook.net/D9459
Summary: boost functions cause complications while deploying to third-party
Test Plan: make
Reviewers: sheki, dhruba
Reviewed By: sheki
Differential Revision: https://reviews.facebook.net/D9441
Summary:
Add --benchmarks=updaterandom for read-modify-write workloads. This is different
from --benchmarks=readrandomwriterandom in a few ways. First, an "operation" is the
combined time to do the read & write rather than treating them as two ops. Second,
the same key is used for the read & write.
Change RandomGenerator to support rows larger than 1M. That was using "assert"
to fail and assert is compiled-away when -DNDEBUG is used.
Add more options to db_bench
--duration - sets the number of seconds for tests to run. When not set the
operation count continues to be the limit. This is used by random operation
tests.
--use_snapshot - when set GetSnapshot() is called prior to each random read.
This is to measure the overhead from using snapshots.
--get_approx - when set GetApproximateSizes() is called prior to each random
read. This is to measure the overhead for a query optimizer.
Task ID: #
Blame Rev:
Test Plan:
run db_bench
Revert Plan:
Database Impact:
Memcache Impact:
Other Notes:
EImportant:
- begin *PUBLIC* platform impact section -
Bugzilla: #
- end platform impact -
Reviewers: dhruba
Reviewed By: dhruba
Differential Revision: https://reviews.facebook.net/D9267
Summary: Updated TOOL_CHAIN_LIB_BASE to use the third-party version for jemalloc-3.3.1 which contains a bug fix in quarantine.cc. This was detected while debugging valgrind issues with the rocksdb table_test
Test Plan: make table_test;valgrind --leak-check=full ./table_test
Reviewers: dhruba, sheki, vamsi
Reviewed By: sheki
Differential Revision: https://reviews.facebook.net/D9387