rocksdb

Commit Graph

Author	SHA1	Message	Date
Mayank Agarwal	e9b675bd94	Fix memory leak in KeyMayExist test part of db_test Summary: NewBloomFilterPolicy call requires Delete to be called later on Test Plan: make; valgrind ./db_test Reviewers: haobo, dhruba, vamsi Differential Revision: https://reviews.facebook.net/D11667	12 years ago
Mayank Agarwal	2a986919d6	Make rocksdb-deletes faster using bloom filter Summary: Wrote a new function in db_impl.c-CheckKeyMayExist that calls Get but with a new parameter turned on which makes Get return false only if bloom filters can guarantee that key is not in database. Delete calls this function and if the option- deletes_use_filter is turned on and CheckKeyMayExist returns false, the delete will be dropped saving: 1. Put of delete type 2. Space in the db,and 3. Compaction time Test Plan: make all check; will run db_stress and db_bench and enhance unit-test once the basic design gets approved Reviewers: dhruba, haobo, vamsi Reviewed By: haobo CC: leveldb Differential Revision: https://reviews.facebook.net/D11607	12 years ago
Xing Jin	8a5341ec7d	Newbie code question Summary: This diff is more about my question when reading compaction codes, instead of a normal diff. I don't quite understand the logic here. Test Plan: I didn't do any test. If this is a bug, I will continue doing some test. Reviewers: haobo, dhruba, emayanke Reviewed By: dhruba CC: leveldb Differential Revision: https://reviews.facebook.net/D11661	12 years ago
Mayank Agarwal	821889e207	Print complete statistics in db_stress Summary: db_stress should alos print complete statistics like db_bench. Needed this when I wanted to measure number of delete-IOs dropped due to CheckKeyMayExist to be introduced to rocksdb codebase later- to make deltes in rocksdb faster Test Plan: make db_stress;./db_stress --max_key=100 --ops_per_thread=1000 --statistics=1 Reviewers: sheki, dhruba, vamsi, haobo Reviewed By: dhruba Differential Revision: https://reviews.facebook.net/D11655	12 years ago
Haobo Xu	a8d5f8dde2	[RocksDB] Remove old readahead options Summary: As title. Test Plan: make check; db_bench Reviewers: dhruba, MarkCallaghan CC: leveldb Differential Revision: https://reviews.facebook.net/D11643	12 years ago
Haobo Xu	9ba82786ce	[RocksDB] Provide contiguous sequence number even in case of write failure Summary: Replication logic would be simplifeid if we can guarantee that write sequence number is always contiguous, even if write failure occurs. Dhruba and I looked at the sequence number generation part of the code. It seems fixable. Note that if WAL was successful and insert into memtable was not, we would be in an unfortunate state. The approach in this diff is : IO error is expected and error status will be returned to client, sequence number will not be advanced; In-mem error is not expected and we panic. Test Plan: make check; db_stress Reviewers: dhruba, sheki CC: leveldb Differential Revision: https://reviews.facebook.net/D11439	12 years ago
Haobo Xu	92ca816a60	[RocksDB] Support internal key/value dump for ldb Summary: This diff added a command 'idump' to ldb tool, which dumps the internal key/value pairs. It could be useful for diagnosis and estimating the per user key 'overhead'. Also cleaned up the ldb code a bit where I touched. Test Plan: make check; ldb idump Reviewers: emayanke, sheki, dhruba CC: leveldb Differential Revision: https://reviews.facebook.net/D11517	12 years ago
Mayank Agarwal	d56523c49c	Update rocksdb version Summary: rocksdb-2.0 released to third party Test Plan: visual inspection Reviewers: dhruba, haobo, sheki Reviewed By: dhruba CC: leveldb Differential Revision: https://reviews.facebook.net/D11559	12 years ago
Haobo Xu	71e0f695c1	[RocksDB] Expose count for WriteBatch Summary: As title. Exposed a Count function that returns the number of updates in a batch. Could be handy for replication sequence number check. Test Plan: make check; Reviewers: emayanke, sheki, dhruba CC: leveldb Differential Revision: https://reviews.facebook.net/D11523	12 years ago
Deon Nicholas	34ef873290	Added stringappend_test back into the unit tests. Summary: With the Makefile now updated to correctly update all .o files, this should fix the issues recompiling stringappend_test. This should also fix the "segmentation-fault" that we were getting earlier. Now, stringappend_test should be clean, and I have added it back to the unit-tests. Also made some minor updates to the tests themselves. Test Plan: 1. make clean; make stringappend_test -j 32 (will test it by itself) 2. make clean; make all check -j 32 (to run all unit tests) 3. make clean; make release (test in release mode) 4. valgrind ./stringappend_test (valgrind tests) Reviewers: haobo, jpaton, dhruba Reviewed By: haobo CC: leveldb Differential Revision: https://reviews.facebook.net/D11505	12 years ago
Deon Nicholas	6894a50aa7	Updated "make clean" to remove all .o files Summary: The old Makefile did not remove ALL .o and .d files, but rather only those that happened to be in the root folder and one-level deep. This was causing issues when recompiling files in deeper folders. This fix now causes make clean to find ALL .o and .d files via a unix "find" command, and then remove them. Test Plan: make clean; make all -j 32; Reviewers: haobo, jpaton, dhruba Reviewed By: haobo CC: leveldb Differential Revision: https://reviews.facebook.net/D11493	12 years ago
Mayank Agarwal	b858da709a	Simplify bucketing logic in ldb-ttl Summary: [start_time, end_time) is waht I'm following for the buckets and the whole time-range. Also cleaned up some code in db_ttl.* Not correcting the spacing/indenting convention for util/ldb_cmd.cc in this diff. Test Plan: python ldb_test.py, make ttl_test, Run mcrocksdb-backup tool, Run the ldb tool on 2 mcrocksdb production backups form sigmafio033.prn1 Reviewers: vamsi, haobo Reviewed By: vamsi Differential Revision: https://reviews.facebook.net/D11433	12 years ago
Mayank Agarwal	61f1baaedf	Introducing timeranged scan, timeranged dump in ldb. Also the ability to count in time-batches during Dump Summary: Scan and Dump commands in ldb use iterator. We need to also print timestamp for ttl databases for debugging. For this I create a TtlIterator class pointer in these functions and assign it the value of Iterator pointer which actually points to t TtlIterator object, and access the new function ValueWithTS which can return TS also. Buckets feature for dump command: gives a count of different key-values in the specified time-range distributed across the time-range partitioned according to bucket-size. start_time and end_time are specified in unixtimestamp and bucket in seconds on the user-commandline Have commented out 3 ines from ldb_test.py so that the test does not break right now. It breaks because timestamp is also printed now and I have to look at wildcards in python to compare properly. Test Plan: python tools/ldb_test.py Reviewers: vamsi, dhruba, haobo, sheki Reviewed By: vamsi CC: leveldb Differential Revision: https://reviews.facebook.net/D11403	13 years ago
Haobo Xu	0f78fad9f5	[RocksDB] add back --mmap_read options to crashtest Summary: As title, now that db_stress supports --map_read properly Test Plan: make crash_test Reviewers: vamsi, emayanke, dhruba CC: leveldb Differential Revision: https://reviews.facebook.net/D11391	13 years ago
Haobo Xu	4deaa0d48b	[RocksDB] Minor change to statistics.h Summary: as title, use initialize list so that lines fit in 80 chars. Test Plan: make check; Reviewers: sheki, dhruba Differential Revision: https://reviews.facebook.net/D11385	13 years ago
Haobo Xu	96be2c4ee0	[RocksDB] Add mmap_read option for db_stress Summary: as title, also removed an incorrect assertion Test Plan: make check; db_stress --mmap_read=1; db_stress --mmap_read=0 Reviewers: dhruba, emayanke CC: leveldb Differential Revision: https://reviews.facebook.net/D11367	13 years ago
Abhishek Kona	5ef6bb8c37	[rocksdb][refactor] statistic printing code to one place Summary: $title Test Plan: db_bench --statistics=1 Reviewers: haobo Reviewed By: haobo CC: leveldb Differential Revision: https://reviews.facebook.net/D11373	13 years ago
Jim Paton	09de7a3b6a	Fix Zlib_Compress and Zlib_Uncompress Summary: Zlib_{Compress,Uncompress} did not handle very small input buffers properly. In addition, they did not call inflate/deflate until Z_STREAM_END was returned; it was possible for them to exit when only Z_OK had returned. This diff also fixes a bunch of lint errors. Test Plan: Run make check Reviewers: dhruba, sheki, haobo Reviewed By: dhruba CC: leveldb Differential Revision: https://reviews.facebook.net/D11301	13 years ago
Haobo Xu	3cc1af2062	[RocksDB] Option for incremental sync Summary: This diff added an option to control the incremenal sync frequency. db_bench has a new flag bytes_per_sync for easy tuning exercise. Test Plan: make check; db_bench Reviewers: dhruba CC: leveldb Differential Revision: https://reviews.facebook.net/D11295	13 years ago
Abhishek Kona	79f4fd2b62	[Rocksdb] Simplify Printing code in db_bench Summary: simplify the printing code in db_bench use TickersMap and HistogramsNameMap introduced in previous diffs. Test Plan: ./db_bench --statistics=1 and see if all the statistics are printed Reviewers: haobo, dhruba Reviewed By: haobo CC: leveldb Differential Revision: https://reviews.facebook.net/D11355	13 years ago
Dhruba Borthakur	6acbe0fc45	Compact multiple memtables before flushing to storage. Summary: Merge multiple multiple memtables in memory before writing it out to a file in L0. There is a new config parameter min_write_buffer_number_to_merge that specifies the number of write buffers that should be merged together to a single file in storage. The system will not flush wrte buffers to storage unless at least these many buffers have accumulated in memory. The default value of this new parameter is 1, which means that a write buffer will be immediately flushed to disk as soon it is ready. Test Plan: make check Differential Revision: https://reviews.facebook.net/D11241	13 years ago
Abhishek Kona	f561b3a324	[Rocksdb] Rename one stat key from leveldb to rocksdb	13 years ago
Dhruba Borthakur	836534debd	Enhance dbstress to allow specifying compaction trigger for L0. Summary: Rocksdb allos specifying the number of files in L0 that triggers compactions. Expose this api as a command line parameter for running db_stress. Test Plan: Run test Reviewers: sheki, emayanke Reviewed By: emayanke CC: leveldb Differential Revision: https://reviews.facebook.net/D11343	13 years ago
Abhishek Kona	00124683de	[rocksdb] do not trim range for level0 in manual compaction Summary: https://code.google.com/p/leveldb/issues/detail?can=1&q=178&colspec=ID%20Type%20Status%20Priority%20Milestone%20Owner%20Summary&id=178 Ported the solution as is to RocksDB. Test Plan: moved the unit test as manual_compaction_test Reviewers: dhruba Reviewed By: dhruba CC: leveldb Differential Revision: https://reviews.facebook.net/D11331	13 years ago
Abhishek Kona	39ee47fbf4	[Rocksdb] Record WriteBlock Times into a histogram Summary: Add a histogram to track WriteBlock times Test Plan: db_bench and print Reviewers: haobo, dhruba Reviewed By: haobo CC: leveldb Differential Revision: https://reviews.facebook.net/D11319	13 years ago
Deon Nicholas	8926b72751	Minor tweaks to StringAppend MergeOperator. Summary: I'm concerned about a random seg-fault that sometimes occurs when running stringappend_test. I will investigate further. First, I am removing stringappend_test from the regular release tests, and making some clean-ups to the code. Test Plan: 1. make stringappend_test 2. ./stringappend_test Reviewers: haobo, dhruba Reviewed By: haobo CC: leveldb Differential Revision: https://reviews.facebook.net/D11313	13 years ago
Abhishek Kona	bff718d81c	[Rocksdb] Implement filluniquerandom Summary: Use a bit set to keep track of which random number is generated. Currently only supports single-threaded. All our perf tests are run with threads=1 Copied over bitset implementation from common/datastructures Test Plan: printed the generated keys, and verified all keys were present. Reviewers: MarkCallaghan, haobo, dhruba Reviewed By: MarkCallaghan CC: leveldb Differential Revision: https://reviews.facebook.net/D11247	13 years ago
Deon Nicholas	2a52e1dcb6	Fix db_bench for release build. Test Plan: make release Reviewers: haobo, dhruba, jpaton Reviewed By: haobo CC: leveldb Differential Revision: https://reviews.facebook.net/D11307	13 years ago
Haobo Xu	1afdf28701	[RocksDB] Compaction Filter Cleanup Summary: This hopefully gives the right semantics to compaction filter. Will write a small wiki to explain the ideas. Test Plan: make check; db_stress Reviewers: dhruba CC: leveldb Differential Revision: https://reviews.facebook.net/D11121	13 years ago
Abhishek Kona	7a5f71d19a	[Rocksdb] measure table open io in a histogram Summary: Table is setup for compaction using Table::SetupForCompaction. So read block calls can be differentiated b/w Gets/Compaction. Use this and measure times. Test Plan: db_bench --statistics=1 Reviewers: dhruba, haobo Reviewed By: haobo CC: leveldb, MarkCallaghan Differential Revision: https://reviews.facebook.net/D11217	13 years ago
Haobo Xu	0c2a2dd5e8	[RocksDB] Fix build. Removed deprecated option --mmap_read from db_crashtest Summary: As title Test Plan: db_crashtest Reviewers: vamsi, emayanke CC: leveldb Differential Revision: https://reviews.facebook.net/D11271	13 years ago
Haobo Xu	778e179046	[RocksDB] Sync file to disk incrementally Summary: During compaction, we sync the output files after they are fully written out. This causes unnecessary blocking of the compaction thread and burstiness of the write traffic. This diff simply asks the OS to sync data incrementally as they are written, on the background. The hope is that, at the final sync, most of the data are already on disk and we would block less on the sync call. Thus, each compaction runs faster and we could use fewer number of compaction threads to saturate IO. In addition, the write traffic will be smoothed out, hopefully reducing the IO P99 latency too. Some quick tests show 10~20% improvement in per thread compaction throughput. Combined with posix advice on compaction read, just 5 threads are enough to almost saturate the udb flash bandwidth for 800 bytes write only benchmark. What's more promising is that, with saturated IO, iostat shows average wait time is actually smoother and much smaller. For the write only test 800bytes test: Before the change: await occillate between 10ms and 3ms After the change: await ranges 1-3ms Will test against read-modify-write workload too, see if high read latency P99 could be resolved. Will introduce a parameter to control the sync interval in a follow up diff after cleaning up EnvOptions. Test Plan: make check; db_bench; db_stress Reviewers: dhruba CC: leveldb Differential Revision: https://reviews.facebook.net/D11115	13 years ago
Deon Nicholas	4985a9f73b	[Rocksdb] [Multiget] Introduced multiget into db_bench Summary: Preliminary! Introduced the --use_multiget=1 and --keys_per_multiget=n flags for db_bench. Also updated and tested the ReadRandom() method to include an option to use multiget. By default, keys_per_multiget=100. Preliminary tests imply that multiget is at least 1.25x faster per key than regular get. Will continue adding Multiget for ReadMissing, ReadHot, RandomWithVerify, ReadRandomWriteRandom; soon. Will also think about ways to better verify benchmarks. Test Plan: 1. make db_bench 2. ./db_bench --benchmarks=fillrandom 3. ./db_bench --benchmarks=readrandom --use_existing_db=1 --use_multiget=1 --threads=4 --keys_per_multiget=100 4. ./db_bench --benchmarks=readrandom --use_existing_db=1 --threads=4 5. Verify ops/sec (and 1000000 of 1000000 keys found) Reviewers: haobo, MarkCallaghan, dhruba Reviewed By: MarkCallaghan CC: leveldb Differential Revision: https://reviews.facebook.net/D11127	13 years ago
Haobo Xu	bdf1085944	[RocksDB] cleanup EnvOptions Summary: This diff simplifies EnvOptions by treating it as POD, similar to Options. - virtual functions are removed and member fields are accessed directly. - StorageOptions is removed. - Options.allow_readahead and Options.allow_readahead_compactions are deprecated. - Unused global variables are removed: useOsBuffer, useFsReadAhead, useMmapRead, useMmapWrite Test Plan: make check; db_stress Reviewers: dhruba CC: leveldb Differential Revision: https://reviews.facebook.net/D11175	13 years ago
Deon Nicholas	5679107b07	Completed the implementation and test cases for Redis API. Summary: Completed the implementation for the Redis API for Lists. The Redis API uses rocksdb as a backend to persistently store maps from key->list. It supports basic operations for appending, inserting, pushing, popping, and accessing a list, given its key. Test Plan: - Compile with: make redis_test - Test with: ./redis_test - Run all unit tests (for all rocksdb) with: make all check - To use an interactive REDIS client use: ./redis_test -m - To clean the database before use: ./redis_test -m -d Reviewers: haobo, dhruba, zshao Reviewed By: haobo CC: leveldb Differential Revision: https://reviews.facebook.net/D10833	13 years ago
Dhruba Borthakur	e673d5d26d	Do not submit multiple simultaneous seek-compaction requests. Summary: The code was such that if multi-threaded-compactions as well as seek compaction are enabled then it submits multiple compaction request for the same range of keys. This causes extraneous sst-files to accumulate at various levels. Test Plan: I am not able to write a very good unit test for this one but can easily reproduce this bug with 'dbstress' with the following options. batch=1;maxk=100000000;ops=100000000;ro=0;fm=2;bpl=10485760;of=500000; wbn=3; mbc=20; mb=2097152; wbs=4194304; dds=1; sync=0; t=32; bs=16384; cs=1048576; of=500000; ./db_stress --disable_seek_compaction=0 --mmap_read=0 --threads=$t --block_size=$bs --cache_size=$cs --open_files=$of --verify_checksum=1 --db=/data/mysql/leveldb/dbstress.dir --sync=$sync --disable_wal=1 --disable_data_sync=$dds --write_buffer_size=$wbs --target_file_size_base=$mb --target_file_size_multiplier=$fm --max_write_buffer_number=$wbn --max_background_compactions=$mbc --max_bytes_for_level_base=$bpl --reopen=$ro --ops_per_thread=$ops --max_key=$maxk --test_batches_snapshots=$batch Reviewers: leveldb, emayanke Reviewed By: emayanke Differential Revision: https://reviews.facebook.net/D11055	13 years ago
Mayank Agarwal	3c35eda9bd	Make Write API work for TTL databases Summary: Added logic to make another WriteBatch with Timestamps during the Write function execution in TTL class. Also expanded the ttl_test to test for it. Have done nothing for Merge for now. Test Plan: make ttl_test;./ttl_test Reviewers: haobo, vamsi, dhruba Reviewed By: haobo CC: leveldb Differential Revision: https://reviews.facebook.net/D10827	13 years ago
Dhruba Borthakur	1b69f1e584	Fix refering freed memory in earlier commit. Summary: Fix refering freed memory in earlier commit by https://reviews.facebook.net/D11181 Test Plan: make check Reviewers: haobo, sheki Reviewed By: haobo CC: leveldb Differential Revision: https://reviews.facebook.net/D11193	13 years ago
Abhishek Kona	4a8554d5bb	[Rocksdb] fix wrong assert Summary: the assert was wrong in D11145. Broke build Test Plan: make db_bench run it Reviewers: dhruba, haobo, emayanke Reviewed By: dhruba CC: leveldb Differential Revision: https://reviews.facebook.net/D11187	13 years ago
Dhruba Borthakur	c5de1b9391	Print name of user comparator in LOG. Summary: The current code prints the name of the InternalKeyComparator in the log file. We would also like to print the name of the user-specified comparator for easier debugging. Test Plan: make check Reviewers: sheki Reviewed By: sheki CC: leveldb Differential Revision: https://reviews.facebook.net/D11181	13 years ago
Abhishek Kona	a4913c5170	[rocksdb] names for all metrics provided in statistics.h Summary: Provide a map of histograms and ticker vs strings. Fb303 libraries can use this to provide the mapping. We will not have to duplicate the code during release. Test Plan: db_bench with statistics=1 Reviewers: dhruba, haobo Reviewed By: dhruba CC: leveldb Differential Revision: https://reviews.facebook.net/D11145	13 years ago
Mayank Agarwal	184343a061	Max_mem_compaction_level can have maximum value of num_levels-1 Summary: Without this files could be written out to a level greater than the maximum level possible and is the source of the segfaults that wormhole awas getting. The sequence of steps that was followed: 1. WriteLevel0Table was called when memtable was to be flushed for a file. 2. PickLevelForMemTableOutput was called to determine the level to which this file should be pushed. 3. PickLevelForMemTableOutput returned a wrong result because max_mem_compaction_level was equal to 2 even when num_levels was equal to 0. The fix to re-initialize max_mem_compaction_level based on num_levels passed seems correct. Test Plan: make all check; Also made a dummy file to mimic the wormhole-file behaviour which was causing the segfaults and found that the same segfault occurs without this change and not with this. Reviewers: dhruba, haobo Reviewed By: dhruba CC: leveldb Differential Revision: https://reviews.facebook.net/D11157	13 years ago
Mayank Agarwal	7a6bd8e975	Modifying options to db_stress when it is run with db_crashtest Summary: These extra options caught some bugs. Will be run via Jenkins now with the crash_test Test Plan: ./make crashtest Reviewers: dhruba, vamsi Reviewed By: dhruba CC: leveldb Differential Revision: https://reviews.facebook.net/D11151	13 years ago
Vamsi Ponnekanti	3bb9449906	[Fix whilebox crash test failure] Summary: I think the check for "error" that I added had caused false alarm. Fixed that. Test Plan: Revert Plan: OK Task ID: # Reviewers: emayanke, dhruba Reviewed By: emayanke Differential Revision: https://reviews.facebook.net/D11139	13 years ago
Abhishek Kona	e982b5a489	[Rocksdb] measure table open io in a histogram Summary: as title Test Plan: db_bench --statistics=1 check for statistic. Reviewers: dhruba, haobo Reviewed By: dhruba CC: leveldb Differential Revision: https://reviews.facebook.net/D11109	13 years ago
Jim Paton	8ef328ee6a	ctags and cscope support to Makefile Summary: Added a target to Makefile called 'tags' that runs ctags and cscope on all .cc and .h file Test Plan: Run 'make tags'. Then start vim and do :set tags=./tags :cs add cscope.out These commands should give you no error messages. You should then be able to access cscope db and ctags as normal in vim. Reviewers: dhruba Differential Revision: https://reviews.facebook.net/D11103	13 years ago
Vamsi Ponnekanti	5cf7a00bda	[Make most of the changes suggested by Aaron] Summary: $title Test Plan: Revert Plan: OK Task ID: # Reviewers: emayanke, akushner Reviewed By: akushner Differential Revision: https://reviews.facebook.net/D10923	13 years ago
Deon Nicholas	db1f0cddf3	Fixed valgrind errors	13 years ago
Deon Nicholas	d8c7c45ea0	Very basic Multiget and simple test cases. Summary: Implemented the MultiGet operator which takes in a list of keys and returns their associated values. Currently uses std::vector as its container data structure. Otherwise, it works identically to "Get". Test Plan: 1. make db_test ; compile it 2. ./db_test ; test it 3. make all check ; regress / run all tests 4. make release ; (optional) compile with release settings Reviewers: haobo, MarkCallaghan, dhruba Reviewed By: dhruba CC: leveldb Differential Revision: https://reviews.facebook.net/D10875	13 years ago
Abhishek Kona	d91b42ee27	[Rocksdb] Measure all FSYNC/SYNC times Summary: Add stop watches around all sync calls. Test Plan: db_bench check if respective histograms are printed Reviewers: haobo, dhruba Reviewed By: haobo CC: leveldb Differential Revision: https://reviews.facebook.net/D11073	13 years ago

1 2 3 4 5 ...

562 Commits (e9b675bd94370ac15cf0ab65ba9beef7a02d904b) All Branches Search

562 Commits (e9b675bd94370ac15cf0ab65ba9beef7a02d904b)

All Branches