rocksdb

Commit Graph

Author	SHA1	Message	Date
Siying Dong	9edda37027	Universal Compaction to Have a Size Percentage Threshold To Decide Whether to Compress Summary: This patch adds a option for universal compaction to allow us to only compress output files if the files compacted previously did not yet reach a specified ratio, to save CPU costs in some cases. Compression is always skipped for flushing. This is because the size information is not easy to evaluate for flushing case. We can improve it later. Test Plan: add test DBTest.UniversalCompactionCompressRatio1 and DBTest.UniversalCompactionCompressRatio12 Reviewers: dhruba, haobo Reviewed By: dhruba CC: leveldb Differential Revision: https://reviews.facebook.net/D13467	11 years ago
Dhruba Borthakur	9cd221094c	Add appropriate LICENSE and Copyright message. Summary: Add appropriate LICENSE and Copyright message. Test Plan: make check Reviewers: CC: Task ID: # Blame Rev:	11 years ago
Siying Dong	073cbfc8f0	Enable background flush thread by default and fix issues related to it Summary: Enable background flush thread in this patch and fix unit tests with: (1) After background flush, schedule a background compaction if condition satisfied; (2) Fix a bug that if universal compaction is enabled and number of levels are set to be 0, compaction will not be automatically triggered (3) Fix unit tests to wait for compaction to finish instead of flush, before checking the compaction results. Test Plan: pass all unit tests Reviewers: haobo, xjin, dhruba Reviewed By: haobo CC: leveldb Differential Revision: https://reviews.facebook.net/D13461	11 years ago
Siying Dong	88f2f89068	Change Function names from Compaction->Flush When they really mean Flush Summary: When I debug the unit test failures when enabling background flush thread, I feel the function names can be made clearer for people to understand. Also, if the names are fixed, in many places, some tests' bugs are obvious (and some of those tests are failing). This patch is to clean it up for future maintenance. Test Plan: Run test suites. Reviewers: haobo, dhruba, xjin Reviewed By: dhruba CC: leveldb Differential Revision: https://reviews.facebook.net/D13431	11 years ago
Dhruba Borthakur	4463b11cad	Migrate names of properties from 'leveldb' prefix to 'rocksdb' prefix. Summary: Migrate names of properties from 'leveldb' prefix to 'rocksdb' prefix. Test Plan: make check Reviewers: emayanke, haobo Reviewed By: haobo CC: leveldb Differential Revision: https://reviews.facebook.net/D13311	11 years ago
Dhruba Borthakur	0a9f873f4b	Removed scribe, thrift and java modules. Summary: Removed scribe, thrift and java modules. Test Plan: make release make check Reviewers: emayanke Reviewed By: emayanke CC: leveldb Differential Revision: https://reviews.facebook.net/D13293	11 years ago
Dhruba Borthakur	a143ef9b38	Change namespace from leveldb to rocksdb Summary: Change namespace from leveldb to rocksdb. This allows a single application to link in open-source leveldb code as well as rocksdb code into the same process. Test Plan: compile rocksdb Reviewers: emayanke Reviewed By: emayanke CC: leveldb Differential Revision: https://reviews.facebook.net/D13287	11 years ago
Mayank Agarwal	854d236361	Add backward compatible option in GetLiveFiles to choose whether to not Flush first Summary: As explained in comments in GetLiveFiles in db.h, this option will cause flush to be skipped in GetLiveFiles because some use-cases use GetSortedWalFiles after GetLiveFiles to generate more complete snapshots. Using GetSortedWalFiles after GetLiveFiles allows us to not Flush in GetLiveFiles first because wals have everything. Note: file deletions will be disabled before calling GLF or GSWF so live logs will not move to archive logs or get delted. Note: Manifest file is truncated to a proper value in GLF, so it will always reply from the proper wal files on a restart Test Plan: make Reviewers: dhruba, haobo Reviewed By: dhruba CC: leveldb Differential Revision: https://reviews.facebook.net/D13257	11 years ago
Haobo Xu	fa798e9e28	[Rocksdb] Submit mem table flush job in a different thread pool Summary: As title. This is just a quick hack and not ready for commit. fails a lot of unit test. I will test/debug it directly in ViewState shadow . Test Plan: Try it in shadow test. Reviewers: dhruba, xjin CC: leveldb Differential Revision: https://reviews.facebook.net/D12933	11 years ago
Haobo Xu	0e422308aa	[RocksDB] Remove Log file immediately after memtable flush Summary: As title. The DB log file life cycle is tied up with the memtable it backs. Once the memtable is flushed to sst and committed, we should be able to delete the log file, without holding the mutex. This is part of the bigger change to avoid FindObsoleteFiles at runtime. It deals with log files. sst files will be dealt with later. Test Plan: make check; db_bench Reviewers: dhruba CC: leveldb Differential Revision: https://reviews.facebook.net/D11709	11 years ago
Xing Jin	42c109cc2e	New ldb command to convert compaction style Summary: Add new command "change_compaction_style" to ldb tool. For universal->level, it shows "nothing to do". For level->universal, it compacts all files into a single one and moves the file to level 0. Also add check for number of files at level 1+ when opening db with universal compaction style. Test Plan: 'make all check'. New unit test for internal convertion function. Also manully test various cmd like: ./ldb change_compaction_style --old_compaction_style=0 --new_compaction_style=1 --db=/tmp/leveldbtest-3088/db_test Reviewers: haobo, dhruba Reviewed By: haobo CC: vamsi, emayanke Differential Revision: https://reviews.facebook.net/D12603	11 years ago
Dhruba Borthakur	fc0c399d2e	Introduced a new flag non_blocking_io in ReadOptions. Summary: If ReadOptions.non_blocking_io is set to true, then KeyMayExists and Iterators will return data that is cached in RAM. If the Iterator needs to do IO from storage to serve the data, then the Iterator.status() will return Status::IsRetry(). Test Plan: Enhanced unit test DBTest.KeyMayExist to detect if there were are IOs issues from storage. Added DBTest.NonBlockingIteration to verify nonblocking Iterations. Reviewers: emayanke, haobo Reviewed By: haobo CC: leveldb Maniphest Tasks: T63 Differential Revision: https://reviews.facebook.net/D12531	11 years ago
Dhruba Borthakur	1186192ed1	Replace include/leveldb with include/rocksdb. Summary: Replace include/leveldb with include/rocksdb. Test Plan: make clean; make check make clean; make release Differential Revision: https://reviews.facebook.net/D12489	11 years ago
Simha Venkataramaiah	60bf2b7d4a	Add APIs to query SST file metadata and to delete specific SST files Summary: An api to query the level, key ranges, size etc for each SST file and an api to delete a specific file from the db and all associated state in the bookkeeping datastructures. Notes: Editing the manifest version does not release the obsolete files right away. However deleting the file directly will mess up the iterator. We may need a more aggressive/timely file deletion api. I have used std::unique_ptr - will switch to boost:: since this is external. thoughts? Unit test is fragile right now as it expects the compaction at certain levels. Test Plan: unittest Reviewers: dhruba, vamsi, emayanke CC: zshao, leveldb, haobo Task ID: # Blame Rev:	11 years ago
Mayank Agarwal	8a3547d38e	API for getting archived log files Summary: Also expanded class LogFile to have startSequene and FileSize and exposed it publicly Test Plan: make all check Reviewers: dhruba, haobo Reviewed By: dhruba CC: leveldb Differential Revision: https://reviews.facebook.net/D12087	11 years ago
Jim Paton	1036537c94	Add soft and hard rate limit support Summary: This diff adds support for both soft and hard rate limiting. The following changes are included: 1) Options.rate_limit is renamed to Options.hard_rate_limit. 2) Options.rate_limit_delay_milliseconds is renamed to Options.rate_limit_delay_max_milliseconds. 3) Options.soft_rate_limit is added. 4) If the maximum compaction score is > hard_rate_limit and rate_limit_delay_max_milliseconds == 0, then writes are delayed by 1 ms at a time until the max compaction score falls below hard_rate_limit. 5) If the max compaction score is > soft_rate_limit but <= hard_rate_limit, then writes are delayed by 0-1 ms depending on how close we are to hard_rate_limit. 6) Users can disable 4 by setting hard_rate_limit = 0. They can add a limit to the maximum amount of time waited by setting rate_limit_delay_max_milliseconds > 0. Thus, the old behavior can be preserved by setting soft_rate_limit = 0, which is the default. Test Plan: make -j32 check ./db_stress Reviewers: dhruba, haobo, MarkCallaghan Reviewed By: dhruba CC: leveldb Differential Revision: https://reviews.facebook.net/D12003	11 years ago
Mayank Agarwal	59d0b02f8b	Expand KeyMayExist to return the proper value if it can be found in memory and also check block_cache Summary: Removed KeyMayExistImpl because KeyMayExist demanded Get like semantics now. Removed no_io from memtable and imm because we need the proper value now and shouldn't just stop when we see Merge in memtable. Added checks to block_cache. Updated documentation and unit-test Test Plan: make all check;db_stress for 1 hour Reviewers: dhruba, haobo Reviewed By: dhruba CC: leveldb Differential Revision: https://reviews.facebook.net/D11853	11 years ago
Jim Paton	9700677a2b	Slow down writes gradually rather than suddenly Summary: Currently, when a certain number of level0 files (level0_slowdown_writes_trigger) are present, RocksDB will slow down each write by 1ms. There is a second limit of level0 files at which RocksDB will stop writes altogether (level0_stop_writes_trigger). This patch enables the user to supply a third parameter specifying the number of files at which Rocks will start slowing down writes (level0_start_slowdown_writes). When this number is reached, Rocks will slow down writes as a quadratic function of level0_slowdown_writes_trigger - num_level0_files. For some workloads, this improves latency and throughput. I will post some stats momentarily in https://our.intern.facebook.com/intern/tasks/?t=2613384. Test Plan: make -j32 check ./db_stress ./db_bench Reviewers: dhruba, haobo, MarkCallaghan, xjin Reviewed By: xjin CC: leveldb, xjin, zshao Differential Revision: https://reviews.facebook.net/D11859	11 years ago
Jim Paton	18afff2e63	Add stall counts to statistics Summary: Previously, statistics are kept on how much time is spent on stalls of different types. This patch adds support for keeping number of stalls of each type. For example, instead of just reporting how many microseconds are spent waiting for memtables to be compacted, it will also report how many times a write stalled for that to occur. Test Plan: make -j32 check ./db_stress # Not really sure what else should be done... Reviewers: dhruba, MarkCallaghan, haobo Reviewed By: dhruba CC: leveldb Differential Revision: https://reviews.facebook.net/D11841	11 years ago
Jim Paton	52d7ecfc78	Virtualize SkipList Interface Summary: This diff virtualizes the skiplist interface so that users can provide their own implementation of a backing store for MemTables. Eventually, the backing store will be responsible for its own synchronization, allowing users (and us) to experiment with different lockless implementations. Test Plan: make clean make -j32 check ./db_stress Reviewers: dhruba, emayanke, haobo Reviewed By: dhruba CC: leveldb Differential Revision: https://reviews.facebook.net/D11739	11 years ago
Mayank Agarwal	bf66c10b13	Use KeyMayExist for WriteBatch-Deletes Summary: Introduced KeyMayExist checking during writebatch-delete and removed from Outer Delete API because it uses writebatch-delete. Added code to skip getting Table from disk if not already present in table_cache. Some renaming of variables. Introduced KeyMayExistImpl which allows checking since specified sequence number in GetImpl useful to check partially written writebatch. Changed KeyMayExist to not be pure virtual and provided a default implementation. Expanded unit-tests in db_test to check appropriately. Ran db_stress for 1 hour with ./db_stress --max_key=100000 --ops_per_thread=10000000 --delpercent=50 --filter_deletes=1 --statistics=1. Test Plan: db_stress;make check Reviewers: dhruba, haobo Reviewed By: dhruba CC: leveldb, xjin Differential Revision: https://reviews.facebook.net/D11745	11 years ago
Haobo Xu	9ee68871dc	[RocksDB] Enable manual compaction to move files back to an appropriate level. Summary: As title. This diff added an option reduce_level to CompactRange. When set to true, it will try to move the files back to the minimum level sufficient to hold the data set. Note that the default is set to true now, just to excerise it in all existing tests. Will set the default to false before check-in, for backward compatibility. Test Plan: make check; Reviewers: dhruba, emayanke CC: leveldb Differential Revision: https://reviews.facebook.net/D11553	11 years ago
Mayank Agarwal	2a986919d6	Make rocksdb-deletes faster using bloom filter Summary: Wrote a new function in db_impl.c-CheckKeyMayExist that calls Get but with a new parameter turned on which makes Get return false only if bloom filters can guarantee that key is not in database. Delete calls this function and if the option- deletes_use_filter is turned on and CheckKeyMayExist returns false, the delete will be dropped saving: 1. Put of delete type 2. Space in the db,and 3. Compaction time Test Plan: make all check; will run db_stress and db_bench and enhance unit-test once the basic design gets approved Reviewers: dhruba, haobo, vamsi Reviewed By: haobo CC: leveldb Differential Revision: https://reviews.facebook.net/D11607	11 years ago
Abhishek Kona	5ef6bb8c37	[rocksdb][refactor] statistic printing code to one place Summary: $title Test Plan: db_bench --statistics=1 Reviewers: haobo Reviewed By: haobo CC: leveldb Differential Revision: https://reviews.facebook.net/D11373	12 years ago
Dhruba Borthakur	6acbe0fc45	Compact multiple memtables before flushing to storage. Summary: Merge multiple multiple memtables in memory before writing it out to a file in L0. There is a new config parameter min_write_buffer_number_to_merge that specifies the number of write buffers that should be merged together to a single file in storage. The system will not flush wrte buffers to storage unless at least these many buffers have accumulated in memory. The default value of this new parameter is 1, which means that a write buffer will be immediately flushed to disk as soon it is ready. Test Plan: make check Differential Revision: https://reviews.facebook.net/D11241	12 years ago
Haobo Xu	bdf1085944	[RocksDB] cleanup EnvOptions Summary: This diff simplifies EnvOptions by treating it as POD, similar to Options. - virtual functions are removed and member fields are accessed directly. - StorageOptions is removed. - Options.allow_readahead and Options.allow_readahead_compactions are deprecated. - Unused global variables are removed: useOsBuffer, useFsReadAhead, useMmapRead, useMmapWrite Test Plan: make check; db_stress Reviewers: dhruba CC: leveldb Differential Revision: https://reviews.facebook.net/D11175	12 years ago
Deon Nicholas	d8c7c45ea0	Very basic Multiget and simple test cases. Summary: Implemented the MultiGet operator which takes in a list of keys and returns their associated values. Currently uses std::vector as its container data structure. Otherwise, it works identically to "Get". Test Plan: 1. make db_test ; compile it 2. ./db_test ; test it 3. make all check ; regress / run all tests 4. make release ; (optional) compile with release settings Reviewers: haobo, MarkCallaghan, dhruba Reviewed By: dhruba CC: leveldb Differential Revision: https://reviews.facebook.net/D10875	12 years ago
Mark Callaghan	d9f538e1a9	Improve output for GetProperty('leveldb.stats') Summary: Display separate values for read, write & total compaction IO. Display compaction amplification and write amplification. Add similar values for the period since the last call to GetProperty. Results since the server started are reported as "cumulative" stats. Results since the last call to GetProperty are reported as "interval" stats. Level Files Size(MB) Time(sec) Read(MB) Write(MB) Rn(MB) Rnp1(MB) Wnew(MB) Amplify Read(MB/s) Write(MB/s) Rn Rnp1 Wnp1 NewW Count Ln-stall ---------------------------------------------------------------------------------------------------------------------------------------------------------------------- 0 7 13 21 0 211 0 0 211 0.0 0.0 10.1 0 0 0 0 113 0.0 1 79 157 88 993 989 198 795 194 9.0 11.3 11.2 106 405 502 97 14 0.0 2 19 36 5 63 63 37 27 36 2.4 12.3 12.2 19 14 32 18 12 0.0 >>>>>>>>>>>>>>>>>>>>>>>>> text below has been is new and/or reformatted Uptime(secs): 122.2 total, 0.9 interval Compaction IO cumulative (GB): 0.21 new, 1.03 read, 1.23 write, 2.26 read+write Compaction IO cumulative (MB/sec): 1.7 new, 8.6 read, 10.3 write, 19.0 read+write Amplification cumulative: 6.0 write, 11.0 compaction Compaction IO interval (MB): 5.59 new, 0.00 read, 5.59 write, 5.59 read+write Compaction IO interval (MB/sec): 6.5 new, 0.0 read, 6.5 write, 6.5 read+write Amplification interval: 1.0 write, 1.0 compaction >>>>>>>>>>>>>>>>>>>>>>>> text above is new and/or reformatted Stalls(secs): 90.574 level0_slowdown, 0.000 level0_numfiles, 10.165 memtable_compaction, 0.000 leveln_slowdown Task ID: # Blame Rev: Test Plan: make check, run db_bench Revert Plan: Database Impact: Memcache Impact: Other Notes: EImportant: - begin PUBLIC platform impact section - Bugzilla: # - end platform impact - Reviewers: haobo Reviewed By: haobo CC: leveldb Differential Revision: https://reviews.facebook.net/D11049	12 years ago
Haobo Xu	2df65c118c	[RocksDB] Dump counters and histogram data periodically with compaction stats Summary: As title Test Plan: make check Reviewers: dhruba Reviewed By: dhruba CC: leveldb Differential Revision: https://reviews.facebook.net/D10995	12 years ago
Haobo Xu	ef15b9d178	[RocksDB] Fix MaybeDumpStats Summary: MaybeDumpStats was causing lock problem Test Plan: make check; db_stress Reviewers: dhruba Reviewed By: dhruba Differential Revision: https://reviews.facebook.net/D10935	12 years ago
Haobo Xu	0e879c93de	[RocksDB] dump leveldb.stats periodically in LOG file. Summary: Added an option stats_dump_period_sec to dump leveldb.stats to LOG periodically for diagnosis. By defauly, it's set to a very big number 3600 (1 hour). Test Plan: make check; Reviewers: dhruba Reviewed By: dhruba CC: leveldb, zshao Differential Revision: https://reviews.facebook.net/D10761	12 years ago
Abhishek Kona	988c20b9f7	[RocksDB] Clear Archive WAL files Summary: WAL files are moved to archive directory and clear only at DB::Open. Can lead to a lot of space consumption in a Database. Added logic to periodically clear Archive Directory too. Test Plan: make all check + add unit test Reviewers: dhruba, heyongqiang Reviewed By: heyongqiang CC: leveldb Differential Revision: https://reviews.facebook.net/D10617	12 years ago
Haobo Xu	05e8854085	[Rocksdb] Support Merge operation in rocksdb Summary: This diff introduces a new Merge operation into rocksdb. The purpose of this review is mostly getting feedback from the team (everyone please) on the design. Please focus on the four files under include/leveldb/, as they spell the client visible interface change. include/leveldb/db.h include/leveldb/merge_operator.h include/leveldb/options.h include/leveldb/write_batch.h Please go over local/my_test.cc carefully, as it is a concerete use case. Please also review the impelmentation files to see if the straw man implementation makes sense. Note that, the diff does pass all make check and truly supports forward iterator over db and a version of Get that's based on iterator. Future work: - Integration with compaction - A raw Get implementation I am working on a wiki that explains the design and implementation choices, but coding comes just naturally and I think it might be a good idea to share the code earlier. The code is heavily commented. Test Plan: run all local tests Reviewers: dhruba, heyongqiang Reviewed By: dhruba CC: leveldb, zshao, sheki, emayanke, MarkCallaghan Differential Revision: https://reviews.facebook.net/D9651	12 years ago
Haobo Xu	645ff8f231	Let's get rid of delete as much as possible, here are some examples. Summary: If a class owns an object: - If the object can be null => use a unique_ptr. no delete - If the object can not be null => don't even need new, let alone delete - for runtime sized array => use vector, no delete. Test Plan: make check Reviewers: dhruba, heyongqiang Reviewed By: heyongqiang CC: leveldb, zshao, sheki, emayanke, MarkCallaghan Differential Revision: https://reviews.facebook.net/D9783	12 years ago
Dhruba Borthakur	ad96563b79	Ability to configure bufferedio-reads, filesystem-readaheads and mmap-read-write per database. Summary: This patch allows an application to specify whether to use bufferedio, reads-via-mmaps and writes-via-mmaps per database. Earlier, there was a global static variable that was used to configure this functionality. The default setting remains the same (and is backward compatible): 1. use bufferedio 2. do not use mmaps for reads 3. use mmap for writes 4. use readaheads for reads needed for compaction I also added a parameter to db_bench to be able to explicitly specify whether to do readaheads for compactions or not. Test Plan: make check Reviewers: sheki, heyongqiang, MarkCallaghan Reviewed By: sheki CC: leveldb Differential Revision: https://reviews.facebook.net/D9429	12 years ago
Abhishek Kona	02c459805b	Ignore a zero-sized file while looking for a seq-no in GetUpdatesSince Summary: Rocksdb can create 0 sized log files when it is opened and closed without any operations. The GetUpdatesSince fails currently if there is a log file of size zero. This diff fixes this. If there is a log file is 0, it is removed form the probable_file_list Test Plan: unit test Reviewers: dhruba, heyongqiang Reviewed By: heyongqiang CC: leveldb Differential Revision: https://reviews.facebook.net/D9507	12 years ago
Abhishek Kona	d68880a1b9	Do not allow Transaction Log Iterator to fall ahead when writer is writing the same file Summary: Store the last flushed, seq no. in db_impl. Check against it in transaction Log iterator. Do not attempt to read ahead if we do not know if the data is flushed completely. Does not work if flush is disabled. Any ideas on fixing that? * Minor change, iter->Next is called the first time automatically for * the first time. Test Plan: existing test pass. More ideas on testing this? Planning to run some stress test. Reviewers: dhruba, heyongqiang CC: leveldb Differential Revision: https://reviews.facebook.net/D9087	12 years ago
Mark Callaghan	993543d1be	Add rate_delay_limit_milliseconds Summary: This adds the rate_delay_limit_milliseconds option to make the delay configurable in MakeRoomForWrite when the max compaction score is too high. This delay is called the Ln slowdown. This change also counts the Ln slowdown per level to make it possible to see where the stalls occur. From IO-bound performance testing, the Level N stalls occur: * with compression -> at the largest uncompressed level. This makes sense because compaction for compressed levels is much slower. When Lx is uncompressed and Lx+1 is compressed then files pile up at Lx because the (Lx,Lx+1)->Lx+1 compaction process is the first to be slowed by compression. * without compression -> at level 1 Task ID: #1832108 Blame Rev: Test Plan: run with real data, added test Revert Plan: Database Impact: Memcache Impact: Other Notes: EImportant: - begin PUBLIC platform impact section - Bugzilla: # - end platform impact - Reviewers: dhruba Reviewed By: dhruba Differential Revision: https://reviews.facebook.net/D9045	12 years ago
Abhishek Kona	959337ed5b	Measure compaction time. Summary: just record time consumed in compaction Test Plan: compile Reviewers: dhruba Reviewed By: dhruba CC: leveldb Differential Revision: https://reviews.facebook.net/D8781	12 years ago
amayank	b2c50f1c3f	Fix for the weird behaviour encountered by ldb Get where it could read only the second-latest value Summary: Changed the Get and Scan options with openForReadOnly mode to have access to the memtable. Changed the visibility of NewInternalIterator in db_impl from private to protected so that the derived class db_impl_read_only can call that in its NewIterator function for the scan case. The previous approach which changed the default for flush_on_destroy_ from false to true caused many problems in the unit tests due to empty sst files that it created. All unit tests pass now. Test Plan: make clean; make all check; ldb put and get and scans Reviewers: dhruba, heyongqiang, sheki Reviewed By: dhruba CC: kosievdmerwe, zshao, dilipj, kailiu Differential Revision: https://reviews.facebook.net/D8697	12 years ago
Chip Turner	0b83a83191	Fix poor error on num_levels mismatch and few other minor improvements Summary: Previously, if you opened a db with num_levels set lower than the database, you received the unhelpful message "Corruption: VersionEdit: new-file entry." Now you get a more verbose message describing the issue. Also, fix handling of compression_levels (both the run-over-the-end issue and the memory management of it). Lastly, unique_ptr'ify a couple of minor calls. Test Plan: make check Reviewers: dhruba Reviewed By: dhruba CC: leveldb Differential Revision: https://reviews.facebook.net/D8151	12 years ago
Chip Turner	2fdf91a4f8	Fix a number of object lifetime/ownership issues Summary: Replace manual memory management with std::unique_ptr in a number of places; not exhaustive, but this fixes a few leaks with file handles as well as clarifies semantics of the ownership of file handles with log classes. Test Plan: db_stress, make check Reviewers: dhruba Reviewed By: dhruba CC: zshao, leveldb, heyongqiang Differential Revision: https://reviews.facebook.net/D8043	12 years ago
Abhishek Kona	7d5a4383bb	rollover manifest file. Summary: Check in LogAndApply if the file size is more than the limit set in Options. Things to consider : will this be expensive? Test Plan: make all check. Inputs on a new unit test? Reviewers: dhruba Reviewed By: dhruba CC: leveldb Differential Revision: https://reviews.facebook.net/D7701	12 years ago
Dhruba Borthakur	f4c2b7cf97	Enhance ReadOnly mode to process the all committed transactions. Summary: Leveldb has an api OpenForReadOnly() that opens the database in readonly mode. This call had an option to not process the transaction log. This patch removes this option and always processes all transactions that had been committed. It has been done in such a way that it does not create/write to any new files in the process. The invariant of "no-writes" to the leveldb data directory is still true. This enhancement allows multiple threads to open the same database in readonly mode and access all trancations that were committed right upto the OpenForReadOnly call. I changed the public API to match the new semantics because there are no users who are currently using this api. Test Plan: make clean check Reviewers: sheki Reviewed By: sheki CC: leveldb Differential Revision: https://reviews.facebook.net/D7479	12 years ago
Dhruba Borthakur	24fc379273	An public api to fetch the latest transaction id. Summary: Implement a interface to retrieve the most current transaction id from the database. Test Plan: Added unit test. Reviewers: sheki Reviewed By: sheki CC: leveldb Differential Revision: https://reviews.facebook.net/D7269	12 years ago
Abhishek Kona	1c6742e32f	Refactor GetArchivalDirectoryName to filename.h Summary: filename.h has functions to do similar things. Moving code away from db_impl.cc Test Plan: make check Reviewers: dhruba Reviewed By: dhruba Differential Revision: https://reviews.facebook.net/D7251	12 years ago
Abhishek Kona	8055008909	GetUpdatesSince API to enable replication. Summary: How it works: * GetUpdatesSince takes a SequenceNumber. * A LogFile with the first SequenceNumber nearest and lesser than the requested Sequence Number is found. * Seek in the logFile till the requested SeqNumber is found. * Return an iterator which contains logic to return record's one by one. Test Plan: * Test case included to check the good code path. * Will update with more test-cases. * Feedback required on test-cases. Reviewers: dhruba, emayanke Reviewed By: dhruba CC: leveldb Differential Revision: https://reviews.facebook.net/D7119	12 years ago
sheki	d4627e6de4	Move WAL files to archive directory, instead of deleting. Summary: Create a directory "archive" in the DB directory. During DeleteObsolteFiles move the WAL files (*.log) to the Archive directory, instead of deleting. Test Plan: Created a DB using DB_Bench. Reopened it. Checked if files move. Reviewers: dhruba Reviewed By: dhruba Differential Revision: https://reviews.facebook.net/D6975	12 years ago
Abhishek Kona	d29f181923	Fix all the lint errors. Summary: Scripted and removed all trailing spaces and converted all tabs to spaces. Also fixed other lint errors. All lint errors from this point of time should be taken seriously. Test Plan: make all check Reviewers: dhruba Reviewed By: dhruba CC: leveldb Differential Revision: https://reviews.facebook.net/D7059	12 years ago
Dhruba Borthakur	9a357847eb	Delete non-visible keys during a compaction even in the presense of snapshots. Summary: LevelDB should delete almost-new keys when a long-open snapshot exists. The previous behavior is to keep all versions that were created after the oldest open snapshot. This can lead to database size bloat for high-update workloads when there are long-open snapshots and long-open snapshot will be used for logical backup. By "almost new" I mean that the key was updated more than once after the oldest snapshot. If there were two snapshots with seq numbers s1 and s2 (s1 < s2), and if we find two instances of the same key k1 that lie entirely within s1 and s2 (i.e. s1 < k1 < s2), then the earlier version of k1 can be safely deleted because that version is not visible in any snapshot. Test Plan: unit test attached make clean check Differential Revision: https://reviews.facebook.net/D6999	12 years ago

... 5 6 7 8 9

425 Commits (382277d0fe79ece86b799e3925919cf44c3afb4c)