rocksdb

Commit Graph

Author	SHA1	Message	Date
Dhruba Borthakur	5d16e503a6	Improved CompactionFilter api: pass in a opaque argument to CompactionFilter invocation. Summary: There are applications that operate on multiple leveldb instances. These applications will like to pass in an opaque type for each leveldb instance and this type should be passed back to the application with every invocation of the CompactionFilter api. Test Plan: Enehanced unit test for opaque parameter to CompactionFilter. Reviewers: heyongqiang Reviewed By: heyongqiang CC: MarkCallaghan, sheki, emayanke Differential Revision: https://reviews.facebook.net/D6711	12 years ago
heyongqiang	20d18a89a3	disable size compaction in ldb reduce_levels and added compression and file size parameter to it Summary: disable size compaction in ldb reduce_levels, this will avoid compactions rather than the manual comapction, added --compression=none\|snappy\|zlib\|bzip2 and --file_size= per-file size to ldb reduce_levels command Test Plan: run ldb Reviewers: dhruba, MarkCallaghan Reviewed By: dhruba CC: sheki, emayanke Differential Revision: https://reviews.facebook.net/D6597	12 years ago
Dhruba Borthakur	5273c81483	Ability to invoke application hook for every key during compaction. Summary: There are certain use-cases where the application intends to delete older keys aftre they have expired a certian time period. One option for those applications is to periodically scan the entire database and delete appropriate keys. A better way is to allow the application to hook into the compaction process. This patch allows the application to set a method callback for every key that is being compacted. If this method returns true, then the key is not preserved in the output of the compaction. Test Plan: This is mostly to preview the proposed new public api. Since it is a public api, please do due diligence on reviewing it. I will be writing test cases for this api in mynext version of this patch. Reviewers: MarkCallaghan, heyongqiang Reviewed By: heyongqiang CC: sheki, adsharma Differential Revision: https://reviews.facebook.net/D6285	12 years ago
amayank	854c66b089	Make compression options configurable. These include window-bits, level and strategy for ZlibCompression Summary: Leveldb currently uses windowBits=-14 while using zlib compression.(It was earlier 15). This makes the setting configurable. Related changes here: https://reviews.facebook.net/D6105 Test Plan: make all check Reviewers: dhruba, MarkCallaghan, sheki, heyongqiang Differential Revision: https://reviews.facebook.net/D6393	12 years ago
heyongqiang	3096fa7534	Add two more options: disable block cache and make table cache shard number configuable Summary: as subject Test Plan: run db_bench and db_test Reviewers: dhruba Reviewed By: dhruba Differential Revision: https://reviews.facebook.net/D6111	12 years ago
Dhruba Borthakur	321dfdc3ae	Allow having different compression algorithms on different levels. Summary: The leveldb API is enhanced to support different compression algorithms at different levels. This adds the option min_level_to_compress to db_bench that specifies the minimum level for which compression should be done when compression is enabled. This can be used to disable compression for levels 0 and 1 which are likely to suffer from stalls because of the CPU load for memtable flushes and (L0,L1) compaction. Level 0 is special as it gets frequent memtable flushes. Level 1 is special as it frequently gets all:all file compactions between it and level 0. But all other levels could be the same. For any level N where N > 1, the rate of sequential IO for that level should be the same. The last level is the exception because it might not be full and because files from it are not read to compact with the next larger level. The same amount of time will be spent doing compaction at any level N excluding N=0, 1 or the last level. By this standard all of those levels should use the same compression. The difference is that the loss (using more disk space) from a faster compression algorithm is less significant for N=2 than for N=3. So we might be willing to trade disk space for faster write rates with no compression for L0 and L1, snappy for L2, zlib for L3. Using a faster compression algorithm for the mid levels also allows us to reclaim some cpu without trading off much loss in disk space overhead. Also note that little is to be gained by compressing levels 0 and 1. For a 4-level tree they account for 10% of the data. For a 5-level tree they account for 1% of the data. With compression enabled: * memtable flush rate is ~18MB/second * (L0,L1) compaction rate is ~30MB/second With compression enabled but min_level_to_compress=2 * memtable flush rate is ~320MB/second * (L0,L1) compaction rate is ~560MB/second This practicaly takes the same code from https://reviews.facebook.net/D6225 but makes the leveldb api more general purpose with a few additional lines of code. Test Plan: make check Differential Revision: https://reviews.facebook.net/D6261	12 years ago
Mark Callaghan	70c42bf05f	Adds DB::GetNextCompaction and then uses that for rate limiting db_bench Summary: Adds a method that returns the score for the next level that most needs compaction. That method is then used by db_bench to rate limit threads. Threads are put to sleep at the end of each stats interval until the score is less than the limit. The limit is set via the --rate_limit=$double option. The specified value must be > 1.0. Also adds the option --stats_per_interval to enable additional metrics reported every stats interval. Task ID: # Blame Rev: Test Plan: run db_bench Revert Plan: Database Impact: Memcache Impact: Other Notes: EImportant: - begin PUBLIC platform impact section - Bugzilla: # - end platform impact - Reviewers: dhruba Reviewed By: dhruba Differential Revision: https://reviews.facebook.net/D6243	12 years ago
Kai Liu	d50f8eb603	Enable LevelDb to create a new log file if current log file is too large. Summary: Enable LevelDb to create a new log file if current log file is too large. Test Plan: Write a script and manually check the generated info LOG. Task ID: 1803577 Blame Rev: Reviewers: dhruba, heyongqiang Reviewed By: heyongqiang CC: zshao Differential Revision: https://reviews.facebook.net/D6003	12 years ago
Mark Callaghan	51d2adfbeb	Fix broken build. Add stdint.h to get uint64_t Summary: I still get failures from this. Not sure whether there was a fix in progress. Task ID: # Blame Rev: Test Plan: compile Revert Plan: Database Impact: Memcache Impact: Other Notes: EImportant: - begin PUBLIC platform impact section - Bugzilla: # - end platform impact - Reviewers: dhruba Reviewed By: dhruba Differential Revision: https://reviews.facebook.net/D6147	12 years ago
Dhruba Borthakur	1ca0584345	This is the mega-patch multi-threaded compaction published in https://reviews.facebook.net/D5997. Summary: This patch allows compaction to occur in multiple background threads concurrently. If a manual compaction is issued, the system falls back to a single-compaction-thread model. This is done to ensure correctess and simplicity of code. When the manual compaction is finished, the system resumes its concurrent-compaction mode automatically. The updates to the manifest are done via group-commit approach. Test Plan: run db_bench	12 years ago
Dhruba Borthakur	aa73538f2a	The deletion of obsolete files should not occur very frequently. Summary: The method DeleteObsolete files is a very costly methind, especially when the number of files in a system is large. It makes a list of all live-files and then scans the directory to compute the diff. By default, this method is executed after every compaction run. This patch makes it such that DeleteObsolete files is never invoked twice within a configured period. Test Plan: run all unit tests Reviewers: heyongqiang, MarkCallaghan Reviewed By: MarkCallaghan Differential Revision: https://reviews.facebook.net/D6045	12 years ago
heyongqiang	3662c2976a	improve comments about target_file_size_base, target_file_size_multiplier, max_bytes_for_level_base, max_bytes_for_level_multiplier Summary: Summary: as subject Test Plan: compile Reviewers: MarkCallaghan, dhruba Differential Revision: https://reviews.facebook.net/D5499	12 years ago
heyongqiang	a8464ed820	add an option to disable seek compaction Summary: as subject. This diff should be good for benchmarking. will send another diff to make it better in the case the seek compaction is enable. In that coming diff, will not count a seek if the bloomfilter filters. Test Plan: build Reviewers: dhruba, MarkCallaghan Reviewed By: MarkCallaghan Differential Revision: https://reviews.facebook.net/D5481	12 years ago
heyongqiang	0f43aa474e	put log in a seperate dir Summary: added a new option db_log_dir, which points the log dir. Inside that dir, in order to make log names unique, the log file name is prefixed with the leveldb data dir absolute path. Test Plan: db_test Reviewers: dhruba Reviewed By: dhruba Differential Revision: https://reviews.facebook.net/D5205	12 years ago
Dhruba Borthakur	fe93631678	Clean up compiler warnings generated by -Wall option. Summary: Clean up compiler warnings generated by -Wall option. make clean all OPT=-Wall This is a pre-requisite before making a new release. Test Plan: compile and run unit tests Reviewers: heyongqiang Reviewed By: heyongqiang Differential Revision: https://reviews.facebook.net/D5019	12 years ago
Dhruba Borthakur	fc20273e73	Introduce a new method Env->Fsync() that issues fsync (instead of fdatasync). Summary: Introduce a new method Env->Fsync() that issues fsync (instead of fdatasync). This is needed for data durability when running on ext3 filesystems. Added options to the benchmark db_bench to generate performance numbers with either fsync or fdatasync enabled. Cleaned up Makefile to build leveldb_shell only when building the thrift leveldb server. Test Plan: build and run benchmark Reviewers: heyongqiang Reviewed By: heyongqiang Differential Revision: https://reviews.facebook.net/D4911	12 years ago
Dhruba Borthakur	e5a7c8e580	Log the open-options to the LOG. Summary: Log the open-options to the LOG. Use options_ instead of options because SanitizeOptions could modify the max_file_open limit. Test Plan: num db_bench Reviewers: heyongqiang Reviewed By: heyongqiang Differential Revision: https://reviews.facebook.net/D4833	12 years ago
heyongqiang	6ba1f17789	adding a scribe logger in leveldb to log leveldb deploy stats Summary: as subject. A new log is written to scribe via thrift client when a new db is opened and when there is a compaction. a new option var scribe_log_db_stats is added. Test Plan: manually checked using command "ptail -time 0 leveldb_deploy_stats" Reviewers: dhruba Differential Revision: https://reviews.facebook.net/D4659	12 years ago
heyongqiang	f16e393658	add more options to db_ben Summary: as subject Test Plan: run db_bench with new options Reviewers: dhruba Differential Revision: https://reviews.facebook.net/D4677	12 years ago
Dhruba Borthakur	c3096afd61	Introduce a new option disableDataSync for opening the database. If this is set to true, then the data written to newly created data files are not sycned to disk, instead depend on the OS to flush dirty data to stable storage. This option is good for bulk Test Plan: manual tests Task ID: # Blame Rev: Differential Revision: https://reviews.facebook.net/D4515	12 years ago
heyongqiang	22ee777f68	add flush interface to DB Summary: as subject. The flush will flush everything in the db. Test Plan: new test in db_test.cc Reviewers: dhruba Reviewed By: dhruba Differential Revision: https://reviews.facebook.net/D4029	13 years ago
heyongqiang	a347d4ac0d	add disable WAL option Summary: add disable WAL option Test Plan: new testcase in db_test.cc Reviewers: dhruba Reviewed By: dhruba Differential Revision: https://reviews.facebook.net/D4011	13 years ago
heyongqiang	daa816c4a0	add bzip2 compression Summary: add bzip2 compression Test Plan: testcases in table_test Reviewers: dhruba Reviewed By: dhruba Differential Revision: https://reviews.facebook.net/D3909	13 years ago
heyongqiang	054a5657f8	add zlib compression Summary: add zlib compression Test Plan: Will add more testcases Reviewers: dhruba Reviewed By: dhruba Differential Revision: https://reviews.facebook.net/D3873	13 years ago
heyongqiang	4e4b6812ff	Make some variables configurable for each db instance Summary: Make configurable 'targetFileSize', 'targetFileSizeMultiplier', 'maxBytesForLevelBase', 'maxBytesForLevelMultiplier', 'expandedCompactionFactor', 'maxGrandParentOverlapFactor' Test Plan: N/A Reviewers: dhruba Reviewed By: dhruba Differential Revision: https://reviews.facebook.net/D3801	13 years ago
Dhruba Borthakur	f50ece60c7	Fix table-cache size bug, gather table-cache statistics and prevent readahead done by fs. Summary: Summary: The db_bench test was not using the specified value for the max-file-open. Fixed. The fs readhead is switched off. Gather statistics about the table cache and print it out at the end of the tets run. Test Plan: Revert Plan: Reviewers: adsharma, sc Reviewed By: adsharma Differential Revision: https://reviews.facebook.net/D3441	13 years ago
Dhruba Borthakur	3b86a51cb1	Ability to switch on checksum verification from benchmark. Summary: Task ID: # Blame Rev: Test Plan: Revert Plan: Differential Revision: https://reviews.facebook.net/D3309	13 years ago
Sanjay Ghemawat	85584d497e	Added bloom filter support. In particular, we add a new FilterPolicy class. An instance of this class can be supplied in Options when opening a database. If supplied, the instance is used to generate summaries of keys (e.g., a bloom filter) which are placed in sstables. These summaries are consulted by DB::Get() so we can avoid reading sstable blocks that are guaranteed to not contain the key we are looking for. This change provides one implementation of FilterPolicy based on bloom filters. Other changes: - Updated version number to 1.4. - Some build tweaks. - C binding for CompactRange. - A few more benchmarks: deleteseq, deleterandom, readmissing, seekrandom. - Minor .gitignore update.	13 years ago
Hans Wennborg	36a5f8ed7f	A number of fixes: - Replace raw slice comparison with a call to user comparator. Added test for custom comparators. - Fix end of namespace comments. - Fixed bug in picking inputs for a level-0 compaction. When finding overlapping files, the covered range may expand as files are added to the input set. We now correctly expand the range when this happens instead of continuing to use the old range. For example, suppose L0 contains files with the following ranges: F1: a .. d F2: c .. g F3: f .. j and the initial compaction target is F3. We used to search for range f..j which yielded {F2,F3}. However we now expand the range as soon as another file is added. In this case, when F2 is added, we expand the range to c..j and restart the search. That picks up file F1 as well. This change fixes a bug related to deleted keys showing up incorrectly after a compaction as described in Issue 44. (Sync with upstream @25072954)	13 years ago
Gabor Cselle	299ccedfec	A number of bugfixes: - Added DB::CompactRange() method. Changed manual compaction code so it breaks up compactions of big ranges into smaller compactions. Changed the code that pushes the output of memtable compactions to higher levels to obey the grandparent constraint: i.e., we must never have a single file in level L that overlaps too much data in level L+1 (to avoid very expensive L-1 compactions). Added code to pretty-print internal keys. - Fixed bug where we would not detect overlap with files in level-0 because we were incorrectly using binary search on an array of files with overlapping ranges. Added "leveldb.sstables" property that can be used to dump all of the sstables and ranges that make up the db state. - Removing post_write_snapshot support. Email to leveldb mailing list brought up no users, just confusion from one person about what it meant. - Fixing static_cast char to unsigned on BIG_ENDIAN platforms. Fixes Issue 35 and Issue 36. - Comment clarification to address leveldb Issue 37. - Change license in posix_logger.h to match other files. - A build problem where uint32 was used instead of uint32_t. Sync with upstream @24408625	13 years ago
gabor@google.com	60bd8015f2	Speed up Snappy uncompression, new Logger interface. - Removed one copy of an uncompressed block contents changing the signature of Snappy_Uncompress() so it uncompresses into a flat array instead of a std::string. Speeds up readrandom ~10%. - Instead of a combination of Env/WritableFile, we now have a Logger interface that can be easily overridden applications that want to supply their own logging. - Separated out the gcc and Sun Studio parts of atomic_pointer.h so we can use 'asm', 'volatile' keywords for Sun Studio. git-svn-id: https://leveldb.googlecode.com/svn/trunk@39 62dab493-f737-651d-591e-8d6aee1b9529	14 years ago
gabor@google.com	6872ace901	Sun Studio support, and fix for test related memory fixes. - LevelDB patch for Sun Studio Based on a patch submitted by Theo Schlossnagle - thanks! This fixes Issue 17. - Fix a couple of test related memory leaks. git-svn-id: https://leveldb.googlecode.com/svn/trunk@38 62dab493-f737-651d-591e-8d6aee1b9529	14 years ago
gabor@google.com	6699c7ebe6	Small tweaks and bugfixes for Issue 18 and 19. Slight tweak to the no-overlap optimization: only push to level 2 to reduce the amount of wasted space when the same small key range is being repeatedly overwritten. Fix for Issue 18: Avoid failure on Windows by avoiding deletion of lock file until the end of DestroyDB(). Fix for Issue 19: Disregard sequence numbers when checking for overlap in sstable ranges. This fixes issue 19: when writing the same key over and over again, we would generate a sequence of sstables that were never merged together since their sequence numbers were disjoint. Don't ignore map/unmap error checks. Miscellaneous fixes for small problems Sanjay found while diagnosing issue/9 and issue/16 (corruption_testr failures). - log::Reader reports the record type when it finds an unexpected type. - log::Reader no longer reports an error when it encounters an expected zero record regardless of the setting of the "checksum" flag. - Added a missing forward declaration. - Documented a side-effects of larger write buffer sizes (longer recovery time). git-svn-id: https://leveldb.googlecode.com/svn/trunk@37 62dab493-f737-651d-591e-8d6aee1b9529	14 years ago
dgrogan@chromium.org	ba6dac0e80	@20776309 * env_chromium.cc should not export symbols. * Fix MSVC warnings. * Removed large value support. * Fix broken reference to documentation file git-svn-id: https://leveldb.googlecode.com/svn/trunk@24 62dab493-f737-651d-591e-8d6aee1b9529	14 years ago
dgrogan@chromium.org	69c6d38342	reverting disastrous MOE commit, returning to r21 git-svn-id: https://leveldb.googlecode.com/svn/trunk@23 62dab493-f737-651d-591e-8d6aee1b9529	14 years ago
dgrogan@chromium.org	b743906eea	Revision created by MOE tool push_codebase.	14 years ago
dgrogan@chromium.org	b409afe968	chmod a-x git-svn-id: https://leveldb.googlecode.com/svn/trunk@21 62dab493-f737-651d-591e-8d6aee1b9529	14 years ago
dgrogan@chromium.org	f779e7a5d8	@20602303 . Default file permission is now 755. git-svn-id: https://leveldb.googlecode.com/svn/trunk@20 62dab493-f737-651d-591e-8d6aee1b9529	14 years ago
jorlow@chromium.org	9e33808a26	Fix last commit git-svn-id: https://leveldb.googlecode.com/svn/trunk@19 62dab493-f737-651d-591e-8d6aee1b9529	14 years ago
jorlow@chromium.org	8303bb1b33	Pull from upstream. git-svn-id: https://leveldb.googlecode.com/svn/trunk@14 62dab493-f737-651d-591e-8d6aee1b9529	14 years ago
jorlow@chromium.org	f67e15e50f	Initial checkin. git-svn-id: https://leveldb.googlecode.com/svn/trunk@2 62dab493-f737-651d-591e-8d6aee1b9529	14 years ago

44 Commits (6c5a4d646a056126057fa25faf1d54adbd4f3c85)