Summary:
Added function to `RandomAccessFile` to generate an unique ID for that file. Currently only `PosixRandomAccessFile` has this behaviour implemented and only on Linux.
Changed how key is generated in `Table::BlockReader`.
Added tests to check whether the unique ID is stable, unique and not a prefix of another unique ID. Added tests to see that `Table` uses the cache more efficiently.
Test Plan: make check
Reviewers: chip, vamsi, dhruba
Reviewed By: chip
CC: leveldb
Differential Revision: https://reviews.facebook.net/D8145
Summary: fallocate is linux only, so let's protect it with ifdef's
Test Plan: make
Reviewers: sheki, dhruba
Reviewed By: dhruba
CC: leveldb
Differential Revision: https://reviews.facebook.net/D8223
Summary:
Previously, if you opened a db with num_levels set lower than
the database, you received the unhelpful message "Corruption:
VersionEdit: new-file entry." Now you get a more verbose message
describing the issue.
Also, fix handling of compression_levels (both the run-over-the-end
issue and the memory management of it).
Lastly, unique_ptr'ify a couple of minor calls.
Test Plan: make check
Reviewers: dhruba
Reviewed By: dhruba
CC: leveldb
Differential Revision: https://reviews.facebook.net/D8151
Summary:
We continually rebuilt build_version.c because we put the
current date into it, but that's what __DATE__ already is. This makes
builds faster.
This also fixes an issue with 'make clean FOO' not working properly.
Also tweak the build rules to be more consistent, always have warnings,
and add a 'make release' rule to handle flags for release builds.
Test Plan: make, make clean
Reviewers: dhruba
Reviewed By: dhruba
Differential Revision: https://reviews.facebook.net/D8139
Summary:
On some filesystems, pre-allocation can be a considerable
amount of space. xfs in our production environment pre-allocates by
1GB, for instance. By using fallocate to inform the kernel of our
expected file sizes, we eliminate this wasteage (that isn't recovered
until the file is closed which, in the case of LOG files, can be a
considerable amount of time).
Test Plan:
created an xfs loopback filesystem, mounted with
allocsize=4M, and ran db_stress. LOG file without this change was 4M,
and with it it was 128k then grew to normal size.
Reviewers: dhruba
Reviewed By: dhruba
CC: adsharma, leveldb
Differential Revision: https://reviews.facebook.net/D7953
Summary:
Replace manual memory management with std::unique_ptr in a
number of places; not exhaustive, but this fixes a few leaks with file
handles as well as clarifies semantics of the ownership of file handles
with log classes.
Test Plan: db_stress, make check
Reviewers: dhruba
Reviewed By: dhruba
CC: zshao, leveldb, heyongqiang
Differential Revision: https://reviews.facebook.net/D8043
Summary:
Check in LogAndApply if the file size is more than the limit set in
Options.
Things to consider : will this be expensive?
Test Plan: make all check. Inputs on a new unit test?
Reviewers: dhruba
Reviewed By: dhruba
CC: leveldb
Differential Revision: https://reviews.facebook.net/D7701
Summary:
clang is an alternate compiler based on llvm. It produces
nicer error messages and finds some bugs that gcc doesn't, such as the
size_t change in this file (which caused some write return values to be
misinterpreted!)
Clang isn't the default; to try it, do "USE_CLANG=1 make" or "export
USE_CLANG=1" then make as normal
Test Plan: "make check" and "USE_CLANG=1 make check"
Reviewers: dhruba
Reviewed By: dhruba
Differential Revision: https://reviews.facebook.net/D7899
Summary: `~ShardedLRUCache()` was empty despite `init()` allocating memory on the heap. Fixed the leak by freeing memory allocated by `init()`.
Test Plan:
make check
Ran valgrind on db_test before and after patch and saw leaked memory went down
Reviewers: vamsi, dhruba, emayanke, sheki
Reviewed By: dhruba
CC: leveldb
Differential Revision: https://reviews.facebook.net/D7791
Summary:
Changed CreateDir() to CreateDirIfMissing() so a directory that already exists now causes and error.
Fixed CreateDirIfMissing() and added Env.DirExists()
Test Plan:
make check to test for regessions
Ran the following to test if the error message is not about lock files not existing
./db_bench --db=dir/testdb
After creating a file "testdb", ran the following to see if it failed with sane error message:
./db_bench --db=testdb
Reviewers: dhruba, emayanke, vamsi, sheki
Reviewed By: emayanke
CC: leveldb
Differential Revision: https://reviews.facebook.net/D7707
Summary:
There is a compilation error while using gcc 4.7.1.
util/ldb_cmd.cc:381:3: error: ‘leveldb::ReadOptions::ReadOptions’ names the constructor, not the type
util/ldb_cmd.cc:381:37: error: expected ‘;’ before ‘read_options’
util/ldb_cmd.cc:381:49: error: statement cannot resolve address of overloaded function
Test Plan: make clean check
Reviewers: sheki, emayanke, zshao
Reviewed By: emayanke
CC: leveldb
Differential Revision: https://reviews.facebook.net/D7659
Summary: The queries will come from stdin. One key per line. The output will be in stdout, in the format of "<key> ==> <value>" if found, or "<key>" if not found. "--hex" uses HEX-encoded keys and values in both input and output.
Test Plan: ldb query --db=leveldb_db --hex
Reviewers: dhruba, emayanke, sheki
Reviewed By: dhruba
CC: leveldb
Differential Revision: https://reviews.facebook.net/D7617
Summary: We were ignoring additional chars at the end of an arg. This can create confusion, e.g. --disable_wal=0 will act the same as --disable_wal without any warnings.
Test Plan:
Tried this:
[zshao@dev485 ~/git/rocksdb] ./ldb dump --statsAAA
Failed: Unknown argument:--statsAAA
Reviewers: dhruba, sheki, emayanke
Reviewed By: dhruba
CC: leveldb
Differential Revision: https://reviews.facebook.net/D7635
Summary:
This allows ldb to control the write_buffer_size (which reflects to L0 file size) and file_size (which reflects to L1 file size). Since the target_file_size_ratio is 1 by default, all other levels will also have the same file size as L1.
As part of the diff, I also cleaned up some unused code and help messages.
Test Plan: ./ldb load --db=/data/users/zshao/test_leveldb --file_size=64000000 --write_buffer_size=32000000 --create_if_missing --input_hex --disable_wal
Reviewers: dhruba, sheki, emayanke
Reviewed By: dhruba
CC: leveldb
Differential Revision: https://reviews.facebook.net/D7569
Summary: This command accepts key-value pairs from stdin with the same format of "ldb dump" command. This allows us to try out different compression algorithms/block sizes easily.
Test Plan: dump, load, dump, verify the data is the same.
Reviewers: dhruba
Reviewed By: dhruba
CC: leveldb
Differential Revision: https://reviews.facebook.net/D7443
Summary: The old code was omitting the 0 if the char is less than 16.
Test Plan:
Tried the following program:
int main() {
unsigned char c = 1;
printf("%X\n", c);
printf("%02X\n", c);
return 0;
}
The output is:
1
01
Reviewers: dhruba
Reviewed By: dhruba
CC: leveldb
Differential Revision: https://reviews.facebook.net/D7437
Summary: This allows us to use ldb to do more experiments like block_size changes.
Test Plan: run it by hand.
Reviewers: dhruba, sheki, emayanke
Reviewed By: dhruba
CC: leveldb
Differential Revision: https://reviews.facebook.net/D7431
Summary:
Suppose you submit 100 background tasks one after another. The first
enqueu task finds that the queue is empty and wakes up one worker thread.
Now suppose that all remaining 99 work items are enqueued, they do not
wake up any worker threads because the queue is already non-empty.
This causes a situation when there are 99 tasks in the task queue but
only one worker thread is processing a task while the remaining
worker threads are waiting.
The fix is to always wakeup one worker thread while enqueuing a task.
I also added a check to count the number of elements in the queue
to help in debugging.
Test Plan: make clean check.
Reviewers: chip
Reviewed By: chip
CC: leveldb
Differential Revision: https://reviews.facebook.net/D7203
Summary:
Added the following two options:
[--bloom_bits=<int,e.g.:14>]
[--compression_type=<no|snappy|zlib|bzip2>]
These options will be used when ldb opens the leveldb database.
Test Plan: Tried by hand for both success and failure cases. We do need a test framework.
Reviewers: dhruba, emayanke, sheki
Reviewed By: dhruba
CC: leveldb
Differential Revision: https://reviews.facebook.net/D7197
Summary: Added 1 to indices where I shouldn't have so overrun array.
Test Plan: make check
Reviewers: sheki, emayanke, vamsi, dhruba
Reviewed By: dhruba
Differential Revision: https://reviews.facebook.net/D7227
Summary: Added BitStreamPutInt() and BitStreamGetInt() which take a stream of chars and can write integers of arbitrary bit sizes to that stream at arbitrary positions. There are also convenience versions of these functions that take std::strings and leveldb::Slices.
Test Plan: make check
Reviewers: sheki, vamsi, dhruba, emayanke
Reviewed By: vamsi
CC: leveldb
Differential Revision: https://reviews.facebook.net/D7071
Summary:
Create a directory "archive" in the DB directory.
During DeleteObsolteFiles move the WAL files (*.log) to the Archive directory,
instead of deleting.
Test Plan: Created a DB using DB_Bench. Reopened it. Checked if files move.
Reviewers: dhruba
Reviewed By: dhruba
Differential Revision: https://reviews.facebook.net/D6975
Summary:
Scripted and removed all trailing spaces and converted all tabs to
spaces.
Also fixed other lint errors.
All lint errors from this point of time should be taken seriously.
Test Plan: make all check
Reviewers: dhruba
Reviewed By: dhruba
CC: leveldb
Differential Revision: https://reviews.facebook.net/D7059
Summary:
It would appear our unit tests make use of code from ldb_cmd,
and don't always require a valid database handle. D6855 was not aware
db_ could sometimes be NULL for such commands, and so it broke
reduce_levels_test.
This moves the check elsewhere to (at least) fix the 'ldb dump' case of
segfaulting when it couldn't open a database.
Test Plan: make check
Reviewers: dhruba
Reviewed By: dhruba
CC: leveldb
Differential Revision: https://reviews.facebook.net/D6903
Summary:
Link statically against snappy, using the gvfs one for facebook
environments, and the bundled one otherwise.
In addition, fix a few minor segfaults in ldb when it couldn't open the
database, and update .gitignore to include a few other build artifacts.
Test Plan: make check
Reviewers: dhruba
Reviewed By: dhruba
CC: leveldb
Differential Revision: https://reviews.facebook.net/D6855
Summary:
The compaction process takes some files from LevelK and
merges it into LevelK+1. The number of files it picks from
LevelK was capped such a way that the total amount of
data picked does not exceed the maxfilesize of that level.
This essentially meant that only one file from LevelK
is picked for a single compaction.
For bulkloads, we would like to take many many file from
LevelK and compact them using a single compaction run.
This patch introduces a option called the 'source_compaction_factor'
(similar to expanded_compaction_factor). It is a multiplier
that is multiplied by the maxfilesize of that level to arrive
at the limit that is used to throttle the number of source
files from LevelK. For bulk loads, set source_compaction_factor
to a very high number so that multiple files from the same
level are picked for compaction in a single compaction.
The default value of source_compaction_factor is 1, so that
we can keep backward compatibilty with existing compaction semantics.
Test Plan: make clean check
Reviewers: emayanke, sheki
Reviewed By: emayanke
CC: leveldb
Differential Revision: https://reviews.facebook.net/D6867
Summary:
This option is needed for fast bulk uploads. The goal is to load
all the data into files in L0 without any interference from
background compactions.
Test Plan: make clean check
Reviewers: sheki
Reviewed By: sheki
CC: leveldb
Differential Revision: https://reviews.facebook.net/D6849
Summary:
StringStream.clear() does not clear the stream. It sets some flags.
Who knew? Fixing that is not printing the stuff again and again.
Test Plan: ran it on a local db
Reviewers: dhruba, emayanke
Reviewed By: dhruba
Differential Revision: https://reviews.facebook.net/D6795
Summary:
StringStream.clear() does not clear the stream. It sets some flags.
Who knew? Fixing that is not printing the stuff again and again.
Test Plan: ran it on a local db
Reviewers: dhruba, emayanke
Reviewed By: dhruba
Differential Revision: https://reviews.facebook.net/D6795
Summary:
There are applications that operate on multiple leveldb instances.
These applications will like to pass in an opaque type for each
leveldb instance and this type should be passed back to the application
with every invocation of the CompactionFilter api.
Test Plan: Enehanced unit test for opaque parameter to CompactionFilter.
Reviewers: heyongqiang
Reviewed By: heyongqiang
CC: MarkCallaghan, sheki, emayanke
Differential Revision: https://reviews.facebook.net/D6711
Summary:
I changed the reduce_num_levels logic to avoid "compactRange()" call if the current number of levels in use (levels that contain files) is smaller than the new num of levels.
And that change breaks the assert in reduce_levels_test
Test Plan: run reduce_levels_test
Reviewers: dhruba, MarkCallaghan
Reviewed By: dhruba
CC: emayanke, sheki
Differential Revision: https://reviews.facebook.net/D6651
Summary:
make clean check OPT=-g fails
leveldb::DBStatistics::getTickerCount(leveldb::Tickers)’:
./db/db_statistics.h:34: error: ‘MAX_NO_TICKERS’ was not declared in this scope
util/ldb_cmd.cc:255: warning: left shift count >= width of type
Test Plan:
make clean check OPT=-g
Reviewers:
CC:
Task ID: #
Blame Rev:
Summary:
disable size compaction in ldb reduce_levels, this will avoid compactions rather than the manual comapction,
added --compression=none|snappy|zlib|bzip2 and --file_size= per-file size to ldb reduce_levels command
Test Plan: run ldb
Reviewers: dhruba, MarkCallaghan
Reviewed By: dhruba
CC: sheki, emayanke
Differential Revision: https://reviews.facebook.net/D6597
Summary:
The default compilation process now uses "-Wall" to compile.
Fix all compilation error generated by gcc.
Test Plan: make all check
Reviewers: heyongqiang, emayanke, sheki
Reviewed By: heyongqiang
CC: MarkCallaghan
Differential Revision: https://reviews.facebook.net/D6525
Summary:
There are certain use-cases where the application intends to
delete older keys aftre they have expired a certian time period.
One option for those applications is to periodically scan the
entire database and delete appropriate keys.
A better way is to allow the application to hook into the
compaction process. This patch allows the application to set
a method callback for every key that is being compacted. If
this method returns true, then the key is not preserved in
the output of the compaction.
Test Plan:
This is mostly to preview the proposed new public api.
Since it is a public api, please do due diligence on reviewing it.
I will be writing test cases for this api in mynext version of
this patch.
Reviewers: MarkCallaghan, heyongqiang
Reviewed By: heyongqiang
CC: sheki, adsharma
Differential Revision: https://reviews.facebook.net/D6285
Summary: as subject.
Test Plan: manually test it, will add a testcase
Reviewers: dhruba, MarkCallaghan
Differential Revision: https://reviews.facebook.net/D6345
Summary: Leveldb currently uses windowBits=-14 while using zlib compression.(It was earlier 15). This makes the setting configurable. Related changes here: https://reviews.facebook.net/D6105
Test Plan: make all check
Reviewers: dhruba, MarkCallaghan, sheki, heyongqiang
Differential Revision: https://reviews.facebook.net/D6393
Summary:
as subject
Test Plan:
run db_bench and db_test
Reviewers: dhruba
Reviewed By: dhruba
Differential Revision: https://reviews.facebook.net/D6111
Summary:
The leveldb API is enhanced to support different compression algorithms at
different levels.
This adds the option min_level_to_compress to db_bench that specifies
the minimum level for which compression should be done when
compression is enabled. This can be used to disable compression for levels
0 and 1 which are likely to suffer from stalls because of the CPU load
for memtable flushes and (L0,L1) compaction. Level 0 is special as it
gets frequent memtable flushes. Level 1 is special as it frequently
gets all:all file compactions between it and level 0. But all other levels
could be the same. For any level N where N > 1, the rate of sequential
IO for that level should be the same. The last level is the
exception because it might not be full and because files from it are
not read to compact with the next larger level.
The same amount of time will be spent doing compaction at any
level N excluding N=0, 1 or the last level. By this standard all
of those levels should use the same compression. The difference is that
the loss (using more disk space) from a faster compression algorithm
is less significant for N=2 than for N=3. So we might be willing to
trade disk space for faster write rates with no compression
for L0 and L1, snappy for L2, zlib for L3. Using a faster compression
algorithm for the mid levels also allows us to reclaim some cpu
without trading off much loss in disk space overhead.
Also note that little is to be gained by compressing levels 0 and 1. For
a 4-level tree they account for 10% of the data. For a 5-level tree they
account for 1% of the data.
With compression enabled:
* memtable flush rate is ~18MB/second
* (L0,L1) compaction rate is ~30MB/second
With compression enabled but min_level_to_compress=2
* memtable flush rate is ~320MB/second
* (L0,L1) compaction rate is ~560MB/second
This practicaly takes the same code from https://reviews.facebook.net/D6225
but makes the leveldb api more general purpose with a few additional
lines of code.
Test Plan: make check
Differential Revision: https://reviews.facebook.net/D6261
Summary:
Adds a method that returns the score for the next level that most
needs compaction. That method is then used by db_bench to rate limit threads.
Threads are put to sleep at the end of each stats interval until the score
is less than the limit. The limit is set via the --rate_limit=$double option.
The specified value must be > 1.0. Also adds the option --stats_per_interval
to enable additional metrics reported every stats interval.
Task ID: #
Blame Rev:
Test Plan:
run db_bench
Revert Plan:
Database Impact:
Memcache Impact:
Other Notes:
EImportant:
- begin *PUBLIC* platform impact section -
Bugzilla: #
- end platform impact -
Reviewers: dhruba
Reviewed By: dhruba
Differential Revision: https://reviews.facebook.net/D6243
Summary: Enable LevelDb to create a new log file if current log file is too large.
Test Plan:
Write a script and manually check the generated info LOG.
Task ID: 1803577
Blame Rev:
Reviewers: dhruba, heyongqiang
Reviewed By: heyongqiang
CC: zshao
Differential Revision: https://reviews.facebook.net/D6003
Summary:
The parameter delete_obsolete_files_period_micros controls the
periodicity of deleting obsolete files. db_bench was reading in
this parameter intoa local variable called 'l' but was incorrectly
using another local variable called 'n' while setting it in the
db.options data structure.
This patch also logs the value of delete_obsolete_files_period_micros
in the LOG file at db startup time.
I am hoping that this will improve the overall write throughput drastically.
Test Plan: run db_bench
Reviewers: MarkCallaghan, heyongqiang
Reviewed By: MarkCallaghan
Differential Revision: https://reviews.facebook.net/D6099
published in https://reviews.facebook.net/D5997.
Summary:
This patch allows compaction to occur in multiple background threads
concurrently.
If a manual compaction is issued, the system falls back to a
single-compaction-thread model. This is done to ensure correctess
and simplicity of code. When the manual compaction is finished,
the system resumes its concurrent-compaction mode automatically.
The updates to the manifest are done via group-commit approach.
Test Plan: run db_bench
Summary:
The method DeleteObsolete files is a very costly methind, especially
when the number of files in a system is large. It makes a list of
all live-files and then scans the directory to compute the diff.
By default, this method is executed after every compaction run.
This patch makes it such that DeleteObsolete files is never
invoked twice within a configured period.
Test Plan: run all unit tests
Reviewers: heyongqiang, MarkCallaghan
Reviewed By: MarkCallaghan
Differential Revision: https://reviews.facebook.net/D6045
Summary:
Each assoc is identified by (id1, assocType). This is the rowkey.
Each row has a read/write rowlock. There is statically allocated array
of 2000 read/write locks. A rowkey is murmur-hashed to one of the
read/write locks.
assocPut and assocDelete acquires the rowlock in Write mode.
The key-updates are done within the rowlock with a atomic nosync
batch write to leveldb. Then the rowlock is released and
a write-with-sync is done to sync leveldb transaction log.
Test Plan: added unit test
Reviewers: heyongqiang
Reviewed By: heyongqiang
Differential Revision: https://reviews.facebook.net/D5859