Summary:
We are going to expose properties of all tables to end users through "some" db interface.
However, current design doesn't naturally fit for this need, which is because:
1. If a table presents in table cache, we cannot simply return the reference to its table properties, because the table may be destroy after compaction (and we don't want to hold the ref of the version).
2. Copy table properties is OK, but it's slow.
Thus in this diff, I change the table reader's interface to return a shared pointer (for const table properties), instead a const refernce.
Test Plan: `make check` passed
Reviewers: haobo, sdong, dhruba
Reviewed By: haobo
CC: leveldb
Differential Revision: https://reviews.facebook.net/D15999
Summary:
This diff enables the command line tool `sst_dump` to work for sst files
under plain table format. Changes include:
* In tools/sst_dump.cc:
- add support for plain table format
- display prefix_extractor information when --show_properties is on
* In table/format.cc
- Now the table magic number of a Footer can be later initialized
via ReadFooterFromFile().
* In table/meta_bocks:
- add function ReadTableMagicNumber() that reads the magic number of
the specified file.
Minor fixes:
- remove a duplicate #include in table/table_test.cc
- fix a commentary typo in include/rocksdb/memtablerep.h
- fix lint errors.
Test Plan:
Runs sst_dump with both block-based and plain-table format files with
different arguments, specifically those with --show-properties and --from.
* sample output:
https://reviews.facebook.net/P261
Reviewers: kailiu, sdong, xjin
CC: leveldb
Differential Revision: https://reviews.facebook.net/D15903
Summary:
Rocksdb can now support a uncompressed block cache, or a compressed
block cache or both. Lookups first look for a block in the
uncompressed cache, if it is not found only then it is looked up
in the compressed cache. If it is found in the compressed cache,
then it is uncompressed and inserted into the uncompressed cache.
It is possible that the same block resides in the compressed cache
as well as the uncompressed cache at the same time. Both caches
have their own individual LRU policy.
Test Plan: Unit test case attached.
Reviewers: kailiu, sdong, haobo, leveldb
Reviewed By: haobo
CC: xjin, haobo
Differential Revision: https://reviews.facebook.net/D12675
Summary:
Change namespace from leveldb to rocksdb. This allows a single
application to link in open-source leveldb code as well as
rocksdb code into the same process.
Test Plan: compile rocksdb
Reviewers: emayanke
Reviewed By: emayanke
CC: leveldb
Differential Revision: https://reviews.facebook.net/D13287
Summary: The pupose of this diff is to expose per user-call level precise timing of block read, so that we can answer questions like: a Get() costs me 100ms, is that somehow related to loading blocks from file system, or sth else? We will answer that with EXACTLY how many blocks have been read, how much time was spent on transfering the bytes from os, how much time was spent on checksum verification and how much time was spent on block decompression, just for that one Get. A nano second stopwatch was introduced to track time with higher precision. The cost/precision of the stopwatch is also measured in unit-test. On my dev box, retrieving one time instance costs about 30ns, on average. The deviation of timing results is good enough to track 100ns-1us level events. And the overhead could be safely ignored for 100us level events (10000 instances/s), for example, a viewstate thrift call.
Test Plan: perf_context_test, also testing with viewstate shadow traffic.
Reviewers: dhruba
Reviewed By: dhruba
CC: leveldb, xjin
Differential Revision: https://reviews.facebook.net/D12351
Summary: Replace include/leveldb with include/rocksdb.
Test Plan:
make clean; make check
make clean; make release
Differential Revision: https://reviews.facebook.net/D12489
Summary: In table.cc, the code section that reads in BlockContent and then put it into a Block, appears at least 4 times. This is too much duplication. BlockReader is much shorter after the change and reads way better. D10077 attempted that for index block read. This is a complete cleanup.
Test Plan: make check; ./db_stress
Reviewers: dhruba, sheki
Reviewed By: dhruba
CC: leveldb
Differential Revision: https://reviews.facebook.net/D10527
Summary:
`Table::Open()` assumes that `size` correctly describes the size of `file`, added a check that the footer is actually the right size and for good measure added assertions to `Footer::DecodeFrom()`.
This was discovered by running `valgrind ./db_test` and seeing that `Footer::DecodeFrom()` was accessing uninitialized memory.
Test Plan:
make clean check
ran `valgrind ./db_test` and saw DBTest.NoSpace no longer complains about a conditional jump being dependent on uninitialized memory.
Reviewers: dhruba, vamsi, emayanke, sheki
Reviewed By: dhruba
CC: leveldb
Differential Revision: https://reviews.facebook.net/D7815
Summary: It took me almost a day to debug this. :( Although I got to learn the file format as a by-product, this time could be saved if we have better error messages.
Test Plan: gmake clean all; sst_dump --hex --file=000005.sst
Reviewers: dhruba
Reviewed By: dhruba
CC: leveldb
Differential Revision: https://reviews.facebook.net/D7551
In particular, we add a new FilterPolicy class. An instance
of this class can be supplied in Options when opening a
database. If supplied, the instance is used to generate
summaries of keys (e.g., a bloom filter) which are placed in
sstables. These summaries are consulted by DB::Get() so we
can avoid reading sstable blocks that are guaranteed to not
contain the key we are looking for.
This change provides one implementation of FilterPolicy
based on bloom filters.
Other changes:
- Updated version number to 1.4.
- Some build tweaks.
- C binding for CompactRange.
- A few more benchmarks: deleteseq, deleterandom, readmissing, seekrandom.
- Minor .gitignore update.
- Replace raw slice comparison with a call to user comparator.
Added test for custom comparators.
- Fix end of namespace comments.
- Fixed bug in picking inputs for a level-0 compaction.
When finding overlapping files, the covered range may expand
as files are added to the input set. We now correctly expand
the range when this happens instead of continuing to use the
old range. For example, suppose L0 contains files with the
following ranges:
F1: a .. d
F2: c .. g
F3: f .. j
and the initial compaction target is F3. We used to search
for range f..j which yielded {F2,F3}. However we now expand
the range as soon as another file is added. In this case,
when F2 is added, we expand the range to c..j and restart the
search. That picks up file F1 as well.
This change fixes a bug related to deleted keys showing up
incorrectly after a compaction as described in Issue 44.
(Sync with upstream @25072954)
- Removed one copy of an uncompressed block contents changing
the signature of Snappy_Uncompress() so it uncompresses into a
flat array instead of a std::string.
Speeds up readrandom ~10%.
- Instead of a combination of Env/WritableFile, we now have a
Logger interface that can be easily overridden applications
that want to supply their own logging.
- Separated out the gcc and Sun Studio parts of atomic_pointer.h
so we can use 'asm', 'volatile' keywords for Sun Studio.
git-svn-id: https://leveldb.googlecode.com/svn/trunk@39 62dab493-f737-651d-591e-8d6aee1b9529
* env_chromium.cc should not export symbols.
* Fix MSVC warnings.
* Removed large value support.
* Fix broken reference to documentation file
git-svn-id: https://leveldb.googlecode.com/svn/trunk@24 62dab493-f737-651d-591e-8d6aee1b9529