rocksdb

fork of https://github.com/oxigraph/rocksdb and https://github.com/facebook/rocksdb for nextgraph and oxigraph

History

Andrew Kryczka 62f70f6d14 Reduce scope of compression dictionary to single SST (#4952 ) Summary: Our previous approach was to train one compression dictionary per compaction, using the first output SST to train a dictionary, and then applying it on subsequent SSTs in the same compaction. While this was great for minimizing CPU/memory/I/O overhead, it did not achieve good compression ratios in practice. In our most promising potential use case, moderate reductions in a dictionary's scope make a major difference on compression ratio. So, this PR changes compression dictionary to be scoped per-SST. It accepts the tradeoff during table building to use more memory and CPU. Important changes include: - The `BlockBasedTableBuilder` has a new state when dictionary compression is in-use: `kBuffered`. In that state it accumulates uncompressed data in-memory whenever `Add` is called. - After accumulating target file size bytes or calling `BlockBasedTableBuilder::Finish`, a `BlockBasedTableBuilder` moves to the `kUnbuffered` state. The transition (`EnterUnbuffered()`) involves sampling the buffered data, training a dictionary, and compressing/writing out all buffered data. In the `kUnbuffered` state, a `BlockBasedTableBuilder` behaves the same as before -- blocks are compressed/written out as soon as they fill up. - Samples are now whole uncompressed data blocks, except the final sample may be a partial data block so we don't breach the user's configured `max_dict_bytes` or `zstd_max_train_bytes`. The dictionary trainer is supposed to work better when we pass it real units of compression. Previously we were passing 64-byte KV samples which was not realistic. Pull Request resolved: https://github.com/facebook/rocksdb/pull/4952 Differential Revision: D13967980 Pulled By: ajkr fbshipit-source-id: 82bea6f7537e1529c7a1a4cdee84585f5949300f		7 years ago
..
advisor	Rules Advisor: some fixes to support fetching stats from ODS (#4223 )	7 years ago
dump	fix gflags namespace	8 years ago
rdb	Fix /bin/bash shebangs	8 years ago
CMakeLists.txt	cmake support for linux and osx (#1358 )	9 years ago
Dockerfile	adding docker build script and dockerfile	11 years ago
auto_sanity_test.sh	Suppress lint in old files	8 years ago
benchmark.sh	Updated benchmark script (#4134 )	7 years ago
benchmark_leveldb.sh	Suppress lint in old files	8 years ago
blob_dump.cc	comment unused parameters to turn on -Wunused-parameter flag	8 years ago
check_format_compatible.sh	Include newer RocksDB versions in compat test (#4634 )	7 years ago
db_bench.cc	Change RocksDB License	8 years ago
db_bench_tool.cc	Remove cuckoo hash memtable (#4953 )	7 years ago
db_bench_tool_test.cc	Update all unique/shared_ptr instances to be qualified with namespace std (#4638 )	7 years ago
db_crashtest.py	Fix `compression_zstd_max_train_bytes` coverage in stress test (#4957 )	7 years ago
db_repl_stress.cc	Update all unique/shared_ptr instances to be qualified with namespace std (#4638 )	7 years ago
db_sanity_test.cc	Change RocksDB License	8 years ago
db_stress.cc	Free memory after use	7 years ago
dbench_monitor	Fix /bin/bash shebangs	8 years ago
generate_random_db.sh	Fix /bin/bash shebangs	8 years ago
ingest_external_sst.sh	Add compatibility test of SST ingestion (#4310 )	7 years ago
ldb.cc	comment unused parameters to turn on -Wunused-parameter flag	8 years ago
ldb_cmd.cc	With ldb --try_load_options and wal_dir doesn't exist, ignore it (#4875 )	7 years ago
ldb_cmd_impl.h	Add SST ingestion to ldb (#4205 )	7 years ago
ldb_cmd_test.cc	tools: use provided options instead of the default (#4839 )	7 years ago
ldb_test.py	Add SST ingestion to ldb (#4205 )	7 years ago
ldb_tool.cc	Add SST ingestion to ldb (#4205 )	7 years ago
pflag	Fix /bin/bash shebangs	8 years ago
reduce_levels_test.cc	Per-thread unique test db names (#4135 )	7 years ago
regression_test.sh	Suppress lint in old files	8 years ago
report_lite_binary_size.sh	Legocastle job to report lite build binary size to scuba	8 years ago
rocksdb_dump_test.sh	Suppress lint in old files	8 years ago
run_flash_bench.sh	Fix /bin/bash shebangs	8 years ago
run_leveldb.sh	Fix /bin/bash shebangs	8 years ago
sample-dump.dmp	First version of rocksdb_dump and rocksdb_undump.	11 years ago
sst_dump.cc	comment unused parameters to turn on -Wunused-parameter flag	8 years ago
sst_dump_test.cc	Reduce scope of compression dictionary to single SST (#4952 )	7 years ago
sst_dump_tool.cc	Reduce scope of compression dictionary to single SST (#4952 )	7 years ago
sst_dump_tool_imp.h	tools: use provided options instead of the default (#4839 )	7 years ago
trace_analyzer.cc	RocksDB Trace Analyzer (#4091 )	7 years ago
trace_analyzer_test.cc	Add the unit test of Iterator to trace_analyzer_test (#4282 )	7 years ago
trace_analyzer_tool.cc	Add unique key number changing statistics to Trace_analyzer (#4646 )	7 years ago
trace_analyzer_tool.h	Add unique key number changing statistics to Trace_analyzer (#4646 )	7 years ago
verify_random_db.sh	tools/check_format_compatible.sh to cover forward option reading too (#3994 )	7 years ago
write_external_sst.sh	correct mistyped msg. (#4341 )	7 years ago
write_stress.cc	Compilation fixes for powerpc build, -Wparentheses-equality error and missing header guards	8 years ago
write_stress_runner.py	Suppress lint in old files	8 years ago