Summary:
There are two versions of FindObsoleteFiles():
* full scan, which is executed every 6 hours (and it's terribly slow)
* no full scan, which is executed every time a background process finishes and iterator is deleted
This diff is optimizing the second case (no full scan). Here's what we do before the diff:
* Get the list of obsolete files (files with ref==0). Some files in obsolete_files set might actually be live.
* Get the list of live files to avoid deleting files that are live.
* Delete files that are in obsolete_files and not in live_files.
After this diff:
* The only files with ref==0 that are still live are files that have been part of move compaction. Don't include moved files in obsolete_files.
* Get the list of obsolete files (which exclude moved files).
* No need to get the list of live files, since all files in obsolete_files need to be deleted.
I'll post the benchmark results, but you can get the feel of it here: https://reviews.facebook.net/D30123
This depends on D30123.
P.S. We should do full scan only in failure scenarios, not every 6 hours. I'll do this in a follow-up diff.
Test Plan:
One new unit test. Made sure that unit test fails if we don't have a `if (!f->moved)` safeguard in ~Version.
make check
Big number of compactions and flushes:
./db_stress --threads=30 --ops_per_thread=20000000 --max_key=10000 --column_families=20 --clear_column_family_one_in=10000000 --verify_before_write=0 --reopen=15 --max_background_compactions=10 --max_background_flushes=10 --db=/fast-rocksdb-tmp/db_stress --prefixpercent=0 --iterpercent=0 --writepercent=75 --db_write_buffer_size=2000000
Reviewers: yhchiang, rven, sdong
Reviewed By: sdong
Subscribers: dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D30249