Summary: This was reported by our customers in task #4295529. Cause: * MANIFEST file contains a VersionEdit, which contains file entries whose 'smallest' and 'largest' internal keys are empty. String with zero characters. Root cause of corruption was not investigated. We should report corruption when this happens. However, we currently SIGSEGV. Here's what happens: * VersionEdit encodes zero-strings happily and stores them in smallest and largest InternalKeys. InternalKey::Encode() does assert when `rep_.empty()`, but we don't assert in production environemnts. Also, we should never assert as a result of DB corruption. * As part of our ConsistencyCheck, we call GetLiveFilesMetaData() * GetLiveFilesMetadata() calls `file->largest.user_key().ToString()` * user_key() function does: 1. assert(size > 8) (ooops, no assert), 2. returns `Slice(internal_key.data(), internal_key.size() - 8)` * since `internal_key.size()` is unsigned int, this call translates to `Slice(whatever, 1298471928561892576182756)`. Bazinga. Fix: * VersionEdit checks if InternalKey is valid in `VersionEdit::GetInternalKey()`. If it's invalid, returns corruption. Lessons learned: * Always keep in mind that even if you `assert()`, production code will continue execution even if assert fails. * Never `assert` based on DB corruption. Assert only if the code should guarantee that assert can't fail. Test Plan: dumped offending manifest. Before: assert. Now: corruption Reviewers: dhruba, haobo, sdong Reviewed By: dhruba CC: leveldb Differential Revision: https://reviews.facebook.net/D18507main
parent
313b2e5da1
commit
768d424dd9
Loading…
Reference in new issue