From 5f4391dda2a9bc3ac48eb4399815a7419432d337 Mon Sep 17 00:00:00 2001 From: Peter Dillinger Date: Mon, 3 Oct 2022 22:23:38 -0700 Subject: [PATCH] Some clean-up of secondary cache (#10730) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Summary: This is intended as a step toward possibly separating secondary cache integration from the Cache implementation as much as possible, to (hopefully) minimize code duplication in adding secondary cache support to HyperClockCache. * Major clarifications to API docs of secondary cache compatible parts of Cache. For example, previously the docs seemed to suggest that Wait() was not needed if IsReady()==true. And it wasn't clear what operations were actually supported on pending handles. * Add some assertions related to these requirements, such as that we don't Release() before Wait() (which would leak a secondary cache handle). * Fix a leaky abstraction with dummy handles, which are supposed to be internal to the Cache. Previously, these just used value=nullptr to indicate dummy handle, which meant that they could be confused with legitimate value=nullptr cases like cache reservations. Also fixed blob_source_test which was relying on this leaky abstraction. * Drop "incomplete" terminology, which was another name for "pending". * Split handle flags into "mutable" ones requiring mutex and "immutable" ones which do not. Because of single-threaded access to pending handles, the "Is Pending" flag can be in the "immutable" set. This allows removal of a TSAN work-around and removing a mutex acquire-release in IsReady(). * Remove some unnecessary handling of charges on handles of failed lookups. Keeping total_charge=0 means no special handling needed. (Removed one unnecessary mutex acquire/release.) * Simplify handling of dummy handle in Lookup(). There is no need to explicitly Ref & Release w/Erase if we generally overwrite the dummy anyway. (Removed one mutex acquire/release, a call to Release().) Intended follow-up: * Clarify APIs in secondary_cache.h * Doesn't SecondaryCacheResultHandle transfer ownership of the Value() on success (implementations should not release the value in destructor)? * Does Wait() need to be called if IsReady() == true? (This would be different from Cache.) * Do Value() and Size() have undefined behavior if IsReady() == false? * Why have a custom API for what is essentially a std::future>? * Improve unit testing of standalone handle case * Apparent null `e` bug in `free_standalone_handle` case * Clean up secondary cache testing in lru_cache_test * Why does TestSecondaryCacheResultHandle hold on to a Cache::Handle? * Why does TestSecondaryCacheResultHandle::Wait() do nothing? Shouldn't it establish the post-condition IsReady() == true? * (Assuming that is sorted out...) Shouldn't TestSecondaryCache::WaitAll simply wait on each handle in order (no casting required)? How about making that the default implementation? * Why does TestSecondaryCacheResultHandle::Size() check Value() first? If the API is intended to be returning 0 before IsReady(), then that is weird but should at least be documented. Otherwise, if it's intended to be undefined behavior, we should assert IsReady(). * Consider replacing "standalone" and "dummy" entries with a single kind of "weak" entry that deletes its value when it reaches zero refs. Suppose you are using compressed secondary cache and have two iterators at similar places. It will probably common for one iterator to have standalone results pinned (out of cache) when the second iterator needs those same blocks and has to re-load them from secondary cache and duplicate the memory. Combining the dummy and the standalone should fix this. Pull Request resolved: https://github.com/facebook/rocksdb/pull/10730 Test Plan: existing tests (minor update), and crash test with sanitizers and secondary cache Performance test for any regressions in LRUCache (primary only): Create DB with ``` TEST_TMPDIR=/dev/shm ./db_bench -benchmarks=fillrandom -num=30000000 -disable_wal=1 -bloom_bits=16 ``` Test before & after (run at same time) with ``` TEST_TMPDIR=/dev/shm ./db_bench -benchmarks=readrandom[-X100] -readonly -num=30000000 -bloom_bits=16 -cache_index_and_filter_blocks=1 -cache_size=233000000 -duration 30 -threads=16 ``` Before: readrandom [AVG 100 runs] : 22234 (± 63) ops/sec; 1.6 (± 0.0) MB/sec After: readrandom [AVG 100 runs] : 22197 (± 64) ops/sec; 1.6 (± 0.0) MB/sec That's within 0.2%, which is not significant by the confidence intervals. Reviewed By: anand1976 Differential Revision: D39826010 Pulled By: anand1976 fbshipit-source-id: 3202b4a91f673231c97648ae070e502ae16b0f44 --- cache/lru_cache.cc | 118 ++++++++++++++------------- cache/lru_cache.h | 154 ++++++++++++++++-------------------- db/blob/blob_source_test.cc | 16 +++- include/rocksdb/cache.h | 62 ++++++++++----- 4 files changed, 185 insertions(+), 165 deletions(-) diff --git a/cache/lru_cache.cc b/cache/lru_cache.cc index 922716d8c..afecedde9 100644 --- a/cache/lru_cache.cc +++ b/cache/lru_cache.cc @@ -22,6 +22,9 @@ namespace ROCKSDB_NAMESPACE { namespace lru_cache { +// A distinct pointer value for marking "dummy" cache entries +void* const kDummyValueMarker = const_cast("kDummyValueMarker"); + LRUHandleTable::LRUHandleTable(int max_upper_hash_bits) : length_bits_(/* historical starting size*/ 4), list_(new LRUHandle* [size_t{1} << length_bits_] {}), @@ -425,16 +428,20 @@ void LRUCacheShard::Promote(LRUHandle* e) { SecondaryCacheResultHandle* secondary_handle = e->sec_handle; assert(secondary_handle->IsReady()); - e->SetIncomplete(false); - e->SetInCache(false); + // e is not thread-shared here; OK to modify "immutable" fields as well as + // "mutable" (normally requiring mutex) + e->SetIsPending(false); e->value = secondary_handle->Value(); - e->CalcTotalCharge(secondary_handle->Size(), metadata_charge_policy_); + assert(e->total_charge == 0); + size_t value_size = secondary_handle->Size(); delete secondary_handle; if (e->value) { + e->CalcTotalCharge(value_size, metadata_charge_policy_); Status s; - if (secondary_cache_ && secondary_cache_->SupportForceErase() && - e->IsStandalone()) { + if (e->IsStandalone()) { + assert(secondary_cache_ && secondary_cache_->SupportForceErase()); + // Insert a dummy handle and return a standalone handle to caller. // Charge the standalone handle. autovector last_reference_list; @@ -464,14 +471,15 @@ void LRUCacheShard::Promote(LRUHandle* e) { // Insert a dummy handle into the primary cache. This dummy handle is // not IsSecondaryCacheCompatible(). + // FIXME? This should not overwrite an existing non-dummy entry in the + // rare case that one exists Cache::Priority priority = e->IsHighPri() ? Cache::Priority::HIGH : Cache::Priority::LOW; - s = Insert(e->key(), e->hash, /*value=*/nullptr, 0, + s = Insert(e->key(), e->hash, kDummyValueMarker, /*charge=*/0, /*deleter=*/nullptr, /*helper=*/nullptr, /*handle=*/nullptr, priority); } else { e->SetInCache(true); - e->SetIsStandalone(false); Cache::Handle* handle = reinterpret_cast(e); // This InsertItem() could fail if the cache is over capacity and // strict_capacity_limit_ is true. In such a case, we don't want @@ -489,15 +497,11 @@ void LRUCacheShard::Promote(LRUHandle* e) { // When the handle is released, the item should get deleted. assert(!e->InCache()); } - } else { - // Since the secondary cache lookup failed, mark the item as not in cache - // Don't charge the cache as its only metadata that'll shortly be released - DMutexLock l(mutex_); - // TODO - e->CalcTotalCharge(0, metadata_charge_policy_); - e->SetInCache(false); - e->SetIsStandalone(false); + // Secondary cache lookup failed. The caller will take care of detecting + // this and eventually releasing e. + assert(!e->value); + assert(!e->InCache()); } } @@ -513,21 +517,22 @@ Cache::Handle* LRUCacheShard::Lookup( e = table_.Lookup(key, hash); if (e != nullptr) { assert(e->InCache()); - if (!e->HasRefs()) { - // The entry is in LRU since it's in hash and has no external - // references. - LRU_Remove(e); - } - e->Ref(); - e->SetHit(); - - // For a dummy handle, if it was retrieved from secondary cache, - // it may still exist in secondary cache. - // If the handle exists in secondary cache, the value should be - // erased from sec cache and be inserted into primary cache. - if (!e->value && secondary_cache_ && - secondary_cache_->SupportForceErase()) { + if (e->value == kDummyValueMarker) { + // For a dummy handle, if it was retrieved from secondary cache, + // it may still exist in secondary cache. + // If the handle exists in secondary cache, the value should be + // erased from sec cache and be inserted into primary cache. found_dummy_entry = true; + // Let the dummy entry be overwritten + e = nullptr; + } else { + if (!e->HasRefs()) { + // The entry is in LRU since it's in hash and has no external + // references. + LRU_Remove(e); + } + e->Ref(); + e->SetHit(); } } } @@ -541,21 +546,13 @@ Cache::Handle* LRUCacheShard::Lookup( // standalone handle is returned to the caller. Only if the block is hit // again, we erase it from CompressedSecondaryCache and add it into the // primary cache. - // - // Only support synchronous for now. - // TODO: Support asynchronous lookup in secondary cache - if ((!e || found_dummy_entry) && secondary_cache_ && helper && - helper->saveto_cb) { + if (!e && secondary_cache_ && helper && helper->saveto_cb) { // For objects from the secondary cache, we expect the caller to provide // a way to create/delete the primary cache object. The only case where // a deleter would not be required is for dummy entries inserted for // accounting purposes, which we won't demote to the secondary cache // anyway. assert(create_cb && helper->del_cb); - // Release the dummy handle. - if (e) { - Release(reinterpret_cast(e), true /*erase_if_last_ref*/); - } bool is_in_sec_cache{false}; std::unique_ptr secondary_handle = secondary_cache_->Lookup(key, create_cb, wait, found_dummy_entry, @@ -564,7 +561,8 @@ Cache::Handle* LRUCacheShard::Lookup( e = reinterpret_cast( new char[sizeof(LRUHandle) - 1 + key.size()]); - e->flags = 0; + e->m_flags = 0; + e->im_flags = 0; e->SetSecondaryCacheCompatible(true); e->info_.helper = helper; e->key_length = key.size(); @@ -578,10 +576,8 @@ Cache::Handle* LRUCacheShard::Lookup( e->total_charge = 0; e->Ref(); e->SetIsInSecondaryCache(is_in_sec_cache); - - if (secondary_cache_->SupportForceErase() && !found_dummy_entry) { - e->SetIsStandalone(true); - } + e->SetIsStandalone(secondary_cache_->SupportForceErase() && + !found_dummy_entry); if (wait) { Promote(e); @@ -599,14 +595,16 @@ Cache::Handle* LRUCacheShard::Lookup( } else { // If wait is false, we always return a handle and let the caller // release the handle after checking for success or failure. - e->SetIncomplete(true); + e->SetIsPending(true); // This may be slightly inaccurate, if the lookup eventually fails. // But the probability is very low. PERF_COUNTER_ADD(secondary_cache_hit_count, 1); RecordTick(stats, SECONDARY_CACHE_HITS); } } else { - e = nullptr; + // Caller will most likely overwrite the dummy entry with an Insert + // after this Lookup fails + assert(e == nullptr); } } return reinterpret_cast(e); @@ -617,6 +615,8 @@ bool LRUCacheShard::Ref(Cache::Handle* h) { DMutexLock l(mutex_); // To create another reference - entry must be already externally referenced. assert(e->HasRefs()); + // Pending handles are not for sharing + assert(!e->IsPending()); e->Ref(); return true; } @@ -641,6 +641,9 @@ bool LRUCacheShard::Release(Cache::Handle* handle, bool erase_if_last_ref) { } LRUHandle* e = reinterpret_cast(handle); bool last_reference = false; + // Must Wait or WaitAll first on pending handles. Otherwise, would leak + // a secondary cache handle. + assert(!e->IsPending()); { DMutexLock l(mutex_); last_reference = e->Unref(); @@ -658,12 +661,8 @@ bool LRUCacheShard::Release(Cache::Handle* handle, bool erase_if_last_ref) { last_reference = false; } } - // If it was the last reference, and the entry is either not secondary - // cache compatible (i.e a dummy entry for accounting), or is secondary - // cache compatible and has a non-null value, then decrement the cache - // usage. If value is null in the latter case, that means the lookup - // failed and we didn't charge the cache. - if (last_reference && (!e->IsSecondaryCacheCompatible() || e->value)) { + // If it was the last reference, then decrement the cache usage. + if (last_reference) { assert(usage_ >= e->total_charge); usage_ -= e->total_charge; } @@ -688,14 +687,17 @@ Status LRUCacheShard::Insert(const Slice& key, uint32_t hash, void* value, new char[sizeof(LRUHandle) - 1 + key.size()]); e->value = value; - e->flags = 0; + e->m_flags = 0; + e->im_flags = 0; if (helper) { + // Use only one of the two parameters + assert(deleter == nullptr); + // value == nullptr is reserved for indicating failure for when secondary + // cache compatible + assert(value != nullptr); e->SetSecondaryCacheCompatible(true); e->info_.helper = helper; } else { -#ifdef __SANITIZE_THREAD__ - e->is_secondary_cache_compatible_for_tsan = false; -#endif // __SANITIZE_THREAD__ e->info_.deleter = deleter; } e->key_length = key.size(); @@ -738,7 +740,6 @@ void LRUCacheShard::Erase(const Slice& key, uint32_t hash) { bool LRUCacheShard::IsReady(Cache::Handle* handle) { LRUHandle* e = reinterpret_cast(handle); - DMutexLock l(mutex_); bool ready = true; if (e->IsPending()) { assert(secondary_cache_); @@ -823,7 +824,10 @@ const CacheShard* LRUCache::GetShard(uint32_t shard) const { } void* LRUCache::Value(Handle* handle) { - return reinterpret_cast(handle)->value; + auto h = reinterpret_cast(handle); + assert(!h->IsPending() || h->value == nullptr); + assert(h->value != kDummyValueMarker); + return h->value; } size_t LRUCache::GetCharge(Handle* handle) const { diff --git a/cache/lru_cache.h b/cache/lru_cache.h index 6e642d04d..062cbcb67 100644 --- a/cache/lru_cache.h +++ b/cache/lru_cache.h @@ -38,21 +38,14 @@ namespace lru_cache { // (refs == 0 && in_cache == true) // 3. Referenced externally AND not in hash table. // In that case the entry is not in the LRU list and not in hash table. -// The entry can be freed when refs becomes 0. +// The entry must be freed if refs becomes 0 in this state. // (refs >= 1 && in_cache == false) -// 4. The handle is never inserted into the LRUCache (both hash table and LRU -// list) and it doesn't experience the above three states. -// The entry can be freed when refs becomes 0. -// (refs >= 1 && in_cache == false && IS_STANDALONE == true) -// All newly created LRUHandles are in state 1 or 4. If you call -// LRUCacheShard::Release on entry in state 1, it will go into state 2. -// To move from state 1 to state 3, either call LRUCacheShard::Erase or -// LRUCacheShard::Insert with the same key (but possibly different value). -// To move from state 2 to state 1, use LRUCacheShard::Lookup. -// Before destruction, make sure that no handles are in state 1. This means -// that any successful LRUCacheShard::Lookup/LRUCacheShard::Insert have a -// matching LRUCache::Release (to move into state 2) or LRUCacheShard::Erase -// (to move into state 3). +// If you call LRUCacheShard::Release enough times on an entry in state 1, it +// will go into state 2. To move from state 1 to state 3, either call +// LRUCacheShard::Erase or LRUCacheShard::Insert with the same key (but +// possibly different value). To move from state 2 to state 1, use +// LRUCacheShard::Lookup. +// While refs > 0, public properties like value and deleter must not change. struct LRUHandle { void* value; @@ -77,40 +70,39 @@ struct LRUHandle { // The number of external refs to this entry. The cache itself is not counted. uint32_t refs; - enum Flags : uint16_t { + // Mutable flags - access controlled by mutex + // The m_ and M_ prefixes (and im_ and IM_ later) are to hopefully avoid + // checking an M_ flag on im_flags or an IM_ flag on m_flags. + uint8_t m_flags; + enum MFlags : uint8_t { // Whether this entry is referenced by the hash table. - IN_CACHE = (1 << 0), - // Whether this entry is high priority entry. - IS_HIGH_PRI = (1 << 1), - // Whether this entry is in high-pri pool. - IN_HIGH_PRI_POOL = (1 << 2), + M_IN_CACHE = (1 << 0), // Whether this entry has had any lookups (hits). - HAS_HIT = (1 << 3), + M_HAS_HIT = (1 << 1), + // Whether this entry is in high-pri pool. + M_IN_HIGH_PRI_POOL = (1 << 2), + // Whether this entry is in low-pri pool. + M_IN_LOW_PRI_POOL = (1 << 3), + }; + + // "Immutable" flags - only set in single-threaded context and then + // can be accessed without mutex + uint8_t im_flags; + enum ImFlags : uint8_t { + // Whether this entry is high priority entry. + IM_IS_HIGH_PRI = (1 << 0), + // Whether this entry is low priority entry. + IM_IS_LOW_PRI = (1 << 1), // Can this be inserted into the secondary cache. - IS_SECONDARY_CACHE_COMPATIBLE = (1 << 4), + IM_IS_SECONDARY_CACHE_COMPATIBLE = (1 << 2), // Is the handle still being read from a lower tier. - IS_PENDING = (1 << 5), + IM_IS_PENDING = (1 << 3), // Whether this handle is still in a lower tier - IS_IN_SECONDARY_CACHE = (1 << 6), - // Whether this entry is low priority entry. - IS_LOW_PRI = (1 << 7), - // Whether this entry is in low-pri pool. - IN_LOW_PRI_POOL = (1 << 8), - // Whether this entry is not inserted into the cache (both hash table and - // LRU list). - IS_STANDALONE = (1 << 9), + IM_IS_IN_SECONDARY_CACHE = (1 << 4), + // Marks result handles that should not be inserted into cache + IM_IS_STANDALONE = (1 << 5), }; - uint16_t flags; - -#ifdef __SANITIZE_THREAD__ - // TSAN can report a false data race on flags, where one thread is writing - // to one of the mutable bits and another thread is reading this immutable - // bit. So precisely suppress that TSAN warning, we separate out this bit - // during TSAN runs. - bool is_secondary_cache_compatible_for_tsan; -#endif // __SANITIZE_THREAD__ - // Beginning of the key (MUST BE THE LAST FIELD IN THIS STRUCT!) char key_data[1]; @@ -129,104 +121,94 @@ struct LRUHandle { // Return true if there are external refs, false otherwise. bool HasRefs() const { return refs > 0; } - bool InCache() const { return flags & IN_CACHE; } - bool IsHighPri() const { return flags & IS_HIGH_PRI; } - bool InHighPriPool() const { return flags & IN_HIGH_PRI_POOL; } - bool IsLowPri() const { return flags & IS_LOW_PRI; } - bool InLowPriPool() const { return flags & IN_LOW_PRI_POOL; } - bool HasHit() const { return flags & HAS_HIT; } + bool InCache() const { return m_flags & M_IN_CACHE; } + bool IsHighPri() const { return im_flags & IM_IS_HIGH_PRI; } + bool InHighPriPool() const { return m_flags & M_IN_HIGH_PRI_POOL; } + bool IsLowPri() const { return im_flags & IM_IS_LOW_PRI; } + bool InLowPriPool() const { return m_flags & M_IN_LOW_PRI_POOL; } + bool HasHit() const { return m_flags & M_HAS_HIT; } bool IsSecondaryCacheCompatible() const { -#ifdef __SANITIZE_THREAD__ - return is_secondary_cache_compatible_for_tsan; -#else - return flags & IS_SECONDARY_CACHE_COMPATIBLE; -#endif // __SANITIZE_THREAD__ + return im_flags & IM_IS_SECONDARY_CACHE_COMPATIBLE; + } + bool IsPending() const { return im_flags & IM_IS_PENDING; } + bool IsInSecondaryCache() const { + return im_flags & IM_IS_IN_SECONDARY_CACHE; } - bool IsPending() const { return flags & IS_PENDING; } - bool IsInSecondaryCache() const { return flags & IS_IN_SECONDARY_CACHE; } - bool IsStandalone() const { return flags & IS_STANDALONE; } + bool IsStandalone() const { return im_flags & IM_IS_STANDALONE; } void SetInCache(bool in_cache) { if (in_cache) { - flags |= IN_CACHE; + m_flags |= M_IN_CACHE; } else { - flags &= ~IN_CACHE; + m_flags &= ~M_IN_CACHE; } } void SetPriority(Cache::Priority priority) { if (priority == Cache::Priority::HIGH) { - flags |= IS_HIGH_PRI; - flags &= ~IS_LOW_PRI; + im_flags |= IM_IS_HIGH_PRI; + im_flags &= ~IM_IS_LOW_PRI; } else if (priority == Cache::Priority::LOW) { - flags &= ~IS_HIGH_PRI; - flags |= IS_LOW_PRI; + im_flags &= ~IM_IS_HIGH_PRI; + im_flags |= IM_IS_LOW_PRI; } else { - flags &= ~IS_HIGH_PRI; - flags &= ~IS_LOW_PRI; + im_flags &= ~IM_IS_HIGH_PRI; + im_flags &= ~IM_IS_LOW_PRI; } } void SetInHighPriPool(bool in_high_pri_pool) { if (in_high_pri_pool) { - flags |= IN_HIGH_PRI_POOL; + m_flags |= M_IN_HIGH_PRI_POOL; } else { - flags &= ~IN_HIGH_PRI_POOL; + m_flags &= ~M_IN_HIGH_PRI_POOL; } } void SetInLowPriPool(bool in_low_pri_pool) { if (in_low_pri_pool) { - flags |= IN_LOW_PRI_POOL; + m_flags |= M_IN_LOW_PRI_POOL; } else { - flags &= ~IN_LOW_PRI_POOL; + m_flags &= ~M_IN_LOW_PRI_POOL; } } - void SetHit() { flags |= HAS_HIT; } + void SetHit() { m_flags |= M_HAS_HIT; } void SetSecondaryCacheCompatible(bool compat) { if (compat) { - flags |= IS_SECONDARY_CACHE_COMPATIBLE; + im_flags |= IM_IS_SECONDARY_CACHE_COMPATIBLE; } else { - flags &= ~IS_SECONDARY_CACHE_COMPATIBLE; + im_flags &= ~IM_IS_SECONDARY_CACHE_COMPATIBLE; } -#ifdef __SANITIZE_THREAD__ - is_secondary_cache_compatible_for_tsan = compat; -#endif // __SANITIZE_THREAD__ } - void SetIncomplete(bool incomp) { - if (incomp) { - flags |= IS_PENDING; + void SetIsPending(bool pending) { + if (pending) { + im_flags |= IM_IS_PENDING; } else { - flags &= ~IS_PENDING; + im_flags &= ~IM_IS_PENDING; } } void SetIsInSecondaryCache(bool is_in_secondary_cache) { if (is_in_secondary_cache) { - flags |= IS_IN_SECONDARY_CACHE; + im_flags |= IM_IS_IN_SECONDARY_CACHE; } else { - flags &= ~IS_IN_SECONDARY_CACHE; + im_flags &= ~IM_IS_IN_SECONDARY_CACHE; } } void SetIsStandalone(bool is_standalone) { if (is_standalone) { - flags |= IS_STANDALONE; + im_flags |= IM_IS_STANDALONE; } else { - flags &= ~IS_STANDALONE; + im_flags &= ~IM_IS_STANDALONE; } } void Free() { assert(refs == 0); -#ifdef __SANITIZE_THREAD__ - // Here we can safely assert they are the same without a data race reported - assert(((flags & IS_SECONDARY_CACHE_COMPATIBLE) != 0) == - is_secondary_cache_compatible_for_tsan); -#endif // __SANITIZE_THREAD__ if (!IsSecondaryCacheCompatible() && info_.deleter) { (*info_.deleter)(key(), value); } else if (IsSecondaryCacheCompatible()) { diff --git a/db/blob/blob_source_test.cc b/db/blob/blob_source_test.cc index 0f2d19260..30f2aa02a 100644 --- a/db/blob/blob_source_test.cc +++ b/db/blob/blob_source_test.cc @@ -1304,12 +1304,20 @@ TEST_F(BlobSecondaryCacheTest, GetBlobsFromSecondaryCache) { blob_offsets[1])); // key1's dummy handle is in the primary cache and key1's item is still - // in the secondary cache. So, the primary cache's Lookup() can only - // get a dummy handle. + // in the secondary cache. So, the primary cache's Lookup() without + // secondary cache support cannot see it. (NOTE: The dummy handle used + // to be a leaky abstraction but not anymore.) + handle1 = blob_cache->Lookup(key1, statistics); + ASSERT_EQ(handle1, nullptr); + + // But after another access, it is promoted to primary cache + ASSERT_TRUE(blob_source.TEST_BlobInCache(file_number, file_size, + blob_offsets[1])); + + // And Lookup() can find it (without secondary cache support) handle1 = blob_cache->Lookup(key1, statistics); ASSERT_NE(handle1, nullptr); - // handl1 is a dummy handle. - ASSERT_EQ(blob_cache->Value(handle1), nullptr); + ASSERT_NE(blob_cache->Value(handle1), nullptr); blob_cache->Release(handle1); } } diff --git a/include/rocksdb/cache.h b/include/rocksdb/cache.h index 48ee22f91..11c7da77e 100644 --- a/include/rocksdb/cache.h +++ b/include/rocksdb/cache.h @@ -569,9 +569,11 @@ class Cache { // over time. // Insert a mapping from key->value into the cache and assign it - // the specified charge against the total cache capacity. - // If strict_capacity_limit is true and cache reaches its full capacity, - // return Status::MemoryLimit. + // the specified charge against the total cache capacity. If + // strict_capacity_limit is true and cache reaches its full capacity, + // return Status::MemoryLimit. `value` must be non-nullptr for this + // Insert() because Value() == nullptr is reserved for indicating failure + // with secondary-cache-compatible mappings. // // The helper argument is saved by the cache and will be used when the // inserted object is evicted or promoted to the secondary cache. It, @@ -619,11 +621,31 @@ class Cache { // saved and used later when the object is evicted. Therefore, it must // outlive the cache. // - // The handle returned may not be ready. The caller should call IsReady() - // to check if the item value is ready, and call Wait() or WaitAll() if - // its not ready. The caller should then call Value() to check if the - // item was successfully retrieved. If unsuccessful (perhaps due to an - // IO error), Value() will return nullptr. + // ======================== Async Lookup (wait=false) ====================== + // When wait=false, the handle returned might be in any of three states: + // * Present - If Value() != nullptr, then the result is present and + // the handle can be used just as if wait=true. + // * Pending, not ready (IsReady() == false) - secondary cache is still + // working to retrieve the value. Might become ready any time. + // * Pending, ready (IsReady() == true) - secondary cache has the value + // but it has not been loaded into primary cache. Call to Wait()/WaitAll() + // will not block. + // + // IMPORTANT: Pending handles are not thread-safe, and only these functions + // are allowed on them: Value(), IsReady(), Wait(), WaitAll(). Even Release() + // can only come after Wait() or WaitAll() even though a reference is held. + // + // Only Wait()/WaitAll() gets a Handle out of a Pending state. (Waiting is + // safe and has no effect on other handle states.) After waiting on a Handle, + // it is in one of two states: + // * Present - if Value() != nullptr + // * Failed - if Value() == nullptr, such as if the secondary cache + // initially thought it had the value but actually did not. + // + // Note that given an arbitrary Handle, the only way to distinguish the + // Pending+ready state from the Failed state is to Wait() on it. A cache + // entry not compatible with secondary cache can also have Value()==nullptr + // like the Failed state, but this is not generally a concern. virtual Handle* Lookup(const Slice& key, const CacheItemHelper* /*helper_cb*/, const CreateCallback& /*create_cb*/, Priority /*priority*/, bool /*wait*/, @@ -634,27 +656,31 @@ class Cache { // Release a mapping returned by a previous Lookup(). The "useful" // parameter specifies whether the data was actually used or not, // which may be used by the cache implementation to decide whether - // to consider it as a hit for retention purposes. + // to consider it as a hit for retention purposes. As noted elsewhere, + // "pending" handles require Wait()/WaitAll() before Release(). virtual bool Release(Handle* handle, bool /*useful*/, bool erase_if_last_ref) { return Release(handle, erase_if_last_ref); } - // Determines if the handle returned by Lookup() has a valid value yet. The - // call is not thread safe and should be called only by someone holding a - // reference to the handle. + // Determines if the handle returned by Lookup() can give a value without + // blocking, though Wait()/WaitAll() might be required to publish it to + // Value(). See secondary cache compatible Lookup() above for details. + // This call is not thread safe on "pending" handles. virtual bool IsReady(Handle* /*handle*/) { return true; } - // If the handle returned by Lookup() is not ready yet, wait till it - // becomes ready. - // Note: A ready handle doesn't necessarily mean it has a valid value. The - // user should call Value() and check for nullptr. + // Convert a "pending" handle into a full thread-shareable handle by + // * If necessary, wait until secondary cache finishes loading the value. + // * Construct the value for primary cache and set it in the handle. + // Even after Wait() on a pending handle, the caller must check for + // Value() == nullptr in case of failure. This call is not thread-safe + // on pending handles. This call has no effect on non-pending handles. + // See secondary cache compatible Lookup() above for details. virtual void Wait(Handle* /*handle*/) {} // Wait for a vector of handles to become ready. As with Wait(), the user // should check the Value() of each handle for nullptr. This call is not - // thread safe and should only be called by the caller holding a reference - // to each of the handles. + // thread-safe on pending handles. virtual void WaitAll(std::vector& /*handles*/) {} private: