Summary:
The range deletion meta-block iterators weren't getting cleaned up properly since they don't support arena allocation. I didn't implement arena support since, in the general case, each iterator is used only once and separately from all other iterators, so there should be no benefit to data locality.
Anyways, this diff fixes up #2370 by treating range deletion iterators as non-arena-allocated.
Closes https://github.com/facebook/rocksdb/pull/2399
Differential Revision: D5171119
Pulled By: ajkr
fbshipit-source-id: bef6f5c4c5905a124f4993945aed4bd86e2807d8
Summary:
Previously we returned NotSupported when ingesting files into a database containing any range deletions. This diff adds the support.
- Flush if any memtable contains range deletions overlapping the to-be-ingested file
- Place to-be-ingested file before any level that contains range deletions overlapping it.
- Added support for `Version` to return iterators over range deletions in a given level. Previously, we piggybacked getting range deletions onto `Version`'s `Get()` / `AddIterator()` functions by passing them a `RangeDelAggregator*`. But file ingestion needs to get iterators over range deletions, not populate an aggregator (since the aggregator does collapsing and doesn't expose the actual ranges).
Closes https://github.com/facebook/rocksdb/pull/2370
Differential Revision: D5127648
Pulled By: ajkr
fbshipit-source-id: 816faeb9708adfa5287962bafdde717db56e3f1a
Summary:
First cut for early review; there are few conceptual points to answer and some code structure issues.
For conceptual points -
- restriction-wise, we're going to disallow ingest_behind if (use_seqno_zero_out=true || disable_auto_compaction=false), the user is responsible to properly open and close DB with required params
- we wanted to ingest into reserved bottom most level. Should we fail fast if bottom level isn't empty, or should we attempt to ingest if file fits there key-ranges-wise?
- Modifying AssignLevelForIngestedFile seems the place we we'd handle that.
On code structure - going to refactor GenerateAndAddExternalFile call in the test class to allow passing instance of IngestionOptions, that's just going to incur lots of changes at callsites.
Closes https://github.com/facebook/rocksdb/pull/2144
Differential Revision: D4873732
Pulled By: lightmark
fbshipit-source-id: 81cb698106b68ef8797f564453651d50900e153a
Summary:
Support buck load with universal compaction.
More test cases to be added.
Closes https://github.com/facebook/rocksdb/pull/2202
Differential Revision: D4935360
Pulled By: lightmark
fbshipit-source-id: cc3ca1b6f42faa503207dab1408d6bcf393ee5b5
Summary:
Replacement of #2147
The change was squashed due to a lot of conflicts.
Closes https://github.com/facebook/rocksdb/pull/2194
Differential Revision: D4929799
Pulled By: siying
fbshipit-source-id: 5cd49c254737a1d5ac13f3c035f128e86524c581
Summary:
merger.h was always a confusing name for me, simply give the file a better name
Closes https://github.com/facebook/rocksdb/pull/1836
Differential Revision: D4505357
Pulled By: IslamAbdelRahman
fbshipit-source-id: 07b28d8
Summary:
File copying happens when creating checkpoints and bulkloading files from different FS partition. We should fsync the files when copying them to guarantee durability. A side effect will be that the dirty pages in file system buffers won't grow too large.
Closes https://github.com/facebook/rocksdb/pull/1728
Differential Revision: D4371083
Pulled By: siying
fbshipit-source-id: 579e14c
Summary:
When we Ingest an external file we open it to read some metadata and first/last key
during doing that we insert blocks into the block cache with global_seqno = 0
If we move the file (did not copy it) into the DB, we will use these blocks with the wrong seqno in the read path
Closes https://github.com/facebook/rocksdb/pull/1627
Differential Revision: D4293332
Pulled By: yiwu-arbug
fbshipit-source-id: 3ce5523
Summary:
Allow user to explicitly specify that the generated file by SstFileWriter will be ingested in a specific CF.
This allow us to persist the CF id in the generated file
Closes https://github.com/facebook/rocksdb/pull/1615
Differential Revision: D4270422
Pulled By: IslamAbdelRahman
fbshipit-source-id: 7fb954e
Summary:
Note: reviewed in https://reviews.facebook.net/D65115
- DBIter maintains a range tombstone accumulator. We don't cleanup obsolete tombstones yet, so if the user seeks back and forth, the same tombstones would be added to the accumulator multiple times.
- DBImpl::NewInternalIterator() (used to make DBIter's underlying iterator) adds memtable/L0 range tombstones, L1+ range tombstones are added on-demand during NewSecondaryIterator() (see D62205)
- DBIter uses ShouldDelete() when advancing to check whether keys are covered by range tombstones
Closes https://github.com/facebook/rocksdb/pull/1464
Differential Revision: D4131753
Pulled By: ajkr
fbshipit-source-id: be86559
Summary:
Changes in the diff
API changes:
- Introduce IngestExternalFile to replace AddFile (I think this make the API more clear)
- Introduce IngestExternalFileOptions (This struct will encapsulate the options for ingesting the external file)
- Deprecate AddFile() API
Logic changes:
- If our file overlap with the memtable we will flush the memtable
- We will find the first level in the LSM tree that our file key range overlap with the keys in it
- We will find the lowest level in the LSM tree above the the level we found in step 2 that our file can fit in and ingest our file in it
- We will assign a global sequence number to our new file
- Remove AddFile restrictions by using global sequence numbers
Other changes:
- Refactor all AddFile logic to be encapsulated in ExternalSstFileIngestionJob
Test Plan:
unit tests (still need to add more)
addfile_stress (https://reviews.facebook.net/D65037)
Reviewers: yiwu, andrewkr, lightmark, yhchiang, sdong
Reviewed By: sdong
Subscribers: jkedgar, hcz, andrewkr, dhruba
Differential Revision: https://reviews.facebook.net/D65061