Remove reallocation of AlignedBuffer in direct_io sync reads if already aligned (#11600)

Summary:
Remove reallocation of AlignedBuffer in direct_io sync reads in RandomAccessFileReader::Read if buffer passed is already aligned.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/11600

Test Plan:
Setup: `TEST_TMPDIR=./tmp-db/ ./db_bench -benchmarks=filluniquerandom -disable_auto_compactions=true -target_file_size_base=1048576 -write_buffer_size=1048576 -compression_type=none`
Benchmark: `TEST_TMPDIR=./tmp-db/ perf record ./db_bench --cache_size=8388608 --use_existing_db=true --disable_auto_compactions=true --benchmarks=seekrandom --use_direct_reads=true -use_direct_io_for_flush_and_compaction=true -reads=1000 -seek_nexts=1 -max_auto_readahead_size=131072 -initial_auto_readahead_size=16384 -adaptive_readahead=true -num_file_reads_for_auto_readahead=0`

Perf profile-
Before:
```
8.73% db_bench libc.so.6 [.] __memmove_evex_unaligned_erms
3.34% db_bench [kernel.vmlinux] [k] filemap_get_read_batch
```

After:
```
2.50% db_bench [kernel.vmlinux] [k] filemap_get_read_batch
2.29% db_bench libc.so.6 [.] __memmove_evex_unaligned_erms
```

`make  crash_test -j `with direct_io enabled completed succesfully locally.

Ran few benchmarks with direct_io from seek_nexts varying between 912 to 327680 and different readahead_size parameters and it showed no regression so far.

Reviewed By: ajkr

Differential Revision: D47478598

Pulled By: akankshamahajan15

fbshipit-source-id: 6a48e21cb34696f5d09c22a6311a3a1cb5f9cf33
oxigraph-main
akankshamahajan 1 year ago committed by Facebook GitHub Bot
parent b1b6f87fbe
commit 749b179c04
  1. 18
      file/random_access_file_reader.cc
  2. 1
      unreleased_history/performance_improvements/avoid_memcpy_directio.md

@ -97,6 +97,15 @@ IOStatus RandomAccessFileReader::Read(
IOStatus io_s; IOStatus io_s;
uint64_t elapsed = 0; uint64_t elapsed = 0;
size_t alignment = file_->GetRequiredBufferAlignment();
bool is_aligned = false;
if (scratch != nullptr) {
// Check if offset, length and buffer are aligned.
is_aligned = (offset & (alignment - 1)) == 0 &&
(n & (alignment - 1)) == 0 &&
(uintptr_t(scratch) & (alignment - 1)) == 0;
}
{ {
StopWatch sw(clock_, stats_, hist_type_, StopWatch sw(clock_, stats_, hist_type_,
(opts.io_activity != Env::IOActivity::kUnknown) (opts.io_activity != Env::IOActivity::kUnknown)
@ -106,8 +115,7 @@ IOStatus RandomAccessFileReader::Read(
true /*delay_enabled*/); true /*delay_enabled*/);
auto prev_perf_level = GetPerfLevel(); auto prev_perf_level = GetPerfLevel();
IOSTATS_TIMER_GUARD(read_nanos); IOSTATS_TIMER_GUARD(read_nanos);
if (use_direct_io()) { if (use_direct_io() && is_aligned == false) {
size_t alignment = file_->GetRequiredBufferAlignment();
size_t aligned_offset = size_t aligned_offset =
TruncateToPageBoundary(alignment, static_cast<size_t>(offset)); TruncateToPageBoundary(alignment, static_cast<size_t>(offset));
size_t offset_advance = static_cast<size_t>(offset) - aligned_offset; size_t offset_advance = static_cast<size_t>(offset) - aligned_offset;
@ -182,9 +190,9 @@ IOStatus RandomAccessFileReader::Read(
if (rate_limiter_->IsRateLimited(RateLimiter::OpType::kRead)) { if (rate_limiter_->IsRateLimited(RateLimiter::OpType::kRead)) {
sw.DelayStart(); sw.DelayStart();
} }
allowed = rate_limiter_->RequestToken(n - pos, 0 /* alignment */, allowed = rate_limiter_->RequestToken(
rate_limiter_priority, stats_, n - pos, (use_direct_io() ? alignment : 0), rate_limiter_priority,
RateLimiter::OpType::kRead); stats_, RateLimiter::OpType::kRead);
if (rate_limiter_->IsRateLimited(RateLimiter::OpType::kRead)) { if (rate_limiter_->IsRateLimited(RateLimiter::OpType::kRead)) {
sw.DelayStop(); sw.DelayStop();
} }

@ -0,0 +1 @@
In case of direct_io, if buffer passed by callee is already aligned, RandomAccessFileRead::Read will avoid realloacting a new buffer, reducing memcpy and use already passed aligned buffer.
Loading…
Cancel
Save