RocksDB Trace Analyzer (#4091)
Summary:
A framework of trace analyzing for RocksDB
After collecting the trace by using the tool of [PR #3837](https://github.com/facebook/rocksdb/pull/3837). User can use the Trace Analyzer to interpret, analyze, and characterize the collected workload.
**Input:**
1. trace file
2. Whole keys space file
**Statistics:**
1. Access count of each operation (Get, Put, Delete, SingleDelete, DeleteRange, Merge) in each column family.
2. Key hotness (access count) of each one
3. Key space separation based on given prefix
4. Key size distribution
5. Value size distribution if appliable
6. Top K accessed keys
7. QPS statistics including the average QPS and peak QPS
8. Top K accessed prefix
9. The query correlation analyzing, output the number of X after Y and the corresponding average time
intervals
**Output:**
1. key access heat map (either in the accessed key space or whole key space)
2. trace sequence file (interpret the raw trace file to line base text file for future use)
3. Time serial (The key space ID and its access time)
4. Key access count distritbution
5. Key size distribution
6. Value size distribution (in each intervals)
7. whole key space separation by the prefix
8. Accessed key space separation by the prefix
9. QPS of each operation and each column family
10. Top K QPS and their accessed prefix range
**Test:**
1. Added the unit test of analyzing Get, Put, Delete, SingleDelete, DeleteRange, Merge
2. Generated the trace and analyze the trace
**Implemented but not tested (due to the limitation of trace_replay):**
1. Analyzing Iterator, supporting Seek() and SeekForPrev() analyzing
2. Analyzing the number of Key found by Get
**Future Work:**
1. Support execution time analyzing of each requests
2. Support cache hit situation and block read situation of Get
Pull Request resolved: https://github.com/facebook/rocksdb/pull/4091
Differential Revision: D9256157
Pulled By: zhichao-cao
fbshipit-source-id: f0ceacb7eedbc43a3eee6e85b76087d7832a8fe6
7 years ago
|
|
|
// Copyright (c) 2011-present, Facebook, Inc. All rights reserved.
|
|
|
|
// This source code is licensed under both the GPLv2 (found in the
|
|
|
|
// COPYING file in the root directory) and Apache 2.0 License
|
|
|
|
// (found in the LICENSE.Apache file in the root directory).
|
|
|
|
//
|
|
|
|
// Copyright (c) 2012 The LevelDB Authors. All rights reserved.
|
|
|
|
// Use of this source code is governed by a BSD-style license that can be
|
|
|
|
// found in the LICENSE file. See the AUTHORS file for names of contributors.
|
|
|
|
|
|
|
|
#ifndef ROCKSDB_LITE
|
|
|
|
#ifndef GFLAGS
|
|
|
|
#include <cstdio>
|
|
|
|
int main() {
|
|
|
|
fprintf(stderr, "Please install gflags to run trace_analyzer test\n");
|
|
|
|
return 1;
|
|
|
|
}
|
|
|
|
#else
|
|
|
|
|
|
|
|
#include <chrono>
|
|
|
|
#include <cstdio>
|
|
|
|
#include <cstdlib>
|
|
|
|
#include <sstream>
|
|
|
|
#include <thread>
|
|
|
|
|
|
|
|
#include "db/db_test_util.h"
|
|
|
|
#include "file/read_write_util.h"
|
RocksDB Trace Analyzer (#4091)
Summary:
A framework of trace analyzing for RocksDB
After collecting the trace by using the tool of [PR #3837](https://github.com/facebook/rocksdb/pull/3837). User can use the Trace Analyzer to interpret, analyze, and characterize the collected workload.
**Input:**
1. trace file
2. Whole keys space file
**Statistics:**
1. Access count of each operation (Get, Put, Delete, SingleDelete, DeleteRange, Merge) in each column family.
2. Key hotness (access count) of each one
3. Key space separation based on given prefix
4. Key size distribution
5. Value size distribution if appliable
6. Top K accessed keys
7. QPS statistics including the average QPS and peak QPS
8. Top K accessed prefix
9. The query correlation analyzing, output the number of X after Y and the corresponding average time
intervals
**Output:**
1. key access heat map (either in the accessed key space or whole key space)
2. trace sequence file (interpret the raw trace file to line base text file for future use)
3. Time serial (The key space ID and its access time)
4. Key access count distritbution
5. Key size distribution
6. Value size distribution (in each intervals)
7. whole key space separation by the prefix
8. Accessed key space separation by the prefix
9. QPS of each operation and each column family
10. Top K QPS and their accessed prefix range
**Test:**
1. Added the unit test of analyzing Get, Put, Delete, SingleDelete, DeleteRange, Merge
2. Generated the trace and analyze the trace
**Implemented but not tested (due to the limitation of trace_replay):**
1. Analyzing Iterator, supporting Seek() and SeekForPrev() analyzing
2. Analyzing the number of Key found by Get
**Future Work:**
1. Support execution time analyzing of each requests
2. Support cache hit situation and block read situation of Get
Pull Request resolved: https://github.com/facebook/rocksdb/pull/4091
Differential Revision: D9256157
Pulled By: zhichao-cao
fbshipit-source-id: f0ceacb7eedbc43a3eee6e85b76087d7832a8fe6
7 years ago
|
|
|
#include "rocksdb/db.h"
|
|
|
|
#include "rocksdb/env.h"
|
|
|
|
#include "rocksdb/status.h"
|
|
|
|
#include "rocksdb/trace_reader_writer.h"
|
|
|
|
#include "test_util/testharness.h"
|
|
|
|
#include "test_util/testutil.h"
|
|
|
|
#include "tools/trace_analyzer_tool.h"
|
|
|
|
#include "trace_replay/trace_replay.h"
|
RocksDB Trace Analyzer (#4091)
Summary:
A framework of trace analyzing for RocksDB
After collecting the trace by using the tool of [PR #3837](https://github.com/facebook/rocksdb/pull/3837). User can use the Trace Analyzer to interpret, analyze, and characterize the collected workload.
**Input:**
1. trace file
2. Whole keys space file
**Statistics:**
1. Access count of each operation (Get, Put, Delete, SingleDelete, DeleteRange, Merge) in each column family.
2. Key hotness (access count) of each one
3. Key space separation based on given prefix
4. Key size distribution
5. Value size distribution if appliable
6. Top K accessed keys
7. QPS statistics including the average QPS and peak QPS
8. Top K accessed prefix
9. The query correlation analyzing, output the number of X after Y and the corresponding average time
intervals
**Output:**
1. key access heat map (either in the accessed key space or whole key space)
2. trace sequence file (interpret the raw trace file to line base text file for future use)
3. Time serial (The key space ID and its access time)
4. Key access count distritbution
5. Key size distribution
6. Value size distribution (in each intervals)
7. whole key space separation by the prefix
8. Accessed key space separation by the prefix
9. QPS of each operation and each column family
10. Top K QPS and their accessed prefix range
**Test:**
1. Added the unit test of analyzing Get, Put, Delete, SingleDelete, DeleteRange, Merge
2. Generated the trace and analyze the trace
**Implemented but not tested (due to the limitation of trace_replay):**
1. Analyzing Iterator, supporting Seek() and SeekForPrev() analyzing
2. Analyzing the number of Key found by Get
**Future Work:**
1. Support execution time analyzing of each requests
2. Support cache hit situation and block read situation of Get
Pull Request resolved: https://github.com/facebook/rocksdb/pull/4091
Differential Revision: D9256157
Pulled By: zhichao-cao
fbshipit-source-id: f0ceacb7eedbc43a3eee6e85b76087d7832a8fe6
7 years ago
|
|
|
|
|
|
|
namespace rocksdb {
|
|
|
|
|
|
|
|
namespace {
|
|
|
|
static const int kMaxArgCount = 100;
|
|
|
|
static const size_t kArgBufferSize = 100000;
|
|
|
|
} // namespace
|
RocksDB Trace Analyzer (#4091)
Summary:
A framework of trace analyzing for RocksDB
After collecting the trace by using the tool of [PR #3837](https://github.com/facebook/rocksdb/pull/3837). User can use the Trace Analyzer to interpret, analyze, and characterize the collected workload.
**Input:**
1. trace file
2. Whole keys space file
**Statistics:**
1. Access count of each operation (Get, Put, Delete, SingleDelete, DeleteRange, Merge) in each column family.
2. Key hotness (access count) of each one
3. Key space separation based on given prefix
4. Key size distribution
5. Value size distribution if appliable
6. Top K accessed keys
7. QPS statistics including the average QPS and peak QPS
8. Top K accessed prefix
9. The query correlation analyzing, output the number of X after Y and the corresponding average time
intervals
**Output:**
1. key access heat map (either in the accessed key space or whole key space)
2. trace sequence file (interpret the raw trace file to line base text file for future use)
3. Time serial (The key space ID and its access time)
4. Key access count distritbution
5. Key size distribution
6. Value size distribution (in each intervals)
7. whole key space separation by the prefix
8. Accessed key space separation by the prefix
9. QPS of each operation and each column family
10. Top K QPS and their accessed prefix range
**Test:**
1. Added the unit test of analyzing Get, Put, Delete, SingleDelete, DeleteRange, Merge
2. Generated the trace and analyze the trace
**Implemented but not tested (due to the limitation of trace_replay):**
1. Analyzing Iterator, supporting Seek() and SeekForPrev() analyzing
2. Analyzing the number of Key found by Get
**Future Work:**
1. Support execution time analyzing of each requests
2. Support cache hit situation and block read situation of Get
Pull Request resolved: https://github.com/facebook/rocksdb/pull/4091
Differential Revision: D9256157
Pulled By: zhichao-cao
fbshipit-source-id: f0ceacb7eedbc43a3eee6e85b76087d7832a8fe6
7 years ago
|
|
|
|
|
|
|
// The helper functions for the test
|
|
|
|
class TraceAnalyzerTest : public testing::Test {
|
|
|
|
public:
|
|
|
|
TraceAnalyzerTest() : rnd_(0xFB) {
|
|
|
|
// test_path_ = test::TmpDir() + "trace_analyzer_test";
|
|
|
|
test_path_ = test::PerThreadDBPath("trace_analyzer_test");
|
|
|
|
env_ = rocksdb::Env::Default();
|
|
|
|
env_->CreateDir(test_path_);
|
|
|
|
dbname_ = test_path_ + "/db";
|
|
|
|
}
|
|
|
|
|
|
|
|
~TraceAnalyzerTest() override {}
|
RocksDB Trace Analyzer (#4091)
Summary:
A framework of trace analyzing for RocksDB
After collecting the trace by using the tool of [PR #3837](https://github.com/facebook/rocksdb/pull/3837). User can use the Trace Analyzer to interpret, analyze, and characterize the collected workload.
**Input:**
1. trace file
2. Whole keys space file
**Statistics:**
1. Access count of each operation (Get, Put, Delete, SingleDelete, DeleteRange, Merge) in each column family.
2. Key hotness (access count) of each one
3. Key space separation based on given prefix
4. Key size distribution
5. Value size distribution if appliable
6. Top K accessed keys
7. QPS statistics including the average QPS and peak QPS
8. Top K accessed prefix
9. The query correlation analyzing, output the number of X after Y and the corresponding average time
intervals
**Output:**
1. key access heat map (either in the accessed key space or whole key space)
2. trace sequence file (interpret the raw trace file to line base text file for future use)
3. Time serial (The key space ID and its access time)
4. Key access count distritbution
5. Key size distribution
6. Value size distribution (in each intervals)
7. whole key space separation by the prefix
8. Accessed key space separation by the prefix
9. QPS of each operation and each column family
10. Top K QPS and their accessed prefix range
**Test:**
1. Added the unit test of analyzing Get, Put, Delete, SingleDelete, DeleteRange, Merge
2. Generated the trace and analyze the trace
**Implemented but not tested (due to the limitation of trace_replay):**
1. Analyzing Iterator, supporting Seek() and SeekForPrev() analyzing
2. Analyzing the number of Key found by Get
**Future Work:**
1. Support execution time analyzing of each requests
2. Support cache hit situation and block read situation of Get
Pull Request resolved: https://github.com/facebook/rocksdb/pull/4091
Differential Revision: D9256157
Pulled By: zhichao-cao
fbshipit-source-id: f0ceacb7eedbc43a3eee6e85b76087d7832a8fe6
7 years ago
|
|
|
|
|
|
|
void GenerateTrace(std::string trace_path) {
|
|
|
|
Options options;
|
|
|
|
options.create_if_missing = true;
|
|
|
|
options.merge_operator = MergeOperators::CreatePutOperator();
|
|
|
|
ReadOptions ro;
|
|
|
|
WriteOptions wo;
|
|
|
|
TraceOptions trace_opt;
|
|
|
|
DB* db_ = nullptr;
|
|
|
|
std::string value;
|
|
|
|
std::unique_ptr<TraceWriter> trace_writer;
|
|
|
|
Iterator* single_iter = nullptr;
|
RocksDB Trace Analyzer (#4091)
Summary:
A framework of trace analyzing for RocksDB
After collecting the trace by using the tool of [PR #3837](https://github.com/facebook/rocksdb/pull/3837). User can use the Trace Analyzer to interpret, analyze, and characterize the collected workload.
**Input:**
1. trace file
2. Whole keys space file
**Statistics:**
1. Access count of each operation (Get, Put, Delete, SingleDelete, DeleteRange, Merge) in each column family.
2. Key hotness (access count) of each one
3. Key space separation based on given prefix
4. Key size distribution
5. Value size distribution if appliable
6. Top K accessed keys
7. QPS statistics including the average QPS and peak QPS
8. Top K accessed prefix
9. The query correlation analyzing, output the number of X after Y and the corresponding average time
intervals
**Output:**
1. key access heat map (either in the accessed key space or whole key space)
2. trace sequence file (interpret the raw trace file to line base text file for future use)
3. Time serial (The key space ID and its access time)
4. Key access count distritbution
5. Key size distribution
6. Value size distribution (in each intervals)
7. whole key space separation by the prefix
8. Accessed key space separation by the prefix
9. QPS of each operation and each column family
10. Top K QPS and their accessed prefix range
**Test:**
1. Added the unit test of analyzing Get, Put, Delete, SingleDelete, DeleteRange, Merge
2. Generated the trace and analyze the trace
**Implemented but not tested (due to the limitation of trace_replay):**
1. Analyzing Iterator, supporting Seek() and SeekForPrev() analyzing
2. Analyzing the number of Key found by Get
**Future Work:**
1. Support execution time analyzing of each requests
2. Support cache hit situation and block read situation of Get
Pull Request resolved: https://github.com/facebook/rocksdb/pull/4091
Differential Revision: D9256157
Pulled By: zhichao-cao
fbshipit-source-id: f0ceacb7eedbc43a3eee6e85b76087d7832a8fe6
7 years ago
|
|
|
|
|
|
|
ASSERT_OK(
|
|
|
|
NewFileTraceWriter(env_, env_options_, trace_path, &trace_writer));
|
|
|
|
ASSERT_OK(DB::Open(options, dbname_, &db_));
|
|
|
|
ASSERT_OK(db_->StartTrace(trace_opt, std::move(trace_writer)));
|
|
|
|
|
|
|
|
WriteBatch batch;
|
|
|
|
ASSERT_OK(batch.Put("a", "aaaaaaaaa"));
|
|
|
|
ASSERT_OK(batch.Merge("b", "aaaaaaaaaaaaaaaaaaaa"));
|
|
|
|
ASSERT_OK(batch.Delete("c"));
|
|
|
|
ASSERT_OK(batch.SingleDelete("d"));
|
|
|
|
ASSERT_OK(batch.DeleteRange("e", "f"));
|
|
|
|
ASSERT_OK(db_->Write(wo, &batch));
|
|
|
|
|
|
|
|
ASSERT_OK(db_->Get(ro, "a", &value));
|
|
|
|
single_iter = db_->NewIterator(ro);
|
|
|
|
single_iter->Seek("a");
|
|
|
|
single_iter->SeekForPrev("b");
|
|
|
|
delete single_iter;
|
|
|
|
std::this_thread::sleep_for (std::chrono::seconds(1));
|
|
|
|
|
RocksDB Trace Analyzer (#4091)
Summary:
A framework of trace analyzing for RocksDB
After collecting the trace by using the tool of [PR #3837](https://github.com/facebook/rocksdb/pull/3837). User can use the Trace Analyzer to interpret, analyze, and characterize the collected workload.
**Input:**
1. trace file
2. Whole keys space file
**Statistics:**
1. Access count of each operation (Get, Put, Delete, SingleDelete, DeleteRange, Merge) in each column family.
2. Key hotness (access count) of each one
3. Key space separation based on given prefix
4. Key size distribution
5. Value size distribution if appliable
6. Top K accessed keys
7. QPS statistics including the average QPS and peak QPS
8. Top K accessed prefix
9. The query correlation analyzing, output the number of X after Y and the corresponding average time
intervals
**Output:**
1. key access heat map (either in the accessed key space or whole key space)
2. trace sequence file (interpret the raw trace file to line base text file for future use)
3. Time serial (The key space ID and its access time)
4. Key access count distritbution
5. Key size distribution
6. Value size distribution (in each intervals)
7. whole key space separation by the prefix
8. Accessed key space separation by the prefix
9. QPS of each operation and each column family
10. Top K QPS and their accessed prefix range
**Test:**
1. Added the unit test of analyzing Get, Put, Delete, SingleDelete, DeleteRange, Merge
2. Generated the trace and analyze the trace
**Implemented but not tested (due to the limitation of trace_replay):**
1. Analyzing Iterator, supporting Seek() and SeekForPrev() analyzing
2. Analyzing the number of Key found by Get
**Future Work:**
1. Support execution time analyzing of each requests
2. Support cache hit situation and block read situation of Get
Pull Request resolved: https://github.com/facebook/rocksdb/pull/4091
Differential Revision: D9256157
Pulled By: zhichao-cao
fbshipit-source-id: f0ceacb7eedbc43a3eee6e85b76087d7832a8fe6
7 years ago
|
|
|
db_->Get(ro, "g", &value);
|
|
|
|
|
|
|
|
ASSERT_OK(db_->EndTrace());
|
|
|
|
|
|
|
|
ASSERT_OK(env_->FileExists(trace_path));
|
|
|
|
|
|
|
|
std::unique_ptr<WritableFile> whole_f;
|
|
|
|
std::string whole_path = test_path_ + "/0.txt";
|
|
|
|
ASSERT_OK(env_->NewWritableFile(whole_path, &whole_f, env_options_));
|
|
|
|
std::string whole_str = "0x61\n0x62\n0x63\n0x64\n0x65\n0x66\n";
|
|
|
|
ASSERT_OK(whole_f->Append(whole_str));
|
|
|
|
delete db_;
|
|
|
|
ASSERT_OK(DestroyDB(dbname_, options));
|
|
|
|
}
|
|
|
|
|
|
|
|
void RunTraceAnalyzer(const std::vector<std::string>& args) {
|
|
|
|
char arg_buffer[kArgBufferSize];
|
|
|
|
char* argv[kMaxArgCount];
|
|
|
|
int argc = 0;
|
|
|
|
int cursor = 0;
|
|
|
|
|
|
|
|
for (const auto& arg : args) {
|
|
|
|
ASSERT_LE(cursor + arg.size() + 1, kArgBufferSize);
|
|
|
|
ASSERT_LE(argc + 1, kMaxArgCount);
|
|
|
|
snprintf(arg_buffer + cursor, arg.size() + 1, "%s", arg.c_str());
|
|
|
|
|
|
|
|
argv[argc++] = arg_buffer + cursor;
|
|
|
|
cursor += static_cast<int>(arg.size()) + 1;
|
|
|
|
}
|
|
|
|
|
|
|
|
ASSERT_EQ(0, rocksdb::trace_analyzer_tool(argc, argv));
|
|
|
|
}
|
|
|
|
|
|
|
|
void CheckFileContent(const std::vector<std::string>& cnt,
|
|
|
|
std::string file_path, bool full_content) {
|
|
|
|
ASSERT_OK(env_->FileExists(file_path));
|
|
|
|
std::unique_ptr<SequentialFile> f_ptr;
|
|
|
|
ASSERT_OK(env_->NewSequentialFile(file_path, &f_ptr, env_options_));
|
|
|
|
|
|
|
|
std::string get_line;
|
|
|
|
std::istringstream iss;
|
|
|
|
bool has_data = true;
|
|
|
|
std::vector<std::string> result;
|
|
|
|
uint32_t count;
|
|
|
|
Status s;
|
Introduce a new storage specific Env API (#5761)
Summary:
The current Env API encompasses both storage/file operations, as well as OS related operations. Most of the APIs return a Status, which does not have enough metadata about an error, such as whether its retry-able or not, scope (i.e fault domain) of the error etc., that may be required in order to properly handle a storage error. The file APIs also do not provide enough control over the IO SLA, such as timeout, prioritization, hinting about placement and redundancy etc.
This PR separates out the file/storage APIs from Env into a new FileSystem class. The APIs are updated to return an IOStatus with metadata about the error, as well as to take an IOOptions structure as input in order to allow more control over the IO.
The user can set both ```options.env``` and ```options.file_system``` to specify that RocksDB should use the former for OS related operations and the latter for storage operations. Internally, a ```CompositeEnvWrapper``` has been introduced that inherits from ```Env``` and redirects individual methods to either an ```Env``` implementation or the ```FileSystem``` as appropriate. When options are sanitized during ```DB::Open```, ```options.env``` is replaced with a newly allocated ```CompositeEnvWrapper``` instance if both env and file_system have been specified. This way, the rest of the RocksDB code can continue to function as before.
This PR also ports PosixEnv to the new API by splitting it into two - PosixEnv and PosixFileSystem. PosixEnv is defined as a sub-class of CompositeEnvWrapper, and threading/time functions are overridden with Posix specific implementations in order to avoid an extra level of indirection.
The ```CompositeEnvWrapper``` translates ```IOStatus``` return code to ```Status```, and sets the severity to ```kSoftError``` if the io_status is retryable. The error handling code in RocksDB can then recover the DB automatically.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5761
Differential Revision: D18868376
Pulled By: anand1976
fbshipit-source-id: 39efe18a162ea746fabac6360ff529baba48486f
5 years ago
|
|
|
std::unique_ptr<FSSequentialFile> file =
|
|
|
|
NewLegacySequentialFileWrapper(f_ptr);
|
|
|
|
for (count = 0; ReadOneLine(&iss, file.get(), &get_line, &has_data, &s);
|
RocksDB Trace Analyzer (#4091)
Summary:
A framework of trace analyzing for RocksDB
After collecting the trace by using the tool of [PR #3837](https://github.com/facebook/rocksdb/pull/3837). User can use the Trace Analyzer to interpret, analyze, and characterize the collected workload.
**Input:**
1. trace file
2. Whole keys space file
**Statistics:**
1. Access count of each operation (Get, Put, Delete, SingleDelete, DeleteRange, Merge) in each column family.
2. Key hotness (access count) of each one
3. Key space separation based on given prefix
4. Key size distribution
5. Value size distribution if appliable
6. Top K accessed keys
7. QPS statistics including the average QPS and peak QPS
8. Top K accessed prefix
9. The query correlation analyzing, output the number of X after Y and the corresponding average time
intervals
**Output:**
1. key access heat map (either in the accessed key space or whole key space)
2. trace sequence file (interpret the raw trace file to line base text file for future use)
3. Time serial (The key space ID and its access time)
4. Key access count distritbution
5. Key size distribution
6. Value size distribution (in each intervals)
7. whole key space separation by the prefix
8. Accessed key space separation by the prefix
9. QPS of each operation and each column family
10. Top K QPS and their accessed prefix range
**Test:**
1. Added the unit test of analyzing Get, Put, Delete, SingleDelete, DeleteRange, Merge
2. Generated the trace and analyze the trace
**Implemented but not tested (due to the limitation of trace_replay):**
1. Analyzing Iterator, supporting Seek() and SeekForPrev() analyzing
2. Analyzing the number of Key found by Get
**Future Work:**
1. Support execution time analyzing of each requests
2. Support cache hit situation and block read situation of Get
Pull Request resolved: https://github.com/facebook/rocksdb/pull/4091
Differential Revision: D9256157
Pulled By: zhichao-cao
fbshipit-source-id: f0ceacb7eedbc43a3eee6e85b76087d7832a8fe6
7 years ago
|
|
|
++count) {
|
|
|
|
ASSERT_OK(s);
|
|
|
|
result.push_back(get_line);
|
|
|
|
}
|
|
|
|
|
|
|
|
ASSERT_EQ(cnt.size(), result.size());
|
|
|
|
for (int i = 0; i < static_cast<int>(result.size()); i++) {
|
|
|
|
if (full_content) {
|
|
|
|
ASSERT_EQ(result[i], cnt[i]);
|
|
|
|
} else {
|
|
|
|
ASSERT_EQ(result[i][0], cnt[i][0]);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
return;
|
|
|
|
}
|
|
|
|
|
|
|
|
void AnalyzeTrace(std::vector<std::string>& paras_diff,
|
|
|
|
std::string output_path, std::string trace_path) {
|
|
|
|
std::vector<std::string> paras = {"./trace_analyzer",
|
|
|
|
"-convert_to_human_readable_trace",
|
|
|
|
"-output_key_stats",
|
|
|
|
"-output_access_count_stats",
|
|
|
|
"-output_prefix=test",
|
|
|
|
"-output_prefix_cut=1",
|
|
|
|
"-output_time_series",
|
|
|
|
"-output_value_distribution",
|
|
|
|
"-output_qps_stats",
|
|
|
|
"-no_key",
|
|
|
|
"-no_print"};
|
|
|
|
for (auto& para : paras_diff) {
|
|
|
|
paras.push_back(para);
|
|
|
|
}
|
|
|
|
Status s = env_->FileExists(trace_path);
|
|
|
|
if (!s.ok()) {
|
|
|
|
GenerateTrace(trace_path);
|
|
|
|
}
|
|
|
|
env_->CreateDir(output_path);
|
|
|
|
RunTraceAnalyzer(paras);
|
|
|
|
}
|
|
|
|
|
RocksDB Trace Analyzer (#4091)
Summary:
A framework of trace analyzing for RocksDB
After collecting the trace by using the tool of [PR #3837](https://github.com/facebook/rocksdb/pull/3837). User can use the Trace Analyzer to interpret, analyze, and characterize the collected workload.
**Input:**
1. trace file
2. Whole keys space file
**Statistics:**
1. Access count of each operation (Get, Put, Delete, SingleDelete, DeleteRange, Merge) in each column family.
2. Key hotness (access count) of each one
3. Key space separation based on given prefix
4. Key size distribution
5. Value size distribution if appliable
6. Top K accessed keys
7. QPS statistics including the average QPS and peak QPS
8. Top K accessed prefix
9. The query correlation analyzing, output the number of X after Y and the corresponding average time
intervals
**Output:**
1. key access heat map (either in the accessed key space or whole key space)
2. trace sequence file (interpret the raw trace file to line base text file for future use)
3. Time serial (The key space ID and its access time)
4. Key access count distritbution
5. Key size distribution
6. Value size distribution (in each intervals)
7. whole key space separation by the prefix
8. Accessed key space separation by the prefix
9. QPS of each operation and each column family
10. Top K QPS and their accessed prefix range
**Test:**
1. Added the unit test of analyzing Get, Put, Delete, SingleDelete, DeleteRange, Merge
2. Generated the trace and analyze the trace
**Implemented but not tested (due to the limitation of trace_replay):**
1. Analyzing Iterator, supporting Seek() and SeekForPrev() analyzing
2. Analyzing the number of Key found by Get
**Future Work:**
1. Support execution time analyzing of each requests
2. Support cache hit situation and block read situation of Get
Pull Request resolved: https://github.com/facebook/rocksdb/pull/4091
Differential Revision: D9256157
Pulled By: zhichao-cao
fbshipit-source-id: f0ceacb7eedbc43a3eee6e85b76087d7832a8fe6
7 years ago
|
|
|
rocksdb::Env* env_;
|
|
|
|
EnvOptions env_options_;
|
|
|
|
std::string test_path_;
|
|
|
|
std::string dbname_;
|
|
|
|
Random rnd_;
|
|
|
|
};
|
|
|
|
|
|
|
|
TEST_F(TraceAnalyzerTest, Get) {
|
|
|
|
std::string trace_path = test_path_ + "/trace";
|
|
|
|
std::string output_path = test_path_ + "/get";
|
|
|
|
std::string file_path;
|
|
|
|
std::vector<std::string> paras = {"-analyze_get"};
|
RocksDB Trace Analyzer (#4091)
Summary:
A framework of trace analyzing for RocksDB
After collecting the trace by using the tool of [PR #3837](https://github.com/facebook/rocksdb/pull/3837). User can use the Trace Analyzer to interpret, analyze, and characterize the collected workload.
**Input:**
1. trace file
2. Whole keys space file
**Statistics:**
1. Access count of each operation (Get, Put, Delete, SingleDelete, DeleteRange, Merge) in each column family.
2. Key hotness (access count) of each one
3. Key space separation based on given prefix
4. Key size distribution
5. Value size distribution if appliable
6. Top K accessed keys
7. QPS statistics including the average QPS and peak QPS
8. Top K accessed prefix
9. The query correlation analyzing, output the number of X after Y and the corresponding average time
intervals
**Output:**
1. key access heat map (either in the accessed key space or whole key space)
2. trace sequence file (interpret the raw trace file to line base text file for future use)
3. Time serial (The key space ID and its access time)
4. Key access count distritbution
5. Key size distribution
6. Value size distribution (in each intervals)
7. whole key space separation by the prefix
8. Accessed key space separation by the prefix
9. QPS of each operation and each column family
10. Top K QPS and their accessed prefix range
**Test:**
1. Added the unit test of analyzing Get, Put, Delete, SingleDelete, DeleteRange, Merge
2. Generated the trace and analyze the trace
**Implemented but not tested (due to the limitation of trace_replay):**
1. Analyzing Iterator, supporting Seek() and SeekForPrev() analyzing
2. Analyzing the number of Key found by Get
**Future Work:**
1. Support execution time analyzing of each requests
2. Support cache hit situation and block read situation of Get
Pull Request resolved: https://github.com/facebook/rocksdb/pull/4091
Differential Revision: D9256157
Pulled By: zhichao-cao
fbshipit-source-id: f0ceacb7eedbc43a3eee6e85b76087d7832a8fe6
7 years ago
|
|
|
paras.push_back("-output_dir=" + output_path);
|
|
|
|
paras.push_back("-trace_path=" + trace_path);
|
|
|
|
paras.push_back("-key_space_dir=" + test_path_);
|
|
|
|
AnalyzeTrace(paras, output_path, trace_path);
|
RocksDB Trace Analyzer (#4091)
Summary:
A framework of trace analyzing for RocksDB
After collecting the trace by using the tool of [PR #3837](https://github.com/facebook/rocksdb/pull/3837). User can use the Trace Analyzer to interpret, analyze, and characterize the collected workload.
**Input:**
1. trace file
2. Whole keys space file
**Statistics:**
1. Access count of each operation (Get, Put, Delete, SingleDelete, DeleteRange, Merge) in each column family.
2. Key hotness (access count) of each one
3. Key space separation based on given prefix
4. Key size distribution
5. Value size distribution if appliable
6. Top K accessed keys
7. QPS statistics including the average QPS and peak QPS
8. Top K accessed prefix
9. The query correlation analyzing, output the number of X after Y and the corresponding average time
intervals
**Output:**
1. key access heat map (either in the accessed key space or whole key space)
2. trace sequence file (interpret the raw trace file to line base text file for future use)
3. Time serial (The key space ID and its access time)
4. Key access count distritbution
5. Key size distribution
6. Value size distribution (in each intervals)
7. whole key space separation by the prefix
8. Accessed key space separation by the prefix
9. QPS of each operation and each column family
10. Top K QPS and their accessed prefix range
**Test:**
1. Added the unit test of analyzing Get, Put, Delete, SingleDelete, DeleteRange, Merge
2. Generated the trace and analyze the trace
**Implemented but not tested (due to the limitation of trace_replay):**
1. Analyzing Iterator, supporting Seek() and SeekForPrev() analyzing
2. Analyzing the number of Key found by Get
**Future Work:**
1. Support execution time analyzing of each requests
2. Support cache hit situation and block read situation of Get
Pull Request resolved: https://github.com/facebook/rocksdb/pull/4091
Differential Revision: D9256157
Pulled By: zhichao-cao
fbshipit-source-id: f0ceacb7eedbc43a3eee6e85b76087d7832a8fe6
7 years ago
|
|
|
|
|
|
|
// check the key_stats file
|
|
|
|
std::vector<std::string> k_stats = {"0 10 0 1 1.000000", "0 10 1 1 1.000000"};
|
|
|
|
file_path = output_path + "/test-get-0-accessed_key_stats.txt";
|
|
|
|
CheckFileContent(k_stats, file_path, true);
|
|
|
|
|
|
|
|
// Check the access count distribution
|
|
|
|
std::vector<std::string> k_dist = {"access_count: 1 num: 2"};
|
|
|
|
file_path = output_path + "/test-get-0-accessed_key_count_distribution.txt";
|
|
|
|
CheckFileContent(k_dist, file_path, true);
|
|
|
|
|
|
|
|
// Check the trace sequence
|
|
|
|
std::vector<std::string> k_sequence = {"1", "5", "2", "3", "4",
|
|
|
|
"0", "6", "7", "0"};
|
RocksDB Trace Analyzer (#4091)
Summary:
A framework of trace analyzing for RocksDB
After collecting the trace by using the tool of [PR #3837](https://github.com/facebook/rocksdb/pull/3837). User can use the Trace Analyzer to interpret, analyze, and characterize the collected workload.
**Input:**
1. trace file
2. Whole keys space file
**Statistics:**
1. Access count of each operation (Get, Put, Delete, SingleDelete, DeleteRange, Merge) in each column family.
2. Key hotness (access count) of each one
3. Key space separation based on given prefix
4. Key size distribution
5. Value size distribution if appliable
6. Top K accessed keys
7. QPS statistics including the average QPS and peak QPS
8. Top K accessed prefix
9. The query correlation analyzing, output the number of X after Y and the corresponding average time
intervals
**Output:**
1. key access heat map (either in the accessed key space or whole key space)
2. trace sequence file (interpret the raw trace file to line base text file for future use)
3. Time serial (The key space ID and its access time)
4. Key access count distritbution
5. Key size distribution
6. Value size distribution (in each intervals)
7. whole key space separation by the prefix
8. Accessed key space separation by the prefix
9. QPS of each operation and each column family
10. Top K QPS and their accessed prefix range
**Test:**
1. Added the unit test of analyzing Get, Put, Delete, SingleDelete, DeleteRange, Merge
2. Generated the trace and analyze the trace
**Implemented but not tested (due to the limitation of trace_replay):**
1. Analyzing Iterator, supporting Seek() and SeekForPrev() analyzing
2. Analyzing the number of Key found by Get
**Future Work:**
1. Support execution time analyzing of each requests
2. Support cache hit situation and block read situation of Get
Pull Request resolved: https://github.com/facebook/rocksdb/pull/4091
Differential Revision: D9256157
Pulled By: zhichao-cao
fbshipit-source-id: f0ceacb7eedbc43a3eee6e85b76087d7832a8fe6
7 years ago
|
|
|
file_path = output_path + "/test-human_readable_trace.txt";
|
|
|
|
CheckFileContent(k_sequence, file_path, false);
|
|
|
|
|
|
|
|
// Check the prefix
|
|
|
|
std::vector<std::string> k_prefix = {"0 0 0 0.000000 0.000000 0x30",
|
|
|
|
"1 1 1 1.000000 1.000000 0x61"};
|
|
|
|
file_path = output_path + "/test-get-0-accessed_key_prefix_cut.txt";
|
|
|
|
CheckFileContent(k_prefix, file_path, true);
|
|
|
|
|
|
|
|
// Check the time series
|
|
|
|
std::vector<std::string> k_series = {"0 1533000630 0", "0 1533000630 1"};
|
|
|
|
file_path = output_path + "/test-get-0-time_series.txt";
|
|
|
|
CheckFileContent(k_series, file_path, false);
|
|
|
|
|
|
|
|
// Check the accessed key in whole key space
|
|
|
|
std::vector<std::string> k_whole_access = {"0 1"};
|
|
|
|
file_path = output_path + "/test-get-0-whole_key_stats.txt";
|
|
|
|
CheckFileContent(k_whole_access, file_path, true);
|
|
|
|
|
|
|
|
// Check the whole key prefix cut
|
|
|
|
std::vector<std::string> k_whole_prefix = {"0 0x61", "1 0x62", "2 0x63",
|
|
|
|
"3 0x64", "4 0x65", "5 0x66"};
|
|
|
|
file_path = output_path + "/test-get-0-whole_key_prefix_cut.txt";
|
|
|
|
CheckFileContent(k_whole_prefix, file_path, true);
|
|
|
|
|
|
|
|
// Check the overall qps
|
|
|
|
std::vector<std::string> all_qps = {"1 0 0 0 0 0 0 0 1"};
|
|
|
|
file_path = output_path + "/test-qps_stats.txt";
|
|
|
|
CheckFileContent(all_qps, file_path, true);
|
|
|
|
|
|
|
|
// Check the qps of get
|
|
|
|
std::vector<std::string> get_qps = {"1"};
|
|
|
|
file_path = output_path + "/test-get-0-qps_stats.txt";
|
|
|
|
CheckFileContent(get_qps, file_path, true);
|
|
|
|
|
|
|
|
// Check the top k qps prefix cut
|
|
|
|
std::vector<std::string> top_qps = {"At time: 0 with QPS: 1",
|
|
|
|
"The prefix: 0x61 Access count: 1"};
|
|
|
|
file_path = output_path + "/test-get-0-accessed_top_k_qps_prefix_cut.txt";
|
|
|
|
CheckFileContent(top_qps, file_path, true);
|
|
|
|
}
|
|
|
|
|
|
|
|
// Test analyzing of Put
|
|
|
|
TEST_F(TraceAnalyzerTest, Put) {
|
|
|
|
std::string trace_path = test_path_ + "/trace";
|
|
|
|
std::string output_path = test_path_ + "/put";
|
|
|
|
std::string file_path;
|
|
|
|
std::vector<std::string> paras = {"-analyze_put"};
|
RocksDB Trace Analyzer (#4091)
Summary:
A framework of trace analyzing for RocksDB
After collecting the trace by using the tool of [PR #3837](https://github.com/facebook/rocksdb/pull/3837). User can use the Trace Analyzer to interpret, analyze, and characterize the collected workload.
**Input:**
1. trace file
2. Whole keys space file
**Statistics:**
1. Access count of each operation (Get, Put, Delete, SingleDelete, DeleteRange, Merge) in each column family.
2. Key hotness (access count) of each one
3. Key space separation based on given prefix
4. Key size distribution
5. Value size distribution if appliable
6. Top K accessed keys
7. QPS statistics including the average QPS and peak QPS
8. Top K accessed prefix
9. The query correlation analyzing, output the number of X after Y and the corresponding average time
intervals
**Output:**
1. key access heat map (either in the accessed key space or whole key space)
2. trace sequence file (interpret the raw trace file to line base text file for future use)
3. Time serial (The key space ID and its access time)
4. Key access count distritbution
5. Key size distribution
6. Value size distribution (in each intervals)
7. whole key space separation by the prefix
8. Accessed key space separation by the prefix
9. QPS of each operation and each column family
10. Top K QPS and their accessed prefix range
**Test:**
1. Added the unit test of analyzing Get, Put, Delete, SingleDelete, DeleteRange, Merge
2. Generated the trace and analyze the trace
**Implemented but not tested (due to the limitation of trace_replay):**
1. Analyzing Iterator, supporting Seek() and SeekForPrev() analyzing
2. Analyzing the number of Key found by Get
**Future Work:**
1. Support execution time analyzing of each requests
2. Support cache hit situation and block read situation of Get
Pull Request resolved: https://github.com/facebook/rocksdb/pull/4091
Differential Revision: D9256157
Pulled By: zhichao-cao
fbshipit-source-id: f0ceacb7eedbc43a3eee6e85b76087d7832a8fe6
7 years ago
|
|
|
paras.push_back("-output_dir=" + output_path);
|
|
|
|
paras.push_back("-trace_path=" + trace_path);
|
|
|
|
paras.push_back("-key_space_dir=" + test_path_);
|
|
|
|
AnalyzeTrace(paras, output_path, trace_path);
|
RocksDB Trace Analyzer (#4091)
Summary:
A framework of trace analyzing for RocksDB
After collecting the trace by using the tool of [PR #3837](https://github.com/facebook/rocksdb/pull/3837). User can use the Trace Analyzer to interpret, analyze, and characterize the collected workload.
**Input:**
1. trace file
2. Whole keys space file
**Statistics:**
1. Access count of each operation (Get, Put, Delete, SingleDelete, DeleteRange, Merge) in each column family.
2. Key hotness (access count) of each one
3. Key space separation based on given prefix
4. Key size distribution
5. Value size distribution if appliable
6. Top K accessed keys
7. QPS statistics including the average QPS and peak QPS
8. Top K accessed prefix
9. The query correlation analyzing, output the number of X after Y and the corresponding average time
intervals
**Output:**
1. key access heat map (either in the accessed key space or whole key space)
2. trace sequence file (interpret the raw trace file to line base text file for future use)
3. Time serial (The key space ID and its access time)
4. Key access count distritbution
5. Key size distribution
6. Value size distribution (in each intervals)
7. whole key space separation by the prefix
8. Accessed key space separation by the prefix
9. QPS of each operation and each column family
10. Top K QPS and their accessed prefix range
**Test:**
1. Added the unit test of analyzing Get, Put, Delete, SingleDelete, DeleteRange, Merge
2. Generated the trace and analyze the trace
**Implemented but not tested (due to the limitation of trace_replay):**
1. Analyzing Iterator, supporting Seek() and SeekForPrev() analyzing
2. Analyzing the number of Key found by Get
**Future Work:**
1. Support execution time analyzing of each requests
2. Support cache hit situation and block read situation of Get
Pull Request resolved: https://github.com/facebook/rocksdb/pull/4091
Differential Revision: D9256157
Pulled By: zhichao-cao
fbshipit-source-id: f0ceacb7eedbc43a3eee6e85b76087d7832a8fe6
7 years ago
|
|
|
|
|
|
|
// check the key_stats file
|
|
|
|
std::vector<std::string> k_stats = {"0 9 0 1 1.000000"};
|
|
|
|
file_path = output_path + "/test-put-0-accessed_key_stats.txt";
|
|
|
|
CheckFileContent(k_stats, file_path, true);
|
|
|
|
|
|
|
|
// Check the access count distribution
|
|
|
|
std::vector<std::string> k_dist = {"access_count: 1 num: 1"};
|
|
|
|
file_path = output_path + "/test-put-0-accessed_key_count_distribution.txt";
|
|
|
|
CheckFileContent(k_dist, file_path, true);
|
|
|
|
|
|
|
|
// Check the trace sequence
|
|
|
|
std::vector<std::string> k_sequence = {"1", "5", "2", "3", "4",
|
|
|
|
"0", "6", "7", "0"};
|
RocksDB Trace Analyzer (#4091)
Summary:
A framework of trace analyzing for RocksDB
After collecting the trace by using the tool of [PR #3837](https://github.com/facebook/rocksdb/pull/3837). User can use the Trace Analyzer to interpret, analyze, and characterize the collected workload.
**Input:**
1. trace file
2. Whole keys space file
**Statistics:**
1. Access count of each operation (Get, Put, Delete, SingleDelete, DeleteRange, Merge) in each column family.
2. Key hotness (access count) of each one
3. Key space separation based on given prefix
4. Key size distribution
5. Value size distribution if appliable
6. Top K accessed keys
7. QPS statistics including the average QPS and peak QPS
8. Top K accessed prefix
9. The query correlation analyzing, output the number of X after Y and the corresponding average time
intervals
**Output:**
1. key access heat map (either in the accessed key space or whole key space)
2. trace sequence file (interpret the raw trace file to line base text file for future use)
3. Time serial (The key space ID and its access time)
4. Key access count distritbution
5. Key size distribution
6. Value size distribution (in each intervals)
7. whole key space separation by the prefix
8. Accessed key space separation by the prefix
9. QPS of each operation and each column family
10. Top K QPS and their accessed prefix range
**Test:**
1. Added the unit test of analyzing Get, Put, Delete, SingleDelete, DeleteRange, Merge
2. Generated the trace and analyze the trace
**Implemented but not tested (due to the limitation of trace_replay):**
1. Analyzing Iterator, supporting Seek() and SeekForPrev() analyzing
2. Analyzing the number of Key found by Get
**Future Work:**
1. Support execution time analyzing of each requests
2. Support cache hit situation and block read situation of Get
Pull Request resolved: https://github.com/facebook/rocksdb/pull/4091
Differential Revision: D9256157
Pulled By: zhichao-cao
fbshipit-source-id: f0ceacb7eedbc43a3eee6e85b76087d7832a8fe6
7 years ago
|
|
|
file_path = output_path + "/test-human_readable_trace.txt";
|
|
|
|
CheckFileContent(k_sequence, file_path, false);
|
|
|
|
|
|
|
|
// Check the prefix
|
|
|
|
std::vector<std::string> k_prefix = {"0 0 0 0.000000 0.000000 0x30"};
|
|
|
|
file_path = output_path + "/test-put-0-accessed_key_prefix_cut.txt";
|
|
|
|
CheckFileContent(k_prefix, file_path, true);
|
|
|
|
|
|
|
|
// Check the time series
|
|
|
|
std::vector<std::string> k_series = {"1 1533056278 0"};
|
|
|
|
file_path = output_path + "/test-put-0-time_series.txt";
|
|
|
|
CheckFileContent(k_series, file_path, false);
|
|
|
|
|
|
|
|
// Check the accessed key in whole key space
|
|
|
|
std::vector<std::string> k_whole_access = {"0 1"};
|
|
|
|
file_path = output_path + "/test-put-0-whole_key_stats.txt";
|
|
|
|
CheckFileContent(k_whole_access, file_path, true);
|
|
|
|
|
|
|
|
// Check the whole key prefix cut
|
|
|
|
std::vector<std::string> k_whole_prefix = {"0 0x61", "1 0x62", "2 0x63",
|
|
|
|
"3 0x64", "4 0x65", "5 0x66"};
|
|
|
|
file_path = output_path + "/test-put-0-whole_key_prefix_cut.txt";
|
|
|
|
CheckFileContent(k_whole_prefix, file_path, true);
|
|
|
|
|
|
|
|
// Check the overall qps
|
|
|
|
std::vector<std::string> all_qps = {"1 1 0 0 0 0 0 0 2"};
|
|
|
|
file_path = output_path + "/test-qps_stats.txt";
|
|
|
|
CheckFileContent(all_qps, file_path, true);
|
|
|
|
|
|
|
|
// Check the qps of Put
|
RocksDB Trace Analyzer (#4091)
Summary:
A framework of trace analyzing for RocksDB
After collecting the trace by using the tool of [PR #3837](https://github.com/facebook/rocksdb/pull/3837). User can use the Trace Analyzer to interpret, analyze, and characterize the collected workload.
**Input:**
1. trace file
2. Whole keys space file
**Statistics:**
1. Access count of each operation (Get, Put, Delete, SingleDelete, DeleteRange, Merge) in each column family.
2. Key hotness (access count) of each one
3. Key space separation based on given prefix
4. Key size distribution
5. Value size distribution if appliable
6. Top K accessed keys
7. QPS statistics including the average QPS and peak QPS
8. Top K accessed prefix
9. The query correlation analyzing, output the number of X after Y and the corresponding average time
intervals
**Output:**
1. key access heat map (either in the accessed key space or whole key space)
2. trace sequence file (interpret the raw trace file to line base text file for future use)
3. Time serial (The key space ID and its access time)
4. Key access count distritbution
5. Key size distribution
6. Value size distribution (in each intervals)
7. whole key space separation by the prefix
8. Accessed key space separation by the prefix
9. QPS of each operation and each column family
10. Top K QPS and their accessed prefix range
**Test:**
1. Added the unit test of analyzing Get, Put, Delete, SingleDelete, DeleteRange, Merge
2. Generated the trace and analyze the trace
**Implemented but not tested (due to the limitation of trace_replay):**
1. Analyzing Iterator, supporting Seek() and SeekForPrev() analyzing
2. Analyzing the number of Key found by Get
**Future Work:**
1. Support execution time analyzing of each requests
2. Support cache hit situation and block read situation of Get
Pull Request resolved: https://github.com/facebook/rocksdb/pull/4091
Differential Revision: D9256157
Pulled By: zhichao-cao
fbshipit-source-id: f0ceacb7eedbc43a3eee6e85b76087d7832a8fe6
7 years ago
|
|
|
std::vector<std::string> get_qps = {"1"};
|
|
|
|
file_path = output_path + "/test-put-0-qps_stats.txt";
|
|
|
|
CheckFileContent(get_qps, file_path, true);
|
|
|
|
|
|
|
|
// Check the top k qps prefix cut
|
|
|
|
std::vector<std::string> top_qps = {"At time: 0 with QPS: 1",
|
|
|
|
"The prefix: 0x61 Access count: 1"};
|
|
|
|
file_path = output_path + "/test-put-0-accessed_top_k_qps_prefix_cut.txt";
|
|
|
|
CheckFileContent(top_qps, file_path, true);
|
|
|
|
|
|
|
|
// Check the value size distribution
|
|
|
|
std::vector<std::string> value_dist = {
|
|
|
|
"Number_of_value_size_between 0 and 16 is: 1"};
|
|
|
|
file_path = output_path + "/test-put-0-accessed_value_size_distribution.txt";
|
|
|
|
CheckFileContent(value_dist, file_path, true);
|
|
|
|
}
|
|
|
|
|
|
|
|
// Test analyzing of delete
|
|
|
|
TEST_F(TraceAnalyzerTest, Delete) {
|
|
|
|
std::string trace_path = test_path_ + "/trace";
|
|
|
|
std::string output_path = test_path_ + "/delete";
|
|
|
|
std::string file_path;
|
|
|
|
std::vector<std::string> paras = {"-analyze_delete"};
|
RocksDB Trace Analyzer (#4091)
Summary:
A framework of trace analyzing for RocksDB
After collecting the trace by using the tool of [PR #3837](https://github.com/facebook/rocksdb/pull/3837). User can use the Trace Analyzer to interpret, analyze, and characterize the collected workload.
**Input:**
1. trace file
2. Whole keys space file
**Statistics:**
1. Access count of each operation (Get, Put, Delete, SingleDelete, DeleteRange, Merge) in each column family.
2. Key hotness (access count) of each one
3. Key space separation based on given prefix
4. Key size distribution
5. Value size distribution if appliable
6. Top K accessed keys
7. QPS statistics including the average QPS and peak QPS
8. Top K accessed prefix
9. The query correlation analyzing, output the number of X after Y and the corresponding average time
intervals
**Output:**
1. key access heat map (either in the accessed key space or whole key space)
2. trace sequence file (interpret the raw trace file to line base text file for future use)
3. Time serial (The key space ID and its access time)
4. Key access count distritbution
5. Key size distribution
6. Value size distribution (in each intervals)
7. whole key space separation by the prefix
8. Accessed key space separation by the prefix
9. QPS of each operation and each column family
10. Top K QPS and their accessed prefix range
**Test:**
1. Added the unit test of analyzing Get, Put, Delete, SingleDelete, DeleteRange, Merge
2. Generated the trace and analyze the trace
**Implemented but not tested (due to the limitation of trace_replay):**
1. Analyzing Iterator, supporting Seek() and SeekForPrev() analyzing
2. Analyzing the number of Key found by Get
**Future Work:**
1. Support execution time analyzing of each requests
2. Support cache hit situation and block read situation of Get
Pull Request resolved: https://github.com/facebook/rocksdb/pull/4091
Differential Revision: D9256157
Pulled By: zhichao-cao
fbshipit-source-id: f0ceacb7eedbc43a3eee6e85b76087d7832a8fe6
7 years ago
|
|
|
paras.push_back("-output_dir=" + output_path);
|
|
|
|
paras.push_back("-trace_path=" + trace_path);
|
|
|
|
paras.push_back("-key_space_dir=" + test_path_);
|
|
|
|
AnalyzeTrace(paras, output_path, trace_path);
|
RocksDB Trace Analyzer (#4091)
Summary:
A framework of trace analyzing for RocksDB
After collecting the trace by using the tool of [PR #3837](https://github.com/facebook/rocksdb/pull/3837). User can use the Trace Analyzer to interpret, analyze, and characterize the collected workload.
**Input:**
1. trace file
2. Whole keys space file
**Statistics:**
1. Access count of each operation (Get, Put, Delete, SingleDelete, DeleteRange, Merge) in each column family.
2. Key hotness (access count) of each one
3. Key space separation based on given prefix
4. Key size distribution
5. Value size distribution if appliable
6. Top K accessed keys
7. QPS statistics including the average QPS and peak QPS
8. Top K accessed prefix
9. The query correlation analyzing, output the number of X after Y and the corresponding average time
intervals
**Output:**
1. key access heat map (either in the accessed key space or whole key space)
2. trace sequence file (interpret the raw trace file to line base text file for future use)
3. Time serial (The key space ID and its access time)
4. Key access count distritbution
5. Key size distribution
6. Value size distribution (in each intervals)
7. whole key space separation by the prefix
8. Accessed key space separation by the prefix
9. QPS of each operation and each column family
10. Top K QPS and their accessed prefix range
**Test:**
1. Added the unit test of analyzing Get, Put, Delete, SingleDelete, DeleteRange, Merge
2. Generated the trace and analyze the trace
**Implemented but not tested (due to the limitation of trace_replay):**
1. Analyzing Iterator, supporting Seek() and SeekForPrev() analyzing
2. Analyzing the number of Key found by Get
**Future Work:**
1. Support execution time analyzing of each requests
2. Support cache hit situation and block read situation of Get
Pull Request resolved: https://github.com/facebook/rocksdb/pull/4091
Differential Revision: D9256157
Pulled By: zhichao-cao
fbshipit-source-id: f0ceacb7eedbc43a3eee6e85b76087d7832a8fe6
7 years ago
|
|
|
|
|
|
|
// check the key_stats file
|
|
|
|
std::vector<std::string> k_stats = {"0 0 0 1 1.000000"};
|
|
|
|
file_path = output_path + "/test-delete-0-accessed_key_stats.txt";
|
|
|
|
CheckFileContent(k_stats, file_path, true);
|
|
|
|
|
|
|
|
// Check the access count distribution
|
|
|
|
std::vector<std::string> k_dist = {"access_count: 1 num: 1"};
|
|
|
|
file_path =
|
|
|
|
output_path + "/test-delete-0-accessed_key_count_distribution.txt";
|
|
|
|
CheckFileContent(k_dist, file_path, true);
|
|
|
|
|
|
|
|
// Check the trace sequence
|
|
|
|
std::vector<std::string> k_sequence = {"1", "5", "2", "3", "4",
|
|
|
|
"0", "6", "7", "0"};
|
RocksDB Trace Analyzer (#4091)
Summary:
A framework of trace analyzing for RocksDB
After collecting the trace by using the tool of [PR #3837](https://github.com/facebook/rocksdb/pull/3837). User can use the Trace Analyzer to interpret, analyze, and characterize the collected workload.
**Input:**
1. trace file
2. Whole keys space file
**Statistics:**
1. Access count of each operation (Get, Put, Delete, SingleDelete, DeleteRange, Merge) in each column family.
2. Key hotness (access count) of each one
3. Key space separation based on given prefix
4. Key size distribution
5. Value size distribution if appliable
6. Top K accessed keys
7. QPS statistics including the average QPS and peak QPS
8. Top K accessed prefix
9. The query correlation analyzing, output the number of X after Y and the corresponding average time
intervals
**Output:**
1. key access heat map (either in the accessed key space or whole key space)
2. trace sequence file (interpret the raw trace file to line base text file for future use)
3. Time serial (The key space ID and its access time)
4. Key access count distritbution
5. Key size distribution
6. Value size distribution (in each intervals)
7. whole key space separation by the prefix
8. Accessed key space separation by the prefix
9. QPS of each operation and each column family
10. Top K QPS and their accessed prefix range
**Test:**
1. Added the unit test of analyzing Get, Put, Delete, SingleDelete, DeleteRange, Merge
2. Generated the trace and analyze the trace
**Implemented but not tested (due to the limitation of trace_replay):**
1. Analyzing Iterator, supporting Seek() and SeekForPrev() analyzing
2. Analyzing the number of Key found by Get
**Future Work:**
1. Support execution time analyzing of each requests
2. Support cache hit situation and block read situation of Get
Pull Request resolved: https://github.com/facebook/rocksdb/pull/4091
Differential Revision: D9256157
Pulled By: zhichao-cao
fbshipit-source-id: f0ceacb7eedbc43a3eee6e85b76087d7832a8fe6
7 years ago
|
|
|
file_path = output_path + "/test-human_readable_trace.txt";
|
|
|
|
CheckFileContent(k_sequence, file_path, false);
|
|
|
|
|
|
|
|
// Check the prefix
|
|
|
|
std::vector<std::string> k_prefix = {"0 0 0 0.000000 0.000000 0x30"};
|
|
|
|
file_path = output_path + "/test-delete-0-accessed_key_prefix_cut.txt";
|
|
|
|
CheckFileContent(k_prefix, file_path, true);
|
|
|
|
|
|
|
|
// Check the time series
|
|
|
|
std::vector<std::string> k_series = {"2 1533000630 0"};
|
|
|
|
file_path = output_path + "/test-delete-0-time_series.txt";
|
|
|
|
CheckFileContent(k_series, file_path, false);
|
|
|
|
|
|
|
|
// Check the accessed key in whole key space
|
|
|
|
std::vector<std::string> k_whole_access = {"2 1"};
|
|
|
|
file_path = output_path + "/test-delete-0-whole_key_stats.txt";
|
|
|
|
CheckFileContent(k_whole_access, file_path, true);
|
|
|
|
|
|
|
|
// Check the whole key prefix cut
|
|
|
|
std::vector<std::string> k_whole_prefix = {"0 0x61", "1 0x62", "2 0x63",
|
|
|
|
"3 0x64", "4 0x65", "5 0x66"};
|
|
|
|
file_path = output_path + "/test-delete-0-whole_key_prefix_cut.txt";
|
|
|
|
CheckFileContent(k_whole_prefix, file_path, true);
|
|
|
|
|
|
|
|
// Check the overall qps
|
|
|
|
std::vector<std::string> all_qps = {"1 1 1 0 0 0 0 0 3"};
|
|
|
|
file_path = output_path + "/test-qps_stats.txt";
|
|
|
|
CheckFileContent(all_qps, file_path, true);
|
|
|
|
|
|
|
|
// Check the qps of Delete
|
RocksDB Trace Analyzer (#4091)
Summary:
A framework of trace analyzing for RocksDB
After collecting the trace by using the tool of [PR #3837](https://github.com/facebook/rocksdb/pull/3837). User can use the Trace Analyzer to interpret, analyze, and characterize the collected workload.
**Input:**
1. trace file
2. Whole keys space file
**Statistics:**
1. Access count of each operation (Get, Put, Delete, SingleDelete, DeleteRange, Merge) in each column family.
2. Key hotness (access count) of each one
3. Key space separation based on given prefix
4. Key size distribution
5. Value size distribution if appliable
6. Top K accessed keys
7. QPS statistics including the average QPS and peak QPS
8. Top K accessed prefix
9. The query correlation analyzing, output the number of X after Y and the corresponding average time
intervals
**Output:**
1. key access heat map (either in the accessed key space or whole key space)
2. trace sequence file (interpret the raw trace file to line base text file for future use)
3. Time serial (The key space ID and its access time)
4. Key access count distritbution
5. Key size distribution
6. Value size distribution (in each intervals)
7. whole key space separation by the prefix
8. Accessed key space separation by the prefix
9. QPS of each operation and each column family
10. Top K QPS and their accessed prefix range
**Test:**
1. Added the unit test of analyzing Get, Put, Delete, SingleDelete, DeleteRange, Merge
2. Generated the trace and analyze the trace
**Implemented but not tested (due to the limitation of trace_replay):**
1. Analyzing Iterator, supporting Seek() and SeekForPrev() analyzing
2. Analyzing the number of Key found by Get
**Future Work:**
1. Support execution time analyzing of each requests
2. Support cache hit situation and block read situation of Get
Pull Request resolved: https://github.com/facebook/rocksdb/pull/4091
Differential Revision: D9256157
Pulled By: zhichao-cao
fbshipit-source-id: f0ceacb7eedbc43a3eee6e85b76087d7832a8fe6
7 years ago
|
|
|
std::vector<std::string> get_qps = {"1"};
|
|
|
|
file_path = output_path + "/test-delete-0-qps_stats.txt";
|
|
|
|
CheckFileContent(get_qps, file_path, true);
|
|
|
|
|
|
|
|
// Check the top k qps prefix cut
|
|
|
|
std::vector<std::string> top_qps = {"At time: 0 with QPS: 1",
|
|
|
|
"The prefix: 0x63 Access count: 1"};
|
|
|
|
file_path = output_path + "/test-delete-0-accessed_top_k_qps_prefix_cut.txt";
|
|
|
|
CheckFileContent(top_qps, file_path, true);
|
|
|
|
}
|
|
|
|
|
|
|
|
// Test analyzing of Merge
|
|
|
|
TEST_F(TraceAnalyzerTest, Merge) {
|
|
|
|
std::string trace_path = test_path_ + "/trace";
|
|
|
|
std::string output_path = test_path_ + "/merge";
|
|
|
|
std::string file_path;
|
|
|
|
std::vector<std::string> paras = {"-analyze_merge"};
|
RocksDB Trace Analyzer (#4091)
Summary:
A framework of trace analyzing for RocksDB
After collecting the trace by using the tool of [PR #3837](https://github.com/facebook/rocksdb/pull/3837). User can use the Trace Analyzer to interpret, analyze, and characterize the collected workload.
**Input:**
1. trace file
2. Whole keys space file
**Statistics:**
1. Access count of each operation (Get, Put, Delete, SingleDelete, DeleteRange, Merge) in each column family.
2. Key hotness (access count) of each one
3. Key space separation based on given prefix
4. Key size distribution
5. Value size distribution if appliable
6. Top K accessed keys
7. QPS statistics including the average QPS and peak QPS
8. Top K accessed prefix
9. The query correlation analyzing, output the number of X after Y and the corresponding average time
intervals
**Output:**
1. key access heat map (either in the accessed key space or whole key space)
2. trace sequence file (interpret the raw trace file to line base text file for future use)
3. Time serial (The key space ID and its access time)
4. Key access count distritbution
5. Key size distribution
6. Value size distribution (in each intervals)
7. whole key space separation by the prefix
8. Accessed key space separation by the prefix
9. QPS of each operation and each column family
10. Top K QPS and their accessed prefix range
**Test:**
1. Added the unit test of analyzing Get, Put, Delete, SingleDelete, DeleteRange, Merge
2. Generated the trace and analyze the trace
**Implemented but not tested (due to the limitation of trace_replay):**
1. Analyzing Iterator, supporting Seek() and SeekForPrev() analyzing
2. Analyzing the number of Key found by Get
**Future Work:**
1. Support execution time analyzing of each requests
2. Support cache hit situation and block read situation of Get
Pull Request resolved: https://github.com/facebook/rocksdb/pull/4091
Differential Revision: D9256157
Pulled By: zhichao-cao
fbshipit-source-id: f0ceacb7eedbc43a3eee6e85b76087d7832a8fe6
7 years ago
|
|
|
paras.push_back("-output_dir=" + output_path);
|
|
|
|
paras.push_back("-trace_path=" + trace_path);
|
|
|
|
paras.push_back("-key_space_dir=" + test_path_);
|
|
|
|
AnalyzeTrace(paras, output_path, trace_path);
|
RocksDB Trace Analyzer (#4091)
Summary:
A framework of trace analyzing for RocksDB
After collecting the trace by using the tool of [PR #3837](https://github.com/facebook/rocksdb/pull/3837). User can use the Trace Analyzer to interpret, analyze, and characterize the collected workload.
**Input:**
1. trace file
2. Whole keys space file
**Statistics:**
1. Access count of each operation (Get, Put, Delete, SingleDelete, DeleteRange, Merge) in each column family.
2. Key hotness (access count) of each one
3. Key space separation based on given prefix
4. Key size distribution
5. Value size distribution if appliable
6. Top K accessed keys
7. QPS statistics including the average QPS and peak QPS
8. Top K accessed prefix
9. The query correlation analyzing, output the number of X after Y and the corresponding average time
intervals
**Output:**
1. key access heat map (either in the accessed key space or whole key space)
2. trace sequence file (interpret the raw trace file to line base text file for future use)
3. Time serial (The key space ID and its access time)
4. Key access count distritbution
5. Key size distribution
6. Value size distribution (in each intervals)
7. whole key space separation by the prefix
8. Accessed key space separation by the prefix
9. QPS of each operation and each column family
10. Top K QPS and their accessed prefix range
**Test:**
1. Added the unit test of analyzing Get, Put, Delete, SingleDelete, DeleteRange, Merge
2. Generated the trace and analyze the trace
**Implemented but not tested (due to the limitation of trace_replay):**
1. Analyzing Iterator, supporting Seek() and SeekForPrev() analyzing
2. Analyzing the number of Key found by Get
**Future Work:**
1. Support execution time analyzing of each requests
2. Support cache hit situation and block read situation of Get
Pull Request resolved: https://github.com/facebook/rocksdb/pull/4091
Differential Revision: D9256157
Pulled By: zhichao-cao
fbshipit-source-id: f0ceacb7eedbc43a3eee6e85b76087d7832a8fe6
7 years ago
|
|
|
|
|
|
|
// check the key_stats file
|
|
|
|
std::vector<std::string> k_stats = {"0 20 0 1 1.000000"};
|
|
|
|
file_path = output_path + "/test-merge-0-accessed_key_stats.txt";
|
|
|
|
CheckFileContent(k_stats, file_path, true);
|
|
|
|
|
|
|
|
// Check the access count distribution
|
|
|
|
std::vector<std::string> k_dist = {"access_count: 1 num: 1"};
|
|
|
|
file_path = output_path + "/test-merge-0-accessed_key_count_distribution.txt";
|
|
|
|
CheckFileContent(k_dist, file_path, true);
|
|
|
|
|
|
|
|
// Check the trace sequence
|
|
|
|
std::vector<std::string> k_sequence = {"1", "5", "2", "3", "4",
|
|
|
|
"0", "6", "7", "0"};
|
RocksDB Trace Analyzer (#4091)
Summary:
A framework of trace analyzing for RocksDB
After collecting the trace by using the tool of [PR #3837](https://github.com/facebook/rocksdb/pull/3837). User can use the Trace Analyzer to interpret, analyze, and characterize the collected workload.
**Input:**
1. trace file
2. Whole keys space file
**Statistics:**
1. Access count of each operation (Get, Put, Delete, SingleDelete, DeleteRange, Merge) in each column family.
2. Key hotness (access count) of each one
3. Key space separation based on given prefix
4. Key size distribution
5. Value size distribution if appliable
6. Top K accessed keys
7. QPS statistics including the average QPS and peak QPS
8. Top K accessed prefix
9. The query correlation analyzing, output the number of X after Y and the corresponding average time
intervals
**Output:**
1. key access heat map (either in the accessed key space or whole key space)
2. trace sequence file (interpret the raw trace file to line base text file for future use)
3. Time serial (The key space ID and its access time)
4. Key access count distritbution
5. Key size distribution
6. Value size distribution (in each intervals)
7. whole key space separation by the prefix
8. Accessed key space separation by the prefix
9. QPS of each operation and each column family
10. Top K QPS and their accessed prefix range
**Test:**
1. Added the unit test of analyzing Get, Put, Delete, SingleDelete, DeleteRange, Merge
2. Generated the trace and analyze the trace
**Implemented but not tested (due to the limitation of trace_replay):**
1. Analyzing Iterator, supporting Seek() and SeekForPrev() analyzing
2. Analyzing the number of Key found by Get
**Future Work:**
1. Support execution time analyzing of each requests
2. Support cache hit situation and block read situation of Get
Pull Request resolved: https://github.com/facebook/rocksdb/pull/4091
Differential Revision: D9256157
Pulled By: zhichao-cao
fbshipit-source-id: f0ceacb7eedbc43a3eee6e85b76087d7832a8fe6
7 years ago
|
|
|
file_path = output_path + "/test-human_readable_trace.txt";
|
|
|
|
CheckFileContent(k_sequence, file_path, false);
|
|
|
|
|
|
|
|
// Check the prefix
|
|
|
|
std::vector<std::string> k_prefix = {"0 0 0 0.000000 0.000000 0x30"};
|
|
|
|
file_path = output_path + "/test-merge-0-accessed_key_prefix_cut.txt";
|
|
|
|
CheckFileContent(k_prefix, file_path, true);
|
|
|
|
|
|
|
|
// Check the time series
|
|
|
|
std::vector<std::string> k_series = {"5 1533000630 0"};
|
|
|
|
file_path = output_path + "/test-merge-0-time_series.txt";
|
|
|
|
CheckFileContent(k_series, file_path, false);
|
|
|
|
|
|
|
|
// Check the accessed key in whole key space
|
|
|
|
std::vector<std::string> k_whole_access = {"1 1"};
|
|
|
|
file_path = output_path + "/test-merge-0-whole_key_stats.txt";
|
|
|
|
CheckFileContent(k_whole_access, file_path, true);
|
|
|
|
|
|
|
|
// Check the whole key prefix cut
|
|
|
|
std::vector<std::string> k_whole_prefix = {"0 0x61", "1 0x62", "2 0x63",
|
|
|
|
"3 0x64", "4 0x65", "5 0x66"};
|
|
|
|
file_path = output_path + "/test-merge-0-whole_key_prefix_cut.txt";
|
|
|
|
CheckFileContent(k_whole_prefix, file_path, true);
|
|
|
|
|
|
|
|
// Check the overall qps
|
|
|
|
std::vector<std::string> all_qps = {"1 1 1 0 0 1 0 0 4"};
|
|
|
|
file_path = output_path + "/test-qps_stats.txt";
|
|
|
|
CheckFileContent(all_qps, file_path, true);
|
|
|
|
|
|
|
|
// Check the qps of Merge
|
RocksDB Trace Analyzer (#4091)
Summary:
A framework of trace analyzing for RocksDB
After collecting the trace by using the tool of [PR #3837](https://github.com/facebook/rocksdb/pull/3837). User can use the Trace Analyzer to interpret, analyze, and characterize the collected workload.
**Input:**
1. trace file
2. Whole keys space file
**Statistics:**
1. Access count of each operation (Get, Put, Delete, SingleDelete, DeleteRange, Merge) in each column family.
2. Key hotness (access count) of each one
3. Key space separation based on given prefix
4. Key size distribution
5. Value size distribution if appliable
6. Top K accessed keys
7. QPS statistics including the average QPS and peak QPS
8. Top K accessed prefix
9. The query correlation analyzing, output the number of X after Y and the corresponding average time
intervals
**Output:**
1. key access heat map (either in the accessed key space or whole key space)
2. trace sequence file (interpret the raw trace file to line base text file for future use)
3. Time serial (The key space ID and its access time)
4. Key access count distritbution
5. Key size distribution
6. Value size distribution (in each intervals)
7. whole key space separation by the prefix
8. Accessed key space separation by the prefix
9. QPS of each operation and each column family
10. Top K QPS and their accessed prefix range
**Test:**
1. Added the unit test of analyzing Get, Put, Delete, SingleDelete, DeleteRange, Merge
2. Generated the trace and analyze the trace
**Implemented but not tested (due to the limitation of trace_replay):**
1. Analyzing Iterator, supporting Seek() and SeekForPrev() analyzing
2. Analyzing the number of Key found by Get
**Future Work:**
1. Support execution time analyzing of each requests
2. Support cache hit situation and block read situation of Get
Pull Request resolved: https://github.com/facebook/rocksdb/pull/4091
Differential Revision: D9256157
Pulled By: zhichao-cao
fbshipit-source-id: f0ceacb7eedbc43a3eee6e85b76087d7832a8fe6
7 years ago
|
|
|
std::vector<std::string> get_qps = {"1"};
|
|
|
|
file_path = output_path + "/test-merge-0-qps_stats.txt";
|
|
|
|
CheckFileContent(get_qps, file_path, true);
|
|
|
|
|
|
|
|
// Check the top k qps prefix cut
|
|
|
|
std::vector<std::string> top_qps = {"At time: 0 with QPS: 1",
|
|
|
|
"The prefix: 0x62 Access count: 1"};
|
|
|
|
file_path = output_path + "/test-merge-0-accessed_top_k_qps_prefix_cut.txt";
|
|
|
|
CheckFileContent(top_qps, file_path, true);
|
|
|
|
|
|
|
|
// Check the value size distribution
|
|
|
|
std::vector<std::string> value_dist = {
|
|
|
|
"Number_of_value_size_between 0 and 24 is: 1"};
|
|
|
|
file_path =
|
|
|
|
output_path + "/test-merge-0-accessed_value_size_distribution.txt";
|
|
|
|
CheckFileContent(value_dist, file_path, true);
|
|
|
|
}
|
|
|
|
|
|
|
|
// Test analyzing of SingleDelete
|
|
|
|
TEST_F(TraceAnalyzerTest, SingleDelete) {
|
|
|
|
std::string trace_path = test_path_ + "/trace";
|
|
|
|
std::string output_path = test_path_ + "/single_delete";
|
|
|
|
std::string file_path;
|
|
|
|
std::vector<std::string> paras = {"-analyze_single_delete"};
|
RocksDB Trace Analyzer (#4091)
Summary:
A framework of trace analyzing for RocksDB
After collecting the trace by using the tool of [PR #3837](https://github.com/facebook/rocksdb/pull/3837). User can use the Trace Analyzer to interpret, analyze, and characterize the collected workload.
**Input:**
1. trace file
2. Whole keys space file
**Statistics:**
1. Access count of each operation (Get, Put, Delete, SingleDelete, DeleteRange, Merge) in each column family.
2. Key hotness (access count) of each one
3. Key space separation based on given prefix
4. Key size distribution
5. Value size distribution if appliable
6. Top K accessed keys
7. QPS statistics including the average QPS and peak QPS
8. Top K accessed prefix
9. The query correlation analyzing, output the number of X after Y and the corresponding average time
intervals
**Output:**
1. key access heat map (either in the accessed key space or whole key space)
2. trace sequence file (interpret the raw trace file to line base text file for future use)
3. Time serial (The key space ID and its access time)
4. Key access count distritbution
5. Key size distribution
6. Value size distribution (in each intervals)
7. whole key space separation by the prefix
8. Accessed key space separation by the prefix
9. QPS of each operation and each column family
10. Top K QPS and their accessed prefix range
**Test:**
1. Added the unit test of analyzing Get, Put, Delete, SingleDelete, DeleteRange, Merge
2. Generated the trace and analyze the trace
**Implemented but not tested (due to the limitation of trace_replay):**
1. Analyzing Iterator, supporting Seek() and SeekForPrev() analyzing
2. Analyzing the number of Key found by Get
**Future Work:**
1. Support execution time analyzing of each requests
2. Support cache hit situation and block read situation of Get
Pull Request resolved: https://github.com/facebook/rocksdb/pull/4091
Differential Revision: D9256157
Pulled By: zhichao-cao
fbshipit-source-id: f0ceacb7eedbc43a3eee6e85b76087d7832a8fe6
7 years ago
|
|
|
paras.push_back("-output_dir=" + output_path);
|
|
|
|
paras.push_back("-trace_path=" + trace_path);
|
|
|
|
paras.push_back("-key_space_dir=" + test_path_);
|
|
|
|
AnalyzeTrace(paras, output_path, trace_path);
|
RocksDB Trace Analyzer (#4091)
Summary:
A framework of trace analyzing for RocksDB
After collecting the trace by using the tool of [PR #3837](https://github.com/facebook/rocksdb/pull/3837). User can use the Trace Analyzer to interpret, analyze, and characterize the collected workload.
**Input:**
1. trace file
2. Whole keys space file
**Statistics:**
1. Access count of each operation (Get, Put, Delete, SingleDelete, DeleteRange, Merge) in each column family.
2. Key hotness (access count) of each one
3. Key space separation based on given prefix
4. Key size distribution
5. Value size distribution if appliable
6. Top K accessed keys
7. QPS statistics including the average QPS and peak QPS
8. Top K accessed prefix
9. The query correlation analyzing, output the number of X after Y and the corresponding average time
intervals
**Output:**
1. key access heat map (either in the accessed key space or whole key space)
2. trace sequence file (interpret the raw trace file to line base text file for future use)
3. Time serial (The key space ID and its access time)
4. Key access count distritbution
5. Key size distribution
6. Value size distribution (in each intervals)
7. whole key space separation by the prefix
8. Accessed key space separation by the prefix
9. QPS of each operation and each column family
10. Top K QPS and their accessed prefix range
**Test:**
1. Added the unit test of analyzing Get, Put, Delete, SingleDelete, DeleteRange, Merge
2. Generated the trace and analyze the trace
**Implemented but not tested (due to the limitation of trace_replay):**
1. Analyzing Iterator, supporting Seek() and SeekForPrev() analyzing
2. Analyzing the number of Key found by Get
**Future Work:**
1. Support execution time analyzing of each requests
2. Support cache hit situation and block read situation of Get
Pull Request resolved: https://github.com/facebook/rocksdb/pull/4091
Differential Revision: D9256157
Pulled By: zhichao-cao
fbshipit-source-id: f0ceacb7eedbc43a3eee6e85b76087d7832a8fe6
7 years ago
|
|
|
|
|
|
|
// check the key_stats file
|
|
|
|
std::vector<std::string> k_stats = {"0 0 0 1 1.000000"};
|
|
|
|
file_path = output_path + "/test-single_delete-0-accessed_key_stats.txt";
|
|
|
|
CheckFileContent(k_stats, file_path, true);
|
|
|
|
|
|
|
|
// Check the access count distribution
|
|
|
|
std::vector<std::string> k_dist = {"access_count: 1 num: 1"};
|
|
|
|
file_path =
|
|
|
|
output_path + "/test-single_delete-0-accessed_key_count_distribution.txt";
|
|
|
|
CheckFileContent(k_dist, file_path, true);
|
|
|
|
|
|
|
|
// Check the trace sequence
|
|
|
|
std::vector<std::string> k_sequence = {"1", "5", "2", "3", "4",
|
|
|
|
"0", "6", "7", "0"};
|
RocksDB Trace Analyzer (#4091)
Summary:
A framework of trace analyzing for RocksDB
After collecting the trace by using the tool of [PR #3837](https://github.com/facebook/rocksdb/pull/3837). User can use the Trace Analyzer to interpret, analyze, and characterize the collected workload.
**Input:**
1. trace file
2. Whole keys space file
**Statistics:**
1. Access count of each operation (Get, Put, Delete, SingleDelete, DeleteRange, Merge) in each column family.
2. Key hotness (access count) of each one
3. Key space separation based on given prefix
4. Key size distribution
5. Value size distribution if appliable
6. Top K accessed keys
7. QPS statistics including the average QPS and peak QPS
8. Top K accessed prefix
9. The query correlation analyzing, output the number of X after Y and the corresponding average time
intervals
**Output:**
1. key access heat map (either in the accessed key space or whole key space)
2. trace sequence file (interpret the raw trace file to line base text file for future use)
3. Time serial (The key space ID and its access time)
4. Key access count distritbution
5. Key size distribution
6. Value size distribution (in each intervals)
7. whole key space separation by the prefix
8. Accessed key space separation by the prefix
9. QPS of each operation and each column family
10. Top K QPS and their accessed prefix range
**Test:**
1. Added the unit test of analyzing Get, Put, Delete, SingleDelete, DeleteRange, Merge
2. Generated the trace and analyze the trace
**Implemented but not tested (due to the limitation of trace_replay):**
1. Analyzing Iterator, supporting Seek() and SeekForPrev() analyzing
2. Analyzing the number of Key found by Get
**Future Work:**
1. Support execution time analyzing of each requests
2. Support cache hit situation and block read situation of Get
Pull Request resolved: https://github.com/facebook/rocksdb/pull/4091
Differential Revision: D9256157
Pulled By: zhichao-cao
fbshipit-source-id: f0ceacb7eedbc43a3eee6e85b76087d7832a8fe6
7 years ago
|
|
|
file_path = output_path + "/test-human_readable_trace.txt";
|
|
|
|
CheckFileContent(k_sequence, file_path, false);
|
|
|
|
|
|
|
|
// Check the prefix
|
|
|
|
std::vector<std::string> k_prefix = {"0 0 0 0.000000 0.000000 0x30"};
|
|
|
|
file_path = output_path + "/test-single_delete-0-accessed_key_prefix_cut.txt";
|
|
|
|
CheckFileContent(k_prefix, file_path, true);
|
|
|
|
|
|
|
|
// Check the time series
|
|
|
|
std::vector<std::string> k_series = {"3 1533000630 0"};
|
|
|
|
file_path = output_path + "/test-single_delete-0-time_series.txt";
|
|
|
|
CheckFileContent(k_series, file_path, false);
|
|
|
|
|
|
|
|
// Check the accessed key in whole key space
|
|
|
|
std::vector<std::string> k_whole_access = {"3 1"};
|
|
|
|
file_path = output_path + "/test-single_delete-0-whole_key_stats.txt";
|
|
|
|
CheckFileContent(k_whole_access, file_path, true);
|
|
|
|
|
|
|
|
// Check the whole key prefix cut
|
|
|
|
std::vector<std::string> k_whole_prefix = {"0 0x61", "1 0x62", "2 0x63",
|
|
|
|
"3 0x64", "4 0x65", "5 0x66"};
|
|
|
|
file_path = output_path + "/test-single_delete-0-whole_key_prefix_cut.txt";
|
|
|
|
CheckFileContent(k_whole_prefix, file_path, true);
|
|
|
|
|
|
|
|
// Check the overall qps
|
|
|
|
std::vector<std::string> all_qps = {"1 1 1 1 0 1 0 0 5"};
|
|
|
|
file_path = output_path + "/test-qps_stats.txt";
|
|
|
|
CheckFileContent(all_qps, file_path, true);
|
|
|
|
|
|
|
|
// Check the qps of SingleDelete
|
RocksDB Trace Analyzer (#4091)
Summary:
A framework of trace analyzing for RocksDB
After collecting the trace by using the tool of [PR #3837](https://github.com/facebook/rocksdb/pull/3837). User can use the Trace Analyzer to interpret, analyze, and characterize the collected workload.
**Input:**
1. trace file
2. Whole keys space file
**Statistics:**
1. Access count of each operation (Get, Put, Delete, SingleDelete, DeleteRange, Merge) in each column family.
2. Key hotness (access count) of each one
3. Key space separation based on given prefix
4. Key size distribution
5. Value size distribution if appliable
6. Top K accessed keys
7. QPS statistics including the average QPS and peak QPS
8. Top K accessed prefix
9. The query correlation analyzing, output the number of X after Y and the corresponding average time
intervals
**Output:**
1. key access heat map (either in the accessed key space or whole key space)
2. trace sequence file (interpret the raw trace file to line base text file for future use)
3. Time serial (The key space ID and its access time)
4. Key access count distritbution
5. Key size distribution
6. Value size distribution (in each intervals)
7. whole key space separation by the prefix
8. Accessed key space separation by the prefix
9. QPS of each operation and each column family
10. Top K QPS and their accessed prefix range
**Test:**
1. Added the unit test of analyzing Get, Put, Delete, SingleDelete, DeleteRange, Merge
2. Generated the trace and analyze the trace
**Implemented but not tested (due to the limitation of trace_replay):**
1. Analyzing Iterator, supporting Seek() and SeekForPrev() analyzing
2. Analyzing the number of Key found by Get
**Future Work:**
1. Support execution time analyzing of each requests
2. Support cache hit situation and block read situation of Get
Pull Request resolved: https://github.com/facebook/rocksdb/pull/4091
Differential Revision: D9256157
Pulled By: zhichao-cao
fbshipit-source-id: f0ceacb7eedbc43a3eee6e85b76087d7832a8fe6
7 years ago
|
|
|
std::vector<std::string> get_qps = {"1"};
|
|
|
|
file_path = output_path + "/test-single_delete-0-qps_stats.txt";
|
|
|
|
CheckFileContent(get_qps, file_path, true);
|
|
|
|
|
|
|
|
// Check the top k qps prefix cut
|
|
|
|
std::vector<std::string> top_qps = {"At time: 0 with QPS: 1",
|
|
|
|
"The prefix: 0x64 Access count: 1"};
|
|
|
|
file_path =
|
|
|
|
output_path + "/test-single_delete-0-accessed_top_k_qps_prefix_cut.txt";
|
|
|
|
CheckFileContent(top_qps, file_path, true);
|
|
|
|
}
|
|
|
|
|
|
|
|
// Test analyzing of delete
|
|
|
|
TEST_F(TraceAnalyzerTest, DeleteRange) {
|
|
|
|
std::string trace_path = test_path_ + "/trace";
|
|
|
|
std::string output_path = test_path_ + "/range_delete";
|
|
|
|
std::string file_path;
|
|
|
|
std::vector<std::string> paras = {"-analyze_range_delete"};
|
RocksDB Trace Analyzer (#4091)
Summary:
A framework of trace analyzing for RocksDB
After collecting the trace by using the tool of [PR #3837](https://github.com/facebook/rocksdb/pull/3837). User can use the Trace Analyzer to interpret, analyze, and characterize the collected workload.
**Input:**
1. trace file
2. Whole keys space file
**Statistics:**
1. Access count of each operation (Get, Put, Delete, SingleDelete, DeleteRange, Merge) in each column family.
2. Key hotness (access count) of each one
3. Key space separation based on given prefix
4. Key size distribution
5. Value size distribution if appliable
6. Top K accessed keys
7. QPS statistics including the average QPS and peak QPS
8. Top K accessed prefix
9. The query correlation analyzing, output the number of X after Y and the corresponding average time
intervals
**Output:**
1. key access heat map (either in the accessed key space or whole key space)
2. trace sequence file (interpret the raw trace file to line base text file for future use)
3. Time serial (The key space ID and its access time)
4. Key access count distritbution
5. Key size distribution
6. Value size distribution (in each intervals)
7. whole key space separation by the prefix
8. Accessed key space separation by the prefix
9. QPS of each operation and each column family
10. Top K QPS and their accessed prefix range
**Test:**
1. Added the unit test of analyzing Get, Put, Delete, SingleDelete, DeleteRange, Merge
2. Generated the trace and analyze the trace
**Implemented but not tested (due to the limitation of trace_replay):**
1. Analyzing Iterator, supporting Seek() and SeekForPrev() analyzing
2. Analyzing the number of Key found by Get
**Future Work:**
1. Support execution time analyzing of each requests
2. Support cache hit situation and block read situation of Get
Pull Request resolved: https://github.com/facebook/rocksdb/pull/4091
Differential Revision: D9256157
Pulled By: zhichao-cao
fbshipit-source-id: f0ceacb7eedbc43a3eee6e85b76087d7832a8fe6
7 years ago
|
|
|
paras.push_back("-output_dir=" + output_path);
|
|
|
|
paras.push_back("-trace_path=" + trace_path);
|
|
|
|
paras.push_back("-key_space_dir=" + test_path_);
|
|
|
|
AnalyzeTrace(paras, output_path, trace_path);
|
RocksDB Trace Analyzer (#4091)
Summary:
A framework of trace analyzing for RocksDB
After collecting the trace by using the tool of [PR #3837](https://github.com/facebook/rocksdb/pull/3837). User can use the Trace Analyzer to interpret, analyze, and characterize the collected workload.
**Input:**
1. trace file
2. Whole keys space file
**Statistics:**
1. Access count of each operation (Get, Put, Delete, SingleDelete, DeleteRange, Merge) in each column family.
2. Key hotness (access count) of each one
3. Key space separation based on given prefix
4. Key size distribution
5. Value size distribution if appliable
6. Top K accessed keys
7. QPS statistics including the average QPS and peak QPS
8. Top K accessed prefix
9. The query correlation analyzing, output the number of X after Y and the corresponding average time
intervals
**Output:**
1. key access heat map (either in the accessed key space or whole key space)
2. trace sequence file (interpret the raw trace file to line base text file for future use)
3. Time serial (The key space ID and its access time)
4. Key access count distritbution
5. Key size distribution
6. Value size distribution (in each intervals)
7. whole key space separation by the prefix
8. Accessed key space separation by the prefix
9. QPS of each operation and each column family
10. Top K QPS and their accessed prefix range
**Test:**
1. Added the unit test of analyzing Get, Put, Delete, SingleDelete, DeleteRange, Merge
2. Generated the trace and analyze the trace
**Implemented but not tested (due to the limitation of trace_replay):**
1. Analyzing Iterator, supporting Seek() and SeekForPrev() analyzing
2. Analyzing the number of Key found by Get
**Future Work:**
1. Support execution time analyzing of each requests
2. Support cache hit situation and block read situation of Get
Pull Request resolved: https://github.com/facebook/rocksdb/pull/4091
Differential Revision: D9256157
Pulled By: zhichao-cao
fbshipit-source-id: f0ceacb7eedbc43a3eee6e85b76087d7832a8fe6
7 years ago
|
|
|
|
|
|
|
// check the key_stats file
|
|
|
|
std::vector<std::string> k_stats = {"0 0 0 1 1.000000", "0 0 1 1 1.000000"};
|
|
|
|
file_path = output_path + "/test-range_delete-0-accessed_key_stats.txt";
|
|
|
|
CheckFileContent(k_stats, file_path, true);
|
|
|
|
|
|
|
|
// Check the access count distribution
|
|
|
|
std::vector<std::string> k_dist = {"access_count: 1 num: 2"};
|
|
|
|
file_path =
|
|
|
|
output_path + "/test-range_delete-0-accessed_key_count_distribution.txt";
|
|
|
|
CheckFileContent(k_dist, file_path, true);
|
|
|
|
|
|
|
|
// Check the trace sequence
|
|
|
|
std::vector<std::string> k_sequence = {"1", "5", "2", "3", "4",
|
|
|
|
"0", "6", "7", "0"};
|
RocksDB Trace Analyzer (#4091)
Summary:
A framework of trace analyzing for RocksDB
After collecting the trace by using the tool of [PR #3837](https://github.com/facebook/rocksdb/pull/3837). User can use the Trace Analyzer to interpret, analyze, and characterize the collected workload.
**Input:**
1. trace file
2. Whole keys space file
**Statistics:**
1. Access count of each operation (Get, Put, Delete, SingleDelete, DeleteRange, Merge) in each column family.
2. Key hotness (access count) of each one
3. Key space separation based on given prefix
4. Key size distribution
5. Value size distribution if appliable
6. Top K accessed keys
7. QPS statistics including the average QPS and peak QPS
8. Top K accessed prefix
9. The query correlation analyzing, output the number of X after Y and the corresponding average time
intervals
**Output:**
1. key access heat map (either in the accessed key space or whole key space)
2. trace sequence file (interpret the raw trace file to line base text file for future use)
3. Time serial (The key space ID and its access time)
4. Key access count distritbution
5. Key size distribution
6. Value size distribution (in each intervals)
7. whole key space separation by the prefix
8. Accessed key space separation by the prefix
9. QPS of each operation and each column family
10. Top K QPS and their accessed prefix range
**Test:**
1. Added the unit test of analyzing Get, Put, Delete, SingleDelete, DeleteRange, Merge
2. Generated the trace and analyze the trace
**Implemented but not tested (due to the limitation of trace_replay):**
1. Analyzing Iterator, supporting Seek() and SeekForPrev() analyzing
2. Analyzing the number of Key found by Get
**Future Work:**
1. Support execution time analyzing of each requests
2. Support cache hit situation and block read situation of Get
Pull Request resolved: https://github.com/facebook/rocksdb/pull/4091
Differential Revision: D9256157
Pulled By: zhichao-cao
fbshipit-source-id: f0ceacb7eedbc43a3eee6e85b76087d7832a8fe6
7 years ago
|
|
|
file_path = output_path + "/test-human_readable_trace.txt";
|
|
|
|
CheckFileContent(k_sequence, file_path, false);
|
|
|
|
|
|
|
|
// Check the prefix
|
|
|
|
std::vector<std::string> k_prefix = {"0 0 0 0.000000 0.000000 0x30",
|
|
|
|
"1 1 1 1.000000 1.000000 0x65"};
|
|
|
|
file_path = output_path + "/test-range_delete-0-accessed_key_prefix_cut.txt";
|
|
|
|
CheckFileContent(k_prefix, file_path, true);
|
|
|
|
|
|
|
|
// Check the time series
|
|
|
|
std::vector<std::string> k_series = {"4 1533000630 0", "4 1533060100 1"};
|
|
|
|
file_path = output_path + "/test-range_delete-0-time_series.txt";
|
|
|
|
CheckFileContent(k_series, file_path, false);
|
|
|
|
|
|
|
|
// Check the accessed key in whole key space
|
|
|
|
std::vector<std::string> k_whole_access = {"4 1", "5 1"};
|
|
|
|
file_path = output_path + "/test-range_delete-0-whole_key_stats.txt";
|
|
|
|
CheckFileContent(k_whole_access, file_path, true);
|
|
|
|
|
|
|
|
// Check the whole key prefix cut
|
|
|
|
std::vector<std::string> k_whole_prefix = {"0 0x61", "1 0x62", "2 0x63",
|
|
|
|
"3 0x64", "4 0x65", "5 0x66"};
|
|
|
|
file_path = output_path + "/test-range_delete-0-whole_key_prefix_cut.txt";
|
|
|
|
CheckFileContent(k_whole_prefix, file_path, true);
|
|
|
|
|
|
|
|
// Check the overall qps
|
|
|
|
std::vector<std::string> all_qps = {"1 1 1 1 2 1 0 0 7"};
|
|
|
|
file_path = output_path + "/test-qps_stats.txt";
|
|
|
|
CheckFileContent(all_qps, file_path, true);
|
|
|
|
|
|
|
|
// Check the qps of DeleteRange
|
RocksDB Trace Analyzer (#4091)
Summary:
A framework of trace analyzing for RocksDB
After collecting the trace by using the tool of [PR #3837](https://github.com/facebook/rocksdb/pull/3837). User can use the Trace Analyzer to interpret, analyze, and characterize the collected workload.
**Input:**
1. trace file
2. Whole keys space file
**Statistics:**
1. Access count of each operation (Get, Put, Delete, SingleDelete, DeleteRange, Merge) in each column family.
2. Key hotness (access count) of each one
3. Key space separation based on given prefix
4. Key size distribution
5. Value size distribution if appliable
6. Top K accessed keys
7. QPS statistics including the average QPS and peak QPS
8. Top K accessed prefix
9. The query correlation analyzing, output the number of X after Y and the corresponding average time
intervals
**Output:**
1. key access heat map (either in the accessed key space or whole key space)
2. trace sequence file (interpret the raw trace file to line base text file for future use)
3. Time serial (The key space ID and its access time)
4. Key access count distritbution
5. Key size distribution
6. Value size distribution (in each intervals)
7. whole key space separation by the prefix
8. Accessed key space separation by the prefix
9. QPS of each operation and each column family
10. Top K QPS and their accessed prefix range
**Test:**
1. Added the unit test of analyzing Get, Put, Delete, SingleDelete, DeleteRange, Merge
2. Generated the trace and analyze the trace
**Implemented but not tested (due to the limitation of trace_replay):**
1. Analyzing Iterator, supporting Seek() and SeekForPrev() analyzing
2. Analyzing the number of Key found by Get
**Future Work:**
1. Support execution time analyzing of each requests
2. Support cache hit situation and block read situation of Get
Pull Request resolved: https://github.com/facebook/rocksdb/pull/4091
Differential Revision: D9256157
Pulled By: zhichao-cao
fbshipit-source-id: f0ceacb7eedbc43a3eee6e85b76087d7832a8fe6
7 years ago
|
|
|
std::vector<std::string> get_qps = {"2"};
|
|
|
|
file_path = output_path + "/test-range_delete-0-qps_stats.txt";
|
|
|
|
CheckFileContent(get_qps, file_path, true);
|
|
|
|
|
|
|
|
// Check the top k qps prefix cut
|
|
|
|
std::vector<std::string> top_qps = {"At time: 0 with QPS: 2",
|
|
|
|
"The prefix: 0x65 Access count: 1",
|
|
|
|
"The prefix: 0x66 Access count: 1"};
|
|
|
|
file_path =
|
|
|
|
output_path + "/test-range_delete-0-accessed_top_k_qps_prefix_cut.txt";
|
|
|
|
CheckFileContent(top_qps, file_path, true);
|
|
|
|
}
|
|
|
|
|
|
|
|
// Test analyzing of Iterator
|
|
|
|
TEST_F(TraceAnalyzerTest, Iterator) {
|
|
|
|
std::string trace_path = test_path_ + "/trace";
|
|
|
|
std::string output_path = test_path_ + "/iterator";
|
|
|
|
std::string file_path;
|
|
|
|
std::vector<std::string> paras = {"-analyze_iterator"};
|
|
|
|
paras.push_back("-output_dir=" + output_path);
|
|
|
|
paras.push_back("-trace_path=" + trace_path);
|
|
|
|
paras.push_back("-key_space_dir=" + test_path_);
|
|
|
|
AnalyzeTrace(paras, output_path, trace_path);
|
|
|
|
|
|
|
|
// Check the output of Seek
|
|
|
|
// check the key_stats file
|
|
|
|
std::vector<std::string> k_stats = {"0 0 0 1 1.000000"};
|
|
|
|
file_path = output_path + "/test-iterator_Seek-0-accessed_key_stats.txt";
|
|
|
|
CheckFileContent(k_stats, file_path, true);
|
|
|
|
|
|
|
|
// Check the access count distribution
|
|
|
|
std::vector<std::string> k_dist = {"access_count: 1 num: 1"};
|
|
|
|
file_path =
|
|
|
|
output_path + "/test-iterator_Seek-0-accessed_key_count_distribution.txt";
|
|
|
|
CheckFileContent(k_dist, file_path, true);
|
|
|
|
|
|
|
|
// Check the trace sequence
|
|
|
|
std::vector<std::string> k_sequence = {"1", "5", "2", "3", "4",
|
|
|
|
"0", "6", "7", "0"};
|
|
|
|
file_path = output_path + "/test-human_readable_trace.txt";
|
|
|
|
CheckFileContent(k_sequence, file_path, false);
|
|
|
|
|
|
|
|
// Check the prefix
|
|
|
|
std::vector<std::string> k_prefix = {"0 0 0 0.000000 0.000000 0x30"};
|
|
|
|
file_path = output_path + "/test-iterator_Seek-0-accessed_key_prefix_cut.txt";
|
|
|
|
CheckFileContent(k_prefix, file_path, true);
|
|
|
|
|
|
|
|
// Check the time series
|
|
|
|
std::vector<std::string> k_series = {"6 1 0"};
|
|
|
|
file_path = output_path + "/test-iterator_Seek-0-time_series.txt";
|
|
|
|
CheckFileContent(k_series, file_path, false);
|
|
|
|
|
|
|
|
// Check the accessed key in whole key space
|
|
|
|
std::vector<std::string> k_whole_access = {"0 1"};
|
|
|
|
file_path = output_path + "/test-iterator_Seek-0-whole_key_stats.txt";
|
|
|
|
CheckFileContent(k_whole_access, file_path, true);
|
|
|
|
|
|
|
|
// Check the whole key prefix cut
|
|
|
|
std::vector<std::string> k_whole_prefix = {"0 0x61", "1 0x62", "2 0x63",
|
|
|
|
"3 0x64", "4 0x65", "5 0x66"};
|
|
|
|
file_path = output_path + "/test-iterator_Seek-0-whole_key_prefix_cut.txt";
|
|
|
|
CheckFileContent(k_whole_prefix, file_path, true);
|
|
|
|
|
|
|
|
// Check the overall qps
|
|
|
|
std::vector<std::string> all_qps = {"1 1 1 1 2 1 1 1 9"};
|
|
|
|
file_path = output_path + "/test-qps_stats.txt";
|
|
|
|
CheckFileContent(all_qps, file_path, true);
|
|
|
|
|
|
|
|
// Check the qps of Iterator_Seek
|
|
|
|
std::vector<std::string> get_qps = {"1"};
|
|
|
|
file_path = output_path + "/test-iterator_Seek-0-qps_stats.txt";
|
|
|
|
CheckFileContent(get_qps, file_path, true);
|
|
|
|
|
|
|
|
// Check the top k qps prefix cut
|
|
|
|
std::vector<std::string> top_qps = {"At time: 0 with QPS: 1",
|
|
|
|
"The prefix: 0x61 Access count: 1"};
|
|
|
|
file_path =
|
|
|
|
output_path + "/test-iterator_Seek-0-accessed_top_k_qps_prefix_cut.txt";
|
|
|
|
CheckFileContent(top_qps, file_path, true);
|
|
|
|
|
|
|
|
// Check the output of SeekForPrev
|
|
|
|
// check the key_stats file
|
|
|
|
k_stats = {"0 0 0 1 1.000000"};
|
|
|
|
file_path =
|
|
|
|
output_path + "/test-iterator_SeekForPrev-0-accessed_key_stats.txt";
|
|
|
|
CheckFileContent(k_stats, file_path, true);
|
|
|
|
|
|
|
|
// Check the access count distribution
|
|
|
|
k_dist = {"access_count: 1 num: 1"};
|
|
|
|
file_path =
|
|
|
|
output_path +
|
|
|
|
"/test-iterator_SeekForPrev-0-accessed_key_count_distribution.txt";
|
|
|
|
CheckFileContent(k_dist, file_path, true);
|
|
|
|
|
|
|
|
// Check the prefix
|
|
|
|
k_prefix = {"0 0 0 0.000000 0.000000 0x30"};
|
|
|
|
file_path =
|
|
|
|
output_path + "/test-iterator_SeekForPrev-0-accessed_key_prefix_cut.txt";
|
|
|
|
CheckFileContent(k_prefix, file_path, true);
|
|
|
|
|
|
|
|
// Check the time series
|
|
|
|
k_series = {"7 0 0"};
|
|
|
|
file_path = output_path + "/test-iterator_SeekForPrev-0-time_series.txt";
|
|
|
|
CheckFileContent(k_series, file_path, false);
|
|
|
|
|
|
|
|
// Check the accessed key in whole key space
|
|
|
|
k_whole_access = {"1 1"};
|
|
|
|
file_path = output_path + "/test-iterator_SeekForPrev-0-whole_key_stats.txt";
|
|
|
|
CheckFileContent(k_whole_access, file_path, true);
|
|
|
|
|
|
|
|
// Check the whole key prefix cut
|
|
|
|
k_whole_prefix = {"0 0x61", "1 0x62", "2 0x63", "3 0x64", "4 0x65", "5 0x66"};
|
|
|
|
file_path =
|
|
|
|
output_path + "/test-iterator_SeekForPrev-0-whole_key_prefix_cut.txt";
|
|
|
|
CheckFileContent(k_whole_prefix, file_path, true);
|
|
|
|
|
|
|
|
// Check the qps of Iterator_SeekForPrev
|
|
|
|
get_qps = {"1"};
|
|
|
|
file_path = output_path + "/test-iterator_SeekForPrev-0-qps_stats.txt";
|
|
|
|
CheckFileContent(get_qps, file_path, true);
|
|
|
|
|
|
|
|
// Check the top k qps prefix cut
|
|
|
|
top_qps = {"At time: 0 with QPS: 1", "The prefix: 0x62 Access count: 1"};
|
|
|
|
file_path = output_path +
|
|
|
|
"/test-iterator_SeekForPrev-0-accessed_top_k_qps_prefix_cut.txt";
|
|
|
|
CheckFileContent(top_qps, file_path, true);
|
|
|
|
}
|
|
|
|
|
RocksDB Trace Analyzer (#4091)
Summary:
A framework of trace analyzing for RocksDB
After collecting the trace by using the tool of [PR #3837](https://github.com/facebook/rocksdb/pull/3837). User can use the Trace Analyzer to interpret, analyze, and characterize the collected workload.
**Input:**
1. trace file
2. Whole keys space file
**Statistics:**
1. Access count of each operation (Get, Put, Delete, SingleDelete, DeleteRange, Merge) in each column family.
2. Key hotness (access count) of each one
3. Key space separation based on given prefix
4. Key size distribution
5. Value size distribution if appliable
6. Top K accessed keys
7. QPS statistics including the average QPS and peak QPS
8. Top K accessed prefix
9. The query correlation analyzing, output the number of X after Y and the corresponding average time
intervals
**Output:**
1. key access heat map (either in the accessed key space or whole key space)
2. trace sequence file (interpret the raw trace file to line base text file for future use)
3. Time serial (The key space ID and its access time)
4. Key access count distritbution
5. Key size distribution
6. Value size distribution (in each intervals)
7. whole key space separation by the prefix
8. Accessed key space separation by the prefix
9. QPS of each operation and each column family
10. Top K QPS and their accessed prefix range
**Test:**
1. Added the unit test of analyzing Get, Put, Delete, SingleDelete, DeleteRange, Merge
2. Generated the trace and analyze the trace
**Implemented but not tested (due to the limitation of trace_replay):**
1. Analyzing Iterator, supporting Seek() and SeekForPrev() analyzing
2. Analyzing the number of Key found by Get
**Future Work:**
1. Support execution time analyzing of each requests
2. Support cache hit situation and block read situation of Get
Pull Request resolved: https://github.com/facebook/rocksdb/pull/4091
Differential Revision: D9256157
Pulled By: zhichao-cao
fbshipit-source-id: f0ceacb7eedbc43a3eee6e85b76087d7832a8fe6
7 years ago
|
|
|
} // namespace rocksdb
|
|
|
|
|
|
|
|
int main(int argc, char** argv) {
|
|
|
|
::testing::InitGoogleTest(&argc, argv);
|
|
|
|
return RUN_ALL_TESTS();
|
|
|
|
}
|
|
|
|
#endif // GFLAG
|
|
|
|
#else
|
|
|
|
#include <stdio.h>
|
|
|
|
|
|
|
|
int main(int /*argc*/, char** /*argv*/) {
|
|
|
|
fprintf(stderr, "Trace_analyzer test is not supported in ROCKSDB_LITE\n");
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
|
|
|
#endif // !ROCKSDB_LITE return RUN_ALL_TESTS();
|