Advisor: README and blog, and also tests for DBBenchRunner, DatabaseOptions (#4201)

Summary: This pull request adds a README file and a blog post for the Advisor tool. It also adds the missing tests for some Optimizer modules. Some comments are added to the classes being tested for improved readability. Pull Request resolved: https://github.com/facebook/rocksdb/pull/4201 Reviewed By: maysamyabandeh Differential Revision: D9125311 Pulled By: poojam23 fbshipit-source-id: aefcf2f06eaa05490cc2834ef5aa6e21f0d1dc55
7 years ago · 892a156267
parent f8f6983f89
commit 892a156267
14 changed files with 806 additions and 301 deletions
--- a/docs/_posts/2018-08-01-rocksdb-tuning-advisor.markdown
+++ b/docs/_posts/2018-08-01-rocksdb-tuning-advisor.markdown
@ -0,0 +1,58 @@
 ---
 title: Rocksdb Tuning Advisor
 layout: post
 author: poojam23
 category: blog
 ---
 The performance of Rocksdb is contingent on its tuning. However, because
 of the complexity of its underlying technology and a large number of
 configurable parameters, a good configuration is sometimes hard to obtain. The aim of
 the python command-line tool, Rocksdb Advisor, is to automate the process of
 suggesting improvements in the configuration based on advice from Rocksdb
 experts.
 ### Overview
 Experts share their wisdom as rules comprising of conditions and suggestions in the INI format (refer
 [rules.ini](https://github.com/facebook/rocksdb/blob/master/tools/advisor/advisor/rules.ini)).
 Users provide the Rocksdb configuration that they want to improve upon (as the
 familiar Rocksdb OPTIONS file —
 [example](https://github.com/facebook/rocksdb/blob/master/examples/rocksdb_option_file_example.ini))
 and the path of the file which contains Rocksdb logs and statistics.
 The [Advisor](https://github.com/facebook/rocksdb/blob/master/tools/advisor/advisor/rule_parser_example.py)
 creates appropriate DataSource objects (for Rocksdb
 [logs](https://github.com/facebook/rocksdb/blob/master/tools/advisor/advisor/db_log_parser.py),
 [options](https://github.com/facebook/rocksdb/blob/master/tools/advisor/advisor/db_options_parser.py),
 [statistics](https://github.com/facebook/rocksdb/blob/master/tools/advisor/advisor/db_stats_fetcher.py) etc.)
 and provides them to the [Rules Engine](https://github.com/facebook/rocksdb/blob/master/tools/advisor/advisor/rule_parser.py).
 The Rules uses rules from experts to parse data-sources and trigger appropriate rules.
 The Advisor's output gives information about which rules were triggered,
 why they were triggered and what each of them suggests. Each suggestion
 provided by a triggered rule advises some action on a Rocksdb
 configuration option, for example, increase CFOptions.write_buffer_size,
 set bloom_bits to 2 etc.
 ### Usage
 An example command to run the tool:
 ```shell
 cd rocksdb/tools/advisor
 python3 -m advisor.rule_parser_example --rules_spec=advisor/rules.ini --rocksdb_options=test/input_files/OPTIONS-000005 --log_files_path_prefix=test/input_files/LOG-0 --stats_dump_period_sec=20
 ```
 Sample output where a Rocksdb log-based rule has been triggered :
 ```shell
 Rule: stall-too-many-memtables
 LogCondition: stall-too-many-memtables regex: Stopping writes because we have \d+ immutable memtables \(waiting for flush\), max_write_buffer_number is set to \d+
 Suggestion: inc-bg-flush option : DBOptions.max_background_flushes action : increase suggested_values : ['2']
 Suggestion: inc-write-buffer option : CFOptions.max_write_buffer_number action : increase
 scope: col_fam:
 {'default'}
 ```
 ### Read more
 For more information, refer to [advisor](https://github.com/facebook/rocksdb/tree/master/tools/advisor/README.md).
--- a/tools/advisor/README.md
+++ b/tools/advisor/README.md
@ -0,0 +1,96 @@
 # Rocksdb Tuning Advisor
 ## Motivation
 The performance of Rocksdb is contingent on its tuning. However,
 because of the complexity of its underlying technology and a large number of
 configurable parameters, a good configuration is sometimes hard to obtain. The aim of
 the python command-line tool, Rocksdb Advisor, is to automate the process of
 suggesting improvements in the configuration based on advice from Rocksdb
 experts.
 ## Overview
 Experts share their wisdom as rules comprising of conditions and suggestions in the INI format (refer
 [rules.ini](https://github.com/facebook/rocksdb/blob/master/tools/advisor/advisor/rules.ini)).
 Users provide the Rocksdb configuration that they want to improve upon (as the
 familiar Rocksdb OPTIONS file —
 [example](https://github.com/facebook/rocksdb/blob/master/examples/rocksdb_option_file_example.ini))
 and the path of the file which contains Rocksdb logs and statistics.
 The [Advisor](https://github.com/facebook/rocksdb/blob/master/tools/advisor/advisor/rule_parser_example.py)
 creates appropriate DataSource objects (for Rocksdb
 [logs](https://github.com/facebook/rocksdb/blob/master/tools/advisor/advisor/db_log_parser.py),
 [options](https://github.com/facebook/rocksdb/blob/master/tools/advisor/advisor/db_options_parser.py),
 [statistics](https://github.com/facebook/rocksdb/blob/master/tools/advisor/advisor/db_stats_fetcher.py) etc.)
 and provides them to the [Rules Engine](https://github.com/facebook/rocksdb/blob/master/tools/advisor/advisor/rule_parser.py).
 The Rules uses rules from experts to parse data-sources and trigger appropriate rules.
 The Advisor's output gives information about which rules were triggered,
 why they were triggered and what each of them suggests. Each suggestion
 provided by a triggered rule advises some action on a Rocksdb
 configuration option, for example, increase CFOptions.write_buffer_size,
 set bloom_bits to 2 etc.
 ## Usage
 ### Prerequisites
 The tool needs the following to run:
 * python3
 ### Running the tool
 An example command to run the tool:
 ```shell
 cd rocksdb/tools/advisor
 python3 -m advisor.rule_parser_example --rules_spec=advisor/rules.ini --rocksdb_options=test/input_files/OPTIONS-000005 --log_files_path_prefix=test/input_files/LOG-0 --stats_dump_period_sec=20
 ```
 ### Command-line arguments
 Most important amongst all the input that the Advisor needs, are the rules
 spec and starting Rocksdb configuration. The configuration is provided as the
 familiar Rocksdb Options file (refer [example](https://github.com/facebook/rocksdb/blob/master/examples/rocksdb_option_file_example.ini)).
 The Rules spec is written in the INI format (more details in
 [rules.ini](https://github.com/facebook/rocksdb/blob/master/tools/advisor/advisor/rules.ini)).
 In brief, a Rule is made of conditions and is triggered when all its
 constituent conditions are triggered. When triggered, a Rule suggests changes
 (increase/decrease/set to a suggested value) to certain Rocksdb options that
 aim to improve Rocksdb performance. Every Condition has a 'source' i.e.
 the data source that would be checked for triggering that condition.
 For example, a log Condition (with 'source=LOG') is triggered if a particular
 'regex' is found in the Rocksdb LOG files. As of now the Rules Engine
 supports 3 types of Conditions (and consequently data-sources):
 LOG, OPTIONS, TIME_SERIES. The TIME_SERIES data can be sourced from the
 Rocksdb [statistics](https://github.com/facebook/rocksdb/blob/master/include/rocksdb/statistics.h)
 or [perf context](https://github.com/facebook/rocksdb/blob/master/include/rocksdb/perf_context.h).
 For more information about the remaining command-line arguments, run:
 ```shell
 cd rocksdb/tools/advisor
 python3 -m advisor.rule_parser_example --help
 ```
 ### Sample output
 Here, a Rocksdb log-based rule has been triggered:
 ```shell
 Rule: stall-too-many-memtables
 LogCondition: stall-too-many-memtables regex: Stopping writes because we have \d+ immutable memtables \(waiting for flush\), max_write_buffer_number is set to \d+
 Suggestion: inc-bg-flush option : DBOptions.max_background_flushes action : increase suggested_values : ['2']
 Suggestion: inc-write-buffer option : CFOptions.max_write_buffer_number action : increase
 scope: col_fam:
 {'default'}
 ```
 ## Running the tests
 Tests for the code have been added to the
 [test/](https://github.com/facebook/rocksdb/tree/master/tools/advisor/test)
 directory. For example, to run the unit tests for db_log_parser.py:
 ```shell
 cd rocksdb/tools/advisor
 python3 -m unittest -v test.test_db_log_parser
 ```
--- a/tools/advisor/advisor/db_bench_runner.py
+++ b/tools/advisor/advisor/db_bench_runner.py
@ -1,11 +1,8 @@
 from advisor.bench_runner import BenchmarkRunner
 from advisor.db_log_parser import DataSource, DatabaseLogs, NO_COL_FAMILY
 from advisor.db_options_parser import DatabaseOptions
 from advisor.db_stats_fetcher import (
    LogStatsParser, OdsStatsFetcher, DatabasePerfContext
 )
 import os
 import re
 import shutil
 import subprocess
 import time
@ -30,6 +27,8 @@ class DBBenchRunner(BenchmarkRunner):
    @staticmethod
    def get_opt_args_str(misc_options_dict):
        # given a dictionary of options and their values, return a string
        # that can be appended as command-line arguments
        optional_args_str = ""
        for option_name, option_value in misc_options_dict.items():
            if option_value:
@ -43,12 +42,10 @@ class DBBenchRunner(BenchmarkRunner):
        self.db_bench_binary = positional_args[0]
        self.benchmark = positional_args[1]
        self.db_bench_args = None
        # TODO(poojam23): move to unittest with method get_available_workloads
        self.supported_benchmarks = None
        if len(positional_args) > 2:
            # options list with each option given as "<option>=<value>"
            self.db_bench_args = positional_args[2:]
-        # save ods_args if provided
+        # save ods_args, if provided
        self.ods_args = ods_args
    def _parse_output(self, get_perf_context=False):
@ -67,21 +64,29 @@ class DBBenchRunner(BenchmarkRunner):
        with open(self.OUTPUT_FILE, 'r') as fp:
            for line in fp:
                if line.startswith(self.benchmark):
-                    print(line)  # print output of db_bench run
+                    # line from sample output:
                    # readwhilewriting : 16.582 micros/op 60305 ops/sec; \
                    # 4.2 MB/s (3433828 of 5427999 found)\n
                    print(line)  # print output of the benchmark run
                    token_list = line.strip().split()
                    for ix, token in enumerate(token_list):
                        if token.startswith(self.THROUGHPUT):
                            # in above example, throughput = 60305 ops/sec
                            output[self.THROUGHPUT] = (
                                float(token_list[ix - 1])
                            )
                            break
-                elif line.startswith(self.PERF_CON):
+                elif get_perf_context and line.startswith(self.PERF_CON):
                    # the following lines in the output contain perf context
                    # statistics (refer example above)
                    perf_context_begins = True
                elif get_perf_context and perf_context_begins:
                    # Sample perf_context output:
                    # user_key_comparison_count = 500, block_cache_hit_count =\
                    # 468, block_read_count = 580, block_read_byte = 445, ...
                    token_list = line.strip().split(',')
                    # token_list = ['user_key_comparison_count = 500',
                    # 'block_cache_hit_count = 468','block_read_count = 580'...
                    perf_context = {
                        tk.split('=')[0].strip(): tk.split('=')[1].strip()
                        for tk in token_list
@ -99,6 +104,8 @@ class DBBenchRunner(BenchmarkRunner):
                    output[self.PERF_CON] = perf_context_ts
                    perf_context_begins = False
                elif line.startswith(self.DB_PATH):
                    # line from sample output:
                    # DB path: [/tmp/rocksdbtest-155919/dbbench]\n
                    output[self.DB_PATH] = (
                        line.split('[')[1].split(']')[0]
                    )
@ -111,8 +118,10 @@ class DBBenchRunner(BenchmarkRunner):
        stats_freq_sec = None
        logs_file_prefix = None
-        # fetch the options
+        # fetch frequency at which the stats are dumped in the Rocksdb logs
        dump_period = 'DBOptions.stats_dump_period_sec'
        # fetch the directory, if specified, in which the Rocksdb logs are
        # dumped, by default logs are dumped in same location as database
        log_dir = 'DBOptions.db_log_dir'
        log_options = db_options.get_options([dump_period, log_dir])
        if dump_period in log_options:
@ -153,6 +162,7 @@ class DBBenchRunner(BenchmarkRunner):
            shutil.rmtree(db_path, ignore_errors=True)
        except OSError as e:
            print('Error: rmdir ' + e.filename + ' ' + e.strerror)
        # setup database with a million keys using the fillrandom benchmark
        command = "%s --benchmarks=fillrandom --db=%s --num=1000000" % (
            self.db_bench_binary, db_path
        )
@ -164,15 +174,16 @@ class DBBenchRunner(BenchmarkRunner):
        command = "%s --benchmarks=%s --statistics --perf_level=3 --db=%s" % (
            self.db_bench_binary, self.benchmark, db_path
        )
        # fetch the command-line arguments string for providing Rocksdb options
        args_str = self._get_options_command_line_args_str(curr_options)
-        # handle the command-line args passed in the constructor
+        # handle the command-line args passed in the constructor, these
        # arguments are specific to db_bench
        for cmd_line_arg in self.db_bench_args:
            args_str += (" --" + cmd_line_arg)
        command += args_str
        return command
    def _run_command(self, command):
        # run db_bench and return the
        out_file = open(self.OUTPUT_FILE, "w+")
        err_file = open(self.ERROR_FILE, "w+")
        print('executing... - ' + command)
@ -181,18 +192,21 @@ class DBBenchRunner(BenchmarkRunner):
        err_file.close()
    def run_experiment(self, db_options, db_path):
-        # type: (List[str], str) -> str
+        # setup the Rocksdb database before running experiment
        self._setup_db_before_experiment(db_options, db_path)
        # get the command to run the experiment
        command = self._build_experiment_command(db_options, db_path)
        # run experiment
        self._run_command(command)
-
+        # parse the db_bench experiment output
        parsed_output = self._parse_output(get_perf_context=True)
-        # Create the LOGS object
+        # get the log files path prefix and frequency at which Rocksdb stats
-        # get the log options from the OPTIONS file
+        # are dumped in the logs
        logs_file_prefix, stats_freq_sec = self.get_log_options(
            db_options, parsed_output[self.DB_PATH]
        )
        # create the Rocksbd LOGS object
        db_logs = DatabaseLogs(
            logs_file_prefix, db_options.get_column_families()
        )
@ -202,6 +216,7 @@ class DBBenchRunner(BenchmarkRunner):
        db_perf_context = DatabasePerfContext(
            parsed_output[self.PERF_CON], 0, False
        )
        # create the data-sources dictionary
        data_sources = {
            DataSource.Type.DB_OPTIONS: [db_options],
            DataSource.Type.LOG: [db_logs],
@ -214,99 +229,5 @@ class DBBenchRunner(BenchmarkRunner):
                self.ods_args['entity'],
                self.ods_args['key_prefix']
            ))
        # return the experiment's data-sources and throughput
        return data_sources, parsed_output[self.THROUGHPUT]
    # TODO: this method is for testing, shift it out to unit-tests when ready
    def get_available_workloads(self):
        if not self.supported_benchmarks:
            self.supported_benchmarks = []
            command = '%s --help' % self.db_bench_binary
            self._run_command(command)
            with open(self.OUTPUT_FILE, 'r') as fp:
                start = False
                for line in fp:
                    if re.search('available benchmarks', line, re.IGNORECASE):
                        start = True
                        continue
                    elif start:
                        if re.search('meta operations', line, re.IGNORECASE):
                            break
                        benchmark_info = line.strip()
                        if benchmark_info:
                            token_list = benchmark_info.split()
                            if len(token_list) > 2 and token_list[1] == '--':
                                self.supported_benchmarks.append(token_list[0])
                    else:
                        continue
            self.supported_benchmarks = sorted(self.supported_benchmarks)
        return self.supported_benchmarks
 # TODO: remove this method, used only for testing
 def main():
    pos_args = [
        '/home/poojamalik/workspace/rocksdb/db_bench',
        'readwhilewriting',
        'use_existing_db=true',
        'duration=10'
    ]
    db_bench_helper = DBBenchRunner(pos_args)
    # populate benchmarks with the available ones in the db_bench tool
    benchmarks = db_bench_helper.get_available_workloads()
    print(benchmarks)
    print()
    options_file = (
        '/home/poojamalik/workspace/rocksdb/tools/advisor/temp/' +
        'OPTIONS_temp.tmp'
    )
    misc_options = ["rate_limiter_bytes_per_sec=1024000", "bloom_bits=2"]
    db_options = DatabaseOptions(options_file, misc_options)
    data_sources, _ = db_bench_helper.run_experiment(db_options)
    print(data_sources[DataSource.Type.DB_OPTIONS][0].options_dict)
    print()
    print(data_sources[DataSource.Type.LOG][0].logs_path_prefix)
    if os.path.isfile(data_sources[DataSource.Type.LOG][0].logs_path_prefix):
        print('log file exists!')
    else:
        print('error: log file does not exist!')
    print(data_sources[DataSource.Type.LOG][0].column_families)
    print()
    print(data_sources[DataSource.Type.TIME_SERIES][0].logs_file_prefix)
    if (
        os.path.isfile(
            data_sources[DataSource.Type.TIME_SERIES][0].logs_file_prefix
        )
    ):
        print('log file exists!')
    else:
        print('error: log file does not exist!')
    print(data_sources[DataSource.Type.TIME_SERIES][0].stats_freq_sec)
    print(data_sources[DataSource.Type.TIME_SERIES][1].keys_ts)
    db_options = DatabaseOptions(options_file, None)
    data_sources, _ = db_bench_helper.run_experiment(db_options)
    print(data_sources[DataSource.Type.DB_OPTIONS][0].options_dict)
    print()
    print(data_sources[DataSource.Type.LOG][0].logs_path_prefix)
    if os.path.isfile(data_sources[DataSource.Type.LOG][0].logs_path_prefix):
        print('log file exists!')
    else:
        print('error: log file does not exist!')
    print(data_sources[DataSource.Type.LOG][0].column_families)
    print()
    print(data_sources[DataSource.Type.TIME_SERIES][0].logs_file_prefix)
    if (
        os.path.isfile(
            data_sources[DataSource.Type.TIME_SERIES][0].logs_file_prefix
        )
    ):
        print('log file exists!')
    else:
        print('error: log file does not exist!')
    print(data_sources[DataSource.Type.TIME_SERIES][0].stats_freq_sec)
    print(data_sources[DataSource.Type.TIME_SERIES][1].keys_ts)
    print(data_sources[DataSource.Type.TIME_SERIES][1].stats_freq_sec)
 if __name__ == "__main__":
    main()
--- a/tools/advisor/advisor/db_options_parser.py
+++ b/tools/advisor/advisor/db_options_parser.py
@ -6,7 +6,6 @@
 import copy
 from advisor.db_log_parser import DataSource, NO_COL_FAMILY
 from advisor.ini_parser import IniParser
 from advisor.rule_parser import Condition, OptionCondition
 import os
@ -28,10 +27,12 @@ class OptionsSpecParser(IniParser):
    @staticmethod
    def get_section_name(line):
        # example: get_section_name('[CFOptions "default"]')
        token_list = line.strip()[1:-1].split('"')
        # token_list = ['CFOptions', 'default', '']
        if len(token_list) < 3:
            return None
-        return token_list[1]
+        return token_list[1]  # return 'default'
    @staticmethod
    def get_section_str(section_type, section_name):
@ -73,6 +74,9 @@ class DatabaseOptions(DataSource):
    @staticmethod
    def is_misc_option(option_name):
        # these are miscellaneous options that are not yet supported by the
        # Rocksdb options file, hence they are not prefixed with any section
        # name
        return '.' not in option_name
    @staticmethod
@ -352,88 +356,3 @@ class DatabaseOptions(DataSource):
            # field
            if col_fam_options_dict:
                cond.set_trigger(col_fam_options_dict)
 # TODO(poojam23): remove these methods once the unit tests for this class are
 # in place
 def main():
    options_file = 'temp/OPTIONS_default.tmp'
    misc_options = ["misc_opt1=10", "misc_opt2=100", "misc_opt3=1000"]
    db_options = DatabaseOptions(options_file, misc_options)
    print(db_options.get_column_families())
    get_op = db_options.get_options([
        'DBOptions.db_log_dir',
        'DBOptions.is_fd_close_on_exec',
        'CFOptions.memtable_prefix_bloom_size_ratio',
        'TableOptions.BlockBasedTable.verify_compression',
        'misc_opt1',
        'misc_opt3'
    ])
    print(get_op)
    get_op['DBOptions.db_log_dir'][NO_COL_FAMILY] = 'some_random_path'
    get_op['CFOptions.memtable_prefix_bloom_size_ratio']['default'] = 2.31
    get_op['TableOptions.BlockBasedTable.verify_compression']['default'] = 4.4
    get_op['misc_opt2'] = {}
    get_op['misc_opt2'][NO_COL_FAMILY] = 2
    db_options.update_options(get_op)
    print('options updated in ' + db_options.generate_options_config(123))
    print('misc options ' + repr(db_options.get_misc_options()))
    options_file = 'temp/OPTIONS_123.tmp'
    db_options = DatabaseOptions(options_file, misc_options)
    # only CFOptions
    cond1 = Condition('opt-cond-1')
    cond1 = OptionCondition.create(cond1)
    cond1.set_parameter(
        'options', [
            'CFOptions.level0_file_num_compaction_trigger',
            'CFOptions.write_buffer_size',
            'CFOptions.max_bytes_for_level_base'
        ]
    )
    cond1.set_parameter(
        'evaluate',
        'int(options[0])*int(options[1])-int(options[2])>=0'
    )
    # only DBOptions
    cond2 = Condition('opt-cond-2')
    cond2 = OptionCondition.create(cond2)
    cond2.set_parameter(
        'options', [
            'DBOptions.max_file_opening_threads',
            'DBOptions.table_cache_numshardbits',
            'misc_opt2',
            'misc_opt3'
        ]
    )
    cond2_expr = (
        '(int(options[0])*int(options[2]))-' +
        '((4*int(options[1])*int(options[3]))/10)==0'
    )
    cond2.set_parameter('evaluate', cond2_expr)
    # mix of CFOptions and DBOptions
    cond3 = Condition('opt-cond-3')
    cond3 = OptionCondition.create(cond3)
    cond3.set_parameter(
        'options', [
            'DBOptions.max_background_jobs',  # 2
            'DBOptions.write_thread_slow_yield_usec',  # 3
            'CFOptions.num_levels',  # 7
            'misc_opt1'  # 10
        ]
    )
    cond3_expr = (
        '(int(options[3])*int(options[2]))-' +
        '(int(options[1])*int(options[0]))==64'
    )
    cond3.set_parameter('evaluate', cond3_expr)
    db_options.check_and_trigger_conditions([cond1, cond2, cond3])
    print(cond1.get_trigger())  # {'col-fam-B': ['4', '10', '10']}
    print(cond2.get_trigger())  # {'DB_WIDE': ['16', '4']}
    # {'col-fam-B': ['2', '3', '10'], 'col-fam-A': ['2', '3', '7']}
    print(cond3.get_trigger())
 if __name__ == "__main__":
    main()
--- a/tools/advisor/advisor/db_stats_fetcher.py
+++ b/tools/advisor/advisor/db_stats_fetcher.py
@ -5,7 +5,6 @@
 from advisor.db_log_parser import Log
 from advisor.db_timeseries_parser import TimeSeriesData, NO_ENTITY
 from advisor.rule_parser import Condition, TimeSeriesCondition
 import copy
 import glob
 import re
@ -123,7 +122,7 @@ class LogStatsParser(TimeSeriesData):
 class DatabasePerfContext(TimeSeriesData):
    # TODO(poojam23): check if any benchrunner provides PerfContext sampled at
    # regular intervals
-    def __init__(self, perf_context_ts, stats_freq_sec=0, cumulative=True):
+    def __init__(self, perf_context_ts, stats_freq_sec, cumulative):
        '''
        perf_context_ts is expected to be in the following format:
        Dict[metric, Dict[timestamp, value]], where for
@ -321,101 +320,3 @@ class OdsStatsFetcher(TimeSeriesData):
        with open(self.OUTPUT_FILE, 'r') as fp:
            url = fp.readline()
        return url
 # TODO(poojam23): remove these blocks once the unittests for LogStatsParser are
 # in place
 def main():
    # populating the statistics
    log_stats = LogStatsParser('temp/db_stats_fetcher_main_LOG.tmp', 20)
    print(log_stats.type)
    print(log_stats.keys_ts)
    print(log_stats.logs_file_prefix)
    print(log_stats.stats_freq_sec)
    print(log_stats.duration_sec)
    statistics = [
        'rocksdb.number.rate_limiter.drains.count',
        'rocksdb.number.block.decompressed.count',
        'rocksdb.db.get.micros.p50',
        'rocksdb.manifest.file.sync.micros.p99',
        'rocksdb.db.get.micros.p99'
    ]
    log_stats.fetch_timeseries(statistics)
    print()
    print(log_stats.keys_ts)
    # aggregated statistics
    print()
    print(log_stats.fetch_aggregated_values(
        NO_ENTITY, statistics, TimeSeriesData.AggregationOperator.latest
    ))
    print(log_stats.fetch_aggregated_values(
        NO_ENTITY, statistics, TimeSeriesData.AggregationOperator.oldest
    ))
    print(log_stats.fetch_aggregated_values(
        NO_ENTITY, statistics, TimeSeriesData.AggregationOperator.max
    ))
    print(log_stats.fetch_aggregated_values(
        NO_ENTITY, statistics, TimeSeriesData.AggregationOperator.min
    ))
    print(log_stats.fetch_aggregated_values(
        NO_ENTITY, statistics, TimeSeriesData.AggregationOperator.avg
    ))
    # condition 'evaluate_expression' that evaluates to true
    cond1 = Condition('cond-1')
    cond1 = TimeSeriesCondition.create(cond1)
    cond1.set_parameter('keys', statistics)
    cond1.set_parameter('behavior', 'evaluate_expression')
    cond1.set_parameter('evaluate', 'keys[3]-keys[2]>=0')
    cond1.set_parameter('aggregation_op', 'avg')
    # condition 'evaluate_expression' that evaluates to false
    cond2 = Condition('cond-2')
    cond2 = TimeSeriesCondition.create(cond2)
    cond2.set_parameter('keys', statistics)
    cond2.set_parameter('behavior', 'evaluate_expression')
    cond2.set_parameter('evaluate', '((keys[1]-(2*keys[0]))/100)<3000')
    cond2.set_parameter('aggregation_op', 'latest')
    # condition 'evaluate_expression' that evaluates to true; no aggregation_op
    cond3 = Condition('cond-3')
    cond3 = TimeSeriesCondition.create(cond3)
    cond3.set_parameter('keys', [statistics[2], statistics[3]])
    cond3.set_parameter('behavior', 'evaluate_expression')
    cond3.set_parameter('evaluate', '(keys[1]/keys[0])>23')
    # check remaining methods
    conditions = [cond1, cond2, cond3]
    print()
    print(log_stats.get_keys_from_conditions(conditions))
    log_stats.check_and_trigger_conditions(conditions)
    print()
    print(cond1.get_trigger())
    print(cond2.get_trigger())
    print(cond3.get_trigger())
 # TODO(poojam23): shift this code to the unit tests for DatabasePerfContext
 def check_perf_context_code():
    string = (
        " user_key_comparison_count = 675903942, " +
        "block_cache_hit_count = 830086, " +
        "get_from_output_files_time = 85088293818, " +
        "seek_on_memtable_time = 0,"
    )
    token_list = string.split(',')
    perf_context = {
        token.split('=')[0].strip(): int(token.split('=')[1].strip())
        for token in token_list
        if token
    }
    timestamp = int(time.time())
    perf_ts = {}
    for key in perf_context:
        perf_ts[key] = {}
        start_val = perf_context[key]
        for ix in range(5):
            perf_ts[key][timestamp+(ix*10)] = start_val + (2 * ix)
    db_perf_context = DatabasePerfContext(perf_ts, 10, True)
    print(db_perf_context.keys_ts)
 if __name__ == '__main__':
    main()
    check_perf_context_code()
--- a/tools/advisor/advisor/rule_parser_example.py
+++ b/tools/advisor/advisor/rule_parser_example.py
@ -0,0 +1,89 @@
 # Copyright (c) 2011-present, Facebook, Inc.  All rights reserved.
 #  This source code is licensed under both the GPLv2 (found in the
 #  COPYING file in the root directory) and Apache 2.0 License
 #  (found in the LICENSE.Apache file in the root directory).
 from advisor.rule_parser import RulesSpec
 from advisor.db_log_parser import DatabaseLogs, DataSource
 from advisor.db_options_parser import DatabaseOptions
 from advisor.db_stats_fetcher import LogStatsParser, OdsStatsFetcher
 import argparse
 def main(args):
    # initialise the RulesSpec parser
    rule_spec_parser = RulesSpec(args.rules_spec)
    rule_spec_parser.load_rules_from_spec()
    rule_spec_parser.perform_section_checks()
    # initialize the DatabaseOptions object
    db_options = DatabaseOptions(args.rocksdb_options)
    # Create DatabaseLogs object
    db_logs = DatabaseLogs(
        args.log_files_path_prefix, db_options.get_column_families()
    )
    # Create the Log STATS object
    db_log_stats = LogStatsParser(
        args.log_files_path_prefix, args.stats_dump_period_sec
    )
    data_sources = {
        DataSource.Type.DB_OPTIONS: [db_options],
        DataSource.Type.LOG: [db_logs],
        DataSource.Type.TIME_SERIES: [db_log_stats]
    }
    if args.ods_client:
        data_sources[DataSource.Type.TIME_SERIES].append(OdsStatsFetcher(
            args.ods_client,
            args.ods_entity,
            args.ods_tstart,
            args.ods_tend,
            args.ods_key_prefix
        ))
    triggered_rules = rule_spec_parser.get_triggered_rules(
        data_sources, db_options.get_column_families()
    )
    rule_spec_parser.print_rules(triggered_rules)
 if __name__ == '__main__':
    parser = argparse.ArgumentParser(description='Use this script to get\
        suggestions for improving Rocksdb performance.')
    parser.add_argument(
        '--rules_spec', required=True, type=str,
        help='path of the file containing the expert-specified Rules'
    )
    parser.add_argument(
        '--rocksdb_options', required=True, type=str,
        help='path of the starting Rocksdb OPTIONS file'
    )
    parser.add_argument(
        '--log_files_path_prefix', required=True, type=str,
        help='path prefix of the Rocksdb LOG files'
    )
    parser.add_argument(
        '--stats_dump_period_sec', required=True, type=int,
        help='the frequency (in seconds) at which STATISTICS are printed to ' +
        'the Rocksdb LOG file'
    )
    # ODS arguments
    parser.add_argument(
        '--ods_client', type=str, help='the ODS client binary'
    )
    parser.add_argument(
        '--ods_entity', type=str,
        help='the servers for which the ODS stats need to be fetched'
    )
    parser.add_argument(
        '--ods_key_prefix', type=str,
        help='the prefix that needs to be attached to the keys of time ' +
        'series to be fetched from ODS'
    )
    parser.add_argument(
        '--ods_tstart', type=int,
        help='start time of timeseries to be fetched from ODS'
    )
    parser.add_argument(
        '--ods_tend', type=int,
        help='end time of timeseries to be fetched from ODS'
    )
    args = parser.parse_args()
    main(args)
--- a/tools/advisor/test/input_files/OPTIONS-000005
+++ b/tools/advisor/test/input_files/OPTIONS-000005
@ -12,7 +12,8 @@
  manual_wal_flush=false
  allow_ingest_behind=false
  db_write_buffer_size=0
-
+  db_log_dir=
  random_access_max_buffer_size=1048576
 [CFOptions "default"]
  ttl=0
@ -29,3 +30,20 @@
 [TableOptions/BlockBasedTable "default"]
  block_align=false
  index_type=kBinarySearch
 [CFOptions "col_fam_A"]
 ttl=0
 max_bytes_for_level_base=268435456
 max_bytes_for_level_multiplier=10.000000
 level0_file_num_compaction_trigger=5
 level0_stop_writes_trigger=36
 write_buffer_size=1024000
 min_write_buffer_number_to_merge=1
 num_levels=5
 compaction_filter_factory=nullptr
 compaction_style=kCompactionStyleLevel
 [TableOptions/BlockBasedTable "col_fam_A"]
 block_align=true
 block_restart_interval=16
 index_type=kBinarySearch
--- a/tools/advisor/test/input_files/log_stats_parser_keys_ts
+++ b/tools/advisor/test/input_files/log_stats_parser_keys_ts
@ -0,0 +1,3 @@
 rocksdb.number.block.decompressed.count: 1530896335 88.0, 1530896361 788338.0, 1530896387 1539256.0, 1530896414 2255696.0, 1530896440 3009325.0, 1530896466 3767183.0, 1530896492 4529775.0, 1530896518 5297809.0, 1530896545 6033802.0, 1530896570 6794129.0
 rocksdb.db.get.micros.p50: 1530896335 295.5, 1530896361 16.561841, 1530896387 16.20677, 1530896414 16.31508, 1530896440 16.346602, 1530896466 16.284669, 1530896492 16.16005, 1530896518 16.069096, 1530896545 16.028746, 1530896570 15.9638
 rocksdb.manifest.file.sync.micros.p99: 1530896335 649.0, 1530896361 835.0, 1530896387 1435.0, 1530896414 9938.0, 1530896440 9938.0, 1530896466 9938.0, 1530896492 9938.0, 1530896518 1882.0, 1530896545 1837.0, 1530896570 1792.0
--- a/tools/advisor/test/input_files/test_rules.ini
+++ b/tools/advisor/test/input_files/test_rules.ini
@ -32,7 +32,7 @@ regex=Stalling writes because of estimated pending compaction bytes \d+
 [Condition "options-1-false"]
 source=OPTIONS
-options=CFOptions.level0_file_num_compaction_trigger:CFOptions.write_buffer_size:random_access_max_buffer_size
+options=CFOptions.level0_file_num_compaction_trigger:CFOptions.write_buffer_size:DBOptions.random_access_max_buffer_size
 evaluate=int(options[0])*int(options[1])-int(options[2])<0  # should evaluate to a boolean
 [Suggestion "inc-bg-flush"]
--- a/tools/advisor/test/test_db_bench_runner.py
+++ b/tools/advisor/test/test_db_bench_runner.py
@ -0,0 +1,147 @@
 # Copyright (c) 2011-present, Facebook, Inc.  All rights reserved.
 #  This source code is licensed under both the GPLv2 (found in the
 #  COPYING file in the root directory) and Apache 2.0 License
 #  (found in the LICENSE.Apache file in the root directory).
 from advisor.db_bench_runner import DBBenchRunner
 from advisor.db_log_parser import NO_COL_FAMILY, DataSource
 from advisor.db_options_parser import DatabaseOptions
 import os
 import unittest
 class TestDBBenchRunnerMethods(unittest.TestCase):
    def setUp(self):
        self.pos_args = [
            './../../db_bench',
            'overwrite',
            'use_existing_db=true',
            'duration=10'
        ]
        self.bench_runner = DBBenchRunner(self.pos_args)
        this_path = os.path.abspath(os.path.dirname(__file__))
        options_path = os.path.join(this_path, 'input_files/OPTIONS-000005')
        self.db_options = DatabaseOptions(options_path)
    def test_setup(self):
        self.assertEqual(self.bench_runner.db_bench_binary, self.pos_args[0])
        self.assertEqual(self.bench_runner.benchmark, self.pos_args[1])
        self.assertSetEqual(
            set(self.bench_runner.db_bench_args), set(self.pos_args[2:])
        )
    def test_get_info_log_file_name(self):
        log_file_name = DBBenchRunner.get_info_log_file_name(
            None, 'random_path'
        )
        self.assertEqual(log_file_name, 'LOG')
        log_file_name = DBBenchRunner.get_info_log_file_name(
            '/dev/shm/', '/tmp/rocksdbtest-155919/dbbench/'
        )
        self.assertEqual(log_file_name, 'tmp_rocksdbtest-155919_dbbench_LOG')
    def test_get_opt_args_str(self):
        misc_opt_dict = {'bloom_bits': 2, 'empty_opt': None, 'rate_limiter': 3}
        optional_args_str = DBBenchRunner.get_opt_args_str(misc_opt_dict)
        self.assertEqual(optional_args_str, ' --bloom_bits=2 --rate_limiter=3')
    def test_get_log_options(self):
        db_path = '/tmp/rocksdb-155919/dbbench'
        # when db_log_dir is present in the db_options
        update_dict = {
            'DBOptions.db_log_dir': {NO_COL_FAMILY: '/dev/shm'},
            'DBOptions.stats_dump_period_sec': {NO_COL_FAMILY: '20'}
        }
        self.db_options.update_options(update_dict)
        log_file_prefix, stats_freq = self.bench_runner.get_log_options(
            self.db_options, db_path
        )
        self.assertEqual(
            log_file_prefix, '/dev/shm/tmp_rocksdb-155919_dbbench_LOG'
        )
        self.assertEqual(stats_freq, 20)
        update_dict = {
            'DBOptions.db_log_dir': {NO_COL_FAMILY: None},
            'DBOptions.stats_dump_period_sec': {NO_COL_FAMILY: '30'}
        }
        self.db_options.update_options(update_dict)
        log_file_prefix, stats_freq = self.bench_runner.get_log_options(
            self.db_options, db_path
        )
        self.assertEqual(log_file_prefix, '/tmp/rocksdb-155919/dbbench/LOG')
        self.assertEqual(stats_freq, 30)
    def test_build_experiment_command(self):
        # add some misc_options to db_options
        update_dict = {
            'bloom_bits': {NO_COL_FAMILY: 2},
            'rate_limiter_bytes_per_sec': {NO_COL_FAMILY: 128000000}
        }
        self.db_options.update_options(update_dict)
        db_path = '/dev/shm'
        experiment_command = self.bench_runner._build_experiment_command(
            self.db_options, db_path
        )
        opt_args_str = DBBenchRunner.get_opt_args_str(
            self.db_options.get_misc_options()
        )
        opt_args_str += (
            ' --options_file=' +
            self.db_options.generate_options_config('12345')
        )
        for arg in self.pos_args[2:]:
            opt_args_str += (' --' + arg)
        expected_command = (
            self.pos_args[0] + ' --benchmarks=' + self.pos_args[1] +
            ' --statistics --perf_level=3 --db=' + db_path + opt_args_str
        )
        self.assertEqual(experiment_command, expected_command)
 class TestDBBenchRunner(unittest.TestCase):
    def setUp(self):
        # Note: the db_bench binary should be present in the rocksdb/ directory
        self.pos_args = [
            './../../db_bench',
            'overwrite',
            'use_existing_db=true',
            'duration=20'
        ]
        self.bench_runner = DBBenchRunner(self.pos_args)
        this_path = os.path.abspath(os.path.dirname(__file__))
        options_path = os.path.join(this_path, 'input_files/OPTIONS-000005')
        self.db_options = DatabaseOptions(options_path)
    def test_experiment_output(self):
        update_dict = {'bloom_bits': {NO_COL_FAMILY: 2}}
        self.db_options.update_options(update_dict)
        db_path = '/dev/shm'
        data_sources, throughput = self.bench_runner.run_experiment(
            self.db_options, db_path
        )
        self.assertEqual(
            data_sources[DataSource.Type.DB_OPTIONS][0].type,
            DataSource.Type.DB_OPTIONS
        )
        self.assertEqual(
            data_sources[DataSource.Type.LOG][0].type,
            DataSource.Type.LOG
        )
        self.assertEqual(len(data_sources[DataSource.Type.TIME_SERIES]), 2)
        self.assertEqual(
            data_sources[DataSource.Type.TIME_SERIES][0].type,
            DataSource.Type.TIME_SERIES
        )
        self.assertEqual(
            data_sources[DataSource.Type.TIME_SERIES][1].type,
            DataSource.Type.TIME_SERIES
        )
        self.assertEqual(
            data_sources[DataSource.Type.TIME_SERIES][1].stats_freq_sec, 0
        )
 if __name__ == '__main__':
    unittest.main()
--- a/tools/advisor/test/test_db_log_parser.py
+++ b/tools/advisor/test/test_db_log_parser.py
@ -1,3 +1,8 @@
 # Copyright (c) 2011-present, Facebook, Inc.  All rights reserved.
 #  This source code is licensed under both the GPLv2 (found in the
 #  COPYING file in the root directory) and Apache 2.0 License
 #  (found in the LICENSE.Apache file in the root directory).
 from advisor.db_log_parser import DatabaseLogs, Log, NO_COL_FAMILY
 from advisor.rule_parser import Condition, LogCondition
 import os
--- a/tools/advisor/test/test_db_options_parser.py
+++ b/tools/advisor/test/test_db_options_parser.py
@ -0,0 +1,216 @@
 # Copyright (c) 2011-present, Facebook, Inc.  All rights reserved.
 #  This source code is licensed under both the GPLv2 (found in the
 #  COPYING file in the root directory) and Apache 2.0 License
 #  (found in the LICENSE.Apache file in the root directory).
 from advisor.db_log_parser import NO_COL_FAMILY
 from advisor.db_options_parser import DatabaseOptions
 from advisor.rule_parser import Condition, OptionCondition
 import os
 import unittest
 class TestDatabaseOptions(unittest.TestCase):
    def setUp(self):
        self.this_path = os.path.abspath(os.path.dirname(__file__))
        self.og_options = os.path.join(
            self.this_path, 'input_files/OPTIONS-000005'
        )
        misc_options = [
            'bloom_bits = 4', 'rate_limiter_bytes_per_sec = 1024000'
        ]
        # create the options object
        self.db_options = DatabaseOptions(self.og_options, misc_options)
        # perform clean-up before running tests
        self.generated_options = os.path.join(
            self.this_path, '../temp/OPTIONS_testing.tmp'
        )
        if os.path.isfile(self.generated_options):
            os.remove(self.generated_options)
    def test_get_options_diff(self):
        old_opt = {
            'DBOptions.stats_dump_freq_sec': {NO_COL_FAMILY: '20'},
            'CFOptions.write_buffer_size': {
                'default': '1024000',
                'col_fam_A': '128000',
                'col_fam_B': '128000000'
            },
            'DBOptions.use_fsync': {NO_COL_FAMILY: 'true'},
            'DBOptions.max_log_file_size': {NO_COL_FAMILY: '128000000'}
        }
        new_opt = {
            'bloom_bits': {NO_COL_FAMILY: '4'},
            'CFOptions.write_buffer_size': {
                'default': '128000000',
                'col_fam_A': '128000',
                'col_fam_C': '128000000'
            },
            'DBOptions.use_fsync': {NO_COL_FAMILY: 'true'},
            'DBOptions.max_log_file_size': {NO_COL_FAMILY: '0'}
        }
        diff = DatabaseOptions.get_options_diff(old_opt, new_opt)
        expected_diff = {
            'DBOptions.stats_dump_freq_sec': {NO_COL_FAMILY: ('20', None)},
            'bloom_bits': {NO_COL_FAMILY: (None, '4')},
            'CFOptions.write_buffer_size': {
                'default': ('1024000', '128000000'),
                'col_fam_B': ('128000000', None),
                'col_fam_C': (None, '128000000')
            },
            'DBOptions.max_log_file_size': {NO_COL_FAMILY: ('128000000', '0')}
        }
        self.assertDictEqual(diff, expected_diff)
    def test_is_misc_option(self):
        self.assertTrue(DatabaseOptions.is_misc_option('bloom_bits'))
        self.assertFalse(
            DatabaseOptions.is_misc_option('DBOptions.stats_dump_freq_sec')
        )
    def test_set_up(self):
        options = self.db_options.get_all_options()
        self.assertEqual(22, len(options.keys()))
        expected_misc_options = {
            'bloom_bits': '4', 'rate_limiter_bytes_per_sec': '1024000'
        }
        self.assertDictEqual(
            expected_misc_options, self.db_options.get_misc_options()
        )
        self.assertListEqual(
            ['default', 'col_fam_A'], self.db_options.get_column_families()
        )
    def test_get_options(self):
        opt_to_get = [
            'DBOptions.manual_wal_flush', 'DBOptions.db_write_buffer_size',
            'bloom_bits', 'CFOptions.compaction_filter_factory',
            'CFOptions.num_levels', 'rate_limiter_bytes_per_sec',
            'TableOptions.BlockBasedTable.block_align', 'random_option'
        ]
        options = self.db_options.get_options(opt_to_get)
        expected_options = {
            'DBOptions.manual_wal_flush': {NO_COL_FAMILY: 'false'},
            'DBOptions.db_write_buffer_size': {NO_COL_FAMILY: '0'},
            'bloom_bits': {NO_COL_FAMILY: '4'},
            'CFOptions.compaction_filter_factory': {
                'default': 'nullptr', 'col_fam_A': 'nullptr'
            },
            'CFOptions.num_levels': {'default': '7', 'col_fam_A': '5'},
            'rate_limiter_bytes_per_sec': {NO_COL_FAMILY: '1024000'},
            'TableOptions.BlockBasedTable.block_align': {
                'default': 'false', 'col_fam_A': 'true'
            }
        }
        self.assertDictEqual(expected_options, options)
    def test_update_options(self):
        # add new, update old, set old
        # before updating
        expected_old_opts = {
            'DBOptions.db_log_dir': {NO_COL_FAMILY: None},
            'DBOptions.manual_wal_flush': {NO_COL_FAMILY: 'false'},
            'bloom_bits': {NO_COL_FAMILY: '4'},
            'CFOptions.num_levels': {'default': '7', 'col_fam_A': '5'},
            'TableOptions.BlockBasedTable.block_restart_interval': {
                'col_fam_A': '16'
            }
        }
        get_opts = list(expected_old_opts.keys())
        options = self.db_options.get_options(get_opts)
        self.assertEqual(expected_old_opts, options)
        # after updating options
        update_opts = {
            'DBOptions.db_log_dir': {NO_COL_FAMILY: '/dev/shm'},
            'DBOptions.manual_wal_flush': {NO_COL_FAMILY: 'true'},
            'bloom_bits': {NO_COL_FAMILY: '2'},
            'CFOptions.num_levels': {'col_fam_A': '7'},
            'TableOptions.BlockBasedTable.block_restart_interval': {
                'default': '32'
            },
            'random_misc_option': {NO_COL_FAMILY: 'something'}
        }
        self.db_options.update_options(update_opts)
        update_opts['CFOptions.num_levels']['default'] = '7'
        update_opts['TableOptions.BlockBasedTable.block_restart_interval'] = {
            'default': '32', 'col_fam_A': '16'
        }
        get_opts.append('random_misc_option')
        options = self.db_options.get_options(get_opts)
        self.assertDictEqual(update_opts, options)
        expected_misc_options = {
            'bloom_bits': '2',
            'rate_limiter_bytes_per_sec': '1024000',
            'random_misc_option': 'something'
        }
        self.assertDictEqual(
            expected_misc_options, self.db_options.get_misc_options()
        )
    def test_generate_options_config(self):
        # make sure file does not exist from before
        self.assertFalse(os.path.isfile(self.generated_options))
        self.db_options.generate_options_config('testing')
        self.assertTrue(os.path.isfile(self.generated_options))
    def test_check_and_trigger_conditions(self):
        # options only from CFOptions
        # setup the OptionCondition objects to check and trigger
        update_dict = {
            'CFOptions.level0_file_num_compaction_trigger': {'col_fam_A': '4'},
            'CFOptions.max_bytes_for_level_base': {'col_fam_A': '10'}
        }
        self.db_options.update_options(update_dict)
        cond1 = Condition('opt-cond-1')
        cond1 = OptionCondition.create(cond1)
        cond1.set_parameter(
            'options', [
                'CFOptions.level0_file_num_compaction_trigger',
                'TableOptions.BlockBasedTable.block_restart_interval',
                'CFOptions.max_bytes_for_level_base'
            ]
        )
        cond1.set_parameter(
            'evaluate',
            'int(options[0])*int(options[1])-int(options[2])>=0'
        )
        # only DBOptions
        cond2 = Condition('opt-cond-2')
        cond2 = OptionCondition.create(cond2)
        cond2.set_parameter(
            'options', [
                'DBOptions.db_write_buffer_size',
                'bloom_bits',
                'rate_limiter_bytes_per_sec'
            ]
        )
        cond2.set_parameter(
            'evaluate',
            '(int(options[2]) * int(options[1]) * int(options[0]))==0'
        )
        # mix of CFOptions and DBOptions
        cond3 = Condition('opt-cond-3')
        cond3 = OptionCondition.create(cond3)
        cond3.set_parameter(
            'options', [
                'DBOptions.db_write_buffer_size',  # 0
                'CFOptions.num_levels',  # 5, 7
                'bloom_bits'  # 4
            ]
        )
        cond3.set_parameter(
            'evaluate', 'int(options[2])*int(options[0])+int(options[1])>6'
        )
        self.db_options.check_and_trigger_conditions([cond1, cond2, cond3])
        cond1_trigger = {'col_fam_A': ['4', '16', '10']}
        self.assertDictEqual(cond1_trigger, cond1.get_trigger())
        cond2_trigger = {NO_COL_FAMILY: ['0', '4', '1024000']}
        self.assertDictEqual(cond2_trigger, cond2.get_trigger())
        cond3_trigger = {'default': ['0', '7', '4']}
        self.assertDictEqual(cond3_trigger, cond3.get_trigger())
 if __name__ == '__main__':
    unittest.main()
--- a/tools/advisor/test/test_db_stats_fetcher.py
+++ b/tools/advisor/test/test_db_stats_fetcher.py
@ -0,0 +1,126 @@
 # Copyright (c) 2011-present, Facebook, Inc.  All rights reserved.
 #  This source code is licensed under both the GPLv2 (found in the
 #  COPYING file in the root directory) and Apache 2.0 License
 #  (found in the LICENSE.Apache file in the root directory).
 from advisor.db_stats_fetcher import LogStatsParser, DatabasePerfContext
 from advisor.db_timeseries_parser import NO_ENTITY
 from advisor.rule_parser import Condition, TimeSeriesCondition
 import os
 import time
 import unittest
 from unittest.mock import MagicMock
 class TestLogStatsParser(unittest.TestCase):
    def setUp(self):
        this_path = os.path.abspath(os.path.dirname(__file__))
        stats_file = os.path.join(
            this_path, 'input_files/log_stats_parser_keys_ts'
        )
        # populate the keys_ts dictionary of LogStatsParser
        self.stats_dict = {NO_ENTITY: {}}
        with open(stats_file, 'r') as fp:
            for line in fp:
                stat_name = line.split(':')[0].strip()
                self.stats_dict[NO_ENTITY][stat_name] = {}
                token_list = line.split(':')[1].strip().split(',')
                for token in token_list:
                    timestamp = int(token.split()[0])
                    value = float(token.split()[1])
                    self.stats_dict[NO_ENTITY][stat_name][timestamp] = value
        self.log_stats_parser = LogStatsParser('dummy_log_file', 20)
        self.log_stats_parser.keys_ts = self.stats_dict
    def test_check_and_trigger_conditions_bursty(self):
        # mock fetch_timeseries() because 'keys_ts' has been pre-populated
        self.log_stats_parser.fetch_timeseries = MagicMock()
        # condition: bursty
        cond1 = Condition('cond-1')
        cond1 = TimeSeriesCondition.create(cond1)
        cond1.set_parameter('keys', 'rocksdb.db.get.micros.p50')
        cond1.set_parameter('behavior', 'bursty')
        cond1.set_parameter('window_sec', 40)
        cond1.set_parameter('rate_threshold', 0)
        self.log_stats_parser.check_and_trigger_conditions([cond1])
        expected_cond_trigger = {
            NO_ENTITY: {1530896440: 0.9767546362322214}
        }
        self.assertDictEqual(expected_cond_trigger, cond1.get_trigger())
        # ensure that fetch_timeseries() was called once
        self.log_stats_parser.fetch_timeseries.assert_called_once()
    def test_check_and_trigger_conditions_eval_agg(self):
        # mock fetch_timeseries() because 'keys_ts' has been pre-populated
        self.log_stats_parser.fetch_timeseries = MagicMock()
        # condition: evaluate_expression
        cond1 = Condition('cond-1')
        cond1 = TimeSeriesCondition.create(cond1)
        cond1.set_parameter('keys', 'rocksdb.db.get.micros.p50')
        cond1.set_parameter('behavior', 'evaluate_expression')
        keys = [
            'rocksdb.manifest.file.sync.micros.p99',
            'rocksdb.db.get.micros.p50'
        ]
        cond1.set_parameter('keys', keys)
        cond1.set_parameter('aggregation_op', 'latest')
        # condition evaluates to FALSE
        cond1.set_parameter('evaluate', 'keys[0]-(keys[1]*100)>200')
        self.log_stats_parser.check_and_trigger_conditions([cond1])
        expected_cond_trigger = {NO_ENTITY: [1792.0, 15.9638]}
        self.assertIsNone(cond1.get_trigger())
        # condition evaluates to TRUE
        cond1.set_parameter('evaluate', 'keys[0]-(keys[1]*100)<200')
        self.log_stats_parser.check_and_trigger_conditions([cond1])
        expected_cond_trigger = {NO_ENTITY: [1792.0, 15.9638]}
        self.assertDictEqual(expected_cond_trigger, cond1.get_trigger())
        # ensure that fetch_timeseries() was called
        self.log_stats_parser.fetch_timeseries.assert_called()
    def test_check_and_trigger_conditions_eval(self):
        # mock fetch_timeseries() because 'keys_ts' has been pre-populated
        self.log_stats_parser.fetch_timeseries = MagicMock()
        # condition: evaluate_expression
        cond1 = Condition('cond-1')
        cond1 = TimeSeriesCondition.create(cond1)
        cond1.set_parameter('keys', 'rocksdb.db.get.micros.p50')
        cond1.set_parameter('behavior', 'evaluate_expression')
        keys = [
            'rocksdb.manifest.file.sync.micros.p99',
            'rocksdb.db.get.micros.p50'
        ]
        cond1.set_parameter('keys', keys)
        cond1.set_parameter('evaluate', 'keys[0]-(keys[1]*100)>500')
        self.log_stats_parser.check_and_trigger_conditions([cond1])
        expected_trigger = {NO_ENTITY: {
            1530896414: [9938.0, 16.31508],
            1530896440: [9938.0, 16.346602],
            1530896466: [9938.0, 16.284669],
            1530896492: [9938.0, 16.16005]
        }}
        self.assertDictEqual(expected_trigger, cond1.get_trigger())
        self.log_stats_parser.fetch_timeseries.assert_called_once()
 class TestDatabasePerfContext(unittest.TestCase):
    def test_unaccumulate_metrics(self):
        perf_dict = {
            "user_key_comparison_count": 675903942,
            "block_cache_hit_count": 830086,
        }
        timestamp = int(time.time())
        perf_ts = {}
        for key in perf_dict:
            perf_ts[key] = {}
            start_val = perf_dict[key]
            for ix in range(5):
                perf_ts[key][timestamp+(ix*10)] = start_val + (2 * ix * ix)
        db_perf_context = DatabasePerfContext(perf_ts, 10, True)
        timestamps = [timestamp+(ix*10) for ix in range(1, 5, 1)]
        values = [val for val in range(2, 15, 4)]
        inner_dict = {timestamps[ix]: values[ix] for ix in range(4)}
        expected_keys_ts = {NO_ENTITY: {
            'user_key_comparison_count': inner_dict,
            'block_cache_hit_count': inner_dict
        }}
        self.assertDictEqual(expected_keys_ts, db_perf_context.keys_ts)
--- a/tools/advisor/test/test_rule_parser.py
+++ b/tools/advisor/test/test_rule_parser.py
@ -52,7 +52,10 @@ class TestAllRulesTriggered(unittest.TestCase):
        db_options_parser = DatabaseOptions(options_path)
        self.column_families = db_options_parser.get_column_families()
        db_logs_parser = DatabaseLogs(log_path, self.column_families)
-        self.data_sources = [db_options_parser, db_logs_parser]
+        self.data_sources = {
            DataSource.Type.DB_OPTIONS: [db_options_parser],
            DataSource.Type.LOG: [db_logs_parser]
        }
    def test_triggered_conditions(self):
        conditions_dict = self.db_rules.get_conditions_dict()
@ -106,7 +109,10 @@ class TestConditionsConjunctions(unittest.TestCase):
        db_options_parser = DatabaseOptions(options_path)
        self.column_families = db_options_parser.get_column_families()
        db_logs_parser = DatabaseLogs(log_path, self.column_families)
-        self.data_sources = [db_options_parser, db_logs_parser]
+        self.data_sources = {
            DataSource.Type.DB_OPTIONS: [db_options_parser],
            DataSource.Type.LOG: [db_logs_parser]
        }
    def test_condition_conjunctions(self):
        conditions_dict = self.db_rules.get_conditions_dict()