Push- instead of pull-model for managing Write stalls
Summary:
Introducing WriteController, which is a source of truth about per-DB write delays. Let's define an DB epoch as a period where there are no flushes and compactions (i.e. new epoch is started when flush or compaction finishes). Each epoch can either:
* proceed with all writes without delay
* delay all writes by fixed time
* stop all writes
The three modes are recomputed at each epoch change (flush, compaction), rather than on every write (which is currently the case).
When we have a lot of column families, our current pull behavior adds a big overhead, since we need to loop over every column family for every write. With new push model, overhead on Write code-path is minimal.
This is just the start. Next step is to also take care of stalls introduced by slow memtable flushes. The final goal is to eliminate function MakeRoomForWrite(), which currently needs to be called for every column family by every write.
Test Plan: make check for now. I'll add some unit tests later. Also, perf test.
Reviewers: dhruba, yhchiang, MarkCallaghan, sdong, ljin
Reviewed By: ljin
Subscribers: leveldb
Differential Revision: https://reviews.facebook.net/D22791
10 years ago
|
|
|
// Copyright (c) 2013, Facebook, Inc. All rights reserved.
|
|
|
|
// This source code is licensed under the BSD-style license found in the
|
|
|
|
// LICENSE file in the root directory of this source tree. An additional grant
|
|
|
|
// of patent rights can be found in the PATENTS file in the same directory.
|
|
|
|
//
|
|
|
|
#include "db/write_controller.h"
|
|
|
|
|
|
|
|
#include "rocksdb/env.h"
|
Push- instead of pull-model for managing Write stalls
Summary:
Introducing WriteController, which is a source of truth about per-DB write delays. Let's define an DB epoch as a period where there are no flushes and compactions (i.e. new epoch is started when flush or compaction finishes). Each epoch can either:
* proceed with all writes without delay
* delay all writes by fixed time
* stop all writes
The three modes are recomputed at each epoch change (flush, compaction), rather than on every write (which is currently the case).
When we have a lot of column families, our current pull behavior adds a big overhead, since we need to loop over every column family for every write. With new push model, overhead on Write code-path is minimal.
This is just the start. Next step is to also take care of stalls introduced by slow memtable flushes. The final goal is to eliminate function MakeRoomForWrite(), which currently needs to be called for every column family by every write.
Test Plan: make check for now. I'll add some unit tests later. Also, perf test.
Reviewers: dhruba, yhchiang, MarkCallaghan, sdong, ljin
Reviewed By: ljin
Subscribers: leveldb
Differential Revision: https://reviews.facebook.net/D22791
10 years ago
|
|
|
#include "util/testharness.h"
|
|
|
|
|
|
|
|
namespace rocksdb {
|
|
|
|
|
|
|
|
class WriteControllerTest : public testing::Test {};
|
Push- instead of pull-model for managing Write stalls
Summary:
Introducing WriteController, which is a source of truth about per-DB write delays. Let's define an DB epoch as a period where there are no flushes and compactions (i.e. new epoch is started when flush or compaction finishes). Each epoch can either:
* proceed with all writes without delay
* delay all writes by fixed time
* stop all writes
The three modes are recomputed at each epoch change (flush, compaction), rather than on every write (which is currently the case).
When we have a lot of column families, our current pull behavior adds a big overhead, since we need to loop over every column family for every write. With new push model, overhead on Write code-path is minimal.
This is just the start. Next step is to also take care of stalls introduced by slow memtable flushes. The final goal is to eliminate function MakeRoomForWrite(), which currently needs to be called for every column family by every write.
Test Plan: make check for now. I'll add some unit tests later. Also, perf test.
Reviewers: dhruba, yhchiang, MarkCallaghan, sdong, ljin
Reviewed By: ljin
Subscribers: leveldb
Differential Revision: https://reviews.facebook.net/D22791
10 years ago
|
|
|
|
|
|
|
class TimeSetEnv : public EnvWrapper {
|
|
|
|
public:
|
|
|
|
explicit TimeSetEnv() : EnvWrapper(nullptr) {}
|
|
|
|
uint64_t now_micros_ = 6666;
|
|
|
|
virtual uint64_t NowMicros() override { return now_micros_; }
|
|
|
|
};
|
|
|
|
|
|
|
|
TEST_F(WriteControllerTest, SanityTest) {
|
|
|
|
WriteController controller(10000000u);
|
Push- instead of pull-model for managing Write stalls
Summary:
Introducing WriteController, which is a source of truth about per-DB write delays. Let's define an DB epoch as a period where there are no flushes and compactions (i.e. new epoch is started when flush or compaction finishes). Each epoch can either:
* proceed with all writes without delay
* delay all writes by fixed time
* stop all writes
The three modes are recomputed at each epoch change (flush, compaction), rather than on every write (which is currently the case).
When we have a lot of column families, our current pull behavior adds a big overhead, since we need to loop over every column family for every write. With new push model, overhead on Write code-path is minimal.
This is just the start. Next step is to also take care of stalls introduced by slow memtable flushes. The final goal is to eliminate function MakeRoomForWrite(), which currently needs to be called for every column family by every write.
Test Plan: make check for now. I'll add some unit tests later. Also, perf test.
Reviewers: dhruba, yhchiang, MarkCallaghan, sdong, ljin
Reviewed By: ljin
Subscribers: leveldb
Differential Revision: https://reviews.facebook.net/D22791
10 years ago
|
|
|
auto stop_token_1 = controller.GetStopToken();
|
|
|
|
auto stop_token_2 = controller.GetStopToken();
|
|
|
|
|
maint: use ASSERT_TRUE, not ASSERT_EQ(true; same for false
Summary:
The usage I'm fixing here caused trouble on Fedora 21 when
compiling with the current gcc version 4.9.2 20150212 (Red Hat 4.9.2-6) (GCC):
db/write_controller_test.cc: In member function ‘virtual void rocksdb::WriteControllerTest_SanityTest_Test::TestBody()’:
db/write_controller_test.cc:23:165: error: converting ‘false’ to pointer type for argument 1 of ‘char testing::internal::IsNullLiteralHelper(testing::internal::Secret*)’ [-Werror=conversion-null]
ASSERT_EQ(false, controller.IsStopped());
^
This change was induced mechanically via:
git grep -l -E 'ASSERT_EQ\(false'|xargs perl -pi -e 's/ASSERT_EQ\(false, /ASSERT_FALSE(/'
git grep -l -E 'ASSERT_EQ\(true'|xargs perl -pi -e 's/ASSERT_EQ\(true, /ASSERT_TRUE(/'
Except for the three in utilities/backupable/backupable_db_test.cc for which
I ended up reformatting (joining lines) in the result.
As for why this problem is exhibited with that version of gcc, and none
of the others I've used (from 4.8.1 through gcc-5.0.0 and newer), I suspect
it's a bug in F21's gcc that has been fixed in gcc-5.0.0.
Test Plan:
"make" now succeed on Fedora 21
Reviewers: ljin, rven, igor.sugak, yhchiang, sdong, igor
Reviewed By: igor
Subscribers: dhruba
Differential Revision: https://reviews.facebook.net/D37329
10 years ago
|
|
|
ASSERT_TRUE(controller.IsStopped());
|
Push- instead of pull-model for managing Write stalls
Summary:
Introducing WriteController, which is a source of truth about per-DB write delays. Let's define an DB epoch as a period where there are no flushes and compactions (i.e. new epoch is started when flush or compaction finishes). Each epoch can either:
* proceed with all writes without delay
* delay all writes by fixed time
* stop all writes
The three modes are recomputed at each epoch change (flush, compaction), rather than on every write (which is currently the case).
When we have a lot of column families, our current pull behavior adds a big overhead, since we need to loop over every column family for every write. With new push model, overhead on Write code-path is minimal.
This is just the start. Next step is to also take care of stalls introduced by slow memtable flushes. The final goal is to eliminate function MakeRoomForWrite(), which currently needs to be called for every column family by every write.
Test Plan: make check for now. I'll add some unit tests later. Also, perf test.
Reviewers: dhruba, yhchiang, MarkCallaghan, sdong, ljin
Reviewed By: ljin
Subscribers: leveldb
Differential Revision: https://reviews.facebook.net/D22791
10 years ago
|
|
|
stop_token_1.reset();
|
maint: use ASSERT_TRUE, not ASSERT_EQ(true; same for false
Summary:
The usage I'm fixing here caused trouble on Fedora 21 when
compiling with the current gcc version 4.9.2 20150212 (Red Hat 4.9.2-6) (GCC):
db/write_controller_test.cc: In member function ‘virtual void rocksdb::WriteControllerTest_SanityTest_Test::TestBody()’:
db/write_controller_test.cc:23:165: error: converting ‘false’ to pointer type for argument 1 of ‘char testing::internal::IsNullLiteralHelper(testing::internal::Secret*)’ [-Werror=conversion-null]
ASSERT_EQ(false, controller.IsStopped());
^
This change was induced mechanically via:
git grep -l -E 'ASSERT_EQ\(false'|xargs perl -pi -e 's/ASSERT_EQ\(false, /ASSERT_FALSE(/'
git grep -l -E 'ASSERT_EQ\(true'|xargs perl -pi -e 's/ASSERT_EQ\(true, /ASSERT_TRUE(/'
Except for the three in utilities/backupable/backupable_db_test.cc for which
I ended up reformatting (joining lines) in the result.
As for why this problem is exhibited with that version of gcc, and none
of the others I've used (from 4.8.1 through gcc-5.0.0 and newer), I suspect
it's a bug in F21's gcc that has been fixed in gcc-5.0.0.
Test Plan:
"make" now succeed on Fedora 21
Reviewers: ljin, rven, igor.sugak, yhchiang, sdong, igor
Reviewed By: igor
Subscribers: dhruba
Differential Revision: https://reviews.facebook.net/D37329
10 years ago
|
|
|
ASSERT_TRUE(controller.IsStopped());
|
Push- instead of pull-model for managing Write stalls
Summary:
Introducing WriteController, which is a source of truth about per-DB write delays. Let's define an DB epoch as a period where there are no flushes and compactions (i.e. new epoch is started when flush or compaction finishes). Each epoch can either:
* proceed with all writes without delay
* delay all writes by fixed time
* stop all writes
The three modes are recomputed at each epoch change (flush, compaction), rather than on every write (which is currently the case).
When we have a lot of column families, our current pull behavior adds a big overhead, since we need to loop over every column family for every write. With new push model, overhead on Write code-path is minimal.
This is just the start. Next step is to also take care of stalls introduced by slow memtable flushes. The final goal is to eliminate function MakeRoomForWrite(), which currently needs to be called for every column family by every write.
Test Plan: make check for now. I'll add some unit tests later. Also, perf test.
Reviewers: dhruba, yhchiang, MarkCallaghan, sdong, ljin
Reviewed By: ljin
Subscribers: leveldb
Differential Revision: https://reviews.facebook.net/D22791
10 years ago
|
|
|
stop_token_2.reset();
|
maint: use ASSERT_TRUE, not ASSERT_EQ(true; same for false
Summary:
The usage I'm fixing here caused trouble on Fedora 21 when
compiling with the current gcc version 4.9.2 20150212 (Red Hat 4.9.2-6) (GCC):
db/write_controller_test.cc: In member function ‘virtual void rocksdb::WriteControllerTest_SanityTest_Test::TestBody()’:
db/write_controller_test.cc:23:165: error: converting ‘false’ to pointer type for argument 1 of ‘char testing::internal::IsNullLiteralHelper(testing::internal::Secret*)’ [-Werror=conversion-null]
ASSERT_EQ(false, controller.IsStopped());
^
This change was induced mechanically via:
git grep -l -E 'ASSERT_EQ\(false'|xargs perl -pi -e 's/ASSERT_EQ\(false, /ASSERT_FALSE(/'
git grep -l -E 'ASSERT_EQ\(true'|xargs perl -pi -e 's/ASSERT_EQ\(true, /ASSERT_TRUE(/'
Except for the three in utilities/backupable/backupable_db_test.cc for which
I ended up reformatting (joining lines) in the result.
As for why this problem is exhibited with that version of gcc, and none
of the others I've used (from 4.8.1 through gcc-5.0.0 and newer), I suspect
it's a bug in F21's gcc that has been fixed in gcc-5.0.0.
Test Plan:
"make" now succeed on Fedora 21
Reviewers: ljin, rven, igor.sugak, yhchiang, sdong, igor
Reviewed By: igor
Subscribers: dhruba
Differential Revision: https://reviews.facebook.net/D37329
10 years ago
|
|
|
ASSERT_FALSE(controller.IsStopped());
|
Push- instead of pull-model for managing Write stalls
Summary:
Introducing WriteController, which is a source of truth about per-DB write delays. Let's define an DB epoch as a period where there are no flushes and compactions (i.e. new epoch is started when flush or compaction finishes). Each epoch can either:
* proceed with all writes without delay
* delay all writes by fixed time
* stop all writes
The three modes are recomputed at each epoch change (flush, compaction), rather than on every write (which is currently the case).
When we have a lot of column families, our current pull behavior adds a big overhead, since we need to loop over every column family for every write. With new push model, overhead on Write code-path is minimal.
This is just the start. Next step is to also take care of stalls introduced by slow memtable flushes. The final goal is to eliminate function MakeRoomForWrite(), which currently needs to be called for every column family by every write.
Test Plan: make check for now. I'll add some unit tests later. Also, perf test.
Reviewers: dhruba, yhchiang, MarkCallaghan, sdong, ljin
Reviewed By: ljin
Subscribers: leveldb
Differential Revision: https://reviews.facebook.net/D22791
10 years ago
|
|
|
|
|
|
|
TimeSetEnv env;
|
|
|
|
|
|
|
|
auto delay_token_1 = controller.GetDelayToken();
|
|
|
|
ASSERT_EQ(static_cast<uint64_t>(2000000),
|
|
|
|
controller.GetDelay(&env, 20000000u));
|
|
|
|
|
|
|
|
env.now_micros_ += 1999900u; // sleep debt 1000
|
|
|
|
auto delay_token_2 = controller.GetDelayToken();
|
|
|
|
// One refill: 10240 bytes allowed, 1000 used, 9240 left
|
|
|
|
ASSERT_EQ(static_cast<uint64_t>(1124), controller.GetDelay(&env, 1000u));
|
|
|
|
env.now_micros_ += 1124u; // sleep debt 0
|
Push- instead of pull-model for managing Write stalls
Summary:
Introducing WriteController, which is a source of truth about per-DB write delays. Let's define an DB epoch as a period where there are no flushes and compactions (i.e. new epoch is started when flush or compaction finishes). Each epoch can either:
* proceed with all writes without delay
* delay all writes by fixed time
* stop all writes
The three modes are recomputed at each epoch change (flush, compaction), rather than on every write (which is currently the case).
When we have a lot of column families, our current pull behavior adds a big overhead, since we need to loop over every column family for every write. With new push model, overhead on Write code-path is minimal.
This is just the start. Next step is to also take care of stalls introduced by slow memtable flushes. The final goal is to eliminate function MakeRoomForWrite(), which currently needs to be called for every column family by every write.
Test Plan: make check for now. I'll add some unit tests later. Also, perf test.
Reviewers: dhruba, yhchiang, MarkCallaghan, sdong, ljin
Reviewed By: ljin
Subscribers: leveldb
Differential Revision: https://reviews.facebook.net/D22791
10 years ago
|
|
|
|
|
|
|
delay_token_2.reset();
|
|
|
|
// 1000 used, 8240 left
|
|
|
|
ASSERT_EQ(static_cast<uint64_t>(0), controller.GetDelay(&env, 1000u));
|
|
|
|
|
|
|
|
env.now_micros_ += 100u; // sleep credit 100
|
|
|
|
// 1000 used, 7240 left
|
|
|
|
ASSERT_EQ(static_cast<uint64_t>(0), controller.GetDelay(&env, 1000u));
|
|
|
|
|
|
|
|
env.now_micros_ += 100u; // sleep credit 200
|
|
|
|
// One refill: 10240 fileed, sleep credit generates 2000. 8000 used
|
|
|
|
// 7240 + 10240 + 2000 - 8000 = 11480 left
|
|
|
|
ASSERT_EQ(static_cast<uint64_t>(1024u), controller.GetDelay(&env, 8000u));
|
|
|
|
|
|
|
|
env.now_micros_ += 200u; // sleep debt 824
|
|
|
|
// 1000 used, 10480 left.
|
|
|
|
ASSERT_EQ(static_cast<uint64_t>(0), controller.GetDelay(&env, 1000u));
|
|
|
|
|
|
|
|
env.now_micros_ += 200u; // sleep debt 624
|
|
|
|
// Out of bound sleep, still 10480 left
|
|
|
|
ASSERT_EQ(static_cast<uint64_t>(3000624u),
|
|
|
|
controller.GetDelay(&env, 30000000u));
|
|
|
|
|
|
|
|
env.now_micros_ += 3000724u; // sleep credit 100
|
|
|
|
// 6000 used, 4480 left.
|
|
|
|
ASSERT_EQ(static_cast<uint64_t>(0), controller.GetDelay(&env, 6000u));
|
|
|
|
|
|
|
|
env.now_micros_ += 200u; // sleep credit 300
|
|
|
|
// One refill, credit 4480 balance + 3000 credit + 10240 refill
|
|
|
|
// Use 8000, 9720 left
|
|
|
|
ASSERT_EQ(static_cast<uint64_t>(1024u), controller.GetDelay(&env, 8000u));
|
|
|
|
|
|
|
|
env.now_micros_ += 3024u; // sleep credit 2000
|
|
|
|
|
|
|
|
// 1720 left
|
|
|
|
ASSERT_EQ(static_cast<uint64_t>(0u), controller.GetDelay(&env, 8000u));
|
|
|
|
|
|
|
|
// 1720 balance + 20000 credit = 20170 left
|
|
|
|
// Use 8000, 12170 left
|
|
|
|
ASSERT_EQ(static_cast<uint64_t>(0u), controller.GetDelay(&env, 8000u));
|
|
|
|
|
|
|
|
// 4170 left
|
|
|
|
ASSERT_EQ(static_cast<uint64_t>(0u), controller.GetDelay(&env, 8000u));
|
|
|
|
|
|
|
|
// Need a refill
|
|
|
|
ASSERT_EQ(static_cast<uint64_t>(1024u), controller.GetDelay(&env, 9000u));
|
|
|
|
|
Push- instead of pull-model for managing Write stalls
Summary:
Introducing WriteController, which is a source of truth about per-DB write delays. Let's define an DB epoch as a period where there are no flushes and compactions (i.e. new epoch is started when flush or compaction finishes). Each epoch can either:
* proceed with all writes without delay
* delay all writes by fixed time
* stop all writes
The three modes are recomputed at each epoch change (flush, compaction), rather than on every write (which is currently the case).
When we have a lot of column families, our current pull behavior adds a big overhead, since we need to loop over every column family for every write. With new push model, overhead on Write code-path is minimal.
This is just the start. Next step is to also take care of stalls introduced by slow memtable flushes. The final goal is to eliminate function MakeRoomForWrite(), which currently needs to be called for every column family by every write.
Test Plan: make check for now. I'll add some unit tests later. Also, perf test.
Reviewers: dhruba, yhchiang, MarkCallaghan, sdong, ljin
Reviewed By: ljin
Subscribers: leveldb
Differential Revision: https://reviews.facebook.net/D22791
10 years ago
|
|
|
delay_token_1.reset();
|
|
|
|
ASSERT_EQ(static_cast<uint64_t>(0), controller.GetDelay(&env, 30000000u));
|
Push- instead of pull-model for managing Write stalls
Summary:
Introducing WriteController, which is a source of truth about per-DB write delays. Let's define an DB epoch as a period where there are no flushes and compactions (i.e. new epoch is started when flush or compaction finishes). Each epoch can either:
* proceed with all writes without delay
* delay all writes by fixed time
* stop all writes
The three modes are recomputed at each epoch change (flush, compaction), rather than on every write (which is currently the case).
When we have a lot of column families, our current pull behavior adds a big overhead, since we need to loop over every column family for every write. With new push model, overhead on Write code-path is minimal.
This is just the start. Next step is to also take care of stalls introduced by slow memtable flushes. The final goal is to eliminate function MakeRoomForWrite(), which currently needs to be called for every column family by every write.
Test Plan: make check for now. I'll add some unit tests later. Also, perf test.
Reviewers: dhruba, yhchiang, MarkCallaghan, sdong, ljin
Reviewed By: ljin
Subscribers: leveldb
Differential Revision: https://reviews.facebook.net/D22791
10 years ago
|
|
|
delay_token_1.reset();
|
maint: use ASSERT_TRUE, not ASSERT_EQ(true; same for false
Summary:
The usage I'm fixing here caused trouble on Fedora 21 when
compiling with the current gcc version 4.9.2 20150212 (Red Hat 4.9.2-6) (GCC):
db/write_controller_test.cc: In member function ‘virtual void rocksdb::WriteControllerTest_SanityTest_Test::TestBody()’:
db/write_controller_test.cc:23:165: error: converting ‘false’ to pointer type for argument 1 of ‘char testing::internal::IsNullLiteralHelper(testing::internal::Secret*)’ [-Werror=conversion-null]
ASSERT_EQ(false, controller.IsStopped());
^
This change was induced mechanically via:
git grep -l -E 'ASSERT_EQ\(false'|xargs perl -pi -e 's/ASSERT_EQ\(false, /ASSERT_FALSE(/'
git grep -l -E 'ASSERT_EQ\(true'|xargs perl -pi -e 's/ASSERT_EQ\(true, /ASSERT_TRUE(/'
Except for the three in utilities/backupable/backupable_db_test.cc for which
I ended up reformatting (joining lines) in the result.
As for why this problem is exhibited with that version of gcc, and none
of the others I've used (from 4.8.1 through gcc-5.0.0 and newer), I suspect
it's a bug in F21's gcc that has been fixed in gcc-5.0.0.
Test Plan:
"make" now succeed on Fedora 21
Reviewers: ljin, rven, igor.sugak, yhchiang, sdong, igor
Reviewed By: igor
Subscribers: dhruba
Differential Revision: https://reviews.facebook.net/D37329
10 years ago
|
|
|
ASSERT_FALSE(controller.IsStopped());
|
Push- instead of pull-model for managing Write stalls
Summary:
Introducing WriteController, which is a source of truth about per-DB write delays. Let's define an DB epoch as a period where there are no flushes and compactions (i.e. new epoch is started when flush or compaction finishes). Each epoch can either:
* proceed with all writes without delay
* delay all writes by fixed time
* stop all writes
The three modes are recomputed at each epoch change (flush, compaction), rather than on every write (which is currently the case).
When we have a lot of column families, our current pull behavior adds a big overhead, since we need to loop over every column family for every write. With new push model, overhead on Write code-path is minimal.
This is just the start. Next step is to also take care of stalls introduced by slow memtable flushes. The final goal is to eliminate function MakeRoomForWrite(), which currently needs to be called for every column family by every write.
Test Plan: make check for now. I'll add some unit tests later. Also, perf test.
Reviewers: dhruba, yhchiang, MarkCallaghan, sdong, ljin
Reviewed By: ljin
Subscribers: leveldb
Differential Revision: https://reviews.facebook.net/D22791
10 years ago
|
|
|
}
|
|
|
|
|
|
|
|
} // namespace rocksdb
|
|
|
|
|
|
|
|
int main(int argc, char** argv) {
|
|
|
|
::testing::InitGoogleTest(&argc, argv);
|
|
|
|
return RUN_ALL_TESTS();
|
|
|
|
}
|