Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: [MR-617] Enforce subnet-wide best-effort message memory limit #1835

Open
wants to merge 13 commits into
base: master
Choose a base branch
from

Conversation

alin-at-dfinity
Copy link
Contributor

@alin-at-dfinity alin-at-dfinity commented Oct 3, 2024

At the end of each round, if the total best-effort message memory usage is above a configured limit, build a priority queue and keep shedding the largest message of the canister with the highest best-effort message memory usage until we're back under the limit.

At the end of each round, if the total best-effort message memory usage is above the limit, build a priority queue and keep shedding the largest messages from the canisters with the highest best-effort message memory usage until we're back under the limit.
rs/messaging/src/state_machine.rs Outdated Show resolved Hide resolved
rs/replicated_state/src/replicated_state.rs Outdated Show resolved Hide resolved
rs/replicated_state/src/replicated_state.rs Outdated Show resolved Hide resolved
rs/replicated_state/src/replicated_state.rs Outdated Show resolved Hide resolved
@@ -668,6 +668,18 @@ impl ReplicatedState {
canisters_memory_usage + subnet_memory_usage
}

/// Computes the memory taken by best-effort response messages.
pub fn best_effort_message_memory_taken(&self) -> NumBytes {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is somewhat confusing that we include the subnet queue usage here but then below don't shed from the queues?

Should we maybe exclude the usage related to the subnet queues here and re-introduce it in the PR that will start shedding messages from the subnet queues?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason subnet queues were included here is that this is a copy-pasta of guaranteed_response_message_memory_taken() just above (which had it).

The reason why the shedding logic doesn't have it is that I forgot about it. I wrote that code before this, so it wasn't front of mind. And this one only while writing the test.

I'll try extending the shedding logic to subnet queues in this PR. If it turns out to be a sizable change, I'll break it out into its own PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Also extended the test to shed from the subnet queues.

rs/replicated_state/src/replicated_state.rs Outdated Show resolved Hide resolved
rs/replicated_state/src/replicated_state.rs Show resolved Hide resolved
Copy link
Member

@oggy-dfin oggy-dfin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a bunch of nitpicks. What's the plan for subnet queues, is that a separate ticket?

rs/messaging/src/state_machine.rs Show resolved Hide resolved
@@ -668,6 +668,18 @@ impl ReplicatedState {
canisters_memory_usage + subnet_memory_usage
}

/// Computes the memory taken by best-effort response messages.
pub fn best_effort_message_memory_taken(&self) -> NumBytes {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpick: why is this called "taken" but the methods on the state/queues are called "usage"?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, this seems only used in tests, I forget if we mark these methods somehow?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's to make it consistent with guaranteed_response_message_memory_taken() above (most (good) programming style guides will tell you to prioritize consistency over other rules). I believe Execution came up with that one, so you can bring it up with them. (o:

As for it only being used in tests, that's true. If it were to remain so, we could add a #[cfg(test)] attribute. But I can easily imagine that we'd e.g. want to expose this as a metric, in this PR or a follow-up.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well I'm exactly asking because it's inconsistent :) Just between the different layers. I'd prefer if you could change it to something uniform everywhere, but not a blocker for me.

rs/replicated_state/src/replicated_state.rs Outdated Show resolved Hide resolved
rs/replicated_state/src/replicated_state.rs Outdated Show resolved Hide resolved
 * Switch to using `best_effort_message_memory_taken()` instead of a hand-rolled calculation.
 * Add a `debug_assert!()` to ensure that our running `memory_usage` amount matches the actual memory usage.
Copy link
Contributor

@stiegerc stiegerc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lgtm, except the PR description could be more precise too.

Comment on lines 1049 to 1051
let memory_usage_delta = memory_usage_before - memory_usage_after;
debug_assert!(memory_usage_delta > ZERO_BYTES);
memory_usage -= memory_usage_delta;
Copy link
Contributor

@stiegerc stiegerc Nov 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens here if memory_usage_after > memory_usage_before? I know it shouldn't be possible and if it were the accounting would be screwed up anyway. But I'd replace this debug_assert!() with debug_assert!(memory_usage_before > memory_usage_after) and put that in front.

I guess unless ZERO_BYTES is intended to be possibly not actually zero bytes for some reason in the future.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. Unless memory_usage_after was equal to memory_usage_before, the debug_assert!() wouldn't do anything useful as is. Because what would happen is that the two subtractions would underflow. (OTOH I believe that in a debug build an integer overflow will panic anyway.)

Still, there is one case where shedding a message may result in (marginally) higher memory usage: when we shed an outbound request, we enqueue an actual reject response; if the request was really tiny, the reject response may be larger than it (even though the payload only consists of the error message "Request timed out."). It is not clear to me what is the right thing to do at that point (apart from not enqueuing an actual response). I suppose we should keep going regardless, because the alternative is to end up with arbitrarily many tiny messages that are never shed.

So I've changed the debug_assert!() as suggested, but left everything else as it was.

Comment on lines 972 to 974
/// Enforces the best-effort message limit by repeatedly shedding the largest
/// best-effort messages of the canisters with the largest best-effort memory
/// usage until the memory usage drops below the limit.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems imprecise. I had to really look at the code to understand what exactly this means.

IIUC the priority queue holds the canisters by memory usage, i.e. the one with the largest usage in front. It then sheds the largest message from this canister until we either go below the limit or a different canister becomes the one with the largest memory usage (or we run out of messages altogether). So it's not some sort of round robin scheme over the canisters with the largest best-effort memory usage where each has its largest message shed or something (which I was sort of guessing from reading the comment).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed it to say "repeatedly shedding the largest best-effort message of the canister with the highest best-effort message memory usage" (i.e. dropped the plurals).

If you have a better suggestion, I'm happy to apply it.

rs/replicated_state/tests/replicated_state.rs Outdated Show resolved Hide resolved
Copy link
Contributor

@derlerd-dfinity derlerd-dfinity left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you. LGTM.

rs/replicated_state/src/replicated_state.rs Outdated Show resolved Hide resolved
Comment on lines 1021 to 1023
let message_shed;
let memory_usage_after;
if canister_id.get() == self.metadata.own_subnet_id.get() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would find it easier to read if we'd change this so that

Suggested change
let message_shed;
let memory_usage_after;
if canister_id.get() == self.metadata.own_subnet_id.get() {
let (message_shed, memory_usage_after) = if canister_id.get() == self.metadata.own_subnet_id.get() {
...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I only found out about this now and I find it a lot cleaner than the usual

let (message_shed, memory_usage_after) = if foo() {
    let message_shed = bar();
    let memory_usage_after = baz();
    // More code here.
    (message_shed, memory_usage_after)
} else {
    let message_shed = qux();
    let memory_usage_after = quux();
    // Other code here.
    (message_shed, memory_usage_after)
};

Changed, nonetheless, for the sake of consistency.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have a strong opinion. So if you feel strongly you can also leave it as it was. What prompted me to comment was that I wasn't aware of the possibility of uninitialized variables in Rust either and I had to Google it to make sure I understand the semantic.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They're not uninitialized, they just look like that. If you were to not initialize them on every code path (i.e. on both if branches) the compiler would complain about it. Which is also why they don't need to be mut: they're simply declared at the top and initialized further down.

rs/replicated_state/tests/replicated_state.rs Outdated Show resolved Hide resolved
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants