Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Envoy resource monitoring & overload manager configuration #5106

Merged
merged 10 commits into from
May 28, 2024

Conversation

kralicky
Copy link
Contributor

@kralicky kralicky commented May 6, 2024

Summary

This adds a new "resource manager" feature that continuously monitors cgroup memory saturation and feeds it to envoy's overload manager. It supports cgroup v1/v2/hybrid and container-style layouts.

It works by using envoy's injected resource resource monitor, effectively allowing us manual control over the overload action states. Envoy can only monitor its own memory usage, so we can give it a more accurate measurement of memory pressure since we know more about the runtime environment.

The most common use case will be when both pomerium and envoy are running in the same container, and the container has memory restrictions configured. Pomerium will monitor the memory usage of the cgroup containing both processes (since that's what matters from the oom-killer's perspective) and envoy will use that value to control its overload actions.

A runtime flag envoy_resource_manager_enabled (default true) can be used to soft-disable the resource manager. Since the envoy overload manager settings are part of the bootstrap config, they can't be changed without a restart. If this flag is set to false, pomerium will always treat the memory saturation value as 0, ensuring all overload actions stay disabled.

If the current cgroup has an unbounded memory limit ("max") then it is also treated as 0. The resource manager will watch for changes to the memory limit to allow updating it at runtime (such as via k8s 1.30 dynamic resource allocation)

TODOs:

  • Adjust and document thresholds for various overload actions
  • Add new metrics for memory usage and overload manager actions
  • Write tests
  • Document how this interacts with pod resource restrictions

Related issues

User Explanation

Checklist

  • reference any related issues
  • updated docs
  • updated unit tests
  • updated UPGRADING.md
  • add appropriate tag (improvement / bug / etc)
  • ready for review

@coveralls
Copy link

coveralls commented May 6, 2024

Coverage Status

coverage: 56.722% (+0.3%) from 56.38%
when pulling 77111cf on kralicky/memory-limits
into 8269a72 on main.

Instead of reading memory.max/limit_in_bytes on every tick, we
read it once, then again only when it is modified.

To support this change, logic for computing the saturation was moved out
of the cgroup driver and into the resource monitor, and the driver
interface now has separate methods for reading memory usage and limit.
@kralicky kralicky added the enhancement New feature or request label May 10, 2024
@kralicky kralicky marked this pull request as ready for review May 10, 2024 19:16
@kralicky kralicky requested a review from a team as a code owner May 10, 2024 19:16
@kralicky kralicky changed the title [WIP] Envoy resource monitoring & overload manager configuration Envoy resource monitoring & overload manager configuration May 10, 2024
}

saturationStr := fmt.Sprintf("%.6f", saturation)
nextInterval := (monitorMaxTickInterval - (time.Duration(float64(monitorMaxTickInterval-monitorMinTickInterval) * saturation))).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe we could pull this into a separate function?

Also do we need to handle saturation > 1?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I think we should handle that, good catch

}

func (w *memoryLimitWatcher) Watch(ctx context.Context) error {
fd, err := syscall.InotifyInit1(syscall.IN_CLOEXEC)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have a file watcher in fileutil. But I guess its made for more of the cross-platform case and would also start a polling watcher, which is not ideal. Would using the underlying package simplify things? Or is it really no different?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I did this manually since the usage here is simple enough maybe not to warrant using a library for, but I can rewrite to use github.com/fsnotify/fsnotify if you'd like. It might make the code a bit more readable.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No preference from me. Whichever way seems best.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like there are some wrappers around the low level syscall functions in golang.org/x/sys/unix that I can use here, that might be good for now.

Copy link
Contributor

@calebdoxsey calebdoxsey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

I did see these notices in the docs:

This extension is functional but has not had substantial production burn time, use only with this caveat.
The injected resource monitor allows injecting a synthetic resource pressure into Envoy via a text file, which must contain a floating-point number in the range [0..1] representing the resource pressure and be updated atomically by a symbolic link swap. This is intended primarily for integration tests to force Envoy into an overloaded state.

If we're ok with those caveats I think this is a good solution!

Copy link
Contributor

@wasaga wasaga left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great!

I have a couple follow ups that I'll open separate tickets for.

@kralicky kralicky merged commit 927f24e into main May 28, 2024
16 checks passed
@kralicky kralicky deleted the kralicky/memory-limits branch May 28, 2024 20:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants