Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ipsec: cache xfrm state list #32588

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

marseel
Copy link
Contributor

@marseel marseel commented May 16, 2024

Reduces GC CPU usage and memory allocations coming from XfrmStateList.
To ensure we have up-to-date cache, wrap all XfrmState related
functions inside cache, which is invalidated whenever XfrmState changes.

This is follow-up to #32577
While that PR averages out CPU usage over time, in large cluster 100+
nodes amount of allocations coming from netlink.XfrmStateList() is high
due to backgroundSync where we usually don't change any Xfrm states.
This becomes more and more expensive as number of nodes increases.

Some pprofs/GC CPU time graphs.
Before:
Number of alloc objects:
image
Total GC time for 100 nodes:
image
After:
Number of alloc objects:
image
Total GC time for 100 nodes:
image

This would become even more important for larger clusters, as XfrmStateList allocations in backgroundSync are essentially O(n^2).

ipsec: Improve CPU usage of cilum-agent in large clusters

@maintainer-s-little-helper maintainer-s-little-helper bot added the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label May 16, 2024
@marseel marseel force-pushed the improve_background_check_and_cache_xfrm_test branch from d4637f1 to b90b538 Compare May 17, 2024 11:43
@marseel marseel changed the title Improve background check and cache xfrm test Cache XfrmStateList May 17, 2024
@marseel marseel force-pushed the improve_background_check_and_cache_xfrm_test branch from b90b538 to f5dd892 Compare May 17, 2024 11:54
@marseel marseel changed the title Cache XfrmStateList ipsec: cache xfrm state list May 17, 2024
@marseel marseel added the release-note/minor This PR changes functionality that users may find relevant to operating Cilium. label May 17, 2024
@maintainer-s-little-helper maintainer-s-little-helper bot removed the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label May 17, 2024
@marseel marseel force-pushed the improve_background_check_and_cache_xfrm_test branch from f5dd892 to 76a9b0b Compare May 17, 2024 12:03
@marseel
Copy link
Contributor Author

marseel commented May 17, 2024

/test

@marseel marseel force-pushed the improve_background_check_and_cache_xfrm_test branch 2 times, most recently from 4ce9faf to 0bb20b5 Compare May 17, 2024 12:16
@marseel
Copy link
Contributor Author

marseel commented May 17, 2024

/test

@marseel marseel marked this pull request as ready for review May 17, 2024 12:59
@marseel marseel requested a review from a team as a code owner May 17, 2024 12:59
@marseel marseel requested a review from pchaigno May 17, 2024 12:59
@marseel
Copy link
Contributor Author

marseel commented May 17, 2024

@pchaigno I wonder which CI test runs IPSec privileged test?
I wanted to double-check it runs, I was thinking that probably Conformance Runtime (ci-runtime) but junit shows 0 for privileged:

<testsuite name="github.com/cilium/cilium/pkg/datapath/linux/ipsec" tests="0" failures="0" errors="0" id="118" hostname="kind-bpf-next" time="0.130" timestamp="2024-05-17T12:41:11Z"></testsuite>

🤷

@pchaigno
Copy link
Member

@pchaigno I wonder which CI test runs IPSec privileged test? I wanted to double-check it runs, I was thinking that probably Conformance Runtime (ci-runtime) but junit shows 0 for privileged:

<testsuite name="github.com/cilium/cilium/pkg/datapath/linux/ipsec" tests="0" failures="0" errors="0" id="118" hostname="kind-bpf-next" time="0.130" timestamp="2024-05-17T12:41:11Z"></testsuite>

But it prints tests="0" for every single packages, doesn't it? I'm not sure we can rely on that.
In the output, it does look like it's doing something: https://github.com/cilium/cilium/actions/runs/9128109980/job/25099903131#step:17:367.

Copy link
Member

@pchaigno pchaigno left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! Couple questions below, but nothing I believe is blocking.

pkg/datapath/linux/ipsec/xfrm_state_cache.go Show resolved Hide resolved
pkg/datapath/linux/ipsec/xfrm_state_cache.go Show resolved Hide resolved
pkg/datapath/linux/ipsec/ipsec_linux.go Show resolved Hide resolved
@maintainer-s-little-helper maintainer-s-little-helper bot added the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label May 17, 2024
@marseel marseel force-pushed the improve_background_check_and_cache_xfrm_test branch from 0bb20b5 to 522c126 Compare May 17, 2024 14:43
@sjdot
Copy link
Contributor

sjdot commented May 17, 2024

@marseel @pchaigno I originally reported this issue and I've been playing around with a patch on my side also. Instead of calling xfrm state list here and looping until we find a match, can we just use XfrmStateGet instead? This seems to find states just fine in my testing and avoids getting the whole list of states back in the first place.

I haven't tested the impact of this on the GC on a large cluster however, but wondering why we wouldn't want to do this instead of the looping logic?

I'm a bit wary of caching the states in general just because it feels like troubleshooting issues with the cache in the future may be difficult or hard to find.

@marseel
Copy link
Contributor Author

marseel commented May 21, 2024

@sjdot I do agree that we should use XfrmStateGet and probably this part needs some refactoring in general.

While XfrmStateGet would work for background-sync triggered events which usually don't change anything, I am a bit worried about interleaving for other cases:

  • we use states list later in xfrmTemporarilyRemoveState and xfrmDeleteConflictingState which are fetched before performing any initial update/delete/add etc.
  • We ignore errors from XfrmStateDel in the initial part

I would say adding caching should be safer than refactoring that part (also safer for backports), as long as we ensure we don't have other calls that are modifying xfrmState with CI test.
If something external to Cilium modifies xfrmStates, then caching for 1 minute won't make any significant difference, as this is going to be only resolved by background-sync, which runs less often than 1 minute.

@sjdot
Copy link
Contributor

sjdot commented May 21, 2024

Thanks @marseel

Yeah in my patch there are still cases where I have to call list and pass to the functions you mentioned (e.g. a state that needs to be deleted an re-added), I've also not handled the migration of old state so I agree back porting is painful/difficult.

I see the usefulness of the caching given this, I'm just nervous given our long history of very painful ipsec issues. Given it's a pretty significant change, do you think it would be feasible to put this behind a feature flag initially? I would like the ability to turn this off if we hit an issue we suspect might be related to be able to quickly verify if this is the cause or not.

@marseel
Copy link
Contributor Author

marseel commented May 21, 2024

I see the usefulness of the caching given this, I'm just nervous given our long history of very painful ipsec issues. Given it's a pretty significant change, do you think it would be feasible to put this behind a feature flag initially? I would like the ability to turn this off if we hit an issue we suspect might be related to be able to quickly verify if this is the cause or not.

Yup, I think that definitely makes sense.
Let me get that flag + CI coverage before merging.

@marseel marseel force-pushed the improve_background_check_and_cache_xfrm_test branch from 522c126 to ccf72a2 Compare May 22, 2024 10:07
@marseel marseel requested review from a team as code owners May 22, 2024 10:07
@marseel marseel requested a review from ldelossa May 22, 2024 10:07
@marseel
Copy link
Contributor Author

marseel commented May 22, 2024

So I've tested it manually as well to make sure that flag works, we can disable caching by following helm values:

extraArgs:
  - --enable-ipsec-xfrm-state-caching=false

I intentionally made this flag hidden as I don't expect that we will need to turn it off.

For backports, I am considering turning it off as default and switching to true by default later on once we get more confidence on main branch.

@marseel marseel force-pushed the improve_background_check_and_cache_xfrm_test branch 3 times, most recently from 35541fe to 306901f Compare May 22, 2024 10:18
@marseel
Copy link
Contributor Author

marseel commented May 22, 2024

/test

@marseel marseel requested a review from pchaigno May 22, 2024 11:37
@maintainer-s-little-helper maintainer-s-little-helper bot removed the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label May 22, 2024
Copy link
Member

@asauber asauber left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

config changes look good

@marseel
Copy link
Contributor Author

marseel commented May 23, 2024

I just checked and it seems we don't run privileged runtime tests on main:
PR for testing: #32686
Succeded: https://github.com/cilium/cilium/actions/runs/9208640026/job/25331399284#logs

I've checked that while trying to make a backport to hotix branch I've noticed that it fails: #32665
as we run tests there on Jenkins.

@sayboras is looking into fixing running these tests on main.

@sayboras
Copy link
Member

@sayboras is looking into fixing running these tests on main.

Sorry for the trouble, this was missed as part of recent go test migration, I have scanned for all the files, seems like only two test suites are missing. The below PR should help.

#32687

@marseel
Copy link
Contributor Author

marseel commented May 23, 2024

Converting to draft till tests are being correctly run.

@marseel marseel marked this pull request as draft May 23, 2024 15:17
@marseel marseel force-pushed the improve_background_check_and_cache_xfrm_test branch from 306901f to d7ea263 Compare May 23, 2024 16:15
@marseel
Copy link
Contributor Author

marseel commented May 23, 2024

/ci-runtime

@marseel marseel force-pushed the improve_background_check_and_cache_xfrm_test branch from d7ea263 to 8ce2344 Compare May 23, 2024 16:51
@marseel
Copy link
Contributor Author

marseel commented May 23, 2024

/ci-runtime

@marseel marseel force-pushed the improve_background_check_and_cache_xfrm_test branch from 8ce2344 to 0a4d0bc Compare May 23, 2024 17:03
@marseel
Copy link
Contributor Author

marseel commented May 23, 2024

/ci-runtime

Reduces GC CPU usage and memory allocations coming from XfrmStateList.
To ensure we have up-to-date cache, wrap all XfrmState related
functions inside cache, which is invalidated whenever XfrmState changes.

This is follow-up to #32577
While that PR averages out CPU usage over time, in large cluster 100+
nodes amount of allocations coming from netlink.XfrmStateList() is high
due to backgroundSync where we usually don't change any Xfrm states.
This becomes more and more expensive as number of nodes increases.

Added CI test to make sure that we accidentally don't add calls that
modify XFRMState without going through cache.

Also, added hidden option that allows to turn of caching.

Signed-off-by: Marcel Zieba <marcel.zieba@isovalent.com>
@marseel marseel force-pushed the improve_background_check_and_cache_xfrm_test branch from 0a4d0bc to 1164b7a Compare May 24, 2024 10:49
@marseel
Copy link
Contributor Author

marseel commented May 24, 2024

Rebased to pick up #32687

@marseel
Copy link
Contributor Author

marseel commented May 24, 2024

/test

@marseel marseel marked this pull request as ready for review May 27, 2024 08:42
Copy link
Member

@bimmlerd bimmlerd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

my codeowners look good.

Copy link
Contributor

@ldelossa ldelossa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@marseel
Copy link
Contributor Author

marseel commented May 29, 2024

@borkmann friendly ping :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
release-note/minor This PR changes functionality that users may find relevant to operating Cilium.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants