background-sync: spread node updates over time. #32577

marseel · 2024-05-16T12:14:32Z

Before, depending on cluster-size we were triggering node update for
each node at fixed intervals depending on cluster-size. This resulted in
high cpu usage spike in agent. While the intent is to fix state that got
stale and shouldn't be the primary source of updates, it makes sense to
spread these updates over time to average out cpu usage.

Also, reenable backgroundSync test.

GC CPU usage on 100-node cluster with IPSec enabled before:

After:

Improved background resynchronization of nodes. Before all nodes were being updated at the same time, now we spread updates over time to average out CPU usage.

marseel · 2024-05-17T09:58:26Z

/test

pkg/node/manager/manager.go

marseel · 2024-05-17T10:04:17Z

Removing @danehans from reviews due to cilium/community#118

Before, depending on cluster-size we were triggering node update for each node at fixed intervals depending on cluster-size. This resulted in high cpu usage spike in agent. While the intent is to fix state that got stale and shouldn't be the primary source of updates, it makes sense to spread these updates over time to average out cpu usage. Also, reenable backgroundSync test. Signed-off-by: Marcel Zieba <marcel.zieba@isovalent.com>

marseel · 2024-05-17T10:36:37Z

/test

pkg/node/manager/manager.go

Reduces GC CPU usage and memory allocations coming from XfrmStateList. To ensure we have up-to-date cache, wrap all XfrmState related functions inside cache, which is invalidated whenever XfrmState changes. This is follow-up to #32577 While that PR averages out CPU usage over time, in large cluster 100+ nodes amount of allocations coming from netlink.XfrmStateList() is high due to backgroundSync where we usually don't change any Xfrm states. This becomes more and more expensive as number of nodes increases. Signed-off-by: Marcel Zieba <marcel.zieba@isovalent.com>

Reduces GC CPU usage and memory allocations coming from XfrmStateList. To ensure we have up-to-date cache, wrap all XfrmState related functions inside cache, which is invalidated whenever XfrmState changes. This is follow-up to #32577 While that PR averages out CPU usage over time, in large cluster 100+ nodes amount of allocations coming from netlink.XfrmStateList() is high due to backgroundSync where we usually don't change any Xfrm states. This becomes more and more expensive as number of nodes increases. Added CI test to make sure that we accidentally don't add calls that modify XFRMState without going through cache. Also, added hidden option that allows to turn of caching. Signed-off-by: Marcel Zieba <marcel.zieba@isovalent.com>

[ upstream commit 2019ebe ] During bootrstrap, we don't know number of nodes and new implementation essentially was hot looping till fetched nodes. Also, in case of cluster with single node, rate-limiter was not rate-limiting. Fixes cilium#32577 Signed-off-by: Marcel Zieba <marcel.zieba@isovalent.com>

[ upstream commit 3a4c57f ] [ Backporter's notes: switch default to false - not enabled ] Reduces GC CPU usage and memory allocations coming from XfrmStateList. To ensure we have up-to-date cache, wrap all XfrmState related functions inside cache, which is invalidated whenever XfrmState changes. This is follow-up to cilium#32577 While that PR averages out CPU usage over time, in large cluster 100+ nodes amount of allocations coming from netlink.XfrmStateList() is high due to backgroundSync where we usually don't change any Xfrm states. This becomes more and more expensive as number of nodes increases. Added CI test to make sure that we accidentally don't add calls that modify XFRMState without going through cache. Also, added hidden option that allows to turn of caching. Signed-off-by: Marcel Zieba <marcel.zieba@isovalent.com>

[ upstream commit 3a4c57f ] [ Backporter's notes: switch default to false - so not enabled by default. Switch from testing package to checkmate in unit tests ] Reduces GC CPU usage and memory allocations coming from XfrmStateList. To ensure we have up-to-date cache, wrap all XfrmState related functions inside cache, which is invalidated whenever XfrmState changes. This is follow-up to cilium#32577 While that PR averages out CPU usage over time, in large cluster 100+ nodes amount of allocations coming from netlink.XfrmStateList() is high due to backgroundSync where we usually don't change any Xfrm states. This becomes more and more expensive as number of nodes increases. Added CI test to make sure that we accidentally don't add calls that modify XFRMState without going through cache. Also, added hidden option that allows to turn of caching. Signed-off-by: Marcel Zieba <marcel.zieba@isovalent.com>

[ upstream commit 3a4c57f ] [ Backporter's notes: switch default to false - so not enabled by default. Switch from testing package to checkmate in unit tests. Flags use Vp instead of vp. Minor conflicts with netlink.XfrmState* calls ] Reduces GC CPU usage and memory allocations coming from XfrmStateList. To ensure we have up-to-date cache, wrap all XfrmState related functions inside cache, which is invalidated whenever XfrmState changes. This is follow-up to #32577 While that PR averages out CPU usage over time, in large cluster 100+ nodes amount of allocations coming from netlink.XfrmStateList() is high due to backgroundSync where we usually don't change any Xfrm states. This becomes more and more expensive as number of nodes increases. Added CI test to make sure that we accidentally don't add calls that modify XFRMState without going through cache. Also, added hidden option that allows to turn of caching. Signed-off-by: Marcel Zieba <marcel.zieba@isovalent.com>

[ upstream commit 3a4c57f ] [ Backporter's notes: switch default to false - so not enabled by default. Switch from testing package to checkmate in unit tests. Flags use Vp instead of vp. Minor conflicts with netlink.XfrmState* calls. Switched from pkg/time to time ] Reduces GC CPU usage and memory allocations coming from XfrmStateList. To ensure we have up-to-date cache, wrap all XfrmState related functions inside cache, which is invalidated whenever XfrmState changes. This is follow-up to #32577 While that PR averages out CPU usage over time, in large cluster 100+ nodes amount of allocations coming from netlink.XfrmStateList() is high due to backgroundSync where we usually don't change any Xfrm states. This becomes more and more expensive as number of nodes increases. Added CI test to make sure that we accidentally don't add calls that modify XFRMState without going through cache. Also, added hidden option that allows to turn of caching. Signed-off-by: Marcel Zieba <marcel.zieba@isovalent.com>

[ upstream commit 2019ebe ] [ Backporter's notes: minor conflicts due to lack of context and closeChan instead. ] During bootrstrap, we don't know number of nodes and new implementation essentially was hot looping till fetched nodes. Also, in case of cluster with single node, rate-limiter was not rate-limiting. Fixes #32577 Signed-off-by: Marcel Zieba <marcel.zieba@isovalent.com>

[ upstream commit 3a4c57f ] [ Backporter's notes: switch default to false - so not enabled by default. Switch from testing package to checkmate in unit tests. Flags use Vp instead of vp. Minor conflicts with netlink.XfrmState* calls. Switched from pkg/time to time. Switch from checkmate to check.v1 ] Reduces GC CPU usage and memory allocations coming from XfrmStateList. To ensure we have up-to-date cache, wrap all XfrmState related functions inside cache, which is invalidated whenever XfrmState changes. This is follow-up to #32577 While that PR averages out CPU usage over time, in large cluster 100+ nodes amount of allocations coming from netlink.XfrmStateList() is high due to backgroundSync where we usually don't change any Xfrm states. This becomes more and more expensive as number of nodes increases. Added CI test to make sure that we accidentally don't add calls that modify XFRMState without going through cache. Also, added hidden option that allows to turn of caching. Signed-off-by: Marcel Zieba <marcel.zieba@isovalent.com>

[ upstream commit 2019ebe ] [ Backporter's notes: minor conflicts due to lack of context and closeChan instead. ] During bootrstrap, we don't know number of nodes and new implementation essentially was hot looping till fetched nodes. Also, in case of cluster with single node, rate-limiter was not rate-limiting. Fixes #32577 Signed-off-by: Marcel Zieba <marcel.zieba@isovalent.com>

[ upstream commit 3a4c57f ] [ Backporter's notes: switch default to false - so not enabled by default. Switch from testing package to checkmate in unit tests. Flags use Vp instead of vp. Minor conflicts with netlink.XfrmState* calls ] Reduces GC CPU usage and memory allocations coming from XfrmStateList. To ensure we have up-to-date cache, wrap all XfrmState related functions inside cache, which is invalidated whenever XfrmState changes. This is follow-up to #32577 While that PR averages out CPU usage over time, in large cluster 100+ nodes amount of allocations coming from netlink.XfrmStateList() is high due to backgroundSync where we usually don't change any Xfrm states. This becomes more and more expensive as number of nodes increases. Added CI test to make sure that we accidentally don't add calls that modify XFRMState without going through cache. Also, added hidden option that allows to turn of caching. Signed-off-by: Marcel Zieba <marcel.zieba@isovalent.com>

[ upstream commit 3a4c57f ] [ Backporter's notes: switch default to false - so not enabled by default. Switch from testing package to checkmate in unit tests ] Reduces GC CPU usage and memory allocations coming from XfrmStateList. To ensure we have up-to-date cache, wrap all XfrmState related functions inside cache, which is invalidated whenever XfrmState changes. This is follow-up to #32577 While that PR averages out CPU usage over time, in large cluster 100+ nodes amount of allocations coming from netlink.XfrmStateList() is high due to backgroundSync where we usually don't change any Xfrm states. This becomes more and more expensive as number of nodes increases. Added CI test to make sure that we accidentally don't add calls that modify XFRMState without going through cache. Also, added hidden option that allows to turn of caching. Signed-off-by: Marcel Zieba <marcel.zieba@isovalent.com>

[ upstream commit 2019ebe ] During bootrstrap, we don't know number of nodes and new implementation essentially was hot looping till fetched nodes. Also, in case of cluster with single node, rate-limiter was not rate-limiting. Fixes #32577 Signed-off-by: Marcel Zieba <marcel.zieba@isovalent.com>

[ upstream commit 3a4c57f ] [ Backporter's notes: switch default to false - so not enabled by default. Switch from testing package to checkmate in unit tests. Flags use Vp instead of vp. Minor conflicts with netlink.XfrmState* calls. Switched from pkg/time to time. Switch from checkmate to check.v1 ] Reduces GC CPU usage and memory allocations coming from XfrmStateList. To ensure we have up-to-date cache, wrap all XfrmState related functions inside cache, which is invalidated whenever XfrmState changes. This is follow-up to #32577 While that PR averages out CPU usage over time, in large cluster 100+ nodes amount of allocations coming from netlink.XfrmStateList() is high due to backgroundSync where we usually don't change any Xfrm states. This becomes more and more expensive as number of nodes increases. Added CI test to make sure that we accidentally don't add calls that modify XFRMState without going through cache. Also, added hidden option that allows to turn of caching. Signed-off-by: Marcel Zieba <marcel.zieba@isovalent.com>

maintainer-s-little-helper bot added the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label May 16, 2024

marseel force-pushed the improve_background_sync branch 3 times, most recently from 2cd729a to 959d5ee Compare May 17, 2024 09:50

marseel changed the title ~~Improve background sync~~ background-sync: spread node updates over time. May 17, 2024

marseel added the release-note/minor This PR changes functionality that users may find relevant to operating Cilium. label May 17, 2024

maintainer-s-little-helper bot removed the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label May 17, 2024

marseel force-pushed the improve_background_sync branch from 959d5ee to 0388bf0 Compare May 17, 2024 09:58

marseel commented May 17, 2024

View reviewed changes

pkg/node/manager/manager.go Show resolved Hide resolved

marseel marked this pull request as ready for review May 17, 2024 10:01

marseel requested a review from a team as a code owner May 17, 2024 10:01

marseel requested review from danehans and removed request for danehans May 17, 2024 10:01

tklauser self-requested a review May 17, 2024 10:28

marseel force-pushed the improve_background_sync branch from 0388bf0 to 0d4f591 Compare May 17, 2024 10:35

tklauser approved these changes May 17, 2024

View reviewed changes

pkg/node/manager/manager.go Show resolved Hide resolved

marseel mentioned this pull request May 17, 2024

ipsec: cache xfrm state list #32588

Merged

tklauser enabled auto-merge May 17, 2024 12:10

tklauser added this pull request to the merge queue May 17, 2024

Merged via the queue into main with commit 1963879 May 17, 2024
271 checks passed

tklauser deleted the improve_background_sync branch May 17, 2024 12:21

marseel mentioned this pull request Jun 4, 2024

v1.14 Backports - background node sync improvements #32874

Merged

2 tasks

marseel added the backport-pending/1.14 The backport for Cilium 1.14.x for this PR is in progress. label Jun 4, 2024

maintainer-s-little-helper bot added this to Backport done to v1.15 in 1.15.6 Jun 4, 2024

maintainer-s-little-helper bot added this to Backport pending to v1.14 in 1.14.12 Jun 4, 2024

marseel mentioned this pull request Jun 4, 2024

v1.13 Backports - background node sync improvements #32885

Merged

2 tasks

marseel added the backport-pending/1.13 The backport for Cilium 1.13.x for this PR is in progress. label Jun 4, 2024

maintainer-s-little-helper bot added this to Backport pending to v1.13 in 1.13.17 Jun 4, 2024

github-actions bot added backport-done/1.13 The backport for Cilium 1.13.x for this PR is done. and removed backport-pending/1.13 The backport for Cilium 1.13.x for this PR is in progress. labels Jun 5, 2024

maintainer-s-little-helper bot moved this from Backport pending to v1.13 to Backport done to v1.13 in 1.13.17 Jun 5, 2024

maintainer-s-little-helper bot removed this from Backport pending to v1.13 in 1.13.17 Jun 5, 2024

github-actions bot added backport-done/1.14 The backport for Cilium 1.14.x for this PR is done. and removed backport-pending/1.14 The backport for Cilium 1.14.x for this PR is in progress. labels Jun 5, 2024

maintainer-s-little-helper bot removed this from Backport pending to v1.14 in 1.14.12 Jun 5, 2024

maintainer-s-little-helper bot added this to Backport done to v1.14 in 1.14.12 Jun 5, 2024

maintainer-s-little-helper bot added this to Backport done to v1.13 in 1.13.17 Jun 5, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

background-sync: spread node updates over time. #32577

background-sync: spread node updates over time. #32577

marseel commented May 16, 2024 •

edited

marseel commented May 17, 2024

marseel commented May 17, 2024 •

edited

marseel commented May 17, 2024

background-sync: spread node updates over time. #32577

background-sync: spread node updates over time. #32577

Conversation

marseel commented May 16, 2024 • edited

marseel commented May 17, 2024

marseel commented May 17, 2024 • edited

marseel commented May 17, 2024

marseel commented May 16, 2024 •

edited

marseel commented May 17, 2024 •

edited