Add graceful-restart support #2386

karampok · 2024-04-26T11:40:37Z

/kind feature

What this PR does / why we need it:
Issue covers #2368

Special notes for your reviewer:

Release note:

TODO

karampok · 2024-04-26T13:20:33Z

Note to discuss:

Graceful restart configuration changed, reset this peer to take effect
% The Graceful Restart command used is not valid at this moment.
line 76: Failure to communicate[13] to bgpd, line:  neighbor 11.11.11.1 graceful-restart

[311|bfdd] done
[294|zebra] done
[300|bgpd] Configuration file[/etc/frr/frr.conf] processing failure: 13
2024-04-26 13:18:54,515 WARNING: frr-reload.py failed due to
vtysh (exec file) exited with status 13
Failed to fully apply configuration file 1 seconds
Caught SIGHUP and acquired lock! Reloading FRR..
Checking the configuration file syntax

it seems it fails, and retries to apply that the speaker/frr-reloader.

karampok · 2024-04-29T14:10:04Z

The above might be related to FRRouting/frr#8403

 vtysh -c configure -c 'router bgp 7003' -c 'bgp graceful-restart'
Graceful restart configuration changed, reset all peers to take effect
k00-worker:/# echo $?
0
k00-worker:/# vtysh -c configure -c 'router bgp 7003' -c 'neighbor 11.11.11.1 graceful-restart'
Graceful restart configuration changed, reset this peer to take effect
% The Graceful Restart command used is not valid at this moment.
k00-worker:/# echo $?
1

karampok · 2024-04-30T14:42:06Z

In my test: I achieved zero downtime if I have graceful restart either in global bgp graceful restarst or in neighbor mode neighbor x.y.z.k graceful restart and having that in bgp graceful-restart preserve-fw-state.

(Test= in runtime I delete the speaker and see if the client loses connectivity)

The second command that enables the F -bit is only available in global mode, not available per neighbor. Therefore a solution which we add graceful-restart in neighbor level and f-bit in global (because is not safe to just have that there by default) seem not optimal.

@fedepaol How much do you object to have it global (one for enable gracefull restart/one to on/off f-bit), and also have the option to opt out per neighbor? Any other ideas?

karampok · 2024-04-30T14:46:14Z

FRRouting/frr#15880 created this issue upstream, even though we are not really blocked on that (well we will get an error but doe not look like it has an impact)

fedepaol · 2024-05-07T14:18:58Z

The second command that enables the F -bit is only available in global mode, not available per neighbor. Therefore a solution which we add graceful-restart in neighbor level and f-bit in global (because is not safe to just have that there by default) seem not optimal.

@fedepaol How much do you object to have it global (one for enable gracefull restart/one to on/off f-bit), and also have the option to opt out per neighbor? Any other ideas?

Won't the global F-bit affect only peers with gr enabled? If so, doesn't sound too terrible to always enable it.

karampok · 2024-05-07T20:28:04Z

My preference would be to be configurable to be in the safe side. I cannot find argument against always having it (or being default on) but no idea for which case this option exists. Should we go always having it in the config?

In the same topic, should we not allow tuning the timer time? default is 120sec but seems long (ideally we need < holdtime)

fedepaol · 2024-05-08T13:23:28Z

My preference would be to be configurable to be in the safe side. I cannot find argument against always having it (or being default on) but no idea for which case this option exists. Should we go always having it in the config?

I'd try to keep the ux as best as we can, which means not having to set an extra parameter. We can either set the global when at least one peer needs GR, or just always enable it. I am leaning towards the second, as we'll have to deal with peers without GR and the global anyway, so if it doesn't work we'll have a problem.
Also, hopefully CI will tell us if there's a problem.

In the same topic, should we not allow tuning the timer time? default is 120sec but seems long (ideally we need < holdtime)

You mean the graceful restart timer? That should be meaningful only for the helper side I guess (but we may need it in frr-k8s where we might be helpers).

fedepaol · 2024-05-08T13:51:35Z

cc @oribon

fedepaol · 2024-05-14T07:23:32Z

api/v1beta2/bgppeer_types.go

+	// known routes while the routing protocol information is being restored.
+	// Needed for FRR mode only.
+	// +optional
+	GracefulRestart bool `json:"gracefulRestart,omitempty"`


nit: EnableGracefulRestart

I can change that, sure

fedepaol · 2024-05-14T07:26:03Z

api/v1beta2/bgppeer_types.go

nit on the commit subject: this commit does more than adding the field to the api, we are adding the support end to end (which is totally fine, but let's change the subject to "add graceful restart support"

I have three commits and the e2e is added on its own commit, no?

fedepaol · 2024-05-14T07:28:49Z

e2etest/bgptests/bgp.go

+		err := ConfigUpdater.Update(resources)
+		Expect(err).NotTo(HaveOccurred())
+
+		for _, c := range FRRContainers {


I think it'd make sense to have a real e2e test where we shut down a speaker and we check the traffic is still flowing.

First thing that comes to my mind is, focus on a node, exclude it from the daemonset selector, have a pod running only there with etp=local, ensure the traffic flows when the speaker is down

It's probably easier if we raise the graceful restart timers of the peered neighbors, so we'll have control

not sure, because at the end we test whether FRR supports the functionality. Nevertheless, we can implement that, is there any similar test?

Why you think we should raise the timers (default is 120 sec)?

Let's start with the default and see if it's enough then

oribon · 2024-05-15T07:03:48Z

api/v1beta2/bgppeer_types.go

+	// Needed for FRR mode only.
+	// +optional
+	// +kubebuilder:validation:XValidation:rule="self == oldSelf",message="EnableGracefulRestart cannot be changed after creation, because GR requires rest BGP session"


nit: "Needed" should be more about "Supported" here?
also I'd put in the description the point about "this field is immutable because it requires restart of the BGP session" and keep the validation message "EnableGracefulRestart cannot be changed after creation"

sounds good, done

oribon · 2024-05-15T07:06:45Z

internal/config/validation.go

@@ -41,6 +41,9 @@ func DiscardFRROnly(c ClusterResources) error {
 		if p.Spec.ConnectTime != nil {
 			return fmt.Errorf("peer %s has connect time set on native bgp mode", p.Spec.Address)
 		}
+		if p.Spec.EnableGracefulRestart {
+			return fmt.Errorf("peer %s has enabled GR (GracefulRestart) flag set on native bgp mode", p.Spec.Address)


Suggested change

return fmt.Errorf("peer %s has enabled GR (GracefulRestart) flag set on native bgp mode", p.Spec.Address)

return fmt.Errorf("peer %s has GracefulRestart flag set on native bgp mode", p.Spec.Address)

BGP graceful restart functionality as defined in RFC-4724 defines the mechanisms that allows BGP speaker to continue to forward data packets along known routes while the routing protocol information is being restored. This allows DS to be updated without routes being retracted in the peer side. We enable by default if GR then the F-bit (preserve-fw-state) to be set. We make gracefulRestart immutable according to https://kubernetes.io/blog/2022/09/29/enforce-immutability-using-cel/#immutablility-after-first-modification Signed-off-by: karampok <karampok@gmail.com>

Signed-off-by: karampok <karampok@gmail.com>

- Restarts speaker and wait to be healthy - Once healthy cancel context to finish faster in happy path Signed-off-by: karampok <karampok@gmail.com>

Signed-off-by: karampok <karampok@gmail.com>

fedepaol · 2024-05-27T15:17:17Z

e2etest/bgptests/bgp.go

+			eg, ctx := errgroup.WithContext(ctx)
+			for _, c := range FRRContainers {
+				validateService(svc, allNodes.Items, c)
+				cc := c // https://go.dev/doc/faq#closures_and_goroutines


can you use cc above (or just c := c) and use c? Seems a bit less confusing.

fedepaol · 2024-05-27T15:21:39Z

e2etest/pkg/metallb/metallb.go

+
+		ret := true
+		for _, p := range pods {
+			ret = ret && k8s.PodIsReady(p)


can you just exit early if the pod is not ready?

fedepaol · 2024-05-27T15:23:19Z

e2etest/pkg/metallb/metallb.go

+	}
+
+	f := func(context.Context) (bool, error) {
+		pods, err := SpeakerPods(cs)


this may be quick enough to pick the pods being deleted (and seeing them as ready). We should check that the names changed, or add a non existing node selector to the daemonset in order to guarantee a full restart.

fedepaol · 2024-05-27T15:26:05Z

e2etest/pkg/metallb/metallb.go

@@ -32,6 +32,38 @@ func init() {
 	}
 }

+func RestartSpeakerPods(cancel context.CancelFunc, cs clientset.Interface) error {


This will return nil the moment we see the pods restarted (bugs aside).
I'd keep this func context un-aware and let it do what it promises (restart the pods) and handle the asynch part from the calling site.

fedepaol · 2024-05-27T15:26:49Z

e2etest/pkg/metallb/metallb.go

+		}
+	}
+
+	f := func(context.Context) (bool, error) {


this can be embedded as anonymous function in the call below. No need to assign it to a variable.

fedepaol · 2024-05-27T15:30:59Z

e2etest/bgptests/bgp.go

+				})
+			}
+
+			err = metallb.RestartSpeakerPods(cancel, cs)


I'd try to simplify this, because it adds complexity and is a bit hard to follow.

How about restarting the pods, running an eventually assertion that checks all the pods are restarted, and inside the eventually body we check that the service keeps being accessible?

fedepaol · 2024-05-27T15:44:36Z

e2etest/bgptests/metrics.go

@@ -61,6 +62,21 @@ var _ = ginkgo.Describe("BGP metrics", func() {
 	})

 	ginkgo.BeforeEach(func() {
+		clientconfig := k8sclient.RestConfig()


why this was moved?

fedepaol · 2024-05-27T15:45:49Z

e2etest/bgptests/bgp.go

@@ -98,9 +101,25 @@ var _ = ginkgo.Describe("BGP", func() {
 	})

 	ginkgo.BeforeEach(func() {
+
+		clientconfig := k8sclient.RestConfig()
+		var err error


why this was moved?

github-actions bot added the kind/feature label Apr 26, 2024

karampok force-pushed the main branch 4 times, most recently from 14ff603 to 0b43ca8 Compare May 13, 2024 13:32

karampok marked this pull request as ready for review May 13, 2024 14:03

karampok requested review from fedepaol, russellb, gclawes and oribon as code owners May 13, 2024 14:03

fedepaol reviewed May 14, 2024

View reviewed changes

karampok force-pushed the main branch 2 times, most recently from 72e3c27 to d65bb88 Compare May 14, 2024 11:47

oribon reviewed May 15, 2024

View reviewed changes

karampok force-pushed the main branch 4 times, most recently from 50f6988 to 363fa20 Compare May 21, 2024 14:11

karampok force-pushed the main branch 2 times, most recently from ef87f52 to 847394f Compare May 22, 2024 14:54

E2E: add test for gracefulRestart

3f1368b

Signed-off-by: karampok <karampok@gmail.com>

karampok force-pushed the main branch 2 times, most recently from 63944e7 to a7cbd33 Compare May 23, 2024 12:54

E2E: add functional test for GracefulRestart

bd17ec7

- Restarts speaker and wait to be healthy - Once healthy cancel context to finish faster in happy path Signed-off-by: karampok <karampok@gmail.com>

karampok force-pushed the main branch from a7cbd33 to bd17ec7 Compare May 23, 2024 12:57

tosuqash: fix ci

e57aee6

Signed-off-by: karampok <karampok@gmail.com>

karampok force-pushed the main branch from 95e4970 to e57aee6 Compare May 27, 2024 08:40

fedepaol reviewed May 27, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add graceful-restart support #2386

Add graceful-restart support #2386

karampok commented Apr 26, 2024

karampok commented Apr 26, 2024

karampok commented Apr 29, 2024 •

edited

karampok commented Apr 30, 2024

karampok commented Apr 30, 2024

fedepaol commented May 7, 2024 •

edited

karampok commented May 7, 2024 •

edited

fedepaol commented May 8, 2024

fedepaol commented May 8, 2024

fedepaol May 14, 2024

karampok May 14, 2024

karampok May 14, 2024

fedepaol May 14, 2024

karampok May 14, 2024

karampok May 14, 2024

fedepaol May 14, 2024

fedepaol May 14, 2024

karampok May 14, 2024

fedepaol May 14, 2024

oribon May 15, 2024

karampok May 15, 2024

oribon May 15, 2024

karampok May 15, 2024

fedepaol May 27, 2024

fedepaol May 27, 2024

fedepaol May 27, 2024

fedepaol May 27, 2024

fedepaol May 27, 2024

fedepaol May 27, 2024

fedepaol May 27, 2024

fedepaol May 27, 2024

	return fmt.Errorf("peer %s has enabled GR (GracefulRestart) flag set on native bgp mode", p.Spec.Address)
	return fmt.Errorf("peer %s has GracefulRestart flag set on native bgp mode", p.Spec.Address)

Add graceful-restart support #2386

Are you sure you want to change the base?

Add graceful-restart support #2386

Conversation

karampok commented Apr 26, 2024

karampok commented Apr 26, 2024

karampok commented Apr 29, 2024 • edited

karampok commented Apr 30, 2024

karampok commented Apr 30, 2024

fedepaol commented May 7, 2024 • edited

karampok commented May 7, 2024 • edited

fedepaol commented May 8, 2024

fedepaol commented May 8, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

karampok commented Apr 29, 2024 •

edited

fedepaol commented May 7, 2024 •

edited

karampok commented May 7, 2024 •

edited