Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add graceful-restart support #2386

Open
wants to merge 4 commits into
base: main
Choose a base branch
from
Open

Add graceful-restart support #2386

wants to merge 4 commits into from

Conversation

karampok
Copy link
Contributor

/kind feature

What this PR does / why we need it:
Issue covers #2368

Special notes for your reviewer:

Release note:

TODO

@karampok
Copy link
Contributor Author

Note to discuss:

Graceful restart configuration changed, reset this peer to take effect
% The Graceful Restart command used is not valid at this moment.
line 76: Failure to communicate[13] to bgpd, line:  neighbor 11.11.11.1 graceful-restart

[311|bfdd] done
[294|zebra] done
[300|bgpd] Configuration file[/etc/frr/frr.conf] processing failure: 13
2024-04-26 13:18:54,515 WARNING: frr-reload.py failed due to
vtysh (exec file) exited with status 13
Failed to fully apply configuration file 1 seconds
Caught SIGHUP and acquired lock! Reloading FRR..
Checking the configuration file syntax

it seems it fails, and retries to apply that the speaker/frr-reloader.

@karampok
Copy link
Contributor Author

karampok commented Apr 29, 2024

The above might be related to FRRouting/frr#8403

 vtysh -c configure -c 'router bgp 7003' -c 'bgp graceful-restart'
Graceful restart configuration changed, reset all peers to take effect
k00-worker:/# echo $?
0
k00-worker:/# vtysh -c configure -c 'router bgp 7003' -c 'neighbor 11.11.11.1 graceful-restart'
Graceful restart configuration changed, reset this peer to take effect
% The Graceful Restart command used is not valid at this moment.
k00-worker:/# echo $?
1

@karampok
Copy link
Contributor Author

In my test: I achieved zero downtime if I have graceful restart either in global bgp graceful restarst or in neighbor mode neighbor x.y.z.k graceful restart and having that in bgp graceful-restart preserve-fw-state.

(Test= in runtime I delete the speaker and see if the client loses connectivity)

The second command that enables the F -bit is only available in global mode, not available per neighbor. Therefore a solution which we add graceful-restart in neighbor level and f-bit in global (because is not safe to just have that there by default) seem not optimal.

@fedepaol How much do you object to have it global (one for enable gracefull restart/one to on/off f-bit), and also have the option to opt out per neighbor? Any other ideas?

@karampok
Copy link
Contributor Author

FRRouting/frr#15880 created this issue upstream, even though we are not really blocked on that (well we will get an error but doe not look like it has an impact)

@fedepaol
Copy link
Member

fedepaol commented May 7, 2024

The second command that enables the F -bit is only available in global mode, not available per neighbor. Therefore a solution which we add graceful-restart in neighbor level and f-bit in global (because is not safe to just have that there by default) seem not optimal.

@fedepaol How much do you object to have it global (one for enable gracefull restart/one to on/off f-bit), and also have the option to opt out per neighbor? Any other ideas?

Won't the global F-bit affect only peers with gr enabled? If so, doesn't sound too terrible to always enable it.

@karampok
Copy link
Contributor Author

karampok commented May 7, 2024

My preference would be to be configurable to be in the safe side. I cannot find argument against always having it (or being default on) but no idea for which case this option exists. Should we go always having it in the config?

In the same topic, should we not allow tuning the timer time? default is 120sec but seems long (ideally we need < holdtime)

@fedepaol
Copy link
Member

fedepaol commented May 8, 2024

My preference would be to be configurable to be in the safe side. I cannot find argument against always having it (or being default on) but no idea for which case this option exists. Should we go always having it in the config?

I'd try to keep the ux as best as we can, which means not having to set an extra parameter. We can either set the global when at least one peer needs GR, or just always enable it. I am leaning towards the second, as we'll have to deal with peers without GR and the global anyway, so if it doesn't work we'll have a problem.
Also, hopefully CI will tell us if there's a problem.

In the same topic, should we not allow tuning the timer time? default is 120sec but seems long (ideally we need < holdtime)

You mean the graceful restart timer? That should be meaningful only for the helper side I guess (but we may need it in frr-k8s where we might be helpers).

@fedepaol
Copy link
Member

fedepaol commented May 8, 2024

cc @oribon

@karampok karampok force-pushed the main branch 4 times, most recently from 14ff603 to 0b43ca8 Compare May 13, 2024 13:32
@karampok karampok marked this pull request as ready for review May 13, 2024 14:03
// known routes while the routing protocol information is being restored.
// Needed for FRR mode only.
// +optional
GracefulRestart bool `json:"gracefulRestart,omitempty"`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: EnableGracefulRestart

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can change that, sure

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit on the commit subject: this commit does more than adding the field to the api, we are adding the support end to end (which is totally fine, but let's change the subject to "add graceful restart support"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have three commits and the e2e is added on its own commit, no?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

err := ConfigUpdater.Update(resources)
Expect(err).NotTo(HaveOccurred())

for _, c := range FRRContainers {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it'd make sense to have a real e2e test where we shut down a speaker and we check the traffic is still flowing.

First thing that comes to my mind is, focus on a node, exclude it from the daemonset selector, have a pod running only there with etp=local, ensure the traffic flows when the speaker is down

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's probably easier if we raise the graceful restart timers of the peered neighbors, so we'll have control

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure, because at the end we test whether FRR supports the functionality. Nevertheless, we can implement that, is there any similar test?

Why you think we should raise the timers (default is 120 sec)?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's start with the default and see if it's enough then

@karampok karampok force-pushed the main branch 2 times, most recently from 72e3c27 to d65bb88 Compare May 14, 2024 11:47
Comment on lines 90 to 92
// Needed for FRR mode only.
// +optional
// +kubebuilder:validation:XValidation:rule="self == oldSelf",message="EnableGracefulRestart cannot be changed after creation, because GR requires rest BGP session"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: "Needed" should be more about "Supported" here?
also I'd put in the description the point about "this field is immutable because it requires restart of the BGP session" and keep the validation message "EnableGracefulRestart cannot be changed after creation"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sounds good, done

@@ -41,6 +41,9 @@ func DiscardFRROnly(c ClusterResources) error {
if p.Spec.ConnectTime != nil {
return fmt.Errorf("peer %s has connect time set on native bgp mode", p.Spec.Address)
}
if p.Spec.EnableGracefulRestart {
return fmt.Errorf("peer %s has enabled GR (GracefulRestart) flag set on native bgp mode", p.Spec.Address)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
return fmt.Errorf("peer %s has enabled GR (GracefulRestart) flag set on native bgp mode", p.Spec.Address)
return fmt.Errorf("peer %s has GracefulRestart flag set on native bgp mode", p.Spec.Address)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@karampok karampok force-pushed the main branch 4 times, most recently from 50f6988 to 363fa20 Compare May 21, 2024 14:11
BGP graceful restart functionality as defined in RFC-4724 defines the
mechanisms that allows BGP speaker to continue to forward data packets
along known routes while the routing protocol information is being
restored. This allows DS to be updated without routes being retracted in
the peer side.

We enable by default if GR then the F-bit (preserve-fw-state) to be set.

We make gracefulRestart immutable according to
https://kubernetes.io/blog/2022/09/29/enforce-immutability-using-cel/#immutablility-after-first-modification

Signed-off-by: karampok <karampok@gmail.com>
@karampok karampok force-pushed the main branch 2 times, most recently from ef87f52 to 847394f Compare May 22, 2024 14:54
Signed-off-by: karampok <karampok@gmail.com>
@karampok karampok force-pushed the main branch 2 times, most recently from 63944e7 to a7cbd33 Compare May 23, 2024 12:54
- Restarts speaker and wait to be healthy
- Once healthy cancel context to finish faster in happy path

Signed-off-by: karampok <karampok@gmail.com>
Signed-off-by: karampok <karampok@gmail.com>
eg, ctx := errgroup.WithContext(ctx)
for _, c := range FRRContainers {
validateService(svc, allNodes.Items, c)
cc := c // https://go.dev/doc/faq#closures_and_goroutines
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you use cc above (or just c := c) and use c? Seems a bit less confusing.


ret := true
for _, p := range pods {
ret = ret && k8s.PodIsReady(p)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you just exit early if the pod is not ready?

}

f := func(context.Context) (bool, error) {
pods, err := SpeakerPods(cs)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this may be quick enough to pick the pods being deleted (and seeing them as ready). We should check that the names changed, or add a non existing node selector to the daemonset in order to guarantee a full restart.

@@ -32,6 +32,38 @@ func init() {
}
}

func RestartSpeakerPods(cancel context.CancelFunc, cs clientset.Interface) error {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will return nil the moment we see the pods restarted (bugs aside).
I'd keep this func context un-aware and let it do what it promises (restart the pods) and handle the asynch part from the calling site.

}
}

f := func(context.Context) (bool, error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this can be embedded as anonymous function in the call below. No need to assign it to a variable.

})
}

err = metallb.RestartSpeakerPods(cancel, cs)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd try to simplify this, because it adds complexity and is a bit hard to follow.

How about restarting the pods, running an eventually assertion that checks all the pods are restarted, and inside the eventually body we check that the service keeps being accessible?

@@ -61,6 +62,21 @@ var _ = ginkgo.Describe("BGP metrics", func() {
})

ginkgo.BeforeEach(func() {
clientconfig := k8sclient.RestConfig()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why this was moved?

@@ -98,9 +101,25 @@ var _ = ginkgo.Describe("BGP", func() {
})

ginkgo.BeforeEach(func() {

clientconfig := k8sclient.RestConfig()
var err error
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why this was moved?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants