New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix #2311: change layer2status object namespace and add owner reference #2351
base: main
Are you sure you want to change the base?
Conversation
Unit test and e2e test is coming soon. Please have a look at the implement first, thanks! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks a lot for pushing this forward!
Without deep-diving into the splitting logic here, this seems a bit cumbersome. On the issue I mentioned that a reasonable approach might be naming the statuses GenerateName: <node-name>-
, and from there we can rely on the labels of the resource (to understand to which service it belongs). then we can add the service's name/namespace to the kubectl output and it should be readable. wdyt?
I think most of the complexity came from how to determine the actual namespace of the service. As we designed before, the reconciler is receiving name/namespace from both the channel and the cluster cr changes, after moving the status objects to speaker namespace, it can't determine the service namespace directly from Also, if the reconciler is triggered by the channel event, in which case the cr has not been created yet, getting information from labels seems not possible. So In my implement, I used Maybe you can give me an example of the |
oh I see now what's the problem I haven't thought about it all the way through. It might be that generating a name will also be not so easy. I'll think about it and try to go over this tomorrow 😅 thanks! |
I thought about this for a while and might have a reasonable alternative that should work in theory.
What I like about this resource naming change is that we'll get a more compact result and won't have any logic for the name itself (or promises). This also avoids some corner cases such as trying to create a resource with a name that is too long. With these changes, the Reconciliation won't change too much: wdyt? |
Sounds good. Let me implement this and see if there is any further promblem. |
@oribon Please take a look at the latest implement |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks a lot for this! overall looks good, still need to go over the e2es.
is there a reason you went with the "%s-%s-%s" format instead of a generated name for the resource? I'm concerned the names will be too long (or just messy in a -o wide
), wonder if you hit something that you didn't like
config/controllers/speaker.yaml
Outdated
@@ -23,11 +23,16 @@ spec: | |||
- args: | |||
- --port=7472 | |||
- --log-level=info | |||
- --enable-l2-service-status=true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can flip the flag to default to true instead. Also if it's not there already can you add a knob for helm please?
e2etest/go.mod
Outdated
replace ( | ||
k8s.io/kubectl => k8s.io/kubectl v0.29.3 | ||
k8s.io/kubelet => k8s.io/kubelet v0.29.3 | ||
) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is this necessary?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Something similar to this has occured last time: #2198 (comment)
Further information is,while I am using Jetbrain IDE developing, it runs go list -m -json all
to obtain all dependencies at project loading, I guess.
Without the replaces, if the project requires some kubernetes code whose version is set to v0.0.0, the go list
command would fail.
So this is something like a work around I think, but not sure if this is the best way.Also, I am a little confused why this problem is not triggerd in github ci env
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah it's a bit odd, can you try removing it from the commit and see what happens in ci?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah it's a bit odd, can you try removing it from the commit and see what happens in ci?
Ci will work fine because this actuallly doesn't affect go mod tidy
. Maybe I can put this in a temporary changelist to avoid IDE error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if you go get those deps instead of replacing? Replacing pins the deps and it will require manual updates (plus, we made a considerable effort to avoid doing that), with require
we can rely on dependabot.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this has still to be replaced with require, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this has still to be replaced with require, right?
Seem right. Without replaces go list
still reports errors
❯ cd e2etest
❯ go list -m -json all
go: k8s.io/kubectl@v0.0.0: invalid version: unknown revision v0.0.0
go: k8s.io/kubelet@v0.0.0: invalid version: unknown revision v0.0.0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
even with require, you mean?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
even with require, you mean?
yes
internal/k8s/k8s.go
Outdated
@@ -280,10 +280,16 @@ func New(cfg *Config) (*Client, error) { | |||
|
|||
// metallb controller doesn't need this reconciler | |||
if cfg.EnableL2Status && cfg.Layer2StatusChan != nil { | |||
selfPod, err := clientset.CoreV1().Pods(cfg.Namespace).Get(context.TODO(), os.Getenv("METALLB_POD_NAME"), metav1.GetOptions{}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should read the env var in the speaker's main.go (where the other env vars are read) and propagate it
state.Status = desiredStatus | ||
err = controllerutil.SetOwnerReference(r.SpeakerPod, state, r.Scheme()) | ||
if err != nil { | ||
level.Warn(r.Logger).Log("error setting owner ref", err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why not return an err here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was considering maintaining owner reference as an extra function in addition to the core function, rather than a part of the core function.
So I didn't do a fast fail when facing errors about owner reference.
statusObjFetcherFunc := func() ([]v1beta1.ServiceL2Status, error) { | ||
statusList := v1beta1.ServiceL2StatusList{} | ||
err := k8sClient.List(context.TODO(), &statusList, | ||
client.InNamespace(speakerNamespace), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you make this in all namespaces (just so we're sure we aren't creating unnecessary ones)
return err.Error() | ||
} | ||
if len(statuses) != 1 { | ||
return "" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: have an fmt.Errorf(...) explaining what happened
if len(statuses) != 1 { | ||
return "" | ||
} | ||
if len(statuses[0].OwnerReferences) != 1 { | ||
return "" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same, also can you verify the owner ref is the pod?
return err.Error() | ||
} | ||
if len(statuses) != 1 { | ||
return "" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same
if len(statuses) != 1 { | ||
return "" | ||
} | ||
if len(statuses[0].Status.Interfaces) == 0 { | ||
return "" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same + missing owner ref validation
Considering the name of the cr, I have not paid much attention to it so I just chose this because it first came out to my mind. We can decide the final name for crs later. Current logic has removed the dependency to cr names so I think this is a reletively independent part |
charts/metallb/values.yaml
Outdated
@@ -270,6 +270,7 @@ speaker: | |||
enabled: true | |||
# ignore the exclude-from-external-loadbalancer label | |||
ignoreExcludeLB: false | |||
layer2Status: false |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: wdyt about calling this enableLayer2ServiceStatus and default to true?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if we are confident with the implementation, this should not be optional. I added the feature flag only because then we could release a metallb version.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should even rollback the two commits from #2312
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I actually like the idea of keeping the option to disable if someone really wants to (for the sake of "performance impact" or w.e) while defaulting to enabled, wdyt?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd rather have it and that's it. If we want to leave the code that disables it and the parameter, fine (even though it comes with a complexity cost, even if little), but I would hide the parameter from helm until we realize this doesn't scale well (and even if it does, maybe it will be better to optimize it rather than allowing users to disable).
speaker/main.go
Outdated
// TODO: we are hiding the feature behind a feature flag because of https://github.com/metallb/metallb/issues/2311 | ||
// This flag can be removed once the issue is fixed. | ||
enableL2ServiceStatus = flag.Bool("enable-l2-service-status", false, "enables the experimental l2 service status feature") | ||
enableL2ServiceStatus = flag.Bool("enable-l2-service-status", true, "enables the experimental l2 service status feature") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: remove the comments, also I'd say this is not experimental once we default to true
level.Debug(r.Logger).Log("controller", "Layer2StatusReconciler", "enqueueing", "object", evt) | ||
// unify req.Name to svcName-nodeName | ||
level.Debug(r.Logger).Log("controller", "Layer2StatusReconciler", "enqueueing", "object", object) | ||
label := object.GetLabels() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what I mean is doing the same as before + what you added to the predicate: verifying the object is an l2status, that the labels aren't nil etc. (I guess being validated by the predicate is enough, this is just to be on the safe side)
e2etest/go.mod
Outdated
replace ( | ||
k8s.io/kubectl => k8s.io/kubectl v0.29.3 | ||
k8s.io/kubelet => k8s.io/kubelet v0.29.3 | ||
) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah it's a bit odd, can you try removing it from the commit and see what happens in ci?
return "" | ||
} | ||
return ss[0].Status.Node | ||
}, 5*time.Second).Should(Equal(speakingNode)) | ||
}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we also add validation that the status is deleted after the service is deleted? (I think it's not there, ignore this if I'm wrong)
@oribon pls take a look agian |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left a few nits, overall this looks good from my side.
Regarding the object's name I'm ok with the current approach but I think we should opt for some generated name "l2- / NODE-NAME- / etc." as I believe a node-service-namespace
format is a bit cumbersome. If we go that route we should also make sure the service's name/namespace is visible when running a kubectl get -o wide
.
Also, can you organize the commits as we did last time?
I'll be away for the next ~2.5 weeks, thanks a lot for your contributions with this feature we really appreciate it!
/cc @fedepaol
} | ||
return err | ||
}, 2*time.Minute, 1*time.Second).ShouldNot(HaveOccurred()) | ||
Expect(l2Statuses).To(HaveLen(1)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why this change? I think keeping it in the eventually is clearer (returning an error with what we got if it's not 1)
e2etest/l2tests/l2.go
Outdated
} | ||
return err | ||
}, 2*time.Minute, 1*time.Second).ShouldNot(HaveOccurred()) | ||
Expect(l2Statuses).To(HaveLen(0)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same (have it in eventually)
return ctrl.Result{}, nil | ||
} | ||
|
||
func (r *Layer2StatusReconciler) SetupWithManager(mgr ctrl.Manager) error { | ||
p := predicate.NewPredicateFuncs(func(object client.Object) bool { | ||
if s, ok := object.(*v1beta1.ServiceL2Status); ok { | ||
return strings.HasSuffix(s.Name, r.NodeName) | ||
return r.NodeName == s.GetLabels()[LabelAnnounceNode] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: add a verification that the labels aren't nil
level.Debug(r.Logger).Log("controller", "Layer2StatusReconciler", "enqueueing", "object", evt) | ||
// unify req.Name to svcName-nodeName | ||
level.Debug(r.Logger).Log("controller", "Layer2StatusReconciler", "enqueueing", "object", object) | ||
label := object.GetLabels() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is still the same since last time
// test service is located in testNamespace | ||
// speaker pod is deployed in speakerNamespace | ||
// the layer2 status crs are maintained in speakerNamespace | ||
err = func(names ...string) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: having this as a loop over []string{testNamespace, speakerNamespace} is more readable imo
2aba615
to
808e8f4
Compare
@fedepaol
|
level.Debug(r.Logger).Log("controller", "Layer2StatusReconciler", "enqueueing", "object", evt) | ||
// unify req.Name to svcName-nodeName | ||
level.Debug(r.Logger).Log("controller", "Layer2StatusReconciler", "enqueueing", "object", object) | ||
label := object.GetLabels() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See my comment above. I think it does make sense to not validate against labels because some external actor could mess up with them. On the other hand, I'd (either here or in the predicate) at least log if the object is nto of the right type. Maybe even panic, as it is something that we are not expecting.
@@ -14,10 +19,10 @@ require ( | |||
github.com/prometheus/client_golang v1.19.0 // indirect | |||
go.uber.org/zap v1.26.0 // indirect | |||
golang.org/x/sys v0.18.0 // indirect | |||
k8s.io/api v0.29.3 | |||
k8s.io/api v0.29.4 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is there a reason for these bumps?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are submitted with this: #2351 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@fedepaol So Should I downgrade this dependency version?
@lwabish as always, no pressure, just checking you are not waiting something from us. |
Oh,yes. I have been working on something about job switching and almost done. I should be able to continue in the coming days this week🤓 |
d452347
to
1c31bce
Compare
@fedepaol Please take a look at the current implement. 😄 |
will do by eod, thanks! |
svcName := strings.TrimSuffix(req.Name, fmt.Sprintf("-%s", r.NodeName)) | ||
ipAdvS := r.StatusFetcher(types.NamespacedName{Name: svcName, Namespace: req.Namespace}) | ||
serviceName, serviceNamespace := req.Name, req.Namespace | ||
statusName, statusNamespace := fmt.Sprintf("%s-%s-%s", req.Name, req.Namespace, r.NodeName), r.Namespace |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: use serviceName as the argument in the formatting string will make it clearer how the statusName is built with.
Also, @oribon had concerns about generating such name in this way with the risk of going over the maximum size. The alternative would be to use a generated name, but doing so would make the status - service / node binding implemented only with something immutable like labels.
If we do so, we'd risk to have somebody changing the label and leaving a rogue status around. A possible alternative would be:
- generated name
- node and service name as labels (for better discoverability)
- node and service name explicit in the status part (to provide immutability of the information)
If we do that, we'll probably have to add an index as well so we can retrieve the entry easily (we won't be able to do that by name anymore).
Thoughts?
Based on the outcome of this discussion, we'll have to choose what to do with the predicate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you mean by adding an index? Can you give me an quick example for better understanding?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Something along the line of what we do here https://github.com/metallb/metallb/blob/main/internal/k8s/k8s.go#L246 where we set a key to retrieve the ep slice based on the service name
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 for that idea!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let me try to tidy this up.
With completely generated names for crs, along with index which maintains the node->svc relationship, we would use something like this in predicate to determine if a cr belongs to a specific node(speaker), am I right ?
metallb/internal/k8s/controllers/endpoints.go
Lines 16 to 18 in 83b5dd4
if err := cl.List(ctx, &slices, client.MatchingFields{epslices.SlicesServiceIndexName: serviceName.String()}); err != nil { | |
return nil, err | |
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yep, thats correct
In order to avoid dangling status resources, we add an owner reference from a speaker to the L2Service status it owns. Because of that, we also move the instance to the MetalLB's namespace. Labels are added to make it easier to understand which node / speaker is announcing a given service. Signed-off-by: lwabish <wubw@pku.edu.cn>
Added validation for owner reference in l2status instance. Validate l2status instance in metallb speaker namespace insdead of service namespace. Signed-off-by: lwabish <wubw@pku.edu.cn>
We validate that owner reference is added to l2status instances correctly and that the l2status instances are deleted correctly without dangling with the help of owner references. Also, we validate the status instances in speaker namespace insdead of service namespace because of the latest update of implement. Signed-off-by: lwabish <wubw@pku.edu.cn>
…config We use k8s downward api to expose speaker pod name as env variable. Meanwhile, we update the rbac config to allow speaker pod to get speaker pods. At last, `inv generatemanifests` were run to regenerate the manifests. With these manifest modification, implement about owner reference can work in kubernetes cluster. Signed-off-by: lwabish <wubw@pku.edu.cn>
Use k8s downward api to expose speaker pod name as env variable. Update the rbac config to allow the speaker to get the speaker pod. With these modification, metallb installation from helm can work correctly. Signed-off-by: lwabish <wubw@pku.edu.cn>
Is this a BUG FIX or a FEATURE ?:
/kind bug
What this PR does / why we need it:
fix #2311
Special notes for your reviewer:
Release note: