Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pilot: update pods in EndpointShardz when labels change #50432

Merged
merged 2 commits into from
Apr 16, 2024

Conversation

howardjohn
Copy link
Member

Fixes #43694
Fixes #50431

@howardjohn howardjohn requested a review from a team as a code owner April 12, 2024 18:04
@howardjohn howardjohn added the release-notes-none Indicates a PR that does not require release notes. label Apr 12, 2024
@howardjohn howardjohn requested a review from a team as a code owner April 12, 2024 18:04
@istio-testing istio-testing added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Apr 12, 2024
Comment on lines +138 to +139
// Annotations are only used in endpoints in one case, so just compare that one
relevantAnnotationsChanged := old.Annotations[constants.AmbientRedirection] != cur.Annotations[constants.AmbientRedirection]
Copy link
Contributor

@bleggett bleggett Apr 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh this is why you wanted to ditch the annotation.

Can you throw a TODO in here and linkref to #50355?

Alternatively we can just check all annotations && labels but assuming that's measurably slow.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This actually isn't related to my discussion around annotation actually -- even if we do that, we still need this IMO. For instance, if I enroll the entire namespace, then the CNI will annotate the pod even with that proposal I think?

Copy link
Contributor

@bleggett bleggett Apr 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm saying we could make that annotation the CNI does be a label (which that issue discusses as an option), and we then wouldn't need to check annotations JUST for ambient status.

if we still choose to do that in #50355 then we would need to update this anyway.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah ok, at one point it had been discussed "humans use labels, machines (CNI) use annotations". I don't mind the annotation check here really -- I wouldn't weigh it too much on #50355

@howardjohn howardjohn force-pushed the pilot/label-update-eds-update branch from 457f635 to ff923fc Compare April 12, 2024 21:15
@howardjohn
Copy link
Member Author

/retest

Copy link
Contributor

@ramaraochavali ramaraochavali left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couple of minor comments

pilot/pkg/serviceregistry/kube/controller/controller.go Outdated Show resolved Hide resolved
pilot/pkg/serviceregistry/kube/controller/pod.go Outdated Show resolved Hide resolved
func (c *Controller) recomputeServiceForPod(pod *v1.Pod) {
allServices := c.services.List(pod.Namespace, klabels.Everything())
cu := sets.New[model.ConfigKey]()
services := getPodServices(allServices, pod)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

getPodServices is very costy, especially when services number is huge.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function is called extremely rarely -- only if a pod's labels change. We already call this WAY more often in other codepaths (GetProxyServiceTargets). I don't think its so bad to list a namespace's services every time a pod changes.. even a large cluster isn't going to have >1k services in one namespace probably?

The complexity to optimize this is not worth it IMO -- it would be very complex

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

even a large cluster isn't going to have >1k services in one namespace probably?

Maybe

This function is called extremely rarely

Suggest we add a feature flag, this could be a breaking change.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggest we add a feature flag, this could be a breaking change.

How can it be a breaking change? Or just a performance regression you mean?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Performance regression i mean

// If labels/annotations updated, trigger proxy push
labelsChanged := !maps.Equal(old.Labels, cur.Labels)
// Annotations are only used in endpoints in one case, so just compare that one
relevantAnnotationsChanged := old.Annotations[constants.AmbientRedirection] != cur.Annotations[constants.AmbientRedirection]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I donot think we should change the behavior of redirect behavior on flight

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In envoy or in all of ambient?

If we do it in ztunnel but not envoy it causes an outage when you switch

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

all ambient, remember we have some fixed ops based on labels for ambient in CNI.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can change the control plane, but you cannot make the CNI to rework

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This topic has been thoroughly discussed already and the conclusion was that dynamic changes was a hard requirement for ambient. See #48876 for some of the past discussion.

if len(endpoints) > 0 {
c.opts.XDSUpdater.EDSCacheUpdate(shard, string(hostname), svc.Namespace, endpoints)
}
cu.Insert(model.ConfigKey{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we donot need to insert here, if a service->enpoint relationship changed, endpoint handler will do this

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Its not the service->endpoint relationship. Neither service nor endpoint change in k8s, pod did.

Istio's endpoint representation augments EndpointSlice with Pod info, so this ensures if only the pod changes the IstioEndpoint updates.

@howardjohn
Copy link
Member Author

/retest

@stevenctl
Copy link
Contributor

LGTM

@istio-testing istio-testing merged commit c537c34 into istio:master Apr 16, 2024
28 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
release-notes-none Indicates a PR that does not require release notes. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
6 participants