Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow setting priorityClassName on ServiceLB daemonset. #10033

Open
josephshanak opened this issue Apr 27, 2024 · 5 comments
Open

Allow setting priorityClassName on ServiceLB daemonset. #10033

josephshanak opened this issue Apr 27, 2024 · 5 comments
Assignees
Milestone

Comments

@josephshanak
Copy link

Is your feature request related to a problem? Please describe.
I would like to set priorityClassName on all of the pods in my cluster so I can control the order in which they preempted. The pods created by ServiceLB daemonsets do not have a priorityClassName so they receive the default priority of 0, which is lower than other priority classes I have defined. This means these pods will likely be preempted when the cluster is over-committed.

Describe the solution you'd like
I would like the ability to set a priorityClassName on the pods created by ServiceLB / k3s:

Template: core.PodTemplateSpec{
ObjectMeta: meta.ObjectMeta{
Labels: labels.Set{
"app": name,
svcNameLabel: svc.Name,
svcNamespaceLabel: svc.Namespace,
},
},
Spec: core.PodSpec{
ServiceAccountName: "svclb",
AutomountServiceAccountToken: utilsptr.To(false),
SecurityContext: &core.PodSecurityContext{
Sysctls: sysctls,
},
Tolerations: []core.Toleration{
{
Key: util.MasterRoleLabelKey,
Operator: "Exists",
Effect: "NoSchedule",
},
{
Key: util.ControlPlaneRoleLabelKey,
Operator: "Exists",
Effect: "NoSchedule",
},
{
Key: "CriticalAddonsOnly",
Operator: "Exists",
},
},
},
},

Perhaps via a commandline option --servicelb-priority-class=my-priority-class.

Describe alternatives you've considered

  1. I could use a Priority Class with globalDefault: true to define a global default. However, this means pods without a priorityClassName will be scheduled with the same priority, which is not ideal because it priorityClassName could be forgotten.

  2. I could create priority classes with negative values (per https://kubernetes.io/docs/concepts/scheduling-eviction/pod-priority-preemption/#priorityclass this should be fine), and use the global default for ServiceLB pods only (however, this is not ideal for the same reason above).

  3. k3s could create the pods with system-cluster-critical or system-node-critical priority classes.

  4. I could disable ServiceLB with --disable=servicelb and install another load balancer provider like MetalLB, which seems to support priorityClassName (helm option for priorityClassName metallb/metallb#995).

@ChristianCiach
Copy link

You could probably also use a mutating admission controller like Kyverno to modify the pod-spec based on custom rules. See: https://kyverno.io/docs/writing-policies/mutate/

This is surely not an attractive option, but it's a possibility nonetheless.

@brandond
Copy link
Contributor

Seems reasonable. See the linked PR.

@josephshanak
Copy link
Author

PR looks good to me! And an annotation seems much more flexible!

@brandond
Copy link
Contributor

brandond commented May 6, 2024

The pods created by ServiceLB daemonsets do not have a priorityClassName so they receive the default priority of 0, which is lower than other priority classes I have defined.

I will note that the svclb pods have no requests or reservations and consume basically no resources since all they just go to sleep after adding iptables rules.

root@k3s-server-1:~# kubectl top pod -n kube-system
NAME                                      CPU(cores)   MEMORY(bytes)
coredns-6799fbcd5-zxktb                   2m           13Mi
local-path-provisioner-6c86858495-dpfb6   1m           6Mi
metrics-server-54fd9b65b-9xqxs            5m           21Mi
svclb-traefik-49baafe9-xwvrd              0m           0Mi
traefik-7d5f6474df-hfhwd                  1m           26Mi

This means these pods will likely be preempted when the cluster is over-committed.

Are you actually seeing the svclb pods get preempted, or is this a theoretical problem?

@josephshanak
Copy link
Author

The pods created by ServiceLB daemonsets do not have a priorityClassName so they receive the default priority of 0, which is lower than other priority classes I have defined.

I will note that the svclb pods have no requests or reservations and consume basically no resources since all they just go to sleep after adding iptables rules.

root@k3s-server-1:~# kubectl top pod -n kube-system
NAME                                      CPU(cores)   MEMORY(bytes)
coredns-6799fbcd5-zxktb                   2m           13Mi
local-path-provisioner-6c86858495-dpfb6   1m           6Mi
metrics-server-54fd9b65b-9xqxs            5m           21Mi
svclb-traefik-49baafe9-xwvrd              0m           0Mi
traefik-7d5f6474df-hfhwd                  1m           26Mi

This means these pods will likely be preempted when the cluster is over-committed.

Are you actually seeing the svclb pods get preempted, or is this a theoretical problem?

This is theoretical. I have not experienced this. I came upon this while attempting to assign priority classes to all pods.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Peer Review
Development

No branches or pull requests

4 participants