Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Gateway API stops working after restart of single node 'cluster' #32596

Open
3 tasks done
Lennie opened this issue May 16, 2024 · 1 comment
Open
3 tasks done

Gateway API stops working after restart of single node 'cluster' #32596

Lennie opened this issue May 16, 2024 · 1 comment
Labels
kind/bug This is a bug in the Cilium logic. kind/community-report This was reported by a user in the Cilium community, eg via Slack. need-more-info More information is required to further debug or fix the issue. needs/triage This issue requires triaging to establish severity and next steps.

Comments

@Lennie
Copy link

Lennie commented May 16, 2024

Is there an existing issue for this?

  • I have searched the existing issues

What happened?

When I have a single node cluster, and reboot the machine, after everything has started again, gateway API isn't working correctly. Especially deploying new gateway resources. The reason seems pretty simple, at start up, cilium tries to talk to Kube API to get the gateway CRD and gets a timeout. It retries but fails to actually get the needed response. Maybe the non-existence of the CRD is cached ? It repeats for a while until it gives up

A describe of a new gateway resource just says: waiting for controller

Cilium Version

tested with:
1.15.4
1.15.5
1.16.0-pre.2

1.16.0-pre.2 has no output

Kernel Version

tested with:
6.1.0-18-amd64

Kubernetes Version

tested with: v1.29.5

Regression

No response

Sysdump

No response

Relevant log output

level=info msg="Checking for required GatewayAPI resources" requiredGVK="[gateway.networking.k8s.io/v1, Kind=gatewayclasses gateway.networking.k8s.io/v1, Kind=gateways gateway.networking.k8s.io/v1, Kind=httproutes gateway.networking.k8s.io/v1beta1, Kind=referencegrants gateway.networking.k8s.io/v1alpha2, Kind=grpcroutes gateway.networking.k8s.io/v1alpha2, Kind=tlsroutes]" subsys=gateway-api
level=error msg="Required GatewayAPI resources are not found, please refer to docs for installation instructions" error="Get \"https://10.96.0.1:443/apis/apiextensions.k8s.io/v1/customresourcedefinitions/gatewayclasses.gateway.networking.k8s.io\": dial tcp 10.96.0.1:443: i/o timeout" subsys=gateway-api

10.96.0.1 is clusterIP of the kube-apiserver

Later on we do see it retrying, but it's not working, maybe the non-existence is cached ? It repeats for a while until it gives up:

level=error msg="kind must be registered to the Scheme" error="no kind is registered for the type v1.Gateway in scheme \"k8s.io/client-go/kubernetes/scheme/register.go:80\"" logger=controller-runtime.source.EventHandler subsys=controller-runtime

Anything else?

No response

Cilium Users Document

  • Are you a user of Cilium? Please add yourself to the Users doc

Code of Conduct

  • I agree to follow this project's Code of Conduct
@Lennie Lennie added kind/bug This is a bug in the Cilium logic. kind/community-report This was reported by a user in the Cilium community, eg via Slack. needs/triage This issue requires triaging to establish severity and next steps. labels May 16, 2024
@lmb
Copy link
Contributor

lmb commented May 21, 2024

Are you running into this on kind or similar? A way to reproduce this would be great.

@lmb lmb added the need-more-info More information is required to further debug or fix the issue. label May 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug This is a bug in the Cilium logic. kind/community-report This was reported by a user in the Cilium community, eg via Slack. need-more-info More information is required to further debug or fix the issue. needs/triage This issue requires triaging to establish severity and next steps.
Projects
None yet
Development

No branches or pull requests

2 participants