Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Trailing / in Rancher server URL makes Harvester clusters registration fail #45403

Open
m-ildefons opened this issue May 7, 2024 · 0 comments
Labels
kind/bug Issues that are defects reported by users or that we know have reached a real release

Comments

@m-ildefons
Copy link

Rancher Server Setup

  • Rancher version: any
  • Installation option (Docker install/Helm Chart): vcluster / helm
    • If Helm Chart, Kubernetes Cluster and version (RKE1, RKE2, k3s, EKS, etc):
  • Proxy/Cert Details: behind Nginx ingress proxy

Information about the Cluster

  • Kubernetes version: v1.27.12-k3s1
  • Cluster Type (Local/Downstream): vcluster on Harvester
    • If downstream, what type of cluster? (Custom/Imported or specify provider for Hosted/Infrastructure Provider): Imported

User Information

  • What is the role of the user logged in? (Admin/Cluster Owner/Cluster Member/Project Owner/Project Member/Custom)
    • If custom, define the set of permissions: admin

Describe the bug

When the Rancher server is deployed behind an Nginx ingress proxy and initialized with a server URL with a trailing /, the Rancher agent fails to register downstream clusters.

To Reproduce

  1. Install Harvester
  2. Install rancher-vcluster plugin
  3. Initialize Rancher with a URL with a trailing /
  4. Try to register the Harvester cluster under virtualization management

Result
The Rancher agent will be deployed, but the cluster will remain in "Waiting for API" state forever

Expected Result
The cluster registration should complete without errors and the Harvester cluster should be in "Active" state in Rancher

Additional context
If the Rancher server URL is configured with a trailing /, e.g. https://rancher.example.com/, the rancher agent will be deployed with that same URL as value in its environment variable CATTLE_SERVER. During initialization, the Rancher agent will use the value of this environment variable to generate the secret stv-aggregation, which contains a key url. The value in the secret for the key url is the string concatenation of the value of the CATTLE_SERVER environment variable and the fixed string /v3/connect, resulting in e.g. https://rancher.example.com//v3/connect. This is the URL the agent will try to use to finalize the registration process with the Rancher server.
However, if Rancher is deployed behind an Nginx ingress proxy, any HTTP request to this URL will be answered by the proxy with a HTTP status code 301, redirecting to /v3/connect (i.e. without the double / in the path). This response is ignored by the Rancher agent and thus the registration will never complete.

There are several problems in the Rancher agent connected to this bug:

  • The agent does not log the unexpected HTTP status code or even the fact that the connection attempt failed. This makes the situation hard to debug
  • Rancher makes assumptions about the format of the URL (such as that it doesn't have a trailing /) without enforcing or even verifying them. If a server URL with a trailing / is invalid, it should be impossible to set it and an appropriate message should be displayed to the user. Any assumptions about the format of this URL should be asserted through checks before the configuration is applied.
  • The Rancher agent uses an HTTP API but doesn't behave like a normal HTTP client by not following redirects. This makes the connection fragile if proxies are introduced
  • The agent's use of primitive string concatenation to construct URLs from user input is brittle, it is better to either normalize user input or use URL handling functions (e.g. from a library) for further processing.
@m-ildefons m-ildefons added the kind/bug Issues that are defects reported by users or that we know have reached a real release label May 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Issues that are defects reported by users or that we know have reached a real release
Projects
None yet
Development

No branches or pull requests

1 participant