Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] rancher/shell:v0.1.24 reporting manifest invalid #45424

Closed
slickwarren opened this issue May 8, 2024 · 2 comments
Closed

[BUG] rancher/shell:v0.1.24 reporting manifest invalid #45424

slickwarren opened this issue May 8, 2024 · 2 comments
Assignees
Labels
area/documentation kind/bug Issues that are defects reported by users or that we know have reached a real release kind/bug-qa Issues that have not yet hit a real release. Bugs introduced by a new feature or enhancement priority/1 release-note Note this issue in the milestone's release notes status/release-note-added
Milestone

Comments

@slickwarren
Copy link
Contributor

slickwarren commented May 8, 2024

Rancher Server Setup

  • Rancher version: 2.8.4-rc3, 2.7.13-rc4
  • Installation option (Docker install/Helm Chart): helm
    • If Helm Chart, Kubernetes Cluster and version (RKE1, RKE2, k3s, EKS, etc): rke2, 1.27.13 also tested on 1.28.9
  • Proxy/Cert Details: letsencrypt signed

Information about the Cluster

  • Kubernetes version: any
  • Cluster Type (Local/Downstream): local
    • If downstream, what type of cluster? (Custom/Imported or specify provider for Hosted/Infrastructure Provider):

User Information

  • What is the role of the user logged in? (Admin/Cluster Owner/Cluster Member/Project Owner/Project Member/Custom)
    • If custom, define the set of permissions: this step is done before RBAC is a factor

Describe the bug

when trying to pull image rancher/shell:v0.1.24, there is an issue with the manifest. This is causing an issue in airgap setups.

To Reproduce

  • pull rancher/shell:v0.1.24 with crane

Result
unable to copy the image as the manifest is reported as invalid

Expected Result

image should pull without issue

Screenshots

Additional context

for image rancher/shell:v0.1.24 I'm seeing a strange error with the manifest. It is causing issues in the airgap setup. It appears as though the manifest wasn't populated correctly (this is happening on both rancher 2.7.13 and 2.8.4)

docker manifest inspect rancher/shell:v0.1.24
no such manifest: docker.io/rancher/shell:v0.1.24
...
crane copy rancher/shell:v0.1.24 <registry>/rancher/shell:v0.1.24
Error: PUT https://<registry>/v2/rancher/shell/manifests/sha256:b3bfeb9e3a7e673bfaab317be86119c9122b4208b014148e041132dbde10bfc1: MANIFEST_INVALID: manifest invalid; map[]

issue doesn't happen for v0.1.22 nor 0.1.19

issue seems to be ignored if using docker for image pull/push but can be seen with tools like crane


  • on docker 20.10.7 and/or crane v0.19.1 the issue is reproducible
  • on docker 23.10.6, the manifest inspect command will properly load the manifest for the image rancher/shell:v0.1.24
@slickwarren slickwarren added kind/bug Issues that are defects reported by users or that we know have reached a real release kind/bug-qa Issues that have not yet hit a real release. Bugs introduced by a new feature or enhancement status/release-blocker labels May 8, 2024
@slickwarren slickwarren added this to the v2.8-Next1 milestone May 8, 2024
@slickwarren slickwarren self-assigned this May 8, 2024
@slickwarren slickwarren changed the title [BUG] rancher/shell:v0.1.24 reporting manifest invalid in airgap setups [BUG] rancher/shell:v0.1.24 reporting manifest invalid May 9, 2024
@mallardduck
Copy link
Member

To be succinct, the root of the issue here seems to come down to a change from Docker based manifests to OCI based manifests. Because OCI format is "the future" it's unclear the correct fix for us to take. Since telling end-users to update docker is just as valid of a solution as us applying a workaround.

The reason it works with Docker 23 is because that's the version of CLI which added support for OCI. See here: docker/cli#3990 (comment)

That in mind, our support matrix (for current Rancher versions) does cite a minimum docker version of 23.0.x for RKE1 installs. So it might be wise if we just start indicating that the docker client (or other container clinets) need to match Docker 23.0 support? However nothing that docker publishes (unlike k8s/kubectl) explicitly define a compatibility chart. Yet Docker does always specifically ship/install identical client version as the engine being installed.


The longer version

Within the context of (most) our Rancher container images it appears that any produced by Drone create the older docker format. However any projects moved to GHA for CI will be using the approved docker setup-buildx and build-push actions. And these actions will use the current version of buildx which produces OCI format images.

We can see this via comparing the mediaType on the manifest of a rancher/shell image created via drone:

docker manifest inspect rancher/shell:v0.1.22|grep mediaType
   "mediaType": "application/vnd.docker.distribution.manifest.list.v2+json",
         "mediaType": "application/vnd.docker.distribution.manifest.v2+json",

And compare this to a newer image created via GHA:

docker manifest inspect rancher/shell:v0.1.24|grep mediaType
   "mediaType": "application/vnd.oci.image.index.v1+json",
         "mediaType": "application/vnd.oci.image.manifest.v1+json",
         "mediaType": "application/vnd.oci.image.manifest.v1+json",

However, the OCI specs are the future of the formats - these are open standards based directly on the docker ones. Which is why docker has updated their tooling to produce OCI images over the last few years too. That in mind any workaround should be seen as a temporary stop gap to give customers using old tools more time to update. (Those users MUST update eventually, just like we need to switch to use OCI eventually.)

If we need to produce docker style container images, then we can potentially apply a workaround. Eventually I found this issue: docker/build-push-action#771 Which I have tested to confirm that setting provenance: false will produce docker images again.

@MKlimuszka
Copy link
Collaborator

Engineering requested release note:

Please note that support for Docker CLI 20.x has been sunset in Rancher due to it being EOL. Please update your local Docker CLI versions to 23.0.x or higher. Versions below this may not recognize OCI compliant Rancher image manifests.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/documentation kind/bug Issues that are defects reported by users or that we know have reached a real release kind/bug-qa Issues that have not yet hit a real release. Bugs introduced by a new feature or enhancement priority/1 release-note Note this issue in the milestone's release notes status/release-note-added
Projects
None yet
Development

No branches or pull requests

4 participants