Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error pushing manifest list/index #191

Open
aamato80 opened this issue Nov 2, 2022 · 8 comments
Open

Error pushing manifest list/index #191

aamato80 opened this issue Nov 2, 2022 · 8 comments

Comments

@aamato80
Copy link

aamato80 commented Nov 2, 2022

Hi all,

i am trying to use manifest-tool to create a multi architect docker image.
As explained in your guide, i created the different images, in my case for arm64 and amd64 with kaniko, and i tried to exexcute the manifest tool.
I tried both, yaml and spec mode, but without success.
I receive any way an error like this one:
Error pushing manifest list/index to registry: sha256:0c91a4e37f4765d431b50d62439ba660b8b57ae75412fd45c371d9174c38e3df: manifest list/index references to blobs and/or manifests are missing in your target registry...

This one is an example of the used yaml:

image: myrepo.com/myservice:latest
manifests:
  - image: myrepo.com/myservice-arm64:latest
    platform:
      architecture: arm64
      os: linux
  - image: myrepo.com/myservice-amd64:latest
    platform:
      architecture: amd64
      os: linux

Any suggestion? There is something wrong in my configuration?
Many Thanks!

@estesp
Copy link
Owner

estesp commented Nov 8, 2022

Can you run the push with --debug and provide the output? It sounds like it thinks a required components of the full tree of contained images (configs, manifests, and layers) is not existing in the target repo.

Would be also good to understand which registry you are pushing to (self-hosted? based on distribution/distribution? version?)

@estesp
Copy link
Owner

estesp commented Dec 1, 2022

Hi @aamato80 have you been able to try the command with --debug so I can help figure out your issue?

@b-morgenthaler
Copy link

Hi @estesp I am taking over here since I think I am seeing the same issue as the OP and there was no progress regarding error/debug messages.
Here's the error/debug message I am facing:

level=debug msg="do request" digest="sha256:1d40802bba338d4abdc4ef8395f8827a04fc5b9591e941c52eae27382f5f1192" mediatype=application/vnd.docker.distribution.manifest.list.v2+json request.header.content-type=application/vnd.docker.distribution.manifest.list.v2+json request.header.user-agent=containerd/1.6.18+unknown request.method=PUT size=699 url="https://self_hosted:5001/v2/image_name/manifests/v1.5.0" level=debug msg="fetch response received" digest="sha256:1d40802bba338d4abdc4ef8395f8827a04fc5b9591e941c52eae27382f5f1192" mediatype=application/vnd.docker.distribution.manifest.list.v2+json response.header.content-length=156 response.header.content-security-policy="sandbox allow-forms allow-modals allow-popups allow-presentation allow-scripts allow-top-navigation" response.header.content-type=application/json response.header.date="Thu, 20 Jul 2023 13:44:25 GMT" response.header.docker-distribution-api-version=registry/2.0 response.header.server="Nexus/3.41.0-01 (OSS)" response.header.strict-transport-security="max-age=7776000" response.header.x-content-type-options=nosniff response.header.x-xss-protection="1; mode=block" response.status="400 Bad Request" size=699 url="https://self_hosted:5001/v2/image_name/manifests/v1.5.0" level=debug msg="unexpected response" body="{\"errors\":[{\"code\":\"BLOB_UNKNOWN\",\"message\":\"blob unknown to registry\",\"detail\":\"sha256:f5ef2458f9f1d711e98db59f2239e1171bc9ecf2c442a068e11fc62e105d2b0c\"}]}" digest="sha256:1d40802bba338d4abdc4ef8395f8827a04fc5b9591e941c52eae27382f5f1192" mediatype=application/vnd.docker.distribution.manifest.list.v2+json resp="&{400 Bad Request 400 HTTP/1.1 1 1 map[Content-Length:[156] Content-Security-Policy:[sandbox allow-forms allow-modals allow-popups allow-presentation allow-scripts allow-top-navigation] Content-Type:[application/json] Date:[Thu, 20 Jul 2023 13:44:25 GMT] Docker-Distribution-Api-Version:[registry/2.0] Server:[Nexus/3.41.0-01 (OSS)] Strict-Transport-Security:[max-age=7776000] X-Content-Type-Options:[nosniff] X-Xss-Protection:[1; mode=block]] 0xc0003a8700 156 [] false false map[] 0xc0000aed00 0xc00039c210}" size=699 level=fatal msg="Error pushing manifest list/index to registry: sha256:1d40802bba338d4abdc4ef8395f8827a04fc5b9591e941c52eae27382f5f1192: manifest list/index references to blobs and/or manifests are missing in your target registry: failed commit on ref \"index-self_hosted:5001/image_name:v1.5.0@sha256:1d40802bba338d4abdc4ef8395f8827a04fc5b9591e941c52eae27382f5f1192\": unexpected status: 400 Bad Request"

It's worth noting that

  • the push of the images before deploying the multi-arch index with your tool was successful (pulling the images works fine)
  • we are using a self_hosted registry (Sonatype Nexus Repository ManagerOSS 3.41.0-01)
  • I am utilizing your tool from a Gitlab pipeline (as I am doing with kaniko and building/pushing the images as well) within a "docker" executor from a "curlimages/curl" based Docker image in version 2.0.8
    • Call is:
    • ./manifest-tool --debug --insecure --username ${DOCKER_DEPLOY_USER} --password ${DOCKER_DEPLOY_TOKEN} push from-args --ignore-missing --platforms linux/amd64,linux/arm64/v8 --template ${TARGET_DOCKER_REGISTRY}/${TARGET_DOCKER_GROUP}/ARCHVARIANT/${IMAGE_NAME}:${CI_COMMIT_TAG} --target ${TARGET_DOCKER_REGISTRY}/${TARGET_DOCKER_GROUP}/${IMAGE_NAME}:${CI_COMMIT_TAG}
  • I see this error only the first time new images are built and pushed. Subsequent calls to generate and push multi-arch index file never fails
  • Waiting before deploying the multi-arch index to give the registry time to process the just pushed images does not help to prevent the error

A subsequent and passing call to your tool (without building/pushing images) produces this debug log:

level=debug msg="do request" digest="sha256:1d40802bba338d4abdc4ef8395f8827a04fc5b9591e941c52eae27382f5f1192" mediatype=application/vnd.docker.distribution.manifest.list.v2+json request.header.content-type=application/vnd.docker.distribution.manifest.list.v2+json request.header.user-agent=containerd/1.6.18+unknown request.method=PUT size=699 url="https://self_hosted:5001/v2/image_name/manifests/v1.5.0" level=debug msg="fetch response received" digest="sha256:1d40802bba338d4abdc4ef8395f8827a04fc5b9591e941c52eae27382f5f1192" mediatype=application/vnd.docker.distribution.manifest.list.v2+json response.header.content-length=699 response.header.content-security-policy="sandbox allow-forms allow-modals allow-popups allow-presentation allow-scripts allow-top-navigation" response.header.content-type=application/vnd.docker.distribution.manifest.list.v2+json response.header.date="Thu, 20 Jul 2023 13:49:34 GMT" response.header.docker-content-digest="sha256:1d40802bba338d4abdc4ef8395f8827a04fc5b9591e941c52eae27382f5f1192" response.header.docker-distribution-api-version=registry/2.0 response.header.last-modified="Thu, 20 Jul 2023 13:49:34 GMT" response.header.server="Nexus/3.41.0-01 (OSS)" response.header.strict-transport-security="max-age=7776000" response.header.x-content-type-options=nosniff response.header.x-xss-protection="1; mode=block" response.status="201 Created" size=699 url="https://self_hosted:5001/v2/image_name/manifests/v1.5.0" Digest: sha256:1d40802bba338d4abdc4ef8395f8827a04fc5b9591e941c52eae27382f5f1192 699

@b-morgenthaler
Copy link

Interesting data point: I did an inspect prior to pushing multi-arch index with your tool:

`$ ./manifest-tool --insecure inspect ${TARGET_DOCKER_REGISTRY}/${TARGET_DOCKER_GROUP}/arm64v8/${IMAGE_NAME}:${CI_COMMIT_TAG}
Name: self_hosted:5001/image_name:v1.5.0 (Type: application/vnd.docker.distribution.manifest.v2+json)
Digest: sha256:4b06cb1d532a350ee47dca2766a5940af4111fcbef40ab412ad6ecf399ad1af6
Size: 1364
OS: linux
Arch: arm64
# Layers: 5
layer 01: digest = sha256:5af00eab97847634d0b3b8a5933f52ca8378f5f30a2949279d682de1e210d78b
layer 02: digest = sha256:7e982ec86ba103af9415fb80b62fb4d3b7256fe818db532dd6cc41bb337d182f
layer 03: digest = sha256:0c7b7546e0fe2ad9e0aa6b3d3b05ed37bcb158441cb0e95713dd3b76c8f35800
layer 04: digest = sha256:8e49e2b7b3144d6e26befd9944c0c2fca9d0cb0ce47959961e68828152f07eec
layer 05: digest = sha256:4fd84ae77ddde0a7bd16ee010d145da9f1ca4b3192eee7f973e5a083610fee76

$ ./manifest-tool --insecure inspect ${TARGET_DOCKER_REGISTRY}/${TARGET_DOCKER_GROUP}/amd64/${IMAGE_NAME}:${CI_COMMIT_TAG}
Name: self_hosted:5001/image_name:v1.5.0 (Type: application/vnd.oci.image.manifest.v1+json)
Digest: sha256:f8ef6f81748de0da98d902eb7da4aafc36590ed8bcd8cb15806c79453b0dd2bc
Size: 893
OS: linux
Arch: amd64
# Layers: 4
layer 01: digest = sha256:01085d60b3a624c06a7132ff0749efc6e6565d9f2531d7685ff559fb5d0f669f
layer 02: digest = sha256:f597caf2f79756536e25d4ff08317f77b988e141a34b41b525fde27cf84e9f76
layer 03: digest = sha256:1cca692a2d6413570af817aa136807de39b1b34bd5621c67758eb6c7188ffef6
layer 04: digest = sha256:b9829aedd8e18c8f886495b60d79db107335928f8952e412b20c184e4219bcd3`

But still pushing says, something is missing:

time="2023-07-21T11:43:57Z" level=debug msg="do request" digest="sha256:ab9b3b1d72688fbbe37cf9bd5233ea351049f860b9012b2e9023f782899835f1" mediatype=application/vnd.docker.distribution.manifest.list.v2+json request.header.content-type=application/vnd.docker.distribution.manifest.list.v2+json request.header.user-agent=containerd/1.6.18+unknown request.method=PUT size=699 url="https://self_hosted:5001/v2/image_name/manifests/v1.5.0" time="2023-07-21T11:43:57Z" level=debug msg="fetch response received" digest="sha256:ab9b3b1d72688fbbe37cf9bd5233ea351049f860b9012b2e9023f782899835f1" mediatype=application/vnd.docker.distribution.manifest.list.v2+json response.header.content-length=156 response.header.content-security-policy="sandbox allow-forms allow-modals allow-popups allow-presentation allow-scripts allow-top-navigation" response.header.content-type=application/json response.header.date="Fri, 21 Jul 2023 11:43:57 GMT" response.header.docker-distribution-api-version=registry/2.0 response.header.server="Nexus/3.41.0-01 (OSS)" response.header.strict-transport-security="max-age=7776000" response.header.x-content-type-options=nosniff response.header.x-xss-protection="1; mode=block" response.status="400 Bad Request" size=699 url="https://self_hosted:5001/v2/image_name/manifests/v1.5.0" time="2023-07-21T11:43:57Z" level=debug msg="unexpected response" body="{\"errors\":[{\"code\":\"BLOB_UNKNOWN\",\"message\":\"blob unknown to registry\",\"detail\":\"sha256:4b06cb1d532a350ee47dca2766a5940af4111fcbef40ab412ad6ecf399ad1af6\"}]}" digest="sha256:ab9b3b1d72688fbbe37cf9bd5233ea351049f860b9012b2e9023f782899835f1" mediatype=application/vnd.docker.distribution.manifest.list.v2+json resp="&{400 Bad Request 400 HTTP/1.1 1 1 map[Content-Length:[156] Content-Security-Policy:[sandbox allow-forms allow-modals allow-popups allow-presentation allow-scripts allow-top-navigation] Content-Type:[application/json] Date:[Fri, 21 Jul 2023 11:43:57 GMT] Docker-Distribution-Api-Version:[registry/2.0] Server:[Nexus/3.41.0-01 (OSS)] Strict-Transport-Security:[max-age=7776000] X-Content-Type-Options:[nosniff] X-Xss-Protection:[1; mode=block]] 0xc0002bd400 156 [] false false map[] 0xc0001a9100 0xc0002c1080}" size=699 time="2023-07-21T11:43:57Z" level=fatal msg="Error pushing manifest list/index to registry: sha256:ab9b3b1d72688fbbe37cf9bd5233ea351049f860b9012b2e9023f782899835f1: manifest list/index references to blobs and/or manifests are missing in your target registry: failed commit on ref \"index-self_hosted:5001/image_name:v1.5.0@sha256:ab9b3b1d72688fbbe37cf9bd5233ea351049f860b9012b2e9023f782899835f1\": unexpected status: 400 Bad Request"

@estesp
Copy link
Owner

estesp commented Jul 27, 2023

@b-morgenthaler interesting; so the specific piece of content in the first error you included is specifically:

{
   "errors":
       [
           { "code": "BLOB_UNKNOWN",
             "message": "blob unknown to registry",
             "detail": "sha256:f5ef2458f9f1d711e98db59f2239e1171bc9ecf2c442a068e11fc62e105d2b0c"
           }
       ]
}

The flow of creating the manifest list/index is to first make sure that all referred content is in the target imageref repo using either cross-repo blob mount (for blobs) or pushing the actual ref into the target repo (without a tag). After those steps are complete, the manifest list/index referring to all that content is pushed. It seems there is a possible timing issue that the content is not fully committed in your chosen registry implementation such that the registry throws a "missing content" error when it is pushed immediately after all the dependent content within the "tree" of member images. I can only assume it works when you run it again because the content seems to now be stably in the registry's data store/index; however Nexus has implemented that.

Your second example shows that the missing content is the first manifest object (the linux/arm64 manifest digest), which would possibly be the last thing pushed in the ordering of steps that manifest-tool takes. Curious if you have any way to confirm that with other examples (e.g. that it's always a manifest and not a blob ref) which might confirm the timing issue.

I would prefer not to generate any artificial delays in manifest-tool and I'm not sure exactly what the OCI distribution spec states about the consistency of the registry's content following the return of a POST/push operation. But, it might be worth raising with Nexus as this is the first I've heard of any issues with the flow of operations in manifest-tool with a registry implementation.

@b-morgenthaler
Copy link

@estesp

It seems there is a possible timing issue that the content is not fully committed in your chosen registry implementation such that the registry throws a "missing content" error when it is pushed immediately after all the dependent content within the "tree" of member images.

My first thought was also a timing issue but ruled it out at the end after verifying the following which had no positive impact regarding the error:

  • add delay time between building/pushing the amd64/arm64 images and deploying the multi-arch manifest to let Nexus process the pushed images before deploying the multi-arch manifest. At the end I had a 40 second delay (I even waited several minutes once).
  • verifying that the pushed images are "existing" by calling manifest-tool inspect prior to manifest-tool push for all images. manifest-tool inspect returned no errors (see my log above in the first post) but manifest-tool push still did
  • manually pulled the images right after they have been pushed (without any delay) which worked perfectly

On a side note: I activated --ignore-missing for pushing the multi-arch manifest. Shouldn't this switch prevent the error?

Interesting as well: The time between pushing the images and pushing the multi-arch manifest seems not to be important at all. No matter how fast or slow these two things happen after each other, a subsequent call to the manifest-tool push command always succeeds without any additional changes.

Curious if you have any way to confirm that with other examples (e.g. that it's always a manifest and not a blob ref) which might confirm the timing issue

It is always the same type of error I am seeing. Regarding the order of the pushed image and the error: the images are built/pushed in parallel on dedicated Gitlab runners, so I would have to go thru the logs to see which one finished first or later.

@estesp
Copy link
Owner

estesp commented Jul 27, 2023

verifying that the pushed images are "existing" by calling manifest-tool inspect prior to manifest-tool push for all images. manifest-tool inspect returned no errors (see my log above in the first post) but manifest-tool push still did

I think I didn't do a great job separating the concepts of pushed content (as standalone images) and pushed content that gets created during the manifest list creation steps. You are correct that there is no issue with the created standalone image content and any timing issues there. The --ignore-missing is about the source content, not the target content being missing.

To be clearer, when you assemble a target manifest list from multiple source images, the target (final) repository must contain references to any source content that is outside that specific repository reference. Those references are pushed during the operations that manifest-tool performs before that final PUT operation that you included the debug output for a few comments ago. If you look at the debug logs that come before it, you will see several additional HTTP transactions with the registry. Those transactions are pushing additional content references based on your source images into the target repo so that the registry will "find" all the right pieces of the DAG (content tree) when it creates that final manifest list entry.

In your case the target repo is something like myrepo/image_name:some_version, and the source images are coming from myrepo/arm64v8/image_name:some_version and myrepo/amd64/image_name:some_version. If we want to test this theory about the commit state of the pushes of content references into the target, one option is to temporarily try using the same repo for source and target by using tags instead of distinct image repo names. That way these extra content references won't need to be pushed into the target repo, and if the commit state of these extra reference pushes is the problem with Nexus, then you won't see it anymore when using the tag method on the same repo—because they won't be performed at all.

For example, you can try creating source images named myrepo/image_name:some_version_arm64v8 and myrepo/image_name:some_version_amd64* and keep the target as myrepo/image_name:some_version. This keeps sources and target all in the same repo (myrepo/image_name) and won't be doing any additional content reference pushes before doing the PUT of the final manifest list.

@b-morgenthaler
Copy link

b-morgenthaler commented Jul 31, 2023

@estesp

For example, you can try creating source images named myrepo/image_name:some_version_arm64v8 and myrepo/image_name:some_version_amd64* and keep the target as myrepo/image_name:some_version. This keeps sources and target all in the same repo (myrepo/image_name) and won't be doing any additional content reference pushes before doing the PUT of the final manifest list.

This seems to work. I didn't deploy the multi-arch manifest with manifest-tool often enough to tell if it's working for good. But so far, I did not see the error (not even once when I decreased the artificial delay to a minimum of 5 seconds). Thanks for this suggestion, I may use this as a work-around for now.

EDIT: after a few more build/push and deployments with manifest-tool (even removing the artificial delay completely), I am fairly sure that having everything within the same repo is properly working for a stable build pipeline.

How to move on from this point? Is this a registry issue or a combination between manifest-tool and the registry we use (Nexus)?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants