Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

skip invalid link path during gc #4343

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

microyahoo
Copy link
Contributor

Signed-off-by: Liang Zheng zhengliang0901@gmail.com

Signed-off-by: Liang Zheng <zhengliang0901@gmail.com>
Copy link
Member

@milosgajdos milosgajdos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you describe what is it you are trying to achieve with this PR?

Specifically, can you give us an example of when a stored registry link contains an invalid digest? That'd be mildly worrying, tbh

@microyahoo
Copy link
Contributor Author

Can you describe what is it you are trying to achieve with this PR?

Specifically, can you give us an example of when a stored registry link contains an invalid digest? That'd be mildly worrying, tbh

hi @milosgajdos, we use Harbor as our image repository, and Harbor uses Registry to manage images underneath. I've noticed that after running for a while, Harbor tends to leave behind a lot of residual data, as described in the #4330. I tried running garbage collection (GC) to reclaim resources, but encountered errors. It turns out that there are many invalid link files in the Registry. These link files are either empty or have invalid digests, which directly cause the GC to interrupt. Actually, we can skip these invalid link files instead of directly interrupting the GC.

~/go/src/github.com/distribution/distribution (link ✗) bin/registry garbage-collect cmd/registry/config-dev.yml -m
.......
deeproute-all/docker-repo/data-ocean/nginx
deeproute-all/docker-repo/data-ocean/prod/data-ocean-front-end
deeproute-all/docker-repo/data-ocean/qa/data-ocean-front-end
deeproute-all/docker-repo/dataocean/adas-web-term
deeproute-all/docker-repo/dataocean/dataocean-account-management-service
deeproute-all/docker-repo/dataocean/dataocean-adas-monitor
deeproute-all/docker-repo/dataocean/dataocean-adas-tar
deeproute-all/docker-repo/dataocean/dataocean-api-docs
deeproute-all/docker-repo/dataocean/dataocean-api-gateway
deeproute-all/docker-repo/dataocean/dataocean-authentication-service
deeproute-all/docker-repo/dataocean/dataocean-availability-service
deeproute-all/docker-repo/dataocean/dataocean-bag-construction
deeproute-all/docker-repo/dataocean/dataocean-bag-constructor
deeproute-all/docker-repo/dataocean/dataocean-car-transmit
deeproute-all/docker-repo/dataocean/dataocean-cloud-vehicle-communication
deeproute-all/docker-repo/dataocean/dataocean-data-agent
deeproute-all/docker-repo/dataocean/dataocean-data-agent-overseer
deeproute-all/docker-repo/dataocean/dataocean-data-construction
deeproute-all/docker-repo/dataocean/dataocean-data-migration
deeproute-all/docker-repo/dataocean/dataocean-data-provider-service
deeproute-all/docker-repo/dataocean/dataocean-data-transformation-service
deeproute-all/docker-repo/dataocean/dataocean-data-tube-service
deeproute-all/docker-repo/dataocean/dataocean-data-tunnel
deeproute-all/docker-repo/dataocean/dataocean-database-migration
deeproute-all/docker-repo/dataocean/dataocean-database-migration-data-managment-v2
deeproute-all/docker-repo/dataocean/dataocean-download-test
deeproute-all/docker-repo/dataocean/dataocean-dr-pipeline
deeproute-all/docker-repo/dataocean/dataocean-drfile-archive-serivce
deeproute-all/docker-repo/dataocean/dataocean-drtag
deeproute-all/docker-repo/dataocean/dataocean-file-download-service
deeproute-all/docker-repo/dataocean/dataocean-file-galaxy
failed to garbage collect: failed to mark: filesystem: filesystem: invalid checksum digest format
~/go/src/github.com/distribution/distribution (link ✗) ll /mnt/cephfs/docker/registry/v2/repositories/infra-operation/image/build-env-image/_manifests/revisions/sha256/a68799c5a48d4cfe46ea1e122245097441f14720b0488d1d162eea637e3b96f7/link
-rw-r--r-- 1 10000 71 Apr 28  2022 /mnt/cephfs/docker/registry/v2/repositories/infra-operation/image/build-env-image/_manifests/revisions/sha256/a68799c5a48d4cfe46ea1e122245097441f14720b0488d1d162eea637e3b96f7/link
~/go/src/github.com/distribution/distribution (link ✗) cat /mnt/cephfs/docker/registry/v2/repositories/infra-operation/image/build-env-image/_manifests/revisions/sha256/a68799c5a48d4cfe46ea1e122245097441f14720b0488d1d162eea637e3b96f7/link
~/go/src/github.com/distribution/distribution (link ✗) ll /mnt/cephfs/docker/registry/v2/repositories/deeproute-simulation/single-frame-task-images/_manifests/revisions/sha256/20f5baba4c1fdd30c9e13984365ed873b06a945afb877569219daee8870da30b/link
-rw-r--r-- 1 10000 0 Jun 24  2022 /mnt/cephfs/docker/registry/v2/repositories/deeproute-simulation/single-frame-task-images/_manifests/revisions/sha256/20f5baba4c1fdd30c9e13984365ed873b06a945afb877569219daee8870da30b/link

@milosgajdos
Copy link
Member

These link files are either empty or have invalid digests

I'd prefer looking into the underlying issue rather than adding a bandaid on a symptom.

@microyahoo
Copy link
Contributor Author

microyahoo commented May 8, 2024

These link files are either empty or have invalid digests

I'd prefer looking into the underlying issue rather than adding a bandaid on a symptom.

Maybe this should be considered quite common. Taking the example of pushing an image, it involves several HTTP requests. For the push process, the entire operation is non-atomic, so it's very likely that failures or other anomalies midway could lead to data corrupt or incompleteness. This is when running GC would cause errors. I think we can skip the anomalous data.

@milosgajdos
Copy link
Member

What happens to anomalous data if you leave them dangling in the storage and then a push with the same references is attempted?

@microyahoo
Copy link
Contributor Author

What happens to anomalous data if you leave them dangling in the storage and then a push with the same references is attempted?

the link file will be overwritten if push the same references.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants