Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[bug] conan-2 backup sources workflow does not attempt to fetch from multiple remotes #16287

Closed
maxnbk opened this issue May 17, 2024 · 5 comments
Assignees
Milestone

Comments

@maxnbk
Copy link

maxnbk commented May 17, 2024

Describe the bug

The way that this page is written: https://docs.conan.io/2/devops/backup_sources/sources_backup.html

It sounds as though configuring something like..

core.sources:download_urls = ["https://myremote", "origin"]

..should produce the effect that "myremote" is checked first, and failing that, a retry would hit the origin URL in the source data, so that the source can be put into the source cache locally, and a subsequent conan upload can populate the cache for future queries.

However, the behavior I experience is simply that the first remote is queried, and, not being populated yet, fails, and the build stops, without performing any retry against the origin URL.

I can of course make things fetch by inverting the URL order, but then I'll only ever download from origin and never the cache, even if the origin URL broke, somewhat obviating the purpose of such a feature.

Assuming I am not configuring something wrong, I suspect this would be a bug. I am using conan-2.3.0.

Thanks for your support in investigating.

How to reproduce it

  1. Set global.conf's core.sources:download_urls as above, ensuring that the first remote is not "origin", and does not actually exist, so as to mimic the behavior of a missing source.
  2. Perform any trivial build.
  3. Observe that the the download and build will fail to fetch any source.
@RubenRBS
Copy link
Member

After some Slack chats, the expectations from @maxnbk is that if the remote does not exist, that it would fall back to the origin, while we designed this feature expecting that if your backup fails, you'll want to have the command fail - Note that this is different from the backup not having the file. In such case, Conan will iterate to the next entry without an issue.

A question arose about the expectation from the user being in line with ours, but that's something we can keep discussing

@memsharded
Copy link
Member

After some Slack chats, the expectations from @maxnbk is that if the remote does not exist, that it would fall back to the origin, while we designed this feature expecting that if your backup fails, you'll want to have the command fail - Note that this is different from the backup not having the file. In such case, Conan will iterate to the next entry without an issue.

This behavior is aligned with the Conan remotes behavior, and it has been considered and discussed a few times. Silently ignoring remote connection errors and iterating to the next remote can have quite unexpected and unpleasant effects. When a remote is configured it is totally expected to exist, or it will fail if it cannot connect to it.

This would be something similar for the backup-sources configuration. If some user defines some server URL as the priority in their configuration, this server must exist. Otherwise, if there are any issues, like for example a typo in the URL, or some network connection error, the user builds will fall back to building from the internet downloads with possible extra costs and providing a false sense of security (maybe the full backup sources is not working, but it is not known until something really fails and someone removes sources from the internet, but then it is too late). As it is a user configuration, defined by the user, the user has full control, if they don't want to use that server, they can remove it from their conf, but it is defined it must exist. We think this default behavior is more robust and less problematic than the opposite.

@memsharded
Copy link
Member

Hi @maxnbk

If there are no further questions, I think this ticket can be closed as resolved.

Please @maxnbk don't hesitate to re-open or create new tickets for any further issue. Many thanks for your feedback!

@maxnbk
Copy link
Author

maxnbk commented May 29, 2024

Yes, sorry, as @RubenRBS described, there was a difference of expectations.
I do think that it would be wise to outline the expectations / reasoning for such expectations in the documentation, as it wasn't immediately grokkable from my perspective.

Although the remainder of this request I don't necessarily think is actionable, I think it's worth outlining that there are two potential use-cases, and that one of them is not necessarily being accounted for.

In very very short form, the source backup workflow as implemented is more for backing-up-of-sources with deterministic building, and explicitly not for "optional caching / performance speedup but don't care if your remote isn't available". In certain scenarios, like, let's say, on or off a corporate VPN, it could be useful to use a source backup server purely as a caching server, but also not specifically caring about the determinism aspect and being fine with a remote being missing, temporarily.

Hope that makes it clear where the disconnect was. Appreciate y'all!

@memsharded
Copy link
Member

Thanks for the feedback, I understand better your reasoning.

Still, let me put another example

"optional caching / performance speedup but don't care if your remote isn't available"

Ok, you want to use the feature for performance. Still your CI in the cloud is very large, does tons of builds per day, and your ingress cost is very high. Someone does a typo in the conf that defines the backup sources URL, and then all the CI builds defaults to download files from the internet and then your CI invoice suddenly increases a bunch, and it makes some people at the org angry, and Conan is to blame.

So our reasoning is that the feature prioritizes correctness over convenience, making sure that it either works or it clearly makes aware users. Yes, we could add warnings when the server is not found, but our experience also tells that most of the people doesn't read warnings...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants