New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow not using symlinks when fetching files #2476
base: develop
Are you sure you want to change the base?
Conversation
Ping @pplantinga and @Gastron as they both have a good understanding of the current issue and SB codebase. Could you please share your thoughts on this issue so that we can move in a valuable direction please? Thanks :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great change, LGTM!
speechbrain/utils/fetching.py
Outdated
@@ -58,38 +158,68 @@ def fetch( | |||
use_auth_token=False, | |||
revision=None, | |||
huggingface_cache_dir=None, | |||
local_strategy: LocalStrategy = LocalStrategy.SYMLINK, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm curious if we still want the default to be SYMLINK
here. I appreciate that we might want to keep the default the same but perhaps we could add a warning and change it in the next version?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought about this more and I think it might make sense to switch to NO_LINK
by default. I was wary of doing such a change but I don't think most things implicitly rely on this, and it's usually a simple fix or workaround.
CI passes when changing the default after one change in parameter transfer logic.
It should be plenty documented though... I don't want users to be too surprised when they see an empty directory where the pretrained stuff should be.
This is not model data but files we might want to infer, there is never really any point in making a symlink to the file there. If there is, the user can fetch themselves.
What does this PR do?
Currently this is just mocking up APIs and writing proofs-of-concept.
The goal of this PR would ultimately be to explore ways to avoid using symbolic links in most cases in SB (which should e.g. definitely not be done when using
load_audio
against local paths in inference APIs...)As of the current state of the PR:
fetch
is extended to accept alocal_strategy
parameter. This parameter controls the behavior for downloading and copying/linking files around.fetch
now accepts and forwards the same parameter.SYMLINK
toNO_LINK
on all platforms (can be debated).NO_LINK
by default since polluting the working directory with audio files is largely undesired.NO_LINK
behavior.Resolves #1070, resolves #1278, resolves #2197, resolves #1885, resolves #1155, resolves #1586
Supersedes #1303
Before submitting
PR review
Reviewer checklist