Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

When CMD downloads large models, it always interrupts. #232

Open
martjay opened this issue Dec 13, 2023 · 18 comments
Open

When CMD downloads large models, it always interrupts. #232

martjay opened this issue Dec 13, 2023 · 18 comments

Comments

@martjay
Copy link

martjay commented Dec 13, 2023

When CMD downloads large models, it always interrupts. I don't know how to solve this problem, so can you tell me all the models link that need to be downloaded and the save path?

@Woisek
Copy link

Woisek commented Dec 14, 2023

Tbh, it's never a good idea to download large files not with a download manager. It may seem convenient for smaller sized files, yes, but large models, like used for SD, text generation and now speech is just too much error-prone. Especially if your connection is not the fastest.
Therefore, a manual download must be always available for the users. The devs should always consider that.

@martjay
Copy link
Author

martjay commented Dec 14, 2023

Tbh, it's never a good idea to download large files not with a download manager. It may seem convenient for smaller sized files, yes, but large models, like used for SD, text generation and now speech is just too much error-prone. Especially if your connection is not the fastest. Therefore, a manual download must be always available for the users. The devs should always consider that.

I agree.

@rsxdalv
Copy link
Owner

rsxdalv commented Dec 16, 2023

This issue is linked with the other issue made by martjay in the suno repository - suno-ai/bark#505
As this is an integration project, I often depend on what the other projects have done. Although I agree that the way the models are downloaded should not be locked into the TTS projects, as that makes it hard for end-users to control how they download these things.

@rsxdalv
Copy link
Owner

rsxdalv commented Dec 16, 2023

@martjay I recommend creating an issue here as well - https://github.com/huggingface/huggingface_hub/issues since this is the "source" of the failing download.
If they want to be the main way of downloading models, they should make it more robust for several GB downloads.

@martjay
Copy link
Author

martjay commented Dec 17, 2023

@martjay I recommend creating an issue here as well - https://github.com/huggingface/huggingface_hub/issues since this is the "source" of the failing download. If they want to be the main way of downloading models, they should make it more robust for several GB downloads.

Did you all download the model without interruption? It's so frustrating.

@rsxdalv
Copy link
Owner

rsxdalv commented Dec 17, 2023

@martjay I recommend creating an issue here as well - https://github.com/huggingface/huggingface_hub/issues since this is the "source" of the failing download. If they want to be the main way of downloading models, they should make it more robust for several GB downloads.

Did you all download the model without interruption? It's so frustrating.

I did see many installation problems when I was elsewhere. But where I live gigabit connections are the standart.

@Woisek
Copy link

Woisek commented Dec 18, 2023

But where I live gigabit connections are the standart.

It's not a good choice to assume things based on yourself. You have always to consider the worst possible thing. And since gigabit connections are nowhere "standard" all around the world, there is always the problem of failing while downloading with slow connections.
Also, this isn't directly an issue with huggingface. This site hosts hundreds of thousands of GB files. I'm sure they provide more than enough hardware for all of us. You can't really blame them for a failed download. Downloads are (almost) always in the scope of the user downloading. The best hosting can't do anything, if the user has a poor, error-prone connection, paired with an error-prone tool, that has no resume and/or error checking ability.

Therefore, the best possibility right now, is to provide the needed URL and the save location on the hard drive, in case the user has to download manually. The devs just have to implement this known information, for example, like this:
Downloading now:
http://www.someserver.com/and/this/file
saving to
G:/folder/to/save

@rsxdalv
Copy link
Owner

rsxdalv commented Dec 18, 2023

But where I live gigabit connections are the standart.

It's not a good choice to assume things based on yourself. You have always to consider the worst possible thing. And since gigabit connections are nowhere "standard" all around the world, there is always the problem of failing while downloading with slow connections. Also, this isn't directly an issue with huggingface. This site hosts hundreds of thousands of GB files. I'm sure they provide more than enough hardware for all of us. You can't really blame them for a failed download. Downloads are (almost) always in the scope of the user downloading. The best hosting can't do anything, if the user has a poor, error-prone connection, paired with an error-prone tool, that has no resume and/or error checking ability.

Therefore, the best possibility right now, is to provide the needed URL and the save location on the hard drive, in case the user has to download manually. The devs just have to implement this known information, for example, like this: Downloading now: http://www.someserver.com/and/this/file saving to G:/folder/to/save

HuggingFace creates the downloader software that is built in to the models that are then integrated into this project.

@Woisek
Copy link

Woisek commented Dec 18, 2023

HuggingFace creates the downloader software that is built in to the models that are then integrated into this project.

A downloader software is/would not be needed if a model is downloadable from their website. And still, if it is that way, it makes what I wrote even more important.
There is no reason to not show the user WHAT is downloaded and to WHERE it's saved.

@rsxdalv
Copy link
Owner

rsxdalv commented Dec 18, 2023

HuggingFace creates the downloader software that is built in to the models that are then integrated into this project.

A downloader software is/would not be needed if a model is downloadable from their website. And still, if it is that way, it makes what I wrote even more important. There is no reason to not show the user WHAT is downloaded and to WHERE it's saved.

image
downloading "4dda87e5dfafc1b59351131d9610002b06fbecc50793a0e3c8ab2e534176cd7f" to
C:\Users<user>.cache\huggingface\hub\models--facebook--musicgen-melody\blobs

@martjay
Copy link
Author

martjay commented Dec 21, 2023

HuggingFace creates the downloader software that is built in to the models that are then integrated into this project.

A downloader software is/would not be needed if a model is downloadable from their website. And still, if it is that way, it makes what I wrote even more important. There is no reason to not show the user WHAT is downloaded and to WHERE it's saved.

image downloading "4dda87e5dfafc1b59351131d9610002b06fbecc50793a0e3c8ab2e534176cd7f" to C:\Users.cache\huggingface\hub\models--facebook--musicgen-melody\blobs

Sometimes it's not a matter of internet speed, you know? Some countries are unable to connect to the huggingface website and require VPN access, but I have also verified this issue and the answer is no. Because even if I change several VPNs, the download will still be interrupted, but if I download all the models from the huggingface website, it never interrupts, and even if it interrupts, I can still continue. But when downloading in CMD, if I interrupt, everything starts over and it keeps looping. I don't understand the mechanism of huggingface, maybe the download address will automatically refresh? I'm not sure, but the model URL and save directory are necessary, really.

@rsxdalv
Copy link
Owner

rsxdalv commented Dec 21, 2023 via email

@martjay
Copy link
Author

martjay commented Dec 21, 2023

Just to reclarify - I'm not saying that the download process is amazing and it doesn't screw people over. What I am saying is that it's not my code that a) does the download b) chooses how things are downloaded. I can only change very few downloads, and those are not the big ones like bark etc. Yes bark is open source and could be modified, and it could even be me who modifies it, but that's the point - bark, tortoise, audio craft and more projects would need to be changed. This isn't stable diffusion where it's just a checkpoint. Bark and tortoise, for example, are two entirely different projects on almost every level. Meanwhile something like Realistic Vision and Anythingv5 are basically the same thing, same model, same architecture everything, just a different checkpoint (ckpt) file. The only model that can be reasonably self downloaded is tortoise, probably because there are different checkpoints and fine tuning. My repo does not prevent this. You can choose your own checkpoint files. That's why I'm saying - if HuggingFace download is so prevalent, they ought to make it work. Obviously downloading big files shouldn't be an issue, we have had open source torrent software for ages. If they can't make it work, we need to start pressuring the projects with an alternative that does work.

Are you also unable to determine the name of each project download model and the folder where it is saved? It's really frustrating. I need to find a solution to the download issue for all AI projects, but currently, that's all I have in mind.

@rsxdalv
Copy link
Owner

rsxdalv commented Dec 21, 2023

Maybe I can test it with a similar VPN at the end of next month. I know it's frustrating, with that VPN I had to download and install automatic1111 multiple times since several things kept failing.

Because of how HuggingFace works, the model names are hashes, like my screenshot above. It might be possible to determine this. Maybe there's a better solution as well. There must be many people facing the same problem.

@martjay
Copy link
Author

martjay commented Dec 21, 2023

Maybe I can test it with a similar VPN at the end of next month. I know it's frustrating, with that VPN I had to download and install automatic1111 multiple times since several things kept failing.

Because of how HuggingFace works, the model names are hashes, like my screenshot above. It might be possible to determine this. Maybe there's a better solution as well. There must be many people facing the same problem.

Yes, I hope to solve this problem. It's really a big problem because almost all project models rely on huggingface downloads

@martjay
Copy link
Author

martjay commented Dec 21, 2023

pytorch_model.bin: 12%|███████▎ | 157M/1.26G [01:00<07:02, 2.61MB/s]
Traceback (most recent call last):gError: ('Connection broken: IncompleteRead(158145050 bytes read, 968064663 more expec
File "F:\AI\Bert-VITS\one-click-installers-tts-6.0\installer_files\env\lib\site-packages\urllib3\response.py", line 710, in _error_catcher
yield
File "F:\AI\Bert-VITS\one-click-installers-tts-6.0\installer_files\env\lib\site-packages\urllib3\response.py", line 835, in _raw_read
raise IncompleteRead(self._fp_bytes_read, self.length_remaining)
urllib3.exceptions.IncompleteRead: IncompleteRead(162115360 bytes read, 1100409233 more expected)

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "F:\AI\Bert-VITS\one-click-installers-tts-6.0\installer_files\env\lib\site-packages\requests\models.py", line 816, in generate
yield from self.raw.stream(chunk_size, decode_content=True)
File "F:\AI\Bert-VITS\one-click-installers-tts-6.0\installer_files\env\lib\site-packages\urllib3\response.py", line 936, in stream
data = self.read(amt=amt, decode_content=decode_content)
File "F:\AI\Bert-VITS\one-click-installers-tts-6.0\installer_files\env\lib\site-packages\urllib3\response.py", line 907, in read
data = self._raw_read(amt)
File "F:\AI\Bert-VITS\one-click-installers-tts-6.0\installer_files\env\lib\site-packages\urllib3\response.py", line 813, in _raw_read
with self._error_catcher():
File "F:\AI\Bert-VITS\one-click-installers-tts-6.0\installer_files\env\lib\contextlib.py", line 153, in exit
self.gen.throw(typ, value, traceback)
File "F:\AI\Bert-VITS\one-click-installers-tts-6.0\installer_files\env\lib\site-packages\urllib3\response.py", line 727, in _error_catcher
raise ProtocolError(f"Connection broken: {e!r}", e) from e
urllib3.exceptions.ProtocolError: ('Connection broken: IncompleteRead(162115360 bytes read, 1100409233 more expected)', IncompleteRead(162115360 bytes read, 1100409233 more expected))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "F:\AI\Bert-VITS\one-click-installers-tts-6.0\installer_files\env\lib\site-packages\gradio\routes.py", line 437, in run_predict
output = await app.get_blocks().process_api(
File "F:\AI\Bert-VITS\one-click-installers-tts-6.0\installer_files\env\lib\site-packages\gradio\blocks.py", line 1352, in process_api
result = await self.call_function(
File "F:\AI\Bert-VITS\one-click-installers-tts-6.0\installer_files\env\lib\site-packages\gradio\blocks.py", line 1077, in call_function
prediction = await anyio.to_thread.run_sync(
File "F:\AI\Bert-VITS\one-click-installers-tts-6.0\installer_files\env\lib\site-packages\anyio\to_thread.py", line 33, in run_sync
return await get_asynclib().run_sync_in_worker_thread(
File "F:\AI\Bert-VITS\one-click-installers-tts-6.0\installer_files\env\lib\site-packages\anyio_backends_asyncio.py", line 877, in run_sync_in_worker_thread
return await future
File "F:\AI\Bert-VITS\one-click-installers-tts-6.0\installer_files\env\lib\site-packages\anyio_backends_asyncio.py", line 807, in run
result = context.run(func, *args)
File "F:\AI\Bert-VITS\one-click-installers-tts-6.0\tts-generation-webui\src\tortoise\gen_tortoise.py", line 49, in switch_model
get_tts(
File "F:\AI\Bert-VITS\one-click-installers-tts-6.0\tts-generation-webui\src\tortoise\gen_tortoise.py", line 84, in get_tts
MODEL = TextToSpeech(
File "F:\AI\Bert-VITS\one-click-installers-tts-6.0\installer_files\env\lib\site-packages\tortoise\api.py", line 231, in init
self.aligner = Wav2VecAlignment()
File "F:\AI\Bert-VITS\one-click-installers-tts-6.0\installer_files\env\lib\site-packages\tortoise\utils\wav2vec_alignment.py", line 53, in init
self.model = Wav2Vec2ForCTC.from_pretrained("jbetker/wav2vec2-large-robust-ft-libritts-voxpopuli").cpu()
File "F:\AI\Bert-VITS\one-click-installers-tts-6.0\installer_files\env\lib\site-packages\transformers\modeling_utils.py", line 2539, in from_pretrained
resolved_archive_file = cached_file(
File "F:\AI\Bert-VITS\one-click-installers-tts-6.0\installer_files\env\lib\site-packages\transformers\utils\hub.py", line 417, in cached_file
resolved_file = hf_hub_download(
File "F:\AI\Bert-VITS\one-click-installers-tts-6.0\installer_files\env\lib\site-packages\huggingface_hub\utils_validators.py", line 118, in _inner_fn
return fn(*args, **kwargs)
File "F:\AI\Bert-VITS\one-click-installers-tts-6.0\installer_files\env\lib\site-packages\huggingface_hub\file_download.py", line 1461, in hf_hub_download
http_get(
File "F:\AI\Bert-VITS\one-click-installers-tts-6.0\installer_files\env\lib\site-packages\huggingface_hub\file_download.py", line 541, in http_get
for chunk in r.iter_content(chunk_size=DOWNLOAD_CHUNK_SIZE):
File "F:\AI\Bert-VITS\one-click-installers-tts-6.0\installer_files\env\lib\site-packages\requests\models.py", line 818, in generate
raise ChunkedEncodingError(e)
requests.exceptions.ChunkedEncodingError: ('Connection broken: IncompleteRead(162115360 bytes read, 1100409233 more expected)', IncompleteRead(162115360 bytes read, 1100409233 more expected))

@martjay
Copy link
Author

martjay commented Jan 2, 2024

Maybe I can test it with a similar VPN at the end of next month. I know it's frustrating, with that VPN I had to download and install automatic1111 multiple times since several things kept failing.

Because of how HuggingFace works, the model names are hashes, like my screenshot above. It might be possible to determine this. Maybe there's a better solution as well. There must be many people facing the same problem.

I found a solution today.

edit "file_download.py"

force_download=True,
resume_download=False,

Can successfully complete the download now.

@rsxdalv
Copy link
Owner

rsxdalv commented Jan 2, 2024

Maybe I can test it with a similar VPN at the end of next month. I know it's frustrating, with that VPN I had to download and install automatic1111 multiple times since several things kept failing.
Because of how HuggingFace works, the model names are hashes, like my screenshot above. It might be possible to determine this. Maybe there's a better solution as well. There must be many people facing the same problem.

I found a solution today.

edit "file_download.py"

force_download=True, resume_download=False,

Can successfully complete the download now.

That's awesome!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants