Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Download hangs at End #402

Open
zanussbaum opened this issue Feb 11, 2024 · 0 comments
Open

Download hangs at End #402

zanussbaum opened this issue Feb 11, 2024 · 0 comments

Comments

@zanussbaum
Copy link

I'm seeing downloading hanging at the end similar to #74

If I hit ctrl-c, I see this error

KeyboardInterrupt
KeyboardInterrupt
  File "/home/ubuntu/miniconda3/envs/datacomp/lib/python3.10/multiprocessing/synchronize.py", line 95, in __enter__
    return self._semlock.__enter__()
KeyboardInterrupt
  File "/home/ubuntu/miniconda3/envs/datacomp/lib/python3.10/multiprocessing/synchronize.py", line 95, in __enter__
    return self._semlock.__enter__()
KeyboardInterrupt
KeyboardInterrupt
KeyboardInterrupt
KeyboardInterrupt
KeyboardInterrupt
Traceback (most recent call last):
  File "/home/ubuntu/miniconda3/envs/datacomp/lib/python3.10/multiprocessing/process.py", line 314, in _bootstrap
    self.run()
  File "/home/ubuntu/miniconda3/envs/datacomp/lib/python3.10/multiprocessing/process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
  File "/home/ubuntu/miniconda3/envs/datacomp/lib/python3.10/multiprocessing/pool.py", line 114, in worker
    task = get()
  File "/home/ubuntu/miniconda3/envs/datacomp/lib/python3.10/multiprocessing/queues.py", line 365, in get
    res = self._reader.recv_bytes()
  File "/home/ubuntu/miniconda3/envs/datacomp/lib/python3.10/multiprocessing/connection.py", line 221, in recv_bytes
    buf = self._recv_bytes(maxlength)
  File "/home/ubuntu/miniconda3/envs/datacomp/lib/python3.10/multiprocessing/connection.py", line 419, in _recv_bytes
    buf = self._recv(4)
  File "/home/ubuntu/miniconda3/envs/datacomp/lib/python3.10/multiprocessing/connection.py", line 384, in _recv
    chunk = read(handle, remaining)
KeyboardInterrupt
Process SpawnPoolWorker-28039:
Traceback (most recent call last):
  File "/home/ubuntu/miniconda3/envs/datacomp/lib/python3.10/multiprocessing/pool.py", line 856, in next
    item = self._items.popleft()
IndexError: pop from an empty deque

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/ubuntu/miniconda3/envs/datacomp/lib/python3.10/multiprocessing/process.py", line 314, in _bootstrap
    self.run()
  File "/home/ubuntu/miniconda3/envs/datacomp/lib/python3.10/multiprocessing/process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
  File "/home/ubuntu/miniconda3/envs/datacomp/lib/python3.10/multiprocessing/pool.py", line 125, in worker
    result = (True, func(*args, **kwds))
  File "/home/ubuntu/miniconda3/envs/datacomp/lib/python3.10/site-packages/img2dataset/downloader.py", line 126, in __call__
    self.download_shard(row)
  File "/home/ubuntu/miniconda3/envs/datacomp/lib/python3.10/site-packages/img2dataset/downloader.py", line 199, in download_shard
    for key, img_stream, error_message in thread_pool.imap_unordered(
  File "/home/ubuntu/miniconda3/envs/datacomp/lib/python3.10/multiprocessing/pool.py", line 861, in next
    self._cond.wait(timeout)
  File "/home/ubuntu/miniconda3/envs/datacomp/lib/python3.10/threading.py", line 320, in wait
    waiter.acquire()
KeyboardInterrupt
34it [14:25:16, 1526.95s/it]

I am using an Ubuntu machine following the instructions for Datacomp
with the command
/home/ubuntu/datacomp/download_upstream.py --scale datacomp_1b --data_dir s3://datacomp --metadata_dir /tmp/metadata --enable_wandb --wandb_project=datacomp and using a c6a.16xlarge

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant