You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I got:
RuntimeError:
We found the following error Traceback (most recent call last):
File "/usr/local/lib/python3.10/dist-packages/litdata/processing/data_processor.py", line 628, in _handle_data_chunk_recipe
for item_data in item_data_or_generator:
File "/usr/local/lib/python3.10/dist-packages/litdata/processing/functions.py", line 151, in _prepare_item_generator
yield from self._fn(item_metadata) # type: ignore
File "/usr/local/lib/python3.10/dist-packages/litgpt/data/text_files.py", line 124, in tokenize
with open(filename, "r", encoding="utf-8") as file:
IsADirectoryError: [Errno 21] Is a directory: '/tmp/data'
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/usr/local/lib/python3.10/dist-packages/litdata/processing/data_processor.py", line 423, in run
self._loop()
File "/usr/local/lib/python3.10/dist-packages/litdata/processing/data_processor.py", line 472, in _loop
self._handle_data_chunk_recipe(index)
File "/usr/local/lib/python3.10/dist-packages/litdata/processing/data_processor.py", line 638, in _handle_data_chunk_recipe
raise RuntimeError(f"Failed processing {self.items[index]}") from e
RuntimeError: Failed processing /tmp/data
The text was updated successfully, but these errors were encountered:
@carmocca Please tell me, if I want to continue using litgpt for pre-training, what should I do? Should we wait until the bug is fixed before using litgpt? thank you!
How can I solve this problem?
I download pythia-160m from hugging face.
My data was downloaded according to official documents.
This program is running in NVIDIA docker.
I followed the official documentation and continued pre-training, but an error occurred: RuntimeError: Failed processing /tmp/data.
litgpt pretrain
--model_name pythia-160m
--tokenizer_dir checkpoints/EleutherAI/pythia-160m
--initial_checkpoint_dir checkpoints/EleutherAI/pythia-160m
--data TextFiles
--data.train_data_path "custom_texts"
--out_dir out/custom_model
I got:
RuntimeError:
We found the following error Traceback (most recent call last):
File "/usr/local/lib/python3.10/dist-packages/litdata/processing/data_processor.py", line 628, in _handle_data_chunk_recipe
for item_data in item_data_or_generator:
File "/usr/local/lib/python3.10/dist-packages/litdata/processing/functions.py", line 151, in _prepare_item_generator
yield from self._fn(item_metadata) # type: ignore
File "/usr/local/lib/python3.10/dist-packages/litgpt/data/text_files.py", line 124, in tokenize
with open(filename, "r", encoding="utf-8") as file:
IsADirectoryError: [Errno 21] Is a directory: '/tmp/data'
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/usr/local/lib/python3.10/dist-packages/litdata/processing/data_processor.py", line 423, in run
self._loop()
File "/usr/local/lib/python3.10/dist-packages/litdata/processing/data_processor.py", line 472, in _loop
self._handle_data_chunk_recipe(index)
File "/usr/local/lib/python3.10/dist-packages/litdata/processing/data_processor.py", line 638, in _handle_data_chunk_recipe
raise RuntimeError(f"Failed processing {self.items[index]}") from e
RuntimeError: Failed processing /tmp/data
The text was updated successfully, but these errors were encountered: