Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

在colab上运行finetune.ipynb的时候会报一个huggingface登录的错误,有人遇到同样的错误吗? #272

Open
lee376 opened this issue Feb 12, 2024 · 1 comment

Comments

@lee376
Copy link

lee376 commented Feb 12, 2024

Downloading and preparing dataset generator/default to /root/.cache/huggingface/datasets/generator/default-2eec05f7b1485a75/0.0.0...
Generating train split: 0 examples [00:00, ? examples/s]Traceback (most recent call last):
File "/usr/local/lib/python3.10/dist-packages/huggingface_hub/utils/_errors.py", line 286, in hf_raise_for_status
response.raise_for_status()
File "/usr/local/lib/python3.10/dist-packages/requests/models.py", line 1021, in raise_for_status
raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 401 Client Error: Unauthorized for url: https://huggingface.co/model_path/chatglm/resolve/main/tokenizer_config.json

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "/usr/local/lib/python3.10/dist-packages/transformers/utils/hub.py", line 430, in cached_file
resolved_file = hf_hub_download(
File "/usr/local/lib/python3.10/dist-packages/huggingface_hub/utils/_validators.py", line 118, in _inner_fn
return fn(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/huggingface_hub/file_download.py", line 1368, in hf_hub_download
raise head_call_error
File "/usr/local/lib/python3.10/dist-packages/huggingface_hub/file_download.py", line 1238, in hf_hub_download
metadata = get_hf_file_metadata(
File "/usr/local/lib/python3.10/dist-packages/huggingface_hub/utils/_validators.py", line 118, in _inner_fn
return fn(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/huggingface_hub/file_download.py", line 1631, in get_hf_file_metadata
r = _request_wrapper(
File "/usr/local/lib/python3.10/dist-packages/huggingface_hub/file_download.py", line 385, in _request_wrapper
response = _request_wrapper(
File "/usr/local/lib/python3.10/dist-packages/huggingface_hub/file_download.py", line 409, in _request_wrapper
hf_raise_for_status(response)
File "/usr/local/lib/python3.10/dist-packages/huggingface_hub/utils/_errors.py", line 323, in hf_raise_for_status
raise RepositoryNotFoundError(message, response) from e
huggingface_hub.utils._errors.RepositoryNotFoundError: 401 Client Error. (Request ID: Root=1-65ca176f-5253ecd507530db441e7bd66;124fc4dc-0217-4158-9d74-b4e7e5f1053e)

Repository Not Found for url: https://huggingface.co/model_path/chatglm/resolve/main/tokenizer_config.json.
Please make sure you specified the correct repo_id and repo_type.
If you are trying to access a private or gated repo, make sure you are authenticated.
Invalid username or password.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "/usr/local/lib/python3.10/dist-packages/datasets/builder.py", line 1608, in _prepare_split_single
for key, record in generator:
File "/usr/local/lib/python3.10/dist-packages/datasets/packaged_modules/generator/generator.py", line 30, in _generate_examples
for idx, ex in enumerate(self.config.generator(**gen_kwargs)):
File "/content/ChatGLM-Tuning/tokenize_dataset_rows.py", line 43, in read_jsonl
tokenizer = transformers.AutoTokenizer.from_pretrained(
File "/usr/local/lib/python3.10/dist-packages/transformers/models/auto/tokenization_auto.py", line 718, in from_pretrained
tokenizer_config = get_tokenizer_config(pretrained_model_name_or_path, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/transformers/models/auto/tokenization_auto.py", line 550, in get_tokenizer_config
resolved_config_file = cached_file(
File "/usr/local/lib/python3.10/dist-packages/transformers/utils/hub.py", line 451, in cached_file
raise EnvironmentError(
OSError: model_path/chatglm is not a local folder and is not a valid model identifier listed on 'https://huggingface.co/models'
If this is a private repository, make sure to pass a token having permission to this repo either by logging in with huggingface-cli login or by passing token=<your_token>

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "/content/ChatGLM-Tuning/tokenize_dataset_rows.py", line 75, in
main()
File "/content/ChatGLM-Tuning/tokenize_dataset_rows.py", line 68, in main
dataset = datasets.Dataset.from_generator(
File "/usr/local/lib/python3.10/dist-packages/datasets/arrow_dataset.py", line 1020, in from_generator
).read()
File "/usr/local/lib/python3.10/dist-packages/datasets/io/generator.py", line 47, in read
self.builder.download_and_prepare(
File "/usr/local/lib/python3.10/dist-packages/datasets/builder.py", line 872, in download_and_prepare
self._download_and_prepare(
File "/usr/local/lib/python3.10/dist-packages/datasets/builder.py", line 1649, in _download_and_prepare
super()._download_and_prepare(
File "/usr/local/lib/python3.10/dist-packages/datasets/builder.py", line 967, in _download_and_prepare
self._prepare_split(split_generator, **prepare_split_kwargs)
File "/usr/local/lib/python3.10/dist-packages/datasets/builder.py", line 1488, in _prepare_split
for job_id, done, content in self._prepare_split_single(
File "/usr/local/lib/python3.10/dist-packages/datasets/builder.py", line 1644, in _prepare_split_single
raise DatasetGenerationError("An error occurred while generating the dataset") from e
datasets.builder.DatasetGenerationError: An error occurred while generating the dataset

@lee376
Copy link
Author

lee376 commented Feb 12, 2024

报错的那段代码是:!python tokenize_dataset_rows.py
--jsonl_path data/alpaca_data.jsonl
--save_path data/alpaca
--max_seq_length 128

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant