You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
错误日志-
Traceback (most recent call last):
File "/work/PaddleNLP/model_zoo/uie/finetune.py", line 262, in <module>
main()
File "/work/PaddleNLP/model_zoo/uie/finetune.py", line 193, in main
train_result = trainer.train(resume_from_checkpoint=checkpoint)
File "/opt/py39/lib/python3.9/site-packages/paddlenlp/trainer/trainer.py", line 888, in train
self._maybe_log_save_evaluate(tr_loss, model, epoch, ignore_keys_for_eval, inputs=inputs)
File "/opt/py39/lib/python3.9/site-packages/paddlenlp/trainer/trainer.py", line 1024, in _maybe_log_save_evaluate
tr_loss_scalar = self._nested_gather(tr_loss).mean().item()
File "/opt/py39/lib/python3.9/site-packages/paddlenlp/trainer/trainer.py", line 2544, in _nested_gather
tensors = distributed_concat(tensors)
File "/opt/py39/lib/python3.9/site-packages/paddlenlp/trainer/utils/helper.py", line 41, in distributed_concat
output_tensors = [t if len(t.shape) > 0 else t.reshape_([-1]) for t in output_tensors]
File "/opt/py39/lib/python3.9/site-packages/paddlenlp/trainer/utils/helper.py", line 41, in <listcomp>
output_tensors = [t if len(t.shape) > 0 else t.reshape_([-1]) for t in output_tensors]
File "/opt/py39/lib/python3.9/site-packages/decorator.py", line 232, in fun
return caller(func, *(extras + args), **kw)
File "/opt/py39/lib/python3.9/site-packages/paddle/base/wrapped_decorator.py", line 26, in __impl__
return wrapped_func(*args, **kwargs)
File "/opt/py39/lib/python3.9/site-packages/paddle/utils/inplace_utils.py", line 45, in __impl__
return func(*args, **kwargs)
File "/opt/py39/lib/python3.9/site-packages/paddle/tensor/manipulation.py", line 4635, in reshape_
out = _C_ops.reshape_(x, shape)
OSError: (External) ACL error, the error code is : 100000. (at /work/PaddleCustomDevice/backends/npu/kernels/funcs/npu_op_runner.cc:223)
软件环境
重复问题
错误描述
稳定复现步骤 & 代码
启动脚本
python -u -m paddle.distributed.launch --gpus "0,1,2,3" finetune.py --device gpu --logging_steps 10 --save_steps 100 --eval_steps 100 --seed 42 --model_name_or_path uie-base --output_dir $finetuned_model --train_path data/train.txt --dev_path data/dev.txt --max_seq_length 512 --per_device_eval_batch_size 21 --per_device_train_batch_size 32 --num_train_epochs 50 --learning_rate 1e-2 --label_names "start_positions" "end_positions" --do_train --do_eval --do_export --export_model_dir $finetuned_model --overwrite_output_dir --disable_tqdm True --metric_for_best_model eval_f1 --load_best_model_at_end True --save_total_limit 1
The text was updated successfully, but these errors were encountered: