You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
报的错误:
File "/usr/local/lib/python3.9/site-packages/paddlenlp/transformers/ernie/modeling.py", line 127, in forward
embeddings = self.layer_norm(embeddings)
File "/usr/local/lib64/python3.9/site-packages/paddle/nn/layer/layers.py", line 1429, in call
return self.forward(*inputs, **kwargs)
File "/usr/local/lib/python3.9/site-packages/paddleslim/nas/ofa/layers.py", line 1301, in forward
out, _, _ = paddle._C_ops.layer_norm(
ValueError: too many values to unpack (expected 3)
请提出你的问题
位置:applications/text_classification/multi_class 进行的是模型裁剪操作
环境:
paddle-bfloat 0.1.7
paddle2onnx 1.1.0
paddlefsl 1.1.0
paddlenlp 2.8.0
paddleocr 2.7.0.3
paddlepaddle 2.6.1
paddleslim 2.6.0
scikit-learn 1.4.2
裁剪操作时候的命令:
python3 train.py
--do_compress
--device cpu
--model_name_or_path checkpoint
--output_dir checkpoint/prune
--learning_rate 3e-5
--per_device_train_batch_size 32
--per_device_eval_batch_size 32
--num_train_epochs 1
--max_length 128
--logging_steps 5
--save_steps 100
--width_mult_list '3/4' '2/3' '1/2'
--train_path "20231120data-mul-clas/train.txt"
--dev_path "20231120data-mul-clas/dev.txt"
--test_path "20231120data-mul-clas/test.txt"
--label_path "20231120data-mul-clas/label.txt"
报的错误:
File "/usr/local/lib/python3.9/site-packages/paddlenlp/transformers/ernie/modeling.py", line 127, in forward
embeddings = self.layer_norm(embeddings)
File "/usr/local/lib64/python3.9/site-packages/paddle/nn/layer/layers.py", line 1429, in call
return self.forward(*inputs, **kwargs)
File "/usr/local/lib/python3.9/site-packages/paddleslim/nas/ofa/layers.py", line 1301, in forward
out, _, _ = paddle._C_ops.layer_norm(
ValueError: too many values to unpack (expected 3)
请问这是什么问题?
下面附上完整的日志记录:
python3 train.py \
[2024-04-28 11:01:41,103] [ INFO] - The default value for the training argument
--report_to
will change in v5 (from all installed integrations to none). In v5, you will need to use--report_to all
to get the same behavior as now. You should start updating your code and make this info disappear :-).[2024-04-28 11:01:41,104] [ INFO] - ============================================================
[2024-04-28 11:01:41,104] [ INFO] - Model Configuration Arguments
[2024-04-28 11:01:41,104] [ INFO] - paddle commit id :fbf852dd832bc0e63ae31cd4aa37defd829e4c03
[2024-04-28 11:01:41,104] [ INFO] - export_model_dir :None
[2024-04-28 11:01:41,104] [ INFO] - model_name_or_path :checkpoint
[2024-04-28 11:01:41,104] [ INFO] -
[2024-04-28 11:01:41,104] [ INFO] - ============================================================
[2024-04-28 11:01:41,104] [ INFO] - Data Configuration Arguments
[2024-04-28 11:01:41,104] [ INFO] - paddle commit id :fbf852dd832bc0e63ae31cd4aa37defd829e4c03
[2024-04-28 11:01:41,105] [ INFO] - bad_case_path :./data/bad_case.txt
[2024-04-28 11:01:41,105] [ INFO] - debug :False
[2024-04-28 11:01:41,105] [ INFO] - dev_path :20231120data-mul-clas/dev.txt
[2024-04-28 11:01:41,105] [ INFO] - early_stopping :False
[2024-04-28 11:01:41,105] [ INFO] - early_stopping_patience :4
[2024-04-28 11:01:41,105] [ INFO] - label_path :20231120data-mul-clas/label.txt
[2024-04-28 11:01:41,105] [ INFO] - max_length :128
[2024-04-28 11:01:41,105] [ INFO] - test_path :20231120data-mul-clas/test.txt
[2024-04-28 11:01:41,105] [ INFO] - train_path :20231120data-mul-clas/train.txt
[2024-04-28 11:01:41,105] [ INFO] -
[2024-04-28 11:01:41,106] [ INFO] - We are using <class 'paddlenlp.transformers.ernie.modeling.ErnieForSequenceClassification'> to load 'checkpoint'.
[2024-04-28 11:01:41,106] [ INFO] - Loading configuration file checkpoint/config.json
[2024-04-28 11:01:41,106] [ INFO] - Loading weights file checkpoint/model_state.pdparams
[2024-04-28 11:01:41,571] [ INFO] - Loaded weights file from disk, setting weights to model.
[2024-04-28 11:01:45,845] [ INFO] - All model checkpoint weights were used when initializing ErnieForSequenceClassification.
[2024-04-28 11:01:45,846] [ INFO] - All the weights of ErnieForSequenceClassification were initialized from the model checkpoint at checkpoint.
If your task is similar to the task the model of the checkpoint was trained on, you can already use ErnieForSequenceClassification for predictions without further training.
[2024-04-28 11:01:45,874] [ INFO] - We are using <class 'paddlenlp.transformers.ernie.tokenizer.ErnieTokenizer'> to load 'checkpoint'.
[2024-04-28 11:01:45,899] [ INFO] - The global seed is set to 42, local seed is set to 43 and random seed is set to 42.
[2024-04-28 11:01:45,976] [ DEBUG] - ============================================================
[2024-04-28 11:01:45,976] [ DEBUG] - Training Configuration Arguments
[2024-04-28 11:01:45,976] [ DEBUG] - paddle commit id : fbf852dd832bc0e63ae31cd4aa37defd829e4c03
[2024-04-28 11:01:45,976] [ DEBUG] - paddlenlp commit id : 3105c18
[2024-04-28 11:01:45,976] [ DEBUG] - _no_sync_in_gradient_accumulation: True
[2024-04-28 11:01:45,977] [ DEBUG] - activation_quantize_type : None
[2024-04-28 11:01:45,977] [ DEBUG] - adam_beta1 : 0.9
[2024-04-28 11:01:45,977] [ DEBUG] - adam_beta2 : 0.999
[2024-04-28 11:01:45,977] [ DEBUG] - adam_epsilon : 1e-08
[2024-04-28 11:01:45,977] [ DEBUG] - algo_list : None
[2024-04-28 11:01:45,977] [ DEBUG] - amp_custom_black_list : None
[2024-04-28 11:01:45,977] [ DEBUG] - amp_custom_white_list : None
[2024-04-28 11:01:45,977] [ DEBUG] - amp_master_grad : False
[2024-04-28 11:01:45,977] [ DEBUG] - batch_num_list : None
[2024-04-28 11:01:45,977] [ DEBUG] - batch_size_list : None
[2024-04-28 11:01:45,977] [ DEBUG] - bf16 : False
[2024-04-28 11:01:45,978] [ DEBUG] - bf16_full_eval : False
[2024-04-28 11:01:45,978] [ DEBUG] - bias_correction : False
[2024-04-28 11:01:45,978] [ DEBUG] - current_device : cpu
[2024-04-28 11:01:45,978] [ DEBUG] - data_parallel_config :
[2024-04-28 11:01:45,978] [ DEBUG] - data_parallel_rank : 0
[2024-04-28 11:01:45,978] [ DEBUG] - dataloader_drop_last : False
[2024-04-28 11:01:45,978] [ DEBUG] - dataloader_num_workers : 0
[2024-04-28 11:01:45,978] [ DEBUG] - dataset_rank : 0
[2024-04-28 11:01:45,978] [ DEBUG] - dataset_world_size : 1
[2024-04-28 11:01:45,978] [ DEBUG] - device : cpu
[2024-04-28 11:01:45,978] [ DEBUG] - disable_tqdm : False
[2024-04-28 11:01:45,979] [ DEBUG] - distributed_dataloader : False
[2024-04-28 11:01:45,979] [ DEBUG] - do_compress : True
[2024-04-28 11:01:45,979] [ DEBUG] - do_eval : False
[2024-04-28 11:01:45,979] [ DEBUG] - do_export : False
[2024-04-28 11:01:45,979] [ DEBUG] - do_predict : False
[2024-04-28 11:01:45,979] [ DEBUG] - do_train : False
[2024-04-28 11:01:45,979] [ DEBUG] - enable_auto_parallel : False
[2024-04-28 11:01:45,979] [ DEBUG] - eval_accumulation_steps : None
[2024-04-28 11:01:45,979] [ DEBUG] - eval_batch_size : 32
[2024-04-28 11:01:45,979] [ DEBUG] - eval_steps : None
[2024-04-28 11:01:45,979] [ DEBUG] - evaluation_strategy : IntervalStrategy.NO
[2024-04-28 11:01:45,979] [ DEBUG] - flatten_param_grads : False
[2024-04-28 11:01:45,980] [ DEBUG] - force_reshard_pp : False
[2024-04-28 11:01:45,980] [ DEBUG] - fp16 : False
[2024-04-28 11:01:45,980] [ DEBUG] - fp16_full_eval : False
[2024-04-28 11:01:45,980] [ DEBUG] - fp16_opt_level : O1
[2024-04-28 11:01:45,980] [ DEBUG] - gradient_accumulation_steps : 1
[2024-04-28 11:01:45,980] [ DEBUG] - greater_is_better : None
[2024-04-28 11:01:45,980] [ DEBUG] - hybrid_parallel_topo_order : pp_first
[2024-04-28 11:01:45,980] [ DEBUG] - ignore_data_skip : False
[2024-04-28 11:01:45,980] [ DEBUG] - ignore_load_lr_and_optim : False
[2024-04-28 11:01:45,980] [ DEBUG] - ignore_save_lr_and_optim : False
[2024-04-28 11:01:45,980] [ DEBUG] - input_dtype : int64
[2024-04-28 11:01:45,981] [ DEBUG] - input_infer_model_path : None
[2024-04-28 11:01:45,981] [ DEBUG] - label_names : None
[2024-04-28 11:01:45,981] [ DEBUG] - lazy_data_processing : True
[2024-04-28 11:01:45,981] [ DEBUG] - learning_rate : 3e-05
[2024-04-28 11:01:45,981] [ DEBUG] - load_best_model_at_end : False
[2024-04-28 11:01:45,981] [ DEBUG] - load_sharded_model : False
[2024-04-28 11:01:45,981] [ DEBUG] - local_process_index : 0
[2024-04-28 11:01:45,981] [ DEBUG] - local_rank : -1
[2024-04-28 11:01:45,981] [ DEBUG] - log_level : -1
[2024-04-28 11:01:45,981] [ DEBUG] - log_level_replica : -1
[2024-04-28 11:01:45,981] [ DEBUG] - log_on_each_node : True
[2024-04-28 11:01:45,981] [ DEBUG] - logging_dir : checkpoint/prune/runs/Apr28_11-01-41_meng-paddle2-6-2
[2024-04-28 11:01:45,982] [ DEBUG] - logging_first_step : False
[2024-04-28 11:01:45,982] [ DEBUG] - logging_steps : 5
[2024-04-28 11:01:45,982] [ DEBUG] - logging_strategy : IntervalStrategy.STEPS
[2024-04-28 11:01:45,982] [ DEBUG] - logical_process_index : 0
[2024-04-28 11:01:45,982] [ DEBUG] - lr_end : 1e-07
[2024-04-28 11:01:45,982] [ DEBUG] - lr_scheduler_type : SchedulerType.LINEAR
[2024-04-28 11:01:45,982] [ DEBUG] - max_evaluate_steps : -1
[2024-04-28 11:01:45,982] [ DEBUG] - max_grad_norm : 1.0
[2024-04-28 11:01:45,982] [ DEBUG] - max_steps : -1
[2024-04-28 11:01:45,982] [ DEBUG] - metric_for_best_model : None
[2024-04-28 11:01:45,982] [ DEBUG] - minimum_eval_times : None
[2024-04-28 11:01:45,983] [ DEBUG] - moving_rate : 0.9
[2024-04-28 11:01:45,983] [ DEBUG] - no_cuda : False
[2024-04-28 11:01:45,983] [ DEBUG] - num_cycles : 0.5
[2024-04-28 11:01:45,983] [ DEBUG] - num_train_epochs : 1.0
[2024-04-28 11:01:45,983] [ DEBUG] - onnx_format : True
[2024-04-28 11:01:45,983] [ DEBUG] - optim : OptimizerNames.ADAMW
[2024-04-28 11:01:45,983] [ DEBUG] - optimizer_name_suffix : None
[2024-04-28 11:01:45,983] [ DEBUG] - output_dir : checkpoint/prune
[2024-04-28 11:01:45,983] [ DEBUG] - overwrite_output_dir : False
[2024-04-28 11:01:45,983] [ DEBUG] - past_index : -1
[2024-04-28 11:01:45,983] [ DEBUG] - per_device_eval_batch_size : 32
[2024-04-28 11:01:45,983] [ DEBUG] - per_device_train_batch_size : 32
[2024-04-28 11:01:45,984] [ DEBUG] - pipeline_parallel_config :
[2024-04-28 11:01:45,984] [ DEBUG] - pipeline_parallel_degree : -1
[2024-04-28 11:01:45,984] [ DEBUG] - pipeline_parallel_rank : 0
[2024-04-28 11:01:45,984] [ DEBUG] - power : 1.0
[2024-04-28 11:01:45,984] [ DEBUG] - prediction_loss_only : False
[2024-04-28 11:01:45,984] [ DEBUG] - process_index : 0
[2024-04-28 11:01:45,984] [ DEBUG] - prune_embeddings : False
[2024-04-28 11:01:45,984] [ DEBUG] - recompute : False
[2024-04-28 11:01:45,984] [ DEBUG] - remove_unused_columns : True
[2024-04-28 11:01:45,984] [ DEBUG] - report_to : ['visualdl']
[2024-04-28 11:01:45,984] [ DEBUG] - resume_from_checkpoint : None
[2024-04-28 11:01:45,985] [ DEBUG] - round_type : round
[2024-04-28 11:01:45,985] [ DEBUG] - run_name : checkpoint/prune
[2024-04-28 11:01:45,985] [ DEBUG] - save_on_each_node : False
[2024-04-28 11:01:45,985] [ DEBUG] - save_sharded_model : False
[2024-04-28 11:01:45,985] [ DEBUG] - save_steps : 100
[2024-04-28 11:01:45,985] [ DEBUG] - save_strategy : IntervalStrategy.STEPS
[2024-04-28 11:01:45,985] [ DEBUG] - save_total_limit : None
[2024-04-28 11:01:45,985] [ DEBUG] - scale_loss : 32768
[2024-04-28 11:01:45,985] [ DEBUG] - seed : 42
[2024-04-28 11:01:45,985] [ DEBUG] - sep_parallel_degree : -1
[2024-04-28 11:01:45,985] [ DEBUG] - sharding : []
[2024-04-28 11:01:45,985] [ DEBUG] - sharding_degree : -1
[2024-04-28 11:01:45,986] [ DEBUG] - sharding_parallel_config :
[2024-04-28 11:01:45,986] [ DEBUG] - sharding_parallel_degree : -1
[2024-04-28 11:01:45,986] [ DEBUG] - sharding_parallel_rank : 0
[2024-04-28 11:01:45,986] [ DEBUG] - should_load_dataset : True
[2024-04-28 11:01:45,986] [ DEBUG] - should_load_sharding_stage1_model: False
[2024-04-28 11:01:45,986] [ DEBUG] - should_log : True
[2024-04-28 11:01:45,986] [ DEBUG] - should_save : True
[2024-04-28 11:01:45,986] [ DEBUG] - should_save_model_state : True
[2024-04-28 11:01:45,986] [ DEBUG] - should_save_sharding_stage1_model: False
[2024-04-28 11:01:45,986] [ DEBUG] - skip_memory_metrics : True
[2024-04-28 11:01:45,986] [ DEBUG] - skip_profile_timer : True
[2024-04-28 11:01:45,987] [ DEBUG] - strategy : dynabert
[2024-04-28 11:01:45,987] [ DEBUG] - tensor_parallel_config :
[2024-04-28 11:01:45,987] [ DEBUG] - tensor_parallel_degree : -1
[2024-04-28 11:01:45,987] [ DEBUG] - tensor_parallel_rank : 0
[2024-04-28 11:01:45,987] [ DEBUG] - to_static : False
[2024-04-28 11:01:45,987] [ DEBUG] - train_batch_size : 32
[2024-04-28 11:01:45,987] [ DEBUG] - unified_checkpoint : False
[2024-04-28 11:01:45,987] [ DEBUG] - unified_checkpoint_config :
[2024-04-28 11:01:45,987] [ DEBUG] - use_hybrid_parallel : False
[2024-04-28 11:01:45,987] [ DEBUG] - use_pact : True
[2024-04-28 11:01:45,987] [ DEBUG] - wandb_api_key : None
[2024-04-28 11:01:45,987] [ DEBUG] - warmup_ratio : 0.1
[2024-04-28 11:01:45,988] [ DEBUG] - warmup_steps : 0
[2024-04-28 11:01:45,988] [ DEBUG] - weight_decay : 0.0
[2024-04-28 11:01:45,988] [ DEBUG] - weight_name_suffix : None
[2024-04-28 11:01:45,988] [ DEBUG] - weight_quantize_type : channel_wise_abs_max
[2024-04-28 11:01:45,988] [ DEBUG] - width_mult_list : ['3/4', '2/3', '1/2']
[2024-04-28 11:01:45,988] [ DEBUG] - world_size : 1
[2024-04-28 11:01:45,988] [ DEBUG] -
Traceback (most recent call last):
File "/nlp/PaddleNLP/applications/text_classification/multi_class/train.py", line 230, in
main()
File "/nlp/PaddleNLP/applications/text_classification/multi_class/train.py", line 216, in main
trainer.compress()
File "/usr/local/lib/python3.9/site-packages/paddlenlp/trainer/trainer_compress.py", line 73, in compress
_dynabert(self, self.model)
File "/usr/local/lib/python3.9/site-packages/paddlenlp/trainer/trainer_compress.py", line 158, in _dynabert
ofa_model, teacher_model = _dynabert_init(self, model, eval_dataloader)
File "/usr/local/lib/python3.9/site-packages/paddlenlp/trainer/trainer_compress.py", line 300, in _dynabert_init
head_importance, neuron_importance = compute_neuron_head_importance(
File "/usr/local/lib/python3.9/site-packages/paddlenlp/transformers/ofa_utils.py", line 307, in compute_neuron_head_importance
logits = model(**batch)
File "/usr/local/lib64/python3.9/site-packages/paddle/nn/layer/layers.py", line 1429, in call
return self.forward(*inputs, **kwargs)
File "/usr/local/lib/python3.9/site-packages/paddlenlp/transformers/ernie/modeling.py", line 462, in forward
outputs = self.ernie(
File "/usr/local/lib64/python3.9/site-packages/paddle/nn/layer/layers.py", line 1429, in call
return self.forward(*inputs, **kwargs)
File "/usr/local/lib/python3.9/site-packages/paddlenlp/trainer/trainer_compress.py", line 907, in auto_model_dynabert_forward
embedding_output = self.embeddings(**embedding_kwargs)
File "/usr/local/lib64/python3.9/site-packages/paddle/nn/layer/layers.py", line 1429, in call
return self.forward(*inputs, **kwargs)
File "/usr/local/lib/python3.9/site-packages/paddlenlp/transformers/ernie/modeling.py", line 127, in forward
embeddings = self.layer_norm(embeddings)
File "/usr/local/lib64/python3.9/site-packages/paddle/nn/layer/layers.py", line 1429, in call
return self.forward(*inputs, **kwargs)
File "/usr/local/lib/python3.9/site-packages/paddleslim/nas/ofa/layers.py", line 1301, in forward
out, _, _ = paddle._C_ops.layer_norm(
ValueError: too many values to unpack (expected 3)
The text was updated successfully, but these errors were encountered: