在进行多分类任务，裁剪操作的时候，报错了 ValueError: too many values to unpack (expected 3) #8345

mengxiangyu-png · 2024-04-29T01:59:53Z

请提出你的问题

位置:applications/text_classification/multi_class 进行的是模型裁剪操作
环境:
paddle-bfloat 0.1.7
paddle2onnx 1.1.0
paddlefsl 1.1.0
paddlenlp 2.8.0
paddleocr 2.7.0.3
paddlepaddle 2.6.1
paddleslim 2.6.0
scikit-learn 1.4.2

裁剪操作时候的命令:
python3 train.py
--do_compress
--device cpu
--model_name_or_path checkpoint
--output_dir checkpoint/prune
--learning_rate 3e-5
--per_device_train_batch_size 32
--per_device_eval_batch_size 32
--num_train_epochs 1
--max_length 128
--logging_steps 5
--save_steps 100
--width_mult_list '3/4' '2/3' '1/2'
--train_path "20231120data-mul-clas/train.txt"
--dev_path "20231120data-mul-clas/dev.txt"
--test_path "20231120data-mul-clas/test.txt"
--label_path "20231120data-mul-clas/label.txt"

报的错误:
File "/usr/local/lib/python3.9/site-packages/paddlenlp/transformers/ernie/modeling.py", line 127, in forward
embeddings = self.layer_norm(embeddings)
File "/usr/local/lib64/python3.9/site-packages/paddle/nn/layer/layers.py", line 1429, in call
return self.forward(*inputs, **kwargs)
File "/usr/local/lib/python3.9/site-packages/paddleslim/nas/ofa/layers.py", line 1301, in forward
out, _, _ = paddle._C_ops.layer_norm(
ValueError: too many values to unpack (expected 3)

请问这是什么问题？

下面附上完整的日志记录:
python3 train.py \

--do_compress \
--device cpu \
--model_name_or_path checkpoint \
--output_dir checkpoint/prune \
--learning_rate 3e-5 \
--per_device_train_batch_size 32 \
--per_device_eval_batch_size 32 \
--num_train_epochs 1 \
--max_length 128 \
--logging_steps 5 \
--save_steps 100 \
--width_mult_list '3/4' '2/3' '1/2' \
--train_path "20231120data-mul-clas/train.txt" \
--dev_path "20231120data-mul-clas/dev.txt" \
--test_path "20231120data-mul-clas/test.txt" \
--label_path "20231120data-mul-clas/label.txt"

[2024-04-28 11:01:41,103] [ [2024-04-28 11:01:41,104] [ [2024-04-28 11:01:41,104] [ [2024-04-28 11:01:41,104] [ [2024-04-28 11:01:41,104] [ [2024-04-28 11:01:41,104] [ [2024-04-28 11:01:41,104] [ [2024-04-28 11:01:41,104] [ [2024-04-28 11:01:41,104] [ [2024-04-28 11:01:41,104] [ [2024-04-28 11:01:41,105] [ [2024-04-28 11:01:41,105] [ [2024-04-28 11:01:41,105] [ [2024-04-28 11:01:41,105] [ [2024-04-28 11:01:41,105] [ [2024-04-28 11:01:41,105] [ [2024-04-28 11:01:41,105] [ [2024-04-28 11:01:41,105] [ [2024-04-28 11:01:41,105] [ [2024-04-28 11:01:41,105] [ [2024-04-28 11:01:41,106] [ [2024-04-28 11:01:41,106] [ [2024-04-28 11:01:41,106] [ [2024-04-28 11:01:41,571] [ [2024-04-28 11:01:45,845] [ INFO] - The default value for the training argument --report_to will change in v5 (from all installed integrations to none). In v5, you will need to use --report_to all to get the same behavior as now. You should start updating your code and make this info disappear :-).
INFO] - ============================================================
INFO] - Model Configuration Arguments
INFO] - paddle commit id :fbf852dd832bc0e63ae31cd4aa37defd829e4c03
INFO] - export_model_dir :None
INFO] - model_name_or_path :checkpoint
INFO] -
INFO] - ============================================================
INFO] - Data Configuration Arguments
INFO] - paddle commit id :fbf852dd832bc0e63ae31cd4aa37defd829e4c03
INFO] - bad_case_path :./data/bad_case.txt
INFO] - debug :False
INFO] - dev_path :20231120data-mul-clas/dev.txt
INFO] - early_stopping :False
INFO] - early_stopping_patience :4
INFO] - label_path :20231120data-mul-clas/label.txt
INFO] - max_length :128
INFO] - test_path :20231120data-mul-clas/test.txt
INFO] - train_path :20231120data-mul-clas/train.txt
INFO] -
INFO] - We are using <class 'paddlenlp.transformers.ernie.modeling.ErnieForSequenceClassification'> to load 'checkpoint'.
INFO] - Loading configuration file checkpoint/config.json
INFO] - Loading weights file checkpoint/model_state.pdparams
INFO] - Loaded weights file from disk, setting weights to model.
INFO] - All model checkpoint weights were used when initializing ErnieForSequenceClassification.

[2024-04-28 11:01:45,846] [ INFO] - All the weights of ErnieForSequenceClassification were initialized from the model checkpoint at checkpoint.
If your task is similar to the task the model of the checkpoint was trained on, you can already use ErnieForSequenceClassification for predictions without further training.
[2024-04-28 11:01:45,874] [ INFO] - We are using <class 'paddlenlp.transformers.ernie.tokenizer.ErnieTokenizer'> to load 'checkpoint'.
[2024-04-28 11:01:45,899] [ INFO] - The global seed is set to 42, local seed is set to 43 and random seed is set to 42.
[2024-04-28 11:01:45,976] [ DEBUG] - ============================================================
[2024-04-28 11:01:45,976] [ DEBUG] - Training Configuration Arguments
[2024-04-28 11:01:45,976] [ DEBUG] - paddle commit id : fbf852dd832bc0e63ae31cd4aa37defd829e4c03
[2024-04-28 11:01:45,976] [ DEBUG] - paddlenlp commit id : 3105c18
[2024-04-28 11:01:45,976] [ DEBUG] - _no_sync_in_gradient_accumulation: True
[2024-04-28 11:01:45,977] [ DEBUG] - activation_quantize_type : None
[2024-04-28 11:01:45,977] [ DEBUG] - adam_beta1 : 0.9
[2024-04-28 11:01:45,977] [ DEBUG] - adam_beta2 : 0.999
[2024-04-28 11:01:45,977] [ DEBUG] - adam_epsilon : 1e-08
[2024-04-28 11:01:45,977] [ DEBUG] - algo_list : None
[2024-04-28 11:01:45,977] [ DEBUG] - amp_custom_black_list : None
[2024-04-28 11:01:45,977] [ DEBUG] - amp_custom_white_list : None
[2024-04-28 11:01:45,977] [ DEBUG] - amp_master_grad : False
[2024-04-28 11:01:45,977] [ DEBUG] - batch_num_list : None
[2024-04-28 11:01:45,977] [ DEBUG] - batch_size_list : None
[2024-04-28 11:01:45,977] [ DEBUG] - bf16 : False
[2024-04-28 11:01:45,978] [ DEBUG] - bf16_full_eval : False
[2024-04-28 11:01:45,978] [ DEBUG] - bias_correction : False
[2024-04-28 11:01:45,978] [ DEBUG] - current_device : cpu
[2024-04-28 11:01:45,978] [ DEBUG] - data_parallel_config :
[2024-04-28 11:01:45,978] [ DEBUG] - data_parallel_rank : 0
[2024-04-28 11:01:45,978] [ DEBUG] - dataloader_drop_last : False
[2024-04-28 11:01:45,978] [ DEBUG] - dataloader_num_workers : 0
[2024-04-28 11:01:45,978] [ DEBUG] - dataset_rank : 0
[2024-04-28 11:01:45,978] [ DEBUG] - dataset_world_size : 1
[2024-04-28 11:01:45,978] [ DEBUG] - device : cpu
[2024-04-28 11:01:45,978] [ DEBUG] - disable_tqdm : False
[2024-04-28 11:01:45,979] [ DEBUG] - distributed_dataloader : False
[2024-04-28 11:01:45,979] [ DEBUG] - do_compress : True
[2024-04-28 11:01:45,979] [ DEBUG] - do_eval : False
[2024-04-28 11:01:45,979] [ DEBUG] - do_export : False
[2024-04-28 11:01:45,979] [ DEBUG] - do_predict : False
[2024-04-28 11:01:45,979] [ DEBUG] - do_train : False
[2024-04-28 11:01:45,979] [ DEBUG] - enable_auto_parallel : False
[2024-04-28 11:01:45,979] [ DEBUG] - eval_accumulation_steps : None
[2024-04-28 11:01:45,979] [ DEBUG] - eval_batch_size : 32
[2024-04-28 11:01:45,979] [ DEBUG] - eval_steps : None
[2024-04-28 11:01:45,979] [ DEBUG] - evaluation_strategy : IntervalStrategy.NO
[2024-04-28 11:01:45,979] [ DEBUG] - flatten_param_grads : False
[2024-04-28 11:01:45,980] [ DEBUG] - force_reshard_pp : False
[2024-04-28 11:01:45,980] [ DEBUG] - fp16 : False
[2024-04-28 11:01:45,980] [ DEBUG] - fp16_full_eval : False
[2024-04-28 11:01:45,980] [ DEBUG] - fp16_opt_level : O1
[2024-04-28 11:01:45,980] [ DEBUG] - gradient_accumulation_steps : 1
[2024-04-28 11:01:45,980] [ DEBUG] - greater_is_better : None
[2024-04-28 11:01:45,980] [ DEBUG] - hybrid_parallel_topo_order : pp_first
[2024-04-28 11:01:45,980] [ DEBUG] - ignore_data_skip : False
[2024-04-28 11:01:45,980] [ DEBUG] - ignore_load_lr_and_optim : False
[2024-04-28 11:01:45,980] [ DEBUG] - ignore_save_lr_and_optim : False
[2024-04-28 11:01:45,980] [ DEBUG] - input_dtype : int64
[2024-04-28 11:01:45,981] [ DEBUG] - input_infer_model_path : None
[2024-04-28 11:01:45,981] [ DEBUG] - label_names : None
[2024-04-28 11:01:45,981] [ DEBUG] - lazy_data_processing : True
[2024-04-28 11:01:45,981] [ DEBUG] - learning_rate : 3e-05
[2024-04-28 11:01:45,981] [ DEBUG] - load_best_model_at_end : False
[2024-04-28 11:01:45,981] [ DEBUG] - load_sharded_model : False
[2024-04-28 11:01:45,981] [ DEBUG] - local_process_index : 0
[2024-04-28 11:01:45,981] [ DEBUG] - local_rank : -1
[2024-04-28 11:01:45,981] [ DEBUG] - log_level : -1
[2024-04-28 11:01:45,981] [ DEBUG] - log_level_replica : -1
[2024-04-28 11:01:45,981] [ DEBUG] - log_on_each_node : True
[2024-04-28 11:01:45,981] [ DEBUG] - logging_dir : checkpoint/prune/runs/Apr28_11-01-41_meng-paddle2-6-2
[2024-04-28 11:01:45,982] [ DEBUG] - logging_first_step : False
[2024-04-28 11:01:45,982] [ DEBUG] - logging_steps : 5
[2024-04-28 11:01:45,982] [ DEBUG] - logging_strategy : IntervalStrategy.STEPS
[2024-04-28 11:01:45,982] [ DEBUG] - logical_process_index : 0
[2024-04-28 11:01:45,982] [ DEBUG] - lr_end : 1e-07
[2024-04-28 11:01:45,982] [ DEBUG] - lr_scheduler_type : SchedulerType.LINEAR
[2024-04-28 11:01:45,982] [ DEBUG] - max_evaluate_steps : -1
[2024-04-28 11:01:45,982] [ DEBUG] - max_grad_norm : 1.0
[2024-04-28 11:01:45,982] [ DEBUG] - max_steps : -1
[2024-04-28 11:01:45,982] [ DEBUG] - metric_for_best_model : None
[2024-04-28 11:01:45,982] [ DEBUG] - minimum_eval_times : None
[2024-04-28 11:01:45,983] [ DEBUG] - moving_rate : 0.9
[2024-04-28 11:01:45,983] [ DEBUG] - no_cuda : False
[2024-04-28 11:01:45,983] [ DEBUG] - num_cycles : 0.5
[2024-04-28 11:01:45,983] [ DEBUG] - num_train_epochs : 1.0
[2024-04-28 11:01:45,983] [ DEBUG] - onnx_format : True
[2024-04-28 11:01:45,983] [ DEBUG] - optim : OptimizerNames.ADAMW
[2024-04-28 11:01:45,983] [ DEBUG] - optimizer_name_suffix : None
[2024-04-28 11:01:45,983] [ DEBUG] - output_dir : checkpoint/prune
[2024-04-28 11:01:45,983] [ DEBUG] - overwrite_output_dir : False
[2024-04-28 11:01:45,983] [ DEBUG] - past_index : -1
[2024-04-28 11:01:45,983] [ DEBUG] - per_device_eval_batch_size : 32
[2024-04-28 11:01:45,983] [ DEBUG] - per_device_train_batch_size : 32
[2024-04-28 11:01:45,984] [ DEBUG] - pipeline_parallel_config :
[2024-04-28 11:01:45,984] [ DEBUG] - pipeline_parallel_degree : -1
[2024-04-28 11:01:45,984] [ DEBUG] - pipeline_parallel_rank : 0
[2024-04-28 11:01:45,984] [ DEBUG] - power : 1.0
[2024-04-28 11:01:45,984] [ DEBUG] - prediction_loss_only : False
[2024-04-28 11:01:45,984] [ DEBUG] - process_index : 0
[2024-04-28 11:01:45,984] [ DEBUG] - prune_embeddings : False
[2024-04-28 11:01:45,984] [ DEBUG] - recompute : False
[2024-04-28 11:01:45,984] [ DEBUG] - remove_unused_columns : True
[2024-04-28 11:01:45,984] [ DEBUG] - report_to : ['visualdl']
[2024-04-28 11:01:45,984] [ DEBUG] - resume_from_checkpoint : None
[2024-04-28 11:01:45,985] [ DEBUG] - round_type : round
[2024-04-28 11:01:45,985] [ DEBUG] - run_name : checkpoint/prune
[2024-04-28 11:01:45,985] [ DEBUG] - save_on_each_node : False
[2024-04-28 11:01:45,985] [ DEBUG] - save_sharded_model : False
[2024-04-28 11:01:45,985] [ DEBUG] - save_steps : 100
[2024-04-28 11:01:45,985] [ DEBUG] - save_strategy : IntervalStrategy.STEPS
[2024-04-28 11:01:45,985] [ DEBUG] - save_total_limit : None
[2024-04-28 11:01:45,985] [ DEBUG] - scale_loss : 32768
[2024-04-28 11:01:45,985] [ DEBUG] - seed : 42
[2024-04-28 11:01:45,985] [ DEBUG] - sep_parallel_degree : -1
[2024-04-28 11:01:45,985] [ DEBUG] - sharding : []
[2024-04-28 11:01:45,985] [ DEBUG] - sharding_degree : -1
[2024-04-28 11:01:45,986] [ DEBUG] - sharding_parallel_config :
[2024-04-28 11:01:45,986] [ DEBUG] - sharding_parallel_degree : -1
[2024-04-28 11:01:45,986] [ DEBUG] - sharding_parallel_rank : 0
[2024-04-28 11:01:45,986] [ DEBUG] - should_load_dataset : True
[2024-04-28 11:01:45,986] [ DEBUG] - should_load_sharding_stage1_model: False
[2024-04-28 11:01:45,986] [ DEBUG] - should_log : True
[2024-04-28 11:01:45,986] [ DEBUG] - should_save : True
[2024-04-28 11:01:45,986] [ DEBUG] - should_save_model_state : True
[2024-04-28 11:01:45,986] [ DEBUG] - should_save_sharding_stage1_model: False
[2024-04-28 11:01:45,986] [ DEBUG] - skip_memory_metrics : True
[2024-04-28 11:01:45,986] [ DEBUG] - skip_profile_timer : True
[2024-04-28 11:01:45,987] [ DEBUG] - strategy : dynabert
[2024-04-28 11:01:45,987] [ DEBUG] - tensor_parallel_config :
[2024-04-28 11:01:45,987] [ DEBUG] - tensor_parallel_degree : -1
[2024-04-28 11:01:45,987] [ DEBUG] - tensor_parallel_rank : 0
[2024-04-28 11:01:45,987] [ DEBUG] - to_static : False
[2024-04-28 11:01:45,987] [ DEBUG] - train_batch_size : 32
[2024-04-28 11:01:45,987] [ DEBUG] - unified_checkpoint : False
[2024-04-28 11:01:45,987] [ DEBUG] - unified_checkpoint_config :
[2024-04-28 11:01:45,987] [ DEBUG] - use_hybrid_parallel : False
[2024-04-28 11:01:45,987] [ DEBUG] - use_pact : True
[2024-04-28 11:01:45,987] [ DEBUG] - wandb_api_key : None
[2024-04-28 11:01:45,987] [ DEBUG] - warmup_ratio : 0.1
[2024-04-28 11:01:45,988] [ DEBUG] - warmup_steps : 0
[2024-04-28 11:01:45,988] [ DEBUG] - weight_decay : 0.0
[2024-04-28 11:01:45,988] [ DEBUG] - weight_name_suffix : None
[2024-04-28 11:01:45,988] [ DEBUG] - weight_quantize_type : channel_wise_abs_max
[2024-04-28 11:01:45,988] [ DEBUG] - width_mult_list : ['3/4', '2/3', '1/2']
[2024-04-28 11:01:45,988] [ DEBUG] - world_size : 1
[2024-04-28 11:01:45,988] [ DEBUG] -
Traceback (most recent call last):
File "/nlp/PaddleNLP/applications/text_classification/multi_class/train.py", line 230, in
main()
File "/nlp/PaddleNLP/applications/text_classification/multi_class/train.py", line 216, in main
trainer.compress()
File "/usr/local/lib/python3.9/site-packages/paddlenlp/trainer/trainer_compress.py", line 73, in compress
_dynabert(self, self.model)
File "/usr/local/lib/python3.9/site-packages/paddlenlp/trainer/trainer_compress.py", line 158, in _dynabert
ofa_model, teacher_model = _dynabert_init(self, model, eval_dataloader)
File "/usr/local/lib/python3.9/site-packages/paddlenlp/trainer/trainer_compress.py", line 300, in _dynabert_init
head_importance, neuron_importance = compute_neuron_head_importance(
File "/usr/local/lib/python3.9/site-packages/paddlenlp/transformers/ofa_utils.py", line 307, in compute_neuron_head_importance
logits = model(**batch)
File "/usr/local/lib64/python3.9/site-packages/paddle/nn/layer/layers.py", line 1429, in call
return self.forward(*inputs, **kwargs)
File "/usr/local/lib/python3.9/site-packages/paddlenlp/transformers/ernie/modeling.py", line 462, in forward
outputs = self.ernie(
File "/usr/local/lib64/python3.9/site-packages/paddle/nn/layer/layers.py", line 1429, in call
return self.forward(*inputs, **kwargs)
File "/usr/local/lib/python3.9/site-packages/paddlenlp/trainer/trainer_compress.py", line 907, in auto_model_dynabert_forward
embedding_output = self.embeddings(**embedding_kwargs)
File "/usr/local/lib64/python3.9/site-packages/paddle/nn/layer/layers.py", line 1429, in call
return self.forward(*inputs, **kwargs)
File "/usr/local/lib/python3.9/site-packages/paddlenlp/transformers/ernie/modeling.py", line 127, in forward
embeddings = self.layer_norm(embeddings)
File "/usr/local/lib64/python3.9/site-packages/paddle/nn/layer/layers.py", line 1429, in call
return self.forward(*inputs, **kwargs)
File "/usr/local/lib/python3.9/site-packages/paddleslim/nas/ofa/layers.py", line 1301, in forward
out, _, _ = paddle._C_ops.layer_norm(
ValueError: too many values to unpack (expected 3)

The text was updated successfully, but these errors were encountered:

w5688414 · 2024-04-29T07:26:25Z

把paddle降低到2.5.x版本试一下

mengxiangyu-png added the question Further information is requested label Apr 29, 2024

paddle-bot bot assigned lugimzzz Apr 29, 2024

w5688414 closed this as completed May 13, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

在进行多分类任务，裁剪操作的时候，报错了 ValueError: too many values to unpack (expected 3) #8345

在进行多分类任务，裁剪操作的时候，报错了 ValueError: too many values to unpack (expected 3) #8345

mengxiangyu-png commented Apr 29, 2024

w5688414 commented Apr 29, 2024

在进行多分类任务，裁剪操作的时候，报错了 ValueError: too many values to unpack (expected 3) #8345

在进行多分类任务，裁剪操作的时候，报错了 ValueError: too many values to unpack (expected 3) #8345

Comments

mengxiangyu-png commented Apr 29, 2024

请提出你的问题

w5688414 commented Apr 29, 2024