Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

在进行多分类任务,裁剪操作的时候,报错了 ValueError: too many values to unpack (expected 3) #8345

Closed
mengxiangyu-png opened this issue Apr 29, 2024 · 1 comment
Assignees
Labels
question Further information is requested

Comments

@mengxiangyu-png
Copy link

请提出你的问题

位置:applications/text_classification/multi_class 进行的是模型裁剪操作
环境:
paddle-bfloat 0.1.7
paddle2onnx 1.1.0
paddlefsl 1.1.0
paddlenlp 2.8.0
paddleocr 2.7.0.3
paddlepaddle 2.6.1
paddleslim 2.6.0
scikit-learn 1.4.2

裁剪操作时候的命令:
python3 train.py
--do_compress
--device cpu
--model_name_or_path checkpoint
--output_dir checkpoint/prune
--learning_rate 3e-5
--per_device_train_batch_size 32
--per_device_eval_batch_size 32
--num_train_epochs 1
--max_length 128
--logging_steps 5
--save_steps 100
--width_mult_list '3/4' '2/3' '1/2'
--train_path "20231120data-mul-clas/train.txt"
--dev_path "20231120data-mul-clas/dev.txt"
--test_path "20231120data-mul-clas/test.txt"
--label_path "20231120data-mul-clas/label.txt"

报的错误:
File "/usr/local/lib/python3.9/site-packages/paddlenlp/transformers/ernie/modeling.py", line 127, in forward
embeddings = self.layer_norm(embeddings)
File "/usr/local/lib64/python3.9/site-packages/paddle/nn/layer/layers.py", line 1429, in call
return self.forward(*inputs, **kwargs)
File "/usr/local/lib/python3.9/site-packages/paddleslim/nas/ofa/layers.py", line 1301, in forward
out, _, _ = paddle._C_ops.layer_norm(
ValueError: too many values to unpack (expected 3)

请问这是什么问题?

下面附上完整的日志记录:
python3 train.py \

--do_compress \
--device cpu \
--model_name_or_path checkpoint \
--output_dir checkpoint/prune \
--learning_rate 3e-5 \
--per_device_train_batch_size 32 \
--per_device_eval_batch_size 32 \
--num_train_epochs 1 \
--max_length 128 \
--logging_steps 5 \
--save_steps 100 \
--width_mult_list '3/4' '2/3' '1/2' \
--train_path "20231120data-mul-clas/train.txt" \
--dev_path "20231120data-mul-clas/dev.txt" \
--test_path "20231120data-mul-clas/test.txt" \
--label_path "20231120data-mul-clas/label.txt"

[2024-04-28 11:01:41,103] [ INFO] - The default value for the training argument --report_to will change in v5 (from all installed integrations to none). In v5, you will need to use --report_to all to get the same behavior as now. You should start updating your code and make this info disappear :-).
[2024-04-28 11:01:41,104] [ INFO] - ============================================================
[2024-04-28 11:01:41,104] [ INFO] - Model Configuration Arguments
[2024-04-28 11:01:41,104] [ INFO] - paddle commit id :fbf852dd832bc0e63ae31cd4aa37defd829e4c03
[2024-04-28 11:01:41,104] [ INFO] - export_model_dir :None
[2024-04-28 11:01:41,104] [ INFO] - model_name_or_path :checkpoint
[2024-04-28 11:01:41,104] [ INFO] -
[2024-04-28 11:01:41,104] [ INFO] - ============================================================
[2024-04-28 11:01:41,104] [ INFO] - Data Configuration Arguments
[2024-04-28 11:01:41,104] [ INFO] - paddle commit id :fbf852dd832bc0e63ae31cd4aa37defd829e4c03
[2024-04-28 11:01:41,105] [ INFO] - bad_case_path :./data/bad_case.txt
[2024-04-28 11:01:41,105] [ INFO] - debug :False
[2024-04-28 11:01:41,105] [ INFO] - dev_path :20231120data-mul-clas/dev.txt
[2024-04-28 11:01:41,105] [ INFO] - early_stopping :False
[2024-04-28 11:01:41,105] [ INFO] - early_stopping_patience :4
[2024-04-28 11:01:41,105] [ INFO] - label_path :20231120data-mul-clas/label.txt
[2024-04-28 11:01:41,105] [ INFO] - max_length :128
[2024-04-28 11:01:41,105] [ INFO] - test_path :20231120data-mul-clas/test.txt
[2024-04-28 11:01:41,105] [ INFO] - train_path :20231120data-mul-clas/train.txt
[2024-04-28 11:01:41,105] [ INFO] -
[2024-04-28 11:01:41,106] [ INFO] - We are using <class 'paddlenlp.transformers.ernie.modeling.ErnieForSequenceClassification'> to load 'checkpoint'.
[2024-04-28 11:01:41,106] [ INFO] - Loading configuration file checkpoint/config.json
[2024-04-28 11:01:41,106] [ INFO] - Loading weights file checkpoint/model_state.pdparams
[2024-04-28 11:01:41,571] [ INFO] - Loaded weights file from disk, setting weights to model.
[2024-04-28 11:01:45,845] [ INFO] - All model checkpoint weights were used when initializing ErnieForSequenceClassification.

[2024-04-28 11:01:45,846] [ INFO] - All the weights of ErnieForSequenceClassification were initialized from the model checkpoint at checkpoint.
If your task is similar to the task the model of the checkpoint was trained on, you can already use ErnieForSequenceClassification for predictions without further training.
[2024-04-28 11:01:45,874] [ INFO] - We are using <class 'paddlenlp.transformers.ernie.tokenizer.ErnieTokenizer'> to load 'checkpoint'.
[2024-04-28 11:01:45,899] [ INFO] - The global seed is set to 42, local seed is set to 43 and random seed is set to 42.
[2024-04-28 11:01:45,976] [ DEBUG] - ============================================================
[2024-04-28 11:01:45,976] [ DEBUG] - Training Configuration Arguments
[2024-04-28 11:01:45,976] [ DEBUG] - paddle commit id : fbf852dd832bc0e63ae31cd4aa37defd829e4c03
[2024-04-28 11:01:45,976] [ DEBUG] - paddlenlp commit id : 3105c18
[2024-04-28 11:01:45,976] [ DEBUG] - _no_sync_in_gradient_accumulation: True
[2024-04-28 11:01:45,977] [ DEBUG] - activation_quantize_type : None
[2024-04-28 11:01:45,977] [ DEBUG] - adam_beta1 : 0.9
[2024-04-28 11:01:45,977] [ DEBUG] - adam_beta2 : 0.999
[2024-04-28 11:01:45,977] [ DEBUG] - adam_epsilon : 1e-08
[2024-04-28 11:01:45,977] [ DEBUG] - algo_list : None
[2024-04-28 11:01:45,977] [ DEBUG] - amp_custom_black_list : None
[2024-04-28 11:01:45,977] [ DEBUG] - amp_custom_white_list : None
[2024-04-28 11:01:45,977] [ DEBUG] - amp_master_grad : False
[2024-04-28 11:01:45,977] [ DEBUG] - batch_num_list : None
[2024-04-28 11:01:45,977] [ DEBUG] - batch_size_list : None
[2024-04-28 11:01:45,977] [ DEBUG] - bf16 : False
[2024-04-28 11:01:45,978] [ DEBUG] - bf16_full_eval : False
[2024-04-28 11:01:45,978] [ DEBUG] - bias_correction : False
[2024-04-28 11:01:45,978] [ DEBUG] - current_device : cpu
[2024-04-28 11:01:45,978] [ DEBUG] - data_parallel_config :
[2024-04-28 11:01:45,978] [ DEBUG] - data_parallel_rank : 0
[2024-04-28 11:01:45,978] [ DEBUG] - dataloader_drop_last : False
[2024-04-28 11:01:45,978] [ DEBUG] - dataloader_num_workers : 0
[2024-04-28 11:01:45,978] [ DEBUG] - dataset_rank : 0
[2024-04-28 11:01:45,978] [ DEBUG] - dataset_world_size : 1
[2024-04-28 11:01:45,978] [ DEBUG] - device : cpu
[2024-04-28 11:01:45,978] [ DEBUG] - disable_tqdm : False
[2024-04-28 11:01:45,979] [ DEBUG] - distributed_dataloader : False
[2024-04-28 11:01:45,979] [ DEBUG] - do_compress : True
[2024-04-28 11:01:45,979] [ DEBUG] - do_eval : False
[2024-04-28 11:01:45,979] [ DEBUG] - do_export : False
[2024-04-28 11:01:45,979] [ DEBUG] - do_predict : False
[2024-04-28 11:01:45,979] [ DEBUG] - do_train : False
[2024-04-28 11:01:45,979] [ DEBUG] - enable_auto_parallel : False
[2024-04-28 11:01:45,979] [ DEBUG] - eval_accumulation_steps : None
[2024-04-28 11:01:45,979] [ DEBUG] - eval_batch_size : 32
[2024-04-28 11:01:45,979] [ DEBUG] - eval_steps : None
[2024-04-28 11:01:45,979] [ DEBUG] - evaluation_strategy : IntervalStrategy.NO
[2024-04-28 11:01:45,979] [ DEBUG] - flatten_param_grads : False
[2024-04-28 11:01:45,980] [ DEBUG] - force_reshard_pp : False
[2024-04-28 11:01:45,980] [ DEBUG] - fp16 : False
[2024-04-28 11:01:45,980] [ DEBUG] - fp16_full_eval : False
[2024-04-28 11:01:45,980] [ DEBUG] - fp16_opt_level : O1
[2024-04-28 11:01:45,980] [ DEBUG] - gradient_accumulation_steps : 1
[2024-04-28 11:01:45,980] [ DEBUG] - greater_is_better : None
[2024-04-28 11:01:45,980] [ DEBUG] - hybrid_parallel_topo_order : pp_first
[2024-04-28 11:01:45,980] [ DEBUG] - ignore_data_skip : False
[2024-04-28 11:01:45,980] [ DEBUG] - ignore_load_lr_and_optim : False
[2024-04-28 11:01:45,980] [ DEBUG] - ignore_save_lr_and_optim : False
[2024-04-28 11:01:45,980] [ DEBUG] - input_dtype : int64
[2024-04-28 11:01:45,981] [ DEBUG] - input_infer_model_path : None
[2024-04-28 11:01:45,981] [ DEBUG] - label_names : None
[2024-04-28 11:01:45,981] [ DEBUG] - lazy_data_processing : True
[2024-04-28 11:01:45,981] [ DEBUG] - learning_rate : 3e-05
[2024-04-28 11:01:45,981] [ DEBUG] - load_best_model_at_end : False
[2024-04-28 11:01:45,981] [ DEBUG] - load_sharded_model : False
[2024-04-28 11:01:45,981] [ DEBUG] - local_process_index : 0
[2024-04-28 11:01:45,981] [ DEBUG] - local_rank : -1
[2024-04-28 11:01:45,981] [ DEBUG] - log_level : -1
[2024-04-28 11:01:45,981] [ DEBUG] - log_level_replica : -1
[2024-04-28 11:01:45,981] [ DEBUG] - log_on_each_node : True
[2024-04-28 11:01:45,981] [ DEBUG] - logging_dir : checkpoint/prune/runs/Apr28_11-01-41_meng-paddle2-6-2
[2024-04-28 11:01:45,982] [ DEBUG] - logging_first_step : False
[2024-04-28 11:01:45,982] [ DEBUG] - logging_steps : 5
[2024-04-28 11:01:45,982] [ DEBUG] - logging_strategy : IntervalStrategy.STEPS
[2024-04-28 11:01:45,982] [ DEBUG] - logical_process_index : 0
[2024-04-28 11:01:45,982] [ DEBUG] - lr_end : 1e-07
[2024-04-28 11:01:45,982] [ DEBUG] - lr_scheduler_type : SchedulerType.LINEAR
[2024-04-28 11:01:45,982] [ DEBUG] - max_evaluate_steps : -1
[2024-04-28 11:01:45,982] [ DEBUG] - max_grad_norm : 1.0
[2024-04-28 11:01:45,982] [ DEBUG] - max_steps : -1
[2024-04-28 11:01:45,982] [ DEBUG] - metric_for_best_model : None
[2024-04-28 11:01:45,982] [ DEBUG] - minimum_eval_times : None
[2024-04-28 11:01:45,983] [ DEBUG] - moving_rate : 0.9
[2024-04-28 11:01:45,983] [ DEBUG] - no_cuda : False
[2024-04-28 11:01:45,983] [ DEBUG] - num_cycles : 0.5
[2024-04-28 11:01:45,983] [ DEBUG] - num_train_epochs : 1.0
[2024-04-28 11:01:45,983] [ DEBUG] - onnx_format : True
[2024-04-28 11:01:45,983] [ DEBUG] - optim : OptimizerNames.ADAMW
[2024-04-28 11:01:45,983] [ DEBUG] - optimizer_name_suffix : None
[2024-04-28 11:01:45,983] [ DEBUG] - output_dir : checkpoint/prune
[2024-04-28 11:01:45,983] [ DEBUG] - overwrite_output_dir : False
[2024-04-28 11:01:45,983] [ DEBUG] - past_index : -1
[2024-04-28 11:01:45,983] [ DEBUG] - per_device_eval_batch_size : 32
[2024-04-28 11:01:45,983] [ DEBUG] - per_device_train_batch_size : 32
[2024-04-28 11:01:45,984] [ DEBUG] - pipeline_parallel_config :
[2024-04-28 11:01:45,984] [ DEBUG] - pipeline_parallel_degree : -1
[2024-04-28 11:01:45,984] [ DEBUG] - pipeline_parallel_rank : 0
[2024-04-28 11:01:45,984] [ DEBUG] - power : 1.0
[2024-04-28 11:01:45,984] [ DEBUG] - prediction_loss_only : False
[2024-04-28 11:01:45,984] [ DEBUG] - process_index : 0
[2024-04-28 11:01:45,984] [ DEBUG] - prune_embeddings : False
[2024-04-28 11:01:45,984] [ DEBUG] - recompute : False
[2024-04-28 11:01:45,984] [ DEBUG] - remove_unused_columns : True
[2024-04-28 11:01:45,984] [ DEBUG] - report_to : ['visualdl']
[2024-04-28 11:01:45,984] [ DEBUG] - resume_from_checkpoint : None
[2024-04-28 11:01:45,985] [ DEBUG] - round_type : round
[2024-04-28 11:01:45,985] [ DEBUG] - run_name : checkpoint/prune
[2024-04-28 11:01:45,985] [ DEBUG] - save_on_each_node : False
[2024-04-28 11:01:45,985] [ DEBUG] - save_sharded_model : False
[2024-04-28 11:01:45,985] [ DEBUG] - save_steps : 100
[2024-04-28 11:01:45,985] [ DEBUG] - save_strategy : IntervalStrategy.STEPS
[2024-04-28 11:01:45,985] [ DEBUG] - save_total_limit : None
[2024-04-28 11:01:45,985] [ DEBUG] - scale_loss : 32768
[2024-04-28 11:01:45,985] [ DEBUG] - seed : 42
[2024-04-28 11:01:45,985] [ DEBUG] - sep_parallel_degree : -1
[2024-04-28 11:01:45,985] [ DEBUG] - sharding : []
[2024-04-28 11:01:45,985] [ DEBUG] - sharding_degree : -1
[2024-04-28 11:01:45,986] [ DEBUG] - sharding_parallel_config :
[2024-04-28 11:01:45,986] [ DEBUG] - sharding_parallel_degree : -1
[2024-04-28 11:01:45,986] [ DEBUG] - sharding_parallel_rank : 0
[2024-04-28 11:01:45,986] [ DEBUG] - should_load_dataset : True
[2024-04-28 11:01:45,986] [ DEBUG] - should_load_sharding_stage1_model: False
[2024-04-28 11:01:45,986] [ DEBUG] - should_log : True
[2024-04-28 11:01:45,986] [ DEBUG] - should_save : True
[2024-04-28 11:01:45,986] [ DEBUG] - should_save_model_state : True
[2024-04-28 11:01:45,986] [ DEBUG] - should_save_sharding_stage1_model: False
[2024-04-28 11:01:45,986] [ DEBUG] - skip_memory_metrics : True
[2024-04-28 11:01:45,986] [ DEBUG] - skip_profile_timer : True
[2024-04-28 11:01:45,987] [ DEBUG] - strategy : dynabert
[2024-04-28 11:01:45,987] [ DEBUG] - tensor_parallel_config :
[2024-04-28 11:01:45,987] [ DEBUG] - tensor_parallel_degree : -1
[2024-04-28 11:01:45,987] [ DEBUG] - tensor_parallel_rank : 0
[2024-04-28 11:01:45,987] [ DEBUG] - to_static : False
[2024-04-28 11:01:45,987] [ DEBUG] - train_batch_size : 32
[2024-04-28 11:01:45,987] [ DEBUG] - unified_checkpoint : False
[2024-04-28 11:01:45,987] [ DEBUG] - unified_checkpoint_config :
[2024-04-28 11:01:45,987] [ DEBUG] - use_hybrid_parallel : False
[2024-04-28 11:01:45,987] [ DEBUG] - use_pact : True
[2024-04-28 11:01:45,987] [ DEBUG] - wandb_api_key : None
[2024-04-28 11:01:45,987] [ DEBUG] - warmup_ratio : 0.1
[2024-04-28 11:01:45,988] [ DEBUG] - warmup_steps : 0
[2024-04-28 11:01:45,988] [ DEBUG] - weight_decay : 0.0
[2024-04-28 11:01:45,988] [ DEBUG] - weight_name_suffix : None
[2024-04-28 11:01:45,988] [ DEBUG] - weight_quantize_type : channel_wise_abs_max
[2024-04-28 11:01:45,988] [ DEBUG] - width_mult_list : ['3/4', '2/3', '1/2']
[2024-04-28 11:01:45,988] [ DEBUG] - world_size : 1
[2024-04-28 11:01:45,988] [ DEBUG] -
Traceback (most recent call last):
File "/nlp/PaddleNLP/applications/text_classification/multi_class/train.py", line 230, in
main()
File "/nlp/PaddleNLP/applications/text_classification/multi_class/train.py", line 216, in main
trainer.compress()
File "/usr/local/lib/python3.9/site-packages/paddlenlp/trainer/trainer_compress.py", line 73, in compress
_dynabert(self, self.model)
File "/usr/local/lib/python3.9/site-packages/paddlenlp/trainer/trainer_compress.py", line 158, in _dynabert
ofa_model, teacher_model = _dynabert_init(self, model, eval_dataloader)
File "/usr/local/lib/python3.9/site-packages/paddlenlp/trainer/trainer_compress.py", line 300, in _dynabert_init
head_importance, neuron_importance = compute_neuron_head_importance(
File "/usr/local/lib/python3.9/site-packages/paddlenlp/transformers/ofa_utils.py", line 307, in compute_neuron_head_importance
logits = model(**batch)
File "/usr/local/lib64/python3.9/site-packages/paddle/nn/layer/layers.py", line 1429, in call
return self.forward(*inputs, **kwargs)
File "/usr/local/lib/python3.9/site-packages/paddlenlp/transformers/ernie/modeling.py", line 462, in forward
outputs = self.ernie(
File "/usr/local/lib64/python3.9/site-packages/paddle/nn/layer/layers.py", line 1429, in call
return self.forward(*inputs, **kwargs)
File "/usr/local/lib/python3.9/site-packages/paddlenlp/trainer/trainer_compress.py", line 907, in auto_model_dynabert_forward
embedding_output = self.embeddings(**embedding_kwargs)
File "/usr/local/lib64/python3.9/site-packages/paddle/nn/layer/layers.py", line 1429, in call
return self.forward(*inputs, **kwargs)
File "/usr/local/lib/python3.9/site-packages/paddlenlp/transformers/ernie/modeling.py", line 127, in forward
embeddings = self.layer_norm(embeddings)
File "/usr/local/lib64/python3.9/site-packages/paddle/nn/layer/layers.py", line 1429, in call
return self.forward(*inputs, **kwargs)
File "/usr/local/lib/python3.9/site-packages/paddleslim/nas/ofa/layers.py", line 1301, in forward
out, _, _ = paddle._C_ops.layer_norm(
ValueError: too many values to unpack (expected 3)

@mengxiangyu-png mengxiangyu-png added the question Further information is requested label Apr 29, 2024
@w5688414
Copy link
Contributor

把paddle降低到2.5.x版本试一下

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

3 participants