fsdp-qlora yi-34B-chat throw error " ValueError: Cannot flatten integer dtype tensors" #3470

hellostronger · 2024-04-26T13:00:57Z

Reminder

I have read the README and searched the existing issues.

Reproduction

CUDA_VISIBLE_DEVICES=0,1 accelerate launch
--config_file config.yaml
src/train_bash.py
--stage sft
--do_train
--model_name_or_path /workspace/models/Yi-34B-Chat
--dataset law_with_basis
--dataset_dir data
--template default
--finetuning_type lora
--lora_target q_proj,v_proj
--output_dir /workspace/ckpt/Yi-34B-Chat-sft
--overwrite_cache
--overwrite_output_dir
--cutoff_len 1024
--per_device_train_batch_size 1
--per_device_eval_batch_size 1
--gradient_accumulation_steps 8
--lr_scheduler_type cosine
--logging_steps 10
--save_steps 100
--eval_steps 100
--evaluation_strategy steps
--load_best_model_at_end
--learning_rate 5e-5
--num_train_epochs 3.0
--max_samples 3000
--val_size 0.1
--quantization_bit 4
--plot_loss
--fp16

config.yaml

compute_environment: LOCAL_MACHINE
debug: false
distributed_type: FSDP
downcast_bf16: 'no'
fsdp_config:
  fsdp_auto_wrap_policy: TRANSFORMER_BASED_WRAP
  fsdp_backward_prefetch: BACKWARD_PRE
  fsdp_cpu_ram_efficient_loading: true
  fsdp_forward_prefetch: false
  fsdp_offload_params: true
  fsdp_sharding_strategy: FULL_SHARD
  fsdp_state_dict_type: FULL_STATE_DICT
  fsdp_sync_module_states: true
  fsdp_use_orig_params: false
machine_rank: 0
main_training_function: main
mixed_precision: fp16
num_machines: 1
num_processes: 2
rdzv_backend: static
same_network: true
tpu_env: []
tpu_use_cluster: false
tpu_use_sudo: false
use_cpu: false

Expected behavior

fsdp qlora yi-34B-chat

System Info

transformers 4.39.3 torch 2.1.2 cuda 121 python3.8

Others

The text was updated successfully, but these errors were encountered:

hellostronger · 2024-04-26T13:03:11Z

have seen exist issue written in March,but i cannot get any useful info to find out why this error came，hoping your suggestion

hiyouga · 2024-04-26T13:22:06Z

please provide your version of accelerate and bitsandbytes

hellostronger · 2024-04-28T07:32:19Z

@hiyouga accelerate==0.28.0 bitsandbytes==0.43.0 ,Do these versions have any problems?hoping your suggestion

hiyouga · 2024-04-28T08:06:27Z

did you use the latest code?

etemiz · 2024-04-29T03:17:37Z

~~While I am trying to train https://huggingface.co/cognitivecomputations/dolphin-2.9-llama3-70b I am getting the same error "ValueError: Cannot flatten integer dtype tensors".~~
The error seems to be resolved when I reinstalled LLaMA-Factory again. These are the versions:

accelerate 0.29.3
bitsandbytes 0.43.1

hellostronger · 2024-05-04T09:41:24Z

@hiyouga sorry，my answer is so late this case, using newest llama_factory code, it work currently right now

hiyouga added the pending This problem is yet to be addressed. label Apr 26, 2024

hiyouga added solved This problem has been already solved. and removed pending This problem is yet to be addressed. labels May 4, 2024

hiyouga closed this as completed May 4, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fsdp-qlora yi-34B-chat throw error " ValueError: Cannot flatten integer dtype tensors" #3470

fsdp-qlora yi-34B-chat throw error " ValueError: Cannot flatten integer dtype tensors" #3470

hellostronger commented Apr 26, 2024

hellostronger commented Apr 26, 2024 •

edited

hiyouga commented Apr 26, 2024

hellostronger commented Apr 28, 2024 •

edited

hiyouga commented Apr 28, 2024

etemiz commented Apr 29, 2024 •

edited

hellostronger commented May 4, 2024

fsdp-qlora yi-34B-chat throw error " ValueError: Cannot flatten integer dtype tensors" #3470

fsdp-qlora yi-34B-chat throw error " ValueError: Cannot flatten integer dtype tensors" #3470

Comments

hellostronger commented Apr 26, 2024

Reminder

Reproduction

Expected behavior

System Info

Others

hellostronger commented Apr 26, 2024 • edited

hiyouga commented Apr 26, 2024

hellostronger commented Apr 28, 2024 • edited

hiyouga commented Apr 28, 2024

etemiz commented Apr 29, 2024 • edited

hellostronger commented May 4, 2024

hellostronger commented Apr 26, 2024 •

edited

hellostronger commented Apr 28, 2024 •

edited

etemiz commented Apr 29, 2024 •

edited