基于上次提的问题#691，后续改进后似乎依旧不能按微调的情况回复。 #880

xiaolvtongxue-zt · 2024-05-08T10:37:55Z

基于上次提的问题#691，后续改进后似乎依旧不能按微调的情况回复。

量化模型的merge lora精度损失很大的建议训完后量化，然后使用vllm

您好，根据上次的反馈，我在这次使用的微调中，直接使用lora微调，并没有使用量化。quantization_bit = 0；
训练模型后，将模型进行合并(为了后续可以使用VLLM进行推理。)

CUDA_VISIBLE_DEVICES=0,1 swift export --ckpt_dir './swift_qwen/output/qwen1half-7b-chat-swift/qwen1half-7b-chat/v1-20240416-160243/checkpoint-69500' --sft_type 'lora' --merge_lora true --model_id_or_path './models/models/qwen/Qwen1___5-7B-Chat'

此时发现，这样做，模型依旧未能按照实际微调的结果（即未合并前的模型）来进行回复。
想请问老师，到底时哪一步出现问题了？很是奇怪。

The text was updated successfully, but these errors were encountered:

tastelikefeet · 2024-05-09T03:53:38Z

不merge就没问题？merge了就不太行？

xiaolvtongxue-zt · 2024-05-09T06:31:46Z

不merge就没问题？merge了就不太行？

是的，之前是使用看量化微调，所以不行，可以理解，但后续取消了量化微调，merge了依旧不行？着实有点想不明白。

为了方便你查看，我将把我merge过程中的log输出展示出来：“

run sh: `python /root/anaconda3/envs/python3.9/lib/python3.9/site-packages/swift/cli/export.py --ckpt_dir ./swift_qwen/output/qwen1half-7b-chat-swift/qwen1half-7b-chat/v1-20240416-160243/checkpoint-69000 --sft_type lora --merge_lora true --model_id_or_path ./models/models/qwen/Qwen1___5-7B-Chat`
2024-05-09 14:29:08,784 - modelscope - INFO - PyTorch version 2.1.2 Found.
2024-05-09 14:29:08,785 - modelscope - INFO - Loading ast index from /root/.cache/modelscope/ast_indexer
2024-05-09 14:29:08,839 - modelscope - INFO - Loading done! Current index file version is 1.14.0, with md5 e31c85192ca44a9fd57e7f94f1a1180d and a total number of 976 components indexed
[INFO:swift] Start time of running main: 2024-05-09 14:29:09.570211
[INFO:swift] ckpt_dir: /home/centos/xiaolv/太安模型微调/swift_qwen/output/qwen1half-7b-chat-swift/qwen1half-7b-chat/v1-20240416-160243/checkpoint-69000
[INFO:swift] Setting model_info['revision']: master
[INFO:swift] Setting self.eval_human: True
[INFO:swift] Setting overwrite_generation_config: True
[INFO:swift] Setting args.dataset: ['alpaca-zh', 'alpaca-en']
[INFO:swift] args: ExportArguments(model_type='qwen1half-7b-chat', model_id_or_path='./models/models/qwen/Qwen1___5-7B-Chat', model_revision='master', sft_type='lora', template_type='qwen', infer_backend='vllm', ckpt_dir='/home/centos/xiaolv/太安模型微调/swift_qwen/output/qwen1half-7b-chat-swift/qwen1half-7b-chat/v1-20240416-160243/checkpoint-69000', load_args_from_ckpt_dir=True, load_dataset_config=False, eval_human=True, seed=42, dtype='bf16', dataset=['alpaca-zh', 'alpaca-en'], dataset_seed=42, dataset_test_ratio=0.01, val_dataset_sample=10, save_result=True, system='You are a helpful assistant.', max_length=None, truncation_strategy='delete', check_dataset_strategy='none', custom_train_dataset_path=[], custom_val_dataset_path=[], quantization_bit=0, bnb_4bit_comp_dtype='bf16', bnb_4bit_quant_type='nf4', bnb_4bit_use_double_quant=True, bnb_4bit_quant_storage=None, max_new_tokens=2048, do_sample=True, temperature=0.3, top_k=20, top_p=0.7, repetition_penalty=1.0, num_beams=1, stop_words=None, use_flash_attn=None, ignore_args_error=False, stream=True, merge_lora=True, merge_device_map='auto', save_safetensors=True, overwrite_generation_config=True, verbose=None, gpu_memory_utilization=0.9, tensor_parallel_size=1, max_model_len=None, vllm_enable_lora=False, vllm_max_lora_rank=16, vllm_lora_modules=[], show_dataset_sample=10, safe_serialization=None, model_cache_dir=None, merge_lora_and_save=None, to_peft_format=False, quant_bits=0, quant_method='awq', quant_n_samples=256, quant_seqlen=2048, quant_device_map='cpu', push_to_hub=False, hub_model_id=None, hub_token=None, hub_private_repo=False, commit_message='update files')
[INFO:swift] Global seed set to 42
[INFO:swift] replace_if_exists: False
[INFO:swift] merged_lora_path: `/home/centos/xiaolv/太安模型微调/swift_qwen/output/qwen1half-7b-chat-swift/qwen1half-7b-chat/v1-20240416-160243/checkpoint-69000-merged`
[INFO:swift] device_count: 2
[INFO:swift] Loading the model using model_dir: ./models/models/qwen/Qwen1___5-7B-Chat
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Loading checkpoint shards: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:08<00:00,  2.07s/it]
[INFO:swift] model.max_model_len: 32768
[INFO:swift] generation_config: GenerationConfig {
  "do_sample": true,
  "eos_token_id": 151645,
  "max_new_tokens": 2048,
  "pad_token_id": 151643,
  "temperature": 0.3,
  "top_k": 20,
  "top_p": 0.7
}

[INFO:swift] SwiftModel: 7733.9075M Params (12.5829M Trainable [0.1627%]), 268.4375M Buffers.
[INFO:swift] system: You are a helpful assistant.
[INFO:swift] Merge LoRA...
[INFO:swift] Saving merged weights...
[INFO:swift] Successfully merged LoRA and saved in /home/centos/xiaolv/太安模型微调/swift_qwen/output/qwen1half-7b-chat-swift/qwen1half-7b-chat/v1-20240416-160243/checkpoint-69000-merged.
[INFO:swift] Setting args.sft_type: 'full'
[INFO:swift] Setting args.ckpt_dir: /home/centos/xiaolv/太安模型微调/swift_qwen/output/qwen1half-7b-chat-swift/qwen1half-7b-chat/v1-20240416-160243/checkpoint-69000-merged
[INFO:swift] End time of running main: 2024-05-09 14:30:00.366159

xiaolvtongxue-zt · 2024-05-19T10:35:08Z

@Jintao-Huang

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

基于上次提的问题#691，后续改进后似乎依旧不能按微调的情况回复。 #880

基于上次提的问题#691，后续改进后似乎依旧不能按微调的情况回复。 #880

xiaolvtongxue-zt commented May 8, 2024 •

edited

tastelikefeet commented May 9, 2024

xiaolvtongxue-zt commented May 9, 2024

xiaolvtongxue-zt commented May 19, 2024

基于上次提的问题#691，后续改进后似乎依旧不能按微调的情况回复。 #880

基于上次提的问题#691，后续改进后似乎依旧不能按微调的情况回复。 #880

Comments

xiaolvtongxue-zt commented May 8, 2024 • edited

tastelikefeet commented May 9, 2024

xiaolvtongxue-zt commented May 9, 2024

xiaolvtongxue-zt commented May 19, 2024

xiaolvtongxue-zt commented May 8, 2024 •

edited