LoRA + FlashAttention2 speed up？ #677

zhoumengbo · 2023-11-11T07:42:15Z

When fine-tuning Mistral with LoRA, do you think FlashAttention2 helps in speeding up the process? If yes, how significant is the acceleration? Where is the primary acceleration achieved?

research4pan · 2023-11-15T17:06:36Z

Thanks for your interest in LMFlow! Theoretically I think it helps, since flash attention improves the cache-friendliness for attention operations and should also help the forward of the freezed model for lora. However, we haven't done empirical tests on this matter, which is indeed an interesting topic 😄

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LoRA + FlashAttention2 speed up？ #677

LoRA + FlashAttention2 speed up？ #677

zhoumengbo commented Nov 11, 2023

research4pan commented Nov 15, 2023

LoRA + FlashAttention2 speed up？ #677

LoRA + FlashAttention2 speed up？ #677

Comments

zhoumengbo commented Nov 11, 2023

research4pan commented Nov 15, 2023