[LoRA] Roadmap of LoRA operators #199

yzh119 · 2024-04-08T23:08:03Z

Reducing the latency of LoRA operators (per lorax feedback, lora operators introduce ~20% overhead).
Numerical issue of LoRA operators for large batch size.
Using fp8 tensor cores for LoRA operators.

tgaddair · 2024-04-26T20:00:56Z

Thanks for filing this issue @yzh119! Happy to help out in any way I can.

First step towards #199 . Group gemm should also be helpful for MoE.

yzh119 mentioned this issue Jun 4, 2024

feat: add group gemm operators #282

Merged

yzh119 added a commit that referenced this issue Jun 5, 2024

feat: add group gemm operators (#282)

e08ba42

First step towards #199 . Group gemm should also be helpful for MoE.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[LoRA] Roadmap of LoRA operators #199

[LoRA] Roadmap of LoRA operators #199

yzh119 commented Apr 8, 2024

tgaddair commented Apr 26, 2024

[LoRA] Roadmap of LoRA operators #199

[LoRA] Roadmap of LoRA operators #199

Comments

yzh119 commented Apr 8, 2024

tgaddair commented Apr 26, 2024