Support for Volta / Turing architectures #160

tgaddair · 2024-03-07T00:31:56Z

I saw that support for sm75 / sm70 is listed in progress (https://docs.flashinfer.ai/installation.html) but didn't see an issue to track. Is this something planned in the near-term or further out on the roadmap? Thanks!

aliencaocao · 2024-03-12T10:26:45Z

its tracked here #19 but so far no movements in codebase i think

yzh119 · 2024-03-13T01:07:53Z

@aliencaocao @tgaddair part of the work has been done in #128 , still some work to do to accommodate the small shared memory size of sm75.

Regarding sm70, I have made some local attempts but the performance is not good because I'm using a software simulation of ldmatrix instrinsic. My plan is to write standalone prefill/decode kernels for sm70 because it supports neither async memory copy nor native ldmatrix intrinsic.

They are still on my TODO list, I suppose I can finish sm75 support soon but it will take some effort to debug and performance tuning on sm70, will try my best..

aliencaocao · 2024-03-13T01:27:30Z

Thanks for the update, looking forward to sm70.

K-Mistele · 2024-03-18T21:14:01Z

This would be really really great since I would love to be able to use this on my volta devices

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support for Volta / Turing architectures #160

Support for Volta / Turing architectures #160

tgaddair commented Mar 7, 2024

aliencaocao commented Mar 12, 2024

yzh119 commented Mar 13, 2024 •

edited

aliencaocao commented Mar 13, 2024

K-Mistele commented Mar 18, 2024

Support for Volta / Turing architectures #160

Support for Volta / Turing architectures #160

Comments

tgaddair commented Mar 7, 2024

aliencaocao commented Mar 12, 2024

yzh119 commented Mar 13, 2024 • edited

aliencaocao commented Mar 13, 2024

K-Mistele commented Mar 18, 2024

yzh119 commented Mar 13, 2024 •

edited