interesting new finetuning approach from stanford - ReFT #658

fblissjr · 2024-04-06T15:35:20Z

uses flash attn and pyvene (https://github.com/stanfordnlp/pyvene) but don't see any specific kernels aside from flashattn. tried this on my cuda machine and it's neat - not sure how effective at scale yet, but worth exploring. anyone else looking into this?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

interesting new finetuning approach from stanford - ReFT #658

interesting new finetuning approach from stanford - ReFT #658

fblissjr commented Apr 6, 2024

interesting new finetuning approach from stanford - ReFT #658

interesting new finetuning approach from stanford - ReFT #658

Comments

fblissjr commented Apr 6, 2024