New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Integrating ReFT #1654
Comments
Hi, thanks for bringing this to our attention. We had already looked at (Lo)ReFT internally and had some discussion about whether it would be a good addition to PEFT. IIRC, the ReFT repo was heavily relying on pyvene, does your fork do that too or are you integrating the pyvene code? Maybe you can open a draft PR so that we can more easily discuss what has been changed. Regarding the paper itself, I've only skimmed it, so I don't have the full picture. I'll quote myself on what I had to say internally:
Do you have any further insights into that? |
Thank you for the feedback. I will make a draft PR for discussion. |
@raven38 @BenjaminBossan hey! i was randomly browsing on GitHub and found this ticket, super exciting to see the PEFT library potentially supporting ReFT. Although i think current pyreft + pyvene could support more schematic ReFT design, integrating with the PEFT library could potentially scale up (i.e., different level of parallelisms, checkpointing) simple ReFT experiments super effectively! One input here that might be helpful is batching: using Another comment that might be helpful is KV cache. We currently only intervene on the prompt tokens. So the intervened KV for the prompt tokens can be cached, and there should be no inference overhead when generating (this is different from adaptors depending on the implementations). You guys might already take care of that though. Thanks again! |
Thanks for sharing that information @frankaging. I haven't checked the PR in detail yet and compared it to the pyreft/pyvene code, @raven38 should be better positioned to answer your question. One thing I wondered: From your perspective, do you think that pyeft/pyvene is structured in a way that we could add it as an optional dependency to PEFT and re-use its code or what it be easier to re-implement it from scratch is the PR currently does? |
@frankaging Thank you for the feedback. I am also thinking about the issue of batching. In pyreft, intervention locations are set on the dataset side, but considering the compatibility with other adapters' APIs, I don't think this solution is appropriate. I don't yet have a solution for this issue, but I would like to hear if you have any. I'm interested in KV cache. Can I learn the implementations of KV cache on other adaptors? |
Good point. We could put the burden of adding the intervention information on the user, but ideally, it would be great if they only need to set the parameters and we can apply it automatically. I'm not sure how generalizable it is (beyond language models for instance), we may want to have an option to intervene on all the input data perhaps?
If you mean inside of PEFT, note that we don't have any PEFT specific KV cache. If transformers adds it to a model, we want to make sure it can be properly used, but we don't have anything on our own. If you see potential for performance gains by adding something in PEFT, let us know. |
Hello all, and thank you for your great work!
ReFT, a representation finetuning framework more parameter-efficient than popular PEFTs like LoRA, is announced earlier this month.
I implemented LoReFT, a instance of ReFT, up on the your
PEFT
library referencing to the original implementation.My implementation is available at https://github.com/raven38/peft
Would you be interested to perhaps integrate ReFT into peft? I would be happy to work on this if there is interest from you and the community.
The text was updated successfully, but these errors were encountered: