Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Community contribution - BetterTransformer integration for more models! #488

Open
9 of 15 tasks
younesbelkada opened this issue Nov 18, 2022 · 24 comments · Fixed by #923 · 4 remaining pull requests
Open
9 of 15 tasks

Community contribution - BetterTransformer integration for more models! #488

younesbelkada opened this issue Nov 18, 2022 · 24 comments · Fixed by #923 · 4 remaining pull requests
Labels
good first issue Good for newcomers

Comments

@younesbelkada
Copy link
Contributor

younesbelkada commented Nov 18, 2022

BetterTransformer integration for more models!

BetterTransformer API provides faster inference on CPU & GPU through a simple interface!

Models can benefit from very interesting speedups using a one liner and by making sure to install the latest version of PyTorch. A complete guideline on how to convert a new model has been created on the BetterTransformer documentation!

Here is a list of models that could be potentially supported, pick one of the architecture below and let's discuss about the conversion!

Text models 🖊️ :

Vision models 📷 :

Audio models 🔉 :

Let us also know if you think that some architectures can be supported that we missed. Note that for encoder-decoder based models below, we expect to convert the encoder only.

Support for decoder-based models coming soon!

cc @michaelbenayoun @fxmarty

huggingface/transformers#20372

@Sumanth077
Copy link
Contributor

Hi @younesbelkada would love to contribute to this Issue and can work on FSMT.

@younesbelkada
Copy link
Contributor Author

younesbelkada commented Nov 19, 2022

Hey @Sumanth077 , thanks a bunch for your interest in this issue! 🚀 Would love to assist you for the integration and let's try to make this happen!
I have updated the table above, and attaching you the contribution tutorial here ;)
Would you mind forking this repo and start opening a draft pull request so that I can start guiding you there?
Also please do not hesitate to ping us here for any issue you are facing for the integration 💪

@Sumanth077
Copy link
Contributor

Thankyou for the reply @younesbelkada. Just opened a Draft Pull Request, haven't made any significant changes.

In the Step 1: Identifying the source layer to change and in the BETTER_TRANFORMER_LAYERS_MAPPING_DICT, I couldn't find a mapping between the Module for the FSMT that can be converted to its BetterTransformer equivalent.

Should I start creating that. Would love your assistance

@younesbelkada
Copy link
Contributor Author

Hi @Sumanth077 , I have just replied on your PR, let's continue the discussion there ;)

@ka00ri
Copy link
Contributor

ka00ri commented Nov 22, 2022

Hi, I would like to contribute as well. This would be my first contribution to open source, so I might need some hand holding 🤚

I followed the documentation and the progress made on FSMT in #494 to better understand the task.

I looked into ViLT via

model = AutoModel.from_pretrained("dandelin/vilt-b32-mlm")

and as I understand the documentation, this should be the source layer to make changes to, including its attributes:

(0): ViltLayer( (attention): ViltAttention( (attention): ViltSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.0, inplace=False) ) (output): ViltSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.0, inplace=False) ) ) (intermediate): ViltIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) (intermediate_act_fn): GELUActivation() ) (output): ViltOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (dropout): Dropout(p=0.0, inplace=False) ) (layernorm_before): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (layernorm_after): LayerNorm((768,), eps=1e-12, elementwise_affine=True) )

I could give the ViLTLayer a go, if it's ok with you @younesbelkada 🙂

@younesbelkada
Copy link
Contributor Author

younesbelkada commented Nov 23, 2022

Hi @ka00ri !
Thanks a lot for your message and interest in contributing! Would love to assist you for integrating ViLT into BetterTransformer 💪
That is correct, this layer has to be the source layer to change!
Would you mind opening a PR and tag us (myself, @michaelbenayoun & @fxmarty ) ? Thanks a bunch!

@younesbelkada younesbelkada pinned this issue Dec 7, 2022
@adit299 adit299 linked a pull request Dec 27, 2022 that will close this issue
3 tasks
@adit299
Copy link
Contributor

adit299 commented Dec 27, 2022

Hello, apologies for the delay, but I just opened up a draft PR to start discussion on how to add Better Transformer support for the ProphetNet encoder layer. I had a couple of questions about how to do this, so I was wondering who would would be the best person to reach out to regarding this. @michaelbenayoun @fxmarty @younesbelkada

@fxmarty
Copy link
Collaborator

fxmarty commented Dec 29, 2022

Hi @adit299 , thanks for adding the support for this architecture! Feel free to ask any question in the PR you opened.

@JanFidor
Copy link

Hi @younesbelkada, could I pick up the RoFormer?

@soma2000-lang
Copy link

@younesbelkada doing Detr - DetrLayer

@younesbelkada
Copy link
Contributor Author

Hello @JanFidor
Yes sure!
@soma2000-lang perfect, let us know when you open a PR 💪 !

@JanFidor
Copy link

JanFidor commented Mar 6, 2023

@younesbelkada Hi, thanks for responding, I'm not 100% certain, but I think RemBert, RoFormer and RocBert are already implemented, as they're already added to init.py, overview.mdx and the test_file, if that's the case, the list of models left to implement would need to be updated, let me know if you agree!

@younesbelkada
Copy link
Contributor Author

I see, thanks for clarifying. I will double check that and let you know

@younesbelkada
Copy link
Contributor Author

Thanks for letting me know! Indeed these are already implemented
I can propose you to add BetterTransformer support for Blip (updated the table above)

@JanFidor
Copy link

JanFidor commented Mar 7, 2023

Thanks for the suggestion, I'll get on it!

@ravenouse
Copy link
Contributor

Hi @fxmarty and @younesbelkada !

Thank you so much for your previous help and support on my implementation of MBart support for BetterTransformer.

I want to follow up on my PR on ASTLayer support for BetterTransformer.

Specifically, I would like to check with you if it is still possible to work on this and have it reviewed and merged into the package. If it is, I would be happy to continue working on it.

I realized the whole BetterTransformer part and its testing have changed a lot in last several months. Once I get confirmed, I will start to edit my code accordingly to meet previous changes.

Thank you so much for your time and help, and I look forward to hearing back from you soon.

Sincerely,

@fxmarty fxmarty reopened this Mar 29, 2023
@miyu386 miyu386 linked a pull request Apr 12, 2023 that will close this issue
3 tasks
@awinml awinml linked a pull request May 27, 2023 that will close this issue
@rajveer43
Copy link

rajveer43 commented Aug 4, 2023

@younesbelkada I would like to work upon flavalayer can you confirm whether it is done or not?

@mszsorondo
Copy link
Contributor

mszsorondo commented Aug 7, 2023

Hi! @JanFidor will you finish with BLIP? I can do it if not, with the permission of @younesbelkada @fxmarty

@rajveer43
Copy link

@younesbelkada I would like to work upon flavalayer can you confirm whether it is done or not?

@fxmarty
Copy link
Collaborator

fxmarty commented Aug 11, 2023

Hi,

@mszsorondo Looking into the PRs, BLIP has been implemented in #1125. I just ticked it in the first post.
@rajveer43 For Flava, there is this onging PR: #907

@rajveer43
Copy link

@fxmarty any other model available for work?

@mszsorondo
Copy link
Contributor

@fxmarty same here, if there´s still any model

@issamarabi issamarabi linked a pull request Oct 1, 2023 that will close this issue
3 tasks
@hackpk
Copy link

hackpk commented Nov 20, 2023

@younesbelkada Can, I work on ASTLayer??

@karandua2016
Copy link

Any plans to add support for MPT?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment