Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEATURE] Support FasterViT #1842

Open
seefun opened this issue Jun 12, 2023 · 3 comments
Open

[FEATURE] Support FasterViT #1842

seefun opened this issue Jun 12, 2023 · 3 comments
Labels
enhancement New feature or request

Comments

@seefun
Copy link
Contributor

seefun commented Jun 12, 2023

”FasterViT: Fast Vision Transformers with Hierarchical Attention“

https://github.com/NVlabs/FasterViT

image

The code is written based on timm and provides pretrained weights on ImageNet1k. But there are many layers customized in the code which are different from the implementation of timm. So I'm not sure if we need to make significant adjustments to these code.

It looks interesting, but it doesn't seem like the paper has been released.

@seefun seefun added the enhancement New feature or request label Jun 12, 2023
@rwightman
Copy link
Collaborator

yeah, noticed this one, it is timm oriented but as always, baked in square image size assumptions and put the downsample at the end of the blocks so needs a decent amount of attention to fix and remap :(

I really truly don't understand the obsession with putting downsample at the end of vit/hybrid blocks :(

Other thing is, I've never found gcvit (same authors) to be particularly easy to train or fine-tune (including reproducing the original results) compared to vit, swin, convnext (which I've successfully managed to reproduce and improve on originals). I wonder how this compares.... given the complexity of the model code, I found the throughput #s surprising as more code usually == more activations and slower speeds.

@tp-nan
Copy link

tp-nan commented Aug 11, 2023

Hi, guys, is there any update on this issue? The throughout is really high.

@youssefadr
Copy link

Hi, I can take this one. I'll begin by moving the downsamples as mentioned here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

4 participants