Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sliding window attention #159

Open
WoosukKwon opened this issue Mar 6, 2024 · 2 comments
Open

Sliding window attention #159

WoosukKwon opened this issue Mar 6, 2024 · 2 comments

Comments

@WoosukKwon
Copy link

While I saw this item in the roadmap, I'm wondering if this feature will be supported in the near future or not.

@yzh119
Copy link
Collaborator

yzh119 commented Mar 6, 2024

I skipped the item because we don't need special support for SWA if we set page_size to 1 .
For larger page_size, I think it's still necessary to have SWA support, added to v0.0.4 release plan.

@WoosukKwon
Copy link
Author

@yzh119 Oh yes, we don't need a new kernel for decode. However, if I understand correctly, we need a new kernel for prefills?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants