Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Why SpatialLinearAttention use k mul v first? #30

Open
TtuHamg opened this issue Nov 23, 2023 · 0 comments
Open

Why SpatialLinearAttention use k mul v first? #30

TtuHamg opened this issue Nov 23, 2023 · 0 comments

Comments

@TtuHamg
Copy link

TtuHamg commented Nov 23, 2023

Hello, I'm a newcomer in Diffusion generation. I'd like to ask why in the SpatialLinearAttention, the context is obtained first using 'k' and 'v', which seems different from the typical self-attention mechanism (where attention coefficients are computed using 'q' and 'k'). Is there a specific reason for this approach or other paper mention its use? I hope to receive your explanation. Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant