Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG/DEPRECATION] Remove fused attention/mlp #659

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

Qubitium
Copy link
Contributor

@Qubitium Qubitium commented Apr 28, 2024

Reason for PR:

  • Broken since at least March 1st with no fix in sight
  • No longer have performance value compared to marlin and new marlin kernels (vllm)
  • Why fix/maintain something that no one would want to use in production env?

Ref: Issue #655

Changes:

  • Remove all fused attention/mlp code

TESTS

  • PASS: test_quantization
  • PASS: test_serializtion
  • PASS: test_triton
  • PASS: test_q4

@Qubitium Qubitium changed the title [WIP] [BUG/DEPRECATION] Remove fused attention/mlp [BUG/DEPRECATION] Remove fused attention/mlp Apr 28, 2024
@Qubitium Qubitium marked this pull request as ready for review April 28, 2024 23:47
@fxmarty
Copy link
Collaborator

fxmarty commented Apr 30, 2024

Hey @Qubitium sorry for taking time to review. I've had too many other things in my plate, I'll try to review everything by Friday. Do you have an email/slack/twitter I could reach out to you at? I feel like for some non-huge PRs/bugfixes it would be simplier if you had write access to the main branch. What do you think?

@Qubitium
Copy link
Contributor Author

Qubitium commented Apr 30, 2024

@fxmarty Sure thing! You can reach me via twitter @ https://twitter.com/qubitium or email at qubitium@lbx.dev. Willing to help out anyway I can with the project. Most importantly, I am willing to put in the grunt work to get all the non-tensor (cuda/triton) math level things working, fixed, and added.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants