Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extend SmoothQuant support (exclude nodes, fuse into layernorm) #1357

Closed
wants to merge 3 commits into from

Conversation

fxmarty
Copy link

@fxmarty fxmarty commented Oct 27, 2023

Type of Change

As per title.

Description

This PR includes two new features for SmoothQuant (that I was too lazy to split into two PRs):

How has this PR been tested?

Locally - I did not test that output from the fusion match but I just reused the code from the mul method. Let me know if I should add tests.

@fxmarty
Copy link
Author

fxmarty commented Oct 27, 2023

@chensuyue @mengniwang95 @PenghuiCheng @xin3he happy to get a review on this one!

@chensuyue chensuyue added the enhancement New feature or request label Oct 31, 2023
@fxmarty
Copy link
Author

fxmarty commented Oct 31, 2023

Thank you @chensuyue, will have a look!

@@ -145,6 +146,7 @@ def transform(
calib_iter=100,
quantize_config=None,
auto_alpha_args={"alpha_min": 0.3, "alpha_max": 0.7, "alpha_step": 0.05, "attn_method": "min"},
nodes_to_exclude: Optional[List[str]] = None,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how is nodes_to_exclude setting passed from user facing API of neural-compressor?

@@ -330,6 +336,37 @@ def conv(node, scale): # pragma: no cover
)
return True

def mul_add(node, scale): # pragma: no cover
node_parent = self.model.get_parent(node, 0)
if not len(self.model.get_parents(node)) == 1 or node_parent.op_type != "Mul":
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

below comment is "Add has itself a MatMul before" but here is "node_parent.op_type != 'Mul'", could you align it?

@mengniwang95
Copy link
Collaborator

Further, could you add a ut to make sure it can work correctly?

@chensuyue
Copy link
Contributor

@fxmarty would you like to follow this PR?

@chensuyue
Copy link
Contributor

We will have code freeze on 11/22, if this PR could be merged before the date, it can be packaged into v2.4 release.

@chensuyue
Copy link
Contributor

@fxmarty will you fix the PR?

@fxmarty
Copy link
Author

fxmarty commented Nov 21, 2023

@chensuyue Sorry I did not get time to fix it, I won't be able before the release unfortunately.

@chensuyue chensuyue added this to the v2.5 milestone Nov 23, 2023
@chensuyue chensuyue removed this from the v2.5 milestone Mar 21, 2024
@chensuyue
Copy link
Contributor

Close the PR first due to pending for a long time, feel free to reopen when you have time to handle the issue.

@chensuyue chensuyue closed this May 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request pending
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants