Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug in ORTSmoothQuant._adjust_weights() #1435

Open
pavelkochnev opened this issue Nov 30, 2023 · 5 comments
Open

Bug in ORTSmoothQuant._adjust_weights() #1435

pavelkochnev opened this issue Nov 30, 2023 · 5 comments
Assignees

Comments

@pavelkochnev
Copy link
Contributor

I think there is a bug in ORTSmoothQuant._adjust_weights(). Part of this method presented below:

    def _adjust_weights(self, scales):
        """Adjust the weights with scale.

        Args:
            scales (dict): The input scales
        """
        for tensor_name, nodes in self.tensors_to_node.items():
            for node_info in nodes:
                key = node_info[0] if self.scales_per_op else tensor_name
                if key not in scales:
                    continue
                input = node_info[1][1]
                node = self.model.input_name_to_nodes[input][0]
                weight = numpy_helper.to_array(
                    self.model.get_initializer(input),
                    base_dir=os.path.dirname(self.model.model_path) if self.model.model_path is not None else "",
                )

The error occurs on this line:

node = self.model.input_name_to_nodes[input][0]

self.model.input_name_to_nodes[input] can return list with more then one element, but no matter what, code take the fist element

I think this line should look like something like this:

node = self.model.get_node(node_info[0])

This line had the same issue, but can be removed.

Also I suggest you to check ORTSmoothQuant._get_smooth_scales() method. The same issue here and here.

@mengniwang95
Copy link
Collaborator

Hi @pavelkochnev, thank you for your suggestions.

input = node_info[1][1]

is the weight name of node.
If each node has its own weight, self.model.input_name_to_nodes[input] will return a list with length of 1 and the following processes are no problem.
If more than one node share one weight tensor, the result will be wrong if that weight is updated twice.
Maybe adding warning and process for the shared weight is what I need to do. 😃

@pavelkochnev
Copy link
Contributor Author

@mengniwang95, thanks for reply. I think you are right. I have researched several models that I have and each conv has its own weight. However, I still think that replacing the line will make the code more readable and code should raise error when model have shared weights (for now).

@mengniwang95
Copy link
Collaborator

Hi @pavelkochnev , replacing the line will do make the code more readable.
But in more detail, get_node will iterate nodes of the model until find it and its time complexity is O(n), input_name_to_nodes[input] use key to get node from dict and its time complexity is O(1).
So I use input_name_to_nodes[input] rather than get_node. What do you think about it?

@pavelkochnev
Copy link
Contributor Author

@mengniwang95 You are right again. Would you like to add additional method to class?
Something like this:

    def _get_node_by_weight(self, weight_name):
        nodes = self.model.input_name_to_nodes[weight_name]
        if len(nodes) == 1:
            return nodes[0]
        else:
            raise NotImplementedError("Models with shared weights not supported")

@mengniwang95
Copy link
Collaborator

Hi @pavelkochnev , sure, we will put this into our enhancement plan.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants