Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Qwen2MoE #593

Open
wants to merge 3 commits into
base: main
Choose a base branch
from
Open

Add Qwen2MoE #593

wants to merge 3 commits into from

Conversation

bozheng-hit
Copy link

Adding Qwen2MoE

This PR adds the support of codes for the coming Qwen2MoE models. For information about Qwen, please visit https://github.com/QwenLM/Qwen.

@@ -737,6 +737,21 @@ def get_checkpoints(model_name_or_path: str, extensions: List[str], possible_mod

return False, resolved_archive_file, true_model_basename


def get_moe_inside_layer_modules(inside_layer_modules, num_experts):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add type hints and a small description here?

Comment on lines +409 to +411
if hasattr(self.model.config, "num_experts"):
inside_layer_modules = get_moe_inside_layer_modules(self.inside_layer_modules,
self.model.config.num_experts)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is self.inside_layer_modules not enough?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We cannot write the names of all the parameters in our MoE model because we may have more experts. Besides, we may have different numbers of experts for different model sizes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree it is a bit clunky but you can just add like the highest amount of experts you may have and the models with less will work too.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree it is a bit clunky but you can just add like the highest amount of experts you may have and the models with less will work too.

It will work, but I think the current codes look much better than writing hundreds of parameter names in the modeling file.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't disagree, but I would prefer a separate PR for for something like this that others could rely on. I know that is just for slightly easier git history. I just prefer PRs to stay focused and do the minimum changes required. Feel free to ignore me, it's not that important.

@LaaZa
Copy link
Contributor

LaaZa commented Mar 23, 2024

My review isn't going to help here. I can't merge and I don't have the memory to even try this. Actually I'm not even quite sure where the reference model is. Since this is adding a new feature, would you consider making a tiny version of the model and create a test for it?

@LaaZa
Copy link
Contributor

LaaZa commented Apr 22, 2024

TypeError: qwen2_moe isn't supported yet. (base)

Name Version Build Channel

auto-gptq 0.7.1 pypi_0 pypi (base)

Are you even using this PR? Also you need at least transformers>=4.39.0

@wellcasa
Copy link

No, I haven't used this PR yet. I saw this branch and I'm very happy. Let's wait and merge it into the main branch.
The speed of Qwen2moe is very fast and the effect is not bad. Great.

However, currently vllm does not support qwen2moe int4. Also waiting, seeking support.

@LaaZa
Copy link
Contributor

LaaZa commented Apr 23, 2024

No, I haven't used this PR yet. I saw this branch and I'm very happy. Let's wait and merge it into the main branch. The speed of Qwen2moe is very fast and the effect is not bad. Great.

However, currently vllm does not support qwen2moe int4. Also waiting, seeking support.

Don't post errors here then, we know it isn't supported yet until this is merged. You can say you want this but posting random errors is not the way.

@wellcasa
Copy link

No, I haven't used this PR yet. I saw this branch and I'm very happy. Let's wait and merge it into the main branch. The speed of Qwen2moe is very fast and the effect is not bad. Great.
However, currently vllm does not support qwen2moe int4. Also waiting, seeking support.

Don't post errors here then, we know it isn't supported yet until this is merged. You can say you want this but posting random errors is not the way.

Okay, sorry, I deleted him,

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants