New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Qwen2MoE #593
base: main
Are you sure you want to change the base?
Add Qwen2MoE #593
Conversation
@@ -737,6 +737,21 @@ def get_checkpoints(model_name_or_path: str, extensions: List[str], possible_mod | |||
|
|||
return False, resolved_archive_file, true_model_basename | |||
|
|||
|
|||
def get_moe_inside_layer_modules(inside_layer_modules, num_experts): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add type hints and a small description here?
if hasattr(self.model.config, "num_experts"): | ||
inside_layer_modules = get_moe_inside_layer_modules(self.inside_layer_modules, | ||
self.model.config.num_experts) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is self.inside_layer_modules
not enough?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We cannot write the names of all the parameters in our MoE model because we may have more experts. Besides, we may have different numbers of experts for different model sizes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree it is a bit clunky but you can just add like the highest amount of experts you may have and the models with less will work too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree it is a bit clunky but you can just add like the highest amount of experts you may have and the models with less will work too.
It will work, but I think the current codes look much better than writing hundreds of parameter names in the modeling file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't disagree, but I would prefer a separate PR for for something like this that others could rely on. I know that is just for slightly easier git history. I just prefer PRs to stay focused and do the minimum changes required. Feel free to ignore me, it's not that important.
My review isn't going to help here. I can't merge and I don't have the memory to even try this. Actually I'm not even quite sure where the reference model is. Since this is adding a new feature, would you consider making a tiny version of the model and create a test for it? |
Are you even using this PR? Also you need at least |
No, I haven't used this PR yet. I saw this branch and I'm very happy. Let's wait and merge it into the main branch. However, currently vllm does not support qwen2moe int4. Also waiting, seeking support. |
Don't post errors here then, we know it isn't supported yet until this is merged. You can say you want this but posting random errors is not the way. |
Okay, sorry, I deleted him, |
Adding Qwen2MoE
This PR adds the support of codes for the coming Qwen2MoE models. For information about Qwen, please visit https://github.com/QwenLM/Qwen.