Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

support alternative parallelism #2

Open
152334H opened this issue Dec 10, 2023 · 1 comment
Open

support alternative parallelism #2

152334H opened this issue Dec 10, 2023 · 1 comment

Comments

@152334H
Copy link
Contributor

152334H commented Dec 10, 2023

--num-gpus is implemented by sharding each expert layer across GPUs, i.e. expert parallelism

this is probably not advisable for local experimentation, especially on batch size 1 -- where EP only adds communication overhead to no speed benefit vs naive model/pipeline parallel.

@tonysy
Copy link
Contributor

tonysy commented Dec 10, 2023

Good suggestions, I am working on other parallelism method. Also, contribution is welcomed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants