Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speed up the creation of attention mask #797

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

yuantailing
Copy link

@yuantailing yuantailing commented Apr 29, 2024

Prefer to use the inplace variant of triu_/tril_ because they are faster than the out-of-place variants since torch 2.3.0 (pytorch/pytorch#115013).

@ethanhe42
Copy link
Member

generally, mask will be created inside transformer engine if --use-mcore-models

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants