Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MultiHeadAttention parameter setting #180

Open
LXXiaogege opened this issue Apr 30, 2023 · 2 comments
Open

MultiHeadAttention parameter setting #180

LXXiaogege opened this issue Apr 30, 2023 · 2 comments
Assignees

Comments

@LXXiaogege
Copy link

Is the output linear layer parameter of the MultiHeadAttention class incorrectly set in mha.py file? in_features should be heads*d_k?

@LXXiaogege
Copy link
Author

The get_positional_encoding method of position encoder generates an error when d_model is set to odd

@vpj
Copy link
Member

vpj commented Jun 30, 2023

Our implementation assumes that heads * d_k = d_model. Need to change that

@vpj vpj self-assigned this Jun 30, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants