You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
you are using a sum to accumulate the loss for the tokens in the answer sequence. How does this behave if the possible answers have varying lengths? Shouldn't the loss be divided by the sequence length to get the average loss per token? Otherwise, won't the ranking be biased towards shorter sequences?
The text was updated successfully, but these errors were encountered:
Hi there, I see that in line
ALBEF/models/model_vqa.py
Line 198 in b9727e4
you are using a sum to accumulate the loss for the tokens in the answer sequence. How does this behave if the possible answers have varying lengths? Shouldn't the loss be divided by the sequence length to get the average loss per token? Otherwise, won't the ranking be biased towards shorter sequences?
The text was updated successfully, but these errors were encountered: