You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
It would probably worth sanity checking the SpectrogramDrop code (and others) for issues, and to be careful in the reading of the reference papers (I haven't checked yet). I noticed some odd code:
The max() statement seems wrong, as the -mask_len.max() value will always be negative and disregarded. Thus, a given frequency mask application might mask much fewer bins than selected with drop_length_low/high if it starts near the top of the frequency range. e.g. if it starts at the 5th highest frequency bin it would only mask 5 bins. This might be OK.
Opposite problem at the low end: since the mask_pos is a starting position, the lowest frequency bins are significantly less likely to get masked than the bins in the middle of the spectrogram. e.g. the lowest frequency bin has a 1/D chance of being masked, and so on. This might slightly harm the benefits of SpecAugment?
Example of a few SpecAugment applications (including time-wise, ignore that):
Expected behaviour
Not sure! The proper behavior might be arguable, need to read some papers.
To Reproduce
No response
Environment Details
No response
Relevant Log Output
No response
Additional Context
No response
The text was updated successfully, but these errors were encountered:
Describe the bug
It would probably worth sanity checking the
SpectrogramDrop
code (and others) for issues, and to be careful in the reading of the reference papers (I haven't checked yet). I noticed some odd code:max()
statement seems wrong, as the-mask_len.max()
value will always be negative and disregarded. Thus, a given frequency mask application might mask much fewer bins than selected withdrop_length_low/high
if it starts near the top of the frequency range. e.g. if it starts at the 5th highest frequency bin it would only mask 5 bins. This might be OK.mask_pos
is a starting position, the lowest frequency bins are significantly less likely to get masked than the bins in the middle of the spectrogram. e.g. the lowest frequency bin has a1/D
chance of being masked, and so on. This might slightly harm the benefits of SpecAugment?Example of a few SpecAugment applications (including time-wise, ignore that):
Expected behaviour
Not sure! The proper behavior might be arguable, need to read some papers.
To Reproduce
No response
Environment Details
No response
Relevant Log Output
No response
Additional Context
No response
The text was updated successfully, but these errors were encountered: