Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

oscillations in the loss #822

Open
nicolas-dufour opened this issue Feb 19, 2024 · 2 comments
Open

oscillations in the loss #822

nicolas-dufour opened this issue Feb 19, 2024 · 2 comments

Comments

@nicolas-dufour
Copy link

Hi!
I see in the loss for cc3m model that there are oscillations in the loss.
When using the same webdataset framework I also have oscillations as well. In my case it's more annoying because the oscillation amplitude is greater than the decrease of the loss per epoch.

Do you know what can be the reason of such behaviour?
From my experiments, it seems to be linked to the webdataset since a traditional dataloader don't suffer from such issues. In my case the period of the oscillation is of the number of steps per epoch (on cc12m).

Thanks for the help!

@rom1504
Copy link
Collaborator

rom1504 commented Feb 19, 2024 via email

@nicolas-dufour
Copy link
Author

Hey @rom1504
Yes i'm shuffling shards and samples. I use the following settings that are the openclip defaults

        shard_shuffle_size=2000,
        shard_shuffle_initial=500,
        sample_shuffle_size=5000,
        sample_shuffle_initial=1000,

Thanks for the help!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants