-
-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
my loss jump and desrease #12718
Comments
👋 Hello @xuxiaolin-github, thank you for your interest in Ultralytics YOLOv8 🚀! We recommend a visit to the Docs for new users where you can find many Python and CLI usage examples and where many of the most common questions may already be answered. If this is a 🐛 Bug Report, please provide a minimum reproducible example to help us debug it. If this is a custom training ❓ Question, please provide as much information as possible, including dataset image examples and training logs, and verify you are following our Tips for Best Training Results. Join the vibrant Ultralytics Discord 🎧 community for real-time conversations and collaborations. This platform offers a perfect space to inquire, showcase your work, and connect with fellow Ultralytics users. InstallPip install the pip install ultralytics EnvironmentsYOLOv8 may be run in any of the following up-to-date verified environments (with all dependencies including CUDA/CUDNN, Python and PyTorch preinstalled):
StatusIf this badge is green, all Ultralytics CI tests are currently passing. CI tests verify correct operation of all YOLOv8 Modes and Tasks on macOS, Windows, and Ubuntu every 24 hours and on every commit. |
It sounds like you're experiencing instability in your loss during training, which often indicates issues with the learning rate or batch size settings. Here are a couple of suggestions:
Example of setting learning rate warmup in Python: from ultralytics import YOLO
# Load your model
model = YOLO('path/to/best.pt')
# Train with custom learning rate and warmup
results = model.train(data='your_dataset.yaml', lr0=0.001, epochs=50, warmup_epochs=5) Lastly, ensure your dataset is correctly annotated and normalized, as issues there can also cause unstable loss. Let us know how it goes after trying these changes! |
thanks, it works when i change optimizer=SGD & lr to 0.001 when batch is 4. sorry,i cant upload train loss picture becuase company's network, so i will draw by this way
now lr=0.001, loss become this (train and val): 10 epoch is best
and i want to know why loss still instability i remeber i change another code and train mini dataset,because i want to prune the network through BN gamma, i disable the amp and add code in trainer:
is this reason cause loss problem? i will delete code and try again, see what loss happen. |
@xuxiaolin-github it sounds like you're making good progress with your adjustments! Switching to SGD and reducing the learning rate to 0.001 for a smaller batch size seems to have helped stabilize your training to some extent. 🚀 Regarding the instability in loss you're still experiencing, the additional code you added for pruning through BN gamma could indeed be influencing the training dynamics. Modifying gradients directly during training, especially with a regularization term like you've added, can introduce significant variability in the loss, especially if the lambda value isn't carefully tuned relative to your learning rate and dataset size. Removing or adjusting the pruning code is a good next step to see if it stabilizes the loss. Keep an eye on how the loss trends without these modifications and adjust the regularization strength if you decide to reintroduce it. Good luck, and let us know how it goes! |
ok, thanks. i get the point, i will delete the code and train again |
Great decision! Removing the pruning code should help clarify if that's impacting your loss stability. Keep us posted on how the training progresses after making this change. If you encounter any further issues or have questions, feel free to reach out. Happy training! 🚀 |
Search before asking
Question
i train a best.pt use my car&person dataset, and use this best.pt as pretrained model to train other same car& person dataset, but the loss growing , afer jump 2 epoch ,loss decrease slowly. loss cant decrease to first epoch loss in 50 epoch.(50 epoch no improve will stop)
my batch is 4, i try to change default.yaml lr to 0.0025. but optimizer (optimizer: 'optimizer=auto' found, ignoring 'lr0=0.0025...
i want to know how to train will let loss not grow,but going down in the first time.
Additional
is the reason about lr,because my batch is 4
The text was updated successfully, but these errors were encountered: