-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SNMP panic #15200
Comments
@llamafilm could you please try to reproduce this with latest master and maybe only the SNMP (and a file output) plugin?!? We shifted code for SNMP quite a bit between v1.29 and v1.30... |
You need to look at the correct version of the source code, this is Going though the stack trace, the panic actually happens here: octets := v.Bytes() |
Thanks for calling that out. I guess the question then is if Telegraf should make a change as well? If the value is nil, should Telegraf even be calling the format value function? |
I would try to get that fixed upstream and see what the maintainers say. |
I have put up issue sleepinggenius2/gosmi#44 and a PR sleepinggenius2/gosmi#45 Happy to have reviews or comments to those. I did not realize this library had not had a lot of updates in a while, so let's see if we get a response. |
It appears like the maintainer didn't do much anymore lately. Let's see indeed. |
Relevant telegraf.conf
Logs from Telegraf
System info
Telegraf 1.29.5-66b924ec, Ubuntu 22.04.4
Docker
No response
Steps to reproduce
Unknown
Expected behavior
no crash
Actual behavior
Telegraf has been running for several days under systemd, and this weekend it crashed. Systemd tried to restart it several times, and it kept crashing repeatedly. This log snippet from journald shows a full cycle, beginning after the first crash, until it crashes again. My telegraf config is several thousand lines long, so I'm not sure which part is relevant here. I have dozens of different SNMP devices with different input configs and processors.
There was a power outage Saturday morning, about 24 hours before this crash occurred, so it's likely some of the SNMP devices were in a bad state, but I can't reproduce it. This morning after restarting the service it's working fine.
Additional info
I built this telegraf binary using the custom builder to reduce the input and output plugins. But I did not customize anything else. So it's weird that the log references lines that don't exist like
snmp.go:323
.The text was updated successfully, but these errors were encountered: