New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
vmagent job flapping up/down with no errors #6203
Comments
Hello, |
@Haleygo thanks for you response! as I posted vmagent has 6 shards, unfortunately it's not possible to post all the targets from |
No problem, I'm only concern about the vmagent scrape job. |
we use vm-stack chart for out setup vmagent via VMServiceScrape (kubernetes sd)
I can't get what should I post here. Can you provide exact command or query, pls? UI targets page is really huge due to targets count near 9000 we definitely have unhealthy targets, actually hundreds of them, is it related to vmagent's job up/down flapping? |
@Haleygo can you please tell is there any recommendation to fix this? I've tried to increase scrape timeout with no luck, what else |
Looks like you have a lot of scrape failures(3062/4783; 2733/4846), they could be caused by resources pressure or slow network, could you also check vmagent's cpu usage? |
@Haleygo thanks! VMagent shows extremely high CPU usage. Nodes almost for 100% CPU usage. I didn't expect scrape failures could be caused due to high CPU usage. How to determine a scrape failure is exactly because of lack of resources? |
Is your question request related to a specific component?
vmagent
Describe the question in detail
Victoriametrics cluster 1.96
Victoriametrics vmagent 1.96
clusterized 6 member vmagent
everything seems to be fine, but built-in dashboard always shows
vmagent
job is flapping up/downthis yellow bars on the graph represent this query
up{job="vmagent-vm-stack"}
and VMUI really shows this job goes up and down every time
but I completely can't find the reason. There are no pod/container restarts at all
and pods don't even have any errors
Please, help me to debug
Troubleshooting docs
The text was updated successfully, but these errors were encountered: