Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Outlier Detection for non-error status codes #18789

Open
anikamukherji opened this issue Oct 27, 2021 · 6 comments · May be fixed by #34154
Open

Outlier Detection for non-error status codes #18789

anikamukherji opened this issue Oct 27, 2021 · 6 comments · May be fixed by #34154
Assignees
Labels
area/outlier_detection enhancement Feature requests. Not bugs or questions. help wanted Needs help!

Comments

@anikamukherji
Copy link
Contributor

Title: Support outlier detection of other status codes (particularly 4xx).

Description:
Outliers can be hosts returning an abnormal rate of any status code, not just 5xx. Although 4xx errors are generally considered client errors, if a host starts returning a large number of 4xx, it may signal it has some problem (possibly related to authz, authn, etc) and should be considered an outlier. At Pinterest, we are interested in being able to identify 4xx outliers in addition to 5xx outliers (although I can imagine this could have a general solution for all 300+ status codes).

[optional Relevant Links:]

Any extra documentation required to understand the issue.

@anikamukherji anikamukherji added enhancement Feature requests. Not bugs or questions. triage Issue requires triage labels Oct 27, 2021
@snowp snowp added area/outlier_detection help wanted Needs help! and removed triage Issue requires triage labels Oct 27, 2021
@snowp
Copy link
Contributor

snowp commented Oct 27, 2021

Seems reasonable to me, I don't think this would be that hard to do

@cpakulski
Copy link
Contributor

Sounds good. I will implement this. I think that the API should be extended to define status codes considered as errors, so one can specify exact codes which will cause a node to be considered an outlier.

@cpakulski cpakulski self-assigned this Dec 1, 2021
@gauravojha
Copy link

@cpakulski wanted to check if there are any plans to support the above? this would be amazingly helpful feature..

" I think that the API should be extended to define status codes considered as errors, so one can specify exact codes which will cause a node to be considered an outlier."

this will be really helpful, for cases like lets say we want to eject for all 5xx except 502 for some reason or something like that if required 🙏

@cpakulski
Copy link
Contributor

@gauravojha I still plan to work on this. Your example with excepting 502 is a very good point. Please keep an eye on this issue and I should land a PR within few weeks.

@nzt4567
Copy link

nzt4567 commented Nov 10, 2023

@cpakulski Any progress on this pls? 🙂

@cpakulski
Copy link
Contributor

I wrote a proposal and coded working prototype some time ago. Then it was put on hold but I plan to open a formal PR within next month.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/outlier_detection enhancement Feature requests. Not bugs or questions. help wanted Needs help!
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants