Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Teams alerts spam 5x of the same failure with suppression set. #1511

Open
christopherekfeldt opened this issue May 3, 2024 · 5 comments
Labels
Alerts Created by Linear-GitHub Sync Bug Something isn't working Triage 👀 Urgent Created by Linear-GitHub Sync

Comments

@christopherekfeldt
Copy link

Describe the bug
We have set up x number of Teams alert webhooks with elementary. The job runs hourly to check if any new test failures has arose.
We have set up a workflow which runs with cloud schedule "30 6-19 * * 1-5" # Every hour between 06:30 - 18:30 on workdays.

That job iterates over a list of teams and their filters/webhooks url's etc like this:

`
%{for team in var.elementary_alert_config}

  • name: '${var.elementary_image}'
    id: elementary-teams-alert-${team.name}
    waitFor:
    • generate-elementary-report
      args: ['monitor', '--teams-webhook', '${team.webhook_url}', '--filters', '${team.filter}', '--suppression-interval', '${team.suppression}', '--profiles-dir', '.dbt-profiles', '--profile-target', '${var.dbt_config.environment}']
      %{endfor}
      `

A team config could then look like this:

elementary_alert_config = [ { name = "marketing" webhook_url = "<webhook_url_>" filter = "owners:BICA Marketing" suppression = 24 },

With this config set, I expect the Marketing Teams channel to only reveal the failure of a test once during 24 hours, and not spam it 5x for every failure, for every team in every channel.

Expected behavior
A clear and concise description of what you expected to happen.

That it only spits out 1 test failure for each test.

Screenshots
If applicable, add screenshots to help explain your problem.
image

It always spits out 5x of each failure, daily.

Environment (please complete the following information):

  • edr Version: 0.14.1
  • dbt-core 1.7.10
  • dbt-bigquery 17.6
@christopherekfeldt christopherekfeldt added Bug Something isn't working Triage 👀 labels May 3, 2024
@haritamar
Copy link
Collaborator

Hi @christopherekfeldt !
Thanks for surfacing this, and sorry about the delay.
We will try to look into it as soon as possible.

@haritamar haritamar added Urgent Created by Linear-GitHub Sync Alerts Created by Linear-GitHub Sync and removed Triage 👀 labels May 15, 2024
@haritamar
Copy link
Collaborator

Hi @christopherekfeldt !
I'd like to follow-up with a couple of questions:

  • Do the commands per team run in parallel or sequentially? In general edr monitor does not support parallel runs and it may cause duplicate alerts.
  • Just to clarify - does the 5x spamming happen for the correct team? or every alert is sent 5 times to every team?

Thanks,
Itamar

@christopherekfeldt
Copy link
Author

We are running them in parallell, oh okay so that can be the issue then? Every alert is sent 5 times to every team!

@haritamar
Copy link
Collaborator

Hi @christopherekfeldt - interesting.
I think the parallel issue is definitely the cause for the 5X, but it doesn't explain why you are getting every alert in all channels - that sounds like a filtering issue.

In any case though - will be great if you can try to make it non-parallel and write here what happens in that case.

@christopherekfeldt
Copy link
Author

It worked without running them in parallell. Thanks for the suggestion, i only did it since i thought the filter would seperate them neitherway, and that i wanted the scheduled job to be faster and more effective :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Alerts Created by Linear-GitHub Sync Bug Something isn't working Triage 👀 Urgent Created by Linear-GitHub Sync
Projects
None yet
Development

No branches or pull requests

2 participants