Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow for non-integer values to be returned in the should_error and should_warn fields of generic test SQL; ie. do not store failures as an integer or enforce integer data types in this schema #1453

Open
garfieldthesam opened this issue Mar 9, 2024 · 1 comment

Comments

@garfieldthesam
Copy link

garfieldthesam commented Mar 9, 2024

Is your feature request related to a problem? Please describe.

One can define a custom fail_calc for their generic test. One can define a calculation that returns a non-integer value. For example, I modified the dbt_utils.not_null_proportion test to use this config: {{ config(fail_calc = 'coalesce(not_null_proportion, 0)') }}

This allows me to configure warn and error thresholds like so:

      - name: test_name
        test_name: dbt_utils.not_null_proportion
        column_name: "some_column"
        where: some_where_clause
        config:
          severity: warn
          warn_if: <= 0.98
          error_if: <= 0.95

Now compiled test SQL looks like this in Databricks:

select
  coalesce(not_null_proportion, 0) as failures,
  coalesce(not_null_proportion, 0) <= 0.98 as should_warn,
  coalesce(not_null_proportion, 0) <= 0.95 as should_error
from
  (
    with validation as (
      select
        sum(
          case
            when some_column is null then 0
            else 1
          end
        ) / cast(count(*) as numeric) as not_null_proportion
      from
        some_table
    ),
    validation_errors as (
      select
        not_null_proportion
      from
        validation
      where
        not_null_proportion < 1.00000000001
        or not_null_proportion > 1
    )
    select
      *
    from
      validation_errors
  ) dbt_internal_test

This would work, except Elementary throws this error in the on-run-end hook:

0.98657576579 is not of type 'integer'

Failed validating 'type' in schema['properties']['failures']:
    {'type': 'integer'}

On instance['failures']:
    0.98657576579

Describe the solution you'd like

Allow for non-integer values in the failures field when running an on-run-end-hook

Describe alternatives you've considered

Unfortunately I have to not use the preferred fail_calc for several of my tests

Would you be willing to contribute this feature?

Possibly, but not any time soon

@haritamar
Copy link
Collaborator

Hi @garfieldthesam !

Thanks for opening this issue and sorry for the late response.
Very interesting use case.

However, are you sure that this error is actually coming from Elementary? From what I can see - dbt's base result class requires the failures value to be an integer - which I believe is the cause of this error.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants