Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

elementary.schema_changes_from_baseline Test Fails for Certain Source Tables #1460

Open
ahmedhamidibr opened this issue Mar 13, 2024 · 3 comments
Labels
Bug Something isn't working dbt package Triage 👀

Comments

@ahmedhamidibr
Copy link

ahmedhamidibr commented Mar 13, 2024

Describe the bug
The issue arises when executing schema_changes_from_baseline test on source tables, it works for some tables and fails for others . For example:

sources:
  - name: raw
    tables:
  - name: source_table_one
    columns:
    
      - name: NAME
        data_type: VARCHAR
    
      - name: CREATEDBYID
        data_type: VARCHAR
    
      - name: CREATEDDATE
        data_type: TIMESTAMP_TZ
    
      - name: ID
        data_type: VARCHAR

    tests:
      - elementary.schema_changes_from_baseline

  - name: source_table_two
    columns:
    
      - name: NAME
        data_type: VARCHAR
    
      - name: CREATEDBYID
        data_type: VARCHAR
    
      - name: CREATEDDATE
        data_type: TIMESTAMP_TZ
    
      - name: ID
        data_type: VARCHAR

    tests:
      - elementary.schema_changes_from_baseline

Despite the tables sharing the same schema, the test fails for one table while succeeding for another. This inconsistency is observed even though both tables are defined with the same columns and data types in the dbt project's schema configuration. The error am getting is :

Compilation Error in test elementary_source_schema_changes_from_baseline_raw_source_table_two_ (models/schema.yml)
  'str object' has no attribute 'database'
  
  > in macro load_relation (macros/adapters/relation.sql)
  > called by macro get_model_baseline_columns (macros/edr/tests/test_configuration/get_model_baseline_columns.sql)
  > called by macro test_schema_changes_from_baseline (macros/edr/tests/test_schema_changes_from_baseline.sql)
  > called by test elementary_source_schema_changes_from_baseline_raw_salesforce_campaignmember_ (models/schema.yml)

To Reproduce
Steps to reproduce the behavior:

  1. generate columns and their data types for source table
    dbt run-operation elementary.generate_schema_baseline_test --args '{"name": "source_table"}'
  2. run schema_changes_from_baseline test on source table
    dbt test --select source:raw.source_table

Expected behavior
The elementary.schema_changes_from_baseline test should run successfully for all source tables, provided that all columns and their respective data types are accurately defined in the table configurations.

Environment (please complete the following information):

  • edr Version: 0.14.1, can be found by running pip show elementary-data
  • dbt package Version: 0.14.1, can be found in packages.yml file
@ahmedhamidibr ahmedhamidibr added Bug Something isn't working Triage 👀 labels Mar 13, 2024
@haritamar
Copy link
Collaborator

Hi @ahmedhamidibr !
Thanks for posting this issue and sorry for the late response.

Is this issue still relevant to you?
If so - one question - any chance your project overrides the source macro?

@ahmedhamidibr
Copy link
Author

Hi @haritamar,

Yes, indeed, my project overrides the source macro. It turns out that was the cause. After resetting back to the default source macro, it worked as expected. However, I remain interested in any workaround you might suggest to ensure the tests continue to work even when the source macro is overridden.

@haritamar
Copy link
Collaborator

haritamar commented May 20, 2024

Hi @ahmedhamidibr ,
So the main issue is that for the schema_changes_from_baseline test to work, Elementary needs to ontain a dbt Relation object which allows us to get the current columns of the table.
When you override the source macro (or ref for that matter), you have the ability to return a query string instead of a relation object - which works for many dbt use-cases but is problematic for Elementary's tests due to the reason above.

So there are two workarounds that can be done until we support this use case, depending on what exactly your overriding source macro does:

  • If it's possible to change your macro to return a dbt Relation object instead of a string (here's a dbt example for the ref macro, but a similar thing can be done for sources) then it should help solve the issue. This may be doable if your macro returns a database table (and not a select query).
  • If on the other hand you actually return a query (e.g. if you're doing some filtering), then a possible (hacky) workaround would be to make the macro only return a Relation object when it's run from the context of the schema_changes_from_baseline test, e.g.:
{% set model_node = context.get("model") %}
{% if model_node and model_node.get("resource_type") == "test" and model_node["test_metadata"]["name"] == "schema_changes_from_baseline" %}
    {% do return(builtins.source(source_name, table_name)) %}
{% endif %}

... do more complex logic that returns a query ...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Something isn't working dbt package Triage 👀
Projects
None yet
Development

No branches or pull requests

2 participants