Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] Unit tests fail when input contain reserved-word-named columns #10112

Closed
2 tasks done
mpatek opened this issue May 8, 2024 · 4 comments
Closed
2 tasks done

[Bug] Unit tests fail when input contain reserved-word-named columns #10112

mpatek opened this issue May 8, 2024 · 4 comments
Labels
backport 1.8.latest bug Something isn't working duplicate This issue or pull request already exists Medium Severity bug with minor impact that does not have resolution timeframe requirement quoting Issues related to dbt's quoting behavior unit tests Issues related to built-in dbt unit testing functionality

Comments

@mpatek
Copy link

mpatek commented May 8, 2024

Is this a new bug in dbt-core?

  • I believe this is a new bug in dbt-core
  • I have searched the existing issues, and I could not find an existing issue for this bug

Current Behavior

In BigQuery:

Supposing an input table with a reserved-word-named column like:

-- stg
SELECT id, loaded_at, `from` FROM {{ source('some', source') }}

And a downstream table like:

-- some_model
SELECT id, loaded_at FROM {{ ref('stg') }}
qualify row_number() over (partition by id order by loaded_at desc) = 1

And a unit test like:

version: 2

unit_tests:
  - name: test__some_model
    model: some_model
    given:
      - input: ref('stg')
        rows:
          - {id: "a", loaded_at: "2024-01-01"}
          - {id: "a", loaded_at: "2024-01-02"}
          - {id: "a", loaded_at: "2024-01-02"}
    expect:
      rows:
        - {id: "a", loaded_at: "2024-01-02"}

If I try dbt test --select test_type:unit

I get: Syntax error: Unexpected keyword FROM

Expected Behavior

Expect tests to pass without errors.

Steps To Reproduce

  1. Create input model with keyword-named column (e.g. from)
  2. Create output model that selects from the input model (not necessarily including the reserved-word-named column).
  3. Add unit test that puts data into input model and runs expectations on output model
  4. Run unit test

Relevant log output

No response

Environment

- OS: macOS Monterey
- Python: 3.11.7
- dbt: 1.8.0-rc1 w/ bigquery 1.8.0b2

Which database adapter are you using with dbt?

bigquery

Additional Context

No response

@mpatek mpatek added bug Something isn't working triage labels May 8, 2024
@dbeatty10 dbeatty10 added unit tests Issues related to built-in dbt unit testing functionality quoting Issues related to dbt's quoting behavior labels May 9, 2024
@dbeatty10 dbeatty10 self-assigned this May 9, 2024
@dbeatty10
Copy link
Contributor

Thanks for reporting this @mpatek !

When I ran your example, I got the same error as you.

When I looked at the compiled SQL (in target/compiled/my_project/models/_unit.yml/models/test__some_model.sql for me because my dbt project name in dbt_project.yml is "my_project"), it looked like this:

with __dbt__cte__stg as (

-- Fixture for stg
select safe_cast('''a''' as STRING) as id, safe_cast('''2024-01-01''' as DATETIME) as loaded_at, safe_cast(null as INT64) as from
union all
select safe_cast('''a''' as STRING) as id, safe_cast('''2024-01-02''' as DATETIME) as loaded_at, safe_cast(null as INT64) as from
union all
select safe_cast('''a''' as STRING) as id, safe_cast('''2024-01-02''' as DATETIME) as loaded_at, safe_cast(null as INT64) as from
) -- some_model
SELECT id, loaded_at FROM __dbt__cte__stg
qualify row_number() over (partition by id order by loaded_at desc) = 1

And when I copy that compiled output to an analyses file, I get the same error:

cp target/compiled/my_project/models/_unit.yml/models/test__some_model.sql analyses             
dbt show -s analyses/test__some_model.sql 

But if I modify that analysis file to have backticks around all the column names, then it can run successfully:

with __dbt__cte__stg as (

-- Fixture for stg
select safe_cast('''a''' as STRING) as `id`, safe_cast('''2024-01-01''' as DATETIME) as `loaded_at`, safe_cast(null as INT64) as `from`
union all
select safe_cast('''a''' as STRING) as `id`, safe_cast('''2024-01-02''' as DATETIME) as `loaded_at`, safe_cast(null as INT64) as `from`
union all
select safe_cast('''a''' as STRING) as `id`, safe_cast('''2024-01-02''' as DATETIME) as `loaded_at`, safe_cast(null as INT64) as `from`
) -- some_model
SELECT id, loaded_at FROM __dbt__cte__stg
qualify row_number() over (partition by id order by loaded_at desc) = 1

Output

21:52:49  Previewing node 'test__some_model':
| id |           loaded_at |
| -- | ------------------- |
| a  | 2024-01-02 00:00:00 |

So I'm wondering if we just need to apply quoted in just the right place.

@dbeatty10 dbeatty10 removed the triage label May 9, 2024
@dbeatty10 dbeatty10 removed their assignment May 9, 2024
@jtcohen6
Copy link
Contributor

@dbeatty10 I think you're right, and I think the right spots might be here and here:

{%- for column_name, column_value in ... %} {{ column_value }} as {{ adapter.quote(column_name) }}{% if not loop.last -%},{%- endif %}

@dbeatty10
Copy link
Contributor

dbeatty10 commented May 10, 2024

Minimal reprex

models/my_model.sql

select 1 as {{ adapter.quote("from") }}

models/_unit.yml

unit_tests:

  - name: test___my___model
    model: my_model
    given: []
    expect:
      rows:
        - {from: 1}

Run the model and its tests:

dbt build -s my_model

Implementation prototype

We'd probably choose to actually implement it differently than this, but here is a prototype that worked with the reprex above:

@mpatek
Copy link
Author

mpatek commented May 12, 2024

Closing, since this should proabably be a dbt-adapter issue: dbt-labs/dbt-adapters#205

@mpatek mpatek closed this as completed May 12, 2024
@dbeatty10 dbeatty10 closed this as not planned Won't fix, can't repro, duplicate, stale May 13, 2024
@dbeatty10 dbeatty10 added the duplicate This issue or pull request already exists label May 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport 1.8.latest bug Something isn't working duplicate This issue or pull request already exists Medium Severity bug with minor impact that does not have resolution timeframe requirement quoting Issues related to dbt's quoting behavior unit tests Issues related to built-in dbt unit testing functionality
Projects
None yet
Development

No branches or pull requests

4 participants