Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Understanding how elementary resolves dbt models with CI jobs #1474

Open
wusanny opened this issue Mar 21, 2024 · 0 comments
Open

Understanding how elementary resolves dbt models with CI jobs #1474

wusanny opened this issue Mar 21, 2024 · 0 comments
Labels
Bug Something isn't working Triage 👀

Comments

@wusanny
Copy link

wusanny commented Mar 21, 2024

Describe the bug
Uncertain if this is a bug or expected behaviour, but we are quite confused with the current behaviour, thus raising this issue to get further clarification.

We have observed that the first time we ran a CI job after installing elementary dbt package, the elementary models will get built into a temporary PR schema, instead of their own custom elementary schema as specified in dbt_project.yml.

For context, in dbt Cloud, CI jobs will materialize the models in a temporary schema unique to the PR which will then be dropped once the PR is merged/closed (docs for reference here). It is expected that the Elementary models are still written into their own schema defined in the dbt_project.yml file (Elementary should still override this schema and write its models into its own schema NOT the temp PR schema).

After that PR has been merged and a production job is ran, Elementary models from all the subsequent CI jobs will start writing into the expected schema.

To Reproduce
Prerequisite:

  • dbt Cloud account

Steps to reproduce the behavior:

  1. In dbt Cloud
  1. Install elementary dbt package, following the instructions here
  2. In dbt Cloud's IDE > Commit & sync > Create pull-request
  3. This will kick off the CI job created in step 1
  4. After the run is complete > open up the run page > click on the last step 'Invoke dbt build --select state:modified+' > Debug Logs > Download full debug logs
  5. Search for any of the elementary models, eg, dbt_invocations and we can see that it is built into the temporary PR schema - create or replace table DEVELOPMENT.dbt_cloud_pr_537847_8_elementary_new.dbt_invocations - instead of DEVELOPMENT.dbt_sanny_elementary_new.dbt_invocations.
  6. Merge the PR and run the main production job
  7. Download the debug logs for the main job and we can see that elementary models are built into the correct elementary schema
  8. Create a new pull request that will trigger another CI job
  9. Download the debug logs for that new CI job > elementary models are now built into the correct elementary schema

Expected behavior
The expected behaviour is for all elementary models to be built into the custom elementary schema, regardless if it was the initial CI run or not. Note that prior to the merge of that first PR, any metadata that was inserted into the temporary PR schema will disappear when the PR is merged and temporary PR schema is dropped (default behaviour for dbt's CI jobs). Client's data is then lost and cannot be recovered.

Environment (please complete the following information):

  • dbt package version: 0.14.1

Additional context
Debug logs for references
1. Debug log run 261771270 (1st CI run).log
2. Debug log run 261772372 (after merge, main prod run).log
3. Debug log run 261772795 (2nd CI run).log

This is the observed behaviour from multiple testings:

  • First installment of elementary dbt package > run CI job > elementary models built in temp PR schema (eg, database.pr_9_elementary_schema.model_name)
  • Trigger the PR multiple times > will still be built in the same temp pr schema (ie, database.pr_9_elementary_schema.model_name)
  • When we merge the PR > no main prod job run yet > run another CI job > elementary models will get built in new temp pr schema (ie, database.pr_10_elementary_schema.model_name)
  • When main prod job run > elementary models will get built in normal custom elementary schema (ie, database.dbt_sanny_elementary_schema.model_name)
  • If the previous PR has not been merged while the main job ran > trigger another PR on that open one > elementary models will get built in the temp PR schema that has not been merged (ie, database.pr_10_elementary_schema.model_name)
  • Once that last open PR has been merged > any new PR from then on will get built into normal custom elementary schema (ie, database.dbt_sanny_elementary_schema.model_name)
@wusanny wusanny added Bug Something isn't working Triage 👀 labels Mar 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Something isn't working Triage 👀
Projects
None yet
Development

No branches or pull requests

1 participant