Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] Seed properties .yml files compile twice and clash when seed directory is under model directory #10064

Open
2 tasks done
mbarnathan-os opened this issue Apr 29, 2024 · 1 comment
Labels
bug Something isn't working Medium Severity bug with minor impact that does not have resolution timeframe requirement

Comments

@mbarnathan-os
Copy link

Is this a new bug in dbt-core?

  • I believe this is a new bug in dbt-core
  • I have searched the existing issues, and I could not find an existing issue for this bug

Current Behavior

When specifying seed properties via a .yml file and the seed-paths directory is under the model-paths directory, the .yml file will be compiled twice and dbt will error out with "dbt found two schema.yml entries for the same resource" against the same file, even though there is no actual duplication. This doesn't happen if the seed is defined solely by csv; i.e. the yml file is absent.

Expected Behavior

Nesting seeds under the model directory is a supported use case per the docs, so I would expect the model parsing phase to detect the duplication and omit the seed a second time.

Steps To Reproduce

  1. model-paths: ["models"]
  2. seed-paths: ["models/seeds"]
  3. Add a .yml file in the seed path with at least one seed
  4. Add a csv file for the seed
  5. dbt parse yields "dbt found two schema.yml entries for the same resource" with the seed yml from step 3

Relevant log output

No response

Environment

- OS:
- Python:
- dbt:

Which database adapter are you using with dbt?

snowflake

Additional Context

No response

@mbarnathan-os mbarnathan-os added bug Something isn't working triage labels Apr 29, 2024
@dbeatty10 dbeatty10 self-assigned this Apr 30, 2024
@dbeatty10
Copy link
Contributor

Thanks for reporting this @mbarnathan-os 👍

I was able to reproduce what you described. See details below.

Reprex

dbt_project.yml

name: "my_project"
version: "1.0.0"
config-version: 2
profile: "some_profile"

model-paths: ["models"]
seed-paths: ["models/seeds"]
mkdir -p models
mkdir -p models/seeds

Create a seed file:

cat <<EOF > models/seeds/my_seed.csv
id
1
EOF

Add an associated YAML file:

cat <<EOF > models/seeds/_seeds.yml
seeds:
  - name: my_seed
EOF

See that it never works when partial parsing is disabled (but may work it certain situations when it is enabled):

dbt parse --no-partial-parse
dbt list --no-partial-parse

Side note

For me, I did an initial dbt parse prior to adding the YAML file, and dbt commands worked as long as partial parsing was enabled and I didn't dbt clean my partial parsing artifacts.

@dbeatty10 dbeatty10 removed the triage label Apr 30, 2024
@dbeatty10 dbeatty10 removed their assignment Apr 30, 2024
@dbeatty10 dbeatty10 added the Medium Severity bug with minor impact that does not have resolution timeframe requirement label Apr 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working Medium Severity bug with minor impact that does not have resolution timeframe requirement
Projects
None yet
Development

No branches or pull requests

2 participants