Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] dbt deps automatically recognizes projects in subdirectories #9719

Open
2 tasks done
djbelknapaw opened this issue Mar 1, 2024 · 4 comments · May be fixed by #9734
Open
2 tasks done

[Bug] dbt deps automatically recognizes projects in subdirectories #9719

djbelknapaw opened this issue Mar 1, 2024 · 4 comments · May be fixed by #9734
Labels
bug Something isn't working windows Everyone's favorite OS that's sometimes a little weird

Comments

@djbelknapaw
Copy link

Is this a new bug in dbt-core?

  • I believe this is a new bug in dbt-core
  • I have searched the existing issues, and I could not find an existing issue for this bug

Current Behavior

I'm attempting to build an integration_tests sub-project similar to dbt_utils, then install the parent project when running the integration tests. The file in my_project/integration_tests/packages.yml file is the same as in dbt_utils:

packages:
    - local: ../

This installs the parent project as a package, but that includes the integration_tests child project which dbt is recognizing as a project, and attempts to install its dependencies again, resulting in an endless call of dependencies.

I end up with a directory structure of:

my_project
/integration_tests
/integration_tests/dbt_packages/my_project
/integration_tests/dbt_packages/my_project/integration_tests
/integration_tests/dbt_packages/my_project/integration_tests/dbt_packages/my_project
...

Eventually deps fails with an error:
"[WinError 206] The filename or extension is too long: 'dbt_packages\\\\my_project\\\\integration_tests\\\\dbt_packages\\\\my_project\\\\integration_tests\\\\dbt_packages\\\\my_project\\\\integration_tests\\\\dbt_packages\\\\my_project\\\\integration_tests\\\\dbt_packages\\\\my_project\\\\integration_tests'"

Expected Behavior

When running dbt deps to a local project, only recognize the project.yml and packages.yml from the directly-referenced project and not sub-project directories. In this example, dbt should only look at ../packages.yml and not be looking at ../integration_tests/packages.yml.

Steps To Reproduce

  1. Create a project containing a sub-project
  2. In the sub-project, add a package pointing to the parent project
  3. Run dbt deps

Relevant log output

No response

Environment

- OS: Windows 10
- Python: 3.11.7
- dbt: 1.7.9

Which database adapter are you using with dbt?

No response

Additional Context

No response

@djbelknapaw djbelknapaw added bug Something isn't working triage labels Mar 1, 2024
@dbeatty10 dbeatty10 added the windows Everyone's favorite OS that's sometimes a little weird label Mar 6, 2024
@dbeatty10 dbeatty10 self-assigned this Mar 6, 2024
@dbeatty10
Copy link
Contributor

Thanks for raising this issue @djbelknapaw !

Does this only happen on Windows when using a PowerShell or cmd.exe terminal? Or does it also happen when using WSL (Windows Subsystem for Linux)?

Suspected root cause

Here's what I think is happening:

👉 When dbt installs a local package, it uses a symlink if it can. Otherwise, it makes a copy of the entire directory.

My understanding is that some Windows environments don't allow creation of a symlink, so dbt installs the package via the copy approach instead. Since the install location is a subdirectory of the package being installed, it exhibits recursive behavior you observed.

Potential solution

One approach we can consider when a symlink is not possible:

  1. Instead of copying directly to the dest_path, use a temporary location as an intermediate.
  2. Then move it from the intermediate location to the final dest_path location.

Here's the relevant source code:

try:
fire_event(DepsCreatingLocalSymlink())
system.make_symlink(src_path, dest_path)
except OSError:
fire_event(DepsSymlinkNotAvailable())
shutil.copytree(src_path, dest_path)

@dbeatty10 dbeatty10 removed the triage label Mar 6, 2024
@dbeatty10 dbeatty10 removed their assignment Mar 6, 2024
@djbelknapaw
Copy link
Author

Correct - wsl successfully sets up the symlink and doesn't error out. Also when I run powershell as admin it's able to create a symlink, so it's specific to the copytree code.

It seems like you could just use the ignore parameter in the copytree so it stops recursing into installed packages if finds the project_root of the project calling deps in the source tree structure? This worked in a really quick local copy, but I'm not sure if it works more broadly.

shutil.copytree(src_path, dest_path, 
                         ignore = lambda directory, contents:
                              project.packages_install_path if directory == project.project_root
                              else [])

@dbeatty10
Copy link
Contributor

💡 Great idea about the ignore parameter @djbelknapaw !

No pressure, but are you interested in opening a PR that includes your solution, by any chance?

@djbelknapaw
Copy link
Author

@dbeatty10 Open! First time contributing here, so let me know if there's anything to do differently.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working windows Everyone's favorite OS that's sometimes a little weird
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants