Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] Support dispatch for materializations to use implementations defined in installed packages #10090

Open
3 tasks done
jtcohen6 opened this issue May 5, 2024 · 4 comments
Labels
enhancement New feature or request user docs [docs.getdbt.com] Needs better documentation
Milestone

Comments

@jtcohen6
Copy link
Contributor

jtcohen6 commented May 5, 2024

Is this your first time submitting a feature request?

  • I have read the expectations for open source contributors
  • I have searched the existing issues, and I could not find an existing issue for this feature
  • I am requesting a straightforward extension of existing dbt functionality, rather than a Big Idea better suited to a discussion

Describe the feature

This is a way of preserving a more granular version of the previous behavior, which would implicitly (and somewhat surprisingly) use materializations from packages that override builtins (view, table, incremental, test, etc).

We are opting for this syntax:

dispatch:
  - type: materialization
    # macro_namespace is not needed, because all materializations are global
    search_order: ['elementary', 'dbt']
  - type: macro
    macro_namespace: dbt_utils
    search_order: [...]

Considerations

If not specified, the default would remain:

  1. Root package first
  2. Then builtin implementations within dbt-core / adapters

Which is how it works for materializations with the flag, and for dispatched macros.

Implementation

The relevant method is find_materialization_macro_by_name.

Michelle spiked a similar capability, though by pulling the "allowlist" from Project.flags rather than Project.dispatch: 99998a5

Describe alternatives you've considered

(1) Not doing this

Users must reimplement materializations one-by-one, by defining them in their root project and calling the implementations in packages.

{% materialization table, snowflake %}
  {{ return(elementary.materialization_table_snowflake()) }}
{% endmaterialization %}

{% materialization incremental, snowflake %}
  {{ return(elementary.materialization_incremental_snowflake()) }}
{% endmaterialization %}

-- etc
-- differs by adapter

(2) Bundling with existing macro dispatch

Under the hood, these materializations are macros, and defined within the 'dbt' namespace. But I like the idea of continuing to keep the two separate, for three reasons:

  1. Avoid tying ourselves forever to this implementation details (that materializations are macros)
  2. Materializations are only callable from the "global" namespace, whereas all other macros can be called by namespace
  3. It's conceivable that users would want different behavior for materializations versus other built-in macros

(3) Different syntax that wouldn't require evolving the type of dispatch config

# 'materialization' is just a special macro_namespace
dispatch:
  - macro_namespace: materialization
    search_order: ['elementary', 'dbt']

Edge case: this would not play well with an installed package named materialization.

Who will this benefit?

Are you interested in contributing this feature?

Yes, with the help of the elementarians

Anything else?

We should do this for dbt Core v1.8.x only. No backports. While we introduced the deprecation warning in dbt-core v1.6 + v1.7, it's still the default behavior in those versions to use implementations defined in packages.

We will document in the v1.8 upgrade guide and "legacy behaviors" the recommended sequence of:

  • Any users currently seeing the deprecation warning should set the behavior change flag to False to preserve existing behavior (regardless of the version they're running on), and to unblock their upgrade to v1.8
  • Then, at their leisure, they may set materialization dispatch, set the flag to True / remove the flag setting, and confirm that they're seeing the same behavior as before
@dbeatty10
Copy link
Contributor

Overall, either of these options seem like they could work.

I'm more drawn to the second option because it seems it seems closest conceptually to current.

The first option has a slight hack feel to it along with the edge case where it would not play well with an installed package named materialization.

If we choose the second option, would users be able to mix-n-match like the following? Or would - type become required?

# second option: new attribute
dispatch:

  - type: materialization
    # macro_namespace is not needed, because all materializations are global
    search_order: ['elementary', 'dbt']

  - macro_namespace: dbt_utils
    # type is not needed here because `macro` is the default when not specified
    search_order: ['my_project', 'dbt_utils']

Out of curiosity, would/could either of these work from a technical perspective if we choose the second option?

# second option: new attribute
dispatch:
  - macro_namespace: dbt
    type: materialization
    search_order: ['elementary', 'dbt']
# second option: new attribute
dispatch:
  - macro_namespace: none
    type: materialization
    search_order: ['elementary', 'dbt']

@jtcohen6
Copy link
Contributor Author

jtcohen6 commented May 7, 2024

@dbeatty10 I agree with your preference for a new type attribute.

I think users could "mix-n-match". The default type would be macro, and we already have custom validation logic where we could apply special handling: "If type: materialization and macro_namespace is not specified, it is none or 'dbt' by default."

And I like where your "out of curiosity" thought is going. This syntax would be extensible to eventually support namespacing for materializations, and finer-grained search orders for each namespace. That's a good future possibility, out of scope for whatever we're doing here. For now, this will be a single search order for all materializations.

@jtcohen6
Copy link
Contributor Author

jtcohen6 commented May 7, 2024

@Maayan-s @haritamar Does this proposal make sense to you both? Would you still be up for helping with the contribution?

@haritamar
Copy link
Contributor

Hi @jtcohen6 , yes this proposal sounds good on our end, and we'll be happy to help with contribution!

@dbeatty10 dbeatty10 added the user docs [docs.getdbt.com] Needs better documentation label May 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request user docs [docs.getdbt.com] Needs better documentation
Projects
None yet
Development

No branches or pull requests

3 participants