Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEATURE] Support String Value for Record Source and LDTS in HUBS/SATS/LINKS as we do for Staging #189

Open
RonniePitts opened this issue Apr 13, 2023 · 5 comments
Assignees
Labels
feature This is is requesting a new feature

Comments

@RonniePitts
Copy link

RonniePitts commented Apr 13, 2023

Currently the Staging method supports string values, e.g.

derived_columns:
SOURCE_SYSTEM: '!{{ somevalue }}'
INSTANCE: '!someinstance'

It would be nice to support the string value in Hubs/links/sats methods as well instead of just a column value.

Describe the solution you'd like
Support the use of string values, e.g. ! for LDTS and RECORD SOURCE

AB#5363

@RonniePitts RonniePitts added the feature This is is requesting a new feature label Apr 13, 2023
@RonniePitts
Copy link
Author

Probably some benefit instead of have unnecessary columns for LDTS and RECORD SOURCE in Staging layer.

@DVAlexHiggs
Copy link
Member

DVAlexHiggs commented Apr 13, 2023

Hi,

Welcome and thanks for your suggestion.

Whilst the idea makes sense on paper, I can't think of a use case where we wouldn't want the RECORD_SOURCE or LDTS in the staging layer.

There are a few reasons for the current implementation at this time:

  1. If we have it in the raw vault, hard coded then we have no traceability and no audit.

  2. RECORD_SOURCE and LDTS are often shared by multiple tables in the raw vault (units of work) and rather than repeating yourself everywhere, referencing the column is much cleaner, especially since that specific stage object will likely be loading multiple raw vault objects

  3. You may be loading your hubs and links from multiple staging objects, in which case hard coding the LDTS or RECORD_SOURCE is unwanted behavior, as it's likely those staging objects have multiple sources between them (each record could be from any of the sources)

For the above reasons I'm not sure having these columns in the stage is unnecessary. It is in fact quite standard.

Saying all this, we're not against adding this if it's seen to be useful by the wider community. Could you help us understand where this might be used?

Thanks!

@RonniePitts
Copy link
Author

RonniePitts commented Apr 13, 2023 via email

@DVAlexHiggs
Copy link
Member

Yeah this makes sense. Sometimes record sources are actually codes and it is good practice to have a lookup table which maps these to system names.

I presume this is what you mean? Sounds like you have a macro to do this kind of lookup

@RonniePitts
Copy link
Author

RonniePitts commented Apr 13, 2023 via email

@DVAlexHiggs DVAlexHiggs added feature This is is requesting a new feature and removed feature This is is requesting a new feature labels May 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature This is is requesting a new feature
Projects
None yet
Development

No branches or pull requests

2 participants