-
Notifications
You must be signed in to change notification settings - Fork 86
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
check_output_custom
could pass the information about the feature definition to the output validators
#809
Comments
@lucamattiazzi yep seems like a reasonable feature -- there might be a bit of surgery required to pass that information through (haven't scoped it). @elijahbenizzy might also have thoughts. Some questions to get your input on:
Reason I ask is there are other implementation options:
So just figuring out which of the three approaches would be better to invest in / look at. |
Thank you for the quick answer! 1 - it does not matter, it's nice but it would not change a lot for us the lifecycle API seems great, and I think we might make it work with our current implementation, but the results would be stored elsewhere than the driver, which looks a little bit worse in my opinion. the current structure of DataValidator does not seem to allow the current_node value to be passed easily without breaking current child classes |
Thanks for the responses, makes sense. Some follow ups:
Yes, that lifecycle adapter could house the results that you would then inspect. How are you getting the results now? (or how would like to get them?)
None of the approaches would directly enable this I don't think. My Hamilton philosophy here is that this should then be explicitly part of the DAG -- or if part of a decorator maybe a different one or a flag to enable it.
Yep. But nothing that can't be changed since I think we can do it in a backwards compatible way. :) |
Chiming in -- so yes, this does make inherent sense. We could expose It's interesting -- usually we recommend that users put the configuration closer to their code (I.E. in the parameters themselves), but if its a lot this can get tricky. So I think this is a reasonable approach. Your workaround is pretty good for now, but it's reasonable to have the name of the node as the input. |
hello, we're using the decorator
check_output_custom
to add custom validation to our features, but while the validator class is defined only once, each feature has its own rules for validationthe problem is that the validator class does not receive any information on the feature it will be validating, only the resulting pd.Series, and we would need at least the name of the feature to be able to validate it correctly without explicitly using the name of the feature on the validator
the
DataValidator
child class could receive either in the__init__
or in thevalidate
method the function whose results it should validateI'll glady open a PR with the change
I'd like to be possible to do this:
instead of needing to do this:
The text was updated successfully, but these errors were encountered: