Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature eng dev mmt #157

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

hengrumay
Copy link
Collaborator

Updating code with feature store (fs) to feature engineering(fe) wrt https://databricks.atlassian.net/browse/MLOPS-675

@@ -103,7 +103,7 @@ raw_data = spark.read.format("delta").load(input_table_path)
# Compute the features. This is done by dynamically loading the features module.
from importlib import import_module

mod = import_module(features_module)
mod = import_module(features_module) ##??
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is this comment for?

Copy link
Collaborator Author

@hengrumay hengrumay May 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch! Should have been omitted -- maybe I deleted after the commit -- will recommit


fs = feature_store.FeatureStoreClient()

from databricks.feature_engineering import FeatureEngineeringClient
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should only do this if UC is enabled, you can see other examples where we use:
{{ if (eq .input_include_models_in_unity_catalog `no`) }}, or does this client work for non-UC cases as well?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK -- I think need to update with conditionals. I was wondering where they were but they need to be added!

@@ -116,11 +116,11 @@ features_df = compute_features_fn(
# COMMAND ----------
# DBTITLE 1, Write computed features.

{{- if (eq .input_include_models_in_unity_catalog "no") }}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason you changed this from your last commit?

Suggested change
{{- if (eq .input_include_models_in_unity_catalog "no") }}
{{- if (eq .input_include_models_in_unity_catalog `no`) }}

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There were some failed tests and I noted previous conditionals in section above related to dbutils.wigets with {{- if (eq .input_include_models_in_unity_catalog "no") }} so thought to replace... should it be 'no' instead of "no" ?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm generally we use backticks (`) instead of quotes (") or ('). You can see other examples here: https://github.com/search?q=repo%3Adatabricks%2Fmlops-stacks%20input_include_models_in_unity_catalog%20&type=code

label="fare_amount",
exclude_columns=exclude_columns,
)
{{end -}}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the - at the end means to remove space after this line. You should try generating this notebook in your local Stacks repo by using databricks bundle init and add screenshots of both UC and non-UC to make sure they look as expected

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will try generating -- I didn't realize I could! Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants