Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add data quality part for parking_sensor_synapse using great expectations library #641

Open
wants to merge 8 commits into
base: main
Choose a base branch
from

Conversation

cchenshu
Copy link
Contributor

@cchenshu cchenshu commented Jun 20, 2023

Type of PR

  • Code changes

Purpose

  • Add the data validataion part for parking_sensor_synapse using great expectations library, following the similar logic in the databricks.
  • For both 02_standardize and 03_transform notebook, add the following steps for data validation.
    0. Create mount point path for spark job
    1. Configure DataContext
    2. Create a BatchRequest based on dataframe
    3. Define Expecation Suite and corresponding Data Expectations
    4. Configure a checkpoint and run Expectation suite using checkpoint

Does this introduce a breaking change? If yes, details on what can break

No

Author pre-publish checklist

  • Executed test to prove my fix is effective or new feature works
  • No PII in logs
  • Made corresponding changes to the documentation

Validation steps

  • Run the notebooks: 02_standardize, 03_transform

Issues Closed or Referenced

  • Closes #issue_number
  • References #issue_number

Adding linkedService: sywsdev77-WorkspaceDefaultSqlServer
Adding integrationRuntime: AutoResolveIntegrationRuntime
Adding linkedService: sywsdev77-WorkspaceDefaultStorage
Adding linkedService: Ls_KeyVault_01
Adding linkedService: Ls_AdlsGen2_01
Adding linkedService: Ls_Rest_MelParkSensors_01
Adding pipeline: P_Ingest_MelbParkingData
Adding dataset: Ds_REST_MelbParkingData
Adding dataset: Ds_AdlsGen2_MelbParkingData
Adding notebook: 02_standardize
Adding notebook: 03_transform
Adding trigger: T_Sched
Adding notebook: 00_setup
Adding notebook: 01a_explore
Adding notebook: 01b_explore_sqlserverless
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant