Skip to content
#

etl

Here are 3,672 public repositories matching this topic...

We Build an ETL pipeline using Airflow that accomplishes the following: Downloads data from an AWS S3 bucket, Runs a Spark/Spark SQL job on the downloaded data producing a cleaned-up dataset of delivery deadline missing orders and then Upload the cleaned-up dataset back to the same S3 bucket in a folder primed for higher level analytics

  • Updated Feb 25, 2023
  • Python

Improve this page

Add a description, image, and links to the etl topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the etl topic, visit your repo's landing page and select "manage topics."

Learn more