Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
-
Updated
May 12, 2024 - Python
Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
Apache Superset is a Data Visualization and Data Exploration Platform
Learn how to design, develop, deploy and iterate on production-grade ML applications.
The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.
📚 Papers & tech blogs by companies sharing their work on data science & machine learning in production.
Workflow Engine for Kubernetes
The Data Engineering Cookbook
Always know what to expect from your data.
Prefect is a workflow orchestration tool empowering developers to build, observe, and react to data pipelines
An orchestration platform for the development, production, and observation of data assets.
Roadmap to becoming a data engineer in 2021
The Open Source Feature Store for Machine Learning
Fancy stream processing made operationally mundane
Source code accompanying book: Data Science on the Google Cloud Platform, Valliappa Lakshmanan, O'Reilly 2017
pandas on AWS - Easy integration with Athena, Glue, Redshift, Timestream, Neptune, OpenSearch, QuickSight, Chime, CloudWatchLogs, DynamoDB, EMR, SecretManager, PostgreSQL, MySQL, SQLServer and S3 (Parquet, CSV, JSON and EXCEL).
🧙 Build, run, and manage data pipelines for integrating and transforming data.
Implementing best practices for PySpark ETL jobs and applications.
Turns Data and AI algorithms into production-ready web applications in no time.
A collection of scientific methods, processes, algorithms, and systems to build stories & models. Whether you are a fresher in the field or an experienced professional who wants to transition into Data Science & AI
Add a description, image, and links to the data-engineering topic page so that developers can more easily learn about it.
To associate your repository with the data-engineering topic, visit your repo's landing page and select "manage topics."