Skip to content

agutiernc/data-eng-zoomcamp

Repository files navigation

Data Engineering Zoomcamp 2024

This repository contains the projects, assignments, and code I've worked on as part of the Data Engineering Zoomcamp offered by Data Talks Club. The ZoomCamp is a comprehensive online program designed to equip individuals with the essential skills and knowledge required for pursuing a career in Data Engineering.

Course Overview

The Data Engineering Zoomcamp covers a wide range of topics, including data modeling, data pipelines, batch and stream processing, data warehousing, and various data engineering tools and technologies. Throughout the course, I've gained hands-on experience with industry-standard tools such as Apache Spark, Apache Kafka, Docker, Mage, PostgreSQL, Redpanda, dbt cloud, dlt, and more.

Repository Structure

The repository is organized into several folders, each representing a module or topic covered in the zoomcamp:

  • module 1: Containerization and Infrastructure as Code
  • module 2: Workflow Orchestration with Mage
  • module 3: Data Warehouse and Big Data
  • module 4: Analytics Engineering with dbt
  • module 5: Batch Processing with Apache Spark
  • module 6: Streaming Data with Apache Spark, Apache Kafka, and Redpanda
  • Workshop 1: Data Ingestion with dlt
  • Workshop 2: Stream processing with RisingWave

Each module folder contains the corresponding assignments, code examples, and documentation related to the respective topic.

Learning Outcomes

Through this course, I have acquired a comprehensive understanding of Data Engineering principles and best practices. Some key areas of learning include:

  • Data modeling techniques
  • Building robust and scalable data pipelines
  • Batch and stream processing with Apache Spark and Apache Kafka
  • Data warehousing concepts and implementation with cloud-based solutions
  • Containerization and orchestration with Docker, PostgreSQL, Mage
  • Proficiency in SQL, Python, and other Data Engineering-related tools

This repository serves as a showcase of my work and demonstrates my proficiency in various Data Engineering tools and technologies.

Contact

If you have any questions or would like to discuss my work further, please feel free to reach out to me on LinkedIn.