data-engineering-pipeline

Two data frames of different kaggle cases of diesease cases and weather in Brazil. The project aims to clean the DFs and build a new one in order to analyse the correlation of dengue (serious disease transmited by mosquito), rain precipitation and temperature.

python weather pipeline data-engineering health-data data-cleaning dengue dengue-cases data-engineering-pipeline diesease vitrinedev

Updated Apr 3, 2023

vspatil / citibike-data-pipeline

Star

Analysis of NYC's citibike data. Technologies: Python , Prefect, dbt, Terraform , Looker data studio

python docker bigquery data looker sql terraform gcp data-visualization dbt dataengineering dataanalysis prefect gcs-bucket data-engineering-pipeline

Updated Apr 19, 2023
Python

andersonesanto / igti-edd-m5-desafio

Star

IGTI Enhenheiro de Dados - Módulo 5 Desafio Final

docker airflow postgresql pandas data-engineering data-engineer data-engineering-pipeline igti engenharia-de-dados

Updated Dec 19, 2021
Jupyter Notebook

Savimbi / etl-batchprocess

Star

Data ingestion solution using spring batch and postgreSQL as data warehouse.

java spring-data postgresql spring-batch-jobs data-engineering-pipeline

Updated Mar 12, 2023
Java

MRLintern / Sparkify_ETL

Star

ETL Pipeline for Music Analysis

python sql data-engineering etl-pipeline data-engineering-pipeline

Updated Sep 23, 2022
Jupyter Notebook

desanti / airflow-examples

Star

Pipelines de Airflow - códigos de exemplo

python airflow examples data-engineer data-engineering-pipeline

Updated Sep 4, 2022
Python

NatanDuarte / sega_games_pipeline

Star

Experimenting with Data Pipelines in Python

python data-engineering study-project data-engineering-pipeline

Updated Nov 1, 2023
Python

amva13 / monofeed

Star

cryptocurrency ticker data pipeline

data-science coinbase cryptocurrency data-engineering trading-platform data-engineering-pipeline

Updated Feb 16, 2023
Python

Cognizant-Technology-Innovation / lakehouseops-sra-for-databricks

Star

The Security Reference Architecture (SRA) implements typical security features as Terraform Templates that are deployed by most high-security organizations, and enforces controls for the largest risks that customers ask about most often.

data-engineering databricks data-engineering-pipeline

Updated May 8, 2024
HCL

AyersAuthentic / Data_Engineering

Star

Projects and Exercises for Udacity Data Engineering Nano Degree

data sql database etl data-engineering software-engineering data-engineering-pipeline

Updated Aug 25, 2021
HTML

ZahidGalea / data-engineering-in-gcp-challenge

Star

terraform gcp data-engineering workflows data-engineering-pipeline github-actions

Updated Dec 30, 2021
Python

rivas-j / Big_Data_Marketing_Analysis-AWS-Spark-SQL

Star

Build Data Pipeline with pgAdmin, AWS Cloud and Apache Spark to Analyze and Determine Bias in Amazon Vine Reviews

sql apache-spark aws-s3 postgresql python3 data-analysis aws-ec2 aws-rds etl-pipeline colab-notebook data-engineering-pipeline

Updated Jul 2, 2023
Jupyter Notebook

Ruth-Mwangi / youtube-data-etl

Star

The purpose of the project is to efficiently collect, process, and store Twitter data using a combination of Apache Airflow, Apache Spark, and Amazon S3.

python data-science airflow apache-spark mongodb etl pyspark data-engineering apache-airflow data-engineering-pipeline

Updated Jan 4, 2024
Python

Susanhuynh / aws_etl_from_s3_to_redshift

Star

Building an ETL pipeline for a database hosted on Redshift. Extracting data from S3 to staging tables on Redshift . Transforming data by executing SQL statements that create the analytics tables from these staging tables by start schema. Loading star schema tables to Redshift

infrastructure-as-code data-modeling amazon-redshift amazon-s3 etl-pipeline data-engineering-pipeline star-schema-tables