Streaming data-pipeline in aws
-
Updated
May 2, 2022 - Python
Streaming data-pipeline in aws
A data pipeline that conducts ETL processes to AWS Redshift, utilizing Spark and coordinated by Apache Airflow.
A Java API that gathers historical cryptocurrency pricing data (via CryptoCompare API) & makes predictions (via AWS Machine Learning API)
The goal of this repository is to provide good and clear examples of Amazon CLI commands together with Amazon CDK to easily create any AWS services and resources
Platzi. School of Amazon Web Services. Redshift for Big Data management.
The goal of this project is to build data pipeline for gathering real-time carpark lots availability and weather datasets from Data.gov.sg. These data are extracted via API, and stored them in the S3 bucket before ingesting them into the Data Warehouse.
Projects for Udacity's Data Engineering Nanodegree
Load data from the Million Song Dataset into AWS RedShift.
Udacity Data Engineering Nanodegree Project #3.
Data Pipeline Analytics Platform is an end-to-end generic Big Data pipeline. Involves following tech stack: AWS S3, AWS Redshift, AWS EMR Cluster, Apache Spark, Apache Airflow.
Remove duplicates entries from a Redshift cluster
Data Warehousing in AWS with Redshift
An implementation of a Data Warehouse leveraging AWS RedShift. This project builds an ETL pipeline for the database hosted on AWS Redshift that extracts their data from multiple JSON files residing in S3 buckets, stages them in Redshift, and transforms data into a set of dimensional tables for their analytics team to continue finding insights in…
A batch processing data pipeline, using AWS resources (S3, EMR, Redshift, EC2, IAM), provisioned via Terraform, and orchestrated from locally hosted Airflow containers. The end product is a Superset dashboard and a Postgres database, hosted on an EC2 instance at this address (powered down):
ETL pipeline with AWS Redshift orchestrated with Airflow
Add a description, image, and links to the aws-redshift topic page so that developers can more easily learn about it.
To associate your repository with the aws-redshift topic, visit your repo's landing page and select "manage topics."