Build clickstream analytics on AWS for your mobile and web applications
-
Updated
May 27, 2024 - TypeScript
Build clickstream analytics on AWS for your mobile and web applications
This project aims to securely manage, streamline, and perform analysis on the structured and semi-structured YouTube videos data based on the video categories and the trending metrics.
Data pipeline from RDBMS to AWS
Data Pipelines with Airflow
Data Warehousing with AWS
This project provides a comprehensive data pipeline solution to extract, transform, and load (ETL) Reddit data into a Redshift data warehouse. The pipeline leverages a combination of tools and services including Apache Airflow, Celery, PostgreSQL, Amazon S3, AWS Glue, Amazon Athena, and Amazon Redshift.
Redshift Python Connector. It supports Python Database API Specification v2.0.
building etl pipelines to migrate music json data/ metadata files (semi-structured data) into a relational database stored in AWS Redshift cluster
Virtual Schema for connecting Redshift as a data source to Exasol
Build an ETL pipeline for a database hosted on AWS Redshift.
load local files to AWS Redshift using Python and Unleash Insights with Power BI
PySpark RDD and DataFrame Examples
Scan databases and data warehouses for PII data. Tag tables and columns in data catalogs like Amundsen and Datahub
ETL data pipelining using AWS
A data pipeline that conducts ETL processes to AWS Redshift, utilizing Spark and coordinated by Apache Airflow.
A Covid-19 data pipeline on AWS featuring PySpark/Glue, Docker, Great Expectations, Airflow, and Redshift, templated in CloudFormation and CDK, deployable via Github Actions.
Add a description, image, and links to the aws-redshift topic page so that developers can more easily learn about it.
To associate your repository with the aws-redshift topic, visit your repo's landing page and select "manage topics."