Skip to content

Data pipeline using S3, Glue, Athena, Lambda and Quicksight to analyze dataset of YouTube

Notifications You must be signed in to change notification settings

shreeyajoshi2013/AWS_Data_Engineering_YouTube_Data

Repository files navigation

AWS Data Engineering Project - YouTube Data Analysis

The goal of this project is to help our imaginary customer in launching her marketing campaign by providing her a data ready for finding out top category of trending YouTube videos.

Dataset : link

My Image

Highlights

  • Data ingestion
  • Data cleansing
  • Data transformation
  • Data Catalog
  • Data quering and analysis
  • ETL
  • Scalability through trigger of Lambda function
  • Data partitioning
  • Visualization - BI Dashboard

AWS services used:

  • Extraction:
    • CLI
    • S3
    • IAM
  • Transformation:
    • S3
    • Lambda (with Trigger)
    • Glue (Crawler, Database, ETL Job)
    • Athena
    • IAM
  • Loading
    • S3
    • Glue (Database, ETL Job)
    • IAM
  • Business Intelligence
    • S3
    • Glue (Database)
    • QuickSight
    • IAM

File Formats Handled:

  • CSV
  • JSON
  • Parquet

Other Tools

  • Miro board - Used for Data Pipeline Diagram

References

About

Data pipeline using S3, Glue, Athena, Lambda and Quicksight to analyze dataset of YouTube

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published