Skip to content

ayanhussain81/Olympics-Data-ETL

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Tokyo Olympic Azure Data Engineering Project

Overview

This project focuses on building a comprehensive data engineering pipeline for the Tokyo Olympic Games, leveraging Azure services such as Data Lake Gen2, Data Factory, Databricks, and Synapse Analytics. The pipeline aims to handle data integration, transformation, and analysis to support valuable insights for the Olympic events.

Technologies Used

  • Azure Data Lake Gen2: Storage for raw and processed data.
  • Azure Data Factory: Orchestration and automation of data workflows.
  • Azure Databricks: Advanced analytics and data transformation.
  • Azure Synapse Analytics: Data warehousing and analytics.

Project Structure

  1. Data Ingestion: Raw data from various sources is ingested into Data Lake Gen2.
  2. ETL Pipeline: Data is processed and transformed using Azure Data Factory, leading to curated datasets.
  3. Advanced Analytics: Complex analytics and transformations are performed in Azure Databricks.
  4. Data Warehousing: Synapse Analytics is utilized for scalable data warehousing and efficient querying.

Setup Instructions

  1. Azure Account: Ensure you have an active Azure account.
  2. Azure Resources: Create necessary Azure resources - Data Lake Gen2, Data Factory, Databricks, and Synapse Analytics.
  3. Configuration: Update configuration files with your Azure credentials and project-specific details.
  4. Run Pipelines: Execute Data Factory pipelines for ETL, monitor Databricks jobs, and utilize Synapse Analytics for analytics.

Usage

  • Follow the documentation provided in the 'docs' directory for detailed instructions on setting up, running, and maintaining the project.
  • For any issues or inquiries, refer to the 'issues' section in this repository.

Contribution

Contributions are welcome! Please follow the guidelines in the 'CONTRIBUTING.md' file.

License

This project is licensed under the MIT License.


Feel free to reach out for any questions or clarifications.

Happy coding!

Releases

No releases published

Packages

No packages published