Skip to content

Garcel/nyc-collision-elt

Repository files navigation

nyc-collision-elt

๐Ÿ•ต๏ธโ€โ™‚๏ธ Overview

ELT for New York City (NYC) Collision Dataset.

About me

Please, visit my profile ๐Ÿ˜€

About the project

This project was originally conceived in the year 2022, when I was employed by a data analytics company, but I was unable to complete it because of other commitments.

Iโ€™ve been considering finishing it a lot lately, so Iโ€™ve moved forward to do so.

It fascinates me and serves as a good illustration of how a straightforward data integration may be carried out.

I sincerely hope you find this as fascinating as I do, and any help would be appreciated.

Why an ELT?

First and foremost, I am aware that a simple Jupyter notebook would have sufficed for this project. However, the objective of this project revolves around developing a more intricate data integration process.

Although I have previous experience in a data analysis environment, I must admit that I do not possess extensive knowledge about data integration. Therefore, I recommend delving into additional resources to obtain a comprehensive understanding of this subject! :).

When it comes to designing this project, the utilization of either ETL (Extract, Transform, Load) or ELT (Extract, Load, Transform) approaches was taken into consideration.

The key distinction lies in the fact that ELT carries out data transformations directly within the data warehouse. In contrast, ETL involves staging procedures before the data can be transmitted to the warehouse.

Given my preference to avoid managing multiple systems for data storage, I decided to stick to the ELT approach.

๐Ÿ—„๏ธ Data sources

โš™๏ธ The ELT

ELT diagram

๐Ÿ“ˆ The Dashboard

In progressโ€ฆโ€‹

๐Ÿ’ป Development environment

Iโ€™ll be using Docker to set up the development environment since Iโ€™m used to it and I very like it.

๐Ÿ“– Prerequisites

You must have these installed in your system:

  • Python 3.12 or Docker.

  • pre-commit.

๐Ÿš€ Running it

docker-compose up

โœ… Running the tests

Python

pytest

Docker alternative

docker build -t nyc . && docker run -v .:/app --rm nyc pytest

๐Ÿ†˜ Contributing

I would love your contributions and Iโ€™ll do my best to provide you with mentorship and support. If you are looking for an issue to tackle, take a look at issues labeled Good first issue.

Get more details in the Contributing Guide.

๐Ÿ›ก๏ธ Security

Please, do not create a regular Issue for reporting a Security issue.

See the Security Policy to known more about the procedure details.

๐Ÿชช License

Apache License 2.0.

โœ๐Ÿผ Author

June 17th, 2023.