- Student Name: Wanxuan Zhang
- Student ID: 1079686
- Due Date: Friday 13th of August 11:59:00 am (AEST).
- Report Link: https://www.overleaf.com/project/60ceae2418c1c92954bd0a6e
- Language: _i.e Python 3.9
- Packages / Libraries: _i.e pandas, sklearn, statsmodels, os, urllib, folium, numpy, matplotlib, glob, geopandas, datetime, seaborn, warnings
- NYC TLC: https://www1.nyc.gov/site/tlc/about/tlc-trip-record-data.page
- External dataset 1: https://www.ncdc.noaa.gov/cdo-web/datasets/GHCND/stations/GHCND:USW00094728/detail
Change this to fit your needs when you have started the project.
raw_data
: Contains all the raw data files.preprocessed_data
: Contains all the preprocessed data files.plots
: Output plots.code
: Keep all notebooks and scripts in this folder. Ensure that you have notebooks for each stage of code. Here's an example:- Notebook 1 for "Preprocessing"
- Notebook 2 for "Feature Engineering & Visualisation"
- Notebook 3 for "Statistical Model".
Run the notebook in order of Preprocessing -> Feature Engineering & Visualisation -> Statistical Model