Skip to content

samirsaci/ml-forecast-features-eng

Repository files navigation

Machine Learning for Retail Sales Forecasting — Features Engineering 📈

Understand the impacts of additional features related to stock-out, store closing date or cannibalization on a Machine Learning model for sales forecasting

Based on the feedback of the last Makridakis Forecasting Competitions, Machine Learning models can reduce the forecasting error by 20% to 60% compared to benchmark statistical models.

Their major advantage is the capacity to include external features that heavily impact the variability of your sales.

For example, e-commerce cosmetics sales are driven by *special events (promotions) and on how you advertise a reference on the website (first page, second page, …).

This process called features engineering is based on analytical concepts and business insights to understand what could drive your sales.

Article

In this Article, will try to understand the impact of several features on the accuracy of a model using the M5 Forecasting competition dataset.

Experiment

Based on business insights or common sense, we will add additional features, built with existing ones, to help our model to capture all the key factors impacting your customer demand.

Data set

This analysis will be based on the M5 Forecasting dataset of Walmart stores sales records (Link).

Code

  1. Create a folder Data in your directory where the notebook is located
  2. Download all the files of the kaggle forecasting competition (Link).
  3. Launch the notebook

About me 🤓

Senior Supply Chain Engineer with an international experience working on Logistics and Transportation operations.
Have a look at my portfolio: Data Science for Supply Chain Portfolio
Data Science for Warehousing📦, Transportation 🚚 and Demand Forecasting 📈