Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Developments folder via upload #286

Open
wants to merge 22 commits into
base: main
Choose a base branch
from
Open

Add Developments folder via upload #286

wants to merge 22 commits into from

Conversation

facundoallia
Copy link

Developments folder contain Ensemble notebook

@review-notebook-app
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@review-notebook-app
Copy link

review-notebook-app bot commented Nov 9, 2022

View / edit / reply to this conversation on ReviewNB

MMenchero commented on 2022-11-09T16:54:13Z
----------------------------------------------------------------

Suggested title: How to do an ensemble model for time series forecasting.

StatsForecast allows you to create ensemble models in a very easy way. First we need to import the data that we are going to use, in this case the M4 Dataset. After that, we´ll to (remove "to") generate the forecasts via the generate_forecast() function. Finally, we´ll create the ensemble models using the forecasts previously generated. In this notebook we’ll implement and benchmark a ensemble model of AutoARIMA, AutoETS and AutoCES:


@review-notebook-app
Copy link

review-notebook-app bot commented Nov 9, 2022

View / edit / reply to this conversation on ReviewNB

MMenchero commented on 2022-11-09T16:54:13Z
----------------------------------------------------------------

Let's try to summarize the information in a table

|Frequency| Min # observations training set| Forecasting horizon|

|Yearly | 13 | 6 |


@review-notebook-app
Copy link

View / edit / reply to this conversation on ReviewNB

MMenchero commented on 2022-11-09T16:54:14Z
----------------------------------------------------------------

Give a brief introduction to this section, saying that we'll now generate the forecasts and the ensemble for every frequency.


@review-notebook-app
Copy link

review-notebook-app bot commented Nov 9, 2022

View / edit / reply to this conversation on ReviewNB

MMenchero commented on 2022-11-09T16:54:15Z
----------------------------------------------------------------

Use overall instead of total.


@review-notebook-app
Copy link

review-notebook-app bot commented Nov 9, 2022

View / edit / reply to this conversation on ReviewNB

MMenchero commented on 2022-11-09T16:54:16Z
----------------------------------------------------------------

Expand the conclusions and add the table comparing our results with the other competitors. Mention the ease of use of StatsForecast for generating multiple models in one go and for creating the ensembles. 


@review-notebook-app
Copy link

review-notebook-app bot commented Nov 15, 2022

View / edit / reply to this conversation on ReviewNB

MMenchero commented on 2022-11-15T07:20:37Z
----------------------------------------------------------------

Hay que checar esta tabla. Para el Naive2, se obtuvieron los siguientes valores:

  • sMAPE = 13.564
  • MASE = 1.912

Entonces

OWA = 1/2*( sMAPE_Nixtla/sMAPE_Naive2 + MASE_Nixtla/MASE_Naive2) = 0.841 != 0.853

Tomando los valores del Naive2 de arriba, para el primer lugar de M4 obtenemos

OWA = 1/2*(11.374/13.564+1.536/1.912) = 0.8209

Este es el valor que aparece en la tabla de resultados

https://www.sciencedirect.com/science/article/pii/S0169207019301128

Sugiero también agregar a los autores de los métodos ya que hay varias afiliaciones que aparecen como Individual.


@review-notebook-app
Copy link

View / edit / reply to this conversation on ReviewNB

MMenchero commented on 2022-11-18T06:27:51Z
----------------------------------------------------------------

Line #1.    # Get trarin dataframe:

Typo: Train. Seems to be in other frequencies as well.


@review-notebook-app
Copy link

View / edit / reply to this conversation on ReviewNB

MMenchero commented on 2022-11-18T06:27:52Z
----------------------------------------------------------------

I think we can compute the accuracy of every frequency and model more efficiently since the code for it seems to be the same, just with different names. One way of doing this is with this function. The only argument it takes is the name of the frequency.

def compute_accuracy(freq): 
   
  """""Computes the accuracy of a given frequency and model"""
   
  data = pd.read_csv(freq.lower()+'_df_forecast.csv')
  data = data.drop(columns=['Unnamed: 0']) # add 'Unnamed: 0.1' if requiered
  data['ds'] = data.groupby('unique_id')['ds'].transform(lambda data: np.arange(1, len(data) + 1))
  data = pd.melt(data, id_vars=['unique_id', 'ds'], var_name='model')
  data = pd.pivot(data, index=['unique_id', 'model'], columns='ds', values='value').reset_index()

  models = ['Ensemble_median', 'AutoARIMA', 'ETS', 'CES', 'AutoTheta']
  res = {}

  for k in range(len(models)):
    df_mask=data['model'] == models[k]
    df = data[df_mask]
    df.rename(columns={'unique_id': 'id'}, inplace=True)
    df = df.set_index(df['id'])
    df = df.drop(['id', 'model'], axis=1)
    y_metrics = M4Evaluation.evaluate('data2', freq, df.sort_values('id').values)
    res[models[k]] = y_metrics
   
  metrics = [res[k] for k in models]
  total_metrics = pd.concat(metrics)
  total_metrics['model'] = models 
   
  return total_metrics 

We can call it using

compute_accuracy('Yearly')

This should produce the same table as above.

I think the following are good practices to keep in mind: 

- Keep the variable names as short as possible. 

- If you need to do a process more than once, write a function for it if possible. 



@CLAassistant
Copy link

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
You have signed the CLA already but the status is still pending? Let us recheck it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants