Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

model.predict gives different forecast depending on forecast_length #199

Open
sebros-sandvik opened this issue Sep 26, 2023 · 5 comments
Open

Comments

@sebros-sandvik
Copy link

model.predict(forecast_length=n) and model.predict(forecast_length=m) gives (for me) different forecast values depending
on forecast length. What am I missing? Expected behavior, mistake on my part, or bug?

code:

model = AutoTS(
forecast_length=3,
frequency='MS',
ensemble='all',
model_list="best",
n_jobs="auto",
transformer_list="fast",
holiday_country = "PL",
max_generations=4,
num_validations=1,
verbose=0
)
model = model.fit(df, #Multiple ts end-date 2023-08-01
date_col='Date',
value_col='Sales',
id_col='Customer Number')

image

image

Thanks in advance!

@winedarksea
Copy link
Owner

Understanding the expected behavior is a bit complicated.
When AutoTS.fit() runs, it selects but doesn't train the final model on the full data (in cross validation there are trained models, but not on full data, only on cross validation samples).
When AutoTS.predict runs, it will train/fit the final model on the full dataset. Now some models will be different depending on forecast length, for example some regression models which output the full forecast length at once, will likely have different outputs for different forecast lengths inputs, although many models should be the same when run on different forecast lengths. Overall, it depends on what models you are using. If you need the same results across forecast lengths try limiting the model list.

On a related note, there a selection of update_fit models which allow .fit_data followed by .predict to update on new data without rerunning training, which is faster and also useful for consistency.

@sebros-sandvik
Copy link
Author

Thanks for getting back so fast, and for a clear answer. What you are suggesting seems fair, but I wonder if it is right for me.

Here's the rub: I need forecasts for 18 months ahead, but they should be optimized for 3 months ahead.
Could I do the following:

1.) Train with AutoTS(forecast_length = 3); model = model.fit(all_data)
2.) new_data = model_forecast(all_data, model, forecast_length = 3)
3.) all_data = concat(all_data, new_data)
4.) new_data = model_forecast(all_data, model)
5.) repeat until I have 18 months ahead forecast (6 loops)

Thanks again!

Best,

Seb

@winedarksea
Copy link
Owner

I would just stick with the original plan of AutoTS(forecast_length=3) then .predict(forecast_length=18). If the minor variation concerns you (and for me it really doesn't because forecasts are highly uncertain by definition, some variation is to be expected even with similar models) then choose a model_list of models that won't change based off forecast_length. I can suggest some if you want.

@sebros-sandvik
Copy link
Author

Thank you sir, that would be very helpful!

(minor variations in above examples, but major when considering all time series in the data). I understand this can be a convergence issue also.

A job well done on the package, thank you kindly for taking the time and answering!

@winedarksea
Copy link
Owner

Thanks, if you continue to see "major"variations feel free to post more. It's possible there is a bug with the specific model being used, in that case (largely a large chunk of JSON for an ensemble).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants