Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using autots model for predicting,not forecasting #191

Open
OMERTBITU opened this issue Jul 19, 2023 · 5 comments
Open

Using autots model for predicting,not forecasting #191

OMERTBITU opened this issue Jul 19, 2023 · 5 comments

Comments

@OMERTBITU
Copy link

Hello.
I spent some time with autots for my time series prediction task. It seems it is only for forecasting the next "n" timesteps according to past data. However, i need to find optimum model and use it for same time interval with training dataset. Because my task is predicting some graphs which are in the same time interval but with varying input features. I am not sure it is possible but I am curious about it.

Thanks in advance.

@winedarksea
Copy link
Owner

winedarksea commented Jul 19, 2023

If I understand you correctly, it sounds like the only model here that will work is Cassandra, which was designed for explainability. You could optimize it with the AutoTS class (model_list=['Cassandra'], ensemble=None, max_transformer_depth=0) and then copy the selected params to Cassandra, see a Cassandra example here:
https://github.com/winedarksea/AutoTS/blob/7c5d35cdcaf6695e7b936564908516a782461eb5/autots/models/cassandra.py#L2472C3-L2472C3
It allows various different types of input features

make sure include_history=True to get the back forecast (or 'in sample' forecast) which is what I believe you are looking for. There are models which might work, such as the DatepartRegression, but for now this looks like your best bet

@OMERTBITU
Copy link
Author

I think the "varying input features" part of my issue is misleading. I am just trying to say I will predict some samples in the same interval with the training data, not the future. Input features are same but of course their values are changing for different samples. In this case, is Cassandra still your advice or "include_history=True" is enough?

@winedarksea
Copy link
Owner

I think so. Try it and see, then get back to me if that isn't what you were looking for

@OMERTBITU
Copy link
Author

To my understanding, AutoTS must be used for optimum Cassandra model and hyperparameters must be stored. After that, new Cassandra model must be created and fitted. I tried to do it with code below:

auto_ts_model = AutoTS(model_list=['Cassandra'], ensemble=None, transformer_max_depth=0)
auto_ts_model.fit(windowed_tek_ornek_train,date_col="tarih",value_col="ru")
best_params = auto_ts_model.best_model_params
mod = Cassandra(n_jobs=1, **best_params)
mod.fit(windowed_tek_ornek_train)

I managed to fit auto_ts_model and taking parameters from it. But when I am trying to fit Cassandra model, it gives error as "ValueError: infer_frequency failed due to input not being pandas DF or DT index". My input contains 6 columns. 4 of them are input columns, 1 is output and the other one is date_col. Can you give me some idea about solution? Thanks.

@winedarksea
Copy link
Owner

The issue is just with data formatting. Your AutoTS run is probably wrong too, although it is running successfully.
You've got wide style data, mostly, 4 time series covariates should be your 4 columns with a pd.DatetimeIndex used instead of ad date column.
Assuming 'tarih' is your date col

windowed_tek_ornek_train['tarih'] = pd.to_datetime(windowed_tek_ornek_train['tarih'])
windowed_tek_ornek_train = windowed_tek_ornek_train.set_index('tarih')

then pass it without the date_col and value_col params
you might want to use weights to .fit() to show that your output is the primary target. If the covariates are known about the future, you can pass them in as a future_regressor

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants