Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Running out of RAM in 0.6.3 #210

Open
emobs opened this issue Nov 29, 2023 · 6 comments
Open

Running out of RAM in 0.6.3 #210

emobs opened this issue Nov 29, 2023 · 6 comments

Comments

@emobs
Copy link

emobs commented Nov 29, 2023

Upgraded to 0.6.3 today and also updated all dependencies.

Using AutoTS 0.6.3 model.fit() runs fine like it did before on the same data set. Average memory usage is less than 4GB throughout model generations and validations, except for this model which tries to allocate 49GB of RAM and therefore crashes AutoTS:

{"model_number": 386, "model_name": "MultivariateMotif", "model_param_dict": {"window": 10, "point_method": "midhinge", "distance_metric": "hamming", "k": 20, "max_windows": 10000}, "model_transform_dict": {"fillna": "SeasonalityMotifImputerLinMix", "transformations": {"0": "bkfilter", "1": "AlignLastValue", "2": "AlignLastValue", "3": "bkfilter"}, "transformation_params": {"0": {}, "1": {"rows": 7, "lag": 1, "method": "additive", "strength": 1.0, "first_value_only": false}, "2": {"rows": 1, "lag": 28, "method": "additive", "strength": 1.0, "first_value_only": false}, "3": {}}}}

This never happened on many test runs on 0.6.2 with the same AutoTS model and dataset.
Is this a bug or can I prevent this validation from causing a crash somehow?

Would it be possible to implement a precaution in a future version of AutoTS that skips validation methods or models when running out of RAM?

@winedarksea
Copy link
Owner

Thanks for the quick and specific identification.
I have had RAM issues with MultivariateMotif in the past, but it is unchanged in 0.6.3, my solution instead was the BallTreeMultivariateMotif which you could try dropping in its place.
However, I suspect the real issue is the SeasonalityMotifImputerLinMix which I changed to attempt to be more RAM friendly but apparently that's not the case.

Let me run it and see.

@winedarksea
Copy link
Owner

winedarksea commented Nov 29, 2023

--- some running later ---
Appears MultivariateMotif is indeed the issue. The ONLY change I made was it previously had Parallel(self.n_jobs - 1) and I switched it to the standard Parallel(self.n_jobs)
so you should try doing 1 less to your n_jobs and seeing if that helps. It ultimate comes down to a memory issue, and amount of memory per worker. Reducing your n_jobs is the most expedient fix at the moment.

@emobs
Copy link
Author

emobs commented Nov 30, 2023

Thanks Colin, I tried running the fit again with n_jobs-1 and this time it didn't crash on MultivariateMotif model on Validation 1, so that's good. However, on the GLM model according to the last saved CurrentModel it does crash:

{"model_number": 323, "model_name": "GLM", "model_param_dict": {"family": "Gaussian", "constant": false, "regression_type": "datepart"}, "model_transform_dict": {"fillna": "SeasonalityMotifImputerLinMix", "transformations": {"0": "AlignLastValue", "1": "AnomalyRemoval", "2": "SeasonalDifference"}, "transformation_params": {"0": {"rows": 1, "lag": 1, "method": "additive", "strength": 1.0, "first_value_only": false}, "1": {"method": "zscore", "method_params": {"distribution": "chi2", "alpha": 0.1}, "fillna": "linear", "transform_dict": {"fillna": "rolling_mean_24", "transformations": {"0": "RegressionFilter"}, "transformation_params": {"0": {"sigma": 2, "rolling_window": 90, "run_order": "season_first", "regression_params": {"regression_model": {"model": "DecisionTree", "model_params": {"max_depth": 3, "min_samples_split": 0.05}}, "datepart_method": "simple_binarized", "polynomial_degree": null, "transform_dict": null, "holiday_countries_used": false}, "holiday_params": null}}}}, "2": {"lag_1": 7, "method": "Mean"}}}}

Also here, memory peaks certainly after being steady on approximately 10% usage level throughout the full run:
First peak:
image
Crash:
image
(Screenshots taken from real-time Ubuntu Resources monitor)

Hope this helps to pinpoint the cause of this crash. Thanks for any reply in advance.

@emobs
Copy link
Author

emobs commented Nov 30, 2023

--- after another run ---
I tested once again Colin, this time without a transformer_list parameter set in the model initiation (used to be 'superfast' on 0.6.2 and changed to 'scalable' on 0.6.3 as per your advise per email a little while ago. Without a value set for the transformer_list AutoTS doesn't crash. Also tried running with transformer_list 'superfast' (which I used on 0.6.2) again on 0.6.3, which didn't crash either. So it seems the issue is related to the new 'scalable' transformer_list in this case.

@winedarksea
Copy link
Owner

--- after another run --- I tested once again Colin, this time without a transformer_list parameter set in the model initiation (used to be 'superfast' on 0.6.2 and changed to 'scalable' on 0.6.3 as per your advise per email a little while ago. Without a value set for the transformer_list AutoTS doesn't crash. Also tried running with transformer_list 'superfast' (which I used on 0.6.2) again on 0.6.3, which didn't crash either. So it seems the issue is related to the new 'scalable' transformer_list in this case.

Scalable is a much larger group of transformers than superfast is, so should be more accurate. but it looks like I have more work to do still in chasing down parameter combinations that lead to too much memory. I suspect it is in the "AnomalyRemoval" --> "RegressionFilter" that is causing the problems

@emobs
Copy link
Author

emobs commented Nov 30, 2023

I probably can't help you with this, but if there's a way I can contribute, let me know!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants