[BUG] time_limit is displayd wrong in logs #4148

mglowacki100 · 2024-04-27T16:26:26Z

Describe the bug
I've wanted run example from https://auto.gluon.ai/stable/tutorials/tabular/tabular-quick-start.html
with:

time_limit = 3600
preset = "best"
In logs time_limit is divided by 4 (900s instead of 3600s) which seems to me like a bug or not clear message

Beginning AutoGluon training ... Time limit = 900s
...
Fitting model: KNeighborsUnif_BAG_L1 ... Training model for up to 599.53s of the 899.47s of remaining time.
..

Notes:

there is warning about ray not installed.
one message is displayed correctly Sub-fit(s) time limit is: 3600 seconds.

Expected behavior
time_limit specified in fit should be the same as in logs

To Reproduce
Everything done on google colab with newest version:

!pip install autogluon.tabular
from autogluon.tabular import TabularDataset, TabularPredictor
data_url = 'https://raw.githubusercontent.com/mli/ag-docs/main/knot_theory/'
train_data = TabularDataset(f'{data_url}train.csv')

label = 'signature'

predictor = TabularPredictor(label=label).fit(train_data,
                                              presets='best',
                                              time_limit=3600)

Screenshots / Logs

No path specified. Models will be saved in: "AutogluonModels/ag-20240427_161329"
Preset alias specified: 'best' maps to 'best_quality'.
Presets specified: ['best']
Setting dynamic_stacking from 'auto' to True. Reason: Enable dynamic_stacking when use_bag_holdout is disabled. (use_bag_holdout=False)
Stack configuration (auto_stack=True): num_stack_levels=1, num_bag_folds=8, num_bag_sets=1
Dynamic stacking is enabled (dynamic_stacking=True). AutoGluon will try to determine whether the input data is affected by stacked overfitting and enable or disable stacking as a consequence.
Detecting stacked overfitting by sub-fitting AutoGluon on the input data. That is, copies of AutoGluon will be sub-fit on subset(s) of the data. Then, the holdout validation data is used to detect stacked overfitting.
Sub-fit(s) time limit is: 3600 seconds.
Starting holdout-based sub-fit for dynamic stacking. Context path is: AutogluonModels/ag-20240427_161329/ds_sub_fit/sub_fit_ho.
/usr/local/lib/python3.10/dist-packages/autogluon/tabular/predictor/predictor.py:1213: UserWarning: Failed to use ray for memory safe fits. Falling back to normal fit. Error: ImportError('ray is required to train folds in parallel for TabularPredictor or HPO for MultiModalPredictor. A quick tip is to install via `pip install ray==2.10.0`')
  stacked_overfitting = self._sub_fit_memory_save_wrapper(
Beginning AutoGluon training ... Time limit = 900s
AutoGluon will save models to "AutogluonModels/ag-20240427_161329/ds_sub_fit/sub_fit_ho"
=================== System Info ===================
AutoGluon Version:  1.1.0
Python Version:     3.10.12
Operating System:   Linux
Platform Machine:   x86_64
Platform Version:   #1 SMP PREEMPT_DYNAMIC Sat Nov 18 15:31:17 UTC 2023
CPU Count:          2
Memory Avail:       11.15 GB / 12.67 GB (88.0%)
Disk Space Avail:   81.37 GB / 107.72 GB (75.5%)
===================================================
Train Data Rows:    8889
Train Data Columns: 18
Label Column:       signature
Problem Type:       multiclass
Preprocessing data ...
Warning: Some classes in the training set have fewer than 10 examples. AutoGluon will only keep 9 out of 13 classes for training and will not try to predict the rare classes. To keep more classes, increase the number of datapoints from these rare classes in the training data or reduce label_count_threshold.
Fraction of data from classes with at least 10 examples that will be kept for training models: 0.9983125210934863
Train Data Class Count: 9
Using Feature Generators to preprocess the data ...
Fitting AutoMLPipelineFeatureGenerator...
	Available Memory:                    11422.74 MB
	Train Data (Original)  Memory Usage: 1.22 MB (0.0% of available memory)
	Inferring data type of each feature based on column values. Set feature_metadata_in to manually specify special dtypes of the features.
	Stage 1 Generators:
		Fitting AsTypeFeatureGenerator...
			Note: Converting 5 features to boolean dtype as they only contain 2 unique values.
	Stage 2 Generators:
		Fitting FillNaFeatureGenerator...
	Stage 3 Generators:
		Fitting IdentityFeatureGenerator...
	Stage 4 Generators:
		Fitting DropUniqueFeatureGenerator...
	Stage 5 Generators:
		Fitting DropDuplicatesFeatureGenerator...
	Useless Original Features (Count: 1): ['Symmetry_D8']
		These features carry no predictive signal and should be manually investigated.
		This is typically a feature which has the same value for all rows.
		These features do not need to be present at inference time.
	Types of features in original data (raw dtype, special dtypes):
		('float', []) : 14 | ['chern_simons', 'cusp_volume', 'injectivity_radius', 'longitudinal_translation', 'meridinal_translation_imag', ...]
		('int', [])   :  3 | ['Unnamed: 0', 'hyperbolic_adjoint_torsion_degree', 'hyperbolic_torsion_degree']
	Types of features in processed data (raw dtype, special dtypes):
		('float', [])     : 9 | ['chern_simons', 'cusp_volume', 'injectivity_radius', 'longitudinal_translation', 'meridinal_translation_imag', ...]
		('int', [])       : 3 | ['Unnamed: 0', 'hyperbolic_adjoint_torsion_degree', 'hyperbolic_torsion_degree']
		('int', ['bool']) : 5 | ['Symmetry_0', 'Symmetry_D3', 'Symmetry_D4', 'Symmetry_D6', 'Symmetry_Z/2 + Z/2']
	0.4s = Fit runtime
	17 features in original data used to generate 17 features in processed data.
	Train Data (Processed) Memory Usage: 0.85 MB (0.0% of available memory)
Data preprocessing and feature engineering runtime = 0.46s ...
AutoGluon will gauge predictive performance using evaluation metric: 'accuracy'
	To change this, specify the eval_metric parameter of Predictor()
Large model count detected (112 configs) ... Only displaying the first 3 models of each family. To see all, set `verbosity=3`.
User-specified model hyperparameters to be fit:
{
	'NN_TORCH': [{}, {'activation': 'elu', 'dropout_prob': 0.10077639529843717, 'hidden_size': 108, 'learning_rate': 0.002735937344002146, 'num_layers': 4, 'use_batchnorm': True, 'weight_decay': 1.356433327634438e-12, 'ag_args': {'name_suffix': '_r79', 'priority': -2}}, {'activation': 'elu', 'dropout_prob': 0.11897478034205347, 'hidden_size': 213, 'learning_rate': 0.0010474382260641949, 'num_layers': 4, 'use_batchnorm': False, 'weight_decay': 5.594471067786272e-10, 'ag_args': {'name_suffix': '_r22', 'priority': -7}}],
	'GBM': [{'extra_trees': True, 'ag_args': {'name_suffix': 'XT'}}, {}, 'GBMLarge'],
	'CAT': [{}, {'depth': 6, 'grow_policy': 'SymmetricTree', 'l2_leaf_reg': 2.1542798306067823, 'learning_rate': 0.06864209415792857, 'max_ctr_complexity': 4, 'one_hot_max_size': 10, 'ag_args': {'name_suffix': '_r177', 'priority': -1}}, {'depth': 8, 'grow_policy': 'Depthwise', 'l2_leaf_reg': 2.7997999596449104, 'learning_rate': 0.031375015734637225, 'max_ctr_complexity': 2, 'one_hot_max_size': 3, 'ag_args': {'name_suffix': '_r9', 'priority': -5}}],
	'XGB': [{}, {'colsample_bytree': 0.6917311125174739, 'enable_categorical': False, 'learning_rate': 0.018063876087523967, 'max_depth': 10, 'min_child_weight': 0.6028633586934382, 'ag_args': {'name_suffix': '_r33', 'priority': -8}}, {'colsample_bytree': 0.6628423832084077, 'enable_categorical': False, 'learning_rate': 0.08775715546881824, 'max_depth': 5, 'min_child_weight': 0.6294123374222513, 'ag_args': {'name_suffix': '_r89', 'priority': -16}}],
	'FASTAI': [{}, {'bs': 256, 'emb_drop': 0.5411770367537934, 'epochs': 43, 'layers': [800, 400], 'lr': 0.01519848858318159, 'ps': 0.23782946566604385, 'ag_args': {'name_suffix': '_r191', 'priority': -4}}, {'bs': 2048, 'emb_drop': 0.05070411322605811, 'epochs': 29, 'layers': [200, 100], 'lr': 0.08974235041576624, 'ps': 0.10393466140748028, 'ag_args': {'name_suffix': '_r102', 'priority': -11}}],
	'RF': [{'criterion': 'gini', 'ag_args': {'name_suffix': 'Gini', 'problem_types': ['binary', 'multiclass']}}, {'criterion': 'entropy', 'ag_args': {'name_suffix': 'Entr', 'problem_types': ['binary', 'multiclass']}}, {'criterion': 'squared_error', 'ag_args': {'name_suffix': 'MSE', 'problem_types': ['regression', 'quantile']}}],
	'XT': [{'criterion': 'gini', 'ag_args': {'name_suffix': 'Gini', 'problem_types': ['binary', 'multiclass']}}, {'criterion': 'entropy', 'ag_args': {'name_suffix': 'Entr', 'problem_types': ['binary', 'multiclass']}}, {'criterion': 'squared_error', 'ag_args': {'name_suffix': 'MSE', 'problem_types': ['regression', 'quantile']}}],
	'KNN': [{'weights': 'uniform', 'ag_args': {'name_suffix': 'Unif'}}, {'weights': 'distance', 'ag_args': {'name_suffix': 'Dist'}}],
}
AutoGluon will fit 2 stack levels (L1 to L2) ...
Fitting 110 L1 models ...
Fitting model: KNeighborsUnif_BAG_L1 ... Training model for up to 599.53s of the 899.47s of remaining time.
	0.2116	 = Validation score   (accuracy)
	0.07s	 = Training   runtime
	0.11s	 = Validation runtime
Fitting model: KNeighborsDist_BAG_L1 ... Training model for up to 599.29s of the 899.24s of remaining time.
	0.2214	 = Validation score   (accuracy)
	0.05s	 = Training   runtime
	0.09s	 = Validation runtime

The text was updated successfully, but these errors were encountered:

Innixma · 2024-04-30T00:48:01Z

Thanks for reporting the issue!

The 900 seconds is the time budget for dynamic stacking. Normally dynamic stacking is not shown in logs because it uses ray and we hide the logs for it, but since ray isn't present, the logs are shown for dynamic stacking.

I think that we can look to improve the logs prior to dynamic stacking starting so it is clearer what is going on, particularly when ray is not present.

mglowacki100 added bug: unconfirmed Something might not be working Needs Triage Issue requires Triage labels Apr 27, 2024

Innixma added API & Doc Improvements or additions to documentation and removed bug: unconfirmed Something might not be working Needs Triage Issue requires Triage labels Apr 30, 2024

Innixma added this to the 1.2 Release milestone Apr 30, 2024

Innixma self-assigned this Apr 30, 2024

Innixma added the module: tabular label Apr 30, 2024

Innixma modified the milestones: 1.2 Release, 1.1.1 Release May 8, 2024

Innixma mentioned this issue May 18, 2024

[tabular] Dynamic Stacking Logging Enhancement #4208

Merged

Innixma closed this as completed in #4208 May 21, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] time_limit is displayd wrong in logs #4148

[BUG] time_limit is displayd wrong in logs #4148

mglowacki100 commented Apr 27, 2024

Innixma commented Apr 30, 2024

[BUG] time_limit is displayd wrong in logs #4148

[BUG] time_limit is displayd wrong in logs #4148

Comments

mglowacki100 commented Apr 27, 2024

Innixma commented Apr 30, 2024