Lightgbm trains much slower than catboost. #6456

fengshansi · 2024-05-16T13:03:35Z

On Ubuntu 22.04.2 LTS,python version 3.11.4,lightgbm 4.3.0。
Data size is 3000.
params

{
        "boosting_type": "gbdt",  
        "objective": "binary",  
        "verbose": -1,
        "n_jobs": -1,
        "device": "cpu",
        "random_state": 1,
        "metric": "None",
        "learning_rate": 0.03,
    }

feval:

def weighted_f1_score(preds, train_data):
    labels = train_data.get_label()
    preds_binary = (preds > 0.5).astype(int)  
    f1 = f1_score(labels, preds_binary, average="weighted")
    return "weighted_f1_score", f1, True

I use several categorical_features.

lightgbm.train(
        params=params,
        train_set=train_dataset,
        num_boost_round=iterations, 
        feval=feval, 
        categorical_feature=current_cat_feature,
        callbacks=[ lightgbm.early_stopping(50, first_metric_only=False),lightgbm.log_evaluation(period=20, show_stdv=True),],
  
    )

I need 5 minutes to train, much slower than catboost.

The text was updated successfully, but these errors were encountered:

jmoralez · 2024-05-16T15:44:08Z

Hey @fengshansi, thanks for using LightGBM. Unfortunately, this isn't enough information, we'd also need the following:

How many iterations are you running?
At which iteration is LightGBM stopping?
At which iteration is CatBoost stopping?
Which parameters are you using for catboost?
How many features do you have?
Are you also using your custom metric in catboost?

For 3,000 samples 5 minutes sounds like a lot so I'm guessing your custom metric is being the bottleneck here but it's very hard to tell with just this information.

fengshansi · 2024-05-16T16:04:10Z

Thank you for your help.
My answer is:

The max iteration is 300. I use early stop of 60. Both of lightgbm and catboost.

LightGBM stop at 90. So it runs 150 iterations.

Catboost actually, I use optuna for 50 trail to search paramenters. Even run 50 times, it use 1 minute 3seconds.
When I run it for one time of 70 iterations without earlystopping. It use 0.2 seconds.
All the time get from jupyternotebook.

params of 50 times is

"learning_rate": trial.suggest_float("learning_rate", 0.001, 0.1, log=True),
    "depth": trial.suggest_int("depth", 1, 10),
    "subsample": trial.suggest_float("subsample", 0.05, 1.0),
    "colsample_bylevel": trial.suggest_float("colsample_bylevel", 0.05, 1.0),
    "min_data_in_leaf": trial.suggest_int("min_data_in_leaf", 1, 100),

10 category features and 10 numeric features

I don't use custom metric in catboost. Catboost offers macro f1.

jmoralez · 2024-05-16T16:27:28Z

How long does it take if you remove your custom metric?

shiyu1994 · 2024-05-16T16:33:07Z

Thanks for using LightGBM. Could you also provide information about how catboost is used? In my experience, the speed of catboost varies a lot depending on the tree structure you select and the boosting mode. These choices often make trade-offs between speed and performance.

fengshansi · 2024-05-16T16:51:13Z

删除自定义指标需要多长时间？

Also 5 minutes. I use metric= "binary_logloss".

fengshansi · 2024-05-16T16:55:50Z

感谢您使用LightGBM。您能否提供有关如何使用 catboost 的信息？根据我的经验，catboost 的速度会根据您选择的树结构和提升模式而有很大差异。这些选择通常会在速度和性能之间做出权衡。

{
"iterations": 300,
"learning_rate": 0.07116892811065063,
"depth": 5,
"loss_function": "Logloss",
"verbose": 20,
"eval_metric": "TotalF1:average=Macro",
"subsample": 0.2697512982046929,
"colsample_bylevel": 0.932255235452595,
"early_stopping_rounds": 60,
"min_data_in_leaf": 98,
}
I use a CatBoostClassifier.

mayer79 · 2024-05-27T09:24:14Z

Without data and working code, I fear we are stuck here.

fengshansi · 2024-05-27T12:20:37Z

Here is code and data https://github.com/fengshansi/lgbm_compare.

jmoralez · 2024-05-28T16:34:24Z

@fengshansi can you try using the same parameters in both? For example you're setting 0.3 as the learning rate for LightGBM and 0.7 for CatBoost, which should converge faster. Also the default leaves in LightGBM is 31 and you're using a depth of 6 in CatBoost, which produces 64 leaves.

mayer79 · 2024-06-01T13:09:33Z

@fengshansi : On my laptop (8 threads), running your two notebooks gives:

LightGBM

CatBoost

Thus, LightGBM is 4-5 times faster (using pip install)

fengshansi · 2024-06-01T13:16:37Z

：在我的笔记本电脑（8 个线程）上，运行您的两个笔记本可以：

光GBM

CatBoost 升压

因此，LightGBM 的速度要快 4-5 倍（使用 pip install）

Unbelievable, my lightgbm takes nearly 5 minutes to run

mayer79 · 2024-06-01T13:24:04Z

Ooops :-). I have reset the notebook kernels before running each of them.

fengshansi · 2024-06-01T13:30:11Z

哎呀：-）。在运行每个笔记本内核之前，我已经重置了它们。

I reinstalled lightgbm. But still very slow. With Python 3.11.4 and lightgbm 4.3.0.

jameslamb added the question label May 16, 2024

jmoralez added the awaiting response label May 16, 2024

github-actions bot removed the awaiting response label May 16, 2024

jmoralez added the awaiting response label May 16, 2024

github-actions bot removed the awaiting response label May 16, 2024

jmoralez added the awaiting response label May 31, 2024

github-actions bot removed the awaiting response label Jun 1, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Lightgbm trains much slower than catboost. #6456

Lightgbm trains much slower than catboost. #6456

fengshansi commented May 16, 2024

jmoralez commented May 16, 2024

fengshansi commented May 16, 2024 •

edited

jmoralez commented May 16, 2024

shiyu1994 commented May 16, 2024

fengshansi commented May 16, 2024 •

edited

fengshansi commented May 16, 2024

mayer79 commented May 27, 2024

fengshansi commented May 27, 2024

jmoralez commented May 28, 2024

mayer79 commented Jun 1, 2024 •

edited

fengshansi commented Jun 1, 2024

光GBM

CatBoost 升压

mayer79 commented Jun 1, 2024

fengshansi commented Jun 1, 2024

Lightgbm trains much slower than catboost. #6456

Lightgbm trains much slower than catboost. #6456

Comments

fengshansi commented May 16, 2024

jmoralez commented May 16, 2024

fengshansi commented May 16, 2024 • edited

jmoralez commented May 16, 2024

shiyu1994 commented May 16, 2024

fengshansi commented May 16, 2024 • edited

fengshansi commented May 16, 2024

mayer79 commented May 27, 2024

fengshansi commented May 27, 2024

jmoralez commented May 28, 2024

mayer79 commented Jun 1, 2024 • edited

LightGBM

CatBoost

fengshansi commented Jun 1, 2024

光GBM

CatBoost 升压

mayer79 commented Jun 1, 2024

fengshansi commented Jun 1, 2024

fengshansi commented May 16, 2024 •

edited

fengshansi commented May 16, 2024 •

edited

mayer79 commented Jun 1, 2024 •

edited