Base function to check if the model is a clusterer (analogous to `base.is_classifier()` and `base.is_regressor()`)? #28960

adrinjalali · 2024-05-06T11:14:48Z

Discussed in #28904

^{Originally posted by aoot April 26, 2024}
According to the note on figuring out the model type, it is recommended to use sklearn.base.is_classifier() or sklearn.base.is_regressor() function to check instead of of checking the attribute _estimator_type directly.

However, since the attribute _estimator_type can be either "classifier", "regressor", and "clusterer", are there any base function such as sklearn.base.is_clusterer() to check if the model is a clusterer?

Thanks for your input!

#28936 is an effort to fix this. Not sure what to think of it.

The text was updated successfully, but these errors were encountered:

jeremiedbb · 2024-05-06T20:56:17Z

It makes sense and in the same time I'm not sure if we want to move in that direction

there are more types of estimators (decomposition, outlier detectors, density estimators, ...). Do we want to do it for these as well ?
I thought that at some point we wanted to deprecate estimator_type if favor of tags. Is it still the case (I think we should) ?

adrinjalali · 2024-05-07T10:12:23Z

I don't mind deprecating _estimator_type and adding it to tags. But that's independent of having a helper function to check if an estimator is a classifier, regressor, or a clusterer. I think we don't have to cover all cases with these helper functions. We only need the ones most commonly used.

ChVeen · 2024-05-07T14:11:28Z

For the estimator types not so commonly used, one could imagine a more generic function like

def is_estimator_type(estimator, expected_type: str):
    return getattr(estimator, "_estimator_type", None) == expected_type

in order to check for a given category.

adrinjalali · 2024-05-07T14:45:04Z

What we're planning to do would be more like:

get_tags(estimator)["estimator_type"] == expected_type

This fix allows fitting unsupervised estimators with the assumption that they will always predict to shape (n_samples,). Output dtype is now determined based on the `_estimator_type` attribute. This is likely a temporary solution as `_estimator_type` is planned for deprecation in favor of tags and explicit estimator type checking functions, but neither of those solutions are fully implemented yet. See scikit-learn/scikit-learn#28960

adrinjalali added the Needs Triage Issue requires triage label May 6, 2024

adrinjalali mentioned this issue May 6, 2024

ENH Add missing base.is_clusterer() function #28936

Merged

jeremiedbb added New Feature and removed Needs Triage Issue requires triage labels May 6, 2024

aazuspan mentioned this issue May 17, 2024

Fix unsupervised fitting lemma-osu/sknnr-spatial#21

Merged

Charlie-XIAO closed this as completed in #28936 May 22, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Base function to check if the model is a clusterer (analogous to `base.is_classifier()` and `base.is_regressor()`)? #28960

Base function to check if the model is a clusterer (analogous to `base.is_classifier()` and `base.is_regressor()`)? #28960

adrinjalali commented May 6, 2024

jeremiedbb commented May 6, 2024

adrinjalali commented May 7, 2024

ChVeen commented May 7, 2024

adrinjalali commented May 7, 2024

Base function to check if the model is a clusterer (analogous to base.is_classifier() and base.is_regressor())? #28960

Base function to check if the model is a clusterer (analogous to base.is_classifier() and base.is_regressor())? #28960

Comments

adrinjalali commented May 6, 2024

Discussed in #28904

jeremiedbb commented May 6, 2024

adrinjalali commented May 7, 2024

ChVeen commented May 7, 2024

adrinjalali commented May 7, 2024

Base function to check if the model is a clusterer (analogous to `base.is_classifier()` and `base.is_regressor()`)? #28960

Base function to check if the model is a clusterer (analogous to `base.is_classifier()` and `base.is_regressor()`)? #28960