Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MNT consolidate target type tags #28927

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

adrinjalali
Copy link
Member

This consolidates the following tags:

binary_only (default=False)
    whether estimator supports binary classification but lacks multi-class
    classification support.

multilabel (default=False)
    whether the estimator supports multilabel output

multioutput (default=False)
    whether a regressor supports multi-target outputs or a classifier supports
    multi-class multi-output.

multioutput_only (default=False)
    whether estimator supports only multi-output classification or regression.

into

target_type (default=["multiclass", "single-output"])
    the supported target types of the estimator, which can be a subset of:
        - 'binary': binary classification. All estimators supporting
          'multi-class' also support 'binary'; classifiers only;
        - 'multi-class': multiclass classification
        - 'multi-label': multilabel output
        - 'multi-output': multi-output regression or classification, i.e. `y`
          has shape `(n_samples, n_outputs)`
        - 'single-output': single-output regression or classification, i.e. `y`
          has shape `(n_samples,)`

Towards #28910

cc @glemaitre @thomasjpfan

Copy link

github-actions bot commented May 1, 2024

✔️ Linting Passed

All linting checks passed. Your pull request is in excellent shape! ☀️

Generated for commit: 8e35307. Link to the linter CI: here

Comment on lines 547 to 550
- 'multi-output': multi-output regression or classification, i.e. `y`
has shape `(n_samples, n_outputs)`
- 'single-output': single-output regression or classification, i.e. `y`
has shape `(n_samples,)`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that we should have an instance to explain that you can multi-output and not support a single output.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a note:

Note that an estimator can support multi-output but not single-output, or vice
versa. The same goes for binary and multi-class support.

Comment on lines 543 to 545
- 'binary': binary classification. All estimators supporting
'multi-class' also support 'binary'; classifiers only;
- 'multi-class': multiclass classification
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Together with the other comment, we might benefit of an example to show what we mean.

Comment on lines 547 to 550
- 'multi-output': multi-output regression or classification, i.e. `y`
has shape `(n_samples, n_outputs)`
- 'single-output': single-output regression or classification, i.e. `y`
has shape `(n_samples,)`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that we should have an instance to explain that you can multi-output and not support a single output.

Copy link
Member

@thomasjpfan thomasjpfan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see the logic behind it, but it's unfortunate we can use the names in type_of_target.


multioutput_only (default=False)
whether estimator supports only multi-output classification or regression.
target_type (default=["multiclass", "single-output"])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the estimator is a regressor, then having a "multiclass" tag feels strange to me.

@adrinjalali
Copy link
Member Author

An alternative here is to have a set of binary flags, instead of a set of flags, as in, introduce 5 flags which are all binary. It makes updating the flags in inheritance slightly easier. Right now once we implement the inheritance for __sklearn_flags__, the user should do something like

    flags["target_type"] = set(flags["target_type"]) | {"multi-output"}

I'm not sure which one would be better.

@adrinjalali
Copy link
Member Author

Another alternative is to have two flags: classification-type and output-shape/type maybe?

@thomasjpfan
Copy link
Member

An alternative here is to have a set of binary flags, instead of a set of flags, as in, introduce 5 flags which are all binary.

I like the binary flags approach.

@adrinjalali
Copy link
Member Author

If we go back to binary flags, then the only change I see necessary is to:

multioutput_only (default=False)
    whether estimator supports only multi-output classification or regression.

to

single_output (default=True)
    whether estimator supports single output.

Note that an estimator can support `single_output` and/or `multi_output`.

@thomasjpfan
Copy link
Member

I'm okay with single_output. What do you think of having two binary tags for single-output classification? i.e. binary and multi-class.

  • binary=True, multi-class=False == binary_only
  • binary=False, multi-class=True == impossible
  • binary=False, multi-class=False == does not support single output classification
  • binary=True, multi-class=True == most classifiers

Although adding another tag is more complexity, I do not like how binary_only implies that multi-class is not support.

@adrinjalali
Copy link
Member Author

binary=False, multi-class=False == does not support single output classification

I think single and multi output aspect is independent from binary/multi class aspect. You could be a multi output only classifier that only supports binary outputs for instance.

@glemaitre glemaitre self-requested a review May 18, 2024 10:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants