New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MNT consolidate target type tags #28927
base: main
Are you sure you want to change the base?
Conversation
doc/developers/develop.rst
Outdated
- 'multi-output': multi-output regression or classification, i.e. `y` | ||
has shape `(n_samples, n_outputs)` | ||
- 'single-output': single-output regression or classification, i.e. `y` | ||
has shape `(n_samples,)` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that we should have an instance to explain that you can multi-output and not support a single output.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a note:
Note that an estimator can support multi-output but not single-output, or vice
versa. The same goes for binary and multi-class support.
doc/developers/develop.rst
Outdated
- 'binary': binary classification. All estimators supporting | ||
'multi-class' also support 'binary'; classifiers only; | ||
- 'multi-class': multiclass classification |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Together with the other comment, we might benefit of an example to show what we mean.
doc/developers/develop.rst
Outdated
- 'multi-output': multi-output regression or classification, i.e. `y` | ||
has shape `(n_samples, n_outputs)` | ||
- 'single-output': single-output regression or classification, i.e. `y` | ||
has shape `(n_samples,)` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that we should have an instance to explain that you can multi-output and not support a single output.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see the logic behind it, but it's unfortunate we can use the names in type_of_target
.
doc/developers/develop.rst
Outdated
|
||
multioutput_only (default=False) | ||
whether estimator supports only multi-output classification or regression. | ||
target_type (default=["multiclass", "single-output"]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the estimator is a regressor, then having a "multiclass" tag feels strange to me.
An alternative here is to have a set of binary flags, instead of a set of flags, as in, introduce 5 flags which are all binary. It makes updating the flags in inheritance slightly easier. Right now once we implement the inheritance for flags["target_type"] = set(flags["target_type"]) | {"multi-output"} I'm not sure which one would be better. |
Another alternative is to have two flags: classification-type and output-shape/type maybe? |
I like the binary flags approach. |
If we go back to binary flags, then the only change I see necessary is to:
to
|
I'm okay with
Although adding another tag is more complexity, I do not like how |
I think single and multi output aspect is independent from binary/multi class aspect. You could be a multi output only classifier that only supports binary outputs for instance. |
This consolidates the following tags:
into
Towards #28910
cc @glemaitre @thomasjpfan