Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH check_scoring() has raise_exc for multimetric scoring #28992

Merged
merged 12 commits into from May 24, 2024

Conversation

StefanieSenger
Copy link
Contributor

Reference Issues/PRs

#28360
#28359

What does this implement/fix? Explain your changes.

This PR suggests to expose raise_exc in check_scoring to facilitate a public API for multrimetric scoring.
Thus check_scoring can also be used in cross_validate in a simpler way (this is actually how I came across it).

Copy link

github-actions bot commented May 10, 2024

✔️ Linting Passed

All linting checks passed. Your pull request is in excellent shape! ☀️

Generated for commit: 2a30caf. Link to the linter CI: here

@StefanieSenger
Copy link
Contributor Author

StefanieSenger commented May 13, 2024

If this is wanted, then we need to add a changelog entry. And since the CI "Check Changelog" did not alarm, should we move check_scoring() somewhere where it's recognised by the CI as part of the public API?

Copy link
Member

@adrinjalali adrinjalali left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would need a test. And since it's touching public API, it needs a changelog. Otherwise LGTM.

@StefanieSenger
Copy link
Contributor Author

Thanks @adrinjalali, I have now added a test.


raise_exc : bool, default=True
Whether to raise an exception if a subset of the scorers in multimetric scoring
fails or to return an error code.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is something weird here maybe the "to" should not be here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, I believe it's correct English. Whether to do A if condition or to do B.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, do you mind inverting both part around the "or":

Whether to return an error code or to raise an ...

Having a really long sentence before the "or" make my parsing algorithm fail :).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can see what you mean.

However, I really think that with those kinds of speaking params (raise_exc) we should document what happens when True first, because this is the natural meaning (and the default). Otherwise we force people to switch signs while reading.

What about using parenthesis like so:

Whether to raise an exception (if a subset of the scorers in multimetric scoring fails) or to return an error code.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The parenthesis make the trick for me.

Copy link
Member

@glemaitre glemaitre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks good apart from little nitpicks.

sklearn/metrics/_scorer.py Outdated Show resolved Hide resolved
sklearn/metrics/_scorer.py Outdated Show resolved Hide resolved
sklearn/metrics/_scorer.py Outdated Show resolved Hide resolved
sklearn/metrics/_scorer.py Show resolved Hide resolved
sklearn/metrics/_scorer.py Show resolved Hide resolved
Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com>
Copy link
Contributor Author

@StefanieSenger StefanieSenger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @glemaitre. I have added an example, but had to exclude it from doctest. Would you mind having another look?


raise_exc : bool, default=True
Whether to raise an exception if a subset of the scorers in multimetric scoring
fails or to return an error code.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, I believe it's correct English. Whether to do A if condition or to do B.

sklearn/metrics/_scorer.py Show resolved Hide resolved
@glemaitre glemaitre self-requested a review May 21, 2024 13:21
Copy link
Member

@glemaitre glemaitre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the two suggestions, I think that it will be enough.

StefanieSenger and others added 2 commits May 21, 2024 15:37
Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com>
Copy link
Member

@adrinjalali adrinjalali left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor nit, otherwise LGTM.

- a callable returning a dictionary where the keys are the metric
names and the values are the metric scorers;
- a dictionary with metric names as keys and callables a values.
- a list, tuple or set of unique strings;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we replace list, tuple, or set to iterable of strings?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, I actually think we should tell the users explicitly what the can pass in. Any other than those wouldn't show the correct behaviour in li. 1054-56:

    if isinstance(scoring, (list, tuple, set, dict)):
        scorers = _check_multimetric_scoring(estimator, scoring=scoring)
        return _MultimetricScorer(scorers=scorers, raise_exc=raise_exc)

And, any other iterable of strings would raise during the @validate_params().

So I would rather leave it as it is.

@adrinjalali adrinjalali enabled auto-merge (squash) May 24, 2024 09:26
@adrinjalali adrinjalali merged commit e7be614 into scikit-learn:main May 24, 2024
28 checks passed
@StefanieSenger StefanieSenger deleted the multimetric branch May 24, 2024 09:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants