ENH add possibility to have a callable for verbose_feature_names_out of ColumnTransformer #28934

MarcBresson · 2024-05-02T14:35:51Z

What does this implement?

This brings the possibility to pass a callable to the verbose_feature_names_out parameter of ColumnTransformer. Instead of the new feature name being "transormer_name__feature_name", we could have "feature_name$this is amazing$TRANSFORMER_NAME".

Any other comments?

I have a few questions:

How do you type sklearn as there are no stub files ?
What is the version number I should put in .. versionchanged ?

In advance, thank you for your time.

…of ColumnTransformer

github-actions · 2024-05-02T14:37:14Z

❌ Linting issues

This PR is introducing linting issues. Here's a summary of the issues. Note that you can avoid having linting issues by enabling pre-commit hooks. Instructions to enable them can be found here.

You can see the details of the linting issues under the lint job here

`ruff`

ruff detected issues. Please run ruff check --fix --output-format=full . locally, fix the remaining issues, and push the changes. Here you can see the detected issues. Note that the installed ruff version is ruff=0.4.4.


sklearn/compose/_column_transformer.py:10:1: I001 [*] Import block is un-sorted or un-formatted
   |
 8 |   #         Joris Van den Bossche
 9 |   # License: BSD
10 | / import warnings
11 | | from collections import Counter, UserList
12 | | from itertools import chain
13 | | from functools import partial
14 | | from numbers import Integral, Real
15 | | 
16 | | import numpy as np
17 | | from scipy import sparse
18 | | 
19 | | from ..base import TransformerMixin, _fit_context, clone
20 | | from ..pipeline import _fit_transform_one, _name_estimators, _transform_one
21 | | from ..preprocessing import FunctionTransformer
22 | | from ..utils import Bunch
23 | | from ..utils._estimator_html_repr import _VisualBlock
24 | | from ..utils._indexing import _determine_key_type, _get_column_indices
25 | | from ..utils._metadata_requests import METHODS
26 | | from ..utils._param_validation import HasMethods, Hidden, Interval, StrOptions
27 | | from ..utils._set_output import (
28 | |     _get_container_adapter,
29 | |     _get_output_config,
30 | |     _safe_set_output,
31 | | )
32 | | from ..utils.metadata_routing import (
33 | |     MetadataRouter,
34 | |     MethodMapping,
35 | |     _raise_for_params,
36 | |     _routing_enabled,
37 | |     process_routing,
38 | | )
39 | | from ..utils.metaestimators import _BaseComposition
40 | | from ..utils.parallel import Parallel, delayed
41 | | from ..utils.validation import (
42 | |     _check_feature_names_in,
43 | |     _get_feature_names,
44 | |     _is_pandas_df,
45 | |     _num_samples,
46 | |     check_array,
47 | |     check_is_fitted,
48 | | )
49 | | 
50 | | __all__ = ["ColumnTransformer", "make_column_transformer", "make_column_selector"]
   | |_^ I001
   |
   = help: Organize imports

Found 1 error.
[*] 1 fixable with the `--fix` option.

_{Generated for commit: fe4f428. Link to the linter CI: here}

adrinjalali · 2024-05-08T09:46:37Z

cc @thomasjpfan

pretty sure we had the conversation about this at some point.

thomasjpfan

Thank you for the PR!

sklearn/compose/_column_transformer.py

…ColumnTransformer also: improve doc for the parameter, fix validation

MarcBresson · 2024-05-13T09:39:38Z

Hey,

just added the possibility to use string format. I also added unit tests.

One of the test I created creates name clashing, and I remember seeing something about that somewhere in sklearn's doc. However, I can't seem to put my hand on it.

From the behaviour of ColumnTransformer, I know it is not important since it uses np arrays rather than pandas dataframe, but do you think we should issue a warning somewhere in the doc?

ENH add possibility to have a callable for verbose_feature_names_out …

d81e70f

…of ColumnTransformer

github-actions bot added the module:compose label May 2, 2024

thomasjpfan reviewed May 11, 2024

View reviewed changes

sklearn/compose/_column_transformer.py Show resolved Hide resolved

Marc Bresson added 2 commits May 13, 2024 11:35

ENH add possibility to provide a str to verbose_feature_names_out of …

55c8d87

…ColumnTransformer also: improve doc for the parameter, fix validation

TST: add tests for verbose_feature_names_out using callables or strings

fe4f428

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ENH add possibility to have a callable for verbose_feature_names_out of ColumnTransformer #28934

ENH add possibility to have a callable for verbose_feature_names_out of ColumnTransformer #28934

MarcBresson commented May 2, 2024 •

edited

github-actions bot commented May 2, 2024 •

edited

adrinjalali commented May 8, 2024

thomasjpfan left a comment

MarcBresson commented May 13, 2024 •

edited

ENH add possibility to have a callable for verbose_feature_names_out of ColumnTransformer #28934

Are you sure you want to change the base?

ENH add possibility to have a callable for verbose_feature_names_out of ColumnTransformer #28934

Conversation

MarcBresson commented May 2, 2024 • edited

What does this implement?

Any other comments?

github-actions bot commented May 2, 2024 • edited

❌ Linting issues

ruff

adrinjalali commented May 8, 2024

thomasjpfan left a comment

Choose a reason for hiding this comment

MarcBresson commented May 13, 2024 • edited

MarcBresson commented May 2, 2024 •

edited

github-actions bot commented May 2, 2024 •

edited

`ruff`

MarcBresson commented May 13, 2024 •

edited