Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] AutoGluon ignores logging configuration #4140

Open
1 of 3 tasks
opcode81 opened this issue Apr 25, 2024 · 3 comments
Open
1 of 3 tasks

[BUG] AutoGluon ignores logging configuration #4140

opcode81 opened this issue Apr 25, 2024 · 3 comments
Assignees
Labels
enhancement New feature or request priority: 1 High priority
Milestone

Comments

@opcode81
Copy link

opcode81 commented Apr 25, 2024

Bug Report Checklist

  • I provided code that demonstrates a minimal reproducible example.
  • I confirmed bug exists on the latest mainline of AutoGluon via source install.
  • I confirmed bug exists on the latest stable version of AutoGluon.

Describe the bug

AutoGluon ignores the user's logging configuration and sets a very peculiar configuration for the autogluon logger.

autogluon.common.__init__ calls the function autogluon.utils.log_utils._add_stream_handler:

def _add_stream_handler():
# Add stream_handler to AG logger if it doesn't already exist
# This is necessary so that the modification of logging level can take effect
# Also this adjust the logging format
# This function is supposed to be called before any logging from autogluon happens
if not any(isinstance(h, logging.StreamHandler) for h in _logger_ag.handlers):
stream_handler = logging.StreamHandler()
formatter = logging.Formatter("%(message)s")
stream_handler.setFormatter(formatter)
_logger_ag.addHandler(stream_handler)
_logger_ag.propagate = False

This causes the user's logging configuration to be fully ignored, as propagate = False will prevent the user's log handlers from receiving any log messages from the autogluon logger. Furthermore, it imposes a new log handler for autogluon which writes to stderr.

This behaviour is not appropriate for a library. When integrating a library into a project that uses logging, I expect the library to respect my configuration.

I have worked around the issue by using the following function in my project:

def _fix_autogluon_logging():
    ag_logger = logging.getLogger("autogluon")
    ag_logger.propagate = True
    ag_logger.handlers = []

I should not have to do this (and I should not have to spend my time figuring out that this is what I need to do).

Expected behavior

AutoGluon respects the user-defined logging configuration and does not impose its own idea of how things should be logged.

@opcode81 opcode81 added bug: unconfirmed Something might not be working Needs Triage Issue requires Triage labels Apr 25, 2024
@Innixma Innixma added this to the 1.2 Release milestone Apr 26, 2024
@Innixma Innixma added enhancement New feature or request priority: 1 High priority and removed bug: unconfirmed Something might not be working Needs Triage Issue requires Triage labels Apr 26, 2024
@Innixma
Copy link
Contributor

Innixma commented Apr 26, 2024

Thanks for this issue @opcode81!

Indeed logging is a bit odd in AutoGluon (and previously it was even odder). I remember that we chose the current method because of the ways in which other packages sometimes altered logging functionality, which would then also alter AutoGluon's logging in unexpected ways and thus we got GitHub issues from users that were the inverse of yours. Therefore, we isolated the logging logic to be harder to accidentally override in other packages as they are imported. (One of the big ones is Kaggle, which previously hid all AutoGluon logs due to how it altered the logger).

This definitely has side effects, such as what you encountered, and we are interested in making things work with less friction.

I'd be interested in a small code snippet that includes your user-defined logging configuration followed by an import of AutoGluon and calling logger or fitting TabularPredictor so we can use that for testing.

@opcode81
Copy link
Author

opcode81 commented Apr 30, 2024

My assumption is that you

  • add the log handler to the autogluon logger to ensure that users receive output even IF they have not configured logging
  • set propagate = False to avoid duplicate logs IF they have configured logging.

To gracefully handle both cases, I would suggest to add a handler to the autogluon logger which will produce output only if the root logger is not associated with any handlers, i.e. to use a handler like this one I've added to sensAI:

class FallbackHandler(logging.Handler):
    """
    A log handler which applies the given handler only if the root logger has no defined handlers 
    (or has no handlers other than this fallback handler itself)
    """
    def __init__(self, handler: logging.Handler, level=logging.NOTSET):
        """
        :param handler: the handler to which messages shall be passed on
        :param level: the minimum level to output; if NOTSET, pass on everything
        """
        super().__init__(level=level)
        self._handler = handler

    def emit(self, record):
        root_handlers = logging.getLogger().handlers
        if len(root_handlers) == 0 or (len(root_handlers) == 1 and root_handlers[0] == self):
            self._handler.emit(record)

With this type of handler, you can do this:

    ag_logger = logging.getLogger("autogluon")
    ag_logger.setLevel(logging.INFO)

    ag_logger.addHandler(FallbackHandler(logging.StreamHandler(sys.stderr)))
    ag_logger.info("Foo")

    logging.basicConfig(level=logging.INFO, stream=sys.stdout)

    logging.getLogger("foo").info("I am some other logger")
    ag_logger.info("Bar")

and you will get

image

i.e. as long as no root log handler is configured, the autogluon logger will print to stderr (as indicated by the red output colour), and after logging has been configured by the user, the logs will end up wherever it was configured instead (stdout with a different format in this case).

I don't really think a test is needed, as once this is handled properly, it's unlikely to break, but a simple test you could add is this:

def test_root_logger_receives_autogluon_logs():
    import autogluon
    root_logger = logging.getLogger()
    stream = io.StringIO()
    stream_handler = logging.StreamHandler(stream=stream)
    root_logger.addHandler(stream_handler)
    try:
        ag_logger = logging.getLogger("autogluon")
        ag_logger.setLevel(logging.INFO)
        ag_logger.info("Foobar")
        assert "Foobar" in stream.getvalue()
    finally:
        root_logger.removeHandler(stream_handler)

It will currently fail, but it will pass if you apply the function _fix_autogluon_logging I had pasted above.

I can issue a PR if you like, but I really don't know what you were trying to achieve wrt the kaggle case.

@Innixma
Copy link
Contributor

Innixma commented Apr 30, 2024

Awesome write-up, this is very helpful @opcode81! We will look into addressing this for the next minor release (v1.2).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request priority: 1 High priority
Projects
None yet
Development

No branches or pull requests

3 participants