[BUG] AutoGluon ignores logging configuration #4140

opcode81 · 2024-04-25T11:59:39Z

Bug Report Checklist

I provided code that demonstrates a minimal reproducible example.
I confirmed bug exists on the latest mainline of AutoGluon via source install.
I confirmed bug exists on the latest stable version of AutoGluon.

Describe the bug

AutoGluon ignores the user's logging configuration and sets a very peculiar configuration for the autogluon logger.

autogluon.common.__init__ calls the function autogluon.utils.log_utils._add_stream_handler:

autogluon/common/src/autogluon/common/utils/log_utils.py

Lines 99 to 109 in 7fa5dc3

    
           def _add_stream_handler(): 
        
               # Add stream_handler to AG logger if it doesn't already exist 
        
               # This is necessary so that the modification of logging level can take effect 
        
               # Also this adjust the logging format 
        
               # This function is supposed to be called before any logging from autogluon happens 
        
               if not any(isinstance(h, logging.StreamHandler) for h in _logger_ag.handlers): 
        
                   stream_handler = logging.StreamHandler() 
        
                   formatter = logging.Formatter("%(message)s") 
        
                   stream_handler.setFormatter(formatter) 
        
                   _logger_ag.addHandler(stream_handler) 
        
                   _logger_ag.propagate = False

This causes the user's logging configuration to be fully ignored, as propagate = False will prevent the user's log handlers from receiving any log messages from the autogluon logger. Furthermore, it imposes a new log handler for autogluon which writes to stderr.

This behaviour is not appropriate for a library. When integrating a library into a project that uses logging, I expect the library to respect my configuration.

I have worked around the issue by using the following function in my project:

def _fix_autogluon_logging():
    ag_logger = logging.getLogger("autogluon")
    ag_logger.propagate = True
    ag_logger.handlers = []

I should not have to do this (and I should not have to spend my time figuring out that this is what I need to do).

Expected behavior

AutoGluon respects the user-defined logging configuration and does not impose its own idea of how things should be logged.

The text was updated successfully, but these errors were encountered:

Innixma · 2024-04-26T22:01:30Z

Thanks for this issue @opcode81!

Indeed logging is a bit odd in AutoGluon (and previously it was even odder). I remember that we chose the current method because of the ways in which other packages sometimes altered logging functionality, which would then also alter AutoGluon's logging in unexpected ways and thus we got GitHub issues from users that were the inverse of yours. Therefore, we isolated the logging logic to be harder to accidentally override in other packages as they are imported. (One of the big ones is Kaggle, which previously hid all AutoGluon logs due to how it altered the logger).

This definitely has side effects, such as what you encountered, and we are interested in making things work with less friction.

I'd be interested in a small code snippet that includes your user-defined logging configuration followed by an import of AutoGluon and calling logger or fitting TabularPredictor so we can use that for testing.

opcode81 · 2024-04-30T09:03:31Z

My assumption is that you

add the log handler to the autogluon logger to ensure that users receive output even IF they have not configured logging
set propagate = False to avoid duplicate logs IF they have configured logging.

To gracefully handle both cases, I would suggest to add a handler to the autogluon logger which will produce output only if the root logger is not associated with any handlers, i.e. to use a handler like this one I've added to sensAI:

class FallbackHandler(logging.Handler):
    """
    A log handler which applies the given handler only if the root logger has no defined handlers 
    (or has no handlers other than this fallback handler itself)
    """
    def __init__(self, handler: logging.Handler, level=logging.NOTSET):
        """
        :param handler: the handler to which messages shall be passed on
        :param level: the minimum level to output; if NOTSET, pass on everything
        """
        super().__init__(level=level)
        self._handler = handler

    def emit(self, record):
        root_handlers = logging.getLogger().handlers
        if len(root_handlers) == 0 or (len(root_handlers) == 1 and root_handlers[0] == self):
            self._handler.emit(record)

With this type of handler, you can do this:

    ag_logger = logging.getLogger("autogluon")
    ag_logger.setLevel(logging.INFO)

    ag_logger.addHandler(FallbackHandler(logging.StreamHandler(sys.stderr)))
    ag_logger.info("Foo")

    logging.basicConfig(level=logging.INFO, stream=sys.stdout)

    logging.getLogger("foo").info("I am some other logger")
    ag_logger.info("Bar")

and you will get

i.e. as long as no root log handler is configured, the autogluon logger will print to stderr (as indicated by the red output colour), and after logging has been configured by the user, the logs will end up wherever it was configured instead (stdout with a different format in this case).

I don't really think a test is needed, as once this is handled properly, it's unlikely to break, but a simple test you could add is this:

def test_root_logger_receives_autogluon_logs():
    import autogluon
    root_logger = logging.getLogger()
    stream = io.StringIO()
    stream_handler = logging.StreamHandler(stream=stream)
    root_logger.addHandler(stream_handler)
    try:
        ag_logger = logging.getLogger("autogluon")
        ag_logger.setLevel(logging.INFO)
        ag_logger.info("Foobar")
        assert "Foobar" in stream.getvalue()
    finally:
        root_logger.removeHandler(stream_handler)

It will currently fail, but it will pass if you apply the function _fix_autogluon_logging I had pasted above.

I can issue a PR if you like, but I really don't know what you were trying to achieve wrt the kaggle case.

Innixma · 2024-04-30T18:23:05Z

Awesome write-up, this is very helpful @opcode81! We will look into addressing this for the next minor release (v1.2).

opcode81 added bug: unconfirmed Something might not be working Needs Triage Issue requires Triage labels Apr 25, 2024

Innixma added this to the 1.2 Release milestone Apr 26, 2024

Innixma added enhancement New feature or request priority: 1 High priority and removed bug: unconfirmed Something might not be working Needs Triage Issue requires Triage labels Apr 26, 2024

Innixma assigned rey-allan and Innixma Apr 30, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] AutoGluon ignores logging configuration #4140

[BUG] AutoGluon ignores logging configuration #4140

opcode81 commented Apr 25, 2024 •

edited

Innixma commented Apr 26, 2024

opcode81 commented Apr 30, 2024 •

edited

Innixma commented Apr 30, 2024

[BUG] AutoGluon ignores logging configuration #4140

[BUG] AutoGluon ignores logging configuration #4140

Comments

opcode81 commented Apr 25, 2024 • edited

Innixma commented Apr 26, 2024

opcode81 commented Apr 30, 2024 • edited

Innixma commented Apr 30, 2024

opcode81 commented Apr 25, 2024 •

edited

opcode81 commented Apr 30, 2024 •

edited