Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature request] Allow the use of logging instead of print #3729

Open
christophertubbs opened this issue May 10, 2024 · 3 comments
Open

[Feature request] Allow the use of logging instead of print #3729

christophertubbs opened this issue May 10, 2024 · 3 comments
Labels
feature request feature requests for making TTS better.

Comments

@christophertubbs
Copy link

馃殌 Feature Description

The print function is in several places, most noticeably (to me) is in utils.synthesizer.Synthesizer.tts, with lines like:

        print(f" > Processing time: {process_time}")
        print(f" > Real-time factor: {process_time / audio_time}")

This is great when messing around, but it'd be nice to have the option to use different types of loggers (or even just the root). For instance, if I have a distributed application, I can have this writing to something that would send the messages through a pubsub setup so that another application may read and interpret the output in real time.

Solution

utils.synthesizer.Synthesizer's signature can be changed to look like:

    def __init__(
        self,
        tts_checkpoint: str = "",
        tts_config_path: str = "",
        tts_speakers_file: str = "",
        tts_languages_file: str = "",
        vocoder_checkpoint: str = "",
        vocoder_config: str = "",
        encoder_checkpoint: str = "",
        encoder_config: str = "",
        vc_checkpoint: str = "",
        vc_config: str = "",
        model_dir: str = "",
        voice_dir: str = None,
        use_cuda: bool = False,
        logger: logging.Logger = None
    ) -> None:

and the tts function can look like:

    if self.__logger:
        self.__logger.info(f" > Processing time: {process_time}")
        self.__logger.info(f" > Real-time factor: {process_time / audio_time}")
    else:
        print(f" > Processing time: {process_time}")
        print(f" > Real-time factor: {process_time / audio_time}")

A Protocol for the logger might work better than just the hint of logging.Logger - it'd allow programmers to put in some wackier functionality, such as writing non-loggers that just so happen to have a similar signature.

Alternative Solutions

An alternative solution would be to pass the writing function to tts itself, something like:

    def tts(
        self,
        text: str = "",
        speaker_name: str = "",
        language_name: str = "",
        speaker_wav=None,
        style_wav=None,
        style_text=None,
        reference_wav=None,
        reference_speaker_name=None,
        split_sentences: bool = True,
        logging_function: typing.Callable[[str], typing.Any] = None,
        **kwargs,
    ) -> List[int]:

    ...
   
    if logging_function:
        logging_function(f" > Processing time: {process_time}")
        logging_function(f" > Real-time factor: {process_time / audio_time}")
    else:
        print(f" > Processing time: {process_time}")
        print(f" > Real-time factor: {process_time / audio_time}")

This will enable code like:

def output_sound(text: str, output_path: pathlib.Path, connection: Redis):
    from TTS.api import TTS
    speech_model = TTS(DEFAULT_MODEL).to("cpu")
    speech_model.tts_to_file(text=text, speaker="p244", file_path=str(output_path), logging_function: connection.publish)

Additional context

I don't believe that utils.synthesizer.Synthesizer.tts is the only location of the standard print function. A consistent solution should be applied there.

The parameter for the logging functionality will need to be passed through objects and functions that lead to the current print statements. For instance, TTS.api.TTS.tts_to_file would require a logging_function parameter if it were to the function to self.synthesizer.tts within the tts function.

The general vibe of the solutions I've provided will make sure that pre-existing code behaves no different, making the new functionality purely opt-in.

I haven't written anything using a progress bar like the one that this uses, so I can't speak up for that aside from the fact that it might need to be excluded.

@christophertubbs christophertubbs added the feature request feature requests for making TTS better. label May 10, 2024
@eginhard
Copy link
Contributor

In our fork (pip install coqui-tts) all prints have been switched to Python logging. Feel free to try it out and let us know if it works for you. (also a duplicate of #1691)

@christophertubbs
Copy link
Author

Thanks for your work there! I wasn't aware that TTS was essentially dead here when I posted. Is there anything I need to know when migrating over to your version?

@eginhard
Copy link
Contributor

No, there aren't any major changes and you can use it in the same way.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request feature requests for making TTS better.
Projects
None yet
Development

No branches or pull requests

2 participants