TTS API improvements #2086

blob42 · 2024-04-20T13:38:05Z

Description

Improvements to the Coqui TTS API/backend.

tts coqui xtts_v2 not working without speaker_idx #2073: Allow passing speaker_id to models
Add optional language parameter to TTS endpoint/schema
~~[ ] TTS Info endpoint: List available models, speakers and languages~~ (will start new PR for this one)
update swagger documentation
define tts models with config files
updated docs

Notes for Reviewers

Signed commits

Yes, I signed my commits.

netlify · 2024-04-20T13:38:22Z

✅ Deploy Preview for localai canceled.

Name	Link
🔨 Latest commit	`b2361dc`
🔍 Latest deploy log	https://app.netlify.com/sites/localai/deploys/6632c4cbace1270008899180

mudler · 2024-04-20T18:22:14Z

backend/python/coqui/coqui_server.py

        # List available 🐸TTS models
-        print(TTS().list_models())
+        print(TTS().list_models().list_models())


this looks like a leftover, or is it wanted?

I was not sure, I will remove it then.

I am planning to include an endpoint to list models/speakers in this PR.

mudler · 2024-04-20T18:22:46Z

I don't see how the changeset can fix #2073 - is there something missing in the PR?

blob42 · 2024-04-22T05:02:32Z

@mudler I didn't push those changes yet, I will remove the draft status when I will be done

blob42 · 2024-04-22T12:52:31Z

@mudler I am trying to understand where/when is the go gRPC server -> TTS service used, Is this a work in progress ?

blob42 · 2024-04-23T02:16:40Z

backend/python/coqui/coqui_server.py

@@ -66,7 +66,19 @@ def LoadModel(self, request, context):

    def TTS(self, request, context):
        try:
-            self.tts.tts_to_file(text=request.text, speaker_wav=self.AudioPath, language=COQUI_LANGUAGE, file_path=request.dst)
+            # if model is multilangual add language from request or env as fallback
+            lang = request.Lang or COQUI_LANGUAGE


Can I add a new Lang field in the protobuf definition ? It would be an optional one.

If the language is truly independent of both the model, voice, and input text, I see no reason not to have a Language parameter. Personally, I prefer to spell it out rather than name it Lang?

Agreed better to have clearly defined parameter.

Does it make sense to keep the COQUI_LANGUAGE env var ? What use case does it serve ?

blob42 · 2024-04-23T13:17:37Z

I didn't push the swagger docs, it gave me alot of changes.

Quick way to test the language switching capability with multilingual models is something like this:

Without specifying lang:

The voice uses an English accent.

curl -L http://localai:8080/tts \
    -H "Content-Type: application/json" \
    -H "Authorization: Bearer 2708b7c21129e408899d5a38e6d1af8d " \
    -d '{
"backend": "coqui",
"input": "Bonjour Madame ! Comment allez-vous ?",
"model": "tts_models/multilingual/multi-dataset/xtts_v2",
"voice": "Ana Florence"
}' | aplay -D pipewire -

With lang:

Proper language accent is used

curl -L http://localai:8080/tts \
    -H "Content-Type: application/json" \
    -H "Authorization: Bearer 2708b7c21129e408899d5a38e6d1af8d " \
    -d '{
"backend": "coqui",
"input": "Bonjour Madame ! Comment allez-vous ?",
"model": "tts_models/multilingual/multi-dataset/xtts_v2",
"voice": "Ana Florence",
"lang": "fr"
}' | aplay -D pipewire -

blob42 · 2024-04-29T15:53:01Z

Quick update regarding adding TTS Info endpoint. I am skipping this feature from this PR is it would involve too many changes that are out of scope for this PR.

Context:

The goal is to have the possibility to query available models/speakers or other type of information depending on the backend.

My first attempt was to add a gRPC service TTSInfoRequest to query the backend. I found out down the road that the backend grpc service is loaded with the model at the same time, however Info requests might not send any model infromation.

My proposal is to allow backends grpc service to be spawned without a model and to add a service called Info() or Query that backends can use to send arbitrary infromation. A model could be loaded later using the same spawned service or tear-down and start a new one for the designated model.

I will start a PR or Discussion for this proposal.

Signed-off-by: blob42 <contact@blob42.xyz>

core/schema/localai.go

core/backend/tts.go

mudler · 2024-04-29T20:59:40Z

overall looks good, thanks! just few nits/open questions above

Signed-off-by: blob42 <contact@blob42.xyz>

- consolidate TTS options under `tts` config entry Signed-off-by: blob42 <contact@blob42.xyz>

Signed-off-by: blob42 <contact@blob42.xyz>

mudler reviewed Apr 20, 2024

View reviewed changes

blob42 force-pushed the tts_api branch from c09617a to c70cb9a Compare April 23, 2024 02:14

blob42 commented Apr 23, 2024

View reviewed changes

blob42 force-pushed the tts_api branch 2 times, most recently from 55251d3 to 66e1cd4 Compare April 23, 2024 13:15

blob42 marked this pull request as ready for review April 23, 2024 13:16

blob42 marked this pull request as draft April 26, 2024 00:12

blob42 force-pushed the tts_api branch 5 times, most recently from 3ce3154 to 970de10 Compare April 26, 2024 01:59

blob42 changed the title ~~Coqui TTS API improvements~~ TTS API improvements Apr 26, 2024

blob42 force-pushed the tts_api branch from 1fdb597 to 970de10 Compare April 29, 2024 15:47

blob42 marked this pull request as ready for review April 29, 2024 15:49

blob42 force-pushed the tts_api branch 2 times, most recently from 1a2d0cb to fa6e144 Compare April 29, 2024 16:18

blob42 added 4 commits April 29, 2024 18:19

swagger: show /tts endpoint

8389e25

Signed-off-by: blob42 <contact@blob42.xyz>

update doc on COQUI_LANGUAGE env variable

2037734

Signed-off-by: blob42 <contact@blob42.xyz>

return errors from tts gRPC backend

02ab66b

Signed-off-by: blob42 <contact@blob42.xyz>

wip: handling speaker_id and language in coqui TTS backend

9c9ead0

Signed-off-by: blob42 <contact@blob42.xyz>

blob42 force-pushed the tts_api branch from fa6e144 to 628502f Compare April 29, 2024 16:19

blob42 mentioned this pull request Apr 29, 2024

Add grpc service to query info about backend #2185

Open

mudler reviewed Apr 29, 2024

View reviewed changes

core/schema/localai.go Outdated Show resolved Hide resolved

mudler reviewed Apr 29, 2024

View reviewed changes

core/schema/localai.go Show resolved Hide resolved

mudler reviewed Apr 29, 2024

View reviewed changes

core/backend/tts.go Outdated Show resolved Hide resolved

blob42 force-pushed the tts_api branch from 628502f to 5a87ee9 Compare May 1, 2024 22:34

blob42 added 5 commits May 2, 2024 00:40

TTS endpoint: add optional language paramter

4448d9a

Signed-off-by: blob42 <contact@blob42.xyz>

tts fix: empty language string breaks non-multilingual models

c5281dd

Signed-off-by: blob42 <contact@blob42.xyz>

allow tts param definition in config file

e9720c3

- consolidate TTS options under `tts` config entry Signed-off-by: blob42 <contact@blob42.xyz>

log: error when loading backend config

2448370

Signed-off-by: blob42 <contact@blob42.xyz>

tts: update doc

b2361dc

Signed-off-by: blob42 <contact@blob42.xyz>

blob42 force-pushed the tts_api branch from 5a87ee9 to b2361dc Compare May 1, 2024 22:40

blob42 closed this May 13, 2024

blob42 deleted the tts_api branch May 13, 2024 07:15

blob42 mentioned this pull request May 13, 2024

TTS API improvements #2308

Open

6 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TTS API improvements #2086

TTS API improvements #2086

blob42 commented Apr 20, 2024 •

edited

netlify bot commented Apr 20, 2024 •

edited

mudler Apr 20, 2024

blob42 Apr 20, 2024

mudler commented Apr 20, 2024

blob42 commented Apr 22, 2024

blob42 commented Apr 22, 2024

blob42 Apr 23, 2024

dave-gray101 Apr 24, 2024

blob42 Apr 24, 2024

blob42 commented Apr 23, 2024 •

edited

blob42 commented Apr 29, 2024 •

edited

mudler commented Apr 29, 2024

TTS API improvements #2086

TTS API improvements #2086

Conversation

blob42 commented Apr 20, 2024 • edited

netlify bot commented Apr 20, 2024 • edited

✅ Deploy Preview for localai canceled.

mudler Apr 20, 2024

Choose a reason for hiding this comment

blob42 Apr 20, 2024

Choose a reason for hiding this comment

mudler commented Apr 20, 2024

blob42 commented Apr 22, 2024

blob42 commented Apr 22, 2024

blob42 Apr 23, 2024

Choose a reason for hiding this comment

dave-gray101 Apr 24, 2024

Choose a reason for hiding this comment

blob42 Apr 24, 2024

Choose a reason for hiding this comment

blob42 commented Apr 23, 2024 • edited

Without specifying lang:

With lang:

blob42 commented Apr 29, 2024 • edited

mudler commented Apr 29, 2024

blob42 commented Apr 20, 2024 •

edited

netlify bot commented Apr 20, 2024 •

edited

blob42 commented Apr 23, 2024 •

edited

blob42 commented Apr 29, 2024 •

edited