Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add missing pipelines to API #552

Open
semack opened this issue Sep 11, 2023 · 10 comments
Open

Add missing pipelines to API #552

semack opened this issue Sep 11, 2023 · 10 comments
Assignees
Milestone

Comments

@semack
Copy link

semack commented Sep 11, 2023

Hi guys,

First of all, thank you for the amazing job you do.

I didn't find API for Text-To-Speech. The workflow can be used for this I think, but are there any plans to implement it on API?

Kind regards,
/Andriy

@davidmezzetti
Copy link
Member

Thank you for the issue.

The plan moving forward was to push running pipelines through workflows instead of direct when using the API.

@davidmezzetti davidmezzetti changed the title No TTS on http API Add missing pipelines to API Sep 25, 2023
@davidmezzetti
Copy link
Member

Upon further review, there are only a few that aren't in the API and it makes sense to have the routers. I've been pushing things more to workflows but it doesn't hurt to have pipelines, especially in the case of a LLM pipeline.

@davidmezzetti davidmezzetti self-assigned this Sep 25, 2023
@davidmezzetti davidmezzetti added this to the v6.2.0 milestone Sep 25, 2023
@semack
Copy link
Author

semack commented Sep 26, 2023

Another thing I've faced - in my setup txtxai is hosted in a separate remote environment with a powerful GPU and my custom software needs it to be used remotely using the API. Some pipelines like Textraction and Transcription need to have a file name as an argument. The Textraction from remote sources works well, but Transcription doesn't. Could it be fixed?

@davidmezzetti
Copy link
Member

The pipelines are focused on a single task by design. That's where workflows come in. There are workflow steps for reading from URLs and cloud object storage.

@semack
Copy link
Author

semack commented Oct 10, 2023

Hi David,

Thank you for pointing me out, the retrieve task helped me, transcription works well.
I am now having another problem with workflow while I'm trying to make tts get to work in a docker container.

docker-compose file

version: '3.4'
services:
txtai-api:
build:
context: .
dockerfile: txtai-api.Dockerfile
ports:
- 8000:8000
volumes:
- ./app.yml:/app/app.yaml:ro
- ./.cache:/models
environment:
- CONFIG=/app/app.yaml
- TRANSFORMERS_CACHE=/models
#command: python -c "import tensorflow as tf;tf.test.gpu_device_name()"
deploy:
resources:
reservations:
devices:
- driver: nvidia
device_ids: ['0']
capabilities: [gpu]

txtai-api.Dockerfile

Set base image

ARG BASE_IMAGE=neuml/txtai-gpu:latest
FROM $BASE_IMAGE

Start server and listen on all interfaces

ENTRYPOINT ["uvicorn", "--host", "0.0.0.0", "txtai.api:app"]

app.yml

Index file path

path: /tmp/index

Allow indexing of documents

writable: True

Enbeddings index

embeddings:
path: sentence-transformers/nli-mpnet-base-v2

Extractive QA

extractor:
path: distilbert-base-cased-distilled-squad

Zero-shot labeling

labels:

Similarity

similarity:

Text segmentation

segmentation:
sentences: true

Text summarization

summary:

Text extraction

textractor:
join: true
lines: false
minlength: 100
paragraphs: true
sentences: false

Transcribe audio to text

transcription:

#Text To Speech
texttospeech:

Translate text between languages

translation:

Workflow definitions

workflow:
sumfrench:
tasks:
- action: textractor
task: url
- action: summary
- action: translation
args: ["fr"]
sumspanish:
tasks:
- action: textractor
task: url
- action: summary
- action: translation
args: ["es"]
tts:
tasks:
- action: texttospeech
stt:
tasks:
- task: retrieve
- action: transcription

There is my call in C#, sorry not Python, but I showed it for understanding context.

        public async Task<TextToSpeechResponse> Handle(TextToSpeechCommand request, CancellationToken cancellationToken)
        {
            var wf = new Workflow(_settings.BaseUrl);
            
            var elements = new List<string>()
            {
                { request.Text }
            };
            
            var data = await wf.WorkflowActionAsync("tts", elements);
            
            var result = new TextToSpeechResponse
            {
                Binary = (byte[])data.FirstOrDefault()
            };
            
            return result;
        }
    }
Logs from the container

root@debian-AI:/opt/docker/txtai# docker compose up
[+] Running 2/1
✔ Network txtai_default Created 0.1s
✔ Container txtai-txtai-api-1 Created 0.0s
Attaching to txtai-txtai-api-1
txtai-txtai-api-1 | [nltk_data] Downloading package averaged_perceptron_tagger to
txtai-txtai-api-1 | [nltk_data] /root/nltk_data...
txtai-txtai-api-1 | [nltk_data] Unzipping taggers/averaged_perceptron_tagger.zip.
txtai-txtai-api-1 | [nltk_data] Downloading package cmudict to /root/nltk_data...
txtai-txtai-api-1 | [nltk_data] Unzipping corpora/cmudict.zip.
txtai-txtai-api-1 | INFO: Started server process [1]
txtai-txtai-api-1 | INFO: Waiting for application startup.
txtai-txtai-api-1 | No model was supplied, defaulted to facebook/bart-large-mnli and revision c626438 (https://huggingface.co/facebook/bart-large-mnli).
txtai-txtai-api-1 | Using a pipeline without specifying a model name and revision in production is not recommended.
txtai-txtai-api-1 | No model was supplied, defaulted to sshleifer/distilbart-cnn-12-6 and revision a4f8f3e (https://huggingface.co/sshleifer/distilbart-cnn-12-6).
txtai-txtai-api-1 | Using a pipeline without specifying a model name and revision in production is not recommended.
Downloading (…)lve/main/config.yaml: 100%|██████████| 1.10k/1.10k [00:00<00:00, 540kB/s]
Downloading model.onnx: 100%|██████████| 133M/133M [00:02<00:00, 48.3MB/s]
txtai-txtai-api-1 | No model was supplied, defaulted to facebook/wav2vec2-base-960h and revision 55bb623 (https://huggingface.co/facebook/wav2vec2-base-960h).
txtai-txtai-api-1 | Using a pipeline without specifying a model name and revision in production is not recommended.
txtai-txtai-api-1 | Some weights of Wav2Vec2ForCTC were not initialized from the model checkpoint at facebook/wav2vec2-base-960h and are newly initialized: ['wav2vec2.masked_spec_embed']
txtai-txtai-api-1 | You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
txtai-txtai-api-1 | INFO: Application startup complete.
txtai-txtai-api-1 | INFO: Uvicorn running on http://0.0.0.0:8000 (Press CTRL+C to quit)
txtai-txtai-api-1 | INFO: 10.20.255.4:54510 - "POST /workflow HTTP/1.1" 500 Internal Server Error
txtai-txtai-api-1 | ERROR: Exception in ASGI application
txtai-txtai-api-1 | Traceback (most recent call last):
txtai-txtai-api-1 | File "/usr/local/lib/python3.8/dist-packages/fastapi/encoders.py", line 230, in jsonable_encoder
txtai-txtai-api-1 | data = dict(obj)
txtai-txtai-api-1 | TypeError: cannot convert dictionary update sequence element #0 to a sequence
txtai-txtai-api-1 |
txtai-txtai-api-1 | During handling of the above exception, another exception occurred:
txtai-txtai-api-1 |
txtai-txtai-api-1 | Traceback (most recent call last):
txtai-txtai-api-1 | File "/usr/local/lib/python3.8/dist-packages/fastapi/encoders.py", line 235, in jsonable_encoder
txtai-txtai-api-1 | data = vars(obj)
txtai-txtai-api-1 | TypeError: vars() argument must have dict attribute
txtai-txtai-api-1 |
txtai-txtai-api-1 | The above exception was the direct cause of the following exception:
txtai-txtai-api-1 |
txtai-txtai-api-1 | Traceback (most recent call last):
txtai-txtai-api-1 | File "/usr/local/lib/python3.8/dist-packages/uvicorn/protocols/http/h11_impl.py", line 408, in run_asgi
txtai-txtai-api-1 | result = await app( # type: ignore[func-returns-value]
txtai-txtai-api-1 | File "/usr/local/lib/python3.8/dist-packages/uvicorn/middleware/proxy_headers.py", line 84, in call
txtai-txtai-api-1 | return await self.app(scope, receive, send)
txtai-txtai-api-1 | File "/usr/local/lib/python3.8/dist-packages/fastapi/applications.py", line 292, in call
txtai-txtai-api-1 | await super().call(scope, receive, send)
txtai-txtai-api-1 | File "/usr/local/lib/python3.8/dist-packages/starlette/applications.py", line 122, in call
txtai-txtai-api-1 | await self.middleware_stack(scope, receive, send)
txtai-txtai-api-1 | File "/usr/local/lib/python3.8/dist-packages/starlette/middleware/errors.py", line 184, in call
txtai-txtai-api-1 | raise exc
txtai-txtai-api-1 | File "/usr/local/lib/python3.8/dist-packages/starlette/middleware/errors.py", line 162, in call
txtai-txtai-api-1 | await self.app(scope, receive, _send)
txtai-txtai-api-1 | File "/usr/local/lib/python3.8/dist-packages/starlette/middleware/exceptions.py", line 79, in call
txtai-txtai-api-1 | raise exc
txtai-txtai-api-1 | File "/usr/local/lib/python3.8/dist-packages/starlette/middleware/exceptions.py", line 68, in call
txtai-txtai-api-1 | await self.app(scope, receive, sender)
txtai-txtai-api-1 | File "/usr/local/lib/python3.8/dist-packages/fastapi/middleware/asyncexitstack.py", line 20, in call
txtai-txtai-api-1 | raise e
txtai-txtai-api-1 | File "/usr/local/lib/python3.8/dist-packages/fastapi/middleware/asyncexitstack.py", line 17, in call
txtai-txtai-api-1 | await self.app(scope, receive, send)
txtai-txtai-api-1 | File "/usr/local/lib/python3.8/dist-packages/starlette/routing.py", line 718, in call
txtai-txtai-api-1 | await route.handle(scope, receive, send)
txtai-txtai-api-1 | File "/usr/local/lib/python3.8/dist-packages/starlette/routing.py", line 276, in handle
txtai-txtai-api-1 | await self.app(scope, receive, send)
txtai-txtai-api-1 | File "/usr/local/lib/python3.8/dist-packages/starlette/routing.py", line 66, in app
txtai-txtai-api-1 | response = await func(request)
txtai-txtai-api-1 | File "/usr/local/lib/python3.8/dist-packages/fastapi/routing.py", line 291, in app
txtai-txtai-api-1 | content = await serialize_response(
txtai-txtai-api-1 | File "/usr/local/lib/python3.8/dist-packages/fastapi/routing.py", line 179, in serialize_response
txtai-txtai-api-1 | return jsonable_encoder(response_content)
txtai-txtai-api-1 | File "/usr/local/lib/python3.8/dist-packages/fastapi/encoders.py", line 209, in jsonable_encoder
txtai-txtai-api-1 | jsonable_encoder(
txtai-txtai-api-1 | File "/usr/local/lib/python3.8/dist-packages/fastapi/encoders.py", line 238, in jsonable_encoder
txtai-txtai-api-1 | raise ValueError(errors) from e
txtai-txtai-api-1 | ValueError: [TypeError('cannot convert dictionary update sequence element #0 to a sequence'), TypeError('vars() argument must have dict attribute')]

Could you help me figure out the problem please? I feel that there is something missing.

Thank you in advance,
Andriy

@semack
Copy link
Author

semack commented Oct 11, 2023

When I'm using curl there is the same.

curl   -X POST "http://localhost:8000/workflow"   -H "Content-Type: application/json"   -d '{"name":"tts", "elements":["Say something here"]}'

I figured out that the problem on filling responses on the server when using tts.

@davidmezzetti
Copy link
Member

I'll have to look at this closer but it seems like it might be an issue with returning binary data as JSON.

@semack
Copy link
Author

semack commented Oct 11, 2023

Yes, I have the same suspicion.

@davidmezzetti
Copy link
Member

Well instead of binary, I should say NumPy arrays which are what are returned.

You can add your own custom pipeline that converts the waveforms to Python floats which are JSON serializable.

class Converter:
    def __call__(self, inputs):
        return [x.tolist() for x in inputs]

Or perhaps something that even writes it to a WAV file then base64 encodes that data like what's in this notebook - https://github.com/neuml/txtai/blob/master/examples/40_Text_to_Speech_Generation.ipynb

Ultimately, I think having options to write to WAV/base64 encode could be good options to add to the TTS pipeline.

@semack
Copy link
Author

semack commented Oct 12, 2023

Ultimately, I think having options to write to WAV/base64 encode could be good options to add to the TTS pipeline.

This could be the best solution IMHO. Also, it could be a Task I guess.
Thanks.

@davidmezzetti davidmezzetti modified the milestones: v6.2.0, v6.3.0 Nov 8, 2023
@davidmezzetti davidmezzetti modified the milestones: v6.3.0, v6.4.0 Jan 2, 2024
@davidmezzetti davidmezzetti modified the milestones: v7.0.0, v7.1.0 Feb 20, 2024
@davidmezzetti davidmezzetti modified the milestones: v7.1.0, v7.2.0 Apr 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants