Add missing pipelines to API #552

semack · 2023-09-11T06:57:40Z

Hi guys,

First of all, thank you for the amazing job you do.

I didn't find API for Text-To-Speech. The workflow can be used for this I think, but are there any plans to implement it on API?

Kind regards,
/Andriy

davidmezzetti · 2023-09-11T16:02:38Z

Thank you for the issue.

The plan moving forward was to push running pipelines through workflows instead of direct when using the API.

davidmezzetti · 2023-09-25T12:15:52Z

Upon further review, there are only a few that aren't in the API and it makes sense to have the routers. I've been pushing things more to workflows but it doesn't hurt to have pipelines, especially in the case of a LLM pipeline.

semack · 2023-09-26T08:02:20Z

Another thing I've faced - in my setup txtxai is hosted in a separate remote environment with a powerful GPU and my custom software needs it to be used remotely using the API. Some pipelines like Textraction and Transcription need to have a file name as an argument. The Textraction from remote sources works well, but Transcription doesn't. Could it be fixed?

davidmezzetti · 2023-09-26T11:25:51Z

The pipelines are focused on a single task by design. That's where workflows come in. There are workflow steps for reading from URLs and cloud object storage.

semack · 2023-10-10T14:33:20Z

Hi David,

Thank you for pointing me out, the retrieve task helped me, transcription works well.
I am now having another problem with workflow while I'm trying to make tts get to work in a docker container.

docker-compose file

version: '3.4'
services:
txtai-api:
build:
context: .
dockerfile: txtai-api.Dockerfile
ports:
- 8000:8000
volumes:
- ./app.yml:/app/app.yaml:ro
- ./.cache:/models
environment:
- CONFIG=/app/app.yaml
- TRANSFORMERS_CACHE=/models
#command: python -c "import tensorflow as tf;tf.test.gpu_device_name()"
deploy:
resources:
reservations:
devices:
- driver: nvidia
device_ids: ['0']
capabilities: [gpu]

txtai-api.Dockerfile

Set base image

ARG BASE_IMAGE=neuml/txtai-gpu:latest
FROM $BASE_IMAGE

Start server and listen on all interfaces

ENTRYPOINT ["uvicorn", "--host", "0.0.0.0", "txtai.api:app"]

app.yml

Index file path

path: /tmp/index

Allow indexing of documents

writable: True

Enbeddings index

embeddings:
path: sentence-transformers/nli-mpnet-base-v2

Extractive QA

extractor:
path: distilbert-base-cased-distilled-squad

Zero-shot labeling

labels:

Similarity

similarity:

Text segmentation

segmentation:
sentences: true

Text summarization

summary:

Text extraction

textractor:
join: true
lines: false
minlength: 100
paragraphs: true
sentences: false

Transcribe audio to text

transcription:

#Text To Speech
texttospeech:

Translate text between languages

translation:

Workflow definitions

workflow:
sumfrench:
tasks:
- action: textractor
task: url
- action: summary
- action: translation
args: ["fr"]
sumspanish:
tasks:
- action: textractor
task: url
- action: summary
- action: translation
args: ["es"]
tts:
tasks:
- action: texttospeech
stt:
tasks:
- task: retrieve
- action: transcription

There is my call in C#, sorry not Python, but I showed it for understanding context.

        public async Task<TextToSpeechResponse> Handle(TextToSpeechCommand request, CancellationToken cancellationToken)
        {
            var wf = new Workflow(_settings.BaseUrl);
            
            var elements = new List<string>()
            {
                { request.Text }
            };
            
            var data = await wf.WorkflowActionAsync("tts", elements);
            
            var result = new TextToSpeechResponse
            {
                Binary = (byte[])data.FirstOrDefault()
            };
            
            return result;
        }
    }

Logs from the container

root@debian-AI:/opt/docker/txtai# docker compose up
[+] Running 2/1
✔ Network txtai_default Created 0.1s
✔ Container txtai-txtai-api-1 Created 0.0s
Attaching to txtai-txtai-api-1
txtai-txtai-api-1 | [nltk_data] Downloading package averaged_perceptron_tagger to
txtai-txtai-api-1 | [nltk_data] /root/nltk_data...
txtai-txtai-api-1 | [nltk_data] Unzipping taggers/averaged_perceptron_tagger.zip.
txtai-txtai-api-1 | [nltk_data] Downloading package cmudict to /root/nltk_data...
txtai-txtai-api-1 | [nltk_data] Unzipping corpora/cmudict.zip.
txtai-txtai-api-1 | INFO: Started server process [1]
txtai-txtai-api-1 | INFO: Waiting for application startup.
txtai-txtai-api-1 | No model was supplied, defaulted to facebook/bart-large-mnli and revision c626438 (https://huggingface.co/facebook/bart-large-mnli).
txtai-txtai-api-1 | Using a pipeline without specifying a model name and revision in production is not recommended.
txtai-txtai-api-1 | No model was supplied, defaulted to sshleifer/distilbart-cnn-12-6 and revision a4f8f3e (https://huggingface.co/sshleifer/distilbart-cnn-12-6).
txtai-txtai-api-1 | Using a pipeline without specifying a model name and revision in production is not recommended.
Downloading (…)lve/main/config.yaml: 100%|██████████| 1.10k/1.10k [00:00<00:00, 540kB/s]
Downloading model.onnx: 100%|██████████| 133M/133M [00:02<00:00, 48.3MB/s]
txtai-txtai-api-1 | No model was supplied, defaulted to facebook/wav2vec2-base-960h and revision 55bb623 (https://huggingface.co/facebook/wav2vec2-base-960h).
txtai-txtai-api-1 | Using a pipeline without specifying a model name and revision in production is not recommended.
txtai-txtai-api-1 | Some weights of Wav2Vec2ForCTC were not initialized from the model checkpoint at facebook/wav2vec2-base-960h and are newly initialized: ['wav2vec2.masked_spec_embed']
txtai-txtai-api-1 | You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
txtai-txtai-api-1 | INFO: Application startup complete.
txtai-txtai-api-1 | INFO: Uvicorn running on http://0.0.0.0:8000 (Press CTRL+C to quit)
txtai-txtai-api-1 | INFO: 10.20.255.4:54510 - "POST /workflow HTTP/1.1" 500 Internal Server Error
txtai-txtai-api-1 | ERROR: Exception in ASGI application
txtai-txtai-api-1 | Traceback (most recent call last):
txtai-txtai-api-1 | File "/usr/local/lib/python3.8/dist-packages/fastapi/encoders.py", line 230, in jsonable_encoder
txtai-txtai-api-1 | data = dict(obj)
txtai-txtai-api-1 | TypeError: cannot convert dictionary update sequence element #0 to a sequence
txtai-txtai-api-1 |
txtai-txtai-api-1 | During handling of the above exception, another exception occurred:
txtai-txtai-api-1 |
txtai-txtai-api-1 | Traceback (most recent call last):
txtai-txtai-api-1 | File "/usr/local/lib/python3.8/dist-packages/fastapi/encoders.py", line 235, in jsonable_encoder
txtai-txtai-api-1 | data = vars(obj)
txtai-txtai-api-1 | TypeError: vars() argument must have dict attribute
txtai-txtai-api-1 |
txtai-txtai-api-1 | The above exception was the direct cause of the following exception:
txtai-txtai-api-1 |
txtai-txtai-api-1 | Traceback (most recent call last):
txtai-txtai-api-1 | File "/usr/local/lib/python3.8/dist-packages/uvicorn/protocols/http/h11_impl.py", line 408, in run_asgi
txtai-txtai-api-1 | result = await app( # type: ignore[func-returns-value]
txtai-txtai-api-1 | File "/usr/local/lib/python3.8/dist-packages/uvicorn/middleware/proxy_headers.py", line 84, in call
txtai-txtai-api-1 | return await self.app(scope, receive, send)
txtai-txtai-api-1 | File "/usr/local/lib/python3.8/dist-packages/fastapi/applications.py", line 292, in call
txtai-txtai-api-1 | await super().call(scope, receive, send)
txtai-txtai-api-1 | File "/usr/local/lib/python3.8/dist-packages/starlette/applications.py", line 122, in call
txtai-txtai-api-1 | await self.middleware_stack(scope, receive, send)
txtai-txtai-api-1 | File "/usr/local/lib/python3.8/dist-packages/starlette/middleware/errors.py", line 184, in call
txtai-txtai-api-1 | raise exc
txtai-txtai-api-1 | File "/usr/local/lib/python3.8/dist-packages/starlette/middleware/errors.py", line 162, in call
txtai-txtai-api-1 | await self.app(scope, receive, _send)
txtai-txtai-api-1 | File "/usr/local/lib/python3.8/dist-packages/starlette/middleware/exceptions.py", line 79, in call
txtai-txtai-api-1 | raise exc
txtai-txtai-api-1 | File "/usr/local/lib/python3.8/dist-packages/starlette/middleware/exceptions.py", line 68, in call
txtai-txtai-api-1 | await self.app(scope, receive, sender)
txtai-txtai-api-1 | File "/usr/local/lib/python3.8/dist-packages/fastapi/middleware/asyncexitstack.py", line 20, in call
txtai-txtai-api-1 | raise e
txtai-txtai-api-1 | File "/usr/local/lib/python3.8/dist-packages/fastapi/middleware/asyncexitstack.py", line 17, in call
txtai-txtai-api-1 | await self.app(scope, receive, send)
txtai-txtai-api-1 | File "/usr/local/lib/python3.8/dist-packages/starlette/routing.py", line 718, in call
txtai-txtai-api-1 | await route.handle(scope, receive, send)
txtai-txtai-api-1 | File "/usr/local/lib/python3.8/dist-packages/starlette/routing.py", line 276, in handle
txtai-txtai-api-1 | await self.app(scope, receive, send)
txtai-txtai-api-1 | File "/usr/local/lib/python3.8/dist-packages/starlette/routing.py", line 66, in app
txtai-txtai-api-1 | response = await func(request)
txtai-txtai-api-1 | File "/usr/local/lib/python3.8/dist-packages/fastapi/routing.py", line 291, in app
txtai-txtai-api-1 | content = await serialize_response(
txtai-txtai-api-1 | File "/usr/local/lib/python3.8/dist-packages/fastapi/routing.py", line 179, in serialize_response
txtai-txtai-api-1 | return jsonable_encoder(response_content)
txtai-txtai-api-1 | File "/usr/local/lib/python3.8/dist-packages/fastapi/encoders.py", line 209, in jsonable_encoder
txtai-txtai-api-1 | jsonable_encoder(
txtai-txtai-api-1 | File "/usr/local/lib/python3.8/dist-packages/fastapi/encoders.py", line 238, in jsonable_encoder
txtai-txtai-api-1 | raise ValueError(errors) from e
txtai-txtai-api-1 | ValueError: [TypeError('cannot convert dictionary update sequence element #0 to a sequence'), TypeError('vars() argument must have dict attribute')]

Could you help me figure out the problem please? I feel that there is something missing.

Thank you in advance,
Andriy

semack · 2023-10-11T11:32:03Z

When I'm using curl there is the same.

curl   -X POST "http://localhost:8000/workflow"   -H "Content-Type: application/json"   -d '{"name":"tts", "elements":["Say something here"]}'

I figured out that the problem on filling responses on the server when using tts.

davidmezzetti · 2023-10-11T14:35:27Z

I'll have to look at this closer but it seems like it might be an issue with returning binary data as JSON.

semack · 2023-10-11T16:57:38Z

Yes, I have the same suspicion.

davidmezzetti · 2023-10-11T17:36:09Z

Well instead of binary, I should say NumPy arrays which are what are returned.

You can add your own custom pipeline that converts the waveforms to Python floats which are JSON serializable.

class Converter:
    def __call__(self, inputs):
        return [x.tolist() for x in inputs]

Or perhaps something that even writes it to a WAV file then base64 encodes that data like what's in this notebook - https://github.com/neuml/txtai/blob/master/examples/40_Text_to_Speech_Generation.ipynb

Ultimately, I think having options to write to WAV/base64 encode could be good options to add to the TTS pipeline.

semack · 2023-10-12T06:11:25Z

Ultimately, I think having options to write to WAV/base64 encode could be good options to add to the TTS pipeline.

This could be the best solution IMHO. Also, it could be a Task I guess.
Thanks.

davidmezzetti changed the title ~~No TTS on http API~~ Add missing pipelines to API Sep 25, 2023

davidmezzetti self-assigned this Sep 25, 2023

davidmezzetti added this to the v6.2.0 milestone Sep 25, 2023

davidmezzetti modified the milestones: v6.2.0, v6.3.0 Nov 8, 2023

davidmezzetti modified the milestones: v6.3.0, v6.4.0 Jan 2, 2024

davidmezzetti modified the milestones: v7.0.0, v7.1.0 Feb 20, 2024

davidmezzetti modified the milestones: v7.1.0, v7.2.0 Apr 19, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add missing pipelines to API #552

Add missing pipelines to API #552

semack commented Sep 11, 2023

davidmezzetti commented Sep 11, 2023

davidmezzetti commented Sep 25, 2023

semack commented Sep 26, 2023 •

edited

davidmezzetti commented Sep 26, 2023

semack commented Oct 10, 2023

Set base image

Start server and listen on all interfaces

Index file path

Allow indexing of documents

Enbeddings index

Extractive QA

Zero-shot labeling

Similarity

Text segmentation

Text summarization

Text extraction

Transcribe audio to text

Translate text between languages

Workflow definitions

semack commented Oct 11, 2023

davidmezzetti commented Oct 11, 2023

semack commented Oct 11, 2023

davidmezzetti commented Oct 11, 2023

semack commented Oct 12, 2023 •

edited

Add missing pipelines to API #552

Add missing pipelines to API #552

Comments

semack commented Sep 11, 2023

davidmezzetti commented Sep 11, 2023

davidmezzetti commented Sep 25, 2023

semack commented Sep 26, 2023 • edited

davidmezzetti commented Sep 26, 2023

semack commented Oct 10, 2023

Set base image

Start server and listen on all interfaces

Index file path

Allow indexing of documents

Enbeddings index

Extractive QA

Zero-shot labeling

Similarity

Text segmentation

Text summarization

Text extraction

Transcribe audio to text

Translate text between languages

Workflow definitions

semack commented Oct 11, 2023

davidmezzetti commented Oct 11, 2023

semack commented Oct 11, 2023

davidmezzetti commented Oct 11, 2023

semack commented Oct 12, 2023 • edited

semack commented Sep 26, 2023 •

edited

semack commented Oct 12, 2023 •

edited