Add Async Client to be able to make asynchronous requests #88

StefanBogdan · 2021-12-15T09:55:15Z

No description provided.

reppertj · 2022-02-25T17:47:17Z

We'll need to do this very soon for our use case and are happy to contribute something back.

A lot of projects have found maintaining both sync and async versions of clients or libraries in python have been tricky (Elasticsearch, httpx, Mongo/Motor, to name a few) -- any thoughts on the preferred implementation?

From our point of view, the easiest thing to do is "fake" async by just running requests in a separate threadpool and awaiting them from the event loop, which is what Motor/Mongo does. This results in minimal code duplication and maintenance overhead with a small performance penalty. The major downside from our point of view is that without better support for variadic generics it's impossible to infer types for MyPy users via existing types in the sync version.

Elasticsearch-py uses code gen unasync but that can add a lot of flakiness/complexity and would also require a fairly large refactor.

The other solution is to accept some level of duplication while trying to minimize the number of places in the code & API where async vs sync actually makes a difference. This would require a larger refactor and possibly breaking API changes, but it is, for example, more or less the solution httpx ended up with. This works better when anticipating a major release or designing something from scratch.

Just thought I would get your thoughts on any preferences here before starting something!

bobvanluijt · 2022-03-08T22:38:03Z

cc: @michaverhagen and @StefanBogdan

StefanBogdan · 2022-03-09T13:06:50Z

Hi @reppertj , we are going to use the aiohttp library for the AsyncClient because it has very similar requests methods. The way I want it to be structured is to have an Abstract Class for all the classes that make HTTP requests to Weaviate, then implement a Sync and Async versions where all the pre-process is done either in the Abstract class or by some function, that is used by both sync/async Classes. the Sync/Async classes will have only to handle the errors.

The current status of the version 4.0.0 is 80% done, but not tested.

creatorrr · 2023-03-05T09:49:13Z

@StefanBogdan @bobvanluijt any news/plans for this?

bobvanluijt · 2023-03-05T15:03:49Z

Cc @dirkkul

dirkkul · 2023-03-06T09:39:28Z

No concrete plans, sorry. This is something I'd like to explore at some point, but we need to see when we'll find the time

creatorrr · 2023-03-07T03:17:01Z

Gotcha. It can be a big blocker for a lot of people (including us) because the code currently blocks during the request and when the cluster is struggling with too many requests it could lock up the handler processes. Please see if you can prioritize this at some point. :)

kubre · 2023-03-15T12:40:24Z

Is there no other way around this?

netapy · 2023-05-02T13:53:08Z

Any news on this ? My service is slowing down lots of requests because of this :/

kubre · 2023-05-02T15:34:49Z

@netapy Nope, I decided simply write this part using aiohttp and aiographql clients. It's not as clean looking as it might have been with lib itself but wrapping all the connection and querying part inside the a single class would help you alot.

netapy · 2023-05-03T09:49:47Z

@netapy Nope, I decided simply write this part using aiohttp and aiographql clients. It's not as clean looking as it might have been with lib itself but wrapping all the connection and querying part inside the a single class would help you alot.

@kubre I'm sorry could you show me some code ?
I'm just trying out a basic example but it doesn't seem to treat the queries concurrently.

@app.get("/search/{query_string}")
async def read_search(query_string: str):

    url = 'http://api.example.tech:8080/v1/graphql'
    headers = {
        'Content-Type': 'application/json',
        'Authorization': f'Bearer MYTOKEN
    }

    query = '''
        query {
            Get {
                Article(
                    hybrid: {
                        query: "xxxxxx"
                        alpha: 0.7
                    }) {
                    titre
                    pathTitle
                    texte
                    code
                }
            }
        }
    '''.replace("xxxxxx", query_string)

    await asyncio.sleep(3)

    async with aiohttp.ClientSession() as session:
        async with session.post(url, headers=headers, data=json.dumps({'query': query})) as response:
            return await response.json()

kubre · 2023-05-03T10:05:04Z

@netapy Do you mean your entire thread is blocked until it finishes this request? I can't see any issues with the above code. Just calling API with aiohttp should stop it from blocking the main thread. Even when the below code is executing any other request coming into the server is processed without any issues even DB queries like reading objects while some other are being inserted

  async with self.session.post(
      f"{self.url}/v1/batch/objects?consistency_level=ALL",
      json={"objects": batch},
  ) as response:
      await response.json()

barbu110 · 2023-09-10T20:08:36Z

Hi people, is there any update on this?

netapy · 2023-09-11T05:59:38Z

@barbu110 Hi people, is there any update on this?

I don't think they integrated aysnc into the client yet – however I managed to make it work using a classi HTTP call with aiohttp.

async def search(query_string: str):   

    query = '''
        {
            Get{
                Article(
                    limit: 20
                    hybrid: {
                        query: "xxxxxx"
                        alpha: 0.75
                        fusionType: relativeScoreFusion
                    }){
                        texte
                        title
                    }
                    _additional {
                        score
                    }
                }
            }
        }
    '''.replace("xxxxxx", query_string)

    async with aiohttp.ClientSession() as session:
        async with session.post(url, headers=headers, data=json.dumps({'query': query})) as response:
            return await response.json()

This doesn't block the main thread and works flawlessly.
I don't use the client in production anymore, I find it useful only for data batch upload.

I hope this helps.

andersfylling · 2023-09-18T07:22:32Z

I'm successfully using aiohttp as well, and we saw a change from around 40rps to 130rps improvement. Ofcourse our service does more than just query weaviate, but it really helps out.

Some details to note:

create aiohttp.ClientSession only once for all future weaviate queries
you can re-use the weaviate python client as a query builder

# example
query = weaviate_client.query.get(class_name=class_name, properties=properties).with_limit(1).build()
async with aiohttp_client_session.post(url, headers=headers, data=json.dumps({"query": request})) as response:
    result = await response.json()

andersfylling · 2023-09-18T09:09:12Z

Any reason you don't commit to a pure async implementation, and then have a lightweight sync wrapper around it?

plv · 2023-09-26T23:23:10Z

@dirkkul Would you accept design proposals/PRs? We have a patched private fork of the Weaviate Python client with async capability. This is quite dangerous and hacky so I would rather just work on upstreaming it :)

Also, for what it's worth I think we can start with just an async client for reads. This is the use case that most people seem to want when they talk about wanting async support

kranthi419 · 2023-10-31T19:47:32Z

@andersfylling what is the url value here.

example

query = weaviate_client.query.get(class_name=class_name, properties=properties).with_limit(1).build()
async with aiohttp_client_session.post(url, headers=headers, data=json.dumps({"query": request})) as response:
result = await response.json()

mohit-sarvam · 2023-11-01T17:20:44Z

@kranthi419 the url looks like 'http://api.example.tech:8080/v1/graphql'.

TweedBeetle · 2023-11-13T22:05:16Z

Would love an update on this!

doomuch · 2023-12-03T08:06:49Z

@andersfylling so how do use use the query from = weaviate_client.query? Later on you use json.dumps({"query": request})) , not the query variable

dirkkul · 2023-12-13T07:54:57Z

we are currently working on bringing v4 out of beta as fast as possible - async is the next item right afterwards

TweedBeetle · 2024-02-26T09:26:57Z

@dirkkul Loving v4!
Any news on async functionality? :D 👉👈

dirkkul · 2024-02-27T09:21:52Z

It is on the roadmap for this quarter

netapy · 2024-04-17T21:26:11Z

Any news on this ? :)

dirkkul · 2024-04-17T21:27:43Z

There is a draft PR but it is not ready yet - we have some other important things to work on, but it is on its way

StefanBogdan added the enhancement New feature or request label Dec 15, 2021

bobvanluijt assigned StefanBogdan Mar 8, 2022

JoanFM mentioned this issue Oct 3, 2022

Investigate capacity to have async method for match and find specially for Storage Backends docarray/docarray#446

Closed

StefanBogdan removed their assignment Aug 17, 2023

dirkkul added the 24Q1 target quarter label Feb 28, 2024

tsmith023 linked a pull request Apr 18, 2024 that will close this issue

Introduce WeaviateAsyncClient as async alternative to WeaviateClient #1007

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Async Client to be able to make asynchronous requests #88

Add Async Client to be able to make asynchronous requests #88

StefanBogdan commented Dec 15, 2021

reppertj commented Feb 25, 2022

bobvanluijt commented Mar 8, 2022

StefanBogdan commented Mar 9, 2022

creatorrr commented Mar 5, 2023

bobvanluijt commented Mar 5, 2023

dirkkul commented Mar 6, 2023

creatorrr commented Mar 7, 2023

kubre commented Mar 15, 2023

netapy commented May 2, 2023

kubre commented May 2, 2023

netapy commented May 3, 2023

kubre commented May 3, 2023

barbu110 commented Sep 10, 2023

netapy commented Sep 11, 2023

andersfylling commented Sep 18, 2023 •

edited

andersfylling commented Sep 18, 2023

plv commented Sep 26, 2023 •

edited

kranthi419 commented Oct 31, 2023

mohit-sarvam commented Nov 1, 2023

TweedBeetle commented Nov 13, 2023

doomuch commented Dec 3, 2023

dirkkul commented Dec 13, 2023

TweedBeetle commented Feb 26, 2024

dirkkul commented Feb 27, 2024

netapy commented Apr 17, 2024

dirkkul commented Apr 17, 2024

Add Async Client to be able to make asynchronous requests #88

Add Async Client to be able to make asynchronous requests #88

Comments

StefanBogdan commented Dec 15, 2021

reppertj commented Feb 25, 2022

bobvanluijt commented Mar 8, 2022

StefanBogdan commented Mar 9, 2022

creatorrr commented Mar 5, 2023

bobvanluijt commented Mar 5, 2023

dirkkul commented Mar 6, 2023

creatorrr commented Mar 7, 2023

kubre commented Mar 15, 2023

netapy commented May 2, 2023

kubre commented May 2, 2023

netapy commented May 3, 2023

kubre commented May 3, 2023

barbu110 commented Sep 10, 2023

netapy commented Sep 11, 2023

andersfylling commented Sep 18, 2023 • edited

andersfylling commented Sep 18, 2023

plv commented Sep 26, 2023 • edited

kranthi419 commented Oct 31, 2023

example

mohit-sarvam commented Nov 1, 2023

TweedBeetle commented Nov 13, 2023

doomuch commented Dec 3, 2023

dirkkul commented Dec 13, 2023

TweedBeetle commented Feb 26, 2024

dirkkul commented Feb 27, 2024

netapy commented Apr 17, 2024

dirkkul commented Apr 17, 2024

andersfylling commented Sep 18, 2023 •

edited

plv commented Sep 26, 2023 •

edited