Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Async Client to be able to make asynchronous requests #88

Open
StefanBogdan opened this issue Dec 15, 2021 · 26 comments · May be fixed by #1007
Open

Add Async Client to be able to make asynchronous requests #88

StefanBogdan opened this issue Dec 15, 2021 · 26 comments · May be fixed by #1007
Labels
24Q1 target quarter enhancement New feature or request

Comments

@StefanBogdan
Copy link
Member

No description provided.

@StefanBogdan StefanBogdan added the enhancement New feature or request label Dec 15, 2021
@reppertj
Copy link

We'll need to do this very soon for our use case and are happy to contribute something back.

A lot of projects have found maintaining both sync and async versions of clients or libraries in python have been tricky (Elasticsearch, httpx, Mongo/Motor, to name a few) -- any thoughts on the preferred implementation?

From our point of view, the easiest thing to do is "fake" async by just running requests in a separate threadpool and awaiting them from the event loop, which is what Motor/Mongo does. This results in minimal code duplication and maintenance overhead with a small performance penalty. The major downside from our point of view is that without better support for variadic generics it's impossible to infer types for MyPy users via existing types in the sync version.

Elasticsearch-py uses code gen unasync but that can add a lot of flakiness/complexity and would also require a fairly large refactor.

The other solution is to accept some level of duplication while trying to minimize the number of places in the code & API where async vs sync actually makes a difference. This would require a larger refactor and possibly breaking API changes, but it is, for example, more or less the solution httpx ended up with. This works better when anticipating a major release or designing something from scratch.

Just thought I would get your thoughts on any preferences here before starting something!

@bobvanluijt
Copy link
Member

cc: @michaverhagen and @StefanBogdan

@StefanBogdan
Copy link
Member Author

Hi @reppertj , we are going to use the aiohttp library for the AsyncClient because it has very similar requests methods. The way I want it to be structured is to have an Abstract Class for all the classes that make HTTP requests to Weaviate, then implement a Sync and Async versions where all the pre-process is done either in the Abstract class or by some function, that is used by both sync/async Classes. the Sync/Async classes will have only to handle the errors.

The current status of the version 4.0.0 is 80% done, but not tested.

@creatorrr
Copy link

@StefanBogdan @bobvanluijt any news/plans for this?

@bobvanluijt
Copy link
Member

Cc @dirkkul

@dirkkul
Copy link
Collaborator

dirkkul commented Mar 6, 2023

No concrete plans, sorry. This is something I'd like to explore at some point, but we need to see when we'll find the time

@creatorrr
Copy link

Gotcha. It can be a big blocker for a lot of people (including us) because the code currently blocks during the request and when the cluster is struggling with too many requests it could lock up the handler processes. Please see if you can prioritize this at some point. :)

@kubre
Copy link

kubre commented Mar 15, 2023

Is there no other way around this?

@netapy
Copy link

netapy commented May 2, 2023

Any news on this ? My service is slowing down lots of requests because of this :/

@kubre
Copy link

kubre commented May 2, 2023

@netapy Nope, I decided simply write this part using aiohttp and aiographql clients. It's not as clean looking as it might have been with lib itself but wrapping all the connection and querying part inside the a single class would help you alot.

@netapy
Copy link

netapy commented May 3, 2023

@netapy Nope, I decided simply write this part using aiohttp and aiographql clients. It's not as clean looking as it might have been with lib itself but wrapping all the connection and querying part inside the a single class would help you alot.

@kubre I'm sorry could you show me some code ?
I'm just trying out a basic example but it doesn't seem to treat the queries concurrently.

@app.get("/search/{query_string}")
async def read_search(query_string: str):

    url = 'http://api.example.tech:8080/v1/graphql'
    headers = {
        'Content-Type': 'application/json',
        'Authorization': f'Bearer MYTOKEN
    }

    query = '''
        query {
            Get {
                Article(
                    hybrid: {
                        query: "xxxxxx"
                        alpha: 0.7
                    }) {
                    titre
                    pathTitle
                    texte
                    code
                }
            }
        }
    '''.replace("xxxxxx", query_string)

    await asyncio.sleep(3)

    async with aiohttp.ClientSession() as session:
        async with session.post(url, headers=headers, data=json.dumps({'query': query})) as response:
            return await response.json()

@kubre
Copy link

kubre commented May 3, 2023

@netapy Do you mean your entire thread is blocked until it finishes this request? I can't see any issues with the above code. Just calling API with aiohttp should stop it from blocking the main thread. Even when the below code is executing any other request coming into the server is processed without any issues even DB queries like reading objects while some other are being inserted

  async with self.session.post(
      f"{self.url}/v1/batch/objects?consistency_level=ALL",
      json={"objects": batch},
  ) as response:
      await response.json()

@StefanBogdan StefanBogdan removed their assignment Aug 17, 2023
@barbu110
Copy link

Hi people, is there any update on this?

@netapy
Copy link

netapy commented Sep 11, 2023

@barbu110 Hi people, is there any update on this?

I don't think they integrated aysnc into the client yet – however I managed to make it work using a classi HTTP call with aiohttp.

async def search(query_string: str):   

    query = '''
        {
            Get{
                Article(
                    limit: 20
                    hybrid: {
                        query: "xxxxxx"
                        alpha: 0.75
                        fusionType: relativeScoreFusion
                    }){
                        texte
                        title
                    }
                    _additional {
                        score
                    }
                }
            }
        }
    '''.replace("xxxxxx", query_string)

    async with aiohttp.ClientSession() as session:
        async with session.post(url, headers=headers, data=json.dumps({'query': query})) as response:
            return await response.json()

This doesn't block the main thread and works flawlessly.
I don't use the client in production anymore, I find it useful only for data batch upload.

I hope this helps.

@andersfylling
Copy link

andersfylling commented Sep 18, 2023

I'm successfully using aiohttp as well, and we saw a change from around 40rps to 130rps improvement. Ofcourse our service does more than just query weaviate, but it really helps out.

Some details to note:

  • create aiohttp.ClientSession only once for all future weaviate queries
  • you can re-use the weaviate python client as a query builder
# example
query = weaviate_client.query.get(class_name=class_name, properties=properties).with_limit(1).build()
async with aiohttp_client_session.post(url, headers=headers, data=json.dumps({"query": request})) as response:
    result = await response.json()

@andersfylling
Copy link

Any reason you don't commit to a pure async implementation, and then have a lightweight sync wrapper around it?

@plv
Copy link

plv commented Sep 26, 2023

@dirkkul Would you accept design proposals/PRs? We have a patched private fork of the Weaviate Python client with async capability. This is quite dangerous and hacky so I would rather just work on upstreaming it :)

Also, for what it's worth I think we can start with just an async client for reads. This is the use case that most people seem to want when they talk about wanting async support

@kranthi419
Copy link

@andersfylling what is the url value here.

example

query = weaviate_client.query.get(class_name=class_name, properties=properties).with_limit(1).build()
async with aiohttp_client_session.post(url, headers=headers, data=json.dumps({"query": request})) as response:
result = await response.json()

@mohit-sarvam
Copy link

@kranthi419 the url looks like 'http://api.example.tech:8080/v1/graphql'.

@TweedBeetle
Copy link

Would love an update on this!

@doomuch
Copy link

doomuch commented Dec 3, 2023

@andersfylling so how do use use the query from = weaviate_client.query? Later on you use json.dumps({"query": request})) , not the query variable

@dirkkul
Copy link
Collaborator

dirkkul commented Dec 13, 2023

we are currently working on bringing v4 out of beta as fast as possible - async is the next item right afterwards

@TweedBeetle
Copy link

@dirkkul Loving v4!
Any news on async functionality? :D 👉👈

@dirkkul
Copy link
Collaborator

dirkkul commented Feb 27, 2024

It is on the roadmap for this quarter

@dirkkul dirkkul added the 24Q1 target quarter label Feb 28, 2024
@netapy
Copy link

netapy commented Apr 17, 2024

Any news on this ? :)

@dirkkul
Copy link
Collaborator

dirkkul commented Apr 17, 2024

There is a draft PR but it is not ready yet - we have some other important things to work on, but it is on its way

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
24Q1 target quarter enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.