Memory leak #1246

a383615194 · 2024-03-19T09:22:16Z

Confirm this is an issue with the Python library and not an underlying OpenAI API

This is an issue with the Python library

Describe the bug

I am using the AsyncAzureOpenAI class to instantiate a client and using a stream call to client.chat.completions.create. Even after performing close() on both client and response within a try-finally block, I am still encountering a memory leak that eventually leads to server crash.
I tried the solution outlined in #1181, where the pydantic package was upgraded to 2.6.3, but this hasn't resolved my issue.
I noticed using the gc library that memory usage increases after each call to this service. Our service is used for centralized management of AzureOpenAI accounts, hence a client is instantiated for every incoming request. Given the concurrent nature of this service, I'm wondering if client.with_options can support concurrent usage. Do you have any good solutions to address this memory leak issue?

To Reproduce

Several calls in a row, for example, to embeddings that are wrapped with asynс.

Code snippets

class LlmStreamApiHandler(tornado.web.RequestHandler):
    executor = ThreadPoolExecutor(200)

    def __init__(self, *args, **kwargs):
        super(LlmStreamApiHandler, self).__init__(*args, **kwargs)
        self.set_header('Content-Type', 'text/event-stream')
        self.set_header('Access-Control-Allow-Origin', "*")
        self.set_header("Access-Control-Allow-Headers", "*")
        self.set_header("Access-Control-Allow-Methods", "*")

    def on_finish(self):
        return super().on_finish()


    async def post(self):

        try:
            result = await self.process(...)
        except Exception as e:
            ...

        self.write(json.dumps(result) + "\n")
        await self.flush()


    async def process(self, ...)
        client = openai.AsyncAzureOpenAI(
            api_version=api_version,
            api_key=api_key,
            azure_endpoint=azure_endpoint,
            http_client=httpx.AsyncClient(
                proxies=config.api_proxy,
            ),
            max_retries=0
        )
        response_text = False
        try:

            response_text = await client.chat.completions.create(**prompt)
            async for chunk in response_text:
                chunk = chunk.model_dump()
                if chunk['choices'] == [] and chunk['id'] == "" and chunk['model'] == "" and chunk['object'] == "":
                    continue
                chunk_message = chunk['choices'][0]['delta']
                current_text = chunk_message.get('content', '')
                if bool(chunk_message) and current_text:
                    ...
                elif chunk['choices'][0]["finish_reason"] == "stop":
                    break

                elif current_text == '' and chunk_message.get('role', '') == "assistant":
                    ...
                elif chunk['choices'][0]["finish_reason"] == "content_filter":
                    ...
                else:
                    continue
                self.write(json.dumps(json_data) + "\n")
                await self.flush()
        except Exception as e:
            ...
            raise ...
        finally:
            if response_text:
                await response_text.close()
            await client.close()
        return ...

OS

CentOS

Python version

Python 3.8

Library version

openai v1.12.0

antont · 2024-03-19T10:42:05Z

It's possible to reuse the same client for many requests, I think a single one during the lifetime of the process can work fine, at least that's what I've been doing in production so far. (using Python 3.11)

So you could move your client creation to the init maybe? I don't know Tornado though, am using Starlette.

Sure, creating and closing clients should not leak either, and they have been fixing bugs there earlier. But just a note.

a383615194 · 2024-03-19T11:55:53Z

@antont May I kindly ask, in your service, is every call made using the same set of initialization parameters for the client? In my service, different request sources have their own specified api_version, api_key, and azure_endpoint parameters. Therefore, I initialize a new, corresponding client object for each request.

I've also noticed the client.with_options() method that can dynamically change these parameters. However, what I'm uncertain about is, if only a single client is used and concurrently called, would client.with_options() lead to an erroneous override of the client parameters?

antont · 2024-03-19T16:25:48Z

@a383615194

May I kindly ask, in your service, is every call made using the same set of initialization parameters for the client?

Spot on, that is the case for us.

I've also noticed the client.with_options() method that can dynamically change these parameters. However, what I'm uncertain about is, if only a single client is used and concurrently called, would client.with_options() lead to an erroneous override of the client parameters?

Yep that's why that method exists. AFAIK it works correctly, so that you can reuse the same client but have different e.g. API keys for different requests. I have not used it, however, but only seen it mentioned in previous similar issues here. Some people are vary of it, one person here at least explained how he creates new client objects just to be sure. I would probably try to read it to review if it seems clear and trustworthy and use it then. I guess there are tests for it too, though strange bug cases can be hard to cover if some such would happen with it.

rattrayalex · 2024-03-21T01:54:41Z

Can you share a repository that demonstrates a minimal reproduction? (The code you shared is a helpful starting point, but something we can download and run and see the error would be very helpful).

I'd also +1 @antont's suggestion to reuse the client.

zhnglicho · 2024-04-29T02:15:32Z

@a383615194 have you fixed this issue ?

zhnglicho · 2024-04-29T14:20:50Z

I reuse the client , the issue has gone. the version is 1.23.6 that I used

a383615194 added the bug Something isn't working label Mar 19, 2024

zhnglicho mentioned this issue Apr 29, 2024

Memory Leak #1374

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Memory leak #1246

Memory leak #1246

a383615194 commented Mar 19, 2024 •

edited

antont commented Mar 19, 2024 •

edited

a383615194 commented Mar 19, 2024

antont commented Mar 19, 2024

rattrayalex commented Mar 21, 2024

zhnglicho commented Apr 29, 2024

zhnglicho commented Apr 29, 2024

Memory leak #1246

Memory leak #1246

Comments

a383615194 commented Mar 19, 2024 • edited

Confirm this is an issue with the Python library and not an underlying OpenAI API

Describe the bug

To Reproduce

Code snippets

OS

Python version

Library version

antont commented Mar 19, 2024 • edited

a383615194 commented Mar 19, 2024

antont commented Mar 19, 2024

rattrayalex commented Mar 21, 2024

zhnglicho commented Apr 29, 2024

zhnglicho commented Apr 29, 2024

a383615194 commented Mar 19, 2024 •

edited

antont commented Mar 19, 2024 •

edited