Multiple Async calls to the api fail catastrophically #1195

cenedella · 2024-02-26T20:27:35Z

Confirm this is an issue with the Python library and not an underlying OpenAI API

This is an issue with the Python library

Describe the bug

For a resume-writing program with multiple levels of async calls, launching relatively small scale async processing causes the API to fail catastrophically.

Attempted the OpenAIClient and httpx.AsyncClient solutions which were suggested here and elsewhere:
#769

When called synchronously, code processes 50 resumes sequentially with no problem, and perhaps 3 or 4 'Timeout' failures in aggregate that are successfully completed using exponential backoff. The average completion time for each document is 50 seconds with a std of perhaps 10 seconds.

When the same 50 documents are run simultaneously using asyncio:
await asyncio.gather(*tasks)

Several hundred - several thousand timeout errors occur in aggregate, and most of the time, the processes will fail catastrophically as None is returned by the OpenAI api, which then fails cascadingly throughout the system.

Average completion time rises to 240 seconds with an std of perhaps 30 seconds.

I've confirmed that unique clients are created for each document:
OpenAIClient object at 0x7f9a57762fb0
OpenAIClient object at 0x7f9a5764f430
OpenAIClient object at 0x7f9a57249870
...

Running with a clean new environment updated today:
python==3.10.13
openai==1.12.0
httpx==0.27.0

#769 seems to indicate that the problem was resolved in open 1.3.8, but we can't fix.

To Reproduce

Initiate 50 top-level tasks, each of which fires of approx 100 tasks, each of which may fire 0-5 additional tasks, and may reiterate
Create an AsyncOpenAI Client for each of the 50 toplevel tasks
Observe that OpenAI repeatedly returns thousands of timeout errors

Code snippets

class OpenAIClient:                                     
     def __init__(self, account_info):                   
         self.aclient = openai.AsyncOpenAI(              
             api_key=os.environ.get("OPENAI_API_KEY"),   
             http_client=httpx.AsyncClient(              
                 limits=httpx.Limits(                    
                     max_keepalive_connections=10000,    
                     max_connections=1000,),             
                 timeout=15,                             
             ),                                          
         )                                               
         self.account_info = account_info       



Typical error message of the hundreds / thousands received:
API call exceeded the time limit: 
Recalling OpenAI API 1. Error: . iDelay: 0.074. Delay: 0.074
 (<utils.OpenAIClient object at 0x7efd3bd35690>,)

OS

Amazon Linux

Python version

3.10.13

Library version

openai 1.12.0

tonybaloney · 2024-02-27T02:20:48Z

Spawning 5000-25000 requests concurrently would likely hit many of the rate-limiting caps like requests/sec and tokens/sec. Are you ramping up requests or just starting thousands all at once?

cenedella · 2024-02-27T03:57:06Z

Good call, and that’s worth looking in to. In PROD we are fewer than 1/second. Peak testing has been 16,000/ hour, so we haven’t come close to our rate limit of 10,000 RPM (see screenshot)

tonybaloney · 2024-02-27T04:33:13Z

You can use with_raw_response then extract the remaining tokens and requests for your quota. There's some more details in the docs

response = await openai_client.embeddings.with_raw_response.create(
        model="ada-text-002-etc",
        input="Ground control to Major Tom",
  )
 # Get rate limit information
 print(response.headers.get("x-ratelimit-remaining-requests")
 print(response.headers.get("x-ratelimit-remaining-tokens")
 embeddings = response.parse()  # get back the Embeddings object

If you're trying to simulate a production environment at load, I'd recommend ramping up requests or using something like locust. That's what we're using to load test OpenAI endpoints and models.

rattrayalex · 2024-02-28T03:33:45Z

Can you share a repro script? Assuming this is not, in fact, just rate-limits?

(you might run with OPENAI_LOG=debug to see whether you're hitting 429's and retrying them)

cenedella added the bug Something isn't working label Feb 26, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Multiple Async calls to the api fail catastrophically #1195

Multiple Async calls to the api fail catastrophically #1195

cenedella commented Feb 26, 2024

tonybaloney commented Feb 27, 2024

cenedella commented Feb 27, 2024

tonybaloney commented Feb 27, 2024

rattrayalex commented Feb 28, 2024 •

edited

Multiple Async calls to the api fail catastrophically #1195

Multiple Async calls to the api fail catastrophically #1195

Comments

cenedella commented Feb 26, 2024

Confirm this is an issue with the Python library and not an underlying OpenAI API

Describe the bug

To Reproduce

Code snippets

OS

Python version

Library version

tonybaloney commented Feb 27, 2024

cenedella commented Feb 27, 2024

tonybaloney commented Feb 27, 2024

rattrayalex commented Feb 28, 2024 • edited

rattrayalex commented Feb 28, 2024 •

edited