You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
llm = OpenLLMAPI(address="http://some_address:3000/")
llm.complete("What are some hazards crude oil stored in tank?")
error:
| aiohttp.client_exceptions.ClientPayloadError: Response payload is not completed: <TransferEncodingError: 400, message='Not enough data for satisfy transfer length header.'>
|
| The above exception was the direct cause of the following exception:
|
| Traceback (most recent call last):
| File "/home/ib.gaga/.local/lib/python3.10/site-packages/starlette/responses.py", line 261, in wrap
| await func()
| File "/home/ib.gaga/.local/lib/python3.10/site-packages/starlette/responses.py", line 250, in stream_response
| async for chunk in self.body_iterator:
| File "/home/ib.gaga/.local/lib/python3.10/site-packages/openllm/_service.py", line 28, in generate_stream_v1
| async for it in llm.generate_iterator(**llm_model_class(**input_dict).model_dump()):
| File "/home/ib.gaga/.local/lib/python3.10/site-packages/openllm/_llm.py", line 127, in generate_iterator
| raise RuntimeError(f'Exception caught during generation: {err}') from err
| RuntimeError: Exception caught during generation: Response payload is not completed: <TransferEncodingError: 400, message='Not enough data for satisfy transfer length header.'>
Describe the bug
environment: Python 10
usage:
openllm start NousResearch/llama-2-13b-chat-hf
llm = OpenLLMAPI(address="http://some_address:3000/")
llm.complete("What are some hazards crude oil stored in tank?")
error:
| aiohttp.client_exceptions.ClientPayloadError: Response payload is not completed: <TransferEncodingError: 400, message='Not enough data for satisfy transfer length header.'>
|
| The above exception was the direct cause of the following exception:
|
| Traceback (most recent call last):
| File "/home/ib.gaga/.local/lib/python3.10/site-packages/starlette/responses.py", line 261, in wrap
| await func()
| File "/home/ib.gaga/.local/lib/python3.10/site-packages/starlette/responses.py", line 250, in stream_response
| async for chunk in self.body_iterator:
| File "/home/ib.gaga/.local/lib/python3.10/site-packages/openllm/_service.py", line 28, in generate_stream_v1
| async for it in llm.generate_iterator(**llm_model_class(**input_dict).model_dump()):
| File "/home/ib.gaga/.local/lib/python3.10/site-packages/openllm/_llm.py", line 127, in generate_iterator
| raise RuntimeError(f'Exception caught during generation: {err}') from err
| RuntimeError: Exception caught during generation: Response payload is not completed: <TransferEncodingError: 400, message='Not enough data for satisfy transfer length header.'>
To reproduce
No response
Logs
No response
Environment
System information
bentoml
: 1.1.11python
: 3.10.12platform
: Linux-6.5.0-1017-azure-x86_64-with-glibc2.35uid_gid
: 2206643:100pip_packages
transformers
version: 4.39.3System information (Optional)
No response
The text was updated successfully, but these errors were encountered: