Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: OpenSearch ConnectionError(Timeout context manager should be used inside a task) #13358

Open
chrfthr opened this issue May 8, 2024 · 3 comments
Labels
bug Something isn't working P1

Comments

@chrfthr
Copy link

chrfthr commented May 8, 2024

Bug Description

After upgrading from 0.9.3 I get a connection error when querying my OpenSearch vectorstore. I'm not sure if I should post this here or open an opensearch-py issue..

I have llama-index installed in the following conda environment with Pyhton 3.10.0:

channels:

  • conda-forge
    dependencies:
  • _libgcc_mutex=0.1=conda_forge
  • _openmp_mutex=4.5=2_gnu
  • bzip2=1.0.8=hd590300_5
  • ca-certificates=2024.2.2=hbcca054_0
  • ld_impl_linux-64=2.40=h55db66e_0
  • libffi=3.4.2=h7f98852_5
  • libgcc-ng=13.2.0=h77fa898_6
  • libgomp=13.2.0=h77fa898_6
  • libnsl=2.0.1=hd590300_0
  • libsqlite=3.45.3=h2797004_0
  • libuuid=2.38.1=h0b41bf4_0
  • libzlib=1.2.13=hd590300_5
  • ncurses=6.4.20240210=h59595ed_0
  • openssl=3.3.0=hd590300_0
  • pip=24.0=pyhd8ed1ab_0
  • python=3.10.0=h543edf9_3_cpython
  • readline=8.2=h8228510_1
  • setuptools=69.5.1=pyhd8ed1ab_0
  • sqlite=3.45.3=h2c6b66d_0
  • tk=8.6.13=noxft_h4845f30_101
  • wheel=0.43.0=pyhd8ed1ab_1
  • xz=5.2.6=h166bdaf_0
  • pip:
    • aiohttp==3.9.5
    • aiosignal==1.3.1
    • altair==5.3.0
    • annotated-types==0.6.0
    • antlr4-python3-runtime==4.9.3
    • anyio==4.3.0
    • async-timeout==4.0.3
    • attrs==23.2.0
    • backoff==2.2.1
    • bcrypt==4.1.3
    • beautifulsoup4==4.12.3
    • blinker==1.8.1
    • cachetools==5.3.3
    • certifi==2024.2.2
    • cffi==1.16.0
    • chardet==5.2.0
    • charset-normalizer==3.3.2
    • click==8.1.7
    • coloredlogs==15.0.1
    • contourpy==1.2.1
    • cryptography==42.0.6
    • cycler==0.12.1
    • dataclasses-json==0.6.5
    • deepdiff==7.0.1
    • deprecated==1.2.14
    • dirtyjson==1.0.8
    • distro==1.9.0
    • effdet==0.4.1
    • emoji==2.11.1
    • et-xmlfile==1.1.0
    • exceptiongroup==1.2.1
    • extra-streamlit-components==0.1.71
    • filelock==3.14.0
    • filetype==1.2.0
    • flatbuffers==24.3.25
    • fonttools==4.51.0
    • frozenlist==1.4.1
    • fsspec==2024.3.1
    • gitdb==4.0.11
    • gitpython==3.1.43
    • google-api-core==2.19.0
    • google-auth==2.29.0
    • google-cloud-vision==3.7.2
    • googleapis-common-protos==1.63.0
    • greenlet==3.0.3
    • grpcio==1.63.0
    • grpcio-status==1.62.2
    • h11==0.14.0
    • httpcore==1.0.5
    • httpx==0.27.0
    • huggingface-hub==0.23.0
    • humanfriendly==10.0
    • idna==3.7
    • iopath==0.1.10
    • jinja2==3.1.4
    • joblib==1.4.2
    • jsonpath-python==1.0.6
    • jsonschema==4.22.0
    • jsonschema-specifications==2023.12.1
    • kiwisolver==1.4.5
    • langdetect==1.0.9
    • layoutparser==0.3.4
    • llama-index==0.10.34
    • llama-index-agent-openai==0.2.3
    • llama-index-cli==0.1.12
    • llama-index-core==0.10.34
    • llama-index-embeddings-huggingface==0.2.0
    • llama-index-embeddings-openai==0.1.9
    • llama-index-indices-managed-llama-cloud==0.1.6
    • llama-index-legacy==0.9.48
    • llama-index-llms-openai==0.1.16
    • llama-index-multi-modal-llms-openai==0.1.5
    • llama-index-program-openai==0.1.6
    • llama-index-question-gen-openai==0.1.3
    • llama-index-readers-file==0.1.20
    • llama-index-readers-llama-parse==0.1.4
    • llama-index-retrievers-bm25==0.1.3
    • llama-index-vector-stores-opensearch==0.1.8
    • llama-parse==0.4.2
    • llamaindex-py-client==0.1.19
    • lxml==5.2.1
    • markdown==3.6
    • markdown-it-py==3.0.0
    • markupsafe==2.1.5
    • marshmallow==3.21.2
    • matplotlib==3.8.4
    • mdurl==0.1.2
    • minijinja==2.0.1
    • mpmath==1.3.0
    • msg-parser==1.2.0
    • multidict==6.0.5
    • mypy-extensions==1.0.0
    • nest-asyncio==1.6.0
    • networkx==3.3
    • nltk==3.8.1
    • numpy==1.26.4
    • nvidia-cublas-cu12==12.1.3.1
    • nvidia-cuda-cupti-cu12==12.1.105
    • nvidia-cuda-nvrtc-cu12==12.1.105
    • nvidia-cuda-runtime-cu12==12.1.105
    • nvidia-cudnn-cu12==8.9.2.26
    • nvidia-cufft-cu12==11.0.2.54
    • nvidia-curand-cu12==10.3.2.106
    • nvidia-cusolver-cu12==11.4.5.107
    • nvidia-cusparse-cu12==12.1.0.106
    • nvidia-nccl-cu12==2.20.5
    • nvidia-nvjitlink-cu12==12.4.127
    • nvidia-nvtx-cu12==12.1.105
    • olefile==0.47
    • omegaconf==2.3.0
    • onnx==1.16.0
    • onnxruntime==1.17.3
    • openai==1.25.2
    • opencv-python==4.9.0.80
    • openpyxl==3.1.2
    • opensearch-py==2.5.0
    • ordered-set==4.1.0
    • packaging==24.0
    • pandas==2.2.2
    • pdf2image==1.17.0
    • pdfminer-six==20231228
    • pdfplumber==0.11.0
    • pikepdf==8.15.1
    • pillow==10.3.0
    • pillow-heif==0.16.0
    • portalocker==2.8.2
    • proto-plus==1.23.0
    • protobuf==4.25.3
    • pyarrow==16.0.0
    • pyasn1==0.6.0
    • pyasn1-modules==0.4.0
    • pycocotools==2.0.7
    • pycparser==2.22
    • pydantic==2.7.1
    • pydantic-core==2.18.2
    • pydeck==0.9.0
    • pygments==2.18.0
    • pyjwt==2.8.0
    • pypandoc==1.13
    • pyparsing==3.1.2
    • pypdf==4.2.0
    • pypdfium2==4.29.0
    • pytesseract==0.3.10
    • python-dateutil==2.9.0.post0
    • python-docx==1.1.2
    • python-iso639==2024.4.27
    • python-magic==0.4.27
    • python-multipart==0.0.9
    • python-pptx==0.6.23
    • pytz==2024.1
    • pyyaml==6.0.1
    • rank-bm25==0.2.2
    • rapidfuzz==3.9.0
    • referencing==0.35.1
    • regex==2024.4.28
    • requests==2.31.0
    • rich==13.7.1
    • rpds-py==0.18.0
    • rsa==4.9
    • safetensors==0.4.3
    • scikit-learn==1.4.2
    • scipy==1.13.0
    • sentence-transformers==2.7.0
    • six==1.16.0
    • smmap==5.0.1
    • sniffio==1.3.1
    • soupsieve==2.5
    • sqlalchemy==2.0.30
    • streamlit==1.34.0
    • streamlit-authenticator==0.3.2
    • streamlit-chat==0.1.1
    • streamlit-feedback==0.1.3
    • striprtf==0.0.26
    • sympy==1.12
    • tabulate==0.9.0
    • tenacity==8.2.3
    • threadpoolctl==3.5.0
    • tiktoken==0.6.0
    • timm==0.9.16
    • tokenizers==0.19.1
    • toml==0.10.2
    • toolz==0.12.1
    • torch==2.3.0
    • torchvision==0.18.0
    • tornado==6.4
    • tqdm==4.66.4
    • transformers==4.40.1
    • triton==2.3.0
    • typing-extensions==4.11.0
    • typing-inspect==0.9.0
    • tzdata==2024.1
    • unstructured==0.13.6
    • unstructured-client==0.22.0
    • unstructured-inference==0.7.29
    • unstructured-pytesseract==0.3.12
    • urllib3==1.26.18
    • watchdog==4.0.0
    • wrapt==1.16.0
    • xlrd==2.0.1
    • xlsxwriter==3.2.0
    • yarl==1.9.4

Version

0.13.34

Steps to Reproduce

Query a RetrieverQueryEngine built from a VectorIndexRetriever, a VectorStoreIndex and an OpensearchVectorStore. I implemented a custom version of the VectorIndexRetriever, but this should not be relevant as the error also appeared when using the library retriever. The custom vector retriever is also wrapped in a custom retriever.

Relevant Logs/Tracbacks

File "/home/christian/miniforge3/envs/streamlit/lib/python3.10/site-packages/streamlit/runtime/scriptrunner/script_runner.py", line 600, in _run_script
    exec(code, module.__dict__)
File "/home/christian/llmproject/src/frontend/web.py", line 151, in <module>
    response, context = generate_response(prompt, reasoning, keywords, toggle, file_name, options)
File "/home/christian/llmproject/src/frontend/web.py", line 21, in generate_response
    res, context = agent.query(index_name, prompt, reasoning, keywords, toggle, file_name, options)
File "/home/christian/llmproject/src/retrieval/Agent.py", line 174, in query
    response = query_engine.query(query_bundle)
File "/home/christian/miniforge3/envs/streamlit/lib/python3.10/site-packages/llama_index/core/instrumentation/dispatcher.py", line 274, in wrapper
    result = func(*args, **kwargs)
File "/home/christian/miniforge3/envs/streamlit/lib/python3.10/site-packages/llama_index/core/base/base_query_engine.py", line 53, in query
    query_result = self._query(str_or_query_bundle)
File "/home/christian/miniforge3/envs/streamlit/lib/python3.10/site-packages/llama_index/core/instrumentation/dispatcher.py", line 274, in wrapper
    result = func(*args, **kwargs)
File "/home/christian/miniforge3/envs/streamlit/lib/python3.10/site-packages/llama_index/core/query_engine/retriever_query_engine.py", line 189, in _query
    nodes = self.retrieve(query_bundle)
File "/home/christian/miniforge3/envs/streamlit/lib/python3.10/site-packages/llama_index/core/query_engine/retriever_query_engine.py", line 144, in retrieve
    nodes = self._retriever.retrieve(query_bundle)
File "/home/christian/miniforge3/envs/streamlit/lib/python3.10/site-packages/llama_index/core/instrumentation/dispatcher.py", line 274, in wrapper
    result = func(*args, **kwargs)
File "/home/christian/miniforge3/envs/streamlit/lib/python3.10/site-packages/llama_index/core/base/base_retriever.py", line 244, in retrieve
    nodes = self._retrieve(query_bundle)
File "/home/christian/llmproject/src/retrieval/retrievers/CustomRetriever.py", line 62, in _retrieve
    vector_nodes = self._vector_retriever.retrieve(retrieval_bundle)
File "/home/christian/miniforge3/envs/streamlit/lib/python3.10/site-packages/llama_index/core/instrumentation/dispatcher.py", line 274, in wrapper
    result = func(*args, **kwargs)
File "/home/christian/miniforge3/envs/streamlit/lib/python3.10/site-packages/llama_index/core/base/base_retriever.py", line 244, in retrieve
    nodes = self._retrieve(query_bundle)
File "/home/christian/miniforge3/envs/streamlit/lib/python3.10/site-packages/llama_index/core/instrumentation/dispatcher.py", line 274, in wrapper
    result = func(*args, **kwargs)
File "/home/christian/llmproject/src/retrieval/retrievers/CustomVectorIndexRetriever.py", line 100, in _retrieve
    return self._get_nodes_with_embeddings(query_bundle)
File "/home/christian/llmproject/src/retrieval/retrievers/CustomVectorIndexRetriever.py", line 189, in _get_nodes_with_embeddings
    query_result = self._vector_store.query(query, **self._kwargs)
File "/home/christian/miniforge3/envs/streamlit/lib/python3.10/site-packages/llama_index/vector_stores/opensearch/base.py", line 525, in query
    return asyncio.get_event_loop().run_until_complete(self.aquery(query, **kwargs))
File "/home/christian/miniforge3/envs/streamlit/lib/python3.10/site-packages/nest_asyncio.py", line 98, in run_until_complete
    return f.result()
File "/home/christian/miniforge3/envs/streamlit/lib/python3.10/asyncio/futures.py", line 201, in result
    raise self._exception
File "/home/christian/miniforge3/envs/streamlit/lib/python3.10/asyncio/tasks.py", line 232, in __step
    result = coro.send(None)
File "/home/christian/miniforge3/envs/streamlit/lib/python3.10/site-packages/llama_index/vector_stores/opensearch/base.py", line 539, in aquery
    return await self._client.aquery(
File "/home/christian/miniforge3/envs/streamlit/lib/python3.10/site-packages/llama_index/vector_stores/opensearch/base.py", line 399, in aquery
    res = await self._os_client.search(
File "/home/christian/miniforge3/envs/streamlit/lib/python3.10/site-packages/opensearchpy/_async/client/__init__.py", line 1748, in search
    return await self.transport.perform_request(
File "/home/christian/miniforge3/envs/streamlit/lib/python3.10/site-packages/opensearchpy/_async/transport.py", line 425, in perform_request
    raise e
File "/home/christian/miniforge3/envs/streamlit/lib/python3.10/site-packages/opensearchpy/_async/transport.py", line 389, in perform_request
    status, headers_response, data = await connection.perform_request(
File "/home/christian/miniforge3/envs/streamlit/lib/python3.10/site-packages/opensearchpy/_async/http_aiohttp.py", line 329, in perform_request
    raise ConnectionError("N/A", str(e), e)
@chrfthr chrfthr added bug Something isn't working triage Issue needs to be triaged/prioritized labels May 8, 2024
Copy link

dosubot bot commented May 8, 2024

The error you're encountering, "ConnectionError(Timeout context manager should be used inside a task)," suggests that the asynchronous operation with OpenSearch is being executed outside of an async function or context. To resolve this, ensure your code interacts with OpenSearch within an async context. Here's a structured approach:

  1. Async Function: Wrap your OpenSearch query logic within an async function. This ensures that the asynchronous code is executed within the correct context.

  2. Await Async Operations: Use await for asynchronous calls, such as querying OpenSearch, to wait for their completion within the async function.

  3. Async Context Manager: If using a context manager (e.g., for managing connections), use async with to ensure it's handled asynchronously.

Here's an example to guide you:

import asyncio
from opensearchpy import AsyncOpenSearch

async def query_opensearch():
    opensearch_url = 'http://localhost:9200'  # Replace with your details
    async with AsyncOpenSearch(
        opensearch_url,
        http_compress=True,  # Optional
    ) as client:
        response = await client.search(index="index_name", body={"query": {"match_all": {}}})
        print(response)

if __name__ == "__main__":
    asyncio.run(query_opensearch())

This code demonstrates the correct way to perform asynchronous operations with OpenSearch, ensuring they're executed within an async context to avoid the "ConnectionError" you're experiencing.

For further troubleshooting, especially if adapting the code doesn't resolve the issue, providing more context about your execution environment (e.g., script, Jupyter notebook) could help identify additional factors contributing to the problem.

Details

Sources


About Dosu
This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

To continue the conversation, mention @dosu.

@logan-markewich
Copy link
Collaborator

Is this a timeout? your traceback just shows some connection error?

@logan-markewich logan-markewich added p2 P1 and removed triage Issue needs to be triaged/prioritized p2 labels May 8, 2024
@chrfthr
Copy link
Author

chrfthr commented May 10, 2024

Is this a timeout? your traceback just shows some connection error?

Connection issues should not be a problem. Everything is run locally at this stage and I check the health of the OpenSearch cluster before doing anything

res = requests.get(endpoint + '/_cluster/health?wait_for_status=yellow&timeout=30s')

It rather looks like some sort of async error in the library

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working P1
Projects
None yet
Development

No branches or pull requests

2 participants