Replies: 3 comments 3 replies
-
Would adjusting these params suffice to cover your use case? |
Beta Was this translation helpful? Give feedback.
-
Very similar problems here. I can't upload a document from chat, need to do it from documents. When I upload 20 PDFs it looks like everything is uploaded very fast but when I start talking with the docs it just uses the first page of the first document and rest it doesn't seem to know anything about. How do I know RAG has ingested my documents and they are read? |
Beta Was this translation helpful? Give feedback.
-
Any movement on the thoughts on this? |
Beta Was this translation helpful? Give feedback.
-
Bug Report
Description
Bug Summary:
Full content of documents is not returned due to "RAG"ging of the internal content.
I appreciate that this is suitable for summarisation across documents, and the internal chunks should be generated, this seems to be incorrect behavior / context to provide where the task is potentially working upon the whole content of a document, and thus a similarity based approach may not be appropriate.
Steps to Reproduce:
INFO:apps.rag.utils:query_doc:result {'ids': [['f83b3943-9854-4806-8112-3c1b1d276a99', 'f85bea6f-e3b1-4c27-835a-dea22f986d97', 'a79509b0-ea09-41bf-b155-ca18c495ba65', '4b0b10b9-ce41-4439-9742-532de27ca972', '47d39c80-14e1-40a3-9816-5423c88b98a9']], 'distances': [[1.431112589434609, 1.4432387351989746, 1.4436806440353394, 1.443746566772461, 1.4461387395858765]], 'metadatas': [[{'row': 102, 'source': '/app/backend/data/uploads/charging_sessions.csv', 'start_index': 0}, {'row': 42, 'source': '/app/backend/data/uploads/charging_sessions.csv', 'start_index': 0}, {'row': 2, 'source': '/app/backend/data/uploads/charging_sessions.csv', 'start_index': 0}, {'row': 32, 'source': '/app/backend/data/uploads/charging_sessions.csv', 'start_index': 0}, {'row': 1, 'source': '/app/backend/data/uploads/charging_sessions.csv', 'start_index': 0}]], 'embeddings': None, 'documents': [['Start: 2023-09-07 16:16:20\nEnd: 2023-09-07 18:53:43\nDuration: 02:37:23\nSite: Craigpark\nConnector Type: AC\nPayment Status: Paid\nConsum(kWh): 28.04\nCurrency: £\nAmount: 12.22', 'Start: 2024-02-11 18:34:42\nEnd: 2024-02-11 21:09:04\nDuration: 02:34:22\nSite: Craigpark\nConnector Type: AC\nPayment Status: Paid\nConsum(kWh): 28.28\nCurrency: £\nAmount: 12.31', 'Start: 2024-05-08 20:06:46\nEnd: 2024-05-08 22:06:23\nDuration: 01:59:37\nSite: Craigpark\nConnector Type: AC\nPayment Status: Paid\nConsum(kWh): 22.02\nCurrency: £\nAmount: 9.81', 'Start: 2024-03-09 11:14:00\nEnd: 2024-03-09 13:51:40\nDuration: 02:37:40\nSite: Craigpark\nConnector Type: AC\nPayment Status: Paid\nConsum(kWh): 29.32\nCurrency: £\nAmount: 12.73', 'Start: 2024-05-11 09:15:14\nEnd: 2024-05-11 11:05:00\nDuration: 01:49:46\nSite: Craigpark\nConnector Type: AC\nPayment Status: Paid\nConsum(kWh): 20.29\nCurrency: £\nAmount: 9.12']], 'uris': None, 'data': None}
Expected Behavior:
Given the single file nature, the contents of the file should be processed.
Actual Behavior:
LLM reponds with statement indicating fewer rows in the document that reality.
Environment
Reproduction Details
Confirmation:
Logs and Screenshots
Browser Console Logs:
browser.log
Docker Container Logs:
docker.logs.txt
Screenshots (if applicable):
LLM resonse
Sample of CSV format > 5 rows identified in response.
Installation Method
Docker
Additional Information
Beta Was this translation helpful? Give feedback.
All reactions