-
Notifications
You must be signed in to change notification settings - Fork 261
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Added delete document functionality #464
base: main
Are you sure you want to change the base?
Conversation
Added delete functionality for all types of documents (Files, Texts, Q&A and Websites). The feature deletes the documents from S3 upload bucket, S3 processed bucket, DynamoDB documents table, OpenSearch index and also updates DynamoDB workspaces table. Following are the major code changes: 1. Added delete button on UI for each row of the documents. 2. Added confirmation dialog via Modal so that user can Cancel/Delete the document from there. 3. Created AWS step function to use State Machines and delete document workflow. This way, the whole process is organised and is automatically rolled back if any of the operation in the step function fails. Major components and their working is as below: 1. documents-tab.tsx has functionality related to delete button and handling of confirmation Modal. 2. documents-client.ts has function deleteDocument to hit the backend API. 3. delete_document function in lib/chatbot-api/functions/api-handler/routes/documents.py handles the API request 4. deleteDocumentWorkflow is created in lib/rag-engines/workspaces/index.ts 5. delete-document.ts has internal structure of Delete document workflow 6. The lambda function to handle the workflow is written in lib/rag-engines/workspaces/functions/delete-document-workflow/delete/index.py 7. The execution of state machine starts in delete_document function of lib/shared/layers/python-sdk/python/genai_core/documents.py 8. The actual deletion of documents happens in delete_open_search_document function of lib/shared/layers/python-sdk/python/genai_core/opensearch/delete.py Request flow would be like documents-client -> documents.py (api handler) -> documents.py (genai_core) -> index.py (delete-document-workflow) -> delete.py (genai_core/opensearch) As part of this change, also updated version of opensearch-py which was initially updated as calling direct http methods was not allowed in earlier version but later on calling http methods was not required. Kept this change for future perspective as it would have no impact.
@massi-ang @bigadsoleiman Could you please review this? |
sql.SQL("DELETE FROM {table} WHERE document_id = %s").format(table=table_name), | ||
(document_id,), | ||
) | ||
print(f"Deleted document {document_id} from {table_name}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a missing cursor commit. Add cursor.connection.commit()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for pointing out @gbone-restore. I have added the commit statement.
@bigadsoleiman @massi-ang would you be able to have a look at this PR? |
Issue #149 :
Description of changes: Added delete functionality for all types of documents (Files, Texts, Q&A and Websites). The feature deletes the documents from S3 upload bucket, S3 processed bucket, DynamoDB documents table, OpenSearch index and also updates DynamoDB workspaces table. Following are the major code changes:
Major components and their working is as below:
Request flow would be like documents-client -> documents.py (api handler) -> documents.py (genai_core) -> index.py (delete-document-workflow) -> delete.py (genai_core/opensearch)
As part of this change, also updated version of opensearch-py which was initially updated as calling direct http methods was not allowed in earlier version but later on calling http methods was not required. Kept this change for future perspective as it would have no impact.
By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.