Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add code for AstraDB #197

Open
wants to merge 26 commits into
base: dev
Choose a base branch
from
Open

Add code for AstraDB #197

wants to merge 26 commits into from

Conversation

Tylersuard
Copy link

Allows use of AstraDB, similar to other databases already included.

Test plan: Use the example code in mirascope/astra/vectorstore.py

Copy link
Collaborator

@brenkao brenkao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me know if you have any questions or if there is any clarification needed.

Thanks for contributing.

application_token: str = "your token here"
collection_name: str = "your collection name here"

class AstraParams(BaseModel):
Copy link
Collaborator

@brenkao brenkao May 1, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make sure to extend BaseVectorStoreParams, this will make it easier for us to integrate with tools such as weave and logfire easier.

mirascope/astra/types.py Show resolved Hide resolved
mirascope/astra/types.py Show resolved Hide resolved
mirascope/astra/vectorstores.py Show resolved Hide resolved
mirascope/astra/vectorstores.py Outdated Show resolved Hide resolved
mirascope/astra/types.py Show resolved Hide resolved
limit=kwargs.get('limit', 10), # Example of additional parameter
fields=["text", "source"]
)
return AstraQueryResult.convert(results)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove convert and construct the pydantic model here.

return AstraQueryResult(documents=..., embeddings=...)

Each document is expected to have text, embeddings, and a source.
"""
if not text:
logging.error("No text provided for addition.")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can raise ValueError here instead of logging, user can then try except as necessary and handle it themselves with logger.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but also the arguments requires text, so maybe unnecessary?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah this is not necessary since the typing requires text. If not provided, this will throw an error anyway that the user needs to resolve rather than log.

pyproject.toml Outdated
@@ -28,6 +28,7 @@ mistralai = { version = ">=0.1.6,<1.0.0", optional = true }
groq = { version = ">=0.4.2,<1.0.0", optional = true }
cohere = { version = "^5.2.5", optional = true }
pinecone-client = { version = "^3.2.2", optional = true }
astrapy = "^1.1.0"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add astra as an optional dependency

poetry add --optional astrapy

Then update [tool.poetry.extras] with

astrapy = ["astrapy"]

and update all to include astrapy

mirascope/astra/__init__.py Show resolved Hide resolved
Copy link
Contributor

@willbakst willbakst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR is missing tests. The tests should be located in a tests/astra directory.

Please note that all code should be tested with 100% coverage. Ideally we also test additional flows beyond just coverage, but this is not required for a PR (and we will add additional tests should bugs be discovered in the future).

To check coverage, the easiest way is to run the following command from the root directory:

poetry run pytest tests --cov=./ --cov-report-html

This will produce a directory named htmlcov inside of which there will be an index.html file that is interactive and shows you all uncovered lines by file.

I recommend looking in the tests/ directory for a reference of how to write tests. For this PR, I would specifically look at the chroma tests and the pinecone tests.

If you have any questions, let us know! We're here to help.

Each document is expected to have text, embeddings, and a source.
"""
if not text:
logging.error("No text provided for addition.")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah this is not necessary since the typing requires text. If not provided, this will throw an error anyway that the user needs to resolve rather than log.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add docstrings to the module and all public classes and functions. They should follow the Google docstring style

"""A vector store for AstraDB.

Example:
from mirascope.openai import OpenAIEmbedder
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please wrap the example in a python code block, e.g.

Example:

```python
{code here}
```

@@ -31,6 +31,7 @@ pinecone-client = { version = "^3.2.2", optional = true }
logfire = { version = ">=0.26.0,<1.0.0", optional = true }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please include astrapy with the correct version and optional = true like we do for other optional dependencies.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants