Skip to content

Trevato/csv_semantic_search

Repository files navigation

csv_semantic_search

Semantic search app using streamlit and txtai.

To use the app, upload a csv that you would like to search. Embeddings will index the rows of the csv. Enter a plain text query and get the best results!

How it works

Pandas is used to read the csv file and each row is converted to a string creating a 1-dimensional array. Each string in this dataframe recieves a high dimension embedding (1D vector of floats) using txtai embeddings. To save compute, the data is hashed and a pickle file is created so that the data doesn't need to be reindexed. Lastly, the user enters a query to search the data which uses Approximate Nearest Neighbor in the backend. (Implemented within the txtai search function)

Run locally

Run it yourself to save me some resources :)

Note: Nvidia drivers must be configured properly for GPU access.

Just python:

$ git clone https://github.com/Trevato/csv_semantic_search.git
$ pip install requirements.txt
$ streamlit run app.py

docker-compose

$ docker-compose build --up

About

Semantic search app using streamlit and txtai.

Topics

Resources

Stars

Watchers

Forks