CourtBot

Description

CourtBot is a chat bot which can access all decisions of the Supreme Court of the United States. It can answer questions based on the information in these decisions and cite the specific decisions which it used to answer the question.

How to Run

Install Dependencies

pip install -r requirements.txt

Run

./run.sh
    [-s : scrape the supreme court cases]
    [-e : to embed court decisions]
    [-d : start backend]
    [-c cases|db : delete all cases or database]

Run ./run.sh -s scrapes the supreme court cases from FindLaw and prepossesses them for embedding. The time taken to scrape the data is dependent on the number of threads which you allocate and it takes around 3 hours with 50 threads. The data is stored in SupremeCourtCases directory as txt files.
Run ./run.sh -e to embed the court cases for the Chroma database. This process is run on the gpu but could be run on the cpu. When running on the gpu this process takes about 14 hours to embed all 21 thousand cases. The Chroma database is saved in the .chroma directory.
Run ./run.sh -d to start the server. All queries to the vector database and the chat bot are made through this api.

Sources

The data for the Supreme Court cases is scraped from FindLaw using Selenium and a chromium web driver.
The embeddings are calculated using all-mpnet-base-v2 which is a fine tuned version of Microsoft's Masked and Permuted Language Modeling (MPNet) (arXiv).
The vector database used is Chroma. Chroma is a vector database which allows you to search for similar vectors. It is built on top of DuckDB and Apache Parquet.
The chat bot is built using Microsoft's Grounded Open Dialogue Language Model (GODEL) (arXiv). This model is trained on 551 million dialogs from Reddit and 5 million instruction and knowledge dialogs.
The api is built using FastAPI which uses a uvicorn web server.

Name		Name	Last commit message	Last commit date
Latest commit History 46 Commits
Backend		Backend
WebScraper		WebScraper
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt
run.sh		run.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Backend

Backend

WebScraper

WebScraper

.gitignore

.gitignore

README.md

README.md

requirements.txt

requirements.txt

run.sh

run.sh

Repository files navigation

CourtBot

Description

How to Run

Install Dependencies

Run

Sources

About

Releases

Packages

Languages

Ayon-Bhowmick/CourtBot

Folders and files

Latest commit

History

Repository files navigation

CourtBot

Description

How to Run

Install Dependencies

Run

Sources

About

Topics

Resources

Stars

Watchers

Forks

Languages