Skip to content

GrandeVx/LLM-Readme-Creator

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 

Repository files navigation

The LLM Readme Creator

Introduction

The project aims to develop a conversational retrieval system using natural language processing techniques. It leverages the OpenAI GPT-3.5 Turbo model and various language processing libraries to create a chat-based interface for retrieving relevant information from a dataset.

Setup

To run the code successfully, please follow these steps:

Install the required dependencies by running pip install -r requirements.txt. Set up the necessary environment variables: OPENAI_API_KEY: OpenAI API key required for accessing the language model. ACTIVELOOP_TOKEN: Activeloop token for dataset management. ACTIVELOOP_USERNAME: Activeloop username for dataset management.

Code Structure

The code consists of the following main components:

1. Data Loading and Preprocessing

The load_folder function loads documents from a specified folder and applies text splitting to segment them into smaller chunks. The resulting chunks are then passed to the upload_data function, which adds them to the dataset.

2. Embeddings and Vectorization

The embeddings module provides an interface to the OpenAI GPT-3.5 Turbo model, allowing for text embeddings generation. The DeepLake class serves as a vector store, enabling efficient storage and retrieval of text representations.

3. Conversational Retrieval Chain

The ConversationalRetrievalChain class combines the language model and the vector store to create a chat-based retrieval system. It utilizes the ChatOpenAI model for generating responses based on user queries and the DeepLake vector store for retrieving relevant documents.

4. Main Execution

The code begins by setting up the required environment variables. It then attempts to load the dataset from an existing location. If the dataset does not exist, it loads documents from a specified folder and creates the dataset using the upload_data function.

Next, the code initializes the retrieval component by creating an instance of the DeepLake class and configuring the retrieval parameters.

Finally, the code sets up the conversational interface, allowing users to interact with the system by asking questions. The chat history is stored and updated with each interaction, and the responses are written to a file named TEST.md.

Usage

To use the code, follow these steps:

Set the necessary environment variables as described in the Setup section. Run the code and provide the path to the folder containing the documents when prompted. Interact with the system by asking questions in the command line. The responses will be stored in the TEST.md file.

Conclusion

This Readme provides an overview of the project, explaining its purpose, structure, and components. Following the provided instructions, you can set up and use the code effectively. Feel free to explore and customize the code according to your specific requirements.

Releases

No releases published

Packages

No packages published

Languages